Tag: LLM
35 articles
LLM Application Engineering: Key Practices from Demo to P...
Core experience moving LLM applications from prototype to production: context management, error handling, cost control, observability. No basics, just real pitfalls.
Real-time Voice Interaction Pipeline Latency Optimization...
Documenting the process of building ASR→LLM→TTS real-time voice pipeline: why latency is high, how pipeline concurrency reduces first-byte latency, VAD endpoint detection pitfalls, and practical co...
AI Research #135: Gemini 3 Pro Back on Top - MoE, Million...
Explains Gemini 3 Pro's advantages through sparse MoE architecture, million-token context, native multimodal (text/image/video/PDF), thinking depth control (thinking_level), and Deep Think mode. St...
AI Research #130: Qwen2.5-Omni Practical Applications
Office assistant, education and training, programming and operations, search-enhanced RAG, device control/plugin agents, and companion entertainment. Covers...
AI Research #129: Qwen2.5-Omni-7B Key Specs - VRAM, Conte...
Runs stably at FP16 ~14GB VRAM, with INT8/INT4 quantization (<4GB) enabling deployment on consumer GPUs or edge devices. Combined with FlashAttention 2 and...
AI Research #127: Qwen2.5-Omni Deep Dive - Thinker-Talker...
Engineering breakdown of Qwen2.5-Omni (2024-2025) Thinker-Talker dual-core architecture: unified Transformer decoder for text/image/video/audio fusion, TMRoPE...
AI Investigation #75: From LLM to LBM - Robot Hierarchica...
The integration of Large Language Models (LLM) with robot real-time control is driving intelligent upgrades in robotics. LLMs show great potential in...
AI Research 13 - LLM and Agent Research: The Rise and Dev...
2024 is called the 'Year of Agents'. LLM trends show parallel development of 'bigger and stronger' and 'smaller and more specialized'. OpenAI o1 series, Claude, and other multimodal models continue...
AI Research 12 - LLM and Agent Research: Overview of Majo...
Major LLM application directions in 2024-2025 include enterprise applications (code assistance, customer service, knowledge management) and consumer applications (general conversation, content crea...
LangChain-26 Custom Agent Complete Tutorial Building a Cu...
This article demonstrates how to create a chat agent using the Langchain library and GPT-4 model in Python by defining tool functions and integrating them with LLM to achieve queries for informatio...
LangChain-24 AgentExecutor Comprehensive Guide
This article introduces how to use the Langchain library in Python for document retrieval, load web content, configure OpenAIEmbeddings, and integrate GPT-3.5-turbo model for Q&A. It demonstrates h...
LangChain-25 ReAct Framework Detailed Explanation Integra...
This article introduces ReAct, a framework that uses logical reasoning and action sequences to achieve goal-oriented tasks through LLM decision-making and operations. The core components include Th...
LangChain-22 Text Embedding and FAISS Practical Explanation
This article introduces the key role of TextEmbedding in NLP, how to convert text into real number vectors to represent semantic relationships, and how to combine OpenAIEmbeddings and FAISS for eff...
LangChain-23 Vector AI Semantic Search System Vector Data...
This article introduces how to use Chroma vector database to process and retrieve high-dimensional vector embeddings from documents, vectorize them using...
LangChain-20 Document Loaders TextLoader, CSVLoader, PyPD...
This article introduces various document loaders provided by the LangChain library, such as TextLoader, CSVLoader, DirectoryLoader, etc., demonstrating how to load and process data in various formats.
LangChain Text Splitter: Character, Word, HTML and Code-b...
This article introduces various TextSplitters in the LangChain library, including character-based, word-based, HTML tag-based, and programming language-based splitters, as well as their application...
LangChain Cache Mechanism: InMemoryCache and SQLiteCache ...
LangChain provides a comprehensive caching mechanism to significantly reduce LLM call latency and costs. Its core includes InMemoryCache (in-memory cache) and SQLiteCache (persistent cache).
LangChain-19 TokenUsage Callback Function Explained
Explains how to integrate OpenAI GPT-3 model in Python through LangChain library, demonstrating how to use the `get_openai_callback` function to obtain callbacks and execute requests.
LangChain-16 Using Tools: Mastering LLM Tool Calling
LangChain is currently one of the most popular LLM application development frameworks, specifically designed for building intelligent assistants, automation...
LangChain-17 Function Calling AI Function Calling Explained
Function Calling is a core technology for Large Language Models (like GPT-4, Claude, Gemini) to interact with external systems. It enables AI to not only understand language but also execute tasks,...
LangChain-14 OpenAI Content Moderation (Moderation) Expla...
Content moderation is a core component of modern internet platform safety and compliance, used to identify, filter, and manage user-generated content (UGC) to prevent the spread of illegal, low-qua...
LangChain-15 Intelligent Knowledge Retrieval: AgentExecut...
Build an intelligent knowledge retrieval system using Wikipedia search plugin, AgentExecutor, and LangChain tools. Covers agent initialization, tool binding, and multi-step reasoning workflows.
LangChain-12 Routing By Semantic Similarity
This article introduces a method using large models (like OpenAI) and Prompt templates to handle unexpected inputs in program design by calculating the similarity between queries and preset templates.
LangChain-13 Memory ConversationBufferMemory: Conversatio...
This article introduces how to use tools in the LangChain library to manage conversation context of large models in Python. Through components like...
LangChain-11 Code Writing FunctionCalling: Autoregressive...
This article introduces how to use the GPT-3.5-Turbo model to write Python code to solve users' abstract calculation problems, such as 2+2 and complex mathematical expressions, demonstrating the mo...
LangChain 09 - Query SQL DB with RUN GPT
This article introduces how to use Python libraries like langchain and ChatOpenAI (GPT-3.5-turbo) combined with SQLite database to create a program to execute SQL queries and return results in natu...
LangChain 10 - Agents Langchainhub Guide
This article introduces how to use LangChainHub's Hub mechanism through Python code to easily access and share Prompts. Although the project hasn't been...
LangChain 07 - Multiple Chains
How to use Runnable and Prompts in LangChain to create chainable conversation flows for multi-stage question answering, with practical examples of sequential and parallel chain composition.
LangChain 08 - Query SQL DB with GPT
This article introduces how to use LangChain framework to import Chinook SQLite database through Python script and use GPT model to execute SQL queries, such as calculating employee count.
LangChain 05 - RAG Enhanced Conversational Retrieval
This article introduces how to use tools in LangChain library, such as OpenAIEmbeddings and ChatModels, combined with document retrieval technology, to create a program that generates answers based...
LangChain 06 - RAG with Source Document
Retrieval-Augmented Generation (RAG) with Source Document is an AI technology framework that combines retrieval with large language model generation. Its core...
LangChain 03 - astream_events Streaming Output with FAISS...
This article introduces how to use DocArrayInMemorySearch to vectorize text data, combined with OpenAIEmbeddings and GPT-3.5 model, to implement relevant information retrieval and answer generation...
LangChain 04 - RAG Retrieval-Augmented Generation
This article explains in detail how to use RAG technology in LangChain, combined with OpenAI's GPT-3.5 model, to improve text generation quality through retrieval and generation. Provides installat...
LangChain 01 - Getting Started: Quick Hello World Guide
This article introduces how to use the LangChain library with OpenAI API and GPT-3.5-turbo model to create a template for generating jokes about specific topics (like cats). The author demonstrates...
LangChain 02 - JsonOutputParser and Streaming JSON Data P...
This article explains how to install and use LangChain and OpenAI API in Python, retrieve specified country and its population data through async functions, and demonstrates the process of progress...