Tag: LLM
35 articles
LLM Application Engineering: Key Practices from Demo to Production
Core experience moving LLM applications from prototype to production: context management, error handling, cost control, observability. No basics, just real pitfalls.
Real-time Voice Interaction Pipeline Latency Optimization
When building voice interaction systems, latency is the core experience metric.
AI Research #135: Gemini 3 Pro Back on Top - MoE, Million-token Context and Deep Think
Explains Gemini 3 Pro's advantages through sparse MoE architecture, million-token context, native multimodal (text/image/video/PDF), thinking depth control (thinking_leve...
AI Research #130: Qwen2.5-Omni Practical Applications
Office assistant, education and training, programming and operations, search-enhanced RAG, device control/plugin agents, and companion entertainment.
AI Research #129: Qwen2.5-Omni-7B Key Specs - VRAM, Context and Deployment
Runs stably at FP16 ~14GB VRAM, with INT8/INT4 quantization (<4GB) enabling deployment on consumer GPUs or edge devices.
AI Research #127: Qwen2.5-Omni Deep Dive - Thinker-Talker Dual-core Architecture
Engineering breakdown of Qwen2.5-Omni (2024-2025) Thinker-Talker dual-core architecture: unified Transformer decoder for text/image/video/audio fusion, TMRoPE.
AI Investigation #75: From LLM to LBM - Hierarchical Robot Control Architecture Driven by Large Models
The integration of Large Language Models (LLM) with robot real-time control is driving intelligent upgrades in robotics.
AI Research 13 - LLM and Agent Research: The Rise and Development of LLM Agents
2024 is called the 'Year of Agents'. LLM trends show parallel development of 'bigger and stronger' and 'smaller and more specialized'.
AI Research 12 - LLM and Agent Research: Overview of Major LLM Application Directions
Major LLM application directions in 2024-2025 include enterprise applications (code assistance, customer service, knowledge management) and consumer applications.
LangChain-26 Custom Agent Complete Tutorial: Building a Custom Agent
A Custom Agent refers to an intelligent agent program customized by users based on specific requirements, which can execute specific tasks or workflows.
LangChain-24 AgentExecutor Comprehensive Guide
This article introduces how to use the Langchain library in Python for document retrieval, load web content, configure OpenAIEmbeddings, and integrate GPT-3.
LangChain-25 ReAct Framework Detailed Explanation and Integration Practice
This article introduces ReAct, a framework that uses logical reasoning and action sequences to achieve goal-oriented tasks through LLM decision-making and operations.
LangChain-22 Text Embedding and FAISS Practical Explanation
Text Embedding involves the process of mapping high-dimensional data (such as text, images, etc.) to lower-dimensional spaces.
LangChain-23 Vector AI Semantic Search System: Vector Databases and Retrieval
Vector Storage, also known as Vector Database, is a database system specifically optimized for storing and retrieving high-dimensional vector data.
LangChain-20 Document Loaders: TextLoader, CSVLoader, PyPDFLoader and More
This article introduces various document loaders provided by the LangChain library, such as TextLoader, CSVLoader, DirectoryLoader, etc., demonstrating how to load and pr...
LangChain Text Splitter: Character, Word, HTML and Code-based Splitting
This article introduces various TextSplitters in the LangChain library, including character-based, word-based, HTML tag-based, and programming language-based splitters...
LangChain Cache Mechanism: InMemoryCache and SQLiteCache Explained
LangChain provides a comprehensive caching mechanism to significantly reduce LLM call latency and costs. Its core includes InMemoryCache (in-memory cache) and SQLiteCache...
LangChain-19 TokenUsage Callback Function Explained
Explains how to integrate OpenAI GPT-3 model in Python through LangChain library, demonstrating how to use the getopenaicallback function to obtain callbacks and execute...
LangChain-16 Using Tools: Mastering LLM Tool Calling
LangChain is a powerful open-source framework designed to help developers more efficiently build and deploy applications based on Large Language Models (LLMs).
LangChain-17 Function Calling AI Function Calling Explained
Function Calling is a core technology for Large Language Models (like GPT-4, Claude, Gemini) to interact with external systems.
LangChain-14 OpenAI Content Moderation (Moderation) Explained
Moderation refers to the process of reviewing and managing user-generated content (UGC) through manual or automated means.
LangChain-15 Intelligent Knowledge Retrieval: AgentExecutor Practice
Build an intelligent knowledge retrieval system using Wikipedia search plugin, AgentExecutor, and LangChain tools. Covers agent initialization, tool binding...
LangChain-12 Routing By Semantic Similarity
This article introduces a method using large models (like OpenAI) and Prompt templates to handle unexpected inputs in program design by calculating the similarity between...
LangChain-13 Memory ConversationBufferMemory: Conversation Context Management
This article introduces how to use tools in the LangChain library to manage conversation context of large models in Python.
LangChain-11 Code Writing FunctionCalling: Autoregressive Language Modeling
This article introduces how GPT models work based on autoregressive language modeling, which generates coherent text by predicting the probability of the next token.
LangChain 09 - Query SQL DB with RUN GPT
RUN GPT provides a powerful database query function, allowing users to input natural language to query database content.
LangChain 10 - Agents Langchainhub Guide
This article introduces how to use LangChainHub's Hub mechanism through Python code to easily access and share Prompts.
LangChain 07 - Multiple Chains
How to use Runnable and Prompts in LangChain to create chainable conversation flows for multi-stage question answering, with practical examples of sequential and parallel...
LangChain 08 - Query SQL DB with GPT
This article introduces how to use LangChain framework to import Chinook SQLite database through Python script and use GPT model to execute SQL queries, such as calculati...
LangChain 05 - RAG Enhanced Conversational Retrieval
Conversational Search is an intelligent search technology that combines natural language processing and context understanding capabilities.
LangChain 06 - RAG with Source Document
Retrieval-Augmented Generation (RAG) with Source Document is an AI technology framework that combines retrieval with large language model generation.
LangChain 03 - astream_events Streaming Output with FAISS Practice
This article introduces how to use DocArrayInMemorySearch to vectorize text data, combined with OpenAIEmbeddings and GPT-3.5 model, to implement relevant information retr...
LangChain 04 - RAG Retrieval-Augmented Generation
This article explains in detail how to use RAG technology in LangChain, combined with OpenAI's GPT-3.5 model, to improve text generation quality through retrieval and gen...
LangChain 01 - Getting Started: Quick Hello World Guide
This article introduces how to use the LangChain library with OpenAI API and GPT-3.5-turbo model to create a template for generating jokes about specific topics (like cat...
LangChain 02 - JsonOutputParser and Streaming JSON Data Processing Guide
This article explains how to install and use LangChain and OpenAI API in Python, retrieve specified country and its population data through async functions.