LangChain 04 - RAG Retrieval-Augmented Generation
RAG Overview
RAG (Retrieval-Augmented Generation) is a technology that combines information retrieval with text generation.
RAG Workflow
RAG operates through three core stages:
- Retrieval: Search knowledge base
- Augmentation: Rank and filter documents
- Generation: LLM generates response
RAG Advantages
- Reduce hallucination problems in large language models
- Make generated content more accurate and reliable
Application Scenarios
- Customer Service
- Research Assistant
- Enterprise Knowledge Management
ChatPromptTemplate
- Core class for building chat prompt templates
- Supports reusable templates, dynamic variable insertion, multi-role conversations
- Can integrate with LLMChain, Memory, Agents and OutputParsers
Install Dependencies
pip install --upgrade --quiet langchain-core langchain-community langchain-openai
pip install langchain docarray tiktoken
Code Example
from langchain_community.vectorstores import DocArrayInMemorySearch
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_openai.chat_models import ChatOpenAI
from langchain_openai.embeddings import OpenAIEmbeddings
vectorstore = DocArrayInMemorySearch.from_texts(
["harrison worked at kensho", "bears like to eat honey"],
embedding=OpenAIEmbeddings(),
)
retriever = vectorstore.as_retriever()
template = """Answer the question based only on the following context:
{context}
Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
model = ChatOpenAI(model="gpt-3.5-turbo")
output_parser = StrOutputParser()
setup_and_retrieval = RunnableParallel(
{"context": retriever, "question": RunnablePassthrough()}
)
chain = setup_and_retrieval | prompt | model | output_parser
message = chain.invoke("where did harrison work?")
print(message)
Running Result
Harrison worked at Kensho.