LangChain 04 - RAG Retrieval-Augmented Generation

RAG Overview

RAG (Retrieval-Augmented Generation) is a technology that combines information retrieval with text generation.

RAG Workflow

RAG operates through three core stages:

  1. Retrieval: Search knowledge base
  2. Augmentation: Rank and filter documents
  3. Generation: LLM generates response

RAG Advantages

  • Reduce hallucination problems in large language models
  • Make generated content more accurate and reliable

Application Scenarios

  • Customer Service
  • Research Assistant
  • Enterprise Knowledge Management

ChatPromptTemplate

  • Core class for building chat prompt templates
  • Supports reusable templates, dynamic variable insertion, multi-role conversations
  • Can integrate with LLMChain, Memory, Agents and OutputParsers

Install Dependencies

pip install --upgrade --quiet langchain-core langchain-community langchain-openai
pip install langchain docarray tiktoken

Code Example

from langchain_community.vectorstores import DocArrayInMemorySearch
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_openai.chat_models import ChatOpenAI
from langchain_openai.embeddings import OpenAIEmbeddings

vectorstore = DocArrayInMemorySearch.from_texts(
    ["harrison worked at kensho", "bears like to eat honey"],
    embedding=OpenAIEmbeddings(),
)
retriever = vectorstore.as_retriever()

template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
model = ChatOpenAI(model="gpt-3.5-turbo")
output_parser = StrOutputParser()

setup_and_retrieval = RunnableParallel(
    {"context": retriever, "question": RunnablePassthrough()}
)
chain = setup_and_retrieval | prompt | model | output_parser

message = chain.invoke("where did harrison work?")
print(message)

Running Result

Harrison worked at Kensho.