LangChain 05 - RAG Enhanced Conversational Retrieval

Conversational Enhanced Retrieval

Conversational Search is an intelligent search technology that combines natural language processing and context understanding capabilities. It can understand user query intent and maintain context consistency across multiple turns of conversation, providing more precise search results.

Core Features

Context Understanding:
- Can remember conversation history (typically 3-5 turns)
- Understands reference relationships (like “it”, “this”, etc.)
- Example: User asks “How’s the weather in Beijing?”, then asks “What about Shanghai?” The system understands “Shanghai” refers to weather query
Intent Recognition:
- Distinguish different query types (information query, transaction processing, suggestion acquisition, etc.)
- Understand implicit intents (e.g., “I have a headache” may imply medical query intent)
Multi-modal Response:
- Combine text, image, video and other forms
- Provide structured answers (like tables, charts) rather than just web links

Technical Implementation

Architecture Components:
- Dialogue State Tracker (DST)
- Natural Language Understanding Module (NLU)
- Retrieval Augmented Generation (RAG) System
- Response Generator
Key Technologies:
- Transformer architecture (like BERT, GPT, etc.)
- Vector database storage and retrieval
- Knowledge graph integration
- Reinforcement learning optimization

Application Scenarios

Customer Service:
- Handle complex problem consultations
- Automated troubleshooting
- Example: Telecom customer service handling complete diagnostic flow for “Why is my network so slow?”
E-commerce:
- Personalized product recommendations
- Cross-category product comparison
- Example: “I want a Bluetooth headset under 2000 yuan, suitable for sports”
Medical Health:
- Preliminary symptom analysis
- Drug information queries
- Precautions reminders

Challenges and Solutions

Challenges:
- Disambiguation
- Long conversation consistency maintenance
- Domain knowledge updates
Solutions:
- Active clarification mechanism (“Do you mean X or Y?”)
- Conversation history summarization technology
- Real-time knowledge base synchronization

Latest Developments

Hybrid Architecture:
- Combine retrieval-based and generation-based methods
- Example: Use vector retrieval to get relevant documents, then use LLM to generate answer
Personalized Adaptation:
- User profile integration
- Interaction style learning
- Long-term preference memory
Multi-language Support:
- Cross-language query understanding
- Culturally adapted response generation

Install Dependencies

pip install --upgrade --quiet  langchain-core langchain-community langchain-openai

Code Implementation

from langchain_core.messages import AIMessage, HumanMessage, get_buffer_string
from langchain_core.prompts import format_document
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_openai.chat_models import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
from langchain.prompts.prompt import PromptTemplate
from langchain.prompts.chat import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from operator import itemgetter
from langchain_community.vectorstores import DocArrayInMemorySearch


vectorstore = DocArrayInMemorySearch.from_texts(
        ["wuzikang worked at earth", "sam worked at home", "harrison worked at kensho"], embedding=OpenAIEmbeddings()
    )
retriever = vectorstore.as_retriever()


_template = """Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:
{chat_history}
Follow Up Input: {question}
Standalone question:"""
CONDENSE_QUESTION_PROMPT = PromptTemplate.from_template(_template)


template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
ANSWER_PROMPT = ChatPromptTemplate.from_template(template)

DEFAULT_DOCUMENT_PROMPT = PromptTemplate.from_template(template="{page_content}")


def _combine_documents(
    docs, document_prompt=DEFAULT_DOCUMENT_PROMPT, document_separator="\n\n"
):
    doc_strings = [format_document(doc, document_prompt) for doc in docs]
    return document_separator.join(doc_strings)


_inputs = RunnableParallel(
    standalone_question=RunnablePassthrough.assign(
        chat_history=lambda x: get_buffer_string(x["chat_history"])
    )
    | CONDENSE_QUESTION_PROMPT
    | ChatOpenAI(temperature=0)
    | StrOutputParser(),
)
_context = {
    "context": itemgetter("standalone_question") | retriever | _combine_documents,
    "question": lambda x: x["standalone_question"],
}
conversational_qa_chain = _inputs | _context | ANSWER_PROMPT | ChatOpenAI()

message1 = conversational_qa_chain.invoke(
    {
        "question": "what is his name?",
        "chat_history": [],
    }
)
print(f"message1: {message1}")

message2 = conversational_qa_chain.invoke(
    {
        "question": "where did sam work?",
        "chat_history": [],
    }
)
print(f"message2: {message2}")

message3 = conversational_qa_chain.invoke(
    {
        "question": "where did he work?",
        "chat_history": [
            HumanMessage(content="Who wrote this notebook?"),
            AIMessage(content="Harrison"),
        ],
    }
)
print(f"message3: {message3}")

Code Explanation

In initialization, some document content is defined:

“wuzikang worked at earth”
“sam worked at home”
“harrison worked at kensho”

In subsequent code, a template is used, and three conversation questions are made:

“question”: “what is his name?”
“question”: “where did sam work?”
“question”: “where did he work?”

At this time, his, sam, he are not specified, but the large model infers the answer through the defined document content and corresponding chat history.

Running Result

message1: The name of the person we were just talking about is Wuzikang.
message2: Sam worked at home.
message3: Harrison worked at Kensho.