TokenUsage Callback Function Overview
TokenUsage is a practical callback function provided by LangChain, specifically used to track and record token consumption during language model calls. It can help developers:
- Precisely monitor API call costs
- Optimize prompt design
- Analyze model usage efficiency
Core Functions
TokenUsage callback primarily records three types of token data:
- Input tokens (prompt_tokens): Counts the number of prompt tokens sent to the model
- Output tokens (completion_tokens): Counts the number of tokens in the model’s returned results
- Total tokens (total_tokens): The sum of the above two
Implementation Methods
Basic Usage Example
from langchain.callbacks import TokenUsage
from langchain.llms import OpenAI
# Initialize callback
token_usage = TokenUsage()
# Pass callback when creating LLM instance
llm = OpenAI(
temperature=0,
callbacks=[token_usage]
)
# Execute query
response = llm("Please introduce the Python language")
# Get token usage
print(f"Input tokens: {token_usage.prompt_tokens}")
print(f"Output tokens: {token_usage.completion_tokens}")
print(f"Total tokens: {token_usage.total_tokens}")
Usage in Chain Calls
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
# Create prompt template
prompt = PromptTemplate(
input_variables=["topic"],
template="Please explain the following topic in detail: {topic}"
)
# Create chain
chain = LLMChain(
llm=llm,
prompt=prompt,
callbacks=[token_usage]
)
# Execute chain
result = chain.run(topic="Machine Learning")
Advanced Application Scenarios
Batch Processing Monitoring
topics = ["Deep Learning", "Neural Networks", "Natural Language Processing"]
total_tokens = 0
for topic in topics:
chain.run(topic=topic)
total_tokens += token_usage.total_tokens
token_usage.reset() # Reset counter
print(f"Total token consumption for batch processing: {total_tokens}")
Cost Estimation Tool
def estimate_cost(total_tokens):
# Assuming using GPT-3.5 model, $0.002 per 1000 tokens
return total_tokens / 1000 * 0.002
cost = estimate_cost(total_tokens)
print(f"Estimated API call cost: ${cost:.4f}")
Notes
- Multiple Callbacks: Can be used together with other callbacks (like StdOutCallback)
- Async Environment: Need to ensure callback functions are thread-safe in async calls
- Reset Mechanism: Long-running applications should periodically reset counters
- Model Differences: Token calculation methods may vary slightly between different models
Best Practice Recommendations
- Always enable TokenUsage callback in development environment
- Set token consumption alerts for critical business operations
- Regularly analyze historical token usage data
- Optimize prompt structure based on token consumption
- Consider logging token usage to monitoring systems
Install Dependencies
pip install -qU langchain-core langchain-openai
Using get_openai_callback
get_openai_callback is a practical tool built into the LangChain framework that provides:
- Total token consumption (including input and output)
- Total number of API calls
- Estimated cost consumption
- Detailed records for each call
Usage Example
from langchain.callbacks import get_openai_callback
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="gpt-3.5-turbo",
)
with get_openai_callback() as cb:
result = llm.invoke("Tell me a joke")
print(cb)
Running Results
Tokens Used: 26
Prompt Tokens: 12
Completion Tokens: 14
Successful Requests: 1
Total Cost (USD): $4.6e-05