Background Introduction
This article introduces how GPT models work based on autoregressive language modeling, which generates coherent text by predicting the probability of the next token.
GPT Model’s Prediction Mechanism
- When generating each token, it calculates the probability distribution over all possible tokens based on the preceding context
- Then it selects the next token based on sampling strategies (such as greedy search, beam search, or temperature sampling)
Training Data Scale: Taking GPT-3 as an example, its training data contains hundreds of billions of tokens, covering encyclopedias, technical documents, math textbooks, and various other texts.
Mathematical Knowledge Acquisition: GPT models are exposed to a large number of mathematical expressions during training, including basic operations and complex mathematical derivations.
Limitations
- This capability is limited to common operations the model has seen during training
- For more complex or rare mathematical problems, the model may give incorrect answers
- The model does not actually understand mathematical principles; it merely mimics correct expression patterns
Large Number Calculation Problem
When asking GPT to calculate sums of super long numbers (such as 12311111111111111 + 999999988888888111), problems arise:
- Training Data Limitations: Large language models typically encounter numbers no longer than 10-15 digits during training
- Computation Method Differences: Humans calculate using “column addition” digit by digit, while the model may try to output the complete result at once
Solutions
- Step-by-step guidance
- Programming mode activation
- Tool calling method (best practice): Let the model call the Python interpreter
Installing Dependencies
pip install --upgrade --quiet langchain-core langchain-experimental langchain-openai
Code Implementation
from langchain_experimental.utilities import PythonREPL
python_repl = PythonREPL()
result = python_repl.run("print(2 + 2)") # Will output 4
Complete Example Code
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_experimental.utilities import PythonREPL
from langchain_openai import ChatOpenAI
template = """Write some python code to solve the user's problem.
Return only python code in Markdown format, e.g.:
```python
....
```"""
prompt = ChatPromptTemplate.from_messages([("system", template), ("human", "{input}")])
model = ChatOpenAI(model="gpt-3.5-turbo")
def _sanitize_output(text: str):
_, after = text.split("```python")
result = after.split("```")[0]
print("---code---")
print(text)
print("---code---")
return result
chain = prompt | model | StrOutputParser() | _sanitize_output | PythonREPL().run
message = chain.invoke({"input": "whats 2 plus 2"})
print(f"message: {message}")
Running Results
---code---
```python
result = 2 + 2
print(result)
---code--- Python REPL can execute arbitrary code. Use with caution. message: 4
---
## Key Points
| Item | Description |
|------|-------------|
| PythonREPL | LangChain experimental module providing a safe Python interactive interpreter environment |
| Function | Supports executing model-generated Python code snippets and capturing execution results |
| Use Cases | When LLM needs to perform mathematical calculations or data processing |
| Security Note | Runs in a restricted environment by default to prevent dangerous operations |