Course Content
The ReAct Pattern: Reason + Act
The core loop that powers most production AI agents
The Problem That ReAct Solves
Imagine you hand an agent a question and tell it: “You can search the web. Answer this.” Without any structure around its reasoning, the agent might immediately fire off a search query — but for what, exactly? It might search for the wrong thing, get back irrelevant results, and give up. Or worse, it might confidently answer from memory and get it completely wrong.
Early agent designs suffered from “impulsive action” — the model would call tools without thinking through what it actually needed, leading to wasted calls and compounding errors. The fix was a structured reasoning pattern that forces the agent to think before acting.
ReAct (Reasoning + Acting) is that pattern. Published by Yao et al. in 2022, it interleaves explicit reasoning steps with action calls, creating a traceable, debuggable chain that dramatically improves reliability on multi-step tasks.
The ReAct Structure
Every step in a ReAct loop follows a strict format:
Thought: [the agent reasons about what it knows and what it needs]
Action: [tool_name(parameters)]
Observation: [the result returned by the tool]
... repeat ...
Thought: I now have enough information to answer.
Final Answer: [the response to the user]The key insight is the Thought step. By forcing the agent to articulate its reasoning before acting, you get two benefits:
- The agent is less likely to make impulsive, wrong tool calls
- You can read the trace and understand exactly why the agent did what it did
A Full ReAct Trace: Olympics Weather Example
Let’s walk through a realistic example. User asks:
“What’s the weather forecast for the city where the 2028 Summer Olympics will be held?”
This requires two pieces of information the model may not know with certainty: the location of the 2028 Olympics, and the current weather there. Here’s the full ReAct trace:
Thought: The user wants the weather forecast for the 2028 Olympics host city.
I should verify the location rather than assume, since this was confirmed
after my training cutoff. I'll search for it.
Action: web_search("2028 Summer Olympics host city location")
Observation: The 2028 Summer Olympics will be held in Los Angeles, California,
USA, from July 14 to July 30, 2028.
Thought: The 2028 Olympics are in Los Angeles. Now I need the current
weather forecast for Los Angeles. I'll use the weather tool.
Action: get_weather(city="Los Angeles, CA")
Observation: Los Angeles, CA - Current: 72°F (22°C), Sunny.
7-day forecast: Highs 70-78°F, mostly sunny with low humidity.
Thought: I now have both pieces of information I need. The 2028 Olympics
are in Los Angeles, and the weather there is sunny with temperatures
in the low 70s Fahrenheit. I can answer the question directly.
Final Answer: The 2028 Summer Olympics will be held in Los Angeles,
California. The current weather forecast for LA shows sunny skies
with highs around 72-78°F (22-26°C) and low humidity — classic
Southern California summer weather.Notice how each Thought step explains why the next action is being taken. The agent didn’t just fire off two searches randomly — it reasoned about the dependency: first find the city, then get the weather for that city.
Implementing ReAct in Python
Here’s a working implementation of the ReAct loop using Claude:
import anthropic
import json
client = anthropic.Anthropic()
# Define tools available to the agent
tools = [
{
"name": "web_search",
"description": "Search the web for current information. Use for facts, news, and recent events.",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "The search query"}
},
"required": ["query"]
}
},
{
"name": "get_weather",
"description": "Get current weather and 7-day forecast for a city.",
"input_schema": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name, e.g. 'Los Angeles, CA'"}
},
"required": ["city"]
}
}
]
def execute_tool(tool_name: str, tool_input: dict) -> str:
"""Execute a tool and return the result as a string."""
if tool_name == "web_search":
# In production: call a real search API (Serper, Tavily, etc.)
return mock_search(tool_input["query"])
elif tool_name == "get_weather":
# In production: call a weather API
return mock_weather(tool_input["city"])
return f"Error: Unknown tool {tool_name}"
def run_react_agent(user_question: str, max_iterations: int = 10) -> str:
"""Run the ReAct loop until we get a final answer."""
messages = [{"role": "user", "content": user_question}]
system_prompt = """You are a helpful assistant that uses tools to answer questions.
Think step by step. Before calling a tool, explain your reasoning with "Thought:".
After getting a result, reflect on what you learned before deciding the next step.
When you have enough information, provide a "Final Answer:" to the user."""
for iteration in range(max_iterations):
response = client.messages.create(
model="claude-opus-4-5",
max_tokens=1024,
system=system_prompt,
tools=tools,
messages=messages
)
# Check if we're done (no tool calls)
if response.stop_reason == "end_turn":
# Extract the final text response
for block in response.content:
if hasattr(block, "text"):
return block.text
# Process tool calls
if response.stop_reason == "tool_use":
# Add assistant's response (including thought + tool call) to messages
messages.append({"role": "assistant", "content": response.content})
# Execute each tool call
tool_results = []
for block in response.content:
if block.type == "tool_use":
print(f"\nAction: {block.name}({json.dumps(block.input)})")
result = execute_tool(block.name, block.input)
print(f"Observation: {result}")
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
# Add tool results to messages and continue
messages.append({"role": "user", "content": tool_results})
return "Max iterations reached without a final answer."
# Run the example
answer = run_react_agent(
"What's the weather forecast for the city where the 2028 Summer Olympics will be held?"
)
print(f"\nFinal Answer: {answer}")The loop is straightforward:
- Send the current conversation to the model
- If the model calls tools, execute them and append results
- If the model gives a final answer, return it
- Repeat up to
max_iterationstimes
Reading the Thought Steps
One of the most valuable properties of ReAct is debuggability. When your agent fails, you can read the thought trace and diagnose exactly where it went wrong:
- Wrong first action: The agent misunderstood the task. Check your system prompt and how the question is framed.
- Correct action, wrong parameters: The tool schema description isn’t clear enough.
- Correct action and parameters, wrong interpretation of results: The agent misread the observation. Consider adding formatting guidelines to your system prompt.
- Correct reasoning, never terminates: The agent doesn’t know when it has enough information. Add explicit stopping criteria to the system prompt.
Compare this to agents without explicit reasoning — when they fail, you have no visibility into why. ReAct makes debugging tractable.
When to Use ReAct vs. Simpler Approaches
ReAct is not always the right choice. Here’s a decision guide:
Use ReAct when:
- The task requires multiple sequential tool calls where each call depends on the previous result
- You need transparency and debuggability in the agent’s decision-making
- The task is complex enough that “think before acting” meaningfully improves quality
- You’re building a production system where you need to audit agent behavior
Use a simpler direct approach when:
- The task only needs a single tool call (just call the tool directly)
- You have a fixed workflow that doesn’t require dynamic decision-making
- Latency is critical (each thought step costs tokens and time)
- The task is deterministic — the same sequence of actions is always right
Use a more sophisticated planner when:
- The task requires parallel actions (ReAct is inherently sequential)
- You need to backtrack and try different approaches
- The task involves very long horizons (50+ steps)
ReAct is the sweet spot for most practical agent tasks: it’s structured enough to be reliable, simple enough to be fast, and transparent enough to be debuggable.
The ReAct Prompt Pattern
The power of ReAct isn’t in the code — it’s in the prompt structure. Here’s the core pattern you can adapt to any model:
You are an agent with access to the following tools:
{tool_descriptions}
Use the following format exactly:
Question: the input question you must answer
Thought: reason about what you need to do
Action: tool_name
Action Input: {"param": "value"}
Observation: the result of the action
... (repeat Thought/Action/Observation as needed)
Thought: I now know the final answer
Final Answer: the final answer to the original questionThis explicit format trains the model to structure its output consistently, making it easy to parse out tool calls programmatically.
A Common Mistake: Over-Thinking
ReAct can be pushed too far. Some developers add so many thought steps that the agent spends more time reasoning about whether to search than it does actually searching. A good rule of thumb:
- If the next action is obvious, one Thought step is enough
- Only chain multiple Thoughts without actions when genuinely synthesizing complex information
- Set a
max_iterationsguard (10-15 for most tasks) to prevent runaway loops
The Thought step is a means to better actions, not an end in itself.
Summary
- ReAct solves the “impulsive action” problem by requiring explicit reasoning before each tool call
- The format is: Thought → Action → Observation → repeat → Final Answer
- The Olympics weather example shows how a two-step dependency is handled cleanly
- The Python implementation runs the loop until a final answer is produced or max iterations is hit
- Use ReAct for multi-step, sequential tasks where transparency matters
- Read the thought trace when debugging — it shows exactly where reasoning went wrong
Next up: Tool Use — how to define tools the model can call, and how to build the execution loop that makes it all work.
