Implementing the ReAct Pattern for AI Agents: A Hands-On Tutorial

The ReAct pattern is the most widely used reasoning pattern for AI agents. It stands for Reasoning and Acting. Instead of generating an answer in a single pass, the agent thinks about what to do (Reason), takes an action (Act), observes the result (Observe), then repeats until it has enough information to answer.

This three-step loop is what separates an agent from a chatbot. A chatbot generates responses. An agent reasons about what it needs, takes actions to get it, and synthesizes the results into an answer.

This tutorial builds a ReAct agent from scratch in Python. No frameworks. No abstractions. Just the pattern itself, implemented clearly enough that you understand every component and can adapt it for your own agent systems.

Interactive Concept Map

Click any node to expand or collapse. Use the controls to zoom, fit to view, or go fullscreen.

ReAct pattern for AI agents infographic
Visual overview of the ReAct pattern implementation. Click to enlarge.

How the ReAct loop works

The core loop is simple:

1. THINK: The model reasons about what to do next
2. ACT: The model calls a tool based on its reasoning
3. OBSERVE: The tool result is fed back to the model
4. REPEAT until the model has a final answer

Each iteration adds to the context: the model sees its previous reasoning, actions, and observations. This growing context lets the agent build on what it’s learned. A research agent might search, read the results, realize it needs more specific information, search again with a refined query, and then synthesize both results.

The key insight is that the model decides when it has enough information to answer. It doesn’t stop after one tool call. It doesn’t stop after a fixed number of iterations. It stops when its reasoning concludes that it can answer the question with the information gathered.

Setting up the tools

Before building the loop, define the tools the agent can use. Each tool has a name, a description (which the model reads to decide when to use it), and an execute function.

class Tool:
    def __init__(self, name: str, description: str, func):
        self.name = name
        self.description = description
        self.func = func

    def execute(self, input_text: str) -> str:
        try:
            return self.func(input_text)
        except Exception as e:
            return f"Error: {str(e)}"


def web_search(query: str) -> str:
    """Simulate a web search (replace with real API)."""
    # In production, call a real search API
    return f"Search results for '{query}': [simulated results would appear here]"


def calculator(expression: str) -> str:
    """Evaluate a math expression safely."""
    allowed = set("0123456789+-*/.(). ")
    if not all(c in allowed for c in expression):
        return "Error: Invalid characters in expression"
    try:
        result = eval(expression)  # In production, use a safe parser
        return str(result)
    except Exception as e:
        return f"Error: {str(e)}"


# Define available tools
tools = [
    Tool("search", "Search the web for current information", web_search),
    Tool("calculator", "Calculate mathematical expressions", calculator),
]

The tool descriptions matter. The model uses them to decide which tool to call. Vague descriptions lead to wrong tool selections. Be specific about what each tool does and what kind of input it expects.

Building the ReAct prompt

The system prompt teaches the model the ReAct pattern. It needs to know the format for reasoning, acting, and when to give a final answer.

def build_system_prompt(tools: list) -> str:
    tool_descriptions = "\n".join(
        f"- {t.name}: {t.description}" for t in tools
    )

    return f"""You are a helpful assistant that answers questions by reasoning
step by step and using tools when needed.

Available tools:
{tool_descriptions}

When you need to use a tool, respond in this exact format:
Thought: [your reasoning about what to do next]
Action: [tool_name]
Action Input: [input to the tool]

When you have enough information to answer, respond in this format:
Thought: [your final reasoning]
Final Answer: [your answer to the question]

Always start with a Thought. Use tools when you need information you
don't have. You can use multiple tools in sequence.
"""

The format is strict on purpose. Parsing the model’s output requires predictable structure. The model learns to follow the format from the few examples in the system prompt and from the pattern of Thought/Action/Observation in the conversation history.

Building the ReAct loop

The core loop parses the model’s output, executes tool calls, and feeds results back until the model produces a final answer.

import openai

class ReActAgent:
    def __init__(self, tools: list, model: str = "gpt-4o", max_iterations: int = 5):
        self.tools = {t.name: t for t in tools}
        self.model = model
        self.max_iterations = max_iterations
        self.client = openai.OpenAI()

    def run(self, question: str) -> str:
        system_prompt = build_system_prompt(list(self.tools.values()))
        messages = [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": question},
        ]

        for i in range(self.max_iterations):
            # Get model response
            response = self.client.chat.completions.create(
                model=self.model,
                messages=messages,
                temperature=0.1,
            )
            assistant_message = response.choices[0].message.content

            # Check for final answer
            if "Final Answer:" in assistant_message:
                return self._extract_final_answer(assistant_message)

            # Parse action
            action = self._parse_action(assistant_message)
            if action is None:
                # Model didn't follow format; ask it to try again
                messages.append({"role": "assistant", "content": assistant_message})
                messages.append({
                    "role": "user",
                    "content": "Please respond using the Thought/Action format or give a Final Answer."
                })
                continue

            # Execute tool
            tool_name, tool_input = action
            if tool_name not in self.tools:
                observation = f"Error: Tool '{tool_name}' not found. Available: {list(self.tools.keys())}"
            else:
                observation = self.tools[tool_name].execute(tool_input)

            # Add to conversation
            messages.append({"role": "assistant", "content": assistant_message})
            messages.append({
                "role": "user",
                "content": f"Observation: {observation}"
            })

        return "I was unable to find an answer within the allowed number of steps."

    def _parse_action(self, text: str) -> tuple | None:
        """Extract tool name and input from model response."""
        if "Action:" not in text or "Action Input:" not in text:
            return None

        lines = text.split("\n")
        action_name = None
        action_input = None

        for line in lines:
            if line.strip().startswith("Action:"):
                action_name = line.split("Action:", 1)[1].strip()
            elif line.strip().startswith("Action Input:"):
                action_input = line.split("Action Input:", 1)[1].strip()

        if action_name and action_input:
            return (action_name, action_input)
        return None

    def _extract_final_answer(self, text: str) -> str:
        """Extract the final answer from model response."""
        if "Final Answer:" in text:
            return text.split("Final Answer:", 1)[1].strip()
        return text

Running the agent

agent = ReActAgent(tools=tools)
answer = agent.run("What is 15% of the population of France?")
print(answer)

The agent will: (1) Think about what it needs (France’s population), (2) Search for France’s population, (3) Observe the result, (4) Think about the calculation, (5) Use the calculator for 15% of the population, (6) Observe the result, (7) Give the final answer.

Each iteration is visible in the message history. You can log every Thought, Action, and Observation for debugging and evaluation.

Adding harness infrastructure

The basic ReAct loop works, but it needs production hardening. Here are three essential additions.

Retry logic for tool failures

Tools fail. APIs time out. Databases go offline. Wrap tool execution with retry logic:

def execute_with_retry(self, tool_name: str, tool_input: str, max_retries: int = 2) -> str:
    for attempt in range(max_retries + 1):
        result = self.tools[tool_name].execute(tool_input)
        if not result.startswith("Error:") or attempt == max_retries:
            return result
        # Brief delay before retry
        import time
        time.sleep(1 * (attempt + 1))
    return result

Cost tracking

Track token usage per iteration so you can monitor and control costs:

def track_cost(self, response) -> dict:
    usage = response.usage
    cost = (usage.prompt_tokens * 0.005 / 1000) + (usage.completion_tokens * 0.015 / 1000)
    return {
        "input_tokens": usage.prompt_tokens,
        "output_tokens": usage.completion_tokens,
        "cost_usd": cost,
    }

Iteration limits with context

The max_iterations parameter prevents infinite loops, but a hard cutoff isn’t ideal. Better to track progress: if the last two iterations made the same tool call with the same input, the agent is stuck. Stop early and return what you have.

def detect_loop(self, messages: list) -> bool:
    """Check if agent is repeating the same action."""
    actions = []
    for msg in messages:
        if msg["role"] == "assistant" and "Action:" in msg.get("content", ""):
            actions.append(msg["content"])
    if len(actions) >= 2 and actions[-1] == actions[-2]:
        return True
    return False

When ReAct works and when it doesn’t

ReAct works well for:
– Research tasks that require multiple information lookups
– Math problems that need calculation tools
– Tasks where the model needs to verify its own assumptions
– Multi-step processes where each step depends on previous results

ReAct struggles with:
– Tasks requiring very long reasoning chains (10+ steps); the context window fills and quality degrades
– Highly parallel tasks where multiple independent lookups could happen simultaneously (ReAct is sequential)
– Tasks where the model doesn’t know what tools are available or relevant

For parallel tasks, consider the router pattern from our multi-agent design patterns guide. For a broader view of agent design patterns including ReAct, see our agent design patterns guide.

Frequently asked questions

How is ReAct different from chain-of-thought prompting?

Chain-of-thought asks the model to reason step by step but doesn’t include actions. The model reasons in its head without interacting with the world. ReAct adds the action step: the model can use tools, observe results, and incorporate external information into its reasoning. Chain-of-thought is reasoning only. ReAct is reasoning plus acting.

What model should I use for ReAct agents?

Any model that follows instructions reliably. GPT-4o, Claude Sonnet, and Gemini Pro all work well. Smaller models (GPT-4o mini, Haiku) work for simple tool-use tasks but struggle with complex multi-step reasoning. Start with a capable model and optimize down after you have working evaluations.

How do I debug a ReAct agent that gives wrong answers?

Log every Thought, Action, Action Input, and Observation. Read the reasoning trace. The bug is almost always in one of three places: the model chose the wrong tool (fix the tool descriptions), the tool returned wrong results (fix the tool), or the model misinterpreted the observations (fix the system prompt to be more explicit about how to interpret results).

Can I use ReAct with function calling instead of text parsing?

Yes, and many production systems do. Instead of parsing “Action: search” from text, use the model’s native function calling feature. The model returns a structured tool call that you execute directly. This is more reliable than text parsing and avoids format errors. The reasoning pattern is the same; only the interface changes.

Subscribe to the newsletter for weekly tutorials on agent patterns, evaluation, and production deployment.

Leave a Comment