Build Your First AI Agent in 5 Minutes: An Agentic AI Course

There’s a moment in every engineer’s journey when a chatbot stops feeling impressive and starts feeling limiting. You want something that doesn’t just answer — you want something that acts. An AI that can use tools, make decisions in a loop, and carry out a task end-to-end without you holding its hand.

That’s what agentic AI is. And you’re five minutes away from building one.

This tutorial is the first stop in the agentic AI course track here on harnessengineering.academy. By the end, you’ll have a working Python agent that calls a real tool, processes the result, and returns a final answer — all in under 60 lines of code. More importantly, you’ll understand why each piece exists, so you can extend it into something production-worthy.

What Is an Agentic AI System (and Why It Matters Now)

Before we touch code, let’s be precise about what we’re building — because “AI agent” gets used to describe everything from a glorified chatbot to a fully autonomous software engineer.

A chatbot takes your input, runs it through a language model, and returns text. One turn. Done. It has no memory of prior turns unless you explicitly pass them in, and it can’t take any action in the world.

An AI agent is different in three fundamental ways:

Tools — The model can invoke external functions: search the web, call an API, read a file, run code.
Memory — The agent maintains context across multiple steps in a task, not just a single exchange.
Autonomy — The agent decides when to use a tool, which tool to use, and when it’s done — all without you scripting each step.

This isn’t a future-facing concept. According to Gartner’s 2024 AI Hype Cycle, over 70% of enterprises plan to deploy AI agents in production workflows within two years. The global AI agents market is projected to grow from roughly $5 billion in 2024 to over $47 billion by 2030 — a compounding growth rate near 45%. Developers who internalize agentic patterns report building automation prototypes 3 to 5 times faster than with traditional scripting approaches.

Learning to build agents now isn’t getting ahead of the curve — it’s catching up to where the industry already is.

Who This Tutorial Is For

You need to know basic Python — functions, loops, dictionaries. You don’t need to know anything about LLMs, frameworks, or AI beyond “I’ve used ChatGPT.” If you can run a Python script in a terminal, you’re ready.

What You’ll Need Before You Start

Prerequisites

Python 3.9 or later — Check with python3 --version.
A terminal — Any shell works: bash, zsh, PowerShell on Windows.
An Anthropic API key — Sign up at console.anthropic.com. The free tier is sufficient for this tutorial.

Install the SDK

pip install anthropic

That’s the only dependency for this tutorial. No LangChain, no AutoGen, no orchestration framework yet — we’re building the bare-metal version first so you understand what those frameworks are abstracting.

Project Folder Structure

Create a folder and a single Python file:

my-first-agent/
└── agent.py

That’s it. Everything lives in agent.py for now. When your agent grows, you’ll naturally split it — but start minimal.

Step 1 — Define Your Agent’s Goal and Tool

Every agent operates on a loop. The simplest version has four stages:

Perceive → Decide → Act → Observe

Perceive — The agent receives a task (your message).
Decide — The model determines whether to use a tool or answer directly.
Act — If a tool is needed, the agent calls it.
Observe — The agent feeds the tool result back into the model and loops.

The loop terminates when the model decides it has enough information to answer — signaled by a stop_reason of "end_turn" instead of "tool_use".

Choosing Your First Tool

For this tutorial, we’ll build a simple calculator tool. It’s perfect for a first agent because:

No external API key needed
Deterministic output — easy to verify correctness
Forces the model to use a tool rather than guess at arithmetic

Here’s the tool defined as a Python function:

def calculate(expression: str) -> str:
    """Evaluate a mathematical expression and return the result."""
    try:
        result = eval(expression, {"__builtins__": {}}, {})
        return str(result)
    except Exception as e:
        return f"Error: {e}"

Gotcha — Never use eval() in production. For this tutorial it’s fine because we control the input. In any real system, use a safe math parser like simpleeval or asteval. We’re keeping it simple here to focus on the agent loop, not input sanitization.

Writing the System Prompt

The system prompt defines your agent’s role and instructs it on when to use tools. Keep it short and direct:

SYSTEM_PROMPT = """You are a helpful assistant with access to a calculator tool.
When a question requires arithmetic, use the calculator tool rather than guessing.
Always show your reasoning before calling a tool."""

Step 2 — Wire Up the Tool-Calling Loop

Here’s the complete agent in one file. Read it through once before we walk it line by line:

import anthropic
import json

client = anthropic.Anthropic()  # reads ANTHROPIC_API_KEY from environment

SYSTEM_PROMPT = """You are a helpful assistant with access to a calculator tool.
When a question requires arithmetic, use the calculator tool rather than guessing.
Always show your reasoning before calling a tool."""

TOOLS = [
    {
        "name": "calculate",
        "description": "Evaluate a mathematical expression. Input must be a valid Python arithmetic expression.",
        "input_schema": {
            "type": "object",
            "properties": {
                "expression": {
                    "type": "string",
                    "description": "A Python arithmetic expression, e.g. '(15 * 4) + 7'"
                }
            },
            "required": ["expression"]
        }
    }
]

def calculate(expression: str) -> str:
    try:
        result = eval(expression, {"__builtins__": {}}, {})
        return str(result)
    except Exception as e:
        return f"Error: {e}"

def run_agent(user_message: str) -> str:
    messages = [{"role": "user", "content": user_message}]

    while True:
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=1024,
            system=SYSTEM_PROMPT,
            tools=TOOLS,
            messages=messages
        )

        # Append assistant response to message history
        messages.append({"role": "assistant", "content": response.content})

        # If the model is done, return the final text
        if response.stop_reason == "end_turn":
            for block in response.content:
                if hasattr(block, "text"):
                    return block.text

        # Otherwise, process tool calls
        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                tool_input = block.input
                result = calculate(tool_input["expression"])
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": result
                })

        # Feed results back into the conversation
        messages.append({"role": "user", "content": tool_results})

if __name__ == "__main__":
    answer = run_agent("If I have 3 groups of 17 items, and then add 42 more, how many do I have total?")
    print(answer)

Annotated Walkthrough

Lines 1–2: Import anthropic for the API client and json for safe handling later.

TOOLS list: This is how you describe your tool to the model. The input_schema follows JSON Schema format. The model reads this and knows exactly what arguments to pass when it decides to call calculate. This schema is the bridge between the model’s intent and your Python function.

run_agent function: This is the core loop. Notice it uses while True — the loop only breaks when stop_reason == "end_turn". Every iteration either gets the final answer or executes a tool call and loops again.

messages.append after each response: This is how the agent maintains memory within a task. Every exchange — user message, assistant response, tool results — gets added to the running conversation. The model always sees the full history.

tool_results with tool_use_id: When you feed tool output back to the model, you must match it to the original tool call using tool_use_id. Miss this and the model loses track of which result corresponds to which request.

Common Beginner Mistakes

Gotcha 1 — Missing loop termination. If you don’t check stop_reason == "end_turn", your agent loops forever even after the model has its answer. Always gate your loop on the stop reason.

Gotcha 2 — Not passing tool results back. A frequent mistake is calling the tool but forgetting to append the tool_result message before the next model call. Without it, the model doesn’t know what the tool returned and will often hallucinate an answer or call the tool again.

Gotcha 3 — Forgetting to append the assistant message. You must add response.content to messages as an assistant role message before appending tool results as a user role message. The order matters.

Step 3 — Run It and See It Think

Set your API key and run the script:

export ANTHROPIC_API_KEY="your-key-here"
python3 agent.py

You should see something like:

Let me calculate that for you.

3 groups of 17 items = 51 items, plus 42 more = 93 items total.

You have **93 items** in total.

Behind the scenes, here’s what happened:

Your message was sent to Claude with the tool definition attached.
Claude’s response came back with stop_reason: "tool_use" and a tool_use block containing {"expression": "(3 * 17) + 42"}.
Your Python code called calculate("(3 * 17) + 42"), got "93", and sent it back.
Claude received the result, composed a final answer, and returned stop_reason: "end_turn".
Your loop extracted the text and printed it.

Adding a Second Tool

Want to add a second capability without rewriting anything? Just define the tool function and append its schema to TOOLS:

def get_current_date() -> str:
    from datetime import date
    return date.today().isoformat()

# Add to TOOLS list:
{
    "name": "get_current_date",
    "description": "Returns today's date in ISO format (YYYY-MM-DD).",
    "input_schema": {
        "type": "object",
        "properties": {},
        "required": []
    }
}

Then in your tool dispatch, add a conditional:

if block.name == "calculate":
    result = calculate(tool_input["expression"])
elif block.name == "get_current_date":
    result = get_current_date()

The model will now use whichever tool fits the task. You extended the agent’s capabilities without touching the loop logic.

From 5-Minute Prototype to Production-Ready Harness

Congratulations — you’ve built a working AI agent. But let’s be honest about what you haven’t built yet.

What Breaks When You Scale

The prototype above is fragile in predictable ways:

No retries. If the API call fails due to a network timeout or rate limit, your agent crashes. In production, you need exponential backoff and retry logic around every model call.
No observability. You can’t see how long tool calls take, how many tokens you’re burning, or which steps failed. Without logging and tracing, debugging production agents is guesswork.
No cost controls. The while True loop will run until the model says stop. A misbehaving agent or a pathological input could loop dozens of times before terminating — each iteration billing tokens.
No error isolation. If calculate() raises an exception, the whole agent crashes rather than returning a graceful tool error.

The Harness Engineering Mindset

This is where harness engineering comes in — the discipline of wrapping AI agents in the infrastructure that makes them reliable, observable, and safe to operate in production. A harness isn’t the agent itself; it’s everything around the agent that makes it trustworthy:

Retry and circuit-breaker policies
Token budget management
Structured logging and tracing
Input/output validation
Timeout enforcement

Think of it the same way you think about writing a web server. You wouldn’t ship a Flask app with no error handling, no logging, and no request timeouts. Your AI agent deserves the same engineering rigor. For a deep dive into what a production harness looks like, see our harness engineering fundamentals guide.

Next Steps

Here’s the honest progression from where you are now:

Add error handling — Wrap client.messages.create in a try/except, catch anthropic.APIError, and implement basic retry logic.
Add memory — Explore conversation persistence so your agent can pick up a task where it left off, not just within a single run.
Add observability — Log every tool call with its inputs, outputs, and latency. You’ll thank yourself the first time something goes wrong at 2 AM.
Explore multi-agent patterns — Some tasks are too complex for a single agent. Breaking work into specialized sub-agents with a coordinator is one of the most powerful patterns in agentic AI.

Your Agentic AI Learning Path: What’s Next

You’ve completed the foundation. Here’s the recommended progression to go from this five-minute prototype to engineering production-grade agentic systems.

Recommended Learning Progression

Level 1 (You are here): Single-Agent Fundamentals
– Tool calling and the perceive-decide-act-observe loop
– Prompt engineering for agents
– Basic error handling

Level 2: Reliable Single Agents
– Retry and fallback patterns
– Memory architectures (in-context, external vector store, key-value)
– Observability and cost tracking
– Structured output and output validation

Level 3: Multi-Agent Systems
– Orchestrator-worker patterns
– Parallelization and fan-out
– Inter-agent communication and state passing
– Harness patterns for coordinated agents

Level 4: Production Harness Engineering
– Deployment and scaling
– Safety and guardrails
– Evaluation and regression testing for agentic systems
– Incident response for autonomous AI

Resources on harnessengineering.academy

The full agentic AI course track builds directly on what you’ve started here:

Tool-Calling Deep Dive — Goes beyond the basics into parallel tool calls, streaming, and error recovery.
Memory Architectures for AI Agents — A practical guide to giving your agent durable memory.
Multi-Agent Patterns — How to coordinate multiple agents to tackle complex tasks.
Harness Engineering Fundamentals — The broader discipline that makes agents production-ready.

Practice Projects to Cement the Skills

The fastest way to internalize agentic patterns is to build something you actually want to use. Here are three projects at increasing difficulty:

Research assistant — An agent with a web search tool that compiles a structured summary on any topic. Beginner-friendly, no external storage needed.
File organizer — An agent that reads a messy directory, categorizes files by type and date, and moves them into organized subfolders. Introduces file system tools and multi-step planning.
Code reviewer — An agent that reads a Python file, identifies potential bugs and style issues using static analysis tools, and writes a structured review. Introduces chained tool calls and structured output.

Start Building. Keep Learning.

You now know the core loop behind every AI agent ever built — perceive, decide, act, observe. The prototype you built today uses the same fundamental architecture as the agents running in enterprise workflows right now. The difference is the harness around them.

The harness is what this entire academy is about.

Ready to go deeper? Enroll in the full Agentic AI Course Track on harnessengineering.academy — a structured, hands-on curriculum that takes you from this first agent all the way to production-grade harness engineering. Every lesson includes working code, real-world examples, and the engineering judgment that separates prototype builders from production engineers.

The agents running the next generation of software are being built by people who started exactly where you are right now.

Written by Kai Renner — Senior AI/ML Engineering Leader and founder of harnessengineering.academy. Kai has spent a decade building reliable AI systems in production and writes to make agentic engineering accessible to every developer.