Deploying Your Own AI Agent Trading Bot with Claude: A Step-by-Step Guide

Most trading bot tutorials give you a script that calls an API and places orders. They skip the part that actually matters: the harness — the structure that keeps your agent from making catastrophic decisions at 2 a.m. when the market moves against every assumption baked into your prompt.

In this guide, you will build an AI agent trading bot powered by Claude that is designed from the start to be observable, controllable, and safe to deploy. You will not just get working code — you will understand why the harness is structured the way it is, and what breaks if you skip any of it.

Risk Disclaimer: Algorithmic trading carries real financial risk. This tutorial is for educational purposes. Paper trade (simulate with no real money) until you deeply understand the system’s behavior. Never deploy with funds you cannot afford to lose.

What you will build

By the end of this guide, you will have a running AI agent trading bot that:

Connects to a market data feed and fetches real-time price and volume data
Uses Claude to analyze market conditions and reason through trade signals
Executes paper trades through a brokerage simulation API
Enforces hard risk controls that the agent cannot override
Produces structured execution traces for every decision
Supports a human-in-the-loop approval step before any real money moves

The architecture is intentionally beginner-friendly but production-aware. The patterns you learn here apply directly to any agentic system you build after this.

Prerequisites

Before you start, make sure you have:

Python 3.11 or higher installed
An Anthropic API key (get one at console.anthropic.com)
A free Alpaca paper trading account (alpaca.markets — paper trading is free and requires no real money)
Basic Python familiarity — you do not need to be a trading expert

Install the required packages:

pip install anthropic alpaca-trade-api python-dotenv pandas

Step 1: Understand the architecture before writing a line of code

The most common mistake beginners make is going straight to code. Spend five minutes here — it will save you hours of debugging.

Your trading agent has four distinct layers:

Data layer — Fetches market data (prices, volume, indicators) from the API
Agent layer — Sends that data to Claude and receives a structured trade decision
Harness layer — Validates Claude’s decision against hard rules before anything executes
Execution layer — Sends approved orders to the brokerage API

The harness layer is the critical one. Claude is a powerful reasoner, but it operates on the context you give it. If your context is stale, incomplete, or malformed, Claude will reason correctly from bad inputs and produce bad outputs. The harness catches that. It also enforces position limits, maximum trade sizes, and stop-loss rules that are coded in Python — not prompts. You cannot accidentally talk Claude out of a Python if statement.

Market Data API
      ↓
Data Fetcher (Python)
      ↓
Context Builder → Claude API → Structured Decision
                                       ↓
                              Harness Validator
                              (hard rules, risk checks)
                                       ↓
                         [APPROVED] → Execution Layer → Brokerage API
                         [REJECTED] → Log + Skip

Keep this diagram in mind throughout the tutorial.

Step 2: Set up your environment and credentials

Create a project directory and a .env file for your API keys. Never hardcode credentials in source files.

mkdir trading-agent && cd trading-agent
touch .env agent.py data_fetcher.py harness.py executor.py

Add your credentials to .env:

ANTHROPIC_API_KEY=your_anthropic_key_here
ALPACA_API_KEY=your_alpaca_key_here
ALPACA_SECRET_KEY=your_alpaca_secret_here
ALPACA_BASE_URL=https://paper-api.alpaca.markets

The ALPACA_BASE_URL points to Alpaca’s paper trading environment. This means every trade you execute during this tutorial is simulated — no real money involved.

Step 3: Build the data fetcher

The data fetcher is responsible for pulling current market data and formatting it into something Claude can reason about. Keep this layer focused on data retrieval only — no trading logic here.

# data_fetcher.py
import alpaca_trade_api as tradeapi
import os
from dotenv import load_dotenv

load_dotenv()

api = tradeapi.REST(
    os.getenv("ALPACA_API_KEY"),
    os.getenv("ALPACA_SECRET_KEY"),
    os.getenv("ALPACA_BASE_URL"),
    api_version='v2'
)

def get_market_snapshot(symbol: str) -> dict:
    """
    Fetch current price, recent bar data, and account position.
    Returns a structured dict Claude can reason about directly.
    """
    bars = api.get_bars(symbol, '1Min', limit=10).df
    latest_price = bars['close'].iloc[-1]
    price_change_pct = ((bars['close'].iloc[-1] - bars['close'].iloc[0]) / bars['close'].iloc[0]) * 100
    avg_volume = bars['volume'].mean()

    # Check if we already hold a position in this symbol
    try:
        position = api.get_position(symbol)
        current_qty = int(position.qty)
        unrealized_pnl = float(position.unrealized_pl)
    except Exception:
        current_qty = 0
        unrealized_pnl = 0.0

    account = api.get_account()

    return {
        "symbol": symbol,
        "latest_price": round(latest_price, 2),
        "price_change_10min_pct": round(price_change_pct, 3),
        "avg_volume_10min": round(avg_volume, 0),
        "current_position_qty": current_qty,
        "unrealized_pnl": round(unrealized_pnl, 2),
        "account_buying_power": round(float(account.buying_power), 2),
        "account_equity": round(float(account.equity), 2),
    }

Notice the return format is a clean dictionary. This matters for the next step — Claude performs best when context is structured and explicit, not embedded in a wall of prose.

Step 4: Build the agent layer (Claude integration)

This is where Claude comes in. You will send the market snapshot to Claude and ask for a structured trade decision. The key discipline here: always request structured output so the harness can parse it reliably.

# agent.py
import anthropic
import json
import os
from dotenv import load_dotenv

load_dotenv()

client = anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))

SYSTEM_PROMPT = """You are a conservative trading analysis agent. Your role is to analyze market data
and recommend a trade action. You must ALWAYS respond with a JSON object and nothing else.

Your response format:
{
  "action": "BUY" | "SELL" | "HOLD",
  "quantity": <integer, shares to buy/sell — 0 if HOLD>,
  "reasoning": "<1-2 sentence explanation of the signal>",
  "confidence": "LOW" | "MEDIUM" | "HIGH",
  "risk_flag": <true if conditions are unusual or volatile, false otherwise>
}

Guidelines:
- Prefer HOLD when signals are ambiguous — inaction is a valid position.
- Never recommend more than 10 shares in a single trade.
- Flag risk_flag=true if price change exceeds 2% in either direction in the last 10 minutes.
- LOW confidence means you have weak signal. MEDIUM means reasonable signal. HIGH means strong signal."""

def get_trade_decision(market_data: dict) -> dict:
    """
    Send market snapshot to Claude and parse the structured decision.
    Returns the parsed decision dict, or raises on parse failure.
    """
    user_message = f"Analyze this market data and provide your trade recommendation:\n\n{json.dumps(market_data, indent=2)}"

    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=512,
        system=SYSTEM_PROMPT,
        messages=[{"role": "user", "content": user_message}]
    )

    raw_text = response.content[0].text.strip()

    # Parse and validate the response is proper JSON
    try:
        decision = json.loads(raw_text)
    except json.JSONDecodeError as e:
        raise ValueError(f"Claude returned non-JSON response: {raw_text[:200]}") from e

    return decision

The system prompt does three important things. First, it locks Claude into a specific JSON schema — this is your first harness layer, built into the prompt itself. Second, it encodes conservative defaults (prefer HOLD, cap at 10 shares). Third, it instructs Claude to flag its own uncertainty through the risk_flag and confidence fields. You will use both in the harness.

Step 5: Build the harness validator

This is the most important component in the entire system. The harness runs after Claude’s decision and before any trade executes. It enforces rules that are not negotiable — in code, not in prompts.

# harness.py

MAX_TRADE_QUANTITY = 10         # Hard cap: never trade more than 10 shares
MAX_POSITION_SIZE = 20          # Hard cap: never hold more than 20 shares total
MIN_BUYING_POWER = 500.0        # Hard floor: stop trading if account drops this low
MAX_UNREALIZED_LOSS = -200.0    # Hard stop-loss: close position if loss exceeds this

class HarnessViolation(Exception):
    """Raised when a trade decision violates harness rules."""
    pass

def validate_decision(decision: dict, market_data: dict) -> dict:
    """
    Validate Claude's trade decision against hard risk rules.
    Returns approved decision dict or raises HarnessViolation.
    """
    action = decision.get("action")
    quantity = decision.get("quantity", 0)
    confidence = decision.get("confidence", "LOW")
    risk_flag = decision.get("risk_flag", False)

    # Rule 1: Reject trades when Claude itself flags elevated risk
    if risk_flag and action != "HOLD":
        raise HarnessViolation(f"Trade blocked: Claude flagged elevated risk. Action was {action}.")

    # Rule 2: Reject LOW confidence trades — inaction is cheaper than bad trades
    if confidence == "LOW" and action != "HOLD":
        raise HarnessViolation(f"Trade blocked: LOW confidence signal. Action was {action}.")

    # Rule 3: Hard quantity cap regardless of what Claude recommends
    if quantity > MAX_TRADE_QUANTITY:
        raise HarnessViolation(f"Trade blocked: Quantity {quantity} exceeds hard cap of {MAX_TRADE_QUANTITY}.")

    # Rule 4: Enforce maximum position size
    current_qty = market_data.get("current_position_qty", 0)
    if action == "BUY" and (current_qty + quantity) > MAX_POSITION_SIZE:
        raise HarnessViolation(f"Trade blocked: Would exceed max position size of {MAX_POSITION_SIZE} shares.")

    # Rule 5: Stop trading if account buying power falls too low
    buying_power = market_data.get("account_buying_power", 0)
    if buying_power < MIN_BUYING_POWER:
        raise HarnessViolation(f"Trade blocked: Buying power ${buying_power} below minimum ${MIN_BUYING_POWER}.")

    # Rule 6: Trigger automatic stop-loss regardless of Claude's recommendation
    unrealized_pnl = market_data.get("unrealized_pnl", 0)
    if unrealized_pnl < MAX_UNREALIZED_LOSS and current_qty > 0:
        # Override Claude — force a SELL to cut losses
        decision["action"] = "SELL"
        decision["quantity"] = current_qty
        decision["reasoning"] = f"HARNESS OVERRIDE: Stop-loss triggered. Unrealized P&L: ${unrealized_pnl}"
        return decision

    return decision

The stop-loss override on Rule 6 is a deliberate design choice. The harness does not just block — it can override and force a protective action that Claude may not have recommended. This is exactly the kind of hard control that separates a toy demo from a system you can actually trust.

Step 6: Build the executor

The executor is the final gatekeeper. It logs the full execution trace before sending anything to the brokerage.

# executor.py
import alpaca_trade_api as tradeapi
import os
import json
from datetime import datetime
from dotenv import load_dotenv

load_dotenv()

api = tradeapi.REST(
    os.getenv("ALPACA_API_KEY"),
    os.getenv("ALPACA_SECRET_KEY"),
    os.getenv("ALPACA_BASE_URL"),
    api_version='v2'
)

def execute_trade(decision: dict, market_data: dict, dry_run: bool = True) -> dict:
    """
    Execute an approved trade decision.
    dry_run=True logs the trade without actually submitting it.
    Returns a full execution trace for observability.
    """
    trace = {
        "timestamp": datetime.utcnow().isoformat(),
        "symbol": market_data["symbol"],
        "decision": decision,
        "market_snapshot": market_data,
        "dry_run": dry_run,
        "order_submitted": False,
        "order_id": None,
        "error": None,
    }

    action = decision.get("action")

    if action == "HOLD" or decision.get("quantity", 0) == 0:
        trace["result"] = "HOLD — no order submitted"
        _log_trace(trace)
        return trace

    if dry_run:
        trace["result"] = f"DRY RUN: Would submit {action} {decision['quantity']} shares"
        _log_trace(trace)
        return trace

    try:
        order = api.submit_order(
            symbol=market_data["symbol"],
            qty=decision["quantity"],
            side=action.lower(),
            type="market",
            time_in_force="gtc"
        )
        trace["order_submitted"] = True
        trace["order_id"] = order.id
        trace["result"] = f"Order submitted: {action} {decision['quantity']} shares"
    except Exception as e:
        trace["error"] = str(e)
        trace["result"] = "Order submission failed"

    _log_trace(trace)
    return trace

def _log_trace(trace: dict):
    """Append execution trace to a local JSONL log file."""
    with open("execution_traces.jsonl", "a") as f:
        f.write(json.dumps(trace) + "\n")
    print(f"[{trace['timestamp']}] {trace['result']}")

The dry_run=True default is intentional. You must explicitly pass dry_run=False to submit real orders. This prevents accidental live execution during development.

Step 7: Wire everything together

Now connect all four layers into a single run loop:

# main.py
import time
from data_fetcher import get_market_snapshot
from agent import get_trade_decision
from harness import validate_decision, HarnessViolation
from executor import execute_trade

SYMBOL = "AAPL"
DRY_RUN = True       # Set to False only when you are ready for live paper trading
INTERVAL_SECONDS = 60

def run_agent_loop():
    print(f"Starting trading agent for {SYMBOL}. Dry run: {DRY_RUN}")
    while True:
        try:
            # Step 1: Fetch market data
            market_data = get_market_snapshot(SYMBOL)
            print(f"Market snapshot: ${market_data['latest_price']} | Change: {market_data['price_change_10min_pct']}%")

            # Step 2: Get Claude's decision
            decision = get_trade_decision(market_data)
            print(f"Claude decision: {decision['action']} | Confidence: {decision['confidence']} | Risk flag: {decision['risk_flag']}")

            # Step 3: Validate through harness
            approved_decision = validate_decision(decision, market_data)

            # Step 4: Execute (or dry-run log)
            execute_trade(approved_decision, market_data, dry_run=DRY_RUN)

        except HarnessViolation as e:
            print(f"[HARNESS BLOCKED] {e}")

        except ValueError as e:
            print(f"[PARSE ERROR] {e}")

        except Exception as e:
            print(f"[ERROR] Unexpected: {e}")

        time.sleep(INTERVAL_SECONDS)

if __name__ == "__main__":
    run_agent_loop()

Run it:

python main.py

You will see output like:

Starting trading agent for AAPL. Dry run: True
Market snapshot: $189.42 | Change: 0.312%
Claude decision: HOLD | Confidence: MEDIUM | Risk flag: False
[2026-03-11T09:14:22] HOLD — no order submitted

Step 8: Read your execution traces

Every run writes a structured JSON line to execution_traces.jsonl. This is your observability layer. After a few cycles, inspect it:

cat execution_traces.jsonl | python3 -m json.tool | head -60

Look for patterns: How often does the harness block Claude? Which rules fire most? What does Claude’s reasoning look like when it recommends a BUY versus a HOLD? This analysis is how you tune your system before switching off dry run mode.

Common failure modes and how to handle them

Claude returns malformed JSON

This happens occasionally. The json.loads() in agent.py raises ValueError, which the main loop catches and logs without crashing. Add a retry with a backoff if you see this frequently — but do not retry indefinitely.

Market data fetch fails

The Alpaca API has rate limits and occasional outages. Wrap get_market_snapshot in a try/except and skip the cycle on failure rather than crashing the loop.

The harness fires on every cycle

This usually means your risk thresholds are too conservative, or your account’s buying power has genuinely dropped. Check execution_traces.jsonl to see which rule is firing. If it is Rule 2 (LOW confidence) on every cycle, your market data may not be providing enough signal for Claude to reason with confidence — add more context (more bars, RSI, moving averages).

Where to go from here

You now have a working AI agent trading bot with a real harness. A few natural next steps for extending this system:

Add more market context: Include the 50-day moving average, RSI, and volume-weighted average price (VWAP) in your market snapshot. More context gives Claude better signal.
Add human-in-the-loop approval: Before executing BUY/SELL decisions above a threshold size, write the decision to a file and require a manual y/n confirmation. This is a critical safety layer for scaling up.
Add a backtesting harness: Run the same agent logic against historical data before touching live paper trading. Compare the harness-blocked decisions to the approved ones — you may find the harness is your biggest alpha source.
Monitor costs: Every Claude API call has a cost. At 60-second intervals, you will make about 390 calls per trading day. At current pricing, that is roughly $0.50-$2.00/day at claude-sonnet-4-6 rates. Factor that into your paper trading P&L.

Ready to go deeper? The harness patterns in this tutorial — structured output, validation before execution, hard overrides, and execution traces — are the same patterns used in production AI agent systems across every domain. Learn how these patterns scale in [harnessengineering.academy’s agent harness fundamentals course].

Summary

Building an AI agent trading bot with Claude is not about finding the right prompt. It is about building the right harness. Here is what you built today:

A data fetcher that gives Claude structured, current market context
A Claude agent layer that requests and parses structured JSON decisions
A harness validator with hard-coded rules Claude cannot override
An executor with dry-run mode and full execution trace logging

The harness is what makes this deployable. Without it, you have a script that asks Claude what to do and hopes for the best. With it, you have a system where every decision is validated, logged, and constrained by rules you control.

Start with paper trading. Read your traces. Understand how the system behaves before you trust it with real money.

Kai Renner is a senior AI/ML engineering leader with a PhD in Computer Engineering and 10+ years building production agent systems. He writes about harness engineering patterns at harnessengineering.academy.