If you have been looking for a structured, Google-native way to build production-ready AI agents, this Google ADK tutorial is your starting point. Google’s Agent Development Kit (ADK) landed as an open-source release in April 2025 and quickly gained 1,000+ GitHub stars in its first week — a signal that developers were ready for a framework with real production intentions, not just demo scaffolding.
This guide assumes you know basic Python but have never touched ADK before. By the end, you will have built a working research assistant agent, connected two agents into a sequential workflow, and have a clear path to deploying on Vertex AI.
What Is Google ADK and Why It Matters for AI Engineers
Google’s Agent Development Kit is an open-source Python framework for building, orchestrating, and deploying AI agents. It lives inside the Vertex AI ecosystem but is framework-agnostic — you can run agents locally during development and push to Vertex AI Agent Engine for production without rewriting your code.
Where ADK stands apart from alternatives like LangGraph, CrewAI, or AutoGen is its design philosophy. ADK is opinionated about three things: composability (agents are modular and reusable), tool-use (tools are first-class primitives, not add-ons), and multi-agent orchestration (coordination patterns are built into the framework, not bolted on afterward). This is not a research framework or a rapid-demo library. It is designed for the full lifecycle — from local prototype to monitored, traceable production deployment.
That does not make it the right choice for every project. LangGraph gives you more control over custom state machines. CrewAI gets you a multi-agent demo faster. But if your goal is building agents that will actually run in production on Google Cloud, ADK gives you the most direct path.
78% of enterprise AI projects in 2025 involve more than one AI model or agent working in concert (Gartner AI Hype Cycle 2025). Multi-agent orchestration is no longer an advanced topic — it is the baseline for production AI work. ADK was built with that reality in mind.
Core Concepts Before You Write a Single Line of Code
ADK is built around three primitives. Understanding them before writing code will save you from confusion later.
Agents, Tools, and Sessions
An Agent is the core reasoning unit. It wraps an LLM (typically Gemini), holds a system prompt, and knows which tools it can call. Think of it as a specialized worker with a job description.
A Tool is a function the agent can invoke — search the web, query a database, call an API, run a calculation. Tools are how agents interact with the world beyond generating text. In ADK, tools are Python functions with type annotations; the framework handles converting them to the format the model expects.
A Session is the container for a conversation. It holds the message history, the current state, and any memory the agent needs to carry across turns. Sessions are what make an agent feel continuous rather than stateless.
The Agent Loop and the Runner
ADK agents operate on a standard observe → think → act loop. The agent receives an input, decides whether to call a tool or generate a final response, executes tool calls if needed, observes the results, and repeats until it reaches a stopping condition.
The Runner is the component that manages this loop. It connects your agent to an environment (local terminal, web server, Vertex AI), handles the back-and-forth between the agent and its tools, and manages session lifecycle. You rarely subclass Runner directly — you configure it and let it do its job.
Understanding this separation matters: the Agent defines behavior, the Tools define capabilities, and the Runner manages execution. When something goes wrong, this mental model tells you where to look.
Stateful vs. Stateless Agents
A stateless agent treats every invocation as independent. It has no memory of previous turns beyond what is in the current session context. This is simpler and cheaper — use it for single-turn tasks like document summarization or one-shot code generation.
A stateful agent persists information across turns using ADK’s SessionService. It can remember user preferences, track progress through a multi-step task, or build up a knowledge base over time. The cost is added complexity in state management. Start stateless, add state when you have a clear reason to.
Setting Up Your ADK Environment
Before you build anything, get your environment clean and authenticated. Rushing this step causes authentication errors later that are frustrating to debug.
Prerequisites
- Python 3.10 or higher
- A Google Cloud project with billing enabled
- Vertex AI API enabled on that project
- Google Cloud CLI (`gcloud`) installed
Install and Authenticate
# Create and activate a virtual environment
python3 -m venv .venv
source .venv/bin/activate
# Install the ADK package
pip install google-adk
# Authenticate with Google Cloud
gcloud auth application-default login
# Set your project
export GOOGLE_CLOUD_PROJECT="your-project-id"
export GOOGLE_CLOUD_LOCATION="us-central1"
Verify the install worked:
adk --version
# Expected: google-adk x.x.x
If you get a command not found error, your virtual environment’s bin/ directory is not on your PATH. Run which python to confirm you are inside the venv before proceeding.
Smoke Test
Create a file called smoke_test.py and run it to confirm your credentials and Vertex AI access work end-to-end before writing your actual agent.
from google.adk.agents import LlmAgent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.genai import types
# Minimal agent — no tools, just a model response
agent = LlmAgent(
name="smoke_test",
model="gemini-2.0-flash",
instruction="You are a helpful assistant. Reply in one sentence.",
)
session_service = InMemorySessionService()
runner = Runner(agent=agent, app_name="smoke_test", session_service=session_service)
# Run a single turn
session = session_service.create_session(app_name="smoke_test", user_id="test_user")
for event in runner.run(
user_id="test_user",
session_id=session.id,
new_message=types.Content(role="user", parts=[types.Part(text="Say hello.")]),
):
if event.is_final_response():
print(event.content.parts[0].text)
If you see a one-sentence response, your environment is ready. If you see an authentication error, run gcloud auth application-default login again and confirm your project ID is set correctly.
Building Your First Agent: A Research Assistant
Now build something genuinely useful. This research assistant uses Google Search to answer questions with current information — not just the model’s training data.
from google.adk.agents import LlmAgent
from google.adk.tools import google_search # Built-in ADK search tool
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.genai import types
# Define the agent
research_agent = LlmAgent(
name="research_assistant",
model="gemini-2.0-flash",
# The system prompt shapes how the agent approaches every task
instruction="""
You are a research assistant. When asked a question:
1. Use the google_search tool to find current, accurate information.
2. Synthesize what you find into a clear, factual answer.
3. Always cite your sources by mentioning where you found the information.
If you cannot find reliable information, say so directly.
""",
tools=[google_search], # Attach the search tool
description="Researches topics using Google Search and synthesizes the results.",
)
# Wire up the runner with in-memory session storage
session_service = InMemorySessionService()
runner = Runner(
agent=research_agent,
app_name="research_app",
session_service=session_service,
)
def ask(question: str, session_id: str, user_id: str = "learner") -> str:
"""Send a question to the agent and return the final response text."""
for event in runner.run(
user_id=user_id,
session_id=session_id,
new_message=types.Content(
role="user",
parts=[types.Part(text=question)],
),
):
if event.is_final_response():
return event.content.parts[0].text
return ""
# Create a session and ask a question
session = session_service.create_session(app_name="research_app", user_id="learner")
answer = ask("What are the key features of Google ADK released in 2025?", session.id)
print(answer)
Walk through what each part does:
LlmAgent— the agent definition.nameis used for logging and tracing.modelselects the Gemini variant.instructionis the system prompt.toolsis the list of callable functions.google_search— a built-in ADK tool that wraps the Google Search API. ADK converts your Python tool list into the tool schema the model uses to decide when to call it.InMemorySessionService— stores session state in memory. Fine for development; swap forVertexAiSessionServicein production.Runner— orchestrates the agent loop. It handles the back-and-forth between the LLM, the tool calls, and the session until the agent signals it is done.event.is_final_response()— ADK streams events as the agent works. Tool call events, intermediate thoughts, and the final response are all separate events. Filter foris_final_response()to get the output.
You can also run agents interactively via the CLI during development:
# Start an interactive session with your agent
adk run research_assistant
Connecting Agents into a Workflow
A single agent handles focused tasks well. Complex workflows — where you need to research a topic, then summarize findings, then generate a report — benefit from a multi-agent pipeline that divides responsibilities. Multi-agent systems also reduce context window exhaustion: each agent sees only the context relevant to its step, which can cut token consumption by up to 60% on complex research tasks compared to a single agent handling everything sequentially.
Sequential Agents: Researcher → Summarizer
ADK’s SequentialAgent chains agents in order. Each agent’s output becomes part of the session state available to the next agent.
from google.adk.agents import LlmAgent, SequentialAgent
from google.adk.tools import google_search
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.genai import types
# Step 1: Researcher — gathers raw information
researcher = LlmAgent(
name="researcher",
model="gemini-2.0-flash",
instruction="""
You are a research specialist. Search for information on the topic provided.
Gather 3-5 key facts and relevant details. Output a structured list of findings.
Do not summarize yet — collect raw information thoroughly.
""",
tools=[google_search],
description="Gathers raw research using Google Search.",
# Store output in session state so the next agent can access it
output_key="research_findings",
)
# Step 2: Summarizer — condenses findings into a readable summary
summarizer = LlmAgent(
name="summarizer",
model="gemini-2.0-flash",
instruction="""
You are a technical writer. You will receive raw research findings in the session.
Access the findings from the 'research_findings' session key.
Write a clear, 3-paragraph summary suitable for a technical audience.
Lead with the most important finding. End with practical implications.
""",
description="Summarizes research findings into a clear, structured report.",
output_key="final_summary",
)
# Chain them together with SequentialAgent
pipeline = SequentialAgent(
name="research_pipeline",
sub_agents=[researcher, summarizer],
description="Research a topic and produce a structured summary.",
)
# Run the pipeline
session_service = InMemorySessionService()
runner = Runner(
agent=pipeline,
app_name="research_pipeline_app",
session_service=session_service,
)
session = session_service.create_session(
app_name="research_pipeline_app", user_id="learner"
)
for event in runner.run(
user_id="learner",
session_id=session.id,
new_message=types.Content(
role="user",
parts=[types.Part(text="Research the current state of AI agent frameworks in 2025.")],
),
):
if event.is_final_response():
print(event.content.parts[0].text)
Common Pitfalls in Multi-Agent Workflows
Multi-agent pipelines introduce failure modes that single agents do not have. Know these before they cost you debugging time:
- Context bloat — if your researcher agent returns too much raw text, the summarizer’s context fills up and quality degrades. Set a max output length in your researcher’s instruction or use ADK’s context windowing controls.
- Tool call loops — an agent that calls a tool, gets an unsatisfying result, and calls the same tool again with slightly different input can loop. Set
max_iterationson yourLlmAgentto cap this. - Missing exit conditions — agents need a clear signal to stop. An instruction that says “keep researching until you are confident” can run many more iterations than intended. Be explicit: “after 3 searches, stop and report what you found.”
- Silent failures between agents — if the researcher fails to populate
output_key, the summarizer proceeds with empty input and produces plausible-sounding nonsense. Add validation: check thatoutput_keyis present and non-empty before the next step runs.
Adding Memory and State to Your Agents
For single-session tasks, the message history inside a session is enough context. For agents that need to remember things across multiple sessions — a tutoring agent that tracks what a student has learned, a task manager that knows what you asked it yesterday — you need explicit state management.
Session State vs. Persistent Memory
Session state lives inside the current Session object. It is fast, simple, and lost when the session ends. Use it for passing context between agents in a single workflow run (like output_key above).
Persistent memory survives across sessions. ADK provides hooks to store and retrieve facts using SessionService implementations backed by durable storage. In development, InMemorySessionService works fine. In production, swap it for VertexAiSessionService (backed by Firestore) or implement your own with Redis or Cloud Spanner.
Practical Example: A Tutoring Agent That Remembers Progress
from google.adk.agents import LlmAgent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.genai import types
tutor_agent = LlmAgent(
name="python_tutor",
model="gemini-2.0-flash",
instruction="""
You are a Python tutor. You track what concepts the student has already covered.
At the start of each response:
- Check the session state for 'covered_topics' (a list of topics already taught).
- Do not re-teach topics already in that list.
After each teaching response:
- Add the topic you just covered to 'covered_topics' in session state.
Teaching style: explain one concept at a time with a short code example.
Current student level: beginner.
""",
description="An adaptive Python tutor that remembers what the student has learned.",
)
session_service = InMemorySessionService()
runner = Runner(
agent=tutor_agent,
app_name="tutor_app",
session_service=session_service,
)
# A persistent user ID means the same student picks up where they left off
USER_ID = "student_42"
# Create or retrieve the session
session = session_service.create_session(
app_name="tutor_app",
user_id=USER_ID,
# Initialize state for a new student
state={"covered_topics": []},
)
# First lesson
for event in runner.run(
user_id=USER_ID,
session_id=session.id,
new_message=types.Content(
role="user",
parts=[types.Part(text="Teach me about Python variables.")],
),
):
if event.is_final_response():
print(event.content.parts[0].text)
# Check what the agent stored in state
updated_session = session_service.get_session(
app_name="tutor_app", user_id=USER_ID, session_id=session.id
)
print("Topics covered so far:", updated_session.state.get("covered_topics", []))
For production tutoring applications, replace InMemorySessionService with a Firestore-backed service. This way, a student who returns a week later picks up exactly where they left off, with the full history of covered topics intact.
As a rule of thumb: use in-session state for intra-workflow coordination, Cloud Firestore for user-specific persistent memory that needs to survive across sessions, and Redis for high-frequency state reads where Firestore latency would be noticeable.
Deploying and Testing ADK Agents in Production
Local development is only half the story. ADK’s value proposition is the end-to-end path to Vertex AI deployment with observability built in.
Deploy to Vertex AI Agent Engine
ADK’s deployment command packages your agent and pushes it to Vertex AI Agent Engine, which handles scaling, load balancing, and managed infrastructure.
# Deploy your agent to Vertex AI Agent Engine
adk deploy cloud_run \
--project $GOOGLE_CLOUD_PROJECT \
--region $GOOGLE_CLOUD_LOCATION \
--service-name research-assistant \
research_assistant/
After deployment, you get a managed endpoint URL you can call from your application. Vertex AI handles autoscaling — your agent can serve one request or ten thousand without infrastructure changes on your end.
Logging, Tracing, and the ADK Debug UI
ADK integrates with Cloud Trace automatically when deployed to Vertex AI. Every agent invocation produces a structured trace: which tools were called, in what order, with what inputs and outputs, and how long each step took. This is what makes debugging production agents tractable.
During local development, use the ADK web UI:
# Start the local debug UI
adk web
# Opens http://localhost:8000 with a visual interface for running and inspecting agents
The UI shows the full agent loop — every tool call, every intermediate response, the final output — in a structured tree view. Use it to understand what your agent is actually doing before you push to production.
Writing Unit Tests for Your Agents
ADK ships with testing utilities that let you write deterministic tests for non-deterministic agents. The key pattern: mock tool responses so your tests do not make real API calls, then assert on agent behavior given those fixed inputs.
from google.adk.testing import AgentTestCase, mock_tool_response
from research_assistant import research_agent # your agent definition
class TestResearchAgent(AgentTestCase):
def test_cites_sources_in_response(self):
# Mock the google_search tool to return controlled output
with mock_tool_response(
"google_search",
results=[
{"title": "ADK Launch Blog", "url": "https://cloud.google.com/blog/adk", "snippet": "Google ADK released April 2025..."}
],
):
response = self.run_agent(
agent=research_agent,
message="What is Google ADK?",
)
# Assert that the agent referenced the source in its response
self.assertIn("cloud.google.com", response.text)
def test_handles_empty_search_results(self):
with mock_tool_response("google_search", results=[]):
response = self.run_agent(
agent=research_agent,
message="What is the population of Mars?",
)
# Agent should acknowledge it could not find information
self.assertIn("could not find", response.text.lower())
Next Steps: Agent2Agent Protocol and Enterprise Patterns
ADK supports Google’s Agent2Agent (A2A) protocol — a standardized interface that lets agents from different frameworks communicate with each other. This matters once you are building enterprise systems where an ADK agent needs to hand off work to an agent built in LangGraph or a third-party service. ADK’s A2A support means your agents are not locked into a single-vendor ecosystem.
From here, the natural progression is learning enterprise harness patterns: how to add verification loops between agent steps, how to set cost envelopes to prevent runaway token spend, and how to build evaluation pipelines that catch regressions before they reach users. These are the infrastructure layers that separate a working demo from a reliable production system.
What You Have Built — And Where to Go Next
You have covered the full ADK development arc in this tutorial: environment setup, building a tool-using research agent, chaining agents into a sequential workflow, managing session state for persistent memory, deploying to Vertex AI, and writing agent unit tests. That is a complete foundation for building ADK-powered applications.
The next layer — what separates a reliable production agent from a prototype — is the harness infrastructure around your agents: verification loops, observability instrumentation, cost controls, and evaluation pipelines. These are the patterns that determine whether your agent works at 10 requests per day or 10,000.
The Academy’s full AI agent harness curriculum walks through each of these patterns with the same level of practical detail you found here. If you are serious about building agents that hold up in production, that is where to go next.
Ready to go deeper? Explore the Agent Harness Fundamentals course on harnessengineering.academy — a structured curriculum that takes you from ADK basics through production-grade harness patterns, with code templates and real-world architecture examples at every step.