Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Chapter 7: The Agentic Loop (Deep Dive)

File(s) to edit: src/agent.rs — only the run_with_history stub is new in this chapter. single_turn, execute_tools, and chat were implemented back in Chapter 3; this chapter is a deep-dive walkthrough of the loop you already built, plus a thin new event-emitting variant. Tests to run: the same Chapter 3 tests still apply (cargo test -p mini-claw-code-starter test_single_turn_, cargo test -p mini-claw-code-starter test_simple_agent_); there is no dedicated test in the starter for run_with_history — verify it manually by running the example in Chapter 5b and watching the event stream. Estimated time: 45 min

Goal

  • Revisit SimpleAgent::chat from Chapter 3 with a careful walk-through of the control flow, the message ordering, and the edge cases. You are not reimplementing it -- you are understanding what you already wrote.
  • Revisit execute_tools and make sure you know why tool errors become result strings rather than propagating -- the rationale links back to the agreement explained in Chapter 6.
  • Implement the one new piece: run_with_history, an event-emitting variant of the main loop that sends an AgentEvent after every turn so a UI layer (built in later chapters) can observe progress.
  • Understand message ordering: why Message::Assistant must be pushed before the matching Message::ToolResult values.

This is the chapter where everything clicks.

In the previous chapters you built the vocabulary (messages), the mouth (provider), and the hands (tools). Now you build the brain -- the loop that ties them all together. The SimpleAgent is the heart of a coding agent. It is the thing that takes a user prompt, talks to an LLM, executes tools, feeds results back, and keeps going until the job is done.

Every coding agent -- Claude Code, Cursor, Aider, OpenCode -- has some version of this loop. The details vary (streaming, permissions, compaction), but the skeleton is identical. Get this right and you have a working agent. Everything else in this book is refinement.

What the SimpleAgent does

Here is the entire agent lifecycle in one sentence: prompt the LLM, check if it wants to use tools, execute those tools, send the results back, repeat until the LLM says it is done.

That is it. The SimpleAgent implements this loop. It owns three things:

  1. A provider -- the LLM backend (from Chapter 5a / 5b)
  2. A tool set -- the registered tools (from Chapter 6)
  3. A config -- safety limits and behavior knobs
flowchart TD
    A[User prompt] --> B[SimpleAgent::chat]
    B --> C[Provider.chat]
    C --> D{StopReason?}
    D -->|Stop| E[Return final text]
    D -->|ToolUse| F[execute_tools]
    F --> G[Push Message::Assistant]
    G --> H[Push Message::ToolResult for each result]
    H --> C

If you have read Claude Code's source, this maps to the query engine and the query function. Our version strips away streaming, permissions, hooks, and compaction -- those come in later chapters -- leaving the pure control flow.

The SimpleAgent struct

The starter's SimpleAgent is leaner than a production engine -- no config struct, no max turns, no truncation. Just a provider and tools:

#![allow(unused)]
fn main() {
pub struct SimpleAgent<P: Provider> {
    provider: P,
    tools: ToolSet,
}
}

Generic over P: Provider, so the same agent works with OpenRouterProvider in production and MockProvider in tests. The builder pattern lets you configure it fluently:

#![allow(unused)]
fn main() {
let agent = SimpleAgent::new(provider)
    .tool(BashTool::new())
    .tool(ReadTool::new())
    .tool(WriteTool::new());
}

No surprises. The interesting part is the methods that actually run.

execute_tools: the tool dispatch helper

Before tackling the main loop, we need a helper that takes a slice of ToolCalls from the LLM and produces results. This is execute_tools:

#![allow(unused)]
fn main() {
async fn execute_tools(&self, calls: &[ToolCall]) -> Vec<(String, String)> {
    let mut results = Vec::with_capacity(calls.len());
    for call in calls {
        let result = match self.tools.get(&call.name) {
            Some(t) => {
                t.call(call.arguments.clone())
                    .await
                    .unwrap_or_else(|e| format!("error: {e}"))
            }
            None => format!("error: unknown tool `{}`", call.name),
        };
        results.push((call.id.clone(), result));
    }
    results
}
}

Two stages:

  1. Tool lookup -- If the LLM hallucinates a tool name that does not exist, we return an error string. The model sees "error: unknown tool \foo`"` and can recover. This happens more than you might expect, especially with smaller models.

  2. Execute -- Run the tool. If it fails, .unwrap_or_else(|e| format!("error: {e}")) converts the error to a string the model can read.

Note the return type: Vec<(String, String)> -- pairs of (call ID, result string). No ToolResult struct, no truncation, no validation. The starter keeps this simple.

This is a key design decision: tool errors become results, not panics. The agent loop never crashes because a tool failed. The model reads the error, adjusts its approach, and tries again.

The chat() method: the core loop

This is it. The agentic loop. Read it carefully -- it is shorter than you expect.

#![allow(unused)]
fn main() {
pub async fn chat(&self, messages: &mut Vec<Message>) -> anyhow::Result<String> {
    let defs = self.tools.definitions();

    loop {
        let turn = self.provider.chat(messages, &defs).await?;

        match turn.stop_reason {
            StopReason::Stop => {
                let text = turn.text.clone().unwrap_or_default();
                messages.push(Message::Assistant(turn));
                return Ok(text);
            }
            StopReason::ToolUse => {
                let results = self.execute_tools(&turn.tool_calls).await;
                messages.push(Message::Assistant(turn));
                for (id, content) in results {
                    messages.push(Message::ToolResult { id, content });
                }
            }
        }
    }
}
}

Let's break it down.

Tool definitions: collected once

#![allow(unused)]
fn main() {
let defs = self.tools.definitions();
}

We gather tool definitions outside the loop. They do not change between iterations -- the tool set is fixed for the lifetime of the agent. Every call to provider.chat() includes these definitions so the LLM knows which tools are available.

Call the provider

#![allow(unused)]
fn main() {
let turn = self.provider.chat(messages, &defs).await?;
}

Send the full message history and tool definitions to the LLM. The ? propagates provider errors (network failure, auth error, rate limit) directly to the caller. Provider errors are not recoverable by the agent loop -- they need human intervention.

Match the stop reason

#![allow(unused)]
fn main() {
match turn.stop_reason {
    StopReason::Stop => { /* final answer */ }
    StopReason::ToolUse => { /* tool dispatch */ }
}
}

The LLM tells us why it stopped generating. Two possibilities:

  • Stop -- The model is done. It has a final text answer. Extract it, push the assistant message into history, return.
  • ToolUse -- The model wants to use tools. It has populated tool_calls with one or more calls. Execute them, push results, loop.

The two branches

StopReason::Stop -- Clone the text, push the assistant message into history, return. The conversation ends with an Assistant message, ready for the next user turn.

StopReason::ToolUse -- Execute the tools, then push messages in this exact order:

  1. First, Message::Assistant(turn) -- the assistant's response including its tool calls
  2. Then, Message::ToolResult { id, content } for each tool result

This ordering matters. The LLM API expects tool results to follow the assistant message that requested them. Each ToolResult is linked to its ToolCall by the id field. If you push them in the wrong order, the provider will reject the request.

After pushing results, the loop continues. The next iteration sends the entire history -- including the tool calls and their results -- back to the LLM. The model sees what happened and decides what to do next.

Rust concept: ownership and &mut Vec<Message>

The caller owns the message history and passes it as &mut Vec<Message>. This is a deliberate Rust ownership decision -- the agent borrows the history mutably for the duration of the call, but ownership stays with the caller. The alternative would be for the agent to own the Vec, but then the caller could not inspect the history after the call, and multi-turn conversations would require moving the Vec in and out of the agent. &mut is the cleanest solution: the agent pushes messages into the caller's vec, and the caller retains full control afterward.

The caller owns the message history and passes it as &mut Vec<Message>. This is deliberate:

  1. Multi-turn conversations -- The caller can push a new Message::User(...) and call chat() again. The agent picks up where it left off with the full context.
  2. Inspection -- After chat() returns, the caller can examine the full message history to see every tool call, every result, every intermediate step.
  3. Persistence -- The caller can serialize the messages to disk for session save/resume.

run(): the convenience wrapper

Most of the time you just want to send a prompt and get a response. That is run():

#![allow(unused)]
fn main() {
pub async fn run(&self, prompt: &str) -> anyhow::Result<String> {
    let mut messages = vec![Message::User(prompt.to_string())];
    self.chat(&mut messages).await
}
}

Two lines. Creates a fresh message history with the user prompt, delegates to chat(). The message history is discarded after the call -- use chat() directly if you need to preserve it.

AgentEvent: making it observable

The chat() method returns when the agent is done. That is fine for tests, but a real UI needs to show progress while the loop is running. What tool is being called? How long has it been running? Is it done?

The AgentEvent enum models these updates:

#![allow(unused)]
fn main() {
#[derive(Debug)]
pub enum AgentEvent {
    /// A chunk of text streamed from the LLM (streaming mode only).
    TextDelta(String),
    /// A tool is being called.
    ToolCall { name: String, summary: String },
    /// The agent finished with a final response.
    Done(String),
    /// The agent encountered an error.
    Error(String),
}
}

Four variants covering the lifecycle:

EventWhenUI use
TextDeltaLLM streams a text chunkAppend to terminal output
ToolCallA tool is being calledShow: " [bash: ls -la]"
DoneAgent loop finishedDisplay final answer
ErrorUnrecoverable errorShow error message

Note: the starter combines ToolStart/ToolEnd into a single ToolCall event. The summary field is generated by the tool_summary() helper in src/agent.rs, which looks for common argument keys (command, path, question) and formats them like [bash: ls -la].

run_with_events / run_with_history

These methods duplicate the core loop logic but emit events through a tokio::sync::mpsc::UnboundedSender<AgentEvent> channel. The caller creates the channel, passes the sender, and consumes events from the receiver -- typically in a separate task that drives the UI.

#![allow(unused)]
fn main() {
pub async fn run_with_events(
    &self,
    prompt: &str,
    events: mpsc::UnboundedSender<AgentEvent>,
) {
    let messages = vec![Message::User(prompt.to_string())];
    self.run_with_history(messages, events).await;
}
}

run_with_history has the same structure as chat() but with events woven in. It takes ownership of the messages vec and returns the full history. Errors are sent as AgentEvent::Error rather than propagated.

The key differences from chat():

  1. Provider errors are caught with match instead of ?, and sent as AgentEvent::Error.
  2. ToolCall events fire for each tool call, using the tool_summary() helper to produce a one-line description.
  3. Done event fires before pushing the final assistant message, so the UI gets the text immediately.

Note the let _ = events.send(...) pattern. The send can fail if the receiver has been dropped (the UI task crashed or exited early). We ignore the error because the agent should finish its work regardless of whether anyone is watching.

Using events in practice

The caller creates an unbounded channel, passes the sender to the agent, and reads events from the receiver -- typically in a separate task:

#![allow(unused)]
fn main() {
let (tx, mut rx) = tokio::sync::mpsc::unbounded_channel();

let agent_handle = tokio::spawn(async move {
    agent.run_with_events("Fix the bug in main.rs", tx).await
});

while let Some(event) = rx.recv().await {
    match event {
        AgentEvent::ToolCall { summary, .. } => println!("{summary}"),
        AgentEvent::Done(text) => { println!("{text}"); break; }
        AgentEvent::Error(e) => { eprintln!("Error: {e}"); break; }
        _ => {}
    }
}
}

This two-task pattern is what a TUI builds on. The UI task renders events; the agent task runs the loop. They communicate through the channel.

Error handling philosophy

The agent has two distinct error strategies, and the boundary between them is intentional.

Tool errors become results

When a tool fails -- execution error, unknown tool -- the error becomes a string result that the model sees as a normal tool result. The loop continues. The model reads the error and adapts.

Tool error flow:
  LLM requests bash("some_command")
  -> Tool returns Err(e)
  -> unwrap_or_else converts to "error: {e}"
  -> Pushed as Message::ToolResult { id, content: "error: ..." }
  -> LLM sees error, tries different approach

This is essential for robust agents. Models make mistakes. Tools fail for legitimate reasons. The agent should recover, not crash.

Provider errors propagate

When the provider fails -- network timeout, authentication error, rate limit, malformed response -- the error propagates up via ? (in chat()) or via AgentEvent::Error (in chat_with_events()). The loop stops.

Provider error flow:
  Agent calls provider.chat()
  -> Provider returns Err(network timeout)
  -> chat() returns Err(network timeout)
  -> Caller handles it (retry, show error, etc.)

Provider errors are not the agent's problem. They need human or system-level intervention (check your API key, wait for rate limits, fix your network). The agent does not try to recover.

Message history management

The order in which messages are pushed into the history is load-bearing. After a tool-use turn:

#![allow(unused)]
fn main() {
StopReason::ToolUse => {
    let results = self.execute_tools(&turn.tool_calls).await;
    messages.push(Message::Assistant(turn));    // 1. Assistant message (with tool_calls)
    for (id, content) in results {
        messages.push(Message::ToolResult { id, content });  // 2. Tool results
    }
}
}

The resulting message sequence looks like:

[User]        "What files are in src/?"
[Assistant]   tool_calls: [bash("ls src/")]      <- includes the tool call
[ToolResult]  "main.rs\nlib.rs\n"                <- linked by call ID
[Assistant]   "There are two files: ..."          <- next LLM response

Why this order?

  1. API requirement: The Claude API (and OpenAI-compatible APIs) require that tool_result messages immediately follow the assistant message that generated the corresponding tool_use. Violating this causes a 400 error.

  2. ID linking: Each Message::ToolResult has an id that matches a ToolCall.id in the preceding assistant message. The LLM uses this to associate results with requests when there are multiple parallel tool calls.

  3. Context for the next turn: The LLM needs to see its own tool calls to understand what it asked for, and the results to know what happened. Both must be present in the history for the next provider.chat() call.

Putting it all together: a complete trace

Let's trace through a realistic scenario. The user asks: "What is 2 + 3?"

The agent has an AddTool registered. The mock provider is configured to return a tool call first, then a final answer.

Turn 0:

messages: [User("What is 2 + 3?")]
  -> provider.chat() returns: ToolUse, tool_calls: [add(a=2, b=3)]
  -> execute_tools: AddTool.call({a:2, b:3}) -> Ok("5")
  -> push: Assistant(tool_calls: [add(a=2, b=3)])
  -> push: ToolResult { id: "call_1", content: "5" }

Turn 1:

messages: [User, Assistant, ToolResult]
  -> provider.chat() returns: Stop, text: "The sum is 5"
  -> push: Assistant(text: "The sum is 5")
  -> return Ok("The sum is 5")

Two provider calls, one tool execution, clean exit. The final message history has 4 entries: User, Assistant (with tool call), ToolResult, Assistant (with text).

How this compares to Claude Code

Our SimpleAgent is a teaching implementation. Claude Code's real agent is considerably more complex. Here is what it adds:

FeatureOur agentClaude Code
Core looploop { match stop_reason }Same pattern, but with async hooks at every stage
StreamingSeparate run_with_eventsIntegrated SSE streaming with StreamProvider
PermissionsNoneFull permission pipeline checked before every tool call
Max turnsNoneConfigurable ceiling on loop iterations
TruncationNoneTool result size limits
CompactionNoneAuto-compacts when approaching token limit
HooksNonePre/post tool hooks with shell command execution
ConcurrencySequential tool executionParallel execution for safe tools
Error recoveryTool errors as resultsSame, plus retry logic for transient provider errors

The good news: the architecture is the same. Every feature in the right column plugs into the same loop structure. Permissions are checked in execute_tools before calling t.call(). Compaction runs at the top of the loop when token count is high. Hooks fire around tool execution.

Tests

Run the tests to verify your implementation:

cargo test -p mini-claw-code-starter test_single_turn_  # single_turn tests
cargo test -p mini-claw-code-starter test_simple_agent_  # SimpleAgent tests

What the tests verify

Single-turn tests (test_single_turn_):

  • test_single_turn_direct_response -- provider returns text with StopReason::Stop; verifies the agent returns that text directly
  • test_single_turn_one_tool_call -- provider returns a tool call then a final answer; verifies the agent executes the tool and returns the final text
  • test_single_turn_unknown_tool -- provider requests a tool that is not registered; verifies the agent returns an error string (not a panic) and the loop continues

SimpleAgent tests (test_simple_agent_):

  • test_simple_agent_text_response -- run() with a provider that returns text; verifies the response string
  • test_simple_agent_single_tool_call -- provider scripts a tool call followed by a final answer; verifies the agent loops correctly and returns the final text
  • test_simple_agent_unknown_tool -- provider requests a tool that is not registered; verifies the agent returns an error string (not a panic) and the loop continues
  • test_simple_agent_multi_step_loop -- provider scripts two tool calls then a final answer; verifies the agent loops correctly through multiple tool rounds

Implementation checklist

Open src/agent.rs in the starter. You will see unimplemented!() stubs with doc comments for each method. Here is what to fill in:

  1. SimpleAgent::new -- Initialize with the provider and an empty ToolSet.

  2. SimpleAgent::tool -- Push the tool into self.tools, return self.

  3. execute_tools -- Look up each tool, execute, catch errors. Return Vec<(String, String)>.

  4. chat -- The core loop. Call provider, match stop reason, dispatch tools, push messages, loop.

  5. run -- Create messages with Message::User(prompt), delegate to chat.

  6. run_with_history -- Same loop as chat but emit AgentEvents through a channel. Handle errors as events instead of ?.

  7. run_with_events -- Create messages, delegate to run_with_history.

Start with new and tool. Then implement execute_tools -- you can test it implicitly through run. Then chat, then run. Save the event methods for last.

Key takeaway

The agentic loop is surprisingly small -- a loop, a match on StopReason, and a helper that dispatches tool calls. Every feature a production agent adds (permissions, streaming, compaction, hooks) plugs into this same skeleton. If you understand chat(), you understand the architecture of every coding agent.

What you have now

After this chapter, you have a working coding agent. Not a complete one -- it has no real tools yet (those come in later chapters) -- but the core loop is done. You can register any tool that implements the Tool trait, point it at any provider that implements Provider, and the agent will autonomously loop until it has an answer.

This is the skeleton that everything else hangs on. Every feature you add later -- real tools like Bash and Read, permissions, streaming -- plugs into the loop you just built.

Check yourself


← Chapter 6: Tool Interface · Contents · Chapter 8: System Prompt →