Overview
Welcome to Build Your Own Mini Coding Agent in Rust. Over the next seven chapters you will implement a mini coding agent from scratch – a small version of programs like Claude Code or OpenCode – a program that takes a prompt, talks to a large-language model (LLM), and uses tools to interact with the real world. After that, a series of extension chapters add streaming, a TUI, user input, plan mode, and more.
By the end of this book you will have an agent that can run shell commands, read
and write files, and edit code, all driven by an LLM. No API key is required
until Chapter 6, and when you get there the default model is
openrouter/free
– a zero-cost endpoint on OpenRouter, no credits needed.
What is an AI agent?
An LLM on its own is a function: text in, text out. Ask it to summarize
doc.pdf and it will either refuse or hallucinate – it has no way to open the
file.
An agent solves this by giving the LLM tools. A tool is just a function your code can run – read a file, execute a shell command, hit an API. The agent sits in a loop:
- Send the user’s prompt to the LLM.
- The LLM decides it needs to read
doc.pdfand outputs a tool call. - Your code executes the
readtool and feeds the file contents back. - The LLM now has the text and returns a summary.
The LLM never touches the filesystem. It just asks, and your code does. That loop – ask, execute, feed back – is the entire idea.
How does an LLM use a tool?
An LLM cannot execute code. It is a text generator. So “calling a tool” really means the LLM outputs a structured request and your code does the rest.
When you send a request to the LLM, you include a list of tool definitions
alongside the conversation. Each definition is a name, a description, and a
JSON schema describing the arguments. For our read tool that looks like:
{
"name": "read",
"description": "Read the contents of a file.",
"parameters": {
"type": "object",
"properties": {
"path": { "type": "string" }
},
"required": ["path"]
}
}
The LLM reads these definitions the same way it reads the user’s prompt – they are just part of the input. When it decides it needs to read a file, it does not run any code. It produces a structured output like:
{ "name": "read", "arguments": { "path": "doc.pdf" } }
along with a signal that says “I’m not done yet – I made a tool call.” Your code parses this, runs the real function, and sends the result back as a new message. The LLM then continues with that result in context.
Here is the full exchange for our “Summarize doc.pdf” example:
sequenceDiagram
participant U as User
participant A as Agent
participant L as LLM
participant T as read tool
U->>A: "Summarize doc.pdf"
A->>L: prompt + tool definitions
L-->>A: tool_call: read("doc.pdf")
A->>T: read("doc.pdf")
T-->>A: file contents
A->>L: tool result (file contents)
L-->>A: "Here is a summary: ..."
A->>U: "Here is a summary: ..."
The LLM’s only job is deciding which tool to call and what arguments to pass. Your code does the actual work.
A minimal agent in pseudocode
Here is that example as code:
tools = [read_file]
messages = ["Summarize doc.pdf"]
loop:
response = llm(messages, tools)
if response.done:
print(response.text)
break
// The LLM wants to call a tool -- run it and feed the result back.
for call in response.tool_calls:
result = execute(call.name, call.args)
messages.append(result)
That is the entire agent. The rest of this book is implementing each piece –
the llm function, the tools, and the types that connect them – in Rust.
The tool-calling loop
Here is the flow of a single agent invocation:
flowchart TD
A["👤 User prompt"] --> B["🤖 LLM"]
B -- "StopReason::Stop" --> C["✅ Text response"]
B -- "StopReason::ToolUse" --> D["🔧 Execute tool calls"]
D -- "tool results" --> B
- The user sends a prompt.
- The LLM either responds with text (done) or requests one or more tool calls.
- Your code executes each tool and gathers the results.
- The results are fed back to the LLM as new messages.
- Repeat from step 2 until the LLM responds with text.
That is the entire architecture. Everything else is implementation detail.
What we will build
We will build a simple agent framework consisting of:
4 tools:
| Tool | What it does |
|---|---|
read | Read the contents of a file |
write | Write content to a file (creating directories as needed) |
edit | Replace an exact string in a file |
bash | Run a shell command and capture its output |
1 provider:
| Provider | Purpose |
|---|---|
OpenRouterProvider | Talks to a real LLM over HTTP via the OpenAI-compatible API |
Tests use a MockProvider that returns pre-configured responses so you can
run the full test suite without an API key.
Project structure
The project is a Cargo workspace with three crates and a tutorial book:
mini-claw-code/
Cargo.toml # workspace root
mini-claw-code/ # reference solution (do not peek!)
mini-claw-code-starter/ # YOUR code -- you implement things here
mini-claw-code-xtask/ # helper commands (cargo x ...)
mini-claw-code-book/ # this tutorial
- mini-claw-code contains the complete, working implementation. It is there so the test suite can verify that the exercises are solvable, but you should avoid reading it until you have tried on your own.
- mini-claw-code-starter is your working crate. Each source file contains
struct definitions, trait implementations with
unimplemented!()bodies, and doc-comment hints. Your job is to replace theunimplemented!()calls with real code. - mini-claw-code-xtask provides the
cargo xhelper withcheck,solution-check, andbookcommands. - mini-claw-code-book is this mdbook tutorial.
Prerequisites
Before starting, make sure you have:
- Rust installed (1.85+ required, for edition 2024). Install from https://rustup.rs.
- Basic Rust knowledge: ownership, structs, enums, pattern matching, and
Result/Option. If you have read the first half of The Rust Programming Language book, you are ready. - A terminal and a text editor.
- mdbook (optional, for reading the tutorial locally). Install with
cargo install mdbook mdbook-mermaid.
You do not need an API key until Chapter 6. Chapters 1 through 5 use the
MockProvider for testing, so everything runs locally.
Setup
Clone the repository and verify things build:
git clone https://github.com/odysa/mini-claw-code.git
cd mini-claw-code
cargo build
Then verify the test harness works:
cargo test -p mini-claw-code-starter ch1
The tests should fail – that is expected! Your job in Chapter 1 is to make them pass.
If cargo x does not work, make sure you are in the workspace root (the
directory containing the top-level Cargo.toml).
Chapter roadmap
| Chapter | Topic | What you build |
|---|---|---|
| 1 | Core Types | MockProvider – understand the core types by building a test helper |
| 2 | Your First Tool | ReadTool – reading files |
| 3 | Single Turn | single_turn() – explicit match on StopReason, one round of tool calls |
| 4 | More Tools | BashTool, WriteTool, EditTool |
| 5 | Your First Agent SDK! | SimpleAgent – generalizes single_turn() into a loop |
| 6 | The OpenRouter Provider | OpenRouterProvider – talking to a real LLM API |
| 7 | A Simple CLI | Wire everything into an interactive CLI with conversation memory |
| 8 | The Singularity | Your agent can now code itself – what’s next |
Chapters 1–7 are hands-on: you write code in mini-claw-code-starter and run
tests to check your work. Chapter 8 marks the transition to extension
chapters (9+) which walk through the reference implementation:
| Chapter | Topic | What it adds |
|---|---|---|
| 9 | A Better TUI | Markdown rendering, spinners, collapsed tool calls |
| 10 | Streaming | StreamingAgent with SSE parsing and AgentEvents |
| 11 | User Input | AskTool – let the LLM ask you clarifying questions |
| 12 | Plan Mode | PlanAgent – read-only planning phase with approval gating |
Chapters 1–7 follow the same rhythm:
- Read the chapter to understand the concepts.
- Open the corresponding source file in
mini-claw-code-starter/src/. - Replace the
unimplemented!()calls with your implementation. - Run
cargo test -p mini-claw-code-starter chNto check your work.
Ready? Let’s build an agent.
What’s next
Head to Chapter 1: Core Types to understand the
foundational types – StopReason, Message, and the Provider trait – and
build MockProvider, the test helper you will use throughout the next four
chapters.
Chapter 1: Core Types
In this chapter you will understand the types that make up the agent protocol –
StopReason, AssistantTurn, Message, and the Provider trait. These are
the building blocks everything else is built on.
To verify your understanding, you will implement a small test helper:
MockProvider, a struct that returns pre-configured responses so that you can
test future chapters without an API key.
Goal
Understand the core types, then implement MockProvider so that:
- You create it with a
VecDeque<AssistantTurn>of canned responses. - Each call to
chat()returns the next response in sequence. - If all responses have been consumed, it returns an error.
The core types
Open mini-claw-code-starter/src/types.rs. These types define the protocol
between the agent and any LLM backend.
Here is how they relate to each other:
classDiagram
class Provider {
<<trait>>
+chat(messages, tools) AssistantTurn
}
class AssistantTurn {
text: Option~String~
tool_calls: Vec~ToolCall~
stop_reason: StopReason
}
class StopReason {
<<enum>>
Stop
ToolUse
}
class ToolCall {
id: String
name: String
arguments: Value
}
class Message {
<<enum>>
System(String)
User(String)
Assistant(AssistantTurn)
ToolResult(id, content)
}
class ToolDefinition {
name: &'static str
description: &'static str
parameters: Value
}
Provider --> AssistantTurn : returns
Provider --> Message : receives
Provider --> ToolDefinition : receives
AssistantTurn --> StopReason
AssistantTurn --> ToolCall : contains 0..*
Message --> AssistantTurn : wraps
Provider takes in messages and tool definitions, and returns an
AssistantTurn. The turn’s stop_reason tells you what to do next.
ToolDefinition and its builder
#![allow(unused)]
fn main() {
pub struct ToolDefinition {
pub name: &'static str,
pub description: &'static str,
pub parameters: Value,
}
}
Each tool declares a ToolDefinition that tells the LLM what it can do. The
parameters field is a JSON Schema object describing the tool’s arguments.
Rather than building JSON by hand every time, ToolDefinition has a builder
API:
#![allow(unused)]
fn main() {
ToolDefinition::new("read", "Read the contents of a file.")
.param("path", "string", "The file path to read", true)
}
new(name, description)creates a definition with an empty parameter schema.param(name, type, description, required)adds a parameter and returnsself, so you can chain calls.
You will use this builder in every tool starting from Chapter 2.
StopReason and AssistantTurn
#![allow(unused)]
fn main() {
pub enum StopReason {
Stop,
ToolUse,
}
pub struct AssistantTurn {
pub text: Option<String>,
pub tool_calls: Vec<ToolCall>,
pub stop_reason: StopReason,
}
}
The ToolCall struct holds a single tool invocation:
#![allow(unused)]
fn main() {
pub struct ToolCall {
pub id: String,
pub name: String,
pub arguments: Value,
}
}
Each tool call has an id (for matching results back to requests), a name
(which tool to call), and arguments (a JSON value the tool will parse).
Every response from the LLM comes with a stop_reason that tells you why
the model stopped generating:
StopReason::Stop– the model is done. Checktextfor the response.StopReason::ToolUse– the model wants to call tools. Checktool_calls.
This is the raw LLM protocol: the model tells you what to do next. In
Chapter 3 you will write a function that explicitly matches on
stop_reason to handle each case. In Chapter 5 you will wrap that match
inside a loop to create the full agent.
The Provider trait
#![allow(unused)]
fn main() {
pub trait Provider: Send + Sync {
fn chat<'a>(
&'a self,
messages: &'a [Message],
tools: &'a [&'a ToolDefinition],
) -> impl Future<Output = anyhow::Result<AssistantTurn>> + Send + 'a;
}
}
This says: “A Provider is something that can take a slice of messages and a
slice of tool definitions, and asynchronously return an AssistantTurn.”
The Send + Sync bounds mean the provider must be safe to share across
threads. This is important because tokio (the async runtime) may move tasks
between threads.
Notice that chat() takes &self, not &mut self. The real provider
(OpenRouterProvider) does not need mutation – it just fires HTTP requests.
Making the trait &mut self would force every caller to hold exclusive access,
which is unnecessarily restrictive. The trade-off: MockProvider (a test
helper) does need to mutate its response list, so it must use interior
mutability to conform to the trait.
The Message enum
#![allow(unused)]
fn main() {
pub enum Message {
System(String),
User(String),
Assistant(AssistantTurn),
ToolResult { id: String, content: String },
}
}
The conversation history is a list of Message values:
System(text)– a system prompt that sets the agent’s role and behavior. Typically the first message in the history.User(text)– a prompt from the user.Assistant(turn)– a response from the LLM (text, tool calls, or both).ToolResult { id, content }– the result of executing a tool call. Theidmatches theToolCall::idso the LLM knows which call this result belongs to.
You will use these variants starting in Chapter 3 when building the
single_turn() function.
Why Provider uses impl Future but Tool uses #[async_trait]
You may notice in Chapter 2 that the Tool trait uses #[async_trait] while
Provider uses impl Future directly. The difference is about how the trait
is used:
Provideris used generically (SimpleAgent<P: Provider>). The compiler knows the concrete type at compile time, soimpl Futureworks.Toolis stored as a trait object (Box<dyn Tool>) in a collection of different tool types. Trait objects require a uniform return type, which#[async_trait]provides by boxing the future.
When implementing a trait that uses impl Future, you can simply write
async fn in the impl block – Rust desugars it to the impl Future form
automatically. So while the trait definition says -> impl Future<...>,
your implementation can just write async fn chat(...).
If this distinction is unclear now, it will click in Chapter 5 when you see both patterns in action.
ToolSet – a collection of tools
One more type you will use starting in Chapter 3: ToolSet. It wraps a
HashMap<String, Box<dyn Tool>> and indexes tools by name, giving O(1)
lookup when executing tool calls. You build one with a builder:
#![allow(unused)]
fn main() {
let tools = ToolSet::new()
.with(ReadTool::new())
.with(BashTool::new());
}
You do not need to implement ToolSet – it is provided in types.rs.
Implementing MockProvider
Now that you understand the types, let’s put them to use. MockProvider is a
test helper – it implements Provider by returning canned responses instead of
calling a real LLM. You will use it throughout chapters 2–5 to test tools and
the agent loop without needing an API key.
Open mini-claw-code-starter/src/mock.rs. You will see the struct and method
signatures already laid out with unimplemented!() bodies.
Interior mutability with Mutex
MockProvider needs to remove responses from a list each time chat()
is called. But chat() takes &self. How do we mutate through a shared
reference?
Rust’s std::sync::Mutex provides interior mutability: you wrap a value in a
Mutex, and calling .lock().unwrap() gives you a mutable guard even through
&self. The lock ensures only one thread accesses the data at a time.
#![allow(unused)]
fn main() {
use std::collections::VecDeque;
use std::sync::Mutex;
struct MyState {
items: Mutex<VecDeque<String>>,
}
impl MyState {
fn take_one(&self) -> Option<String> {
self.items.lock().unwrap().pop_front()
}
}
}
Step 1: The struct fields
The struct already has the field you need: a Mutex<VecDeque<AssistantTurn>>
to hold the responses. This is provided so that the method signatures compile.
Your job is to implement the methods that use this field.
Step 2: Implement new()
The new() method receives a VecDeque<AssistantTurn>. We want FIFO order –
each call to chat() should return the first remaining response, not the
last. VecDeque::pop_front() does exactly that in O(1):
flowchart LR
subgraph "VecDeque (FIFO)"
direction LR
A["A"] ~~~ B["B"] ~~~ C["C"]
end
A -- "pop_front()" --> out1["chat() → A"]
B -. "next call" .-> out2["chat() → B"]
C -. "next call" .-> out3["chat() → C"]
So in new():
- Wrap the input deque in a
Mutex. - Store it in
Self.
Step 3: Implement chat()
The chat() method should:
- Lock the mutex.
pop_front()the next response.- If there is one, return
Ok(response). - If the deque is empty, return an error.
The mock provider intentionally ignores the messages and tools parameters.
It does not care what the “user” said – it just returns the next canned
response.
A useful pattern for converting Option to Result:
#![allow(unused)]
fn main() {
some_option.ok_or_else(|| anyhow::anyhow!("no more responses"))
}
Running the tests
Run the Chapter 1 tests:
cargo test -p mini-claw-code-starter ch1
What the tests verify
test_ch1_returns_text: Creates aMockProviderwith one response containing text. Callschat()once and checks the text matches.test_ch1_returns_tool_calls: Creates a provider with one response containing a tool call. Verifies the tool call name and id.test_ch1_steps_through_sequence: Creates a provider with three responses. Callschat()three times and verifies they come back in the correct order (First, Second, Third).
These are the core tests. There are also additional edge-case tests (empty responses, exhausted queue, multiple tool calls, etc.) that will pass once your core implementation is correct.
Recap
You have learned the core types that define the agent protocol:
StopReasontells you whether the LLM is done or wants to call tools.AssistantTurncarries the LLM’s response – text, tool calls, or both.Provideris the trait any LLM backend implements.
You also built MockProvider, a test helper you will use throughout the next
four chapters to simulate LLM conversations without HTTP requests.
What’s next
In Chapter 2: Your First Tool you will implement the
ReadTool – a tool that reads file contents and returns them to the LLM.
Chapter 2: Your First Tool
Now that you have a mock provider, it is time to build your first tool. You will
implement ReadTool – a tool that reads a file and returns its contents. This
is the simplest tool in our agent, but it introduces the Tool trait pattern
that every other tool follows.
Goal
Implement ReadTool so that:
- It declares its name, description, and parameter schema.
- When called with a
{"path": "some/file.txt"}argument, it reads the file and returns its contents as a string. - Missing arguments or non-existent files produce errors.
Key Rust concepts
The Tool trait
Open mini-claw-code-starter/src/types.rs and look at the Tool trait:
#![allow(unused)]
fn main() {
#[async_trait::async_trait]
pub trait Tool: Send + Sync {
fn definition(&self) -> &ToolDefinition;
async fn call(&self, args: Value) -> anyhow::Result<String>;
}
}
Two methods:
definition()returns metadata about the tool: its name, a description, and a JSON schema describing its parameters. The LLM uses this to decide which tool to call and how to format the arguments.call()actually executes the tool. It receives aserde_json::Valuecontaining the arguments and returns a string result.
ToolDefinition
#![allow(unused)]
fn main() {
pub struct ToolDefinition {
pub name: &'static str,
pub description: &'static str,
pub parameters: Value,
}
}
As you saw in Chapter 1, ToolDefinition has a builder API for declaring
parameters. For ReadTool, we need a single required parameter called "path"
of type "string":
#![allow(unused)]
fn main() {
ToolDefinition::new("read", "Read the contents of a file.")
.param("path", "string", "The file path to read", true)
}
Under the hood, the builder constructs the JSON Schema you saw in Chapter 1.
The last argument (true) marks the parameter as required.
Why #[async_trait] instead of plain async fn?
You might wonder why we use the async_trait macro instead of writing
async fn directly in the trait. The reason is trait object compatibility.
Later, in the agent loop, we will store tools in a ToolSet – a HashMap-backed
collection of different tool types behind a common interface. This requires
dynamic dispatch, which means the compiler needs to know the size of the
return type at compile time.
async fn in traits generates a different, uniquely-sized Future type for
each implementation. That breaks dynamic dispatch. The #[async_trait] macro
automatically rewrites async fn into a method that returns
Pin<Box<dyn Future<...>>>, which has a known, fixed size regardless of
which tool produced it. You write normal async fn code, and the macro
handles the boxing for you.
Here is the data flow when the agent calls a tool:
flowchart LR
A["LLM returns<br/>ToolCall"] --> B["args: JSON Value<br/>{"path": "f.txt"}"]
B --> C["Tool::call(args)"]
C --> D["Result: String<br/>(file contents)"]
D --> E["Sent back to LLM<br/>as ToolResult"]
The LLM never touches the filesystem. It produces a JSON request, your code executes it, and returns a string.
The implementation
Open mini-claw-code-starter/src/tools/read.rs. The struct, Default impl, and
method signatures are already provided.
Remember to annotate your impl Tool for ReadTool block with
#[async_trait::async_trait]. The starter file already has this in place.
Step 1: Implement new()
Create a ToolDefinition and store it in self.definition. Use the builder:
#![allow(unused)]
fn main() {
ToolDefinition::new("read", "Read the contents of a file.")
.param("path", "string", "The file path to read", true)
}
Step 2: definition() – already provided
The definition() method is already implemented in the starter – it simply
returns &self.definition. No work needed here.
Step 3: Implement call()
This is where the real work happens. Your implementation should:
- Extract the
"path"argument fromargs. - Read the file asynchronously.
- Return the file contents.
Here is the shape:
#![allow(unused)]
fn main() {
async fn call(&self, args: Value) -> anyhow::Result<String> {
// 1. Extract path
// 2. Read file with tokio::fs::read_to_string
// 3. Return contents
}
}
Some useful APIs:
args["path"].as_str()returnsOption<&str>. Use.context("missing 'path' argument")?fromanyhowto convertNoneinto a descriptive error.tokio::fs::read_to_string(path).awaitreads a file asynchronously. Chain.with_context(|| format!("failed to read '{path}'"))?for a clear error message.
That is it – extract the path, read the file, return the contents.
Running the tests
Run the Chapter 2 tests:
cargo test -p mini-claw-code-starter ch2
What the tests verify
test_ch2_read_definition: Creates aReadTooland checks that its name is"read", description is non-empty, and"path"is in the required parameters.test_ch2_read_file: Creates a temp file with known content, callsReadToolwith the file path, and checks the returned content matches.test_ch2_read_missing_file: CallsReadToolwith a path that does not exist and verifies it returns an error.test_ch2_read_missing_arg: CallsReadToolwith an empty JSON object (no"path"key) and verifies it returns an error.
There are also additional edge-case tests (empty files, unicode content, wrong argument types, etc.) that will pass once your core implementation is correct.
Recap
You built your first tool by implementing the Tool trait. The key patterns:
ToolDefinition::new(...).param(...)declares the tool’s name, description, and parameters.#[async_trait::async_trait]on theimplblock lets you writeasync fn call()while keeping trait object compatibility.tokio::fsfor async file I/O.anyhow::Contextfor adding descriptive error messages.
Every tool in the agent follows this exact same structure. Once you understand
ReadTool, the remaining tools are variations on the theme.
What’s next
In Chapter 3: Single Turn you will write a function
that matches on StopReason to handle a single round of tool calls.
Chapter 3: Single Turn
You have a provider and a tool. Before jumping to the full agent loop, let’s
see the raw protocol: the LLM returns a stop_reason that tells you whether
it is done or wants to use tools. In this chapter you will write a function
that handles exactly one prompt with at most one round of tool calls.
Goal
Implement single_turn() so that:
- It sends a prompt to the provider.
- It matches on
stop_reason. - If
Stop– return the text. - If
ToolUse– execute the tools, send results back, return the final text.
No loop. Just one turn.
Key Rust concepts
ToolSet – a HashMap of tools
The function signature takes a &ToolSet instead of a raw slice or vector:
#![allow(unused)]
fn main() {
pub async fn single_turn<P: Provider>(
provider: &P,
tools: &ToolSet,
prompt: &str,
) -> anyhow::Result<String>
}
ToolSet wraps a HashMap<String, Box<dyn Tool>> and indexes tools by their
definition name. This gives O(1) lookup when executing tool calls instead of
scanning a list. The builder API auto-extracts the name from each tool’s
definition:
#![allow(unused)]
fn main() {
let tools = ToolSet::new().with(ReadTool::new());
let result = single_turn(&provider, &tools, "Read test.txt").await?;
}
match on StopReason
This is the core teaching point. Instead of checking tool_calls.is_empty(),
you explicitly match on the stop reason:
#![allow(unused)]
fn main() {
match turn.stop_reason {
StopReason::Stop => { /* return text */ }
StopReason::ToolUse => { /* execute tools */ }
}
}
This makes the protocol visible. The LLM is telling you what to do, and you handle each case explicitly.
Here is the complete flow of single_turn():
flowchart TD
A["prompt"] --> B["provider.chat()"]
B --> C{"stop_reason?"}
C -- "Stop" --> D["Return text"]
C -- "ToolUse" --> E["Execute each tool call"]
E --> F{"Tool error?"}
F -- "Ok" --> G["result = output"]
F -- "Err" --> H["result = error message"]
G --> I["Push Assistant message"]
H --> I
I --> J["Push ToolResult messages"]
J --> K["provider.chat() again"]
K --> L["Return final text"]
The key difference from the full agent loop (Chapter 5) is that there is no
outer loop here. If the LLM asks for tools a second time, single_turn() does
not handle it – that is what the agent loop is for.
The implementation
Open mini-claw-code-starter/src/agent.rs. You will see the single_turn()
function signature at the top of the file, before the SimpleAgent struct.
Step 1: Collect tool definitions
ToolSet has a definitions() method that returns all tool schemas:
#![allow(unused)]
fn main() {
let defs = tools.definitions();
}
Step 2: Create the initial message
#![allow(unused)]
fn main() {
let mut messages = vec![Message::User(prompt.to_string())];
}
Step 3: Call the provider
#![allow(unused)]
fn main() {
let turn = provider.chat(&messages, &defs).await?;
}
Step 4: Match on stop_reason
This is the heart of the function:
#![allow(unused)]
fn main() {
match turn.stop_reason {
StopReason::Stop => Ok(turn.text.unwrap_or_default()),
StopReason::ToolUse => {
// execute tools, send results, get final answer
}
}
}
For the ToolUse branch:
- For each tool call, find the matching tool and call it. Collect the
results into a
Vecfirst – you will needturn.tool_callsfor this, so you cannot moveturnyet. - Push
Message::Assistant(turn)and thenMessage::ToolResultfor each result. Pushing the assistant turn movesturn, which is why you must collect results beforehand. - Call the provider again to get the final answer.
- Return
final_turn.text.unwrap_or_default().
The tool-finding and execution logic is the same as what you will use in the agent loop (Chapter 5):
#![allow(unused)]
fn main() {
println!("{}", tool_summary(call));
let content = match tools.get(&call.name) {
Some(t) => t.call(call.arguments.clone()).await
.unwrap_or_else(|e| format!("error: {e}")),
None => format!("error: unknown tool `{}`", call.name),
};
}
The tool_summary() helper prints each tool call to the terminal so you can
see which tools the agent is using and what arguments it passed. For example,
[bash: ls -la] or [read: src/main.rs]. (The reference implementation uses
print!("\x1b[2K\r...") instead of println! to clear the thinking...
indicator line before printing – you’ll see this pattern in Chapter 7. A plain
println! works fine for now.)
Error handling – never crash the loop
Notice that tool errors are caught, not propagated. The .unwrap_or_else()
converts any error into a string like "error: failed to read 'missing.txt'".
This string is sent back to the LLM as a normal tool result. The LLM can then
decide what to do – try a different file, use another tool, or explain the
problem to the user.
The same applies to unknown tools – instead of panicking, you send an error message back as a tool result.
This is a key design principle: the agent loop should never crash because of a tool failure. Tools operate on the real world (files, processes, network), and failures are expected. The LLM is smart enough to recover if you give it the error message.
Here is the message sequence for a successful tool call:
sequenceDiagram
participant ST as single_turn()
participant P as Provider
participant T as ReadTool
ST->>P: [User("Read test.txt")] + tool defs
P-->>ST: ToolUse: read({path: "test.txt"})
ST->>T: call({path: "test.txt"})
T-->>ST: "file contents..."
Note over ST: Push Assistant + ToolResult
ST->>P: [User, Assistant, ToolResult]
P-->>ST: Stop: "Here are the contents: ..."
ST-->>ST: return text
And here is what happens when a tool fails (e.g. file not found):
sequenceDiagram
participant ST as single_turn()
participant P as Provider
participant T as ReadTool
ST->>P: [User("Read missing.txt")] + tool defs
P-->>ST: ToolUse: read({path: "missing.txt"})
ST->>T: call({path: "missing.txt"})
T--xST: Err("failed to read 'missing.txt'")
Note over ST: Catch error, use as result
Note over ST: Push Assistant + ToolResult("error: failed to read ...")
ST->>P: [User, Assistant, ToolResult]
P-->>ST: Stop: "Sorry, that file doesn't exist."
ST-->>ST: return text
The error does not crash the agent. It becomes a tool result that the LLM reads and responds to.
Running the tests
Run the Chapter 3 tests:
cargo test -p mini-claw-code-starter ch3
What the tests verify
-
test_ch3_direct_response: Provider returnsStopReason::Stop.single_turnshould return the text directly. -
test_ch3_one_tool_call: Provider returnsStopReason::ToolUsewith areadtool call, thenStopReason::Stop. Verifies the file was read and the final text is returned. -
test_ch3_unknown_tool: Provider returnsStopReason::ToolUsefor a tool that does not exist. Verifies the error message is sent as a tool result and the final text is returned. -
test_ch3_tool_error_propagates: Provider requests areadon a file that does not exist. The error should be caught and sent back to the LLM as a tool result (not crash the function). The LLM then responds with text.
There are also additional edge-case tests (empty responses, multiple tool calls in one turn, etc.) that will pass once your core implementation is correct.
Recap
You have written the simplest possible handler for the LLM protocol:
- Match on
StopReason– the model tells you what to do next. - No loop – you handle at most one round of tool calls.
ToolSet– a HashMap-backed collection with O(1) tool lookup by name.
This is the foundation. In Chapter 5 you will wrap this same logic in a loop to create the full agent.
What’s next
In Chapter 4: More Tools you will implement three
more tools: BashTool, WriteTool, and EditTool.
Chapter 4: More Tools
You have already implemented ReadTool and understand the Tool trait pattern.
Now you will implement three more tools: BashTool, WriteTool, and EditTool.
Each follows the same structure – define a schema, implement call() – so this
chapter reinforces the pattern through repetition.
By the end of this chapter your agent will have all four tools it needs to interact with the file system and execute commands.
flowchart LR
subgraph ToolSet
R["read<br/>Read a file"]
B["bash<br/>Run a command"]
W["write<br/>Write a file"]
E["edit<br/>Replace a string"]
end
Agent -- "tools.get(name)" --> ToolSet
Goal
Implement three tools:
- BashTool – run a shell command and return its output.
- WriteTool – write content to a file, creating directories as needed.
- EditTool – replace an exact string in a file (must appear exactly once).
Key Rust concepts
tokio::process::Command
Tokio provides an async wrapper around std::process::Command. You will use it
in BashTool:
#![allow(unused)]
fn main() {
let output = tokio::process::Command::new("bash")
.arg("-c")
.arg(command)
.output()
.await?;
}
This runs bash -c "<command>" and captures stdout and stderr. The output
struct has stdout and stderr fields as Vec<u8>, which you convert to
strings with String::from_utf8_lossy().
bail!() macro
The anyhow::bail!() macro is shorthand for returning an error immediately:
#![allow(unused)]
fn main() {
use anyhow::bail;
if count == 0 {
bail!("not found");
}
// equivalent to:
// return Err(anyhow::anyhow!("not found"));
}
You will use this in EditTool for validation.
Make sure to import it: use anyhow::{Context, bail};. The starter file
already includes this import in edit.rs.
create_dir_all
When writing a file to a path like a/b/c/file.txt, the parent directories
might not exist. tokio::fs::create_dir_all creates the entire directory tree:
#![allow(unused)]
fn main() {
if let Some(parent) = std::path::Path::new(path).parent() {
tokio::fs::create_dir_all(parent).await?;
}
}
Tool 1: BashTool
Open mini-claw-code-starter/src/tools/bash.rs.
Schema
Use the builder pattern you learned in Chapter 2:
#![allow(unused)]
fn main() {
ToolDefinition::new("bash", "Run a bash command and return its output.")
.param("command", "string", "The bash command to run", true)
}
Implementation
The call() method should:
- Extract
"command"from args. - Run
bash -c <command>usingtokio::process::Command. - Capture stdout and stderr.
- Build a result string:
- Start with stdout (if non-empty).
- Append stderr prefixed with
"stderr: "(if non-empty). - If both are empty, return
"(no output)".
Think about how you combine stdout and stderr. If both are present, you want them separated by a newline. Something like:
#![allow(unused)]
fn main() {
let mut result = String::new();
if !stdout.is_empty() {
result.push_str(&stdout);
}
if !stderr.is_empty() {
if !result.is_empty() {
result.push('\n');
}
result.push_str("stderr: ");
result.push_str(&stderr);
}
if result.is_empty() {
result.push_str("(no output)");
}
}
Tool 2: WriteTool
Open mini-claw-code-starter/src/tools/write.rs.
Schema
#![allow(unused)]
fn main() {
ToolDefinition::new("write", "Write content to a file, creating directories as needed.")
.param("path", "string", "The file path to write to", true)
.param("content", "string", "The content to write to the file", true)
}
Implementation
The call() method should:
- Extract
"path"and"content"from args. - Create parent directories if they do not exist.
- Write the content to the file.
- Return a confirmation message like
"wrote {path}".
For creating parent directories:
#![allow(unused)]
fn main() {
if let Some(parent) = std::path::Path::new(path).parent() {
tokio::fs::create_dir_all(parent).await
.with_context(|| format!("failed to create directories for '{path}'"))?;
}
}
Then write the file:
#![allow(unused)]
fn main() {
tokio::fs::write(path, content).await
.with_context(|| format!("failed to write '{path}'"))?;
}
Tool 3: EditTool
Open mini-claw-code-starter/src/tools/edit.rs.
Schema
#![allow(unused)]
fn main() {
ToolDefinition::new("edit", "Replace an exact string in a file (must appear exactly once).")
.param("path", "string", "The file path to edit", true)
.param("old_string", "string", "The exact string to find and replace", true)
.param("new_string", "string", "The replacement string", true)
}
Implementation
The call() method is the most interesting of the bunch. It should:
- Extract
"path","old_string", and"new_string"from args. - Read the file contents.
- Count how many times
old_stringappears in the content. - If the count is 0, return an error: the string was not found.
- If the count is greater than 1, return an error: the string is ambiguous.
- Replace the single occurrence and write the file back.
- Return a confirmation like
"edited {path}".
The validation is important – requiring exactly one match prevents accidental edits in the wrong place.
flowchart TD
A["Read file"] --> B["Count matches<br/>of old_string"]
B --> C{"count?"}
C -- "0" --> D["Error: not found"]
C -- "1" --> E["Replace + write file"]
C -- ">1" --> F["Error: ambiguous"]
E --> G["Return "edited path""]
Useful APIs:
content.matches(old).count()counts occurrences of a substring.content.replacen(old, new, 1)replaces the first occurrence.bail!("old_string not found in '{path}'")for the not-found case.bail!("old_string appears {count} times in '{path}', must be unique")for the ambiguous case.
Running the tests
Run the Chapter 4 tests:
cargo test -p mini-claw-code-starter ch4
What the tests verify
BashTool:
test_ch4_bash_definition: Checks name is"bash"and"command"is required.test_ch4_bash_runs_command: Runsecho helloand checks the output contains"hello".test_ch4_bash_captures_stderr: Runsecho err >&2and checks stderr is captured.test_ch4_bash_missing_arg: Passes empty args and expects an error.
WriteTool:
test_ch4_write_definition: Checks name is"write".test_ch4_write_creates_file: Writes to a temp file and reads it back.test_ch4_write_creates_dirs: Writes toa/b/c/out.txtand verifies directories were created.test_ch4_write_missing_arg: Passes only"path"(no"content") and expects an error.
EditTool:
test_ch4_edit_definition: Checks name is"edit".test_ch4_edit_replaces_string: Edits"hello"to"goodbye"in a file containing"hello world"and checks the result is"goodbye world".test_ch4_edit_not_found: Tries to replace a string that does not exist and expects an error.test_ch4_edit_not_unique: Tries to replace"a"in a file containing"aaa"(three occurrences) and expects an error.
There are also additional edge-case tests for each tool (wrong argument types, missing arguments, output format checks, etc.) that will pass once your core implementations are correct.
Recap
You now have four tools, and they all follow the same pattern:
- Define a
ToolDefinitionwith::new(...).param(...)builder calls. - Return
&self.definitionfromdefinition(). - Add
#[async_trait::async_trait]on theimpl Toolblock and writeasync fn call().
This is a deliberate design. The Tool trait makes every tool interchangeable
from the agent’s perspective. The agent does not know or care how a tool works
internally – it only needs the definition (to tell the LLM) and the call method
(to execute it).
What’s next
With a provider and four tools ready, it is time to connect them. In
Chapter 5: Your First Agent SDK! you will build the
SimpleAgent – the core loop that sends prompts to the provider, executes
tool calls, and iterates until the LLM gives a final answer.
Chapter 5: Your First Agent SDK!
This is the chapter where everything comes together. You have a provider that
returns AssistantTurn responses and four tools that execute actions. Now you
will build the SimpleAgent – the loop that connects them.
This is the “aha!” moment of the tutorial. The agent loop is surprisingly short, but it is the engine that makes an LLM into an agent.
What is an agent loop?
In Chapter 3 you built single_turn() – one prompt, one round of tool calls,
one final answer. That is enough when the LLM knows everything it needs after
reading a single file. But real tasks are messier:
“Find the bug in this project and fix it.”
The LLM might need to read five files, run the test suite, edit a source file, run the tests again, and then report back. Each of those is a tool call, and the LLM cannot plan them all upfront because the result of one call determines the next. It needs a loop.
The agent loop is that loop:
flowchart TD
A["User prompt"] --> B["Call LLM"]
B -- "StopReason::Stop" --> C["Return text"]
B -- "StopReason::ToolUse" --> D["Execute tool calls"]
D -- "Push assistant + tool results" --> B
- Send messages to the LLM.
- If the LLM says “I’m done” (
StopReason::Stop), return its text. - If the LLM says “I need tools” (
StopReason::ToolUse), execute them. - Append the assistant turn and tool results to the message history.
- Go to step 1.
That is the entire architecture of every coding agent – Claude Code, Cursor, OpenCode, Copilot. The details vary (streaming, parallel tool calls, safety checks), but the core loop is always the same. And you are about to build it in about 30 lines of Rust.
Goal
Implement SimpleAgent so that:
- It holds a provider and a collection of tools.
- You can register tools using a builder pattern (
.tool(ReadTool::new())). - The
run()method implements the tool-calling loop: prompt -> provider -> tool calls -> tool results -> provider -> … -> final text.
Key Rust concepts
Generics with trait bounds
#![allow(unused)]
fn main() {
pub struct SimpleAgent<P: Provider> {
provider: P,
tools: ToolSet,
}
}
The <P: Provider> means SimpleAgent is generic over any type that
implements the Provider trait. When you use MockProvider, the compiler
generates code specialized for MockProvider. When you use
OpenRouterProvider, it generates code for that type. Same logic, different
providers.
ToolSet – a HashMap of trait objects
The tools field is a ToolSet, which wraps a HashMap<String, Box<dyn Tool>>
internally. Each value is a heap-allocated trait object that implements Tool,
but the concrete types can differ. One might be a ReadTool, the next a
BashTool. The HashMap key is the tool’s name, giving O(1) lookup when executing
tool calls.
Why trait objects (Box<dyn Tool>) instead of generics? Because you need a
heterogeneous collection. A Vec<T> requires all elements to be the same
type. With Box<dyn Tool>, you erase the concrete type and store them all
behind the same interface.
This is why the Tool trait uses #[async_trait] – the macro rewrites
async fn into a boxed future with a uniform type across different tool
implementations.
The builder pattern
The tool() method takes self by value (not &mut self) and returns Self:
#![allow(unused)]
fn main() {
pub fn tool(mut self, t: impl Tool + 'static) -> Self {
// push the tool
self
}
}
This lets you chain calls:
#![allow(unused)]
fn main() {
let agent = SimpleAgent::new(provider)
.tool(BashTool::new())
.tool(ReadTool::new())
.tool(WriteTool::new())
.tool(EditTool::new());
}
The impl Tool + 'static parameter accepts any type implementing Tool with
a 'static lifetime (meaning it does not borrow temporary data). Inside the
method, you push it into the ToolSet, which boxes it and indexes it by name.
The implementation
Open mini-claw-code-starter/src/agent.rs. The struct definition and method
signatures are provided.
Step 1: Implement new()
Store the provider and initialize an empty ToolSet:
#![allow(unused)]
fn main() {
pub fn new(provider: P) -> Self {
Self {
provider,
tools: ToolSet::new(),
}
}
}
This one is straightforward.
Step 2: Implement tool()
Push the tool into the set, return self:
#![allow(unused)]
fn main() {
pub fn tool(mut self, t: impl Tool + 'static) -> Self {
self.tools.push(t);
self
}
}
Step 3: Implement run() – the core loop
This is the heart of the agent. Here is the flow:
- Collect tool definitions from all registered tools.
- Create a
messagesvector starting with the user’s prompt. - Loop:
a. Call
self.provider.chat(&messages, &defs)to get anAssistantTurn. b. Match onturn.stop_reason:StopReason::Stop– the LLM is done, returnturn.text.StopReason::ToolUse– for each tool call:- Find the matching tool by name.
- Call it with the arguments.
- Collect the result.
c. Push the
AssistantTurnas aMessage::Assistant. d. Push each tool result as aMessage::ToolResult. e. Continue the loop.
Think about the data flow carefully. After executing tools, you push both the assistant’s turn (so the LLM can see what it requested) and the tool results (so it can see what happened). This gives the LLM full context to decide what to do next.
Gathering tool definitions
At the start of run(), collect all tool definitions from the ToolSet:
#![allow(unused)]
fn main() {
let defs = self.tools.definitions();
}
The loop structure
This is single_turn() (from Chapter 3) wrapped in a loop. Instead of
handling just one round, we match on stop_reason inside a loop:
#![allow(unused)]
fn main() {
loop {
let turn = self.provider.chat(&messages, &defs).await?;
match turn.stop_reason {
StopReason::Stop => return Ok(turn.text.unwrap_or_default()),
StopReason::ToolUse => {
// Execute tool calls, collect results
// Push messages
}
}
}
}
Finding and calling tools
For each tool call, look it up by name in the ToolSet:
#![allow(unused)]
fn main() {
println!("{}", tool_summary(call));
let content = match self.tools.get(&call.name) {
Some(t) => t.call(call.arguments.clone()).await
.unwrap_or_else(|e| format!("error: {e}")),
None => format!("error: unknown tool `{}`", call.name),
};
}
The tool_summary() helper prints each tool call to the terminal – one line
per tool with its key argument, so you can watch what the agent does in real
time. For example: [bash: cat Cargo.toml] or [write: src/lib.rs].
Error handling
Tool errors are caught with .unwrap_or_else() and converted into a string
that gets sent back to the LLM as a tool result. This is the same pattern from
Chapter 3, and it is critical here because the agent loop runs multiple
iterations. If a tool error crashed the loop, the agent would die on the first
missing file or failed command. Instead, the LLM sees the error and can
recover – try a different path, adjust the command, or explain the problem.
> What's in README.md?
[read: README.md] <-- tool fails (file not found)
[read: Cargo.toml] <-- LLM recovers, tries another file
Here is the project info from Cargo.toml...
Unknown tools are handled the same way – an error string as the tool result, not a crash.
Pushing messages
After executing all tool calls for a turn, push the assistant message and the
tool results. You need to collect results first (because the turn is moved
into Message::Assistant):
#![allow(unused)]
fn main() {
let mut results = Vec::new();
for call in &turn.tool_calls {
// ... execute and collect (id, content) pairs
}
messages.push(Message::Assistant(turn));
for (id, content) in results {
messages.push(Message::ToolResult { id, content });
}
}
The order matters: assistant message first, then tool results. This matches the format that LLM APIs expect.
Running the tests
Run the Chapter 5 tests:
cargo test -p mini-claw-code-starter ch5
What the tests verify
-
test_ch5_text_response: Provider returns text immediately (no tools). Agent should return that text. -
test_ch5_single_tool_call: Provider first requests areadtool call, then returns text. Agent should execute the tool and return the final text. -
test_ch5_unknown_tool: Provider requests a tool that does not exist. Agent should handle it gracefully (return an error string as the tool result) and continue to get the final text. -
test_ch5_multi_step_loop: Provider requestsreadtwice across two turns, then returns text. Verifies the loop runs multiple iterations. -
test_ch5_empty_response: Provider returnsNonefor text and no tool calls. Agent should return an empty string. -
test_ch5_builder_chain: Verifies that.tool().tool()chaining compiles – a compile-time check for the builder pattern. -
test_ch5_tool_error_propagates: Provider requests areadon a file that does not exist. The error should be caught and sent back as a tool result. The LLM then responds with text. Verifies the loop does not crash on tool failures.
There are also additional edge-case tests (three-step loops, multi-tool pipelines, etc.) that will pass once your core implementation is correct.
Seeing it all work
Once the tests pass, take a moment to appreciate what you have built. With
about 30 lines of code in run(), you have a working agent loop. Here is what
happens when a test runs agent.run("Read test.txt"):
- Messages:
[User("Read test.txt")] - Provider returns: tool call for
readwith{"path": "test.txt"} - Agent calls
ReadTool::call(), gets file contents - Messages:
[User("Read test.txt"), Assistant(tool_call), ToolResult("file content")] - Provider returns: text response
- Agent returns the text
The mock provider makes this deterministic and testable. But the exact same
loop works with a real LLM provider – you just swap MockProvider for
OpenRouterProvider.
Recap
The agent loop is the core of the framework:
- Generics (
<P: Provider>) let it work with any provider. ToolSet(a HashMap ofBox<dyn Tool>) gives O(1) tool lookup by name.- The builder pattern makes setup ergonomic.
- Error resilience – tool errors are caught and sent back to the LLM, not propagated. The loop never crashes from a tool failure.
- The loop is simple: call provider, match on
stop_reason, execute tools, feed results back, repeat.
What’s next
Your agent works, but only with the mock provider. In
Chapter 6: The OpenRouter Provider you will implement
OpenRouterProvider, which talks to a real LLM API over HTTP. This is what
turns your agent from a testing harness into a real, usable tool.
Chapter 6: The OpenRouter Provider
Up to now, everything has run locally with the MockProvider. In this chapter
you will implement OpenRouterProvider – a provider that talks to a real LLM
over HTTP using the OpenAI-compatible chat completions API.
This is the chapter that makes your agent real.
Goal
Implement OpenRouterProvider so that:
- It can be created with an API key and model name.
- It converts our internal
MessageandToolDefinitiontypes to the API format. - It sends HTTP POST requests to the chat completions endpoint.
- It parses responses back into
AssistantTurn.
Key Rust concepts
Serde derives and attributes
The API types in openrouter.rs are already provided – you do not need to
modify them. But understanding them helps:
#![allow(unused)]
fn main() {
#[derive(Serialize, Deserialize, Clone, Debug)]
pub(crate) struct ApiToolCall {
pub(crate) id: String,
#[serde(rename = "type")]
pub(crate) type_: String,
pub(crate) function: ApiFunction,
}
}
Key serde attributes used:
-
#[serde(rename = "type")]– The JSON field is called"type", buttypeis a reserved keyword in Rust. So the struct field istype_and serde renames it during serialization/deserialization. -
#[serde(skip_serializing_if = "Option::is_none")]– Omits the field from JSON if the value isNone. This is important because the API expects certain fields to be absent (notnull) when unused. -
#[serde(skip_serializing_if = "Vec::is_empty")]– Same idea for empty vectors. If there are no tools, we omit thetoolsfield entirely.
The reqwest HTTP client
reqwest is the standard HTTP client crate in Rust. The pattern:
#![allow(unused)]
fn main() {
let response: MyType = client
.post(url)
.bearer_auth(&api_key)
.json(&body) // serialize body as JSON
.send()
.await
.context("request failed")?
.error_for_status() // turn 4xx/5xx into errors
.context("API returned error status")?
.json() // deserialize response as JSON
.await
.context("failed to parse response")?;
}
Each method returns a builder or future that you chain together. The ?
operator propagates errors at each step.
impl Into<String>
Several methods use impl Into<String> as a parameter type:
#![allow(unused)]
fn main() {
pub fn new(api_key: impl Into<String>, model: impl Into<String>) -> Self
}
This accepts anything that can be converted into a String: String, &str,
Cow<str>, etc. Inside the method, call .into() to get the String:
#![allow(unused)]
fn main() {
api_key: api_key.into(),
model: model.into(),
}
dotenvy
The dotenvy crate loads environment variables from a .env file:
#![allow(unused)]
fn main() {
let _ = dotenvy::dotenv(); // loads .env if present, ignores errors
let key = std::env::var("OPENROUTER_API_KEY")?;
}
The let _ = discards the result because it is fine if .env does not exist
(the variable might already be in the environment).
The API types
The file mini-claw-code-starter/src/providers/openrouter.rs starts with a block
of serde structs. These represent the OpenAI-compatible chat completions API
format. Here is a quick summary:
Request types:
ChatRequest– the POST body: model name, messages, toolsApiMessage– a single message with role, content, optional tool callsApiTool/ApiToolDef– tool definition in API format
Response types:
ChatResponse– the API response: a list of choicesChoice– a single choice containing a message and afinish_reasonResponseMessage– the assistant’s response: optional content, optional tool calls
The finish_reason field on Choice tells you why the model stopped
generating. Map it to StopReason in your chat() implementation:
"tool_calls" becomes StopReason::ToolUse, anything else becomes
StopReason::Stop.
These are already complete. Your job is to implement the methods that use them.
The implementation
Step 1: Implement new()
Initialize all four fields:
#![allow(unused)]
fn main() {
pub fn new(api_key: impl Into<String>, model: impl Into<String>) -> Self {
Self {
client: reqwest::Client::new(),
api_key: api_key.into(),
model: model.into(),
base_url: "https://openrouter.ai/api/v1".into(),
}
}
}
Step 2: Implement base_url()
A simple builder method that overrides the base URL:
#![allow(unused)]
fn main() {
pub fn base_url(mut self, url: impl Into<String>) -> Self {
self.base_url = url.into();
self
}
}
Step 3: Implement from_env_with_model()
- Load
.envwithdotenvy::dotenv()(ignore the result). - Read
OPENROUTER_API_KEYfrom the environment. - Call
Self::new()with the key and model.
Use std::env::var("OPENROUTER_API_KEY") and chain .context(...) for a
clear error message if the key is missing.
Step 4: Implement from_env()
This is a one-liner that calls from_env_with_model with the default model
"openrouter/free". This is a free model on OpenRouter – no credits needed
to get started.
Step 5: Implement convert_messages()
This method translates our Message enum into the API’s ApiMessage format.
Iterate over the messages and match on each variant:
-
Message::System(text)becomes anApiMessagewith role"system"andcontent: Some(text.clone()). The other fields areNone. -
Message::User(text)becomes anApiMessagewith role"user"andcontent: Some(text.clone()). The other fields areNone. -
Message::Assistant(turn)becomes anApiMessagewith role"assistant". Setcontenttoturn.text.clone(). Ifturn.tool_callsis non-empty, convert eachToolCallto anApiToolCall:#![allow(unused)] fn main() { ApiToolCall { id: c.id.clone(), type_: "function".into(), function: ApiFunction { name: c.name.clone(), arguments: c.arguments.to_string(), // Value -> String }, } }If
tool_callsis empty, settool_calls: None(notSome(vec![])). -
Message::ToolResult { id, content }becomes anApiMessagewith role"tool",content: Some(content.clone()), andtool_call_id: Some(id.clone()).
Step 6: Implement convert_tools()
Map each &ToolDefinition to an ApiTool:
#![allow(unused)]
fn main() {
ApiTool {
type_: "function",
function: ApiToolDef {
name: t.name,
description: t.description,
parameters: t.parameters.clone(),
},
}
}
Step 7: Implement chat()
This is the main method. It brings everything together:
- Build a
ChatRequestwith the model, converted messages, and converted tools. - POST it to
{base_url}/chat/completionswith bearer auth. - Parse the response as
ChatResponse. - Extract the first choice.
- Convert
tool_callsback to ourToolCalltype.
The tool call conversion is the trickiest part. The API returns
function.arguments as a string (JSON-encoded), but our ToolCall stores
it as a serde_json::Value. So you need to parse it:
#![allow(unused)]
fn main() {
let arguments = serde_json::from_str(&tc.function.arguments)
.unwrap_or(Value::Null);
}
The unwrap_or(Value::Null) handles the case where the arguments string is
not valid JSON (unlikely with a well-behaved API, but good to be safe).
Here is the skeleton for the chat() method:
#![allow(unused)]
fn main() {
async fn chat(
&self,
messages: &[Message],
tools: &[&ToolDefinition],
) -> anyhow::Result<AssistantTurn> {
let body = ChatRequest {
model: &self.model,
messages: Self::convert_messages(messages),
tools: Self::convert_tools(tools),
};
let response: ChatResponse = self.client
.post(format!("{}/chat/completions", self.base_url))
// ... bearer_auth, json, send, error_for_status, json ...
;
let choice = response.choices.into_iter().next()
.context("no choices in response")?;
// Convert choice.message.tool_calls to Vec<ToolCall>
// Map finish_reason to StopReason
// Return AssistantTurn { text, tool_calls, stop_reason }
todo!()
}
}
Fill in the HTTP call chain and the response conversion logic.
Running the tests
Run the Chapter 6 tests:
cargo test -p mini-claw-code-starter ch6
The Chapter 6 tests verify the conversion methods (convert_messages and
convert_tools), the constructor logic, and the full chat() method using a
local mock HTTP server. They do not call a real LLM API, so no API key is
needed. There are also additional edge-case tests that will pass once your core
implementation is correct.
Optional: Live test
If you want to test with a real API, set up an OpenRouter API key:
- Sign up at openrouter.ai.
- Create an API key.
- Create a
.envfile in the workspace root:
OPENROUTER_API_KEY=sk-or-v1-your-key-here
Then try building and running the chat example from Chapter 7. But first, finish reading this chapter and move on to Chapter 7 where you wire everything up.
Recap
You have implemented a real HTTP provider that:
- Constructs from an API key and model name (or from environment variables).
- Converts between your internal types and the OpenAI-compatible API format.
- Sends HTTP requests and parses responses.
The key patterns:
- Serde attributes for JSON field mapping (
rename,skip_serializing_if). reqwestfor HTTP with a fluent builder API.impl Into<String>for flexible string parameters.dotenvyfor loading.envfiles.
Your agent framework is now complete. Every piece – tools, the agent loop, and the HTTP provider – is implemented and tested.
What’s next
In Chapter 7: A Simple CLI you will wire everything into an interactive CLI with conversation memory.
Chapter 7: A Simple CLI
You have built every component: a mock provider for testing, four tools, the agent loop, and an HTTP provider. Now it is time to wire them all into a working CLI.
Goal
Add a chat() method to SimpleAgent and write examples/chat.rs so that:
- The agent remembers the conversation – each prompt builds on the previous ones.
- It prints
>, reads a line, runs the agent, and prints the result. - It shows a
thinking...indicator while the agent works. - It keeps running until the user presses Ctrl+D (EOF).
The chat() method
Open mini-claw-code-starter/src/agent.rs. Below run() you will see the chat()
method signature.
Why a new method?
run() creates a fresh Vec<Message> each time it is called. That means the
LLM has no memory of previous exchanges. A real CLI should carry context
forward, so the LLM can say “I already read that file” or “as I mentioned
earlier.”
chat() solves this by accepting the message history from the caller:
#![allow(unused)]
fn main() {
pub async fn chat(&self, messages: &mut Vec<Message>) -> anyhow::Result<String>
}
The caller pushes Message::User(…) before calling, and chat() appends the
assistant turns. When it returns, messages contains the full conversation
history ready for the next round.
The implementation
The loop body is identical to run(). The only differences are:
- Use the provided
messagesinstead of creating a new vec. - On
StopReason::Stop, clone the text before pushingMessage::Assistant(turn)– the push movesturn, so you need the text first. - Push
Message::Assistant(turn)so the history includes the final response. - Return the cloned text.
#![allow(unused)]
fn main() {
pub async fn chat(&self, messages: &mut Vec<Message>) -> anyhow::Result<String> {
let defs = self.tools.definitions();
loop {
let turn = self.provider.chat(messages, &defs).await?;
match turn.stop_reason {
StopReason::Stop => {
let text = turn.text.clone().unwrap_or_default();
messages.push(Message::Assistant(turn));
return Ok(text);
}
StopReason::ToolUse => {
// Same tool execution as run() ...
}
}
}
}
}
The ToolUse branch is exactly the same as in run(): execute each tool,
collect results, push the assistant turn, push the tool results.
Ownership detail
In run() you could do return Ok(turn.text.unwrap_or_default()) directly
because the function was done with turn. In chat() you also need to push
Message::Assistant(turn) into the history. Since that push moves turn, you
must extract the text first:
#![allow(unused)]
fn main() {
let text = turn.text.clone().unwrap_or_default();
messages.push(Message::Assistant(turn)); // moves turn
return Ok(text); // return the clone
}
This is a one-line change from run(), but it matters.
The CLI
Open mini-claw-code-starter/examples/chat.rs. You will see a skeleton with
unimplemented!(). Replace it with the full program.
Step 1: Imports
#![allow(unused)]
fn main() {
use mini_claw_code_starter::{
BashTool, EditTool, Message, OpenRouterProvider, ReadTool, SimpleAgent, WriteTool,
};
use std::io::{self, BufRead, Write};
}
Note the Message import – you need it to build the history vector.
Step 2: Create the provider and agent
#![allow(unused)]
fn main() {
let provider = OpenRouterProvider::from_env()?;
let agent = SimpleAgent::new(provider)
.tool(BashTool::new())
.tool(ReadTool::new())
.tool(WriteTool::new())
.tool(EditTool::new());
}
Same as before – nothing new here. (In Chapter 11
you’ll add AskTool here so the agent can ask you clarifying questions.)
Step 3: The system prompt and history vector
#![allow(unused)]
fn main() {
let cwd = std::env::current_dir()?.display().to_string();
let mut history: Vec<Message> = vec![Message::System(format!(
"You are a coding agent. Help the user with software engineering tasks \
using all available tools. Be concise and precise.\n\n\
Working directory: {cwd}"
))];
}
The system prompt is the first message in the history. It tells the LLM what role it should play. Two things to note:
-
No tool names in the prompt. Tool definitions are sent separately to the API. The system prompt focuses on behavior – be a coding agent, use whatever tools are available, be concise.
-
Working directory is included. The LLM needs to know where it is so that tool calls like
readandbashuse correct paths. This is what real coding agents do – Claude Code, OpenCode, and Kimi CLI all inject the current directory (and sometimes platform, date, etc.) into their system prompts.
The history vector lives outside the loop and accumulates every user prompt, assistant response, and tool result across the entire session. The system prompt stays at the front, giving the LLM consistent instructions on every turn.
Step 4: The REPL loop
#![allow(unused)]
fn main() {
let stdin = io::stdin();
loop {
print!("> ");
io::stdout().flush()?;
let mut line = String::new();
if stdin.lock().read_line(&mut line)? == 0 {
println!();
break;
}
let prompt = line.trim();
if prompt.is_empty() {
continue;
}
history.push(Message::User(prompt.to_string()));
print!(" thinking...");
io::stdout().flush()?;
match agent.chat(&mut history).await {
Ok(text) => {
print!("\x1b[2K\r");
println!("{}\n", text.trim());
}
Err(e) => {
print!("\x1b[2K\r");
println!("error: {e}\n");
}
}
}
}
A few things to note:
history.push(Message::User(…))adds the prompt before calling the agent.chat()will append the rest.print!(" thinking...")shows a status while the agent works. Theflush()is needed becauseprint!(no newline) does not flush automatically.\x1b[2K\ris an ANSI escape sequence: “erase entire line, move cursor to column 1.” This clears thethinking...text before printing the response. It also gets cleared automatically when the agent prints a tool summary (sincetool_summary()uses the same escape).stdout.flush()?afterprint!ensures the prompt and thinking indicator appear immediately.read_linereturns0on EOF (Ctrl+D), which breaks the loop.- Errors from the agent are printed instead of crashing – this keeps the loop alive even if one request fails.
The main function
Wrap everything in an async main:
#[tokio::main]
async fn main() -> anyhow::Result<()> {
// Steps 1-4 go here
Ok(())
}
The complete program
Putting it all together, the entire program is about 45 lines. That is the beauty of the framework you built – the final assembly is straightforward because each component has a clean interface.
Running the full test suite
Run the full test suite:
cargo test -p mini-claw-code-starter
This runs all tests from chapters 1 through 7. If everything passes, congratulations – your agent framework is complete and fully tested.
What the tests verify
The Chapter 7 tests are integration tests that combine all components:
- Write-then-read flows: Write a file, read it back, verify contents.
- Edit flows: Write a file, edit it, read back the result.
- Multi-tool pipelines: Use bash, write, edit, and read across multiple turns.
- Long conversations: Five-step tool-call sequences.
There are about 10 integration tests that exercise the full agent pipeline.
Running the chat example
To try it with a real LLM, you need an API key. Create a .env file in the
workspace root:
OPENROUTER_API_KEY=sk-or-v1-your-key-here
Then run:
cargo run -p mini-claw-code-starter --example chat
You will get an interactive prompt. Try a multi-turn conversation:
> List the files in the current directory
thinking...
[bash: ls]
Cargo.toml src/ examples/ ...
> What is in Cargo.toml?
thinking...
[read: Cargo.toml]
The Cargo.toml contains the package definition for mini-claw-code-starter...
> Add a new dependency for serde
thinking...
[read: Cargo.toml]
[edit: Cargo.toml]
Done! I added serde to the dependencies.
>
Notice how the second prompt (“What is in Cargo.toml?”) works without repeating context – the LLM already knows the directory listing from the first exchange. That is conversation history at work.
Press Ctrl+D (or Ctrl+C) to exit.
What you have built
Let’s step back and look at the complete picture:
examples/chat.rs
|
| creates
v
SimpleAgent<OpenRouterProvider>
|
| holds
+---> OpenRouterProvider (HTTP to LLM API)
+---> ToolSet (HashMap<String, Box<dyn Tool>>)
|
+---> BashTool
+---> ReadTool
+---> WriteTool
+---> EditTool
The chat() method drives the interaction:
User prompt
|
v
history: [User, Assistant, ToolResult, ..., User]
|
v
Provider.chat() ---HTTP---> LLM API
|
| AssistantTurn
v
Tool calls? ----yes---> Execute tools ---> append to history ---> loop
|
no
|
v
Append final Assistant to history, return text
In about 300 lines of Rust across all files, you have:
- A trait-based tool system with JSON schema definitions.
- A generic agent loop that works with any provider.
- A mock provider for deterministic testing.
- An HTTP provider for real LLM APIs.
- A CLI with conversation memory that ties it all together.
Where to go from here
This framework is intentionally minimal. Here are ideas for extending it:
Streaming responses – Instead of waiting for the full response, stream
tokens as they arrive. This means changing chat() to return a Stream
instead of a single AssistantTurn.
Token limits – Track token usage and truncate old messages when the context window fills up.
More tools – Add a web search tool, a database query tool, or anything
else you can imagine. The Tool trait makes it easy to plug in new
capabilities.
A richer UI – Add a spinner animation, markdown rendering, or collapsed
tool call display. See mini-claw-code/examples/tui.rs for an example that does
all three using termimad.
The foundation you built is solid. Every extension is a matter of adding to the
existing patterns, not rewriting them. The Provider trait, the Tool trait,
and the agent loop are the building blocks for anything you want to build next.
What’s next
Head to Chapter 8: The Singularity – your agent can now modify its own source code, and we will talk about what that means and where to go from here.
Chapter 8: The Singularity
Your agent can edit itself and it starts self-evolving. You don’t need to write any code starting from now.
Extensions
The extension chapters that follow walk through the reference implementation. You don’t need to write the code yourself – read them to understand the design, then let your agent implement them (or do it yourself for practice):
- Chapter 9: A Better TUI – Markdown rendering, spinners, collapsed tool calls.
- Chapter 10: Streaming – Stream tokens as they arrive with
StreamingAgent. - Chapter 11: User Input – Let the LLM ask you clarifying questions.
- Chapter 12: Plan Mode – Read-only planning with approval gating.
Beyond the extension chapters, here are more ideas to explore:
- Parallel tool calls – Execute concurrent tool calls with
tokio::join!. - Token tracking – Truncate old messages when approaching the context limit.
- More tools – Web search, database queries, HTTP requests. The
Tooltrait makes it easy. - MCP – Expose your tools as an MCP server or connect to external ones.
Chapter 9: A Better TUI
The chat.rs CLI works, but it dumps plain text and shows every tool call. A
real coding agent deserves markdown rendering, a thinking spinner, and
collapsed tool calls when the agent gets busy.
See mini-claw-code/examples/tui.rs for a reference implementation. It uses:
termimadfor inline markdown rendering in the terminal.crosstermfor raw terminal mode (used by the arrow-key selection UI in Chapter 11).- An animated spinner (
⠋⠙⠹⠸⠼⠴⠦⠧⠇⠏) that ticks while the agent thinks. - Collapsed tool calls: after 3 tool calls, subsequent ones are collapsed
into a
... and N morecounter to keep the output clean.
The TUI builds on the AgentEvent stream from StreamingAgent (Chapter 10).
The event loop uses tokio::select! to multiplex three sources:
- Agent events (
AgentEvent::TextDelta,ToolCall,Done,Error) – render streaming text, tool summaries, or final output. - User input requests from
AskTool(Chapter 11) – pause the spinner and show a text prompt or arrow-key selection list. - Timer ticks – advance the spinner animation.
This chapter is exposition only – no code to write. Read through
examples/tui.rs to see how the pieces fit together, or ask your mini-claw-code
agent to build a TUI for you.
Chapter 10: Streaming
In Chapter 6 you built OpenRouterProvider::chat(), which waits for the
entire response before returning. That works, but the user stares at a blank
screen until every token has been generated. Real coding agents print tokens as
they arrive – that is streaming.
This chapter adds streaming support and a StreamingAgent – the streaming
counterpart to SimpleAgent. You will:
- Define a
StreamEventenum that represents real-time deltas. - Build a
StreamAccumulatorthat collects deltas into a completeAssistantTurn. - Write a
parse_sse_line()function that converts raw Server-Sent Events intoStreamEvents. - Define a
StreamProvidertrait – the streaming counterpart toProvider. - Implement
StreamProviderforOpenRouterProvider. - Build a
MockStreamProviderfor testing without HTTP. - Build
StreamingAgent<P: StreamProvider>– a full agent loop with real-time text streaming.
None of this touches the Provider trait or SimpleAgent. Streaming is
layered on top of the existing architecture.
Why streaming?
Without streaming, a long response (say 500 tokens) makes the CLI feel frozen. Streaming fixes three things:
- Immediate feedback – the user sees the first word within milliseconds instead of waiting seconds for the full response.
- Early cancellation – if the agent is heading in the wrong direction, the user can Ctrl-C without waiting for the full response.
- Progress visibility – watching tokens arrive confirms the agent is working, not stuck.
How SSE works
The OpenAI-compatible API supports streaming via
Server-Sent Events (SSE).
You set "stream": true in the request, and instead of one big JSON response,
the server sends a series of text lines:
data: {"choices":[{"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"choices":[{"delta":{"content":" world"},"finish_reason":null}]}
data: {"choices":[{"delta":{},"finish_reason":"stop"}]}
data: [DONE]
Each line starts with data: followed by a JSON object (or the sentinel
[DONE]). The key difference from the non-streaming response: instead of a
message field with the complete text, each chunk has a delta field with
just the new part. Your code reads these deltas one by one, prints them
immediately, and accumulates them into the final result.
Here is the flow:
sequenceDiagram
participant A as Agent
participant L as LLM (SSE)
participant U as User
A->>L: POST /chat/completions (stream: true)
L-->>A: data: {"delta":{"content":"Hello"}}
A->>U: print "Hello"
L-->>A: data: {"delta":{"content":" world"}}
A->>U: print " world"
L-->>A: data: [DONE]
A->>U: (done)
Tool calls stream the same way, but with tool_calls deltas instead of
content deltas. The tool call’s name and arguments arrive in pieces that you
concatenate.
StreamEvent
Open mini-claw-code/src/streaming.rs. The StreamEvent enum is our domain type
for streaming deltas:
#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq)]
pub enum StreamEvent {
/// A chunk of assistant text.
TextDelta(String),
/// A new tool call has started.
ToolCallStart { index: usize, id: String, name: String },
/// More argument JSON for a tool call in progress.
ToolCallDelta { index: usize, arguments: String },
/// The stream is complete.
Done,
}
}
This is the interface between the SSE parser and the rest of the application.
The parser produces StreamEvents; the UI consumes them for display; the
accumulator collects them into an AssistantTurn.
StreamAccumulator
The accumulator is a simple state machine. It keeps a running text buffer
and a list of partial tool calls. Each feed() call appends to the
appropriate place:
#![allow(unused)]
fn main() {
pub struct StreamAccumulator {
text: String,
tool_calls: Vec<PartialToolCall>,
}
impl StreamAccumulator {
pub fn new() -> Self { /* ... */ }
pub fn feed(&mut self, event: &StreamEvent) { /* ... */ }
pub fn finish(self) -> AssistantTurn { /* ... */ }
}
}
The implementation is straightforward:
TextDelta→ append toself.text.ToolCallStart→ grow thetool_callsvec if needed, set theidandnameat the given index.ToolCallDelta→ append to the arguments string at the given index.Done→ no-op (we handle completion infinish()).
finish() consumes the accumulator and builds an AssistantTurn:
#![allow(unused)]
fn main() {
pub fn finish(self) -> AssistantTurn {
let text = if self.text.is_empty() { None } else { Some(self.text) };
let tool_calls: Vec<ToolCall> = self.tool_calls
.into_iter()
.filter(|tc| !tc.name.is_empty())
.map(|tc| ToolCall {
id: tc.id,
name: tc.name,
arguments: serde_json::from_str(&tc.arguments)
.unwrap_or(Value::Null),
})
.collect();
let stop_reason = if tool_calls.is_empty() {
StopReason::Stop
} else {
StopReason::ToolUse
};
AssistantTurn { text, tool_calls, stop_reason }
}
}
Notice that arguments is accumulated as a raw string and only parsed as JSON
at the very end. This is because the API sends argument fragments like
{"pa and th": "f.txt"} – they are not valid JSON until concatenated.
Parsing SSE lines
The parse_sse_line() function takes a single line from the SSE stream and
returns zero or more StreamEvents:
#![allow(unused)]
fn main() {
pub fn parse_sse_line(line: &str) -> Option<Vec<StreamEvent>> {
let data = line.strip_prefix("data: ")?;
if data == "[DONE]" {
return Some(vec![StreamEvent::Done]);
}
let chunk: ChunkResponse = serde_json::from_str(data).ok()?;
// ... extract events from chunk.choices[0].delta
}
}
The SSE chunk types mirror the OpenAI delta format:
#![allow(unused)]
fn main() {
#[derive(Deserialize)]
struct ChunkResponse { choices: Vec<ChunkChoice> }
#[derive(Deserialize)]
struct ChunkChoice { delta: Delta, finish_reason: Option<String> }
#[derive(Deserialize)]
struct Delta {
content: Option<String>,
tool_calls: Option<Vec<DeltaToolCall>>,
}
}
For tool calls, the first chunk includes id and function.name (indicating
a new tool call). Subsequent chunks only have function.arguments fragments.
The parser emits ToolCallStart when id is present, and ToolCallDelta for
non-empty argument strings.
StreamProvider trait
Just as Provider defines the non-streaming interface, StreamProvider
defines the streaming one:
#![allow(unused)]
fn main() {
pub trait StreamProvider: Send + Sync {
fn stream_chat<'a>(
&'a self,
messages: &'a [Message],
tools: &'a [&'a ToolDefinition],
tx: mpsc::UnboundedSender<StreamEvent>,
) -> impl Future<Output = anyhow::Result<AssistantTurn>> + Send + 'a;
}
}
The key difference from Provider::chat() is the tx parameter – an mpsc
channel sender. The implementation sends StreamEvents through this channel
as they arrive and returns the final accumulated AssistantTurn. This gives
callers both real-time events and the complete result.
We keep StreamProvider separate from Provider rather than adding a method
to the existing trait. This means SimpleAgent and all existing code are
completely unaffected.
Implementing StreamProvider for OpenRouterProvider
The implementation ties together SSE parsing, the accumulator, and the channel:
#![allow(unused)]
fn main() {
impl StreamProvider for OpenRouterProvider {
async fn stream_chat(
&self,
messages: &[Message],
tools: &[&ToolDefinition],
tx: mpsc::UnboundedSender<StreamEvent>,
) -> anyhow::Result<AssistantTurn> {
// 1. Build request with stream: true
// 2. Send HTTP request
// 3. Read response chunks in a loop:
// - Buffer incoming bytes
// - Split on newlines
// - parse_sse_line() each complete line
// - feed() each event into the accumulator
// - send each event through tx
// 4. Return acc.finish()
}
}
}
The buffering detail is important. HTTP responses may arrive in arbitrary byte
chunks that do not align with SSE line boundaries. So we maintain a String
buffer, append each chunk, and process only complete lines (splitting on \n):
#![allow(unused)]
fn main() {
let mut buffer = String::new();
while let Some(chunk) = resp.chunk().await? {
buffer.push_str(&String::from_utf8_lossy(&chunk));
while let Some(newline_pos) = buffer.find('\n') {
let line = buffer[..newline_pos].trim_end_matches('\r').to_string();
buffer = buffer[newline_pos + 1..].to_string();
if line.is_empty() { continue; }
if let Some(events) = parse_sse_line(&line) {
for event in events {
acc.feed(&event);
let _ = tx.send(event);
}
}
}
}
}
MockStreamProvider
For testing, we need a streaming provider that does not make HTTP calls.
MockStreamProvider wraps the existing MockProvider and synthesizes
StreamEvents from each canned AssistantTurn:
#![allow(unused)]
fn main() {
pub struct MockStreamProvider {
inner: MockProvider,
}
impl StreamProvider for MockStreamProvider {
async fn stream_chat(
&self,
messages: &[Message],
tools: &[&ToolDefinition],
tx: mpsc::UnboundedSender<StreamEvent>,
) -> anyhow::Result<AssistantTurn> {
let turn = self.inner.chat(messages, tools).await?;
// Synthesize stream events from the complete turn
if let Some(ref text) = turn.text {
for ch in text.chars() {
let _ = tx.send(StreamEvent::TextDelta(ch.to_string()));
}
}
for (i, call) in turn.tool_calls.iter().enumerate() {
let _ = tx.send(StreamEvent::ToolCallStart {
index: i, id: call.id.clone(), name: call.name.clone(),
});
let _ = tx.send(StreamEvent::ToolCallDelta {
index: i, arguments: call.arguments.to_string(),
});
}
let _ = tx.send(StreamEvent::Done);
Ok(turn)
}
}
}
It sends text one character at a time (simulating token-by-token streaming)
and each tool call as a start + delta pair. This lets us test StreamingAgent
without any network calls.
StreamingAgent
Now for the main event. StreamingAgent is the streaming counterpart to
SimpleAgent. It has the same structure – a provider, a tool set, and an
agent loop – but it uses StreamProvider and emits AgentEvent::TextDelta
events in real time:
#![allow(unused)]
fn main() {
pub struct StreamingAgent<P: StreamProvider> {
provider: P,
tools: ToolSet,
}
impl<P: StreamProvider> StreamingAgent<P> {
pub fn new(provider: P) -> Self { /* ... */ }
pub fn tool(mut self, t: impl Tool + 'static) -> Self { /* ... */ }
pub async fn run(
&self,
prompt: &str,
events: mpsc::UnboundedSender<AgentEvent>,
) -> anyhow::Result<String> { /* ... */ }
pub async fn chat(
&self,
messages: &mut Vec<Message>,
events: mpsc::UnboundedSender<AgentEvent>,
) -> anyhow::Result<String> { /* ... */ }
}
}
The chat() method is the heart of the streaming agent. Let us walk through
it:
#![allow(unused)]
fn main() {
pub async fn chat(
&self,
messages: &mut Vec<Message>,
events: mpsc::UnboundedSender<AgentEvent>,
) -> anyhow::Result<String> {
let defs = self.tools.definitions();
loop {
// 1. Set up a stream channel
let (stream_tx, mut stream_rx) = mpsc::unbounded_channel();
// 2. Spawn a forwarder that converts StreamEvent::TextDelta
// into AgentEvent::TextDelta for the UI
let events_clone = events.clone();
let forwarder = tokio::spawn(async move {
while let Some(event) = stream_rx.recv().await {
if let StreamEvent::TextDelta(text) = event {
let _ = events_clone.send(AgentEvent::TextDelta(text));
}
}
});
// 3. Call stream_chat — this streams AND returns the turn
let turn = self.provider.stream_chat(messages, &defs, stream_tx).await?;
let _ = forwarder.await;
// 4. Same stop_reason logic as SimpleAgent
match turn.stop_reason {
StopReason::Stop => {
let text = turn.text.clone().unwrap_or_default();
let _ = events.send(AgentEvent::Done(text.clone()));
messages.push(Message::Assistant(turn));
return Ok(text);
}
StopReason::ToolUse => {
// Execute tools, push results, continue loop
// (same pattern as SimpleAgent)
}
}
}
}
}
The architecture has two channels flowing simultaneously:
flowchart LR
SC["stream_chat()"] -- "StreamEvent" --> CH["mpsc channel"]
CH --> FW["forwarder task"]
FW -- "AgentEvent::TextDelta" --> UI["UI / events channel"]
SC -- "feeds" --> ACC["StreamAccumulator"]
ACC -- "finish()" --> TURN["AssistantTurn"]
TURN --> LOOP["Agent loop"]
The forwarder task is a bridge: it receives raw StreamEvents from the
provider and converts TextDelta events into AgentEvent::TextDelta for the
UI. This keeps the provider’s streaming protocol separate from the agent’s
event protocol.
Notice that AgentEvent now has a TextDelta variant:
#![allow(unused)]
fn main() {
pub enum AgentEvent {
TextDelta(String), // NEW — streaming text chunks
ToolCall { name: String, summary: String },
Done(String),
Error(String),
}
}
Using StreamingAgent in the TUI
The TUI example (examples/tui.rs) uses StreamingAgent for the full
experience:
#![allow(unused)]
fn main() {
let provider = OpenRouterProvider::from_env()?;
let agent = Arc::new(
StreamingAgent::new(provider)
.tool(BashTool::new())
.tool(ReadTool::new())
.tool(WriteTool::new())
.tool(EditTool::new()),
);
}
The agent is wrapped in Arc so it can be shared with spawned tasks. Each
turn spawns the agent and processes events with a spinner:
#![allow(unused)]
fn main() {
let (tx, mut rx) = mpsc::unbounded_channel();
let agent = agent.clone();
let mut msgs = std::mem::take(&mut history);
let handle = tokio::spawn(async move {
let _ = agent.chat(&mut msgs, tx).await;
msgs
});
// UI event loop — print TextDeltas, show spinner for tool calls
loop {
tokio::select! {
event = rx.recv() => {
match event {
Some(AgentEvent::TextDelta(text)) => print!("{text}"),
Some(AgentEvent::ToolCall { summary, .. }) => { /* spinner */ },
Some(AgentEvent::Done(_)) => break,
// ...
}
}
_ = tick.tick() => { /* animate spinner */ }
}
}
}
Compare this to the SimpleAgent version from Chapter 9: the structure is
almost identical. The only difference is that TextDelta events let us print
tokens as they arrive instead of waiting for the full Done event.
Running the tests
cargo test -p mini-claw-code ch10
The tests verify:
- Accumulator: text assembly, tool call assembly, mixed events, empty input, multiple parallel tool calls.
- SSE parsing: text deltas, tool call start/delta,
[DONE], non-data lines, empty deltas, invalid JSON, full multi-line sequences. - MockStreamProvider: text responses synthesize char-by-char events; tool call responses synthesize start + delta events.
- StreamingAgent: text-only responses, tool call loops, and multi-turn chat
history – all using
MockStreamProviderfor deterministic testing. - Integration: mock TCP servers that send real SSE responses to
stream_chat()and verify both the returnedAssistantTurnand the events sent through the channel.
Recap
StreamEventrepresents real-time deltas: text chunks, tool call starts, argument fragments, and completion.StreamAccumulatorcollects deltas into a completeAssistantTurn.parse_sse_line()converts raw SSEdata:lines intoStreamEvents.StreamProvideris the streaming counterpart toProvider– it adds anmpscchannel parameter for real-time events.MockStreamProviderwrapsMockProviderto synthesize streaming events for testing.StreamingAgentis the streaming counterpart toSimpleAgent– same tool loop, but with real-timeTextDeltaevents forwarded to the UI.- The
Providertrait andSimpleAgentare unchanged. Streaming is an additive feature layered on top.
Chapter 11: User Input
Your agent can read files, run commands, and write code – but it can’t ask you a question. If it’s unsure which approach to take, which file to target, or whether to proceed with a destructive operation, it just guesses.
Real coding agents solve this with an ask tool. Claude Code has
AskUserQuestion, Kimi CLI has approval prompts. The LLM calls a special tool,
the agent pauses, and the user types an answer. The answer goes back as a tool
result and execution continues.
In this chapter you’ll build:
- An
InputHandlertrait that abstracts how user input is collected. - An
AskToolthat the LLM calls to ask the user a question. - Three handler implementations: CLI, channel-based (for TUI), and mock (for tests).
Why a trait?
Different UIs collect input differently:
- A CLI app prints to stdout and reads from stdin.
- A TUI app sends a request through a channel and waits for the event loop to collect the answer (maybe with arrow-key selection).
- Tests need to provide canned answers without any I/O.
The InputHandler trait lets AskTool work with all three without knowing
which one it’s using:
#![allow(unused)]
fn main() {
#[async_trait::async_trait]
pub trait InputHandler: Send + Sync {
async fn ask(&self, question: &str, options: &[String]) -> anyhow::Result<String>;
}
}
The question is what the LLM wants to ask. The options slice is an optional
list of choices – if empty, the user types free-text. If non-empty, the UI can
present a selection list.
AskTool
AskTool implements the Tool trait. It takes an Arc<dyn InputHandler> so
the handler can be shared across threads:
#![allow(unused)]
fn main() {
pub struct AskTool {
definition: ToolDefinition,
handler: Arc<dyn InputHandler>,
}
}
Tool definition
The LLM needs to know what parameters the tool accepts. question is required
(a string). options is optional (an array of strings).
For options, we need a JSON schema for an array type – something param()
can’t express since it only handles scalar types. So first, add param_raw()
to ToolDefinition:
#![allow(unused)]
fn main() {
/// Add a parameter with a raw JSON schema value.
///
/// Use this for complex types (arrays, nested objects) that `param()` can't express.
pub fn param_raw(mut self, name: &str, schema: Value, required: bool) -> Self {
self.parameters["properties"][name] = schema;
if required {
self.parameters["required"]
.as_array_mut()
.unwrap()
.push(serde_json::Value::String(name.to_string()));
}
self
}
}
Now the tool definition uses both param() and param_raw():
#![allow(unused)]
fn main() {
impl AskTool {
pub fn new(handler: Arc<dyn InputHandler>) -> Self {
Self {
definition: ToolDefinition::new(
"ask_user",
"Ask the user a clarifying question...",
)
.param("question", "string", "The question to ask the user", true)
.param_raw(
"options",
json!({
"type": "array",
"items": { "type": "string" },
"description": "Optional list of choices to present to the user"
}),
false,
),
handler,
}
}
}
}
Tool::call
The call implementation extracts question, parses options with a helper,
and delegates to the handler:
#![allow(unused)]
fn main() {
#[async_trait::async_trait]
impl Tool for AskTool {
fn definition(&self) -> &ToolDefinition {
&self.definition
}
async fn call(&self, args: Value) -> anyhow::Result<String> {
let question = args
.get("question")
.and_then(|v| v.as_str())
.ok_or_else(|| anyhow::anyhow!("missing required parameter: question"))?;
let options = parse_options(&args);
self.handler.ask(question, &options).await
}
}
/// Extract the optional `options` array from tool arguments.
fn parse_options(args: &Value) -> Vec<String> {
args.get("options")
.and_then(|v| v.as_array())
.map(|arr| {
arr.iter()
.filter_map(|v| v.as_str().map(String::from))
.collect()
})
.unwrap_or_default()
}
}
The parse_options helper keeps call() focused on the happy path. If
options is missing or not an array, it defaults to an empty vec – the
handler treats this as free-text input.
Three handlers
CliInputHandler
The simplest handler. Prints the question, lists numbered choices (if any), reads a line from stdin, and resolves numbered answers:
#![allow(unused)]
fn main() {
pub struct CliInputHandler;
#[async_trait::async_trait]
impl InputHandler for CliInputHandler {
async fn ask(&self, question: &str, options: &[String]) -> anyhow::Result<String> {
let question = question.to_string();
let options = options.to_vec();
// spawn_blocking because stdin is synchronous
tokio::task::spawn_blocking(move || {
// Display the question and numbered choices (if any)
println!("\n {question}");
for (i, opt) in options.iter().enumerate() {
println!(" {}) {opt}", i + 1);
}
// Read the answer
print!(" > ");
io::stdout().flush()?;
let mut line = String::new();
io::stdin().lock().read_line(&mut line)?;
let answer = line.trim().to_string();
// If the user typed a valid option number, resolve it
Ok(resolve_option(&answer, &options))
}).await?
}
}
/// If `answer` is a number matching one of the options, return that option.
/// Otherwise return the raw answer.
fn resolve_option(answer: &str, options: &[String]) -> String {
if let Ok(n) = answer.parse::<usize>()
&& n >= 1
&& n <= options.len()
{
return options[n - 1].clone();
}
answer.to_string()
}
}
The resolve_option helper keeps the closure body clean. It uses let-chain
syntax (stabilized in Rust 1.87 / edition 2024): multiple conditions joined
with && including let Ok(n) = ... pattern bindings. If the user types "2"
and there are three options, it resolves to options[1]. Otherwise the raw text
is returned.
Note the for loop over options does nothing when the slice is empty – no
special if branch needed.
Use this in simple CLI apps like examples/chat.rs:
#![allow(unused)]
fn main() {
let agent = SimpleAgent::new(provider)
.tool(BashTool::new())
.tool(ReadTool::new())
.tool(WriteTool::new())
.tool(EditTool::new())
.tool(AskTool::new(Arc::new(CliInputHandler)));
}
ChannelInputHandler
For TUI apps, input collection happens in the event loop, not in the tool. The
ChannelInputHandler bridges the gap with a channel:
#![allow(unused)]
fn main() {
pub struct UserInputRequest {
pub question: String,
pub options: Vec<String>,
pub response_tx: oneshot::Sender<String>,
}
pub struct ChannelInputHandler {
tx: mpsc::UnboundedSender<UserInputRequest>,
}
}
When ask() is called, it sends a UserInputRequest through the channel and
awaits the oneshot response:
#![allow(unused)]
fn main() {
#[async_trait::async_trait]
impl InputHandler for ChannelInputHandler {
async fn ask(&self, question: &str, options: &[String]) -> anyhow::Result<String> {
let (response_tx, response_rx) = oneshot::channel();
self.tx.send(UserInputRequest {
question: question.to_string(),
options: options.to_vec(),
response_tx,
})?;
Ok(response_rx.await?)
}
}
}
The TUI event loop receives the request and renders it however it likes –
a simple text prompt, or an arrow-key-navigable selection list using
crossterm in raw terminal mode.
MockInputHandler
For tests, pre-configure answers in a queue:
#![allow(unused)]
fn main() {
pub struct MockInputHandler {
answers: Mutex<VecDeque<String>>,
}
#[async_trait::async_trait]
impl InputHandler for MockInputHandler {
async fn ask(&self, _question: &str, _options: &[String]) -> anyhow::Result<String> {
self.answers.lock().await.pop_front()
.ok_or_else(|| anyhow::anyhow!("MockInputHandler: no more answers"))
}
}
}
This follows the same pattern as MockProvider – pop from the front, error
when empty. Note that this uses tokio::sync::Mutex (with .lock().await),
not std::sync::Mutex. The reason: ask() is an async fn, and the lock
guard must be held across the .await boundary. A std::sync::Mutex guard is
!Send, so holding it across .await won’t compile. tokio::sync::Mutex
produces a Send-safe guard that works in async contexts. Compare this with
MockProvider from Chapter 1, which uses std::sync::Mutex because its
chat() method doesn’t hold the guard across an .await.
Tool summary
Update tool_summary() in agent.rs to display "question" for ask_user
calls in the terminal output:
#![allow(unused)]
fn main() {
let detail = call.arguments
.get("command")
.or_else(|| call.arguments.get("path"))
.or_else(|| call.arguments.get("question")) // <-- new
.and_then(|v| v.as_str());
}
Plan mode integration
ask_user is read-only – it collects information without mutating anything.
Add it to PlanAgent’s default read_only set (see
Chapter 12) so the LLM can ask questions during
planning:
#![allow(unused)]
fn main() {
read_only: HashSet::from(["bash", "read", "ask_user"]),
}
Wiring it up
Add the module to mini-claw-code/src/tools/mod.rs:
#![allow(unused)]
fn main() {
mod ask;
pub use ask::*;
}
And re-export from lib.rs:
#![allow(unused)]
fn main() {
pub use tools::{
AskTool, BashTool, ChannelInputHandler, CliInputHandler,
EditTool, InputHandler, MockInputHandler, ReadTool,
UserInputRequest, WriteTool,
};
}
Running the tests
cargo test -p mini-claw-code ch11
The tests verify:
- Tool definition: schema has
question(required) andoptions(optional array). - Question only:
MockInputHandlerreturns answer for a question-only call. - With options: tool passes options to the handler correctly.
- Missing question: missing
questionargument returns an error. - Handler exhausted: empty
MockInputHandlerreturns an error. - Agent loop: LLM calls
ask_user, gets an answer, then returns final text. - Ask then tool:
ask_userfollowed by another tool call (e.g.read). - Multiple asks: two sequential
ask_usercalls with different answers. - Channel roundtrip:
ChannelInputHandlersends request and receives response via oneshot channel. - param_raw:
param_raw()adds array parameter toToolDefinitioncorrectly.
Recap
InputHandlertrait abstracts input collection across CLI, TUI, and tests.AskToollets the LLM pause execution and ask the user a question.param_raw()extendsToolDefinitionto support complex JSON schema types like arrays.- Three handlers:
CliInputHandlerfor simple apps,ChannelInputHandlerfor TUI apps,MockInputHandlerfor tests. - Plan mode:
ask_useris read-only by default, so it works during planning. - Purely additive: no changes to
SimpleAgent,StreamingAgent, or any existing tool.
Chapter 12: Plan Mode
Real coding agents can be dangerous. Give an LLM access to write, edit,
and bash and it might rewrite your config, delete a file, or run a
destructive command – all before you’ve had a chance to review what it’s doing.
Plan mode solves this with a two-phase workflow:
- Plan – the agent explores the codebase using read-only tools (
read,bash, andask_user). It cannot write, edit, or mutate anything. It returns a plan describing what it intends to do. - Execute – after the user reviews and approves the plan, the agent runs again with all tools available.
This is exactly how Claude Code’s plan mode works. In this chapter you’ll build
PlanAgent – a streaming agent with caller-driven approval gating.
You will:
- Build
PlanAgent<P: StreamProvider>withplan()andexecute()methods. - Inject a system prompt that tells the LLM it’s in planning mode.
- Add an
exit_plantool the LLM calls when its plan is ready. - Implement double defense: definition filtering and an execution guard.
- Let the caller drive the approval flow between phases.
Why plan mode?
Consider this scenario:
User: "Refactor auth.rs to use JWT instead of session cookies"
Agent (no plan mode):
→ calls write("auth.rs", ...) immediately
→ rewrites half your auth system
→ you didn't want that approach at all
With plan mode:
User: "Refactor auth.rs to use JWT instead of session cookies"
Agent (plan phase):
→ calls read("auth.rs") to understand current code
→ calls bash("grep -r 'session' src/") to find related files
→ calls exit_plan to submit its plan
→ "Plan: Replace SessionStore with JwtProvider in 3 files..."
User: "Looks good, go ahead."
Agent (execute phase):
→ calls write/edit with the approved changes
The key insight: the same agent loop works for both phases. The only difference is which tools are available.
Design
PlanAgent has the same shape as StreamingAgent – a provider, a ToolSet,
and an agent loop. Three additions make it a planning agent:
- A
HashSet<&'static str>recording which tools are allowed during planning. - A system prompt injected at the start of the planning phase.
- An
exit_plantool definition the LLM calls when its plan is ready.
#![allow(unused)]
fn main() {
pub struct PlanAgent<P: StreamProvider> {
provider: P,
tools: ToolSet,
read_only: HashSet<&'static str>,
plan_system_prompt: String,
exit_plan_def: ToolDefinition,
}
}
Two public methods drive the two phases:
plan()– injects the system prompt, runs the agent loop with only read-only tools andexit_planvisible.execute()– runs the agent loop with all tools visible.
Both delegate to a private run_loop() that takes an optional tool filter.
The builder
Construction follows the same builder pattern as SimpleAgent and
StreamingAgent:
#![allow(unused)]
fn main() {
impl<P: StreamProvider> PlanAgent<P> {
pub fn new(provider: P) -> Self {
Self {
provider,
tools: ToolSet::new(),
read_only: HashSet::from(["bash", "read", "ask_user"]),
plan_system_prompt: DEFAULT_PLAN_PROMPT.to_string(),
exit_plan_def: ToolDefinition::new(
"exit_plan",
"Signal that your plan is complete and ready for user review. \
Call this when you have finished exploring and are ready to \
present your plan.",
),
}
}
pub fn tool(mut self, t: impl Tool + 'static) -> Self {
self.tools.push(t);
self
}
pub fn read_only(mut self, names: &[&'static str]) -> Self {
self.read_only = names.iter().copied().collect();
self
}
pub fn plan_prompt(mut self, prompt: impl Into<String>) -> Self {
self.plan_system_prompt = prompt.into();
self
}
}
}
By default, bash, read, and ask_user are read-only. (Chapter 11 added
ask_user so the LLM can ask clarifying questions during planning.) The
.read_only() method lets callers override this – for example, to exclude
bash from planning if you want a stricter mode.
The .plan_prompt() method lets callers override the system prompt – useful
for specialized agents like security auditors or code reviewers.
System prompt
The LLM needs to know it’s in planning mode. Without this, it will try to accomplish the task with whatever tools it sees, rather than producing a deliberate plan.
plan() injects a system prompt at the start of the conversation:
#![allow(unused)]
fn main() {
const DEFAULT_PLAN_PROMPT: &str = "\
You are in PLANNING MODE. Explore the codebase using the available tools and \
create a plan. You can read files, run shell commands, and ask the user \
questions — but you CANNOT write, edit, or create files.\n\n\
When your plan is ready, call the `exit_plan` tool to submit it for review.";
}
The injection is conditional – if the caller already provided a System
message, plan() respects it:
#![allow(unused)]
fn main() {
pub async fn plan(
&self,
messages: &mut Vec<Message>,
events: mpsc::UnboundedSender<AgentEvent>,
) -> anyhow::Result<String> {
if !messages
.first()
.is_some_and(|m| matches!(m, Message::System(_)))
{
messages.insert(0, Message::System(self.plan_system_prompt.clone()));
}
self.run_loop(messages, Some(&self.read_only), events).await
}
}
This means:
- First call: no system message → inject the plan prompt.
- Re-plan call: system message already there → skip.
- Caller provided their own: caller’s system message → respect it.
This is how real agents work. Claude Code switches its system prompt when
entering plan mode. OpenCode uses entirely separate agent configurations with
different system prompts for plan vs build agents.
The exit_plan tool
Without exit_plan, the planning phase ends when the LLM returns
StopReason::Stop – the same way any conversation ends. This is ambiguous:
did the LLM finish planning, or did it just stop talking?
Real agents solve this with an explicit signal. Claude Code has ExitPlanMode.
OpenCode has exit_plan. The LLM calls the tool to say “my plan is ready for
review.”
In PlanAgent, exit_plan is a tool definition stored on the struct – not
registered in the ToolSet. This means:
- During plan:
exit_planis injected into the tool list alongside read-only tools. The LLM can see and call it. - During execute:
exit_planis not in the tool list. The LLM doesn’t know it exists.
When the agent loop sees an exit_plan call, it returns immediately with the
plan text (the LLM’s text from that turn):
#![allow(unused)]
fn main() {
// Handle exit_plan: signal plan completion
if allowed.is_some() && call.name == "exit_plan" {
results.push((call.id.clone(), "Plan submitted for review.".into()));
exit_plan = true;
continue;
}
}
After processing all tool calls in the turn, if exit_plan was among them:
#![allow(unused)]
fn main() {
if exit_plan {
let _ = events.send(AgentEvent::Done(plan_text.clone()));
return Ok(plan_text);
}
}
The planning phase now has two exit paths:
StopReason::Stop– LLM stops naturally (backward compatible).exit_plantool call – LLM explicitly signals plan completion.
Both work. The exit_plan path is better because it’s unambiguous.
Double defense
Tool filtering still uses two layers of protection:
Layer 1: Definition filtering
During plan(), only read-only tool definitions plus exit_plan are sent to
the LLM. The model literally cannot see write or edit in its tool list:
#![allow(unused)]
fn main() {
let all_defs = self.tools.definitions();
let defs: Vec<&ToolDefinition> = match allowed {
Some(names) => {
let mut filtered: Vec<&ToolDefinition> = all_defs
.into_iter()
.filter(|d| names.contains(d.name))
.collect();
filtered.push(&self.exit_plan_def);
filtered
}
None => all_defs,
};
}
During execute(), allowed is None, so all registered tools are sent –
and exit_plan is not included.
Layer 2: Execution guard
If the LLM somehow hallucinated a blocked tool call, the execution guard
catches it and returns an error ToolResult instead of executing the tool:
#![allow(unused)]
fn main() {
if let Some(names) = allowed
&& !names.contains(call.name.as_str())
{
results.push((
call.id.clone(),
format!(
"error: tool '{}' is not available in planning mode",
call.name
),
));
continue;
}
}
The error goes back to the LLM as a tool result, so it learns the tool is blocked and adjusts its behavior. The file is never touched.
The shared agent loop
Both plan() and execute() delegate to run_loop(). The only parameter
that differs is allowed:
#![allow(unused)]
fn main() {
pub async fn plan(
&self,
messages: &mut Vec<Message>,
events: mpsc::UnboundedSender<AgentEvent>,
) -> anyhow::Result<String> {
// System prompt injection (shown earlier)
self.run_loop(messages, Some(&self.read_only), events).await
}
pub async fn execute(
&self,
messages: &mut Vec<Message>,
events: mpsc::UnboundedSender<AgentEvent>,
) -> anyhow::Result<String> {
self.run_loop(messages, None, events).await
}
}
plan() passes Some(&self.read_only) to restrict tools. execute() passes
None to allow everything.
The run_loop itself is identical to StreamingAgent::chat() from Chapter 10,
with these additions:
- Tool definition filtering (read-only +
exit_planduring plan; all during execute). - The
exit_planhandler that breaks the loop when the LLM signals plan completion. - The execution guard for blocked tools.
Caller-driven approval flow
The approval flow lives entirely in the caller. PlanAgent does not ask for
approval – it just runs whichever phase is called. This keeps the agent
simple and lets the caller implement any approval UX they want.
Here is the typical flow:
#![allow(unused)]
fn main() {
let agent = PlanAgent::new(provider)
.tool(ReadTool::new())
.tool(WriteTool::new())
.tool(EditTool::new())
.tool(BashTool::new());
let mut messages = vec![Message::User("Refactor auth.rs".into())];
// Phase 1: Plan (read-only tools + exit_plan)
let (tx, rx) = mpsc::unbounded_channel();
let plan = agent.plan(&mut messages, tx).await?;
println!("Plan: {plan}");
// Show plan to user, get approval
if user_approves() {
// Phase 2: Execute (all tools)
messages.push(Message::User("Approved. Execute the plan.".into()));
let (tx2, rx2) = mpsc::unbounded_channel();
let result = agent.execute(&mut messages, tx2).await?;
println!("Result: {result}");
} else {
// Re-plan with feedback
messages.push(Message::User("No, try a different approach.".into()));
let (tx3, rx3) = mpsc::unbounded_channel();
let revised_plan = agent.plan(&mut messages, tx3).await?;
println!("Revised plan: {revised_plan}");
}
}
Notice how the same messages vec is shared across phases. This is critical –
the LLM sees its own plan, the user’s approval (or rejection), and all
previous context when it enters the execute phase. Re-planning is just
pushing feedback as a User message and calling plan() again.
sequenceDiagram
participant C as Caller
participant P as PlanAgent
participant L as LLM
C->>P: plan(&mut messages)
P->>L: [read, bash, exit_plan tools only]
L-->>P: reads files, calls exit_plan
P-->>C: "Plan: ..."
C->>C: User reviews plan
alt Approved
C->>P: execute(&mut messages)
P->>L: [all tools]
L-->>P: writes/edits files
P-->>C: "Done."
else Rejected
C->>P: plan(&mut messages) [with feedback]
P->>L: [read, bash, exit_plan tools only]
L-->>P: revised plan
P-->>C: "Revised plan: ..."
end
Wiring it up
Add the module to mini-claw-code/src/lib.rs:
#![allow(unused)]
fn main() {
pub mod planning;
// ...
pub use planning::PlanAgent;
}
That’s it. Like streaming, plan mode is a purely additive feature – no existing code is modified.
Running the tests
cargo test -p mini-claw-code ch12
The tests verify:
- Text response:
plan()returns text when the LLM stops immediately. - Read tool allowed:
readexecutes during planning. - Write tool blocked:
writeis blocked during planning; the file is NOT created; an errorToolResultis sent back to the LLM. - Edit tool blocked: same behavior for
edit. - Execute allows write:
writeworks during execution; the file IS created. - Full plan-then-execute: end-to-end flow – plan reads a file, approval, execute writes a file.
- Message continuity: messages from the plan phase carry into the execute phase, including the injected system prompt.
- read_only override:
.read_only(&["read"])excludesbashfrom planning. - Streaming events:
TextDeltaandDoneevents are emitted during planning. - Provider error: empty mock propagates errors correctly.
- Builder pattern: chained
.tool().read_only().plan_prompt()compiles. - System prompt injection:
plan()injects a system prompt at position 0. - System prompt not duplicated: calling
plan()twice doesn’t add a second system message. - Caller system prompt respected: if the caller provides a
Systemmessage,plan()doesn’t overwrite it. exit_plantool: the LLM callsexit_planto signal plan completion;plan()returns the plan text.exit_plannot in execute: duringexecute(),exit_planis not in the tool list.- Custom plan prompt:
.plan_prompt(...)overrides the default. - Full flow with
exit_plan: plan reads file → callsexit_plan→ approve → execute writes file.
Recap
PlanAgentseparates planning (read-only) from execution (all tools) using a single shared agent loop.- System prompt:
plan()injects a system message telling the LLM it’s in planning mode — what tools are available, what’s blocked, and that it should callexit_planwhen done. exit_plantool: the LLM explicitly signals plan completion, just like Claude Code’sExitPlanMode. This is injected during planning and invisible during execution.- Double defense: definition filtering prevents the LLM from seeing blocked tools; an execution guard catches hallucinated calls.
- Caller-driven approval: the agent doesn’t manage approval – the caller
pushes approval/rejection as
Usermessages and calls the appropriate phase. - Message continuity: the same
messagesvec flows through both phases, giving the LLM full context. - Streaming: both phases use
StreamProviderand emitAgentEvents, just likeStreamingAgent. - Purely additive: no changes to
SimpleAgent,StreamingAgent, or any existing code.
Chapter 13: Subagents
This chapter is not yet written. It will cover spawning child agents that handle subtasks independently.
Want to contribute? Open an issue or submit a PR!
Chapter 14: MCP: Model Context Protocol
This chapter is not yet written. It will cover integrating MCP servers to give your agent access to external tools and data sources.
Want to contribute? Open an issue or submit a PR!
Chapter 15: Safety Rails
This chapter is not yet written. It will cover adding confirmation prompts, command filtering, timeouts, and other safety measures to your agent.
Want to contribute? Open an issue or submit a PR!