Welcome to the second chapter of the pi-mono tutorial!
In the previous Agent Session chapter, we built the "body" of our AI—the system that saves files and manages history. Now, we are going to look at the "Central Nervous System": the Agent Runtime.
While the Session manages files, the Runtime manages thought.
A raw Large Language Model (LLM) is actually quite simple: you give it text, and it predicts the next text. It doesn't know how to "browse the web" or "run code." It can only write about browsing the web.
To create a true Agent, we need a system that can:
ls command").
This cycle—Think → Act → Observe → Repeat—is the job of the Agent class.
The Agent is chatty. It doesn't just give you the final answer; it emits events for everything happening inside its brain:
message_start: "I'm starting to type."tool_execution_start: "I'm about to run a command."tool_execution_end: "I finished the command."
The Agent keeps a list of AgentMessage objects. This is its short-term working memory. It contains the User's prompt, its own previous replies, and the results of any tools it used.
Unlike a standard chat where you say "Hello" and the bot says "Hi", the Agent Runtime enters a Loop. One user prompt might trigger 10 internal steps of tool usage before the Agent finally replies to you.
Imagine a user asks: "What is the weather in Tokyo?"
get_weather tool."
The Agent class automates this entire dance.
Using the Agent directly is lower-level than using the Session. Here is how you initialize the brain and send a thought.
We create an agent and give it an initial state (like which model to use).
import { Agent } from "@mariozechner/pi-ai";
const agent = new Agent({
initialState: {
model: myGeminiModel, // The AI Brain
tools: [weatherTool], // The AI Hands
}
});
Explanation: We configure the agent with a model and a list of tools it is allowed to use. We'll learn how to build tools in Standard Tools.
We send a message using .prompt().
// Subscribe to see what happens in real-time
agent.subscribe((event) => {
if (event.type === "tool_execution_start") {
console.log(`Agent is using tool: ${event.toolName}`);
}
});
// Trigger the thought loop
await agent.prompt("Check the weather in Tokyo");
Explanation: Calling prompt starts the agentLoop. This function won't return until the Agent has finished everything—including all tool calls and the final text response.
agentLoopHow does the Runtime handle this multi-step process without getting lost?
The logic lives primarily in two files: agent.ts (the class) and agent-loop.ts (the logic).
agent.ts)The class is primarily a container for State.
export class Agent {
private _state: AgentState = {
messages: [], // History
tools: [], // Available capabilities
isStreaming: false // Are we thinking right now?
};
// ... logic to update state ...
}
Explanation: This holds the "Context Window" of the conversation. When you call prompt(), it adds your message to _state.messages and calls the loop.
agent-loop.ts)
This is the heart of the engine. It uses a while (true) loop to process steps until the AI is satisfied.
// Simplified logic from agentLoop function
async function runLoop(...) {
while (true) {
// 1. Ask the LLM for a response based on current history
const message = await streamAssistantResponse(...);
// 2. Did the LLM ask to use a tool?
const toolCalls = message.content.filter(c => c.type === "toolCall");
if (toolCalls.length > 0) {
// 3. Execute the tools
await executeToolCalls(currentContext.tools, message, ...);
// 4. CONTINUE the loop (go back to step 1)
continue;
}
// 5. No tools used? We are done.
break;
}
}
Explanation: This while loop is what makes the agent "autonomous." It will keep looping, executing tools, and feeding results back into the conversation until the LLM decides to write standard text instead of a tool call.
When a tool call is detected, the Agent must run it and format the output.
async function executeToolCalls(tools, message, stream) {
// Find the requested tool
const tool = tools.find(t => t.name === toolCall.name);
// Run the actual function
const result = await tool.execute(toolCall.id, args);
// Create a specific "Tool Result" message
const resultMsg = {
role: "toolResult",
content: result.content
};
return resultMsg;
}
Explanation: The result is wrapped in a special message with the role toolResult. This tells the LLM, "This is exactly what the computer returned when you ran that command."
You might notice functions like steer() in the full source code.
Sometimes an Agent gets stuck in a loop (e.g., trying to read a file that doesn't exist 100 times).
Steering allows the User to inject a message while the loop is running.
The Agent Runtime is the engine that transforms a static LLM into a dynamic worker. It handles the complex logic of chaining thoughts, executing code, and managing the conversation flow.
In this chapter, we learned:
But wait—how do we actually talk to different LLMs like OpenAI, Anthropic, or Gemini without rewriting our code every time? That is handled by the Unified AI Interface.
Next Chapter: Unified AI Interface
Generated by Code IQ