In the previous chapter, Tool Governance and Limits, we gave the AI "hands" to manipulate files and run commands, along with safety gloves to prevent accidents.
Now we face a communication problem. The AI generates text. But how does the application know the difference between the AI saying, "I am going to list your files," and the AI actually trying to execute the list command?
If we don't distinguish between talk and action, the CLI is just a chatbot.
Imagine a police radio channel.
The code "10-4" isn't just noise; it triggers a specific procedure.
The XML Messaging Protocol is the "Radio Code" for our AI. It uses XML tags to wrap specific parts of the conversation so the application Runtime knows exactly what to do.
Large Language Models (LLMs) output a single stream of string characters. Without a protocol, the output is unstructured.
Unstructured (Bad):
"Okay, I will run the list command. ls -la. Did it work?"
Problem: The system doesn't know where the command starts or ends. It might try to execute "Did it work?" as a command.
Structured with XML Protocol (Good):
"Okay, I will run the list command.
<command-name>execute</command-name>
<command-args>ls -la</command-args>"
Solution: The system ignores the chat text and strictly executes what is inside the tags.
We define these "Radio Codes" as constants in xml.ts. This ensures that the prompt generator (which talks to the AI) and the parser (which reads the AI's reply) agree on the exact spelling of the tags.
These are used when the AI wants to do something.
// From xml.ts
export const COMMAND_NAME_TAG = 'command-name'
export const COMMAND_ARGS_TAG = 'command-args'
Explanation: When the parser sees command-name, it knows an executable tool is being requested.
These are used when the System replies to the AI. The AI needs to know if text coming back is a user message or the raw output of a terminal command.
// From xml.ts
export const BASH_STDOUT_TAG = 'bash-stdout' // Success output
export const BASH_STDERR_TAG = 'bash-stderr' // Error output
Explanation: By wrapping terminal output in <bash-stdout>, the AI understands: "This isn't the user talking; this is the result of my code."
Sometimes the AI launches a background agent (a sub-task). We need tags to track the status of these invisible workers.
// From xml.ts
export const TASK_NOTIFICATION_TAG = 'task-notification'
export const STATUS_TAG = 'status'
export const SUMMARY_TAG = 'summary'
Explanation: These tags act like a status report from a remote worker, letting the main AI know if a job is "Thinking," "Running," or "Done."
As a developer using constants, you rarely write these tags manually. Instead, you use these constants to build parsers or format prompts.
Imagine the AI ran echo "Hello". The terminal returned "Hello". We need to package this for the AI's memory.
import { BASH_STDOUT_TAG } from './xml.js'
function formatOutput(rawOutput: string) {
// Wraps the output: <bash-stdout>Hello</bash-stdout>
return `<${BASH_STDOUT_TAG}>${rawOutput}</${BASH_STDOUT_TAG}>`
}
Explanation: We use the constant BASH_STDOUT_TAG. If we ever decide to change the tag name from bash-stdout to terminal-output in the future, we only change it in one file (xml.ts), and the whole app updates.
The system defines a specific list of tags that represent "Computer Output."
import { TERMINAL_OUTPUT_TAGS } from './xml.js'
// TERMINAL_OUTPUT_TAGS is an array ['bash-input', 'bash-stdout', etc...]
function isSystemMessage(tagName: string) {
// Checks if the tag belongs to the machine, not the human
return TERMINAL_OUTPUT_TAGS.includes(tagName)
}
Explanation: This array allows us to filter the chat history. We might want to hide bulky terminal logs from the user but keep them visible to the AI.
How does the conversation actually flow using these tags? Let's look at the lifecycle of a command.
The file xml.ts is purely a dictionary. It doesn't contain logic; it contains definitions. This is crucial for stability.
These tags handle the simulation of a terminal session.
// From xml.ts
// The command actually typed
export const BASH_INPUT_TAG = 'bash-input'
// The standard output (what you usually see)
export const BASH_STDOUT_TAG = 'bash-stdout'
// The standard error (when things break)
export const BASH_STDERR_TAG = 'bash-stderr'
When using features like "Ultraplan" (remote planning) or "Teammates" (multiple AIs), we use special tags to route messages.
// From xml.ts
// Message from another AI agent
export const TEAMMATE_MESSAGE_TAG = 'teammate-message'
// Result from a remote review session
export const REMOTE_REVIEW_TAG = 'remote-review'
Explanation: If the system sees <teammate-message>, it knows to display the text differently, perhaps with a different avatar or color, because it came from a peer, not the user.
Sometimes a message exists but has no text (perhaps it only had an image, or it was a system ping). We use a constant for this to avoid null errors.
// From messages.ts
export const NO_CONTENT_MESSAGE = '(no content)'
Explanation: This isn't an XML tag, but it serves a similar purpose: a standardized signal that tells the system "There is nothing here to read."
In this chapter, we learned:
xml.ts. We never hardcode strings like '<bash-stdout>' in our logic files.This protocol ensures that the Brain (Chapter 1), the Voice (Chapter 2), and the Hands (Chapter 3) all speak the same language.
We have a working AI that can think, speak, act, and communicate clearly. But to be truly useful, it needs access to the outside worldβAPI keys, GitHub tokens, and user credentials. How do we manage these secrets securely?
Next Chapter: OAuth and Environment Configuration
Generated by Code IQ