Welcome to Chapter 3 of the Chrome DevTools MCP tutorial!
In the previous chapter, MCP Context (State Management), we built the Cockpitโa way to manage the browser's state and translate the visual web page into a text map the AI can understand.
Now that the AI can "see" the plane's controls, we need to give it the ability to push buttons and pull levers. We call these Tool Definitions.
Imagine you are playing a role-playing game (RPG). You have a set of Skill Cards: "Attack", "Defend", or "Heal".
In this project, Tools are those Skill Cards. They turn vague AI desires into concrete code execution.
If the AI wants to "Type 'Hello' into the search box," it needs a Tool defined specifically for typing, and it needs to know exactly which box to type into.
This tells the AI what the tool is. It includes a unique name (e.g., click, fill_form) and a description. The description is crucial because the AI reads it to decide if this is the right tool for the job.
We use a library called Zod to define strict rules for inputs. Think of Zod as the Bouncer at a club.
uid (string) for the element. Do you have it?"
This is a standard JavaScript/TypeScript function. It receives the validated inputs from Zod and the McpContext we built in Chapter 2. It performs the actual work using Puppeteer.
Let's look at how a tool is actually written in the code. We use a helper function called defineTool.
We need to tell the AI what this tool does.
// src/tools/input.ts (Simplified)
export const click = defineTool({
name: 'click',
description: `Clicks on the provided element`,
annotations: {
category: ToolCategory.INPUT
},
// ... schema and handler go here
});
Explanation: The name is how the AI calls it. The description helps the AI understand when to use it.
We define strictly what data the AI must provide.
// src/tools/input.ts (Simplified)
schema: {
uid: zod.string()
.describe('The uid of an element from the snapshot'),
dblClick: zod.boolean().optional()
.describe('Set to true for double clicks'),
},
Explanation: We require a uid (String). We optionally accept dblClick (Boolean). If the AI sends a number for uid, Zod will reject it before it crashes our code.
This is where we actually touch the browser.
// src/tools/input.ts (Simplified)
handler: async (request, response, context) => {
const uid = request.params.uid;
// 1. Ask the Context (Chap 2) to find the element
const handle = await context.getElementByUid(uid);
// 2. Perform the Puppeteer action
await handle.click();
// 3. Tell the AI it worked
response.appendResponseLine(`Successfully clicked.`);
},
Explanation:
uid from the parameters.context.getElementByUid (which we learned about in MCP Context (State Management)) to get the real button.What happens when the AI actually decides to use a tool?
The project organizes tools into categories to keep things tidy.
src/tools/input.ts)These interact with the page directly.
click, fill (type text), press_key.uid from the McpContext snapshot.src/tools/network.ts)These inspect traffic, useful for debugging APIs.
list_network_requests, get_network_request.// src/tools/network.ts (Simplified)
export const listNetworkRequests = defineTool({
name: 'list_network_requests',
description: `List all requests since last navigation.`,
schema: { /* ... filters ... */ },
handler: async (request, response, context) => {
// 1. Get raw data from the browser bridge
const data = await context.getDevToolsData();
// 2. Format it for the AI
response.attachDevToolsData(data);
},
});
src/tools/performance.ts)These are complex tools that run over time.
performance_start_trace, performance_stop_trace.// src/tools/performance.ts (Simplified)
handler: async (request, response, context) => {
// Check if we are already recording
if (context.isRunningPerformanceTrace()) {
response.appendResponseLine('Error: Trace already running');
return;
}
// Update state and start Puppeteer tracing
context.setIsRunningPerformanceTrace(true);
await page.tracing.start({ categories });
},
In this chapter, we defined the Capabilities of our AI agent.
McpContext to execute the logic in the browser.We have the Engine (Browser), the Cockpit (Context), and the Controls (Tools). But what happens when the browser talks back? What if a network request fails or a console error appears while the AI is "thinking"?
We need a way to record these events.
Next Chapter: Data Collectors (Event Buffering)
Generated by Code IQ