In the previous chapter, Sensory Audio Processing (Hearing), we gave airi the ability to listen to your voice. Before that, in The "Stage" (Visual Presentation Layer), we gave her a face.
But currently, airi is like a "Brain in a Jar." She lives inside a web browser window. She can chat with you, but she cannot interact with your computer. She cannot see your other open windows, she cannot click your mouse, and she cannot access heavy hardware power directly.
In this chapter, we will build the Native Capabilities Bridge. This is the nervous system that connects the digital brain to the physical hardware of your computer.
Web browsers (like Chrome or the one embedded in Airi) are designed to be Sandboxed. This means a website is strictly forbidden from touching your operating system for security reasons. A website cannot say: "Move the user's mouse to the left."
The Problem: We want airi to be an assistant, not just a chatbot.
The Solution: We create a Bridge.
To understand this system, think of a Mech Suit.
This is the HTML/JS interface you see. It is smart but weak. It decides what to do (e.g., "I want to click that button").
This is the application wrapper. We use two technologies here:
This is the dashboard control panel. When the Pilot pushes a button, a signal travels down a wire to the Mech's hydraulic arms. In coding terms, the Frontend sends a message (Event) to the Backend, and the Backend executes the system command.
Let's look at a cool feature called Window Pass-Through. Sometimes, you want Airi to float on your screen like a hologram. You want to be able to click through her to the window behind her.
This requires native OS commands. Here is how the Frontend asks for this superpower.
The web page calls a function exposed by the bridge.
// Inside a Vue component or logic file
import { invoke } from '@tauri-apps/api/core'
async function enableGhostMode() {
// We send a command string to the Rust backend
await invoke('start_pass_through')
console.log('Clicks now go through the window!')
}
Explanation:
The invoke function is the magic telephone. We dial 'start_pass_through'. We don't need to know how it works, just that the backend handles it.
How does a JavaScript command turn into a Windows or macOS system call?
Let's look at crates/tauri-plugin-window-pass-through-on-hover/src/lib.rs. This is where the raw power lives.
Unlike JavaScript, Rust can talk directly to the operating system (Windows API or macOS Cocoa).
// This function is callable from JavaScript
#[tauri::command]
async fn start_pass_through<R: Runtime>(
window: tauri::Window<R>
) -> Result<(), String> {
// Call the helper that speaks "Operating System" language
// This changes the window style to be transparent to clicks
set_pass_through_enabled(&window, true).map_err(|e| {
log::error!("Failed: {e}");
e
})
}
Explanation:
#[tauri::command]: This label tells the system "Expose this function to the frontend."set_pass_through_enabled: This is a lower-level function (specific to Windows or Mac) that changes the window flags.
Airi can also "feel" when you type or move your mouse, even when her window is minimized. This uses a library called rdev.
From crates/tauri-plugin-rdev/src/lib.rs:
// Start a background thread to listen to the OS
std::thread::spawn(move || {
// 'listen' hooks into the OS global event loop
listen(move |event: Event| {
// Convert OS event to a Web Event name
let event_name = match event.event_type {
EventType::KeyPress(_) => "keydown",
EventType::MouseMove { .. } => "mousemove",
_ => return,
};
// Send the news up to the Frontend
app.emit(event_name, &event);
});
});
Explanation:
std::thread::spawn: We create a parallel process so we don't freeze the app.listen: This connects to the global OS hook. It intercepts every mouse move on your computer.app.emit: This broadcasts the event back to the web page. Now the Cognitive Brain knows you are active!Airi uses the Model Context Protocol (MCP) to connect to external tools (like a database or a file explorer). This acts like a universal USB port.
From crates/tauri-plugin-mcp/src/lib.rs:
#[tauri::command]
async fn call_tool(
state: State<'_, Mutex<McpState>>,
name: String,
args: Option<Map<String, Value>>,
) -> Result<CallToolResult, String> {
// 1. Get the connected tool client
let state = state.lock().await;
// 2. Send the command to the external tool
let result = state.client.as_ref().unwrap()
.call_tool(CallToolRequestParam { name, arguments: args })
.await;
// 3. Return the result to the AI
Ok(result.unwrap())
}
Explanation: This allows the AI to say "List Files in Folder X." The request goes from JS -> Rust -> External Tool process, and the result flows all the way back.
Finally, all these superpowers need to be registered when the app starts. This happens in the Electron main file: apps/stage-tamagotchi/src/main/index.ts.
app.whenReady().then(async () => {
// 1. Setup the screen capture capability
initScreenCaptureForMain()
// 2. Setup the "Server Channel" (Another bridge type)
const serverChannel = injeca.provide('modules:channel-server',
() => setupServerChannelHandlers()
)
// 3. Create the actual windows
setupMainWindow(dependsOn)
})
Explanation: This acts as the bootloader. It initializes the "nervous system" before the "eyes" (windows) even open.
The Native Capabilities Bridge breaks the walls of the web browser.
Now we have a fully functional body. We have senses, muscles, a brain, and a face. But how do we know what is going on inside that complex brain when things go wrong?
Next Chapter: Introspection & Debugging System
Generated by Code IQ