Welcome back! In the previous chapters, we built nodes for macOS Node, iOS Node, and Android Node. We gave OpenClaw "bodies" to interact with the world.
However, there is a missing piece. Right now, to give a voice command, you might have to press a button on the screen. We want a true "Star Trek" experience where you can just speak into the air.
But we can't send everything you say to the Gateway. That would be slow, expensive, and a huge privacy risk. We need a way to filter the noise locally.
Enter Swabble.
Swabble is the "Ear" of OpenClaw. It is a Swift-based engine located in the Swabble/ directory. Its only job is to listen for a specific Wake Word (like "Hey Claw").
Think of Swabble like a guard dog sleeping by the door. It ignores the TV, the wind, and your casual conversations. But the moment it hears a specific sound (the intruder), it barks to wake up the rest of the house.
The Central Use Case: You are in a room chatting with friends. You say, "The weather is nice." Swabble does nothing. Then you say, "Hey Claw, turn on the lights." Swabble detects the phrase "Hey Claw," wakes up the iOS Node, and only then does the phone start recording your command to send to the Gateway.
Swabble is a library designed to be imported into your Apple apps (macOS and iOS).
Swabble runs 100% on your device. It does not need the internet. This ensures that your private conversations never leave the room until you explicitly summon the bot.
Imagine a conveyor belt carrying boxes of sound. The microphone captures sound in small chunks (frames). Swabble looks at these chunks in real-time.
This is a "Callback" or a signal. When Swabble hears the magic word, it flips a switch from false to true, telling the main app to start recording.
Because Swabble is a library, you don't run it by itself. You use it inside the macOS Node or iOS Node code.
First, inside your iOS or macOS app code, you bring in the library.
import Swabble
// Create an instance of the engine
let ear = SwabbleEngine(wakeWord: "Hey Claw")
We need to turn the microphone on and feed the data to Swabble.
// This is a simplified example
microphone.startRecording { audioBuffer in
// Feed the raw sound to Swabble
ear.process(buffer: audioBuffer)
}
We need to tell Swabble what to do when it hears the name.
ear.onWakeWordDetected = {
print("I heard you!")
// Now we connect to the Gateway
GatewayClient.startSendingAudio()
}
What happens:
onWakeWordDetected code runs.How does Swabble distinguish "Hey Claw" from "Hay Stack"? It uses pattern matching on audio waves.
Here is the visual process of a wake word engine.
The core logic lives in Swabble/Sources/SwabbleEngine.swift. It manages the audio stream.
1. The Buffer Manager: We need to hold onto the last few seconds of audio to analyze it. This is often called a "Sliding Window."
// Swabble/Sources/SwabbleEngine.swift
public class SwabbleEngine {
// A temporary holder for sound data
private var audioBuffer: [Float] = []
// The sensitivity (0.0 to 1.0)
public var sensitivity: Float = 0.5
// The function your app calls when the mic has data
public func process(frame: [Float]) {
audioBuffer.append(contentsOf: frame)
// Check if the buffer matches our model
detect()
}
}
Explanation:
audioBuffer: Holds the numbers representing sound.process(frame): This is the entry point. The microphone gives us a small slice of time (a frame), and we add it to our memory.2. The Detection Logic: This is where the math happens. In a real engine, this uses a trained machine learning model. For this tutorial, we will visualize the logic simply.
private func detect() {
// 1. Analyze the buffer
let score = Model.analyze(self.audioBuffer)
// 2. Check if it's confident enough
if score > self.sensitivity {
// 3. Fire the event!
self.onWakeWordDetected?()
// 4. Clear the buffer so we don't trigger twice
self.audioBuffer.removeAll()
}
}
Explanation:
Model.analyze: This compares the shape of the sound waves in memory against the shape of the word "Hey Claw."sensitivity: We allow some margin for error (background noise).onWakeWordDetected?(): If it's a match, we call the function provided by the main app.In this chapter, we learned about Swabble, the privacy-focused listening engine.
Swabble/ directory.
Now our system has a Brain (Gateway), a Body (Nodes), and Ears (Swabble). But as these parts start talking to each other, they need to agree on a strict language. If the iOS Node sends a message saying { "text": "Hello" } but the Gateway expects { "message": "Hello" }, the whole system breaks.
We need a dictionary for our robot language.
Generated by Code IQ