Welcome to the final chapter of the Moonshine tutorial!
In the previous chapter, Streaming Inference Engine, we explored the high-performance C++ engine that powers real-time transcription. We learned how it uses memory states and caching to be incredibly fast.
However, most developers don't build mobile apps or web servers in raw C++. They use Swift (iOS), Kotlin/Java (Android), or Python.
Imagine you have a brilliant mathematician (the C++ Core) who solves problems instantly but only speaks ancient Greek. You also have a Project Manager (your Python/Swift app) who needs those answers but only speaks English.
They cannot communicate directly. If you try to paste C++ code into a Python file, the program will crash immediately.
The C API Bridge acts as the diplomat.
This allows us to write the heavy AI logic once in C++, and use it everywhere.
Let's look at the simplest possible interaction: Asking for the library version.
Your Swift app wants to know: "What version of Moonshine is running?"
The C++ core knows the answer: 20000 (Version 2.0).
Here is how the message travels:
MoonshineAPI.getVersion()moonshine_get_version()20000.In high-level languages, we pass whole objects around. In the C Bridge, we can't easily pass complex C++ objects to Swift or Java.
Instead, we use Handles. Think of a Handle like a Coat Check Ticket.
Let's visualize the flow of data when an iOS app asks to load a model.
Let's look at the three layers of this bridge: The Contract (C), The Android side (JNI), and the iOS side (Swift).
The file core/moonshine-c-api.h defines the rules of engagement. Notice extern "C". This tells the C++ compiler: "Don't mess up these function names. Keep them simple so other languages can find them."
// From: core/moonshine-c-api.h
#ifdef __cplusplus
extern "C" {
#endif
// The function name is plain and simple
MOONSHINE_EXPORT int32_t moonshine_get_version(void);
// We return an int32_t (Handle), not a complex Object
MOONSHINE_EXPORT int32_t moonshine_load_transcriber_from_files(
const char *path,
uint32_t model_arch,
...
);
Explanation: This file is the "Menu" that Swift and Java read to know what functions are available to order.
Swift allows us to call C functions directly, but we need to handle "Pointers" (direct memory addresses). Swift calls this UnsafePointer.
The wrapper swift/Sources/MoonshineVoice/MoonshineAPI.swift makes this safe for the rest of the app.
// From: swift/Sources/MoonshineVoice/MoonshineAPI.swift
func loadTranscriberFromFiles(path: String, ...) throws -> Int32 {
// 1. Convert Swift String to a C-compatible string
let pathCString = path.cString(using: .utf8)!
// 2. Call the C function directly
let handle = moonshine_load_transcriber_from_files(
pathCString,
modelArch.rawValue,
nil, 0, 20000
)
// 3. Return the Ticket # (Handle)
return handle
}
Explanation: Swift handles the conversion of the string path into bytes that C can understand, calls the function, and returns the result.
Java/Kotlin is stricter. It requires a specific intermediary called JNI (Java Native Interface). This is a C++ file that acts as the glue.
Look at android/moonshine-jni/moonshine-jni.cpp.
// From: android/moonshine-jni/moonshine-jni.cpp
// This weird function name maps directly to a Java class method
extern "C" JNIEXPORT int JNICALL
Java_ai_moonshine_voice_JNI_moonshineLoadTranscriberFromFiles(
JNIEnv *env, jobject, jstring path, ...) {
// 1. Convert Java String to C String
const char *path_str = env->GetStringUTFChars(path, nullptr);
// 2. Call the Core API
int handle = moonshine_load_transcriber_from_files(
path_str, ...
);
// 3. Return the handle to Java
return handle;
}
Explanation: The function name Java_ai_moonshine... tells the Android system: "When the Java class ai.moonshine.voice.JNI calls moonshineLoadTranscriberFromFiles, run this C++ code."
Sending numbers is easy. Sending a list of text results (The Transcript) is hard.
In C, an array is just a pointer to the first item. In Swift/Java, an array is a smart object with a size and methods. The Bridge must convert them manually.
Example: Converting Audio Data (Swift to C) To send audio to the Streaming Inference Engine, we must give C access to the raw memory of the Swift array.
// From: swift/Sources/MoonshineVoice/MoonshineAPI.swift
func addAudioToStream(audioData: [Float], ...) {
// "Open up" the array and give us the memory address (buffer)
audioData.withUnsafeBufferPointer { buffer in
// Pass the memory address (baseAddress) to C
moonshine_transcribe_add_audio_to_stream(
transcriberHandle,
streamHandle,
buffer.baseAddress, // <--- The Pointer
UInt64(audioData.count),
16000,
0
)
}
}
Explanation: withUnsafeBufferPointer effectively says: "Freeze this array in memory for a millisecond so I can show the C engine where the data lives."
The Cross-Platform Bindings are the unsung heroes of the project. They ensure that:
transcriber.start(), unaware of the complex pointer arithmetic happening in the background.Congratulations! You have completed the Moonshine Architecture Tutorial.
We have traced the journey of a spoken word from the airwaves to the screen:
You are now ready to dig into the source code, build your own voice assistants, or contribute to the project! Happy coding!
Generated by Code IQ