Chapter 1 ยท AGENTS

Agent Adapters

๐Ÿ“„ 01_agent_adapters.md ๐Ÿท Agents

Chapter 1: Agent Adapters

Welcome to the world of airi! If you are just getting started, you might be wondering: "How does an AI actually 'live' on the internet?"

Imagine an AI as a pure Brain. It can think, write poetry, and solve math problems, but it floats in a void. It has no hands to type, no eyes to read, and no voice to speak.

Agent Adapters are the physical "bodies" or "suits" that the Brain wears.

In this chapter, we will learn how to build these bodies so our AI can interact with the real world.

The Motivation: Speaking the Right Language

Let's look at a simple use case: We want our AI to reply "Hello!" to a user.

This sounds simple, but every platform is different:

  1. Telegram expects an HTTP request to api.telegram.org.
  2. Discord keeps a WebSocket connection open and sends JSON packets.
  3. Minecraft sends binary packets over TCP.

We don't want our "Brain" to worry about binary packets or WebSockets. We want the Brain to just say: "Send 'Hello' to user Bob."

The Agent Adapter handles the messy details of the specific platform so the Brain doesn't have to.

How to Use: Setting Up an Adapter

To bring an adapter to life, we need to initialize it with the specific credentials (keys) for that platform. Let's look at the Telegram Adapter as our example.

1. The Entry Point

Every adapter starts like a standard computer program. It needs to wake up and connect to the platform.

Here is a simplified look at how the Telegram bot starts (derived from services/telegram-bot/src/index.ts):

// Import the setup function for the specific platform
import { startTelegramBot } from './bots/telegram'
import { initDb } from './db'

async function main() {
  // 1. Prepare the database to remember conversations
  await initDb()

  // 2. Wake up the Telegram Adapter body
  await startTelegramBot()
}

// 3. Run the program
main().catch(console.error)

Explanation: This code is the "on switch." It initializes the database (so the AI remembers who it is talking to) and then runs startTelegramBot(), which connects to Telegram's servers.

2. Configuration

An adapter acts as a bridge. On one side is the Platform (Telegram), and on the other side is the AI logic.

In the Discord Adapter, this setup is very explicit (from services/discord-bot/src/index.ts):

// Create a new Discord "Body"
const adapter = new DiscordAdapter({
  // The ID card for Discord
  discordToken: env.DISCORD_TOKEN, 
  
  // The ID card for talking to the Airi Brain
  airiToken: env.AIRI_TOKEN,     
  airiUrl: 'ws://localhost:6121/ws',
})

// Connect to the world
await adapter.start()

Explanation: Notice that the adapter needs two sets of keys: one to talk to the outside world (discordToken) and one to talk to the internal AI system (airiToken).

Internal Implementation: Under the Hood

What happens when you send a message to the bot? The Adapter acts as a translator.

The Flow of Information

  1. Input: The Adapter receives a message from the platform.
  2. Processing: It sends this text to The Cognitive Brain to decide what to say.
  3. Output: The Brain gives a response, and the Adapter sends it back to the platform.

Here is a diagram showing how the Adapter sits in the middle:

sequenceDiagram participant User participant Adapter as Agent Adapter (Telegram) participant Brain as Cognitive Brain User->>Adapter: Sends "Hello!" Note right of Adapter: Adapter converts Telegram JSON<br/>into simple text. Adapter->>Brain: Here is the chat history + "Hello!" Brain->>Adapter: Reply with "Nice to meet you!" Note right of Adapter: Adapter converts text back<br/>into Telegram API call. Adapter->>User: Sends "Nice to meet you!"

Deep Dive: The "Action" Code

Let's look at how the Adapter actually performs an action, like sending a message. This logic resides deep inside the adapter's code.

The following example is simplified from services/telegram-bot/src/bots/telegram/agent/actions/send-message.ts.

Step 1: Generating the Thought

First, the adapter asks the LLM (Large Language Model) to generate a response based on the conversation.

// Prepare the request for the Brain
const req = {
  apiKey: env.LLM_API_KEY!,
  model: env.LLM_MODEL!,
  // Format the history so the AI understands it
  messages: message.messages(
    message.system(await messageSplit()), 
    message.user(responseText) // The user's input
  ),
}

// Ask the AI what to write
const res = await generateText(req)

Explanation: The adapter gathers the context and sends it to the AI model. It waits for the text generation to finish.

Step 2: Acting on the Environment

Once the AI has text (e.g., "Hello there!"), the Adapter must translate that into a Telegram-specific command.

// Loop through the generated sentences
for (const item of structuredMessage.messages) {
  
  // Simulate human typing (Show "Typing..." status)
  await botContext.bot.api.sendChatAction(chatId, 'typing')
  
  // Wait a bit to make it feel natural
  await sleep(item.length * 50)

  // THE ACTION: Actually call the Telegram API
  await botContext.bot.api.sendMessage(chatId, item)
}

Explanation: This is the core purpose of the Adapter!

  1. It calls sendChatAction (Specific to Telegram).
  2. It calls sendMessage (Specific to Telegram).

If this were the Minecraft Adapter, these lines would be replaced with code to swing a sword or place a block. The AI "thought" remains the same, but the "action" changes based on the adapter.

Storing Memories

The Adapter doesn't just pass messages; it also helps record history. In the code, you might see references to recordMessage. This connects to the Central Data & Identity Server, ensuring that no matter which "body" the AI is wearing, it remembers its friends.

Summary

Agent Adapters are the interface between the AI and the specific platform it lives on.

  1. They wrap the complex API logic (Telegram, Discord, etc.).
  2. They verify credentials and manage connections.
  3. They take the pure text thoughts from the AI and turn them into platform-specific actions (like sending a message or moving a character).

Now that we have a body, we need to understand how the AI actually thinks and decides what to say.

Next Chapter: The Cognitive Brain


Generated by Code IQ