In the previous chapter, Termination Conditions (The Stop Button), we learned how to act as a referee to stop a conversation.
However, up until now, we have mostly treated the conversation as simple strings of text ("Hello", "Stop"). As systems get complex, passing raw strings around becomes dangerous. Was that string a request to run code? Was it a search result? Was it an image?
In this chapter, we explore Messages and Eventsβthe strict protocol that defines exactly what is being sent between agents.
Imagine a post office where every letter is just a blank piece of paper with no envelope.
In software, sending raw text causes similar confusion.
AutoGen solves this by wrapping content in Typed Messages. Think of these as official envelopes with colored stamps.
TextMessage: A standard blue envelope for conversation.StopMessage: A red envelope that means "Halt."MultiModalMessage: A heavy parcel containing images.By checking the type of the message (the envelope), the system knows exactly what to do without guessing.
Everything in AutoGen's communication layer inherits from a base definition. There are two main families you need to know:
BaseChatMessage): Messages meant for conversation. Agent A sends this to Agent B.BaseAgentEvent): Signals about what is happening. Use these for logging or UI updates (e.g., "I am thinking...").This is the bread and butter of the framework.
from autogen_agentchat.messages import TextMessage
# A standard message
msg = TextMessage(
content="Hello, world!",
source="user"
)
print(msg.type) # Output: TextMessage
print(msg.content) # Output: Hello, world!
Modern LLMs (like GPT-4o) have eyes. They can see images. We cannot simply paste an image into a text string, so we use a MultiModalMessage.
from autogen_agentchat.messages import MultiModalMessage
from autogen_core import Image
# Create a message with text AND an image
msg = MultiModalMessage(
content=[
"Describe this image.",
Image.from_file("cat.png")
],
source="user"
)
Why is this better? The Model Client (Chapter 2) sees this specific type and knows exactly how to format the image bytes for OpenAI or Google, saving you from complex encoding work.
Some messages aren't for the user to read; they are instructions for the system.
StopMessage: Used by Termination Conditions (Chapter 5) to signal the end.HandoffMessage: Used to transfer control to a specific agent (e.g., "I can't answer this, switching to the ExpertAgent").from autogen_agentchat.messages import StopMessage
# The system sees this and shuts down the loop
stop_signal = StopMessage(
content="Task completed successfully.",
source="Reviewer"
)
While Messages are directed at other agents ("Please fix this code"), Events are broadcasts to the world ("I am currently generating code").
Events are crucial for building User Interfaces (UI). You don't want the user staring at a blank screen while the agent works. You want to show a spinner or a log.
from autogen_agentchat.messages import ThoughtEvent
# The agent broadcasts its internal monologue
event = ThoughtEvent(
content="I need to calculate the radius first...",
source="MathAgent"
)
# This is NOT sent to the other agent as chat.
# It is sent to the UI/Console to show progress.
Sometimes you don't want a poem or a sentence. You want strictly formatted data (JSON) to save into a database.
You can enforce this using StructuredMessage. This uses Pydantic (a Python data validation library) to define the "shape" of the data.
from pydantic import BaseModel
from autogen_agentchat.messages import StructuredMessage
# 1. Define the shape of data you want
class UserInfo(BaseModel):
name: str
age: int
# 2. Wrap it in a message
msg = StructuredMessage(
content=UserInfo(name="Alice", age=30),
source="DatabaseAgent"
)
print(msg.content.age) # Output: 30 (It's a real integer, not a string!)
How do these objects travel between agents, or even across the internet to a different server?
They must be Serialized (converted to a standard dictionary/JSON format) and Deserialized (converted back to a Python object).
Let's look at autogen_agentchat/messages.py.
Every message inherits from BaseMessage. This base class provides the magic dump and load methods.
# Simplified internal logic
class BaseMessage(BaseModel):
def dump(self) -> Dict[str, Any]:
# Convert Python object to a Dictionary
return self.model_dump(mode="json")
When a message is received (e.g., from a database or a network stream), the system doesn't know what it is yet. It uses the MessageFactory to inspect the type field.
# Simplified logic from MessageFactory
class MessageFactory:
def __init__(self):
# A registry of all known envelopes
self._types = {
"TextMessage": TextMessage,
"StopMessage": StopMessage,
# ...
}
def create(self, data: Dict):
# 1. Peek at the label
msg_type = data.get("type")
# 2. Find the matching class
cls = self._types[msg_type]
# 3. Create the object
return cls.load(data)
This factory pattern ensures that if you add a new custom message type, the system can still safely load and understand it as long as it is registered.
In this chapter, we learned:
MultiModalMessage allows agents to see images.ThoughtEvent) are for observing the agent's progress, distinct from conversational Messages.We have now covered the Actors (Agents), Brains (Models), Hands (Tools), Orchestration (Teams), Rules (Termination), and Language (Messages).
There is one final piece of the puzzle. Who actually delivers these messages? Who manages the memory and the background processes?
Next Chapter: Agent Runtime (The Infrastructure)
Generated by Code IQ