In the previous chapter, Model Clients (The Brains), we gave our agents the ability to think using Large Language Models (LLMs).
However, a brain in a jar cannot change a lightbulb. Even the smartest LLM has limitations:
To solve this, we give agents Code Execution capabilitiesโthe "Hands."
One of AutoGen's most powerful features is allowing agents to write code and execute it immediately.
Imagine you ask an agent: "What is the 100th Fibonacci number?"
If an AI writes code, you don't want it running directly on your laptop's main operating system. What if it accidentally deletes your files?
AutoGen provides a Sandbox (usually using Docker). It's like a chemistry lab fume hood. The agent can make a mess, run experiments, and create files inside the container, but your personal computer remains safe.
To give an agent hands, we use a Code Executor. The most robust option is the DockerCommandLineCodeExecutor.
You need Docker Desktop installed and running on your machine for this to work.
First, we create the environment where the code will run.
from autogen_ext.code_executors.docker import DockerCommandLineCodeExecutor
# Create a "computer within a computer"
executor = DockerCommandLineCodeExecutor(
image="python:3-slim", # The environment (OS + Python)
work_dir="coding", # Where files are saved
)
# Start the Docker container
await executor.start()
Explanation:
image: We tell Docker to download a lightweight version of Linux with Python 3 installed.work_dir: Any code the agent writes will be saved in a local folder named "coding".Normally, an Agent (Chapter 1) would generate the code. But to understand the "Hands," let's manually give the executor some work.
from autogen_core.code_executor import CodeBlock
from autogen_core import CancellationToken
# Imagine the Agent wrote this:
code = "print('Hello from inside the Docker container!')"
# Wrap it in a CodeBlock
block = CodeBlock(language="python", code=code)
# Run it!
result = await executor.execute_code_blocks(
code_blocks=[block],
cancellation_token=CancellationToken()
)
print(result.output)
Output:
Hello from inside the Docker container!
What just happened?
The real magic happens when you combine the Agent (Chapter 1) with the Executor. This creates a feedback loop:
If the code has an error (e.g., a syntax error), the Executor returns the error message. The Agent sees this, thinks "Oops, I made a typo," corrects the code, and runs it again. This allows agents to debug themselves.
How does the text move from the Agent to the Sandbox and back?
%%MERMAID_0%%python
print("Hi")
%%CODE_3%%
Let's look inside autogen_ext/code_executors/docker/_docker_code_executor.py to see how this is built.
The core logic resides in the _execute_code_dont_check_setup method. It performs three main steps.
The executor takes the code string and saves it to a file so the operating system can read it.
# Simplified from _docker_code_executor.py
for code_block in code_blocks:
# 1. Determine the filename (random or specified)
filename = f"tmp_code_{hash}.py"
# 2. Write the code to the workspace on your disk
code_path = self.work_dir / filename
with code_path.open("w") as fout:
fout.write(code_block.code)
It determines how to run that file. If it's Python, it uses python. If it's a shell script, it uses sh.
# Simplified command generation
command = ["timeout", "60", "python", filename]
It adds a timeout command so the agent doesn't accidentally create an infinite loop that runs forever!
Finally, it uses the Docker library to execute that command inside the isolated container.
# Simplified execution logic
exec_result = container.exec_run(command)
# Decode the result (bytes to string)
output = exec_result.output.decode("utf-8")
exit_code = exec_result.exit_code
When using Docker, it is polite to clean up resources so you don't have unused containers eating up your RAM.
await executor.stop()
While Python is the most common language for AI agents, the DockerCommandLineCodeExecutor is flexible. It determines how to run code based on the language label:
python -> Runs python <file>bash, sh, shell -> Runs sh <file>pwsh (PowerShell) -> Runs pwsh <file> (if installed in the image)
(Note: For .NET developers, AutoGen also supports dotnet-interactive kernels to run C# code interactively, as seen in InteractiveService.cs, allowing for similar capabilities in a .NET environment.)
In this chapter, we learned:
We now have:
We can build a single smart agent. But complex tasks usually require a team of specialists. How do we make multiple agents talk to each other?
Next Chapter: Teams and Group Chats (The Orchestration)
Generated by Code IQ