In Chapter 4: Sub-Agent Execution System, we built a team of Sub-Agents capable of writing code and executing complex tasks in the background.
But this creates a terrifying problem.
If an AI Agent decides to "cleanup disk space" and runs rm -rf /, and that Agent is running directly on your laptop... it deletes your files.
In this chapter, we introduce the Sandboxed Environment. This is the safety layer that ensures deer-flow is a helpful assistant, not a digital hazard.
Imagine a scientist studying a dangerous virus. They don't do it at their kitchen table. They work in a Bio-Secure Lab with thick glass walls and robotic arms.
The Scientist can see inside and manipulate things using the arms, but nothing inside the lab can escape to infect the outside world.
User: "Write a Python script that counts to infinity."
If you run this locally:
With a Sandbox:
Sandbox AbstractionIn deer-flow, we don't hard-code "Docker" everywhere. We create a generic interface—a standard set of controls—that any secure environment must obey.
This is defined in backend/src/sandbox/sandbox.py.
Think of the Sandbox class as a universal remote control for our "Bio-Secure Lab."
Simplified Interface (sandbox.py):
from abc import ABC, abstractmethod
class Sandbox(ABC):
@abstractmethod
def execute_command(self, command: str) -> str:
"""Run a terminal command (e.g., 'ls', 'python main.py')"""
pass
@abstractmethod
def write_file(self, path: str, content: str) -> None:
"""Create a file inside the box"""
pass
@abstractmethod
def read_file(self, path: str) -> str:
"""Read a file from inside the box"""
pass
Explanation: The Lead Agent uses these three buttons. It doesn't care if the sandbox is a Docker container, a Kubernetes Pod, or a Firecracker MicroVM. It just knows it can Execute, Read, and Write.
Let's see what happens when the Lead Agent wants to run that dangerous code.
How do we actually implement this "Lab"? deer-flow uses container technology.
When the system starts up (or when a task begins), we spin up a lightweight Linux environment.
In our docker-compose-dev.yaml, we use a specific image for this:
# Inside docker-compose-dev.yaml
provisioner:
environment:
- SANDBOX_IMAGE=all-in-one-sandbox:latest
This image is a stripped-down Linux OS containing:
When the Agent calls execute_command("python script.py"), we don't run os.system() (which is dangerous). We use the container's API.
Simplified Logic (Conceptual):
class DockerSandbox(Sandbox):
def execute_command(self, command: str):
# We tell Docker to run this inside the specific container ID
container = docker_client.containers.get(self.id)
# This runs INSIDE the box, not on the host
result = container.exec_run(command)
return result.output.decode("utf-8")
Explanation: The exec_run function acts like a portal. It teleports the command into the container, runs it there, and teleports the text output back.
The Agent needs to see files, but only specific files. We use Volume Mounting.
Imagine a hotel room (The Sandbox).
In docker-compose-dev.yaml:
volumes:
- ${DEER_FLOW_ROOT}/backend/.deer-flow/threads:/root/workspace
${DEER_FLOW_ROOT}/.../threads (A specific folder on your disk)./root/workspace (Where the Agent thinks it lives).
If the Agent runs rm -rf /root/workspace, it only deletes the temporary thread files, not your actual project code.
The Sandbox isn't just about file isolation; it enforces behavior limits.
In config.yaml, we define how long a command is allowed to run.
# Simplified Logic
def execute_with_timeout(command):
try:
# Allow 30 seconds max
return container.exec_run(command, timeout=30)
except TimeoutError:
container.kill() # Emergency Stop
return "Error: Execution took too long."
We can configure the sandbox to have No Internet Access. This prevents a malicious script (or a confused AI) from uploading your data to a random server.
In this chapter, we secured our system.
Sandbox class) provides a standard remote control (Read, Write, Execute).Now we have a system that can:
But there is one problem left. If the AI learns something important about you (e.g., "The user hates Java"), it forgets it as soon as the chat history gets too long. We need a way to store facts permanently.
Next Chapter: Long-Term Memory Updater
Generated by Code IQ