Chapter 5 · CORE

Structured Memory Systems

📄 05_structured_memory_systems.md 🏷 Core

Chapter 5: Structured Memory Systems

In the previous chapter, Interleaved Thinking, we taught our agent to "think" before it acts. It can now plan complex tasks and catch its own mistakes.

However, we still have a major flaw. As soon as the script finishes running, the agent forgets everything. If you tell the agent your name is "Sarah" today, and come back tomorrow, it will ask: "Who are you?"

The Problem: The "50 First Dates" Syndrome

Current LLMs have Amnesia.

The Context Window (RAM): This is the agent's "working memory." It is fast and smart, but it is volatile. When you close the chat, the RAM is wiped.
The Persistence Layer (Hard Drive): To make an agent truly useful, we need a way to store facts permanently and retrieve them later.

Use Case: The Family Doctor Agent

Imagine you are building a medical assistant.

Session 1 (Monday): You tell the agent, "I am severely allergic to penicillin."
Session 2 (Friday): You ask, "I have a strep throat, what should I take?"

Without Memory: The agent sees "Strep throat" -> retrieves standard medical advice -> Suggests Amoxicillin (a penicillin derivative). This is dangerous.

With Memory: The agent looks up your profile -> sees "Allergy: Penicillin" -> Suggests Azithromycin instead.

In this chapter, we will build a Structured Memory System to save the agent from making dangerous mistakes.

Key Concepts

We don't just want to save the entire chat log (that's messy and expensive). We want to save Structured Facts.

1. The Append-Only Log (The Journal)

The simplest memory. We just write down everything that happened in a file (usually .jsonl). It is perfect for auditing, but hard for the AI to read quickly because it gets too long.

2. The Semantic Search (The Librarian)

We turn text into numbers (vectors). When you ask a question, the system finds past notes that "sound similar" to your question. This is often called RAG (Retrieval-Augmented Generation).

3. The Knowledge Graph (The Mind Map)

This is the most powerful form of structured memory. We store specific relationships between entities.

Subject: Sarah
Predicate: HAS_ALLERGY
Object: Peanuts

This structure allows the agent to answer specific questions like "What is Sarah allergic to?" with 100% accuracy, rather than guessing.

Building a Fact-Based Memory System

Let's build a simple Knowledge Graph system for our Doctor Agent. We will treat memory as a Tool that the agent can read from and write to.

Step 1: Defining the "Fact"

We need a strict format. We don't want the agent scribbling random notes. We want "Triples" (Subject -> Relation -> Object).

# A single memory unit
fact = {
    "subject": "User",
    "relation": "has_allergy",
    "object": "Penicillin",
    "valid_from": "2023-10-27" 
}

Explanation: By forcing this structure, we make it easy for the code to filter data later.

Step 2: Storing the Fact (The Write Operation)

We need a function that saves this JSON object to a file.

import json

def save_fact(fact_dict):
    """Appends a fact to our long-term storage file."""
    with open("memory.jsonl", "a") as f:
        f.write(json.dumps(fact_dict) + "\n")
    return "Fact saved successfully."

Explanation: We use "Append mode" ("a"). We never delete old facts; we just add new ones. This builds a history of the patient.

Step 3: Retrieving the Fact (The Read Operation)

When the user asks a question, we check our database.

def get_facts(subject, relation=None):
    """Finds all facts about a specific subject."""
    memories = []
    with open("memory.jsonl", "r") as f:
        for line in f:
            data = json.loads(line)
            if data['subject'] == subject:
                memories.append(data)
    return memories

Explanation: This is a simplified search. In a real production system (using tools like Zep, Mem0, or Cognee), this logic is handled by a specialized database that is much faster.

Internal Implementation: The Memory Loop

How does the agent know when to use memory? We integrate it into the loop we built in Chapter 4: Interleaved Thinking.

The flow changes from Think -> Act to Recall -> Think -> Act -> Memorize.

sequenceDiagram participant U as User participant M as Memory System participant A as Agent U->>A: "I have a sore throat." Note over A: 1. RECALL: Check specific facts A->>M: get_facts(subject="User") M->>A: Returns: [{"relation": "allergy", "object": "Penicillin"}] Note over A: 2. THINK: "User is allergic to Penicillin." Note over A: 3. THINK: "Avoid Amoxicillin." A->>U: "Since you are allergic to Penicillin..." Note over A: 4. MEMORIZE: Did I learn anything new? Note over A: No new facts in this turn.

Deep Dive: Automatic Extraction

We don't want to manually write Python code for every fact. We want the Agent to decide what to save. We do this by giving the Agent a specialized tool called memorize_fact.

The Tool Definition

We give this tool to the LLM (refer to Chapter 3: Tool Design).

memorize_tool = {
    "name": "memorize_fact",
    "description": "Save a permanent fact about the user. Use this when the user tells you about preferences, allergies, or history.",
    "parameters": {
        "properties": {
            "relation": {"type": "string", "description": "e.g. 'likes', 'lives_in'"},
            "object": {"type": "string", "description": "The value of the fact"}
        }
    }
}

The Agent's Behavior

Now, when the user says "I moved to Seattle," the Agent's internal monologue (Interleaved Thinking) kicks in:

User Input: "I moved to Seattle."
Thought: "This is a location change. I should save this."
Tool Call: memorize_fact(relation="lives_in", object="Seattle")
System: Writes to memory.jsonl.
Agent Response: "Noted! I'll remember you are in Seattle."

Why Structure Matters

You might ask: "Why not just save the user's sentence 'I moved to Seattle' directly?"

If we save raw text, the agent has to read and re-process that text every time. By saving User -> lives_in -> Seattle, we create a Knowledge Graph.

This allows for powerful logic later:

Query: "Does the user live in a coastal city?"
Logic: Look up lives_in. Result: Seattle. Check if Seattle is coastal. Result: Yes.

Advanced Memory Systems

In production, you rarely write raw JSONL files. You use specialized frameworks (detailed in the memory-systems skill) that handle the heavy lifting:

Vector Stores (e.g., Mem0): Good for "fuzzy" matching.
Knowledge Graphs (e.g., Cognee, Zep/Graphiti): Good for strict relationships and "multi-hop" reasoning (connecting A to B to C).

Summary

In this chapter, we solved the "Amnesia" problem.

RAM vs Hard Drive: We learned to move facts out of the Context Window into permanent storage.
Structured Facts: We learned that storing Subject -> Relation -> Object is better than storing raw text.
The Loop: We added a "Recall" step before the agent thinks, and a "Memorize" step after the agent acts.

Now our agent is smart, thoughtful, and has a long-term memory. But so far, we have only been building one agent.

What happens if the "Doctor Agent" needs to talk to a "Scheduling Agent" to book an appointment? How do they talk to each other without confusing the user?

Next Chapter: Multi-Agent Architecture

Generated by Code IQ

← Previous

Interleaved Thinking

Multi-Agent Architecture