Welcome to the first chapter of the memU tutorial! If you are building an AI application, you probably want it to remember things.
Most simple AI memory systems use a "flat list." Imagine throwing thousands of sticky notes into a giant pile. When you ask a question, the AI has to dig through the entire pile to find the right note. It's messy, slow, and often inaccurate.
memU takes a different approach. It organizes memory like a computer file system (folders and files). This allows your AI to "browse" topics before diving into details, making retrieval smarter and cheaper.
Imagine you are chatting with your AI assistant about two very different things:
If you ask, "What is the status?", a "flat" memory system might get confused. It scans everything. It might find a note saying "Coffee status: empty" alongside "Project status: pending".
memU solves this by grouping these facts into hierarchies:
User Preferences -> File: Coffee HabitsWork Projects -> File: Project Apollo
When you ask about work, the AI knows to look in the Work Projects folder, ignoring your coffee habits entirely.
In memU, we don't just store text; we store structured objects. Here are the three core building blocks:
Think of a Resource as the raw source file. This could be a chat log, a PDF document, an image, or a URL. It is the "proof" that something happened.
conversation_2023_10_27.txt.A MemoryItem is a specific fact or insight extracted from a Resource. Think of this as a specific file inside a folder.
A MemoryCategory is a high-level topic that groups related items together.
Let's look at how memU defines these structures in Python. These models form the backbone of the system.
The MemoryItem is the most important unit. It holds the summary of the fact and the mathematical "embedding" (vector) that lets the AI search for it.
From src/memu/database/models.py:
class MemoryItem(BaseRecord):
resource_id: str | None # Links back to the source (The Book)
memory_type: str # e.g., "preference", "event", "fact"
summary: str # The actual memory content
embedding: list[float] | None = None # Vector for search
happened_at: datetime | None = None
extra: dict[str, Any] = {} # Custom data
The MemoryCategory helps the AI browse. Instead of scanning 10,000 items, it might first scan 50 categories to find the right topic.
class MemoryCategory(BaseRecord):
name: str # e.g., "Coffee Preferences"
description: str # e.g., "Details about what the user drinks"
embedding: list[float] | None = None
summary: str | None = None
How do we put a MemoryItem inside a MemoryCategory? We use a linking table called CategoryItem.
class CategoryItem(BaseRecord):
item_id: str # The ID of the specific fact
category_id: str # The ID of the folder it belongs to
This flexible design means one specific memory (like "I have a meeting at 2 PM") can live in multiple categories (e.g., both "Calendar" and "Project Apollo").
When memU processes information, it doesn't just save data; it transforms it.
Here is what happens when you feed data into this hierarchical model. This process is managed by the Memorize Pipeline (which we will cover in Chapter 3), but it's important to visualize the data structure now.
You might wonder: What if I have multiple users?
memU uses a clever trick called "Scoped Models." If you look at src/memu/database/models.py, you'll see a function called build_scoped_models.
def build_scoped_models(user_model: type[BaseModel]):
# Merges your User definition with the Resource definition
resource_model = merge_scope_model(
user_model, Resource, name_suffix="Resource"
)
# ... repeats for Items and Categories
return resource_model, ...
This simply means every MemoryItem or Resource automatically gets an extra field (like user_id) attached to it. This ensures that User A never accidentally sees the memories inside User B's folders.
In this chapter, we learned that memU doesn't use a messy flat list. It uses a Hierarchical Data Model:
This structure provides the foundation for the AI to "think" in an organized way.
But who actually manages these files and folders? Who creates them and searches them? That is the job of the MemoryService, the central brain of the operation.
Next Chapter: MemoryService (The Central Brain)
Generated by Code IQ