Welcome back! In the previous chapter, Retrieve Pipeline (The Recall System), we learned how the AI acts as a librarian to find information.
But a librarian is useless without a library.
If you turn off your computer, what happens to the memories the AI just learned? If they are just variables in Python's memory (RAM), they vanish like writing on a whiteboard. We need a permanent place to keep themβa Filing Cabinet.
This is the job of the Storage Layer.
When building an app, you face a dilemma:
You need a system that acts like a whiteboard for testing (fast, temporary) but acts like a steel filing cabinet for production (permanent, secure).
memU solves this using an Abstraction Layer. The rest of the app (the Brain, the Pipelines) doesn't know how data is saved. It just says, "Save this." The Storage Layer handles the details.
The Storage Layer is built on three main ideas.
Imagine you are hiring a construction crew. You tell them: "Build me a storage unit."
This switch happens automatically based on your configuration.
Inside your filing cabinet, you don't throw everything into one pile. You have labeled drawers:
To make the AI smart, we store "Vectors" (lists of numbers representing meaning).
pgvector) for lightning-fast search.
As a user of memU, you rarely touch the database code directly. You simply tell the MemoryService which "Builder" to use.
Here is how you swap the backend in your code:
from memu.app import MemoryService
# OPTION A: The Tent (In-Memory)
# Great for unit tests. Data is lost when script ends.
service_test = MemoryService(
database_config={"provider": "inmemory"}
)
# OPTION B: The Vault (SQLite)
# Great for local apps. Data is saved to a file.
service_prod = MemoryService(
database_config={
"provider": "sqlite",
"dsn": "sqlite:///my_memory.db"
}
)
By changing one line of text (provider), you completely change the storage engine without rewriting any other code.
How does memU pull off this magic trick? Let's look at the flow when you start the app.
The magic starts in src/memu/database/factory.py. This function acts as the traffic controller.
def build_database(config, user_model):
# 1. Check what the user wants
provider = config.metadata_store.provider
# 2. Return the correct storage engine
if provider == "inmemory":
return build_inmemory_database(...)
elif provider == "sqlite":
# Only imports SQLite code if actually needed!
from memu.database.sqlite import build_sqlite_database
return build_sqlite_database(...)
Why do we import inside the if?
If you are using PostgreSQL, you don't want to crash because you are missing a SQLite driver, and vice versa. This keeps dependencies clean.
Let's look at src/memu/database/sqlite/sqlite.py. This class manages the connection and the repositories.
class SQLiteStore(Database):
def __init__(self, dsn, ...):
self._sessions = SQLiteSessionManager(dsn=dsn)
# 1. Ensure the physical tables exist
self._create_tables()
# 2. Open the specific "Drawers"
self.memory_item_repo = SQLiteMemoryItemRepo(...)
self.resource_repo = SQLiteResourceRepo(...)
# 3. Load cache (for speed)
self.load_existing()
The SQLiteStore is the boss. It holds the connection to the file (my_memory.db) and owns the repositories.
Before we save anything, we need to know what the data looks like. memU uses Pydantic models to define the shape of the data.
In src/memu/database/models.py, we define the "Form" every memory must fill out:
class MemoryItem(BaseRecord):
# Every memory has these fields
id: str
summary: str # The content
embedding: list[float] # The vector
created_at: datetime
# Links to other tables
resource_id: str | None
Because we use this standard model, the rest of the app doesn't care if the underlying database is SQL, NoSQL, or a JSON file. It just sends a MemoryItem.
When you want to find an item, you ask the repository. The repository translates your request into the specific language of the database (e.g., SQL queries).
Conceptual code for a Repository:
class SQLiteMemoryItemRepo:
def create_item(self, summary, embedding):
# 1. Convert Python object to SQL row
row = MemoryItemSQL(summary=summary, ...)
# 2. Save to file
self.session.add(row)
self.session.commit()
return row
In this chapter, we explored the Storage Layer:
Now we have a Brain (Service), a way to learn (Memorize), a way to remember (Retrieve), and a place to keep it all (Storage).
But there is one final piece of the puzzle. How do we string all these steps together? How do we ensure that "Step A" passes data correctly to "Step B"?
We need an assembly line manager.
Next Chapter: Workflow Pipeline Engine (The Assembly Line)
Generated by Code IQ