In the previous chapter, Object-Oriented System Modeling, we learned how to structure code using Classes and Objects. We built a Parking Lot system where cars occupied spots in our computer's memory.
However, there was a catch: Memory (RAM) is volatile. If the server restarts, the parking lot becomes empty. To make data permanent, we need a Database (Hard Disk). But reading from a hard disk is slowโsometimes 100x slower than reading from RAM.
This creates a dilemma:
Caching is the bridge between these two. It is the strategy of keeping frequently used data in fast memory to avoid slow database trips. It acts as your system's "Short-Term Memory."
Imagine we are building a search engine like Google.
If 10,000 people search for "System Design Primer" every hour, we don't want to repeat that slow 2-second database hunt every time.
Solution: The first time someone searches, we save the result in a Cache. The next 9,999 people get the result instantly from memory.
The cache sits between your Web Server and your Database.
How does the cache find data so fast? It uses a data structure called a Hash Table (or Dictionary in Python).
Imagine a library.
In a Hash Table, every piece of data has a unique Key.
# A simple Hash Table (Dictionary) in Python
cache = {}
# We map a Query (Key) to a Result (Value)
cache["system design"] = "A guide to building scalable systems..."
cache["python"] = "A popular programming language..."
Accessing cache["system design"] is instant, regardless of whether you have 10 items or 10 million items.
Memory is expensive and limited. We cannot cache everything. When the cache gets full, we must decide what to delete to make room for new data. This is called an Eviction Policy.
The most popular policy is LRU (Least Recently Used).
The Rule:
To build an efficient LRU Cache, we combine two data structures:
get the data).move to front).Let's trace what happens when we use a cache with a Max Size of 3.
We will write a simplified LRUCache class. This builds on the skills from Object-Oriented System Modeling.
First, we wrap our data in a "Node" so we can link items together.
class Node:
def __init__(self, key, value):
self.key = key
self.value = value
self.prev = None # Link to the item behind this
self.next = None # Link to the item ahead of this
The cache needs a generic dictionary for lookups and a size limit.
class LRUCache:
def __init__(self, capacity):
self.capacity = capacity
self.lookup = {} # The Hash Table
# We use dummy nodes for Head (newest) and Tail (oldest)
self.head = Node(0, 0)
self.tail = Node(0, 0)
self.head.next = self.tail
self.tail.prev = self.head
Explanation: self.lookup lets us find a node instantly. The head and tail are empty placeholders to help us organize the list.
When we read data, we must move it to the "newest" position (right after head).
def get(self, key):
if key in self.lookup:
node = self.lookup[key]
self._remove(node) # Helper to detach node
self._add(node) # Helper to move to front
return node.value
return -1 # Not found
Explanation: If the key exists, we technically "delete" it from its current spot and "re-add" it to the front. This updates its "freshness."
When we save data, we check if we are full. If so, we delete the oldest item.
def put(self, key, value):
if key in self.lookup:
self._remove(self.lookup[key]) # Remove old version
node = Node(key, value)
self._add(node) # Add new version to front
self.lookup[key] = node
if len(self.lookup) > self.capacity:
# Too full! Delete the guy at the back (prev to tail)
oldest = self.tail.prev
self._remove(oldest)
del self.lookup[oldest.key]
Explanation: This is the magic of LRU. We enforce the limit by sacrificing the tail.prev node.
You don't always need to write this code yourself. Tools like Redis or Memcached are pre-built, high-performance caches that use these exact concepts.
You should use Caching when:
You should NOT use Caching when:
Just like we learned in System Architecture Design, a single computer has limits. If we have 100GB of data to cache, but our server only has 16GB of RAM, we need to split the data.
This is called Distributed Caching. We might use a technique called Sharding (e.g., User IDs 1-100 go to Cache Server A, 101-200 go to Cache Server B). We will look closer at distributed systems in Distributed Data Processing.
In this chapter, we learned:
Now that we can store objects (Chapter 2) and retrieve them quickly (Chapter 3), we face a new challenge: Relationships.
How do we model connections, like "Friend lists" on Facebook or "Route maps" in Uber? A simple list or cache isn't enough. We need to understand graphs.
Next Chapter: Graph Relationships and Search
Generated by Code IQ