Giao diện
💾 CACHING
Bộ nhớ đệm: Vũ khí tối thượng giảm Latency
1. Locality of Reference (Nguyên lý Nền tảng)
NOTE
🎓 Giáo sư Tom: Mọi caching strategy đều dựa trên một nguyên lý: dữ liệu được truy cập gần đây có khả năng cao sẽ được truy cập lại.
1.1 Temporal Locality (Tính cục bộ theo thời gian)
Nếu bạn vừa truy cập dữ liệu X,
khả năng cao bạn sẽ truy cập X lại trong tương lai gần.
Ví dụ: User profile sau khi login → truy cập liên tục trong session1.2 Spatial Locality (Tính cục bộ theo không gian)
Nếu bạn truy cập dữ liệu tại địa chỉ X,
khả năng cao bạn sẽ truy cập các địa chỉ gần X.
Ví dụ: Đọc file → đọc các block liền kề, không nhảy randomMemory Hierarchy (Access Time):
| Level | Technology | Latency | Capacity |
|---|---|---|---|
| L1 Cache | SRAM | ~1ns | 64KB |
| L2 Cache | SRAM | ~4ns | 256KB |
| L3 Cache | SRAM | ~10ns | 8-32MB |
| RAM | DRAM | ~100ns | 16-128GB |
| SSD | NAND Flash | ~100μs | 1-4TB |
| HDD | Magnetic | ~10ms | 1-16TB |
| Network | TCP/IP | ~1-100ms | ∞ |
2. Caching Patterns
2.1 Cache-Aside (Lazy Loading) - Phổ biến nhất
┌─────────────────────────────────────────────────────────┐
│ CACHE-ASIDE PATTERN │
├─────────────────────────────────────────────────────────┤
│ │
│ READ PATH: │
│ ────────── │
│ 1. App checks Cache │
│ 2. Cache HIT → Return data │
│ 3. Cache MISS → Query Database │
│ 4. Store result in Cache │
│ 5. Return to client │
│ │
│ [App] ──1──► [Cache] ──HIT──► Return │
│ │ │ │
│ │ MISS │
│ │ ▼ │
│ └────3────► [Database] │
│ │ │
│ 4. SET │
│ ▼ │
│ [Cache] │
│ │
│ WRITE PATH: │
│ ─────────── │
│ 1. Write to Database │
│ 2. Invalidate/Delete Cache entry │
│ │
└─────────────────────────────────────────────────────────┘python
# Cache-Aside Implementation
def get_user(user_id: str) -> User:
# 1. Check cache first
cached = redis.get(f"user:{user_id}")
if cached:
return deserialize(cached)
# 2. Cache miss - query database
user = db.query("SELECT * FROM users WHERE id = ?", user_id)
# 3. Populate cache with TTL
redis.setex(f"user:{user_id}", ttl=3600, value=serialize(user))
return user
def update_user(user_id: str, data: dict) -> None:
# 1. Update database (source of truth)
db.execute("UPDATE users SET ... WHERE id = ?", data, user_id)
# 2. Invalidate cache (NOT update!)
redis.delete(f"user:{user_id}")Pros & Cons:
| ✅ Pros | ❌ Cons |
|---|---|
| Simple to implement | Cache miss = slow (DB round-trip) |
| Only cache what's needed | Application handles cache logic |
| Resilient to cache failure | Risk of stale data |
2.2 Write-Through (An toàn dữ liệu)
┌─────────────────────────────────────────────────────────┐
│ WRITE-THROUGH PATTERN │
├─────────────────────────────────────────────────────────┤
│ │
│ WRITE PATH (Synchronous): │
│ ───────────────────────── │
│ 1. Write to Cache │
│ 2. Cache writes to Database (synchronously) │
│ 3. Return success only after BOTH complete │
│ │
│ [App] ──write──► [Cache] ──sync──► [Database] │
│ │ │ │
│ └───── ACK ────────┘ │
│ │
│ READ: Always from cache (guaranteed fresh!) │
│ │
└─────────────────────────────────────────────────────────┘Pros & Cons:
| ✅ Pros | ❌ Cons |
|---|---|
| Data always consistent | Write latency = Cache + DB |
| Simple read logic | Writes are slow |
| No stale data | Cache may store rarely-read data |
2.3 Write-Back/Write-Behind (Nhanh nhưng Rủi ro)
┌─────────────────────────────────────────────────────────┐
│ WRITE-BACK PATTERN │
├─────────────────────────────────────────────────────────┤
│ │
│ WRITE PATH (Asynchronous): │
│ ────────────────────────── │
│ 1. Write to Cache only → Return immediately │
│ 2. Cache asynchronously flushes to DB (batched) │
│ │
│ [App] ──write──► [Cache] ──ACK──► │
│ │ │
│ │ (async, batched) │
│ ▼ │
│ [Database] │
│ │
│ ⚠️ RISK: Cache crash = DATA LOSS! │
│ │
└─────────────────────────────────────────────────────────┘Pros & Cons:
| ✅ Pros | ❌ Cons |
|---|---|
| Ultra-fast writes | Data loss risk on cache failure |
| Batching reduces DB load | Complex consistency management |
| Great for write-heavy | Hard to debug |
CAUTION
🔧 Kỹ sư Raizo: Write-back chỉ dùng khi bạn CHẤP NHẬN mất dữ liệu. Ví dụ: Page view counters, analytics. KHÔNG BAO GIỜ dùng cho financial transactions!
3. The Nightmares (Cache Disasters)
3.1 Thundering Herd Problem
┌─────────────────────────────────────────────────────────┐
│ THUNDERING HERD DISASTER │
├─────────────────────────────────────────────────────────┤
│ │
│ SCENARIO: Hot key expires at T=0 │
│ │
│ T=0ms: Cache entry "homepage:v1" EXPIRES │
│ │
│ T=1ms: 1000 requests hit simultaneously │
│ All get CACHE MISS │
│ All query database... │
│ │
│ [Req 1] ──MISS──┐ │
│ [Req 2] ──MISS──┤ │
│ [Req 3] ──MISS──┼────────► [DATABASE] 💥 OVERLOAD! │
│ ... │ (1000 identical queries) │
│ [Req 1000]──MISS┘ │
│ │
│ RESULT: Database CPU 100%, response time 10x, │
│ possible cascade failure │
│ │
└─────────────────────────────────────────────────────────┘Solutions:
python
# Solution 1: Distributed Lock (Mutex)
def get_with_lock(key: str) -> Data:
data = cache.get(key)
if data:
return data
# Only ONE request fetches from DB
lock_key = f"lock:{key}"
if cache.setnx(lock_key, "1", ttl=5): # Acquired lock
try:
data = database.query(key)
cache.set(key, data, ttl=3600)
return data
finally:
cache.delete(lock_key)
else:
# Wait for the winner to populate cache
time.sleep(0.1)
return get_with_lock(key) # Retry
# Solution 2: Early Expiration / Probabilistic Refresh
def get_with_probabilistic_refresh(key: str, ttl: int = 3600) -> Data:
data, remaining_ttl = cache.get_with_ttl(key)
if data:
# Refresh early with probability based on remaining TTL
# When TTL is low, refresh probability increases
if remaining_ttl < ttl * 0.1: # Last 10% of TTL
if random.random() < 0.2: # 20% chance
refresh_async(key) # Background refresh
return data
# Normal cache miss handling
return fetch_and_cache(key)
# Solution 3: Request Coalescing (singleflight pattern)
from singleflight import Group
group = Group()
def get_with_coalescing(key: str) -> Data:
data = cache.get(key)
if data:
return data
# All concurrent requests for same key share ONE db call
return group.do(key, lambda: fetch_and_cache(key))3.2 Cache Penetration
┌─────────────────────────────────────────────────────────┐
│ CACHE PENETRATION │
├─────────────────────────────────────────────────────────┤
│ │
│ ATTACK: Query keys that DON'T EXIST │
│ │
│ [Attacker] ──"user:999999999"──► [Cache] ──MISS │
│ ──"user:888888888"──► ──MISS │
│ ──"user:777777777"──► ──MISS │
│ │ │
│ ▼ │
│ [Database] │
│ (finds nothing) │
│ (but still │
│ queries!) │
│ │
│ Cache never stores misses → Every request hits DB! │
│ │
└─────────────────────────────────────────────────────────┘Solutions:
python
# Solution 1: Cache Negative Results
def get_user(user_id: str) -> Optional[User]:
cached = cache.get(f"user:{user_id}")
if cached == "NULL_MARKER": # Cached "not found"
return None
if cached:
return deserialize(cached)
user = db.query(user_id)
if user:
cache.set(f"user:{user_id}", serialize(user), ttl=3600)
else:
# Cache the "not found" with shorter TTL
cache.set(f"user:{user_id}", "NULL_MARKER", ttl=300)
return user
# Solution 2: Bloom Filter (Check existence before query)
from pybloom_live import BloomFilter
# Populate filter with all valid keys (at startup or incrementally)
valid_users = BloomFilter(capacity=10_000_000, error_rate=0.01)
for user_id in db.get_all_user_ids():
valid_users.add(user_id)
def get_user(user_id: str) -> Optional[User]:
# Quick check: if not in Bloom Filter, definitely doesn't exist
if user_id not in valid_users:
return None # No DB query!
# Bloom Filter says "maybe exists" → normal flow
return normal_cache_aside_get(user_id)3.3 Cache Avalanche (Snowball Effect)
┌─────────────────────────────────────────────────────────┐
│ CACHE AVALANCHE │
├─────────────────────────────────────────────────────────┤
│ │
│ SCENARIO: All cache entries expire at SAME TIME │
│ (e.g., cache restart, or all keys with same TTL) │
│ │
│ T=0: [K1][K2][K3][K4]...[K1000] ALL EXPIRE! │
│ │ │
│ ▼ │
│ T=1: 100,000 requests → ALL CACHE MISS │
│ │ │
│ ▼ │
│ [DATABASE] 💥💥💥 COMPLETE MELTDOWN │
│ │
└─────────────────────────────────────────────────────────┘Solutions:
python
# Solution 1: Randomize TTL
import random
def cache_with_jitter(key: str, data: Any, base_ttl: int = 3600):
# Add random jitter: TTL between 3000-4200 seconds
jittered_ttl = base_ttl + random.randint(-600, 600)
cache.set(key, data, ttl=jittered_ttl)
# Solution 2: Multi-tier Fallback
def get_with_fallback(key: str) -> Data:
# L1: Local in-memory cache (fastest)
if local_cache.exists(key):
return local_cache.get(key)
# L2: Distributed cache (Redis)
if redis.exists(key):
data = redis.get(key)
local_cache.set(key, data, ttl=60) # Short TTL
return data
# L3: Database
data = database.query(key)
redis.set(key, data, ttl=3600)
local_cache.set(key, data, ttl=60)
return data
# Solution 3: Rate limiting on cache misses
@rate_limit(max_calls=100, period=1) # 100 DB calls/second max
def fetch_from_database(key: str) -> Data:
return database.query(key)4. Eviction Policies
| Policy | Description | Use Case |
|---|---|---|
| LRU | Evict Least Recently Used | General purpose, most common |
| LFU | Evict Least Frequently Used | When access frequency matters |
| FIFO | First In, First Out | Simple streaming data |
| TTL | Time-based expiration | Freshness requirements |
| Random | Random eviction | When LRU overhead too high |
5. Decision Matrix
| Scenario | Recommended Pattern |
|---|---|
| Read-heavy, consistency OK | Cache-Aside |
| Strong consistency required | Write-Through |
| Write-heavy, loss OK | Write-Back |
| Hot keys problem | Distributed Lock + Coalescing |
| Invalid key attacks | Bloom Filter |
6. Tiếp theo
Caching giảm tải database, nhưng khi data quá lớn cho một server? Đó là lúc cần Database Scaling.