Giao diện
🐦 DESIGN TWITTER
Timeline & Fan-out Patterns at Scale
🎓 Giáo sư Tom
Twitter là bài toán kinh điển về Fan-out - làm sao để phân phối 1 tweet đến hàng triệu followers trong thời gian thực? Đây là nơi bạn học được sự khác biệt giữa Push và Pull model.
📊 Back-of-Envelope Calculations
Scale Assumptions
| Metric | Value | Rationale |
|---|---|---|
| Monthly Active Users (MAU) | 500M | Global social platform |
| Daily Active Users (DAU) | 300M | ~60% of MAU |
| Tweets per user per day | 2 | Average, including retweets |
| Timeline reads per user per day | 10 | Users check feed multiple times |
| Average followers per user | 200 | Power law distribution |
| Celebrity threshold | 10K+ followers | Requires special handling |
| Tweet size (avg) | 1 KB | 280 chars + metadata + media refs |
QPS Calculations
Tweet Write Operations:
Daily tweets = 300M DAU x 2 tweets/user = 600M tweets/day
Average Write QPS = 600M / 86,400 seconds = ~7,000 QPS
Peak Write QPS = 7,000 x 3 = ~21,000 QPS
Timeline Read Operations:
Daily reads = 300M DAU x 10 reads/user = 3B reads/day
Average Read QPS = 3B / 86,400 = ~35,000 QPS
Peak Read QPS = 35,000 x 3 = ~100,000 QPS
Read:Write Ratio = 35,000 : 7,000 = 5:1Calculation Breakdown
- 86,400 = seconds in a day (24 x 60 x 60)
- Peak multiplier = 3x = industry standard for social apps
- Read-heavy workload -> optimize for reads (caching, fan-out on write)
Storage Calculations
Tweet Storage:
Daily tweet storage = 600M tweets x 1 KB = 600 GB/day
Monthly tweet storage = 600 GB x 30 = 18 TB/month
Yearly tweet storage = 18 TB x 12 = 216 TB/year
With 3x replication = 216 TB x 3 = 648 TB/year
Media Storage (images, videos):
Assume 20% tweets have media, avg 500 KB per media
Daily media = 600M x 0.2 x 500 KB = 60 TB/day
Monthly media = 60 TB x 30 = 1.8 PB/month
Timeline Cache (Redis):
Cache last 800 tweets per user (tweet IDs only)
Per user cache = 800 x 8 bytes (tweet_id) = 6.4 KB
Total cache = 300M users x 6.4 KB = ~2 TBBandwidth Calculations
Outbound Bandwidth (Timeline reads):
Avg timeline response = 20 tweets x 1 KB = 20 KB
Peak bandwidth = 100,000 QPS x 20 KB = 2 GB/s = 16 Gbps
Inbound Bandwidth (Tweet posts):
Peak inbound = 21,000 QPS x 1 KB = 21 MB/s = 168 Mbps🏗️ High-Level Architecture
Component Responsibilities
| Component | Responsibility | Technology |
|---|---|---|
| API Gateway | Rate limiting, authentication, routing | Kong/nginx |
| Tweet Service | Create, delete, like tweets | Go/Java microservice |
| Timeline Service | Fetch user's home timeline | Go microservice |
| User Service | User profiles, follow/unfollow | Go microservice |
| Fan-out Workers | Distribute tweets to follower timelines | Kafka consumers |
| Timeline Cache | Pre-computed timelines per user | Redis Sorted Sets |
| Tweet DB | Persistent tweet storage | PostgreSQL (sharded by tweet_id) |
| Social Graph | Follower/following relationships | Cassandra |
🔧 Raizo's Note
Tại sao shard theo tweet_id thay vì user_id?
Nếu shard theo user_id, celebrity accounts (Elon Musk, Taylor Swift) sẽ tạo hot spots - 1 shard chứa quá nhiều data. Shard theo tweet_id phân bố đều hơn vì tweets được tạo liên tục từ nhiều users.
🔄 Core Flows
Flow 1: Tweet Posting (Write Path)
Flow 2: Timeline Retrieval (Read Path)
🎓 Giáo sư Tom
Cursor-based Pagination là best practice cho infinite scroll:
- Không dùng
OFFSETvì performance kém với large datasets - Dùng
tweet_idhoặctimestamplàm cursor - Query:
WHERE tweet_id < :cursor ORDER BY tweet_id DESC LIMIT 20
💡 Deep Dive: Fan-out Strategies
The Core Problem
Khi user A post 1 tweet, làm sao để 10 triệu followers của A thấy tweet đó trong timeline của họ?
Có 2 approaches chính: Pull Model và Push Model.
Strategy 1: Pull Model (Fan-out on Read)
┌─────────────────────────────────────────────────────────────┐
│ PULL MODEL │
│ (Fan-out on Read) │
├─────────────────────────────────────────────────────────────┤
│ │
│ User A posts tweet │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Tweet DB │ ← Store tweet once │
│ └─────────────┘ │
│ │
│ Later, when Follower B opens timeline: │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ 1. Get list of users B follows │ │
│ │ 2. Query tweets from ALL followed users │ │
│ │ 3. Merge & sort by timestamp │ │
│ │ 4. Return top N tweets │ │
│ └─────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘| Pros | Cons |
|---|---|
| ✅ Write is fast (O(1)) | ❌ Read is slow (O(followings)) |
| ✅ No wasted work for inactive users | ❌ High latency at read time |
| ✅ Simple implementation | ❌ Heavy DB load on reads |
| ✅ Always fresh data | ❌ Doesn't scale for users following 1000+ accounts |
Strategy 2: Push Model (Fan-out on Write)
┌─────────────────────────────────────────────────────────────┐
│ PUSH MODEL │
│ (Fan-out on Write) │
├─────────────────────────────────────────────────────────────┤
│ │
│ User A posts tweet (has 1000 followers) │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Tweet DB │ ← Store tweet │
│ └──────┬──────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ Fan-out Workers │ │
│ │ Push tweet_id to 1000 follower caches │ │
│ └─────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────┬─────────┬─────────┬─────────┐ │
│ │Timeline │Timeline │Timeline │ ... │ │
│ │ User 1 │ User 2 │ User 3 │ │ │
│ └─────────┴─────────┴─────────┴─────────┘ │
│ │
│ When Follower B opens timeline: │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ Just read from B's pre-computed cache │ ← O(1) │
│ └─────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘| Pros | Cons |
|---|---|
| ✅ Read is instant (O(1)) | ❌ Write is slow (O(followers)) |
| ✅ Low latency for users | ❌ Celebrity problem (100M followers = 100M writes) |
| ✅ Predictable read performance | ❌ Wasted work for inactive followers |
| ✅ Can pre-compute rankings | ❌ High storage for timeline caches |
The Celebrity Problem (Justin Bieber Problem)
⚠️ The Problem
Khi Justin Bieber (100M followers) tweet, Push model phải write 100 triệu lần. Với 7,000 write QPS, mất 4 giờ để fan-out xong!
100,000,000 followers / 7,000 QPS = 14,285 seconds = ~4 hoursStrategy 3: Hybrid Solution (Twitter's Actual Approach)
How Hybrid Works:
Normal users (< 10K followers): Use Push model
- Tweet is fanned out to all follower timeline caches
- Fast read experience for followers
Celebrities (>= 10K followers): Use Pull model
- Tweet is NOT fanned out
- Stored in a separate "celebrity tweets" list
- Merged at read time when follower opens timeline
At read time:
- Fetch pre-computed timeline from cache (pushed tweets)
- Fetch recent tweets from followed celebrities (pulled tweets)
- Merge and sort by timestamp
- Return unified timeline
🎓 Giáo sư Tom
Threshold tuning: 10K là starting point. Trong production, Twitter dùng nhiều factors:
- Follower count
- Tweet frequency
- Follower activity rate
- Time of day
Một số accounts có thể switch giữa Push/Pull dynamically.
Implementation: Redis Timeline Cache
python
# Push model: Add tweet to follower's timeline
def fan_out_tweet(tweet_id: str, author_id: str, timestamp: int):
followers = get_followers(author_id)
if len(followers) >= CELEBRITY_THRESHOLD:
# Celebrity: skip fan-out, add to celebrity list
redis.zadd(f"celebrity_tweets:{author_id}", {tweet_id: timestamp})
return
# Normal user: fan-out to all followers
pipeline = redis.pipeline()
for follower_id in followers:
pipeline.zadd(
f"timeline:{follower_id}",
{tweet_id: timestamp}
)
# Keep only last 800 tweets
pipeline.zremrangebyrank(f"timeline:{follower_id}", 0, -801)
pipeline.execute()
# Read timeline with hybrid merge
def get_timeline(user_id: str, count: int = 20) -> list:
# 1. Get pushed tweets from cache
pushed_tweets = redis.zrevrange(
f"timeline:{user_id}", 0, count - 1, withscores=True
)
# 2. Get celebrity tweets (pull model)
celebrity_followings = get_celebrity_followings(user_id)
celebrity_tweets = []
for celeb_id in celebrity_followings:
tweets = redis.zrevrange(
f"celebrity_tweets:{celeb_id}", 0, 10, withscores=True
)
celebrity_tweets.extend(tweets)
# 3. Merge and sort
all_tweets = pushed_tweets + celebrity_tweets
all_tweets.sort(key=lambda x: x[1], reverse=True) # Sort by timestamp
return all_tweets[:count]🔧 Raizo's Note
Production gotcha: Redis ZADD với millions of keys cần careful memory management:
- Set
maxmemory-policy volatile-lruđể auto-evict old timelines - Monitor memory usage với
INFO memory - Consider Redis Cluster cho horizontal scaling
⚖️ Trade-offs Analysis
Architecture Decision Matrix
| Decision | Option A | Option B | Chosen | Rationale |
|---|---|---|---|---|
| Feed Generation | Push (Fan-out on Write) | Pull (Fan-out on Read) | Hybrid | Balance write cost vs read latency |
| Tweet Storage | Single PostgreSQL | Sharded PostgreSQL | Sharded by tweet_id | Avoid hot spots from celebrities |
| Timeline Cache | Per-user Redis keys | Shared cache | Per-user Sorted Sets | O(1) read, easy pagination |
| ID Generation | Auto-increment | UUID | Snowflake ID | Time-sortable, distributed, no coordination |
| Social Graph | PostgreSQL | Cassandra | Cassandra | Write-heavy (follows), denormalized queries |
| Message Queue | RabbitMQ | Kafka | Kafka | High throughput, replay capability |
Snowflake ID Deep Dive
┌─────────────────────────────────────────────────────────────┐
│ SNOWFLAKE ID (64 bits) │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┬──────────────┬────────────┬─────────────────┐ │
│ │ Sign(1) │ Timestamp(41)│ Machine(10)│ Sequence(12) │ │
│ │ 0 │ ms since │ Worker ID │ Counter │ │
│ │ │ epoch │ (1024) │ (4096/ms) │ │
│ └─────────┴──────────────┴────────────┴─────────────────┘ │
│ │
│ Benefits: │
│ • Time-sortable: ORDER BY id = ORDER BY time │
│ • Distributed: No central coordinator needed │
│ • High throughput: 4096 IDs per ms per worker │
│ • Compact: 64-bit integer fits in index efficiently │
│ │
└─────────────────────────────────────────────────────────────┘🚨 Failure Scenarios & Mitigations
Scenario 1: Redis Timeline Cache Failure
| Aspect | Details |
|---|---|
| Impact | Users see stale or empty timelines |
| Detection | Redis health checks, latency spikes |
| Mitigation | Redis Cluster with 3 replicas per shard |
| Fallback | Switch to Pull model from DB |
| Recovery | Rebuild cache from DB on restart |
🔧 Raizo's Note
Circuit Breaker Pattern: Khi Redis fails, đừng để requests pile up. Implement circuit breaker để fail fast và fallback to DB. Hystrix hoặc resilience4j là good choices.
Scenario 2: Celebrity Tweet Storm
| Aspect | Details |
|---|---|
| Impact | Fan-out queue backs up, delays for all users |
| Example | Elon Musk tweets during Super Bowl |
| Detection | Kafka consumer lag monitoring |
| Mitigation | Separate queue for celebrity tweets |
| Solution | Pull model for celebrity followers |
Scenario 3: Database Shard Failure
| Aspect | Details |
|---|---|
| Impact | Tweets on that shard unavailable |
| Detection | DB health checks, connection errors |
| Mitigation | Synchronous replication to standby |
| Failover | Automatic promotion (< 30 seconds) |
| Data Loss | Near-zero with sync replication |
Scenario 4: Kafka Cluster Failure
| Aspect | Details |
|---|---|
| Impact | Fan-out stops, new tweets not distributed |
| Detection | Producer errors, consumer lag |
| Mitigation | Multi-AZ Kafka deployment |
| Fallback | Write directly to cache (degraded mode) |
| Recovery | Replay from Kafka offset on recovery |
🔧 Raizo's Note
Idempotency is critical: Fan-out workers MUST be idempotent. Kafka có thể deliver messages multiple times (at-least-once). Dùng tweet_id làm dedup key trong Redis ZADD (same score = no duplicate).
💰 Cost Estimation
Monthly Infrastructure Costs (300M DAU Scale)
| Service | Specification | Unit Cost | Monthly Cost |
|---|---|---|---|
| API Servers | 200 × c5.2xlarge (8 vCPU, 16GB) | $0.34/hr | $49,000 |
| Fan-out Workers | 50 × c5.xlarge (4 vCPU, 8GB) | $0.17/hr | $6,100 |
| Timeline Cache | Redis Cluster 500GB (3 replicas) | $0.068/GB/hr | $15,000 |
| Tweet Cache | Redis 100GB | $0.068/GB/hr | $5,000 |
| Tweet DB | PostgreSQL 20TB (sharded, 3 replicas) | $0.115/GB/mo | $8,000 |
| Social Graph | Cassandra 50TB (3 nodes) | $0.50/GB/mo | $25,000 |
| Kafka Cluster | 10 brokers (m5.2xlarge) | $0.384/hr | $12,000 |
| CDN | 500TB egress/month | $0.08/GB | $40,000 |
| Load Balancers | 2 × ALB | $0.025/hr + LCU | $2,000 |
| Media Storage | S3 1.8PB | $0.023/GB/mo | $42,000 |
Cost Summary
| Category | Monthly Cost | % of Total |
|---|---|---|
| Compute (API + Workers) | $55,100 | 34% |
| Caching (Redis) | $20,000 | 12% |
| Database (PostgreSQL + Cassandra) | $33,000 | 20% |
| Message Queue (Kafka) | $12,000 | 7% |
| CDN & Networking | $42,000 | 26% |
| Storage (S3) | $42,000 | 26% |
| Total | ~$162,000 | 100% |
🎓 Giáo sư Tom
Cost Optimization Strategies:
- Reserved Instances: 1-year commitment giảm 30-40% compute cost
- Spot Instances: Fan-out workers có thể dùng spot (giảm 70%)
- S3 Intelligent Tiering: Auto-move old media to cheaper storage
- CDN Caching: Higher cache hit ratio = lower origin egress
🔧 Raizo's Note
Hidden costs to watch:
- Data transfer between AZs: $0.01/GB adds up fast
- Redis memory overhead: Actual usage ~1.5x data size
- Kafka retention: 7 days retention = 7x daily volume
- Monitoring & Logging: CloudWatch/Datadog có thể $10K+/month
Cost per User Metrics
Monthly cost: $162,000
DAU: 300,000,000
Cost per DAU per month = $162,000 / 300M = $0.00054
Cost per DAU per year = $0.00054 × 12 = $0.0065
Revenue per user (ads): ~$5-10/year
Gross margin: Very healthy!🎯 Interview Checklist
Must-Mention Items ✅
| Topic | Key Points |
|---|---|
| Scale Estimation | 300M DAU, 600M tweets/day, 35K read QPS |
| Fan-out Strategy | Push vs Pull trade-offs, Hybrid solution |
| Celebrity Problem | Why 10K+ followers need special handling |
| Data Storage | Sharding strategy (by tweet_id, not user_id) |
| Caching | Redis Sorted Sets for timeline, cache invalidation |
| ID Generation | Snowflake IDs (time-sortable, distributed) |
Bonus Points 🌟
- Ranking Algorithm: Mention relevance scoring beyond chronological
- Real-time Updates: WebSocket/SSE for live timeline updates
- Search Architecture: Elasticsearch for full-text search
- Trending Topics: Sliding window counters, decay functions
- Anti-spam: Rate limiting, content moderation ML models
- Geo-distribution: Multi-region deployment, data locality
Common Mistakes ❌
| Mistake | Why It's Wrong | Better Approach |
|---|---|---|
| Single database | Won't scale past 10M users | Shard from day 1 |
| Pull-only model | High latency for users | Hybrid Push/Pull |
| Push to all followers | Celebrity problem | Threshold-based hybrid |
| Shard by user_id | Hot spots from celebrities | Shard by tweet_id |
| No caching | DB can't handle read QPS | Redis timeline cache |
| Sync fan-out | Blocks tweet posting | Async via Kafka |
⚠️ Interview Red Flags
- Không mention scale numbers (QPS, storage)
- Không biết Fan-out on Write vs Fan-out on Read
- Không address Celebrity problem
- Thiết kế single point of failure
- Không có caching strategy
🎓 Key Takeaways
- Hybrid Fan-out là real-world solution - không có silver bullet
- Read:Write ratio quyết định architecture (Twitter là 5:1 read-heavy)
- Sharding strategy phải consider data distribution, không chỉ logic
- Snowflake IDs giải quyết distributed ID generation elegantly
- Async processing (Kafka) decouple write path từ fan-out
- Cache everything - Redis Sorted Sets perfect cho timeline
🔗 Navigation
Related Case Studies
| Case Study | Key Learning | Link |
|---|---|---|
| YouTube | Video processing, CDN, adaptive streaming | Design YouTube → |
| Uber | Real-time location, geospatial indexing | Design Uber → |
| Messaging, E2EE, connection management | Design WhatsApp → |