Giao diện
⚖️ LOAD BALANCING
Điều phối Luồng dữ liệu trong Hệ thống Phân tán
1. Forward Proxy vs Reverse Proxy
1.1 Conceptual Difference
FORWARD PROXY (Client-side representative):
[Client A] ─┐
[Client B] ─┼──► [FORWARD PROXY] ──► [Internet/Servers]
[Client C] ─┘ (Squid, Charles)
Use cases: Corporate network control, anonymity, caching external resources
Key: Servers see PROXY's IP, not Client's IP
────────────────────────────────────────────────────────
REVERSE PROXY (Server-side representative):
┌──► [Backend Server 1]
[Internet Clients] ──► [REVERSE PROXY] ──┼──► [Backend Server 2]
(Nginx, HAProxy) └──► [Backend Server 3]
Use cases: Load balancing, SSL termination, caching, security
Key: Clients see PROXY's IP, not Server's IPNOTE
🎓 Giáo sư Tom: Forward Proxy đại diện cho CLIENT, Reverse Proxy đại diện cho SERVER. Đây là "Trust Boundary" trong network security.
2. Load Balancing Strategies
2.1 Round Robin (Simple)
Request Stream: [R1] [R2] [R3] [R4] [R5] [R6]
Server A: [R1] [R4]
Server B: [R2] [R5]
Server C: [R3] [R6]
PROS: Simple, zero state, even distribution
CONS: Ignores server capacity, current load, no session affinity2.2 Least Connections (Intelligent)
LB tracks active connections:
Server A: ████████████ (12 connections)
Server B: ████████ (8 connections) ◄── SEND HERE!
Server C: ████████████████ (16 connections)
PROS: Adapts to load, handles slow requests
CONS: Requires connection tracking2.3 IP Hash (Sticky Sessions)
Client IP: 192.168.1.100
→ hash("192.168.1.100") = 0x7F3A...
→ 0x7F3A... mod 3 = 1 → Server B
Same client ALWAYS goes to same server.
Use cases: Stateful apps, WebSocket connections3. Layer 4 LB vs Layer 7 LB
| Aspect | L4 LB | L7 LB |
|---|---|---|
| Works at | TCP/UDP (IP:Port) | HTTP (URL, Headers) |
| Speed | 10M+ conn/s | 100K-1M req/s |
| Content routing | ❌ No | ✅ Yes |
| SSL Termination | ❌ Pass-through | ✅ Can decrypt |
| Use case | Edge, high throughput | Intelligent routing |
IMPORTANT
Production Pattern: L4 LB at edge → L7 LB cluster. Never expose L7 directly to internet!
4. Consistent Hashing (Thuật toán Tỷ Đô)
4.1 The Problem with Mod Hashing
server_index = hash(key) % N
With N=3: hash("user:1")=5 → 5%3=2 → Server C
Add server D (N=4): 5%4=1 → Server B ❌ MOVED!
Result: ~75% of keys MUST MOVE when adding 1 server!4.2 The Hash Ring Solution
Imagine a circular hash space [0, 2^32-1]:
0°
│
┌────┴────┐
/ \
│ [A: 45°] │
│ [B: 120°] │
\ [C: 250°] /
└──────────┘
180°
Rule: Key routes to FIRST server CLOCKWISE from its position.
Adding server D at 80°: Only keys [45°, 80°] remap to D.
All other keys: UNCHANGED!4.3 Virtual Nodes
Problem: 3 nodes can be unevenly distributed (one handles 70%!)
Solution: Each physical server gets 100-200 virtual positions:
Server A → A1@30°, A2@120°, A3@240°, A4@300°
Server B → B1@60°, B2@150°, B3@210°, B4@330°
Result: Much better balance across the ring.5. Decision Matrix
| Scenario | Recommended Strategy |
|---|---|
| Stateless API | Least Connections |
| WebSocket | IP Hash + Consistent Hashing |
| Distributed Cache | Consistent Hashing + vnodes |
| Database replicas | Least Connections |