Giao diện
💻 Compute Patterns
Level: Core Solves: Chọn đúng compute platform cho workload với trade-offs về control, scalability, và operational overhead
🎯 Mục tiêu (Outcomes)
Sau khi áp dụng kiến thức trong trang này, bạn sẽ có khả năng:
- Chọn đúng Compute Platform dựa trên workload requirements
- Thiết kế GKE Cluster với node pools và Autopilot
- Triển khai Cloud Run cho containerized HTTP services
- Sử dụng Cloud Functions cho event-driven processing
- Tối ưu Chi phí với Spot VMs, CUDs, và scale-to-zero
- So sánh với AWS compute options
✅ Khi nào dùng
| Platform | Use Case | Lý do |
|---|---|---|
| Cloud Run | HTTP APIs, microservices | Scale-to-zero, no infra management |
| GKE Autopilot | Multi-service app, K8s teams | Managed nodes, pay-per-pod |
| GKE Standard | Complex networking, stateful | Full control, GPU/TPU |
| Cloud Functions | Event triggers | Single-purpose, pay-per-invocation |
| GCE | Legacy, specific OS | Full control, BYOL |
❌ Khi nào KHÔNG dùng
| Pattern | Vấn đề | Thay thế |
|---|---|---|
| GCE cho new workloads | Ops overhead | Cloud Run, GKE |
| Cloud Run cho long-running | 60 min timeout | GKE |
| GKE Standard không cần custom | Cost, complexity | GKE Autopilot |
| Functions Gen 1 | Limitations | Functions Gen 2 |
| VMs không có MIG | No auto-healing | Managed Instance Group |
⚠️ Cảnh báo từ Raizo
"Một team chọn GKE Standard cho microservice đơn giản. 6 tháng sau, họ dành 40% thời gian maintain cluster, upgrade nodes. Migrate sang Cloud Run, ops effort giảm 90%. Chọn platform phù hợp với team capability."
Decision Framework
Compute Spectrum
┌─────────────────────────────────────────────────────────────────┐
│ GCP COMPUTE SPECTRUM │
├─────────────────────────────────────────────────────────────────┤
│ │
│ MORE CONTROL LESS CONTROL │
│ MORE OPS BURDEN LESS OPS BURDEN │
│ ◄─────────────────────────────────────────────────────────► │
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────────────┐ │
│ │ GCE │ │ GKE │ │Cloud Run│ │Cloud Functions │ │
│ │ (VMs) │ │ (K8s) │ │(Contain)│ │ (Functions) │ │
│ └─────────┘ └─────────┘ └─────────┘ └─────────────────┘ │
│ │
│ You manage: You manage: You manage: You manage: │
│ • OS • Containers • Container • Code only │
│ • Runtime • K8s configs • Code • Dependencies │
│ • Scaling • Networking │
│ • Patching │
│ │
│ GCP manages: GCP manages: GCP manages: GCP manages: │
│ • Hardware • Control • Everything • Everything │
│ • plane • else • else │
│ • Node pools │
│ │
└─────────────────────────────────────────────────────────────────┘Decision Tree
Compute Engine (GCE)
When to Use GCE
| Use Case | Why GCE |
|---|---|
| Legacy applications | Lift-and-shift without containerization |
| Specific OS requirements | Custom kernels, Windows Server |
| GPU/TPU workloads | Direct hardware access |
| Licensing constraints | BYOL software tied to VMs |
| Stateful workloads | Databases, persistent storage |
Machine Type Selection
┌─────────────────────────────────────────────────────────────────┐
│ MACHINE TYPE FAMILIES │
├─────────────────────────────────────────────────────────────────┤
│ │
│ GENERAL PURPOSE (Most workloads) │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ E2: Cost-optimized, burstable (dev/test, small apps) │ │
│ │ N2/N2D: Balanced (web servers, app servers) │ │
│ │ C3: Latest gen, best price-performance │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ COMPUTE OPTIMIZED │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ C2/C2D: High CPU performance (gaming, HPC, batch) │ │
│ │ H3: Highest per-core performance │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ MEMORY OPTIMIZED │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ M2/M3: Large in-memory databases (SAP HANA) │ │
│ │ Up to 12TB RAM │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ACCELERATOR OPTIMIZED │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ A2/A3: NVIDIA GPUs (ML training, inference) │ │
│ │ TPU VMs: Google TPUs (large-scale ML) │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘GCE Best Practices
yaml
# Managed Instance Group (MIG) for production
instanceTemplate:
machineType: n2-standard-4
disks:
- boot: true
autoDelete: true
initializeParams:
sourceImage: projects/debian-cloud/global/images/family/debian-11
networkInterfaces:
- network: projects/PROJECT/global/networks/VPC_NAME
subnetwork: regions/REGION/subnetworks/SUBNET_NAME
# No external IP for security
serviceAccounts:
- email: app-sa@PROJECT.iam.gserviceaccount.com
scopes:
- https://www.googleapis.com/auth/cloud-platform
shieldedInstanceConfig:
enableSecureBoot: true
enableVtpm: true
enableIntegrityMonitoring: trueGoogle Kubernetes Engine (GKE)
GKE Modes
┌─────────────────────────────────────────────────────────────────┐
│ GKE STANDARD vs AUTOPILOT │
├─────────────────────────────────────────────────────────────────┤
│ │
│ GKE STANDARD GKE AUTOPILOT │
│ ──────────── ──────────── │
│ • You manage node pools • Google manages nodes │
│ • Full K8s customization • Opinionated defaults │
│ • Pay for nodes (running) • Pay for pods (running) │
│ • Manual scaling/upgrades • Auto scaling/upgrades │
│ • Any workload type • Stateless preferred │
│ │
│ CHOOSE STANDARD WHEN: CHOOSE AUTOPILOT WHEN: │
│ • Need DaemonSets • Want minimal ops │
│ • Specific node configs • Standard workloads │
│ • GPU/TPU workloads • Cost optimization │
│ • Windows containers • Rapid scaling needed │
│ • Privileged containers • New to Kubernetes │
│ │
└─────────────────────────────────────────────────────────────────┘GKE Architecture Patterns
┌─────────────────────────────────────────────────────────────────┐
│ PRODUCTION GKE ARCHITECTURE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ GKE Cluster │ │
│ │ ┌─────────────────────────────────────────────────┐ │ │
│ │ │ Control Plane (Managed) │ │ │
│ │ │ • API Server • etcd • Scheduler │ │ │
│ │ └─────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ Node Pools: │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ default │ │ high-mem │ │ gpu │ │ │
│ │ │ n2-std-4 │ │ n2-highmem-8│ │ a2-highgpu │ │ │
│ │ │ Spot: Yes │ │ Spot: No │ │ Spot: Yes │ │ │
│ │ │ Autoscale │ │ Autoscale │ │ Manual │ │ │
│ │ │ 1-10 nodes │ │ 2-20 nodes │ │ 0-4 nodes │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ │ │ │
│ │ Features: │ │
│ │ • Workload Identity ✓ │ │
│ │ • Private cluster ✓ │ │
│ │ • Binary Authorization ✓ │ │
│ │ • Network Policy ✓ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘GKE Security Checklist
- [ ] Private cluster (no public endpoint)
- [ ] Workload Identity enabled
- [ ] Shielded GKE nodes
- [ ] Binary Authorization
- [ ] Network Policy enabled
- [ ] Pod Security Standards
- [ ] Regular node auto-upgrade
- [ ] Secrets encrypted with Cloud KMS
Cloud Run
When to Use Cloud Run
┌─────────────────────────────────────────────────────────────────┐
│ CLOUD RUN SWEET SPOT │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ✅ IDEAL FOR: ❌ NOT IDEAL FOR: │
│ ───────────── ──────────────── │
│ • HTTP APIs/microservices • Long-running processes │
│ • Web applications • Stateful workloads │
│ • Event-driven processing • GPU/TPU workloads │
│ • Async jobs (Cloud Run Jobs) • Windows containers │
│ • Rapid scaling (0 to N) • Complex networking │
│ • Cost-sensitive (scale to 0) • Persistent connections │
│ │
│ LIMITS: │
│ • Max 60 min request timeout (services) │
│ • Max 24 hours (jobs) │
│ • Max 32 GiB memory, 8 vCPUs │
│ • Max 100 concurrent requests per instance │
│ │
└─────────────────────────────────────────────────────────────────┘Cloud Run Configuration
yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: my-api
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/minScale: "1" # Min instances
autoscaling.knative.dev/maxScale: "100" # Max instances
run.googleapis.com/cpu-throttling: "false" # Always-on CPU
run.googleapis.com/startup-cpu-boost: "true"
spec:
containerConcurrency: 80 # Requests per instance
timeoutSeconds: 300
serviceAccountName: my-api-sa@PROJECT.iam.gserviceaccount.com
containers:
- image: gcr.io/PROJECT/my-api:latest
resources:
limits:
cpu: "2"
memory: "2Gi"
env:
- name: DB_HOST
valueFrom:
secretKeyRef:
name: db-credentials
key: hostCloud Functions
Gen 1 vs Gen 2
| Feature | Gen 1 | Gen 2 |
|---|---|---|
| Runtime | Node, Python, Go, Java | Same + more |
| Max timeout | 9 minutes | 60 minutes |
| Max memory | 8 GB | 32 GB |
| Concurrency | 1 request/instance | Up to 1000 |
| Min instances | No | Yes |
| Traffic splitting | No | Yes |
| Underlying | Proprietary | Cloud Run |
💡 Gen 2 Recommendation
Always use Gen 2 for new functions. Gen 2 is built on Cloud Run, offering better performance, longer timeouts, and concurrency support.
Cloud Functions Use Cases
┌─────────────────────────────────────────────────────────────────┐
│ CLOUD FUNCTIONS PATTERNS │
├─────────────────────────────────────────────────────────────────┤
│ │
│ EVENT-DRIVEN TRIGGERS: │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ Cloud Storage ──► Function ──► Process file │ │
│ │ (object created) │ │
│ │ │ │
│ │ Pub/Sub ──► Function ──► Transform & forward │ │
│ │ (message) │ │
│ │ │ │
│ │ Firestore ──► Function ──► Send notification │ │
│ │ (document change) │ │
│ │ │ │
│ │ Cloud Scheduler ──► Function ──► Cron job │ │
│ │ (scheduled) │ │
│ │ │ │
│ │ Eventarc ──► Function ──► React to any GCP event │ │
│ │ (audit logs) │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘Cost Comparison
Pricing Model Summary
| Service | Pricing Model | Scale to Zero | Best For |
|---|---|---|---|
| GCE | Per-second (min 1 min) | No | Predictable, always-on |
| GKE | Nodes + management fee | No (nodes) | Complex, multi-service |
| Cloud Run | Per-request + CPU/memory | Yes | Variable traffic |
| Cloud Functions | Per-invocation + compute | Yes | Event-driven, sporadic |
Cost Optimization Tips
┌─────────────────────────────────────────────────────────────────┐
│ COST OPTIMIZATION STRATEGIES │
├─────────────────────────────────────────────────────────────────┤
│ │
│ GCE: │
│ • Committed Use Discounts (1-3 year): Up to 57% off │
│ • Spot VMs: Up to 91% off (can be preempted) │
│ • Right-sizing recommendations │
│ │
│ GKE: │
│ • Autopilot: Pay only for pods │
│ • Spot node pools for batch workloads │
│ • Cluster autoscaler + node auto-provisioning │
│ │
│ Cloud Run: │
│ • Scale to zero for dev/staging │
│ • CPU throttling for background tasks │
│ • Committed use discounts available │
│ │
│ Cloud Functions: │
│ • Right-size memory allocation │
│ • Use Gen 2 concurrency to reduce instances │
│ • Batch events when possible │
│ │
└─────────────────────────────────────────────────────────────────┘Best Practices Checklist
- [ ] Use decision tree to select appropriate compute
- [ ] Default to Cloud Run for new HTTP workloads
- [ ] Use GKE Autopilot unless specific Standard features needed
- [ ] Enable Workload Identity for all GKE clusters
- [ ] Use Spot/Preemptible for fault-tolerant workloads
- [ ] Implement proper health checks and graceful shutdown
- [ ] Set resource limits and requests appropriately
- [ ] Use committed use discounts for predictable workloads
⚖️ Trade-offs
Trade-off 1: GKE Standard vs Autopilot
| Khía cạnh | GKE Standard | GKE Autopilot |
|---|---|---|
| Control | Full | Limited |
| Cost model | Pay for nodes | Pay for pods |
| Ops overhead | Cao | Thấp |
| GPU/TPU | Có | Limited |
| DaemonSets | Có | Không |
| Best for | Complex, stateful | Standard workloads |
Khuyến nghị: Bắt đầu với Autopilot. Migrate sang Standard chỉ khi cần specific features.
Trade-off 2: Cloud Run vs Cloud Functions
| Khía cạnh | Cloud Run | Cloud Functions |
|---|---|---|
| Container | Any | Runtime-specific |
| Timeout | 60 min | 60 min (Gen 2) |
| Concurrency | Lên đến 1000 | Lên đến 1000 (Gen 2) |
| Cold start | Tùy thuộc image | Faster |
| Use case | HTTP services | Event triggers |
Trade-off 3: Spot VMs vs On-Demand
| Khía cạnh | Spot VMs | On-Demand |
|---|---|---|
| Discount | Lên đến 91% | 0% |
| Availability | Không guarantee | Guarantee |
| Preemption | Có (24h max) | Không |
| Best for | Batch, fault-tolerant | Stateful, critical |
🚨 Failure Modes
Failure Mode 1: Cold Start Latency
🔥 Incident thực tế
Cloud Run service scale-to-zero. First request sau 2 giờ có latency 30s (large container image + DB connection). Customer experience SLA violated.
| Cách phát hiện | Cách phòng tránh |
|---|---|
| P99 latency spikes | min-instances > 0 |
| Timeout errors | Startup CPU boost |
| User complaints | Smaller container images |
| Monitoring alerts | Connection pooling |
Failure Mode 2: GKE Node Pool Exhaustion
| Cách phát hiện | Cách phòng tránh |
|---|---|
| Pending pods | Node auto-provisioning |
| Scale-up failures | Multiple node pools |
| Quota errors | Pre-warm capacity |
Failure Mode 3: Spot VM Preemption Storm
| Cách phát hiện | Cách phòng tránh |
|---|---|
| Sudden capacity drop | Diversify machine types |
| Batch job failures | Checkpointing |
| Service degradation | Mixed Spot + On-Demand |
🔐 Security Baseline
Compute Security Requirements
| Requirement | Implementation | Verification |
|---|---|---|
| Workload Identity | GKE, Cloud Run | No SA keys |
| Private clusters | No public nodes | Security scan |
| Shielded VMs | Secure boot enabled | Configuration audit |
| Container scanning | Artifact Registry | Vulnerability scan |
| Binary Authorization | GKE enabled | Policy enforcement |
Security Checklist by Platform
| Platform | Key Security Items |
|---|---|
| GCE | Shielded VMs, OS Login, no public IP |
| GKE | Private cluster, Workload Identity, Binary Auth |
| Cloud Run | No public access (unless needed), SA per service |
| Cloud Functions | VPC connector, SA per function |
📊 Ops Readiness
Metrics cần Monitoring
| Platform | Key Metrics | Alert Threshold |
|---|---|---|
| GCE | CPU, Memory, Disk | > 80% |
| GKE | Pod restarts, Node status | Restarts > 5 |
| Cloud Run | Request latency, Instance count | P99 > 2s |
| Functions | Execution time, Error rate | Error > 1% |
Runbook Entry Points
| Tình huống | Runbook |
|---|---|
| High latency | runbook/latency-investigation.md |
| Pod CrashLoopBackOff | runbook/pod-crashloop.md |
| Spot preemption | runbook/spot-preemption-handling.md |
| Cloud Run cold start | runbook/cold-start-optimization.md |
| OOM kills | runbook/memory-optimization.md |
| GKE node issues | runbook/gke-node-troubleshooting.md |
✅ Design Review Checklist
Platform Selection
- [ ] Decision tree followed
- [ ] Team capability matched
- [ ] Cost model understood
- [ ] Scaling requirements met
Security
- [ ] Workload Identity enabled
- [ ] No public IPs unnecessary
- [ ] Container scanning enabled
- [ ] Private networking configured
Operations
- [ ] Health checks implemented
- [ ] Graceful shutdown handled
- [ ] Resource limits set
- [ ] Autoscaling configured
Cost
- [ ] Spot/Preemptible evaluated
- [ ] CUDs for stable workloads
- [ ] Right-sizing applied
- [ ] Scale-to-zero where applicable
📎 Liên kết
- 📎 AWS Compute Decisioning - So sánh với AWS compute options
- 📎 VPC & Networking - Network requirements cho compute
- 📎 GCP IAM Model - Workload Identity setup
- 📎 Terraform Modules - IaC patterns cho compute
- 📎 GCP Cost & Quotas - Compute cost optimization