💻 Compute Patterns

Level: Core Solves: Chọn đúng compute platform cho workload với trade-offs về control, scalability, và operational overhead

🎯 Mục tiêu (Outcomes)

Sau khi áp dụng kiến thức trong trang này, bạn sẽ có khả năng:

Chọn đúng Compute Platform dựa trên workload requirements
Thiết kế GKE Cluster với node pools và Autopilot
Triển khai Cloud Run cho containerized HTTP services
Sử dụng Cloud Functions cho event-driven processing
Tối ưu Chi phí với Spot VMs, CUDs, và scale-to-zero
So sánh với AWS compute options

✅ Khi nào dùng

Platform	Use Case	Lý do
Cloud Run	HTTP APIs, microservices	Scale-to-zero, no infra management
GKE Autopilot	Multi-service app, K8s teams	Managed nodes, pay-per-pod
GKE Standard	Complex networking, stateful	Full control, GPU/TPU
Cloud Functions	Event triggers	Single-purpose, pay-per-invocation
GCE	Legacy, specific OS	Full control, BYOL

❌ Khi nào KHÔNG dùng

Pattern	Vấn đề	Thay thế
GCE cho new workloads	Ops overhead	Cloud Run, GKE
Cloud Run cho long-running	60 min timeout	GKE
GKE Standard không cần custom	Cost, complexity	GKE Autopilot
Functions Gen 1	Limitations	Functions Gen 2
VMs không có MIG	No auto-healing	Managed Instance Group

⚠️ Cảnh báo từ Raizo

"Một team chọn GKE Standard cho microservice đơn giản. 6 tháng sau, họ dành 40% thời gian maintain cluster, upgrade nodes. Migrate sang Cloud Run, ops effort giảm 90%. Chọn platform phù hợp với team capability."

Decision Framework

Compute Spectrum

┌─────────────────────────────────────────────────────────────────┐
│                 GCP COMPUTE SPECTRUM                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  MORE CONTROL                              LESS CONTROL         │
│  MORE OPS BURDEN                           LESS OPS BURDEN      │
│  ◄─────────────────────────────────────────────────────────►    │
│                                                                 │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────────────┐    │
│  │   GCE   │  │   GKE   │  │Cloud Run│  │Cloud Functions  │    │
│  │  (VMs)  │  │  (K8s)  │  │(Contain)│  │   (Functions)   │    │
│  └─────────┘  └─────────┘  └─────────┘  └─────────────────┘    │
│                                                                 │
│  You manage:  You manage:  You manage:  You manage:            │
│  • OS        • Containers  • Container  • Code only            │
│  • Runtime   • K8s configs • Code       • Dependencies         │
│  • Scaling   • Networking                                       │
│  • Patching                                                     │
│                                                                 │
│  GCP manages: GCP manages: GCP manages: GCP manages:           │
│  • Hardware  • Control     • Everything • Everything           │
│              • plane       • else       • else                 │
│              • Node pools                                       │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Decision Tree

Compute Engine (GCE)

When to Use GCE

Use Case	Why GCE
Legacy applications	Lift-and-shift without containerization
Specific OS requirements	Custom kernels, Windows Server
GPU/TPU workloads	Direct hardware access
Licensing constraints	BYOL software tied to VMs
Stateful workloads	Databases, persistent storage

Machine Type Selection

┌─────────────────────────────────────────────────────────────────┐
│                 MACHINE TYPE FAMILIES                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  GENERAL PURPOSE (Most workloads)                               │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ E2: Cost-optimized, burstable (dev/test, small apps)    │    │
│  │ N2/N2D: Balanced (web servers, app servers)             │    │
│  │ C3: Latest gen, best price-performance                  │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
│  COMPUTE OPTIMIZED                                              │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ C2/C2D: High CPU performance (gaming, HPC, batch)       │    │
│  │ H3: Highest per-core performance                        │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
│  MEMORY OPTIMIZED                                               │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ M2/M3: Large in-memory databases (SAP HANA)             │    │
│  │ Up to 12TB RAM                                          │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
│  ACCELERATOR OPTIMIZED                                          │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ A2/A3: NVIDIA GPUs (ML training, inference)             │    │
│  │ TPU VMs: Google TPUs (large-scale ML)                   │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

GCE Best Practices

yaml

# Managed Instance Group (MIG) for production
instanceTemplate:
  machineType: n2-standard-4
  disks:
    - boot: true
      autoDelete: true
      initializeParams:
        sourceImage: projects/debian-cloud/global/images/family/debian-11
  networkInterfaces:
    - network: projects/PROJECT/global/networks/VPC_NAME
      subnetwork: regions/REGION/subnetworks/SUBNET_NAME
      # No external IP for security
  serviceAccounts:
    - email: app-sa@PROJECT.iam.gserviceaccount.com
      scopes:
        - https://www.googleapis.com/auth/cloud-platform
  shieldedInstanceConfig:
    enableSecureBoot: true
    enableVtpm: true
    enableIntegrityMonitoring: true

Google Kubernetes Engine (GKE)

GKE Modes

┌─────────────────────────────────────────────────────────────────┐
│              GKE STANDARD vs AUTOPILOT                          │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  GKE STANDARD                        GKE AUTOPILOT              │
│  ────────────                        ────────────               │
│  • You manage node pools             • Google manages nodes     │
│  • Full K8s customization            • Opinionated defaults     │
│  • Pay for nodes (running)           • Pay for pods (running)   │
│  • Manual scaling/upgrades           • Auto scaling/upgrades    │
│  • Any workload type                 • Stateless preferred      │
│                                                                 │
│  CHOOSE STANDARD WHEN:               CHOOSE AUTOPILOT WHEN:     │
│  • Need DaemonSets                   • Want minimal ops         │
│  • Specific node configs             • Standard workloads       │
│  • GPU/TPU workloads                 • Cost optimization        │
│  • Windows containers                • Rapid scaling needed     │
│  • Privileged containers             • New to Kubernetes        │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

GKE Architecture Patterns

┌─────────────────────────────────────────────────────────────────┐
│              PRODUCTION GKE ARCHITECTURE                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │                    GKE Cluster                          │    │
│  │  ┌─────────────────────────────────────────────────┐    │    │
│  │  │              Control Plane (Managed)            │    │    │
│  │  │  • API Server  • etcd  • Scheduler              │    │    │
│  │  └─────────────────────────────────────────────────┘    │    │
│  │                                                         │    │
│  │  Node Pools:                                            │    │
│  │  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐       │    │
│  │  │  default    │ │  high-mem   │ │    gpu      │       │    │
│  │  │ n2-std-4    │ │ n2-highmem-8│ │ a2-highgpu  │       │    │
│  │  │ Spot: Yes   │ │ Spot: No    │ │ Spot: Yes   │       │    │
│  │  │ Autoscale   │ │ Autoscale   │ │ Manual      │       │    │
│  │  │ 1-10 nodes  │ │ 2-20 nodes  │ │ 0-4 nodes   │       │    │
│  │  └─────────────┘ └─────────────┘ └─────────────┘       │    │
│  │                                                         │    │
│  │  Features:                                              │    │
│  │  • Workload Identity ✓                                  │    │
│  │  • Private cluster ✓                                    │    │
│  │  • Binary Authorization ✓                               │    │
│  │  • Network Policy ✓                                     │    │
│  │                                                         │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

GKE Security Checklist

[ ] Private cluster (no public endpoint)
[ ] Workload Identity enabled
[ ] Shielded GKE nodes
[ ] Binary Authorization
[ ] Network Policy enabled
[ ] Pod Security Standards
[ ] Regular node auto-upgrade
[ ] Secrets encrypted with Cloud KMS

Cloud Run

When to Use Cloud Run

┌─────────────────────────────────────────────────────────────────┐
│                 CLOUD RUN SWEET SPOT                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ✅ IDEAL FOR:                       ❌ NOT IDEAL FOR:          │
│  ─────────────                       ────────────────           │
│  • HTTP APIs/microservices           • Long-running processes   │
│  • Web applications                  • Stateful workloads       │
│  • Event-driven processing           • GPU/TPU workloads        │
│  • Async jobs (Cloud Run Jobs)       • Windows containers       │
│  • Rapid scaling (0 to N)            • Complex networking       │
│  • Cost-sensitive (scale to 0)       • Persistent connections   │
│                                                                 │
│  LIMITS:                                                        │
│  • Max 60 min request timeout (services)                        │
│  • Max 24 hours (jobs)                                          │
│  • Max 32 GiB memory, 8 vCPUs                                   │
│  • Max 100 concurrent requests per instance                     │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Cloud Run Configuration

yaml

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: my-api
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/minScale: "1"      # Min instances
        autoscaling.knative.dev/maxScale: "100"    # Max instances
        run.googleapis.com/cpu-throttling: "false" # Always-on CPU
        run.googleapis.com/startup-cpu-boost: "true"
    spec:
      containerConcurrency: 80  # Requests per instance
      timeoutSeconds: 300
      serviceAccountName: my-api-sa@PROJECT.iam.gserviceaccount.com
      containers:
        - image: gcr.io/PROJECT/my-api:latest
          resources:
            limits:
              cpu: "2"
              memory: "2Gi"
          env:
            - name: DB_HOST
              valueFrom:
                secretKeyRef:
                  name: db-credentials
                  key: host

Cloud Functions

Gen 1 vs Gen 2

Feature	Gen 1	Gen 2
Runtime	Node, Python, Go, Java	Same + more
Max timeout	9 minutes	60 minutes
Max memory	8 GB	32 GB
Concurrency	1 request/instance	Up to 1000
Min instances	No	Yes
Traffic splitting	No	Yes
Underlying	Proprietary	Cloud Run

💡 Gen 2 Recommendation

Always use Gen 2 for new functions. Gen 2 is built on Cloud Run, offering better performance, longer timeouts, and concurrency support.

Cloud Functions Use Cases

┌─────────────────────────────────────────────────────────────────┐
│              CLOUD FUNCTIONS PATTERNS                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  EVENT-DRIVEN TRIGGERS:                                         │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │                                                         │    │
│  │  Cloud Storage ──► Function ──► Process file            │    │
│  │  (object created)                                       │    │
│  │                                                         │    │
│  │  Pub/Sub ──► Function ──► Transform & forward           │    │
│  │  (message)                                              │    │
│  │                                                         │    │
│  │  Firestore ──► Function ──► Send notification           │    │
│  │  (document change)                                      │    │
│  │                                                         │    │
│  │  Cloud Scheduler ──► Function ──► Cron job              │    │
│  │  (scheduled)                                            │    │
│  │                                                         │    │
│  │  Eventarc ──► Function ──► React to any GCP event       │    │
│  │  (audit logs)                                           │    │
│  │                                                         │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Cost Comparison

Pricing Model Summary

Service	Pricing Model	Scale to Zero	Best For
GCE	Per-second (min 1 min)	No	Predictable, always-on
GKE	Nodes + management fee	No (nodes)	Complex, multi-service
Cloud Run	Per-request + CPU/memory	Yes	Variable traffic
Cloud Functions	Per-invocation + compute	Yes	Event-driven, sporadic

Cost Optimization Tips

┌─────────────────────────────────────────────────────────────────┐
│              COST OPTIMIZATION STRATEGIES                       │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  GCE:                                                           │
│  • Committed Use Discounts (1-3 year): Up to 57% off            │
│  • Spot VMs: Up to 91% off (can be preempted)                   │
│  • Right-sizing recommendations                                 │
│                                                                 │
│  GKE:                                                           │
│  • Autopilot: Pay only for pods                                 │
│  • Spot node pools for batch workloads                          │
│  • Cluster autoscaler + node auto-provisioning                  │
│                                                                 │
│  Cloud Run:                                                     │
│  • Scale to zero for dev/staging                                │
│  • CPU throttling for background tasks                          │
│  • Committed use discounts available                            │
│                                                                 │
│  Cloud Functions:                                               │
│  • Right-size memory allocation                                 │
│  • Use Gen 2 concurrency to reduce instances                    │
│  • Batch events when possible                                   │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Best Practices Checklist

[ ] Use decision tree to select appropriate compute
[ ] Default to Cloud Run for new HTTP workloads
[ ] Use GKE Autopilot unless specific Standard features needed
[ ] Enable Workload Identity for all GKE clusters
[ ] Use Spot/Preemptible for fault-tolerant workloads
[ ] Implement proper health checks and graceful shutdown
[ ] Set resource limits and requests appropriately
[ ] Use committed use discounts for predictable workloads

⚖️ Trade-offs

Trade-off 1: GKE Standard vs Autopilot

Khía cạnh	GKE Standard	GKE Autopilot
Control	Full	Limited
Cost model	Pay for nodes	Pay for pods
Ops overhead	Cao	Thấp
GPU/TPU	Có	Limited
DaemonSets	Có	Không
Best for	Complex, stateful	Standard workloads

Khuyến nghị: Bắt đầu với Autopilot. Migrate sang Standard chỉ khi cần specific features.

Trade-off 2: Cloud Run vs Cloud Functions

Khía cạnh	Cloud Run	Cloud Functions
Container	Any	Runtime-specific
Timeout	60 min	60 min (Gen 2)
Concurrency	Lên đến 1000	Lên đến 1000 (Gen 2)
Cold start	Tùy thuộc image	Faster
Use case	HTTP services	Event triggers

Trade-off 3: Spot VMs vs On-Demand

Khía cạnh	Spot VMs	On-Demand
Discount	Lên đến 91%	0%
Availability	Không guarantee	Guarantee
Preemption	Có (24h max)	Không
Best for	Batch, fault-tolerant	Stateful, critical

🚨 Failure Modes

Failure Mode 1: Cold Start Latency

🔥 Incident thực tế

Cloud Run service scale-to-zero. First request sau 2 giờ có latency 30s (large container image + DB connection). Customer experience SLA violated.

Cách phát hiện	Cách phòng tránh
P99 latency spikes	min-instances > 0
Timeout errors	Startup CPU boost
User complaints	Smaller container images
Monitoring alerts	Connection pooling

Failure Mode 2: GKE Node Pool Exhaustion

Cách phát hiện	Cách phòng tránh
Pending pods	Node auto-provisioning
Scale-up failures	Multiple node pools
Quota errors	Pre-warm capacity

Failure Mode 3: Spot VM Preemption Storm

Cách phát hiện	Cách phòng tránh
Sudden capacity drop	Diversify machine types
Batch job failures	Checkpointing
Service degradation	Mixed Spot + On-Demand

🔐 Security Baseline

Compute Security Requirements

Requirement	Implementation	Verification
Workload Identity	GKE, Cloud Run	No SA keys
Private clusters	No public nodes	Security scan
Shielded VMs	Secure boot enabled	Configuration audit
Container scanning	Artifact Registry	Vulnerability scan
Binary Authorization	GKE enabled	Policy enforcement

Security Checklist by Platform

Platform	Key Security Items
GCE	Shielded VMs, OS Login, no public IP
GKE	Private cluster, Workload Identity, Binary Auth
Cloud Run	No public access (unless needed), SA per service
Cloud Functions	VPC connector, SA per function

📊 Ops Readiness

Metrics cần Monitoring

Platform	Key Metrics	Alert Threshold
GCE	CPU, Memory, Disk	> 80%
GKE	Pod restarts, Node status	Restarts > 5
Cloud Run	Request latency, Instance count	P99 > 2s
Functions	Execution time, Error rate	Error > 1%

Runbook Entry Points

Tình huống	Runbook
High latency	`runbook/latency-investigation.md`
Pod CrashLoopBackOff	`runbook/pod-crashloop.md`
Spot preemption	`runbook/spot-preemption-handling.md`
Cloud Run cold start	`runbook/cold-start-optimization.md`
OOM kills	`runbook/memory-optimization.md`
GKE node issues	`runbook/gke-node-troubleshooting.md`

✅ Design Review Checklist

Platform Selection

[ ] Decision tree followed
[ ] Team capability matched
[ ] Cost model understood
[ ] Scaling requirements met

Security

[ ] Workload Identity enabled
[ ] No public IPs unnecessary
[ ] Container scanning enabled
[ ] Private networking configured

Operations

[ ] Health checks implemented
[ ] Graceful shutdown handled
[ ] Resource limits set
[ ] Autoscaling configured

Cost

[ ] Spot/Preemptible evaluated
[ ] CUDs for stable workloads
[ ] Right-sizing applied
[ ] Scale-to-zero where applicable

📎 Liên kết

📎 AWS Compute Decisioning - So sánh với AWS compute options
📎 VPC & Networking - Network requirements cho compute
📎 GCP IAM Model - Workload Identity setup
📎 Terraform Modules - IaC patterns cho compute
📎 GCP Cost & Quotas - Compute cost optimization

💻 Compute Patterns ​

🎯 Mục tiêu (Outcomes) ​

✅ Khi nào dùng ​

❌ Khi nào KHÔNG dùng ​

Decision Framework ​

Compute Spectrum ​

Decision Tree ​

Compute Engine (GCE) ​

When to Use GCE ​

Machine Type Selection ​

GCE Best Practices ​

Google Kubernetes Engine (GKE) ​

GKE Modes ​

GKE Architecture Patterns ​

GKE Security Checklist ​

Cloud Run ​

When to Use Cloud Run ​

Cloud Run Configuration ​

Cloud Functions ​

Gen 1 vs Gen 2 ​

Cloud Functions Use Cases ​

Cost Comparison ​

Pricing Model Summary ​

Cost Optimization Tips ​

Best Practices Checklist ​

⚖️ Trade-offs ​

Trade-off 1: GKE Standard vs Autopilot ​

Trade-off 2: Cloud Run vs Cloud Functions ​

Trade-off 3: Spot VMs vs On-Demand ​

🚨 Failure Modes ​

Failure Mode 1: Cold Start Latency ​

Failure Mode 2: GKE Node Pool Exhaustion ​

Failure Mode 3: Spot VM Preemption Storm ​

🔐 Security Baseline ​

Compute Security Requirements ​

Security Checklist by Platform ​

📊 Ops Readiness ​

Metrics cần Monitoring ​

Runbook Entry Points ​

✅ Design Review Checklist ​

Platform Selection ​

Security ​

Operations ​

Cost ​

📎 Liên kết ​