Skip to content

💻 Compute Patterns

Level: Core Solves: Chọn đúng compute platform cho workload với trade-offs về control, scalability, và operational overhead

🎯 Mục tiêu (Outcomes)

Sau khi áp dụng kiến thức trong trang này, bạn sẽ có khả năng:

  • Chọn đúng Compute Platform dựa trên workload requirements
  • Thiết kế GKE Cluster với node pools và Autopilot
  • Triển khai Cloud Run cho containerized HTTP services
  • Sử dụng Cloud Functions cho event-driven processing
  • Tối ưu Chi phí với Spot VMs, CUDs, và scale-to-zero
  • So sánh với AWS compute options

Khi nào dùng

PlatformUse CaseLý do
Cloud RunHTTP APIs, microservicesScale-to-zero, no infra management
GKE AutopilotMulti-service app, K8s teamsManaged nodes, pay-per-pod
GKE StandardComplex networking, statefulFull control, GPU/TPU
Cloud FunctionsEvent triggersSingle-purpose, pay-per-invocation
GCELegacy, specific OSFull control, BYOL

Khi nào KHÔNG dùng

PatternVấn đềThay thế
GCE cho new workloadsOps overheadCloud Run, GKE
Cloud Run cho long-running60 min timeoutGKE
GKE Standard không cần customCost, complexityGKE Autopilot
Functions Gen 1LimitationsFunctions Gen 2
VMs không có MIGNo auto-healingManaged Instance Group

⚠️ Cảnh báo từ Raizo

"Một team chọn GKE Standard cho microservice đơn giản. 6 tháng sau, họ dành 40% thời gian maintain cluster, upgrade nodes. Migrate sang Cloud Run, ops effort giảm 90%. Chọn platform phù hợp với team capability."

Decision Framework

Compute Spectrum

┌─────────────────────────────────────────────────────────────────┐
│                 GCP COMPUTE SPECTRUM                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  MORE CONTROL                              LESS CONTROL         │
│  MORE OPS BURDEN                           LESS OPS BURDEN      │
│  ◄─────────────────────────────────────────────────────────►    │
│                                                                 │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────────────┐    │
│  │   GCE   │  │   GKE   │  │Cloud Run│  │Cloud Functions  │    │
│  │  (VMs)  │  │  (K8s)  │  │(Contain)│  │   (Functions)   │    │
│  └─────────┘  └─────────┘  └─────────┘  └─────────────────┘    │
│                                                                 │
│  You manage:  You manage:  You manage:  You manage:            │
│  • OS        • Containers  • Container  • Code only            │
│  • Runtime   • K8s configs • Code       • Dependencies         │
│  • Scaling   • Networking                                       │
│  • Patching                                                     │
│                                                                 │
│  GCP manages: GCP manages: GCP manages: GCP manages:           │
│  • Hardware  • Control     • Everything • Everything           │
│              • plane       • else       • else                 │
│              • Node pools                                       │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Decision Tree

Compute Engine (GCE)

When to Use GCE

Use CaseWhy GCE
Legacy applicationsLift-and-shift without containerization
Specific OS requirementsCustom kernels, Windows Server
GPU/TPU workloadsDirect hardware access
Licensing constraintsBYOL software tied to VMs
Stateful workloadsDatabases, persistent storage

Machine Type Selection

┌─────────────────────────────────────────────────────────────────┐
│                 MACHINE TYPE FAMILIES                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  GENERAL PURPOSE (Most workloads)                               │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ E2: Cost-optimized, burstable (dev/test, small apps)    │    │
│  │ N2/N2D: Balanced (web servers, app servers)             │    │
│  │ C3: Latest gen, best price-performance                  │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
│  COMPUTE OPTIMIZED                                              │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ C2/C2D: High CPU performance (gaming, HPC, batch)       │    │
│  │ H3: Highest per-core performance                        │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
│  MEMORY OPTIMIZED                                               │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ M2/M3: Large in-memory databases (SAP HANA)             │    │
│  │ Up to 12TB RAM                                          │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
│  ACCELERATOR OPTIMIZED                                          │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ A2/A3: NVIDIA GPUs (ML training, inference)             │    │
│  │ TPU VMs: Google TPUs (large-scale ML)                   │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

GCE Best Practices

yaml
# Managed Instance Group (MIG) for production
instanceTemplate:
  machineType: n2-standard-4
  disks:
    - boot: true
      autoDelete: true
      initializeParams:
        sourceImage: projects/debian-cloud/global/images/family/debian-11
  networkInterfaces:
    - network: projects/PROJECT/global/networks/VPC_NAME
      subnetwork: regions/REGION/subnetworks/SUBNET_NAME
      # No external IP for security
  serviceAccounts:
    - email: app-sa@PROJECT.iam.gserviceaccount.com
      scopes:
        - https://www.googleapis.com/auth/cloud-platform
  shieldedInstanceConfig:
    enableSecureBoot: true
    enableVtpm: true
    enableIntegrityMonitoring: true

Google Kubernetes Engine (GKE)

GKE Modes

┌─────────────────────────────────────────────────────────────────┐
│              GKE STANDARD vs AUTOPILOT                          │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  GKE STANDARD                        GKE AUTOPILOT              │
│  ────────────                        ────────────               │
│  • You manage node pools             • Google manages nodes     │
│  • Full K8s customization            • Opinionated defaults     │
│  • Pay for nodes (running)           • Pay for pods (running)   │
│  • Manual scaling/upgrades           • Auto scaling/upgrades    │
│  • Any workload type                 • Stateless preferred      │
│                                                                 │
│  CHOOSE STANDARD WHEN:               CHOOSE AUTOPILOT WHEN:     │
│  • Need DaemonSets                   • Want minimal ops         │
│  • Specific node configs             • Standard workloads       │
│  • GPU/TPU workloads                 • Cost optimization        │
│  • Windows containers                • Rapid scaling needed     │
│  • Privileged containers             • New to Kubernetes        │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

GKE Architecture Patterns

┌─────────────────────────────────────────────────────────────────┐
│              PRODUCTION GKE ARCHITECTURE                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │                    GKE Cluster                          │    │
│  │  ┌─────────────────────────────────────────────────┐    │    │
│  │  │              Control Plane (Managed)            │    │    │
│  │  │  • API Server  • etcd  • Scheduler              │    │    │
│  │  └─────────────────────────────────────────────────┘    │    │
│  │                                                         │    │
│  │  Node Pools:                                            │    │
│  │  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐       │    │
│  │  │  default    │ │  high-mem   │ │    gpu      │       │    │
│  │  │ n2-std-4    │ │ n2-highmem-8│ │ a2-highgpu  │       │    │
│  │  │ Spot: Yes   │ │ Spot: No    │ │ Spot: Yes   │       │    │
│  │  │ Autoscale   │ │ Autoscale   │ │ Manual      │       │    │
│  │  │ 1-10 nodes  │ │ 2-20 nodes  │ │ 0-4 nodes   │       │    │
│  │  └─────────────┘ └─────────────┘ └─────────────┘       │    │
│  │                                                         │    │
│  │  Features:                                              │    │
│  │  • Workload Identity ✓                                  │    │
│  │  • Private cluster ✓                                    │    │
│  │  • Binary Authorization ✓                               │    │
│  │  • Network Policy ✓                                     │    │
│  │                                                         │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

GKE Security Checklist

  • [ ] Private cluster (no public endpoint)
  • [ ] Workload Identity enabled
  • [ ] Shielded GKE nodes
  • [ ] Binary Authorization
  • [ ] Network Policy enabled
  • [ ] Pod Security Standards
  • [ ] Regular node auto-upgrade
  • [ ] Secrets encrypted with Cloud KMS

Cloud Run

When to Use Cloud Run

┌─────────────────────────────────────────────────────────────────┐
│                 CLOUD RUN SWEET SPOT                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ✅ IDEAL FOR:                       ❌ NOT IDEAL FOR:          │
│  ─────────────                       ────────────────           │
│  • HTTP APIs/microservices           • Long-running processes   │
│  • Web applications                  • Stateful workloads       │
│  • Event-driven processing           • GPU/TPU workloads        │
│  • Async jobs (Cloud Run Jobs)       • Windows containers       │
│  • Rapid scaling (0 to N)            • Complex networking       │
│  • Cost-sensitive (scale to 0)       • Persistent connections   │
│                                                                 │
│  LIMITS:                                                        │
│  • Max 60 min request timeout (services)                        │
│  • Max 24 hours (jobs)                                          │
│  • Max 32 GiB memory, 8 vCPUs                                   │
│  • Max 100 concurrent requests per instance                     │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Cloud Run Configuration

yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: my-api
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/minScale: "1"      # Min instances
        autoscaling.knative.dev/maxScale: "100"    # Max instances
        run.googleapis.com/cpu-throttling: "false" # Always-on CPU
        run.googleapis.com/startup-cpu-boost: "true"
    spec:
      containerConcurrency: 80  # Requests per instance
      timeoutSeconds: 300
      serviceAccountName: my-api-sa@PROJECT.iam.gserviceaccount.com
      containers:
        - image: gcr.io/PROJECT/my-api:latest
          resources:
            limits:
              cpu: "2"
              memory: "2Gi"
          env:
            - name: DB_HOST
              valueFrom:
                secretKeyRef:
                  name: db-credentials
                  key: host

Cloud Functions

Gen 1 vs Gen 2

FeatureGen 1Gen 2
RuntimeNode, Python, Go, JavaSame + more
Max timeout9 minutes60 minutes
Max memory8 GB32 GB
Concurrency1 request/instanceUp to 1000
Min instancesNoYes
Traffic splittingNoYes
UnderlyingProprietaryCloud Run

💡 Gen 2 Recommendation

Always use Gen 2 for new functions. Gen 2 is built on Cloud Run, offering better performance, longer timeouts, and concurrency support.

Cloud Functions Use Cases

┌─────────────────────────────────────────────────────────────────┐
│              CLOUD FUNCTIONS PATTERNS                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  EVENT-DRIVEN TRIGGERS:                                         │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │                                                         │    │
│  │  Cloud Storage ──► Function ──► Process file            │    │
│  │  (object created)                                       │    │
│  │                                                         │    │
│  │  Pub/Sub ──► Function ──► Transform & forward           │    │
│  │  (message)                                              │    │
│  │                                                         │    │
│  │  Firestore ──► Function ──► Send notification           │    │
│  │  (document change)                                      │    │
│  │                                                         │    │
│  │  Cloud Scheduler ──► Function ──► Cron job              │    │
│  │  (scheduled)                                            │    │
│  │                                                         │    │
│  │  Eventarc ──► Function ──► React to any GCP event       │    │
│  │  (audit logs)                                           │    │
│  │                                                         │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Cost Comparison

Pricing Model Summary

ServicePricing ModelScale to ZeroBest For
GCEPer-second (min 1 min)NoPredictable, always-on
GKENodes + management feeNo (nodes)Complex, multi-service
Cloud RunPer-request + CPU/memoryYesVariable traffic
Cloud FunctionsPer-invocation + computeYesEvent-driven, sporadic

Cost Optimization Tips

┌─────────────────────────────────────────────────────────────────┐
│              COST OPTIMIZATION STRATEGIES                       │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  GCE:                                                           │
│  • Committed Use Discounts (1-3 year): Up to 57% off            │
│  • Spot VMs: Up to 91% off (can be preempted)                   │
│  • Right-sizing recommendations                                 │
│                                                                 │
│  GKE:                                                           │
│  • Autopilot: Pay only for pods                                 │
│  • Spot node pools for batch workloads                          │
│  • Cluster autoscaler + node auto-provisioning                  │
│                                                                 │
│  Cloud Run:                                                     │
│  • Scale to zero for dev/staging                                │
│  • CPU throttling for background tasks                          │
│  • Committed use discounts available                            │
│                                                                 │
│  Cloud Functions:                                               │
│  • Right-size memory allocation                                 │
│  • Use Gen 2 concurrency to reduce instances                    │
│  • Batch events when possible                                   │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Best Practices Checklist

  • [ ] Use decision tree to select appropriate compute
  • [ ] Default to Cloud Run for new HTTP workloads
  • [ ] Use GKE Autopilot unless specific Standard features needed
  • [ ] Enable Workload Identity for all GKE clusters
  • [ ] Use Spot/Preemptible for fault-tolerant workloads
  • [ ] Implement proper health checks and graceful shutdown
  • [ ] Set resource limits and requests appropriately
  • [ ] Use committed use discounts for predictable workloads

⚖️ Trade-offs

Trade-off 1: GKE Standard vs Autopilot

Khía cạnhGKE StandardGKE Autopilot
ControlFullLimited
Cost modelPay for nodesPay for pods
Ops overheadCaoThấp
GPU/TPULimited
DaemonSetsKhông
Best forComplex, statefulStandard workloads

Khuyến nghị: Bắt đầu với Autopilot. Migrate sang Standard chỉ khi cần specific features.


Trade-off 2: Cloud Run vs Cloud Functions

Khía cạnhCloud RunCloud Functions
ContainerAnyRuntime-specific
Timeout60 min60 min (Gen 2)
ConcurrencyLên đến 1000Lên đến 1000 (Gen 2)
Cold startTùy thuộc imageFaster
Use caseHTTP servicesEvent triggers

Trade-off 3: Spot VMs vs On-Demand

Khía cạnhSpot VMsOn-Demand
DiscountLên đến 91%0%
AvailabilityKhông guaranteeGuarantee
PreemptionCó (24h max)Không
Best forBatch, fault-tolerantStateful, critical

🚨 Failure Modes

Failure Mode 1: Cold Start Latency

🔥 Incident thực tế

Cloud Run service scale-to-zero. First request sau 2 giờ có latency 30s (large container image + DB connection). Customer experience SLA violated.

Cách phát hiệnCách phòng tránh
P99 latency spikesmin-instances > 0
Timeout errorsStartup CPU boost
User complaintsSmaller container images
Monitoring alertsConnection pooling

Failure Mode 2: GKE Node Pool Exhaustion

Cách phát hiệnCách phòng tránh
Pending podsNode auto-provisioning
Scale-up failuresMultiple node pools
Quota errorsPre-warm capacity

Failure Mode 3: Spot VM Preemption Storm

Cách phát hiệnCách phòng tránh
Sudden capacity dropDiversify machine types
Batch job failuresCheckpointing
Service degradationMixed Spot + On-Demand

🔐 Security Baseline

Compute Security Requirements

RequirementImplementationVerification
Workload IdentityGKE, Cloud RunNo SA keys
Private clustersNo public nodesSecurity scan
Shielded VMsSecure boot enabledConfiguration audit
Container scanningArtifact RegistryVulnerability scan
Binary AuthorizationGKE enabledPolicy enforcement

Security Checklist by Platform

PlatformKey Security Items
GCEShielded VMs, OS Login, no public IP
GKEPrivate cluster, Workload Identity, Binary Auth
Cloud RunNo public access (unless needed), SA per service
Cloud FunctionsVPC connector, SA per function

📊 Ops Readiness

Metrics cần Monitoring

PlatformKey MetricsAlert Threshold
GCECPU, Memory, Disk> 80%
GKEPod restarts, Node statusRestarts > 5
Cloud RunRequest latency, Instance countP99 > 2s
FunctionsExecution time, Error rateError > 1%

Runbook Entry Points

Tình huốngRunbook
High latencyrunbook/latency-investigation.md
Pod CrashLoopBackOffrunbook/pod-crashloop.md
Spot preemptionrunbook/spot-preemption-handling.md
Cloud Run cold startrunbook/cold-start-optimization.md
OOM killsrunbook/memory-optimization.md
GKE node issuesrunbook/gke-node-troubleshooting.md

Design Review Checklist

Platform Selection

  • [ ] Decision tree followed
  • [ ] Team capability matched
  • [ ] Cost model understood
  • [ ] Scaling requirements met

Security

  • [ ] Workload Identity enabled
  • [ ] No public IPs unnecessary
  • [ ] Container scanning enabled
  • [ ] Private networking configured

Operations

  • [ ] Health checks implemented
  • [ ] Graceful shutdown handled
  • [ ] Resource limits set
  • [ ] Autoscaling configured

Cost

  • [ ] Spot/Preemptible evaluated
  • [ ] CUDs for stable workloads
  • [ ] Right-sizing applied
  • [ ] Scale-to-zero where applicable

📎 Liên kết