Skip to content

💰 Cost & Quotas

Level: Ops Solves: Quản lý chi phí và quotas hiệu quả cho enterprise workloads với visibility và control

🎯 Mục tiêu (Outcomes)

Sau khi áp dụng kiến thức trong trang này, bạn sẽ có khả năng:

  • Thiết lập Billing Export đến BigQuery cho cost analysis
  • Cấu hình Budgets với alerts và programmatic actions
  • Mua Committed Use Discounts đúng thời điểm
  • Triển khai Spot VMs cho fault-tolerant workloads
  • Quản lý Quotas và request increases proactively
  • Implement Label Strategy cho cost allocation

Khi nào dùng

StrategyUse CaseLý do
CUDsStable, predictable workloadsLên đến 57% discount
Spot VMsBatch, CI/CD, fault-tolerantLên đến 91% discount
BudgetsMọi projectPrevent surprises
LabelsMọi resourceCost allocation
Billing exportOrganizationDeep analysis

Khi nào KHÔNG dùng

PatternVấn đềThay thế
CUDs cho variable workloadsWaste unused commitmentOn-demand + Spot
Spot cho production databasesPreemption riskOn-demand + CUD
No budgetsBill shockSet budgets always
No labelsCost allocation impossibleMandatory labels
Manual quota managementScale failuresProactive requests

⚠️ Cảnh báo từ Raizo

"Team không monitor quota. Black Friday, scale-up fail do hết vCPU quota. 2 tiếng downtime. Luôn request quota increase TRƯỚC khi cần."

Billing Structure

GCP Billing Hierarchy

┌─────────────────────────────────────────────────────────────────┐
│                 GCP BILLING HIERARCHY                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │              Cloud Billing Account                      │    │
│  │  • Payment method (credit card, invoice)                │    │
│  │  • Billing contact                                      │    │
│  │  • Currency settings                                    │    │
│  └─────────────────────────────────────────────────────────┘    │
│                          │                                      │
│         ┌────────────────┼────────────────┐                     │
│         │                │                │                     │
│         ▼                ▼                ▼                     │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐             │
│  │  Project A  │  │  Project B  │  │  Project C  │             │
│  │  $500/month │  │  $1200/month│  │  $300/month │             │
│  └─────────────┘  └─────────────┘  └─────────────┘             │
│                                                                 │
│  ENTERPRISE PATTERN:                                            │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │  Master Billing Account (Finance team)                  │    │
│  │  ├── Sub-account: Production                            │    │
│  │  │   └── Projects: prod-*                               │    │
│  │  ├── Sub-account: Development                           │    │
│  │  │   └── Projects: dev-*, stg-*                         │    │
│  │  └── Sub-account: Sandbox                               │    │
│  │      └── Projects: sbx-*                                │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Billing Export to BigQuery

sql
-- Enable billing export in Cloud Console
-- Billing > Billing export > BigQuery export

-- Query: Monthly cost by project
SELECT
  project.name as project_name,
  SUM(cost) as total_cost,
  SUM(IFNULL(credits.amount, 0)) as total_credits,
  SUM(cost) + SUM(IFNULL(credits.amount, 0)) as net_cost
FROM `billing_project.billing_dataset.gcp_billing_export_v1_XXXXXX`
LEFT JOIN UNNEST(credits) as credits
WHERE invoice.month = '202401'
GROUP BY project.name
ORDER BY net_cost DESC;

-- Query: Cost by service and SKU
SELECT
  service.description as service,
  sku.description as sku,
  SUM(cost) as cost,
  SUM(usage.amount) as usage_amount,
  usage.unit as usage_unit
FROM `billing_project.billing_dataset.gcp_billing_export_v1_XXXXXX`
WHERE invoice.month = '202401'
GROUP BY service, sku, usage_unit
ORDER BY cost DESC
LIMIT 20;

Budget Management

Budget Configuration

┌─────────────────────────────────────────────────────────────────┐
│                 BUDGET CONFIGURATION                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  BUDGET TYPES                                                   │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ Specified Amount: Fixed dollar amount                   │    │
│  │ • Example: $10,000/month                                │    │
│  │                                                         │    │
│  │ Last Month's Spend: Dynamic based on history            │    │
│  │ • Example: Alert if 20% higher than last month          │    │
│  │                                                         │    │
│  │ Last Period's Spend: Compare to same period last year   │    │
│  │ • Example: Alert if higher than Jan 2023                │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
│  ALERT THRESHOLDS                                               │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ 50%  → Email notification (early warning)               │    │
│  │ 80%  → Email + Slack notification                       │    │
│  │ 100% → Email + Slack + PagerDuty                        │    │
│  │ 120% → All above + auto-disable billing (optional)      │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
│  SCOPE OPTIONS                                                  │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ • Billing account (all projects)                        │    │
│  │ • Specific projects                                     │    │
│  │ • Specific services                                     │    │
│  │ • Labels (e.g., team=data-engineering)                  │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Programmatic Budget Alerts

python
# Cloud Function to handle budget alerts
import base64
import json
from google.cloud import pubsub_v1

def budget_alert_handler(event, context):
    """Handle budget alert from Pub/Sub."""
    pubsub_message = base64.b64decode(event['data']).decode('utf-8')
    budget_notification = json.loads(pubsub_message)
    
    budget_name = budget_notification['budgetDisplayName']
    cost_amount = budget_notification['costAmount']
    budget_amount = budget_notification['budgetAmount']
    threshold = budget_notification['alertThresholdExceeded']
    
    # Calculate percentage
    percentage = (cost_amount / budget_amount) * 100
    
    if threshold >= 1.0:  # 100% exceeded
        # Critical: Take action
        send_pagerduty_alert(budget_name, percentage)
        # Optional: Disable billing
        # disable_billing_for_project(project_id)
    elif threshold >= 0.8:  # 80% exceeded
        send_slack_alert(budget_name, percentage)
    else:
        send_email_alert(budget_name, percentage)

Cost Optimization

Committed Use Discounts (CUDs)

┌─────────────────────────────────────────────────────────────────┐
│              COMMITTED USE DISCOUNTS                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  RESOURCE-BASED CUDs (Compute Engine)                           │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ Commitment: vCPUs + Memory for 1 or 3 years             │    │
│  │                                                         │    │
│  │ Discount:                                               │    │
│  │ • 1-year: Up to 37% off                                 │    │
│  │ • 3-year: Up to 57% off                                 │    │
│  │                                                         │    │
│  │ Flexibility:                                            │    │
│  │ • Applies across machine types in same region           │    │
│  │ • Can mix N2, N2D, C2, etc.                             │    │
│  │ • Shared across projects in billing account             │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
│  SPEND-BASED CUDs (BigQuery, Cloud SQL, etc.)                   │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ Commitment: Dollar amount per hour                      │    │
│  │                                                         │    │
│  │ BigQuery Editions:                                      │    │
│  │ • Standard: No commitment required                      │    │
│  │ • Enterprise: Commit to slots, get discount             │    │
│  │                                                         │    │
│  │ Cloud SQL:                                              │    │
│  │ • 1-year: 25% off                                       │    │
│  │ • 3-year: 52% off                                       │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Spot VMs

┌─────────────────────────────────────────────────────────────────┐
│                    SPOT VMs                                     │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  DISCOUNT: Up to 91% off on-demand price                        │
│                                                                 │
│  TRADE-OFF: Can be preempted with 30-second warning             │
│                                                                 │
│  IDEAL FOR:                                                     │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ ✅ Batch processing (Dataproc, Dataflow)                │    │
│  │ ✅ CI/CD pipelines                                      │    │
│  │ ✅ Fault-tolerant workloads                             │    │
│  │ ✅ Dev/test environments                                │    │
│  │ ✅ Stateless containers (GKE node pools)                │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
│  NOT IDEAL FOR:                                                 │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ ❌ Production databases                                 │    │
│  │ ❌ Stateful applications                                │    │
│  │ ❌ Long-running jobs without checkpointing              │    │
│  │ ❌ User-facing services (without fallback)              │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
│  BEST PRACTICE: Mix Spot + On-demand in GKE                     │
│  • Spot node pool for batch workloads                           │
│  • On-demand node pool for critical services                    │
│  • Use pod anti-affinity to spread across pools                 │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Cost Optimization Recommendations

┌─────────────────────────────────────────────────────────────────┐
│           COST OPTIMIZATION CHECKLIST                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  COMPUTE                                                        │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ □ Right-size VMs (use Recommender)                      │    │
│  │ □ Use Spot VMs for fault-tolerant workloads             │    │
│  │ □ Purchase CUDs for predictable workloads               │    │
│  │ □ Use E2 machine types for cost-sensitive workloads     │    │
│  │ □ Schedule non-prod VMs to stop after hours             │    │
│  │ □ Delete unused disks and snapshots                     │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
│  STORAGE                                                        │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ □ Use appropriate storage class (Standard/Nearline/etc) │    │
│  │ □ Set lifecycle policies for old objects                │    │
│  │ □ Enable Autoclass for unpredictable access patterns    │    │
│  │ □ Delete orphaned persistent disks                      │    │
│  │ □ Use regional storage only when needed                 │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
│  NETWORKING                                                     │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ □ Use Premium tier only when needed                     │    │
│  │ □ Minimize cross-region traffic                         │    │
│  │ □ Use Cloud CDN for static content                      │    │
│  │ □ Delete unused external IPs                            │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
│  DATA                                                           │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ □ Use BigQuery slots for heavy workloads                │    │
│  │ □ Partition and cluster BigQuery tables                 │    │
│  │ □ Set table expiration for temp data                    │    │
│  │ □ Use Dataproc Serverless instead of clusters           │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Quota Management

Quota Types

┌─────────────────────────────────────────────────────────────────┐
│                    QUOTA TYPES                                  │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  RATE QUOTAS (Requests per time period)                         │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ • API requests per minute                               │    │
│  │ • Queries per second                                    │    │
│  │ • Operations per day                                    │    │
│  │                                                         │    │
│  │ Example: BigQuery 100 concurrent queries                │    │
│  │ Example: Compute Engine API 20 requests/second          │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
│  ALLOCATION QUOTAS (Resource limits)                            │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ • Number of VMs per region                              │    │
│  │ • Total vCPUs per region                                │    │
│  │ • Number of VPCs per project                            │    │
│  │                                                         │    │
│  │ Example: 24 vCPUs per region (default)                  │    │
│  │ Example: 15 VPCs per project                            │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
│  QUOTA INCREASE REQUEST                                         │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ 1. Go to IAM & Admin > Quotas                           │    │
│  │ 2. Filter by service and quota name                     │    │
│  │ 3. Select quota and click "Edit Quotas"                 │    │
│  │ 4. Enter new limit and justification                    │    │
│  │ 5. Wait for approval (usually 24-48 hours)              │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Quota Monitoring

bash
# List quotas for a project
gcloud compute project-info describe --project=PROJECT_ID

# List specific quota
gcloud compute regions describe REGION \
  --project=PROJECT_ID \
  --format="table(quotas.metric,quotas.limit,quotas.usage)"

# Set quota alert
gcloud alpha monitoring policies create \
  --notification-channels=CHANNEL_ID \
  --display-name="CPU Quota Alert" \
  --condition-display-name="CPU usage > 80%" \
  --condition-filter='
    resource.type="compute.googleapis.com/Quota"
    AND metric.type="compute.googleapis.com/quota/cpus_per_region/usage"
  ' \
  --condition-threshold-value=0.8 \
  --condition-threshold-comparison=COMPARISON_GT

Labels & Cost Allocation

Label Strategy

yaml
# Required labels for cost allocation
labels:
  # Business context
  cost-center: "cc-12345"        # Finance tracking
  business-unit: "engineering"   # Department
  product: "platform-api"        # Product/service
  
  # Technical context
  environment: "production"      # prod/staging/dev
  team: "platform"               # Owning team
  managed-by: "terraform"        # How it's managed
  
  # Lifecycle
  created-by: "john@example.com" # Creator
  expiry-date: "2024-12-31"      # For temp resources

Cost Allocation Report

sql
-- BigQuery: Cost by label
SELECT
  labels.value as team,
  SUM(cost) as total_cost
FROM `billing_project.billing_dataset.gcp_billing_export_v1_XXXXXX`,
UNNEST(labels) as labels
WHERE labels.key = 'team'
  AND invoice.month = '202401'
GROUP BY team
ORDER BY total_cost DESC;

-- Cost by environment
SELECT
  COALESCE(
    (SELECT value FROM UNNEST(labels) WHERE key = 'environment'),
    'unlabeled'
  ) as environment,
  SUM(cost) as total_cost
FROM `billing_project.billing_dataset.gcp_billing_export_v1_XXXXXX`
WHERE invoice.month = '202401'
GROUP BY environment
ORDER BY total_cost DESC;

Best Practices Checklist

  • [ ] Enable billing export to BigQuery
  • [ ] Set up budgets with multiple thresholds
  • [ ] Implement label strategy for cost allocation
  • [ ] Review Recommender suggestions weekly
  • [ ] Purchase CUDs for predictable workloads
  • [ ] Use Spot VMs for fault-tolerant workloads
  • [ ] Monitor quota usage and request increases proactively
  • [ ] Schedule non-prod resources to stop after hours

⚖️ Trade-offs

Trade-off 1: CUDs vs On-Demand

Khía cạnhCUDs (1-year)CUDs (3-year)On-Demand
Discount37%57%0%
FlexibilityLowVery lowFull
RiskMediumHighNone
Best forStable prodLong-termVariable

Khuyến nghị: CUD cho 60-80% baseline, on-demand + Spot cho peaks.


Trade-off 2: Spot vs On-Demand

Khía cạnhSpot VMsOn-Demand
DiscountLên đến 91%0%
AvailabilityKhông guaranteeGuarantee
PreemptionCó (30s warning)Không
Best forBatch, CI/CDStateful, critical

Trade-off 3: Label Enforcement

ApproachCoverageImplementation
Org policyEnforcedBlocks non-compliant
Terraform validationIaC onlyPre-deploy check
Post-hoc auditReactiveReports unlabeled

🚨 Failure Modes

Failure Mode 1: Budget Overrun

🔥 Incident thực tế

Developer tạo 100 n2-highmem-96 VMs và forget. End of month: $200K bill. No budget alerts configured.

Cách phát hiệnCách phòng tránh
Monthly bill shockBudgets with alerts
Cost spike in billing exportProgrammatic actions
Finance team escalationMultiple thresholds

Failure Mode 2: Quota Exhaustion

Cách phát hiệnCách phòng tránh
Scale-up failuresProactive quota requests
API rate limit errorsQuota monitoring
Deployment blockedMultiple regions

Failure Mode 3: Unused CUDs

Cách phát hiệnCách phòng tránh
Low utilization in reportsRight-size before commit
Waste in billing analysisStart with 1-year
Over-commitmentMonitor usage first

🔐 Security Baseline

Billing Security Requirements

RequirementImplementationVerification
Billing account accessLimited to financeIAM audit
Budget alertsAll projectsBudget review
Billing exportEnabledExport verified
Label enforcementOrg policyCompliance audit

Access Control

RoleScopeUsers
Billing Account AdminBilling accountFinance only
Billing Account UserProjectTeam leads
Billing Account ViewerBilling accountStakeholders

📊 Ops Readiness

Metrics cần Monitoring

MetricSourceAlert Threshold
Daily spendBilling export> 120% average
Budget utilizationCloud Monitoring> 80%
Quota usageQuota Monitoring> 70%
CUD utilizationBilling reports< 80%
Unlabeled resourcesAsset InventoryAny

Runbook Entry Points

Tình huốngRunbook
Budget alert triggeredrunbook/budget-alert-response.md
Cost spike detectedrunbook/cost-spike-investigation.md
Quota exhaustedrunbook/quota-increase-request.md
CUD under-utilizedrunbook/cud-optimization.md
Unlabeled resourcesrunbook/label-compliance.md

Design Review Checklist

Cost Visibility

  • [ ] Billing export enabled
  • [ ] Dashboards configured
  • [ ] Labels strategy defined
  • [ ] Cost allocation reports

Cost Control

  • [ ] Budgets set
  • [ ] Alerts configured
  • [ ] Programmatic actions
  • [ ] Approval workflows

Optimization

  • [ ] CUD analysis done
  • [ ] Spot VM opportunities
  • [ ] Recommender reviewed
  • [ ] Scheduling for non-prod

Quota

  • [ ] Quota monitoring
  • [ ] Proactive requests
  • [ ] Multi-region strategy
  • [ ] Alerting configured

📎 Liên kết