Giao diện
🔴 Environments Strategy
Level: Advanced Solves: Quản lý multiple environments (dev, staging, prod) với Terraform một cách scalable và maintainable
🎯 Mục tiêu (Outcomes)
Sau khi áp dụng kiến thức trong trang này, bạn sẽ có khả năng:
- Chọn đúng Environment Strategy cho project
- Sử dụng Workspaces đúng cách (và biết khi nào không dùng)
- Thiết kế Directory per Environment structure
- Cấu hình Terragrunt cho DRY environments
- Implement Promotion Workflow an toàn
- Ngăn chặn Accidental Prod Changes
✅ Khi nào dùng
| Strategy | Use Case | Lý do |
|---|---|---|
| Workspaces | Đơn giản, same configs | Quick, no duplication |
| Directory per env | Hầu hết teams | Clear separation, recommended |
| Terragrunt | Large orgs, DRY | Minimal duplication |
| Branch per env | GitOps workflows | Clear promotion |
❌ Khi nào KHÔNG dùng
| Pattern | Vấn đề | Thay thế |
|---|---|---|
| Workspaces for prod | Wrong workspace risk | Directory per env |
| Single state all envs | Blast radius | Separate states |
| Different module versions | Env parity broken | Same versions |
| No prod protection | Accidental changes | prevent_destroy, approvals |
⚠️ Cảnh báo từ Raizo
"Engineer quên switch workspace. Apply dev config vào prod. Downtime 4 giờ. Từ đó chỉ dùng directory per env cho production."
Tại sao Environment Strategy quan trọng?
💡 Giáo sư Tom
"Works on my machine" trong IaC là "works in dev". Environment parity là critical - prod phải giống staging giống dev về structure, chỉ khác về scale và data. Strategy sai = deployment nightmares.
Environment Challenges
┌─────────────────────────────────────────────────────────────────┐
│ ENVIRONMENT CHALLENGES │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ DEV │ │ STAGING │ │ PROD │ │
│ │ │ │ │ │ │ │
│ │ t3.micro │ │ t3.small │ │ t3.large │ │
│ │ 1 replica │ │ 2 replicas │ │ 5 replicas │ │
│ │ no HA │ │ basic HA │ │ full HA │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │
│ QUESTIONS: │
│ • How to share code but vary configuration? │
│ • How to prevent dev changes from affecting prod? │
│ • How to promote changes through environments? │
│ • How to manage state isolation? │
│ │
└─────────────────────────────────────────────────────────────────┘Strategy Comparison
Overview
| Strategy | State Isolation | Code Duplication | Complexity | Best For |
|---|---|---|---|---|
| Workspaces | Same backend, different state | None | Low | Simple projects |
| Directory per env | Separate backends | Some | Medium | Most teams |
| Terragrunt | Separate backends | Minimal | High | Large orgs |
| Branch per env | Separate backends | High | Medium | GitOps workflows |
Strategy 1: Terraform Workspaces
How Workspaces Work
┌─────────────────────────────────────────────────────────────────┐
│ TERRAFORM WORKSPACES │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Same Configuration │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ main.tf, variables.tf, outputs.tf │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────┼───────────────┐ │
│ ▼ ▼ ▼ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Workspace │ │ Workspace │ │ Workspace │ │
│ │ "dev" │ │ "staging" │ │ "prod" │ │
│ │ │ │ │ │ │ │
│ │ state: │ │ state: │ │ state: │ │
│ │ env:/dev/ │ │ env:/stg/ │ │ env:/prod/ │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │
│ S3 Backend Structure: │
│ bucket/ │
│ ├── env:/dev/terraform.tfstate │
│ ├── env:/staging/terraform.tfstate │
│ └── env:/prod/terraform.tfstate │
│ │
└─────────────────────────────────────────────────────────────────┘Workspace Commands
bash
# List workspaces
terraform workspace list
# Create workspace
terraform workspace new dev
terraform workspace new staging
terraform workspace new prod
# Switch workspace
terraform workspace select prod
# Show current workspace
terraform workspace show
# Delete workspace
terraform workspace delete devUsing Workspace in Config
hcl
# variables.tf
variable "environment_config" {
type = map(object({
instance_type = string
instance_count = number
enable_ha = bool
}))
default = {
dev = {
instance_type = "t3.micro"
instance_count = 1
enable_ha = false
}
staging = {
instance_type = "t3.small"
instance_count = 2
enable_ha = true
}
prod = {
instance_type = "t3.large"
instance_count = 5
enable_ha = true
}
}
}
# main.tf
locals {
env = terraform.workspace
config = var.environment_config[local.env]
}
resource "aws_instance" "web" {
count = local.config.instance_count
instance_type = local.config.instance_type
tags = {
Environment = local.env
}
}Workspace Pros & Cons
✅ Pros
- Zero code duplication
- Simple to understand
- Built into Terraform
- Easy to switch environments
❌ Cons
- Same backend = shared access control
- Easy to accidentally apply to wrong workspace
- Limited flexibility for env-specific resources
- Not recommended by HashiCorp for production
Strategy 2: Directory per Environment (Recommended)
Directory Structure
infrastructure/
├── modules/ # Shared modules
│ ├── vpc/
│ ├── eks/
│ └── rds/
├── environments/
│ ├── dev/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ ├── outputs.tf
│ │ ├── backend.tf # Dev-specific backend
│ │ └── terraform.tfvars # Dev values
│ ├── staging/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ ├── outputs.tf
│ │ ├── backend.tf # Staging-specific backend
│ │ └── terraform.tfvars
│ └── prod/
│ ├── main.tf
│ ├── variables.tf
│ ├── outputs.tf
│ ├── backend.tf # Prod-specific backend
│ └── terraform.tfvars
└── global/ # Shared resources
├── iam/
└── dns/Environment Configuration
hcl
# environments/dev/backend.tf
terraform {
backend "s3" {
bucket = "company-terraform-state"
key = "dev/infrastructure/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-state-lock"
encrypt = true
}
}
# environments/dev/main.tf
module "vpc" {
source = "../../modules/vpc"
name = "dev"
cidr_block = "10.0.0.0/16"
availability_zones = ["us-east-1a"] # Single AZ for dev
enable_nat_gateway = true
single_nat_gateway = true # Cost saving
tags = local.common_tags
}
# environments/dev/terraform.tfvars
environment = "dev"
instance_type = "t3.micro"
instance_count = 1hcl
# environments/prod/backend.tf
terraform {
backend "s3" {
bucket = "company-terraform-state-prod" # Separate bucket
key = "prod/infrastructure/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-state-lock-prod"
encrypt = true
role_arn = "arn:aws:iam::PROD_ACCOUNT:role/TerraformRole"
}
}
# environments/prod/main.tf
module "vpc" {
source = "../../modules/vpc"
name = "prod"
cidr_block = "10.0.0.0/16"
availability_zones = ["us-east-1a", "us-east-1b", "us-east-1c"]
enable_nat_gateway = true
single_nat_gateway = false # HA for prod
tags = local.common_tags
}
# environments/prod/terraform.tfvars
environment = "prod"
instance_type = "t3.large"
instance_count = 5Directory Strategy Pros & Cons
✅ Pros
- Complete state isolation
- Separate access control per environment
- Clear visibility of env-specific config
- Can have different resources per env
- Recommended by HashiCorp
❌ Cons
- Some code duplication (main.tf per env)
- Need to keep envs in sync manually
- More files to manage
Strategy 3: Terragrunt (DRY Approach)
Terragrunt Structure
infrastructure/
├── modules/
│ └── vpc/
├── terragrunt.hcl # Root config
└── environments/
├── terragrunt.hcl # Env-level config
├── dev/
│ ├── terragrunt.hcl # Dev-specific
│ └── vpc/
│ └── terragrunt.hcl
├── staging/
│ └── ...
└── prod/
└── ...Terragrunt Configuration
hcl
# environments/terragrunt.hcl (parent)
remote_state {
backend = "s3"
generate = {
path = "backend.tf"
if_exists = "overwrite_terragrunt"
}
config = {
bucket = "company-terraform-state"
key = "${path_relative_to_include()}/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-state-lock"
}
}
# environments/dev/vpc/terragrunt.hcl
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "../../../modules/vpc"
}
inputs = {
name = "dev"
cidr_block = "10.0.0.0/16"
availability_zones = ["us-east-1a"]
enable_nat_gateway = true
single_nat_gateway = true
}Terragrunt Commands
bash
# Apply single module
cd environments/dev/vpc
terragrunt apply
# Apply all modules in environment
cd environments/dev
terragrunt run-all apply
# Plan all modules
terragrunt run-all plan
# Destroy in reverse dependency order
terragrunt run-all destroyPromotion Workflow
GitOps Promotion
Environment Promotion Script
bash
#!/bin/bash
# promote.sh - Promote changes between environments
SOURCE_ENV=$1
TARGET_ENV=$2
echo "Promoting from $SOURCE_ENV to $TARGET_ENV"
# Copy tfvars (excluding env-specific values)
cp environments/$SOURCE_ENV/terraform.tfvars environments/$TARGET_ENV/terraform.tfvars.new
# Show diff
diff environments/$TARGET_ENV/terraform.tfvars environments/$TARGET_ENV/terraform.tfvars.new
# Plan in target environment
cd environments/$TARGET_ENV
terraform plan -var-file=terraform.tfvars.newBest Practices
1. Environment Parity
hcl
# Use same module versions across environments
module "vpc" {
source = "company/vpc/aws"
version = "2.1.0" # Same version everywhere
# Only vary configuration, not structure
instance_count = var.instance_count
}2. Separate State Buckets for Prod
hcl
# Dev/Staging - shared bucket
bucket = "company-terraform-state"
# Prod - separate bucket, separate account
bucket = "company-terraform-state-prod"
role_arn = "arn:aws:iam::PROD_ACCOUNT:role/TerraformRole"3. Environment-Specific Variables
hcl
# variables.tf
variable "environment" {
description = "Environment name"
type = string
validation {
condition = contains(["dev", "staging", "prod"], var.environment)
error_message = "Must be dev, staging, or prod."
}
}
# Derive other settings from environment
locals {
is_production = var.environment == "prod"
instance_type = local.is_production ? "t3.large" : "t3.micro"
multi_az = local.is_production ? true : false
}4. Prevent Accidental Prod Changes
hcl
# prod/main.tf
resource "aws_vpc" "main" {
# ...
lifecycle {
prevent_destroy = true
}
}
# CI/CD - require approval for prod
# .github/workflows/terraform.yml
jobs:
apply-prod:
environment: production # Requires approval
steps:
- run: terraform applyAnti-Patterns
❌ Single State for All Environments
hcl
# BAD - All environments in one state
resource "aws_instance" "dev_web" { ... }
resource "aws_instance" "staging_web" { ... }
resource "aws_instance" "prod_web" { ... }❌ Hardcoded Environment Values
hcl
# BAD - Hardcoded in main.tf
resource "aws_instance" "web" {
instance_type = "t3.micro" # What about prod?
}
# GOOD - Variable
resource "aws_instance" "web" {
instance_type = var.instance_type
}❌ Different Module Versions per Environment
hcl
# BAD - Version drift
# dev/main.tf
module "vpc" {
source = "company/vpc/aws"
version = "3.0.0" # Latest
}
## Best Practices Checklist
- [ ] Environment parity (same modules, same versions)
- [ ] State isolation per environment
- [ ] Separate prod state bucket/account
- [ ] prevent_destroy on prod resources
- [ ] Approval required for prod changes
- [ ] Clear promotion workflow
- [ ] No hardcoded env values
- [ ] Environment validation in variables
## ⚖️ Trade-offs
### Trade-off 1: Workspaces vs Directories
| Khía cạnh | Workspaces | Directory per Env |
|-----------|------------|-------------------|
| **Simplicity** | Cao | Trung bình |
| **State isolation** | Thấp | Cao |
| **Access control** | Shared | Separate |
| **Wrong env risk** | Cao | Thấp |
**Khuyến nghị**: Directory per env cho production workloads.
---
### Trade-off 2: Code Duplication vs DRY
| Approach | Duplication | Complexity | Flexibility |
|----------|-------------|------------|-------------|
| **Full duplication** | Cao | Thấp | Cao nhất |
| **Shared modules** | Thấp | Trung bình | Cao |
| **Terragrunt** | Tối thiểu | Cao | Trung bình |
---
### Trade-off 3: Promotion Strategy
| Strategy | Safety | Speed |
|----------|--------|-------|
| **Manual promotion** | Cao | Chậm |
| **GitOps (branches)** | Cao | Trung bình |
| **Auto-promotion** | Thấp | Nhanh |
## 🚨 Failure Modes
### Failure Mode 1: Wrong Workspace Applied to Prod
::: danger 🔥 Incident thực tế
*Engineer forget to switch workspace. terraform apply in "dev" workspace nhưng lại đang ở prod directory. Dev config deployed to prod. 4-hour outage.*
:::
| Cách phát hiện | Cách phòng tránh |
|----------------|------------------|
| Monitoring alerts | Directory per env |
| User reports | CI/CD only apply |
| Plan review | Workspace prompts |
---
### Failure Mode 2: Module Version Drift
| Cách phát hiện | Cách phòng tránh |
|----------------|------------------|
| Different behavior | Pin same versions |
| Promotion failures | Audit version spread |
| Env parity issues | Version management |
---
### Failure Mode 3: Env Parity Broken
| Cách phát hiện | Cách phòng tránh |
|----------------|------------------|
| Works in dev, fails in prod | Same structure |
| Missing resources | Shared modules |
| Config differences | Only vary scale |
## 🔐 Security Baseline
### Environment Security
| Requirement | Implementation | Verification |
|-------------|----------------|---------------|
| **Prod state separate** | Separate bucket/account | Config review |
| **Prod access restricted** | IAM/approvals | Access audit |
| **prevent_destroy on prod** | lifecycle block | Code review |
| **Promotion workflow** | PRs, approvals | Process review |
### Access Control per Environment
| Environment | Who Can Apply | Approval |
|-------------|---------------|----------|
| Dev | Engineers | No |
| Staging | Engineers | Optional |
| Prod | CI/CD only | Required |
## 📊 Ops Readiness
### Metrics cần Monitoring
| Metric | Source | Alert Threshold |
|--------|--------|-----------------|
| Env parity | Module version audit | Any drift |
| Wrong env apply | Audit logs | Any dev->prod |
| Promotion time | CI/CD | > SLA |
| Prod change frequency | CI/CD | Anomaly |
### Runbook Entry Points
| Tình huống | Runbook |
|------------|---------|
| Wrong env applied | `runbook/wrong-env-recovery.md` |
| Module version drift | `runbook/version-alignment.md` |
| Promotion blocked | `runbook/promotion-troubleshoot.md` |
| Env parity issue | `runbook/env-parity-debug.md` |
## ✅ Design Review Checklist
### Strategy
- [ ] Strategy chosen fits team size
- [ ] State isolation appropriate
- [ ] Code duplication acceptable
- [ ] Terragrunt if needed
### Per Environment
- [ ] Backend configured
- [ ] State isolated
- [ ] Access restricted (prod)
- [ ] prevent_destroy (prod)
### Parity
- [ ] Same modules all envs
- [ ] Same versions all envs
- [ ] Only config differs
- [ ] Promotion tested
### Operations
- [ ] Promotion workflow defined
- [ ] Approvals for prod
- [ ] Rollback plan
- [ ] Runbooks documented
## 📎 Liên kết
- 📎 [State Management](/terraform/foundation/state) - Backend configuration
- 📎 [Module Design](/terraform/core/modules) - Reusable modules
- 📎 [Testing & CI/CD](/terraform/advanced/testing) - Environment pipelines
- 📎 [AWS Landing Zone](/aws/foundation/landing-zone) - Multi-account strategy
- 📎 [GCP Hierarchy](/gcp/foundation/hierarchy) - Project-based environments
- 📎 [Terraform Security](/terraform/security/security) - Secure environments