Skip to content

📚 IaC Fundamentals

Level: Foundation Solves: Hiểu nền tảng Infrastructure as Code với Terraform để xây dựng infrastructure reproducible và version-controlled

🎯 Mục tiêu (Outcomes)

Sau khi áp dụng kiến thức trong trang này, bạn sẽ có khả năng:

  • Hiểu Declarative vs Imperative approach trong IaC
  • Sử dụng HCL (HashiCorp Configuration Language) cơ bản
  • Cấu hình Providers cho AWS và GCP
  • Thực hiện Terraform Workflow (init, plan, apply)
  • Tổ chức Code theo best practices
  • So sánh với CloudFormation/Deployment Manager

Khi nào dùng

ApproachUse CaseLý do
TerraformMulti-cloud, complex infraProvider ecosystem, state management
CloudFormationAWS-only, deep integrationNative, no state management
PulumiDeveloper-first, existing codeReal programming languages
CDKAWS + programming languagesTypeScript/Python for AWS

Khi nào KHÔNG dùng

PatternVấn đềThay thế
Terraform cho đơn giản 1-2 resourcesOverkillConsole hoặc CLI
IaC không có CI/CDManual apply = riskGitOps workflow
Local state cho teamState conflictsRemote state
No code reviewDangerous changesPR-based workflow

⚠️ Cảnh báo từ Raizo

"Một junior apply trực tiếp từ local mà không plan trước. Destroy 3 production databases. Bắt buộc CI/CD và plan review cho tất cả changes."

Tại sao cần Infrastructure as Code?

💡 Giáo sư Tom

Manual infrastructure là technical debt lớn nhất của DevOps. Không reproducible, không auditable, và không scalable. IaC biến infrastructure thành code - có thể review, test, và version control như application code.

Vấn đề với Manual Infrastructure

┌─────────────────────────────────────────────────────────────────┐
│                    MANUAL INFRASTRUCTURE CHAOS                  │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │                    Console Clicking                      │    │
│  │                                                          │    │
│  │  Engineer A ──► Creates VPC with settings X              │    │
│  │  Engineer B ──► Creates VPC with settings Y              │    │
│  │  Engineer C ──► "What settings did we use again?"        │    │
│  │                                                          │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
│  PROBLEMS:                                                      │
│  ❌ Configuration drift between environments                    │
│  ❌ No audit trail of changes                                   │
│  ❌ Cannot reproduce infrastructure                             │
│  ❌ Knowledge locked in individuals' heads                      │
│  ❌ Disaster recovery = "hope and pray"                         │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Declarative vs Imperative

Imperative Approach (Scripts)

bash
# Imperative: HOW to do it
aws ec2 create-vpc --cidr-block 10.0.0.0/16
aws ec2 create-subnet --vpc-id vpc-xxx --cidr-block 10.0.1.0/24
aws ec2 create-internet-gateway
aws ec2 attach-internet-gateway --vpc-id vpc-xxx --internet-gateway-id igw-xxx
# ... 50 more commands
# What if step 3 fails? Rollback? Retry?

Declarative Approach (Terraform)

hcl
# Declarative: WHAT you want
resource "aws_vpc" "main" {
  cidr_block = "10.0.0.0/16"
  
  tags = {
    Name = "main-vpc"
  }
}

resource "aws_subnet" "public" {
  vpc_id     = aws_vpc.main.id
  cidr_block = "10.0.1.0/24"
}

resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id
}

# Terraform figures out the HOW
# Handles dependencies, ordering, and state

📖 Key Insight

Declarative = Describe desired end state Terraform = Figures out how to get there

Terraform Core Concepts

The Terraform Workflow

┌─────────────────────────────────────────────────────────────────┐
│                    TERRAFORM WORKFLOW                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  1. WRITE                                                       │
│     ┌─────────────────────────────────────────────────────┐     │
│     │  main.tf, variables.tf, outputs.tf                  │     │
│     │  Define desired infrastructure state                │     │
│     └─────────────────────────────────────────────────────┘     │
│                           │                                     │
│                           ▼                                     │
│  2. PLAN (terraform plan)                                       │
│     ┌─────────────────────────────────────────────────────┐     │
│     │  Compare desired state vs current state             │     │
│     │  Show what will be created/modified/destroyed       │     │
│     │  NO CHANGES MADE - safe to run anytime              │     │
│     └─────────────────────────────────────────────────────┘     │
│                           │                                     │
│                           ▼                                     │
│  3. APPLY (terraform apply)                                     │
│     ┌─────────────────────────────────────────────────────┐     │
│     │  Execute the plan                                   │     │
│     │  Create/modify/destroy resources                    │     │
│     │  Update state file                                  │     │
│     └─────────────────────────────────────────────────────┘     │
│                           │                                     │
│                           ▼                                     │
│  4. STATE (terraform.tfstate)                                   │
│     ┌─────────────────────────────────────────────────────┐     │
│     │  Source of truth for what Terraform manages         │     │
│     │  Maps config to real-world resources                │     │
│     │  CRITICAL: Must be protected and backed up          │     │
│     └─────────────────────────────────────────────────────┘     │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

HCL (HashiCorp Configuration Language)

hcl
# Variables - Input parameters
variable "environment" {
  description = "Environment name"
  type        = string
  default     = "dev"
}

variable "instance_count" {
  description = "Number of instances"
  type        = number
  validation {
    condition     = var.instance_count > 0 && var.instance_count <= 10
    error_message = "Instance count must be between 1 and 10."
  }
}

# Locals - Computed values
locals {
  common_tags = {
    Environment = var.environment
    ManagedBy   = "terraform"
    Project     = "enterprise-platform"
  }
  
  name_prefix = "${var.environment}-platform"
}

# Resources - Infrastructure objects
resource "aws_instance" "web" {
  count         = var.instance_count
  ami           = data.aws_ami.amazon_linux.id
  instance_type = "t3.micro"
  
  tags = merge(local.common_tags, {
    Name = "${local.name_prefix}-web-${count.index}"
  })
}

# Data Sources - Read existing resources
data "aws_ami" "amazon_linux" {
  most_recent = true
  owners      = ["amazon"]
  
  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-*-x86_64-gp2"]
  }
}

# Outputs - Export values
output "instance_ids" {
  description = "IDs of created instances"
  value       = aws_instance.web[*].id
}

Providers

Provider Configuration

hcl
# Provider block - Configure cloud provider
terraform {
  required_version = ">= 1.0"
  
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
    google = {
      source  = "hashicorp/google"
      version = "~> 5.0"
    }
  }
}

# AWS Provider
provider "aws" {
  region = "us-east-1"
  
  default_tags {
    tags = {
      ManagedBy = "terraform"
    }
  }
}

# Multiple provider configurations (aliases)
provider "aws" {
  alias  = "eu"
  region = "eu-west-1"
}

# Use aliased provider
resource "aws_s3_bucket" "eu_bucket" {
  provider = aws.eu
  bucket   = "my-eu-bucket"
}

Provider Version Constraints

ConstraintMeaningExample
= 5.0.0Exact versionOnly 5.0.0
>= 5.0Minimum version5.0 or higher
~> 5.0Pessimistic constraint>= 5.0, < 6.0
>= 5.0, < 6.0RangeBetween 5.0 and 6.0

⚠️ Version Pinning Best Practice

Luôn pin provider versions trong production. ~> 5.0 cho phép patch updates nhưng block breaking changes.

File Organization

Standard Project Structure

terraform-project/
├── main.tf              # Primary resources
├── variables.tf         # Input variables
├── outputs.tf           # Output values
├── providers.tf         # Provider configuration
├── versions.tf          # Terraform and provider versions
├── locals.tf            # Local values
├── data.tf              # Data sources
├── terraform.tfvars     # Variable values (gitignored)
└── README.md            # Documentation

Enterprise Project Structure

infrastructure/
├── modules/                    # Reusable modules
│   ├── vpc/
│   ├── eks/
│   └── rds/
├── environments/               # Environment-specific configs
│   ├── dev/
│   │   ├── main.tf
│   │   ├── terraform.tfvars
│   │   └── backend.tf
│   ├── staging/
│   └── prod/
├── global/                     # Shared resources
│   ├── iam/
│   └── dns/
└── scripts/                    # Helper scripts
    ├── init.sh
    └── plan.sh

Resource Dependencies

Implicit Dependencies

hcl
# Terraform automatically detects dependencies
resource "aws_vpc" "main" {
  cidr_block = "10.0.0.0/16"
}

resource "aws_subnet" "public" {
  vpc_id     = aws_vpc.main.id  # Implicit dependency
  cidr_block = "10.0.1.0/24"
}
# Terraform knows: VPC must exist before subnet

Explicit Dependencies

hcl
# When Terraform can't detect dependency
resource "aws_instance" "web" {
  ami           = "ami-xxx"
  instance_type = "t3.micro"
  
  depends_on = [aws_iam_role_policy_attachment.web_policy]
}

resource "aws_iam_role_policy_attachment" "web_policy" {
  role       = aws_iam_role.web.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess"
}

Common Commands

bash
# Initialize working directory
terraform init

# Validate configuration syntax
terraform validate

# Format code
terraform fmt -recursive

# Preview changes
terraform plan -out=tfplan

# Apply changes
terraform apply tfplan

# Show current state
terraform show

# List resources in state
terraform state list

# Destroy infrastructure
## Best Practices Checklist

- [ ] Use declarative approach (Terraform)
- [ ] Version control all IaC code
- [ ] Use remote state with locking
- [ ] Implement PR-based workflow
- [ ] Run `terraform plan` in CI
- [ ] Format code with `terraform fmt`
- [ ] Validate with `terraform validate`
- [ ] Use consistent file organization

## ⚖️ Trade-offs

### Trade-off 1: Terraform vs CloudFormation/Deployment Manager

| Khía cạnh | Terraform | CloudFormation | Deployment Manager |
|-----------|-----------|----------------|--------------------|
| **Multi-cloud** | Yes | AWS only | GCP only |
| **State management** | External | Built-in | Built-in |
| **Ecosystem** | Rộng nhất | AWS-focused | GCP-focused |
| **Learning curve** | Medium | Medium | Medium |
| **Drift detection** | Manual | StackSets | N/A |

**Khuyến nghị**: Terraform cho multi-cloud hoặc khi cần consistency across environments.

---

### Trade-off 2: Monorepo vs Multi-repo

| Approach | Pros | Cons |
|----------|------|------|
| **Monorepo** | Easy cross-module refs | Large repo, complex CI |
| **Multi-repo** | Clear boundaries | Dependency management |
| **Hybrid** | Balance | More complexity |

---

### Trade-off 3: Workspaces vs Directories

| Approach | Use Case | Complexity |
|----------|----------|------------|
| **Workspaces** | Same config, diff state | Simple |
| **Directories** | Different configs | More flexible |
| **Terragrunt** | DRY configurations | Additional tool |

## 🚨 Failure Modes

### Failure Mode 1: State Corruption

::: danger 🔥 Incident thực tế
*2 engineers apply đồng thời từ local. State conflict. Terraform không biết resource nào tồn tại. Manual cleanup mất 2 ngày.*
:::

| Cách phát hiện | Cách phòng tránh |
|----------------|------------------|
| State lock errors | Remote state + locking |
| Resource mismatch | CI/CD only apply |
| Plan shows unexpected | State backup |

---

### Failure Mode 2: Accidental Destroy

| Cách phát hiện | Cách phòng tránh |
|----------------|------------------|
| Resources gone | `prevent_destroy` lifecycle |
| Plan shows destroy | Require plan review |
| Data loss | Backup before apply |

---

### Failure Mode 3: Provider Version Mismatch

| Cách phát hiện | Cách phòng tránh |
|----------------|------------------|
| Init failures | Lock provider versions |
| Different behavior | `.terraform.lock.hcl` |
| CI vs local diff | Consistent Terraform version |

## 🔐 Security Baseline

### IaC Security Requirements

| Requirement | Implementation | Verification |
|-------------|----------------|---------------|
| **No secrets in code** | Variables, Vault | Pre-commit scan |
| **State encryption** | S3 SSE, GCS encryption | Bucket config |
| **Least privilege** | Scoped provider credentials | IAM audit |
| **Code review** | PR-based workflow | Branch protection |

### Security Checklist

| Item | Status |
|------|--------|
| No hardcoded secrets | Required |
| Remote state encrypted | Required |
| State access restricted | Required |
| Provider credentials scoped | Required |
| Pre-commit hooks | Recommended |

## 📊 Ops Readiness

### Metrics cần Monitoring

| Metric | Source | Alert Threshold |
|--------|--------|-----------------|
| Apply failures | CI/CD | Any |
| Plan drift | Scheduled plans | Any changes |
| State lock wait | Terraform logs | > 5 min |
| Provider errors | Terraform logs | Any |

### Runbook Entry Points

| Tình huống | Runbook |
|------------|---------|
| State locked | `runbook/terraform-state-lock.md` |
| Apply failed | `runbook/terraform-apply-failure.md` |
| State corruption | `runbook/terraform-state-recovery.md` |
| Provider error | `runbook/terraform-provider-debug.md` |
| Drift detected | `runbook/terraform-drift-resolution.md` |

## ✅ Design Review Checklist

### Code Organization

- [ ] Consistent file structure
- [ ] Modules for reusable code
- [ ] Variables properly typed
- [ ] Outputs documented

### Workflow

- [ ] CI/CD integrated
- [ ] Plan review required
- [ ] Apply from CI only
- [ ] Branch protection

### State

- [ ] Remote backend
- [ ] State locking
- [ ] Encryption enabled
- [ ] Access restricted

### Security

- [ ] No secrets in code
- [ ] Pre-commit hooks
- [ ] Provider least privilege
- [ ] Audit logging

## 📎 Liên kết

- 📎 [State Management](./state) - Chi tiết về quản Terraform state
- 📎 [Module Design](/terraform/core/modules) - Thiết kế modules tái sử dụng
- 📎 [AWS Landing Zone](/aws/foundation/landing-zone) - Multi-account strategy với Terraform
- 📎 [GCP Resource Hierarchy](/gcp/foundation/hierarchy) - GCP provider patterns
- 📎 [Terraform Security](/terraform/security/security) - Security best practices