Giao diện
🔴 Testing & CI/CD
Level: Advanced Solves: Đảm bảo chất lượng và tự động hóa deployment cho Terraform infrastructure
🎯 Mục tiêu (Outcomes)
Sau khi áp dụng kiến thức trong trang này, bạn sẽ có khả năng:
- Thiết lập Testing Pyramid cho IaC
- Sử dụng Static Analysis (tflint, checkov)
- Implement Policy-as-Code với OPA/Conftest
- Viết Integration Tests với Terratest
- Thiết kế CI/CD Pipeline an toàn
- Cấu hình OIDC Authentication
✅ Khi nào dùng
| Test Type | Use Case | Lý do |
|---|---|---|
| terraform validate | Mọi commit | Syntax check |
| TFLint | Mọi commit | Best practices |
| Checkov | Mọi PR | Security scan |
| OPA policies | Mọi PR | Custom policies |
| Terratest | Module release | Integration test |
❌ Khi nào KHÔNG dùng
| Pattern | Vấn đề | Thay thế |
|---|---|---|
| Skip validate | Broken commits | Always validate |
| No security scan | Vulnerabilities shipped | Checkov/tfsec |
| Local apply | No audit, conflicts | CI/CD only |
| No plan review | Accidental destroys | Require approval |
⚠️ Cảnh báo từ Raizo
"Developer push trực tiếp và apply không qua review. Destroy production database. Sau đó team bắt buộc PR + approval + CI/CD. 0 incidents kể từ đó."
Tại sao cần Testing cho IaC?
💡 Giáo sư Tom
"It works on my machine" trong IaC là "terraform plan looks good". Plan không catch logic errors, security misconfigurations, hay integration issues. Testing IaC không phải luxury - nó là necessity cho production.
Testing Pyramid cho IaC
┌─────────────────────────────────────────────────────────────────┐
│ IAC TESTING PYRAMID │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ▲ │
│ /│\ │
│ / │ \ End-to-End Tests │
│ / │ \ (Full infrastructure) │
│ / │ \ Slow, expensive │
│ ─────┼───── │
│ / │ \ Integration Tests │
│ / │ \ (Module + cloud) │
│ / │ \ Medium speed │
│ ─────────┼───────── │
│ / │ \ Unit Tests │
│ / │ \ (Static analysis) │
│ / │ \ Fast, cheap │
│ ─────────────┴───────────── │
│ │
│ TESTING TYPES: │
│ • Static Analysis: terraform validate, fmt, tflint │
│ • Policy Tests: OPA, Sentinel, Checkov │
│ • Unit Tests: Module logic without cloud │
│ • Integration Tests: Terratest with real cloud │
│ • E2E Tests: Full environment deployment │
│ │
└─────────────────────────────────────────────────────────────────┘Static Analysis
terraform validate & fmt
bash
# Validate syntax
terraform validate
# Check formatting
terraform fmt -check -recursive
# Auto-format
terraform fmt -recursiveTFLint
bash
# Install
brew install tflint
# Initialize with plugins
tflint --init
# Run linting
tflint --recursivehcl
# .tflint.hcl
plugin "aws" {
enabled = true
version = "0.27.0"
source = "github.com/terraform-linters/tflint-ruleset-aws"
}
rule "terraform_naming_convention" {
enabled = true
format = "snake_case"
}
rule "terraform_documented_variables" {
enabled = true
}
rule "aws_instance_invalid_type" {
enabled = true
}Checkov (Security Scanning)
bash
# Install
pip install checkov
# Scan directory
checkov -d .
# Scan with specific framework
checkov -d . --framework terraform
# Output as JSON
checkov -d . -o json > checkov-results.jsonyaml
# .checkov.yaml
skip-check:
- CKV_AWS_79 # Skip specific check
- CKV_AWS_88
framework:
- terraform
compact: truePolicy-as-Code
Open Policy Agent (OPA)
rego
# policy/terraform.rego
package terraform
# Deny public S3 buckets
deny[msg] {
resource := input.resource_changes[_]
resource.type == "aws_s3_bucket"
resource.change.after.acl == "public-read"
msg := sprintf("S3 bucket '%s' cannot be public", [resource.address])
}
# Require encryption
deny[msg] {
resource := input.resource_changes[_]
resource.type == "aws_s3_bucket"
not resource.change.after.server_side_encryption_configuration
msg := sprintf("S3 bucket '%s' must have encryption enabled", [resource.address])
}
# Enforce tagging
deny[msg] {
resource := input.resource_changes[_]
required_tags := {"Environment", "Owner", "Project"}
provided_tags := {tag | resource.change.after.tags[tag]}
missing := required_tags - provided_tags
count(missing) > 0
msg := sprintf("Resource '%s' missing required tags: %v", [resource.address, missing])
}bash
# Generate plan JSON
terraform plan -out=tfplan
terraform show -json tfplan > tfplan.json
# Evaluate with OPA
opa eval --data policy/ --input tfplan.json "data.terraform.deny"Conftest (OPA Wrapper)
bash
# Install
brew install conftest
# Run policy tests
conftest test tfplan.json -p policy/
# Output
# FAIL - tfplan.json - terraform - S3 bucket 'aws_s3_bucket.public' cannot be publicHashiCorp Sentinel
python
# policy/require-tags.sentinel
import "tfplan/v2" as tfplan
required_tags = ["Environment", "Owner", "Project"]
# Get all resources
all_resources = filter tfplan.resource_changes as _, rc {
rc.mode is "managed" and
rc.change.actions contains "create"
}
# Check tags
main = rule {
all all_resources as _, resource {
all required_tags as tag {
resource.change.after.tags contains tag
}
}
}Integration Testing with Terratest
Basic Terratest Example
go
// test/vpc_test.go
package test
import (
"testing"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/stretchr/testify/assert"
)
func TestVpcModule(t *testing.T) {
t.Parallel()
terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
TerraformDir: "../modules/vpc",
Vars: map[string]interface{}{
"name": "test",
"cidr_block": "10.0.0.0/16",
},
})
// Clean up after test
defer terraform.Destroy(t, terraformOptions)
// Deploy infrastructure
terraform.InitAndApply(t, terraformOptions)
// Get outputs
vpcId := terraform.Output(t, terraformOptions, "vpc_id")
cidrBlock := terraform.Output(t, terraformOptions, "vpc_cidr_block")
// Assertions
assert.NotEmpty(t, vpcId)
assert.Equal(t, "10.0.0.0/16", cidrBlock)
}Testing with AWS SDK
go
func TestVpcWithAwsSdk(t *testing.T) {
t.Parallel()
terraformOptions := &terraform.Options{
TerraformDir: "../modules/vpc",
Vars: map[string]interface{}{
"name": "test",
"cidr_block": "10.0.0.0/16",
},
}
defer terraform.Destroy(t, terraformOptions)
terraform.InitAndApply(t, terraformOptions)
vpcId := terraform.Output(t, terraformOptions, "vpc_id")
// Use AWS SDK to verify
awsRegion := "us-east-1"
vpc := aws.GetVpcById(t, vpcId, awsRegion)
assert.Equal(t, "10.0.0.0/16", aws.GetCidrBlockForVpc(t, vpc, awsRegion))
assert.True(t, aws.IsVpcDnsEnabled(t, vpc, awsRegion))
}Test Stages (Speed Optimization)
go
func TestVpcStages(t *testing.T) {
t.Parallel()
terraformOptions := &terraform.Options{
TerraformDir: "../modules/vpc",
}
// Stage 1: Deploy (skip if already deployed)
defer test_structure.RunTestStage(t, "teardown", func() {
terraform.Destroy(t, terraformOptions)
})
test_structure.RunTestStage(t, "setup", func() {
terraform.InitAndApply(t, terraformOptions)
})
// Stage 2: Validate
test_structure.RunTestStage(t, "validate", func() {
vpcId := terraform.Output(t, terraformOptions, "vpc_id")
assert.NotEmpty(t, vpcId)
})
}bash
# Run specific stage
SKIP_teardown=true go test -v -run TestVpcStages
# Skip setup (use existing infrastructure)
SKIP_setup=true go test -v -run TestVpcStagesCI/CD Pipeline Design
GitHub Actions Pipeline
yaml
# .github/workflows/terraform.yml
name: Terraform CI/CD
on:
push:
branches: [main]
pull_request:
branches: [main]
env:
TF_VERSION: "1.6.0"
AWS_REGION: "us-east-1"
jobs:
# Stage 1: Static Analysis
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: ${{ env.TF_VERSION }}
- name: Terraform Format Check
run: terraform fmt -check -recursive
- name: Terraform Validate
run: |
terraform init -backend=false
terraform validate
- name: TFLint
uses: terraform-linters/setup-tflint@v4
- run: |
tflint --init
tflint --recursive
# Stage 2: Security Scan
security:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Checkov Scan
uses: bridgecrewio/checkov-action@v12
with:
directory: .
framework: terraform
output_format: sarif
output_file_path: checkov.sarif
- name: Upload SARIF
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: checkov.sarif
# Stage 3: Plan
plan:
needs: [validate, security]
runs-on: ubuntu-latest
environment: ${{ github.event_name == 'pull_request' && 'dev' || 'prod' }}
steps:
- uses: actions/checkout@v4
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
aws-region: ${{ env.AWS_REGION }}
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: ${{ env.TF_VERSION }}
- name: Terraform Init
run: terraform init
- name: Terraform Plan
id: plan
run: |
terraform plan -out=tfplan -no-color
terraform show -json tfplan > tfplan.json
- name: Policy Check (OPA)
run: |
conftest test tfplan.json -p policy/
- name: Upload Plan
uses: actions/upload-artifact@v4
with:
name: tfplan
path: tfplan
# Stage 4: Apply (main branch only)
apply:
needs: plan
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
runs-on: ubuntu-latest
environment: prod # Requires approval
steps:
- uses: actions/checkout@v4
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
aws-region: ${{ env.AWS_REGION }}
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: ${{ env.TF_VERSION }}
- name: Download Plan
uses: actions/download-artifact@v4
with:
name: tfplan
- name: Terraform Init
run: terraform init
- name: Terraform Apply
run: terraform apply -auto-approve tfplanPR Comment with Plan
yaml
- name: Comment Plan on PR
uses: actions/github-script@v7
if: github.event_name == 'pull_request'
with:
script: |
const output = `#### Terraform Plan 📖
\`\`\`
${{ steps.plan.outputs.stdout }}
\`\`\`
*Pushed by: @${{ github.actor }}*`;
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: output
})Pipeline Best Practices
1. OIDC Authentication (No Long-Lived Credentials)
yaml
# GitHub Actions OIDC
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789012:role/GitHubActionsRole
aws-region: us-east-1
# No access keys needed!hcl
# AWS IAM Role for GitHub Actions
resource "aws_iam_role" "github_actions" {
name = "GitHubActionsRole"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Principal = {
Federated = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:oidc-provider/token.actions.githubusercontent.com"
}
Action = "sts:AssumeRoleWithWebIdentity"
Condition = {
StringEquals = {
"token.actions.githubusercontent.com:aud" = "sts.amazonaws.com"
}
StringLike = {
"token.actions.githubusercontent.com:sub" = "repo:company/infra:*"
}
}
}
]
})
}2. Environment Protection Rules
yaml
# Require approval for production
environment: prod
# GitHub Settings:
# - Required reviewers: 2
# - Wait timer: 5 minutes
# - Restrict to specific branches3. Drift Detection Schedule
yaml
name: Drift Detection
on:
schedule:
- cron: '0 */6 * * *' # Every 6 hours
jobs:
drift-check:
runs-on: ubuntu-latest
steps:
- name: Check for Drift
run: |
terraform plan -detailed-exitcode
continue-on-error: true
id: plan
- name: Notify on Drift
if: steps.plan.outcome == 'failure'
run: |
# Send Slack notification
curl -X POST ${{ secrets.SLACK_WEBHOOK }} \
-d '{"text": "⚠️ Infrastructure drift detected!"}'Best Practices Checklist
- [ ] CI validates every commit
- [ ] Security scan on every PR
- [ ] Plan requires review
- [ ] Apply from CI only
- [ ] OIDC authentication
- [ ] Environment protection rules
- [ ] Drift detection scheduled
- [ ] Test modules before release
⚖️ Trade-offs
Trade-off 1: Test Coverage vs Speed
| Level | Coverage | Speed | Cost |
|---|---|---|---|
| Static only | Low | Fast | Free |
| + Policies | Medium | Fast | Free |
| + Terratest | High | Slow | Cloud $$ |
| + E2E | Highest | Very slow | Cloud $$$ |
Khuyến nghị: Static + Policies cho mọi PR, Terratest cho module releases.
Trade-off 2: Policy Strictness
| Level | Security | Developer Experience |
|---|---|---|
| Loose | Low | Easy |
| Moderate | Medium | Good |
| Strict | High | Friction |
Trade-off 3: CI/CD Platform
| Platform | Integration | Cost |
|---|---|---|
| GitHub Actions | Excellent | Free tier |
| GitLab CI | Excellent | Free tier |
| Terraform Cloud | Built-in | $$$ |
| Jenkins | Manual | Self-hosted |
🚨 Failure Modes
Failure Mode 1: CI Credentials Leaked
🔥 Incident thực tế
CI secrets stored as env vars. Log file exposed credentials. Attacker gained AWS admin. $50K cleanup cost.
| Cách phát hiện | Cách phòng tránh |
|---|---|
| Unusual API activity | OIDC, no static keys |
| GuardDuty alerts | Secret scanning |
| Bill spike | Least privilege |
Failure Mode 2: Apply Without Review
| Cách phát hiện | Cách phòng tránh |
|---|---|
| Unexpected changes | Branch protection |
| Production issues | Environment approval |
| Audit failures | Require reviewers |
Failure Mode 3: Flaky Tests
| Cách phát hiện | Cách phòng tránh |
|---|---|
| Random failures | Retry logic |
| Rate limits | Test throttling |
| Resource conflicts | Unique naming |
🔐 Security Baseline
CI/CD Security Requirements
| Requirement | Implementation | Verification |
|---|---|---|
| No static credentials | OIDC | Audit secrets |
| Least privilege | Scoped roles | IAM review |
| Secret scanning | GitHub secret scanning | Alert review |
| Audit logging | CloudTrail | Log analysis |
| Environment protection | Required reviewers | Config check |
Pipeline Security Checklist
| Item | Status |
|---|---|
| OIDC authentication | ☑ Required |
| No hardcoded secrets | ☑ Required |
| Branch protection | ☑ Required |
| Required reviewers | ☑ Required |
| Environment approvals | ☑ Production |
| Audit logging | ☑ Required |
📊 Ops Readiness
Metrics cần Monitoring
| Metric | Source | Alert Threshold |
|---|---|---|
| Pipeline failures | CI/CD | > 5% failure rate |
| Apply duration | CI/CD | > 30 min |
| Security findings | Checkov | Any high/critical |
| Policy violations | OPA | Any |
| Drift detected | Scheduled plan | Any |
Runbook Entry Points
| Tình huống | Runbook |
|---|---|
| Pipeline failed | runbook/ci-pipeline-debug.md |
| Security finding | runbook/security-finding-triage.md |
| OIDC auth error | runbook/oidc-troubleshooting.md |
| Test flaky | runbook/terratest-debug.md |
| Apply stuck | runbook/terraform-apply-stuck.md |
✅ Design Review Checklist
Pipeline
- [ ] All stages defined
- [ ] OIDC configured
- [ ] Branch protection
- [ ] Environment approvals
Testing
- [ ] Static analysis
- [ ] Security scanning
- [ ] Policy checks
- [ ] Integration tests (modules)
Security
- [ ] No static credentials
- [ ] Secrets masked
- [ ] Least privilege
- [ ] Audit logging
Operations
- [ ] Drift detection
- [ ] PR comments with plan
- [ ] Failure notifications
- [ ] Runbooks documented
📎 Liên kết
- 📎 State Management - State locking for CI/CD
- 📎 Environments Strategy - Multi-env pipelines
- 📎 Security for IaC - Secure pipelines
- 📎 AWS IAM - OIDC role setup
- 📎 GCP IAM - Workload Identity
- 📎 Module Design - Testing modules