Skip to content

Ninja & CCache Performance

HPN Tunnel — Kỹ thuật tối ưu build time: Ninja thay Make, CCache cho local, sccache cho distributed teams.

Vấn đề: Build Time tại Scale

┌─────────────────────────────────────────────────────────────────────────┐
│                    BUILD TIME REALITY CHECK                              │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   Project size          Clean build       Incremental (1 file)          │
│   ──────────────────    ──────────────    ──────────────────────        │
│   10 files              5 seconds         2 seconds                     │
│   100 files             30 seconds        5 seconds                     │
│   1,000 files           5 minutes         15 seconds                    │
│   10,000 files          45 minutes        30 seconds                    │
│   100,000 files         6+ hours          1 minute                      │
│                                                                         │
│   ⚠️ Mỗi lần debug = đợi 30 giây đến vài phút                           │
│   ⚠️ Clean rebuild trên CI = 45 phút (tiền cloud $$$)                   │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

💸 COST OF SLOW BUILDS

  • Developer productivity: 45-min rebuild = broken flow state
  • CI costs: GitHub Actions charged by minute
  • Team frustration: "I'll just push and hope it works"

Ninja Build System

Ninja là gì?

Ninja là build executor (không phải build generator như CMake). Nó được thiết kế bởi Google engineer cho Chrome project với mục tiêu duy nhất: TỐC ĐỘ.

Make vs Ninja

AspectGNU MakeNinja
Startup timeParse Makefile mỗi lầnPre-parsed binary format
Dependency checkingShell glob patternsFlat dependency graph
Parallel schedulingBasic job queueOptimized work stealing
Progress outputVerbose everythingCompact status line
Design goalFlexibilityPure speed

Benchmark (Chrome codebase ~30K files)

┌────────────────────────────────────────────────┐
│            MAKE vs NINJA BENCHMARK             │
├────────────────────────────────────────────────┤
│                                                │
│   Statistic           Make        Ninja        │
│   ────────────────    ────────    ─────────    │
│   Null build          0.8s        0.1s         │
│   Single file change  8s          2s           │
│   Full rebuild        45min       38min        │
│                                                │
│   Null build = no changes, just dependency     │
│   scanning. Ninja 8x faster.                   │
│                                                │
└────────────────────────────────────────────────┘

Sử dụng Ninja với CMake

powershell
# Installation
# Windows (Chocolatey)
choco install ninja

# macOS
brew install ninja

# Ubuntu/Debian
sudo apt install ninja-build

# Verify
ninja --version
powershell
# Configure CMake với Ninja generator
cmake -B build -G Ninja

# Build (Ninja tự detect CPU cores)
cmake --build build

# Hoặc trực tiếp
cd build && ninja

CMake Default Generator

cmake
# Trong CMakeLists.txt hoặc CMakePresets.json
# Set Ninja làm default generator

# Option 1: Environment variable
# set CMAKE_GENERATOR=Ninja

# Option 2: CMakePresets.json (recommended)
json
// CMakePresets.json
{
  "version": 6,
  "configurePresets": [
    {
      "name": "default",
      "generator": "Ninja",
      "binaryDir": "${sourceDir}/build",
      "cacheVariables": {
        "CMAKE_BUILD_TYPE": "Debug"
      }
    },
    {
      "name": "release",
      "inherits": "default",
      "cacheVariables": {
        "CMAKE_BUILD_TYPE": "Release"
      }
    }
  ],
  "buildPresets": [
    {
      "name": "default",
      "configurePreset": "default"
    }
  ]
}
powershell
# Usage với presets
cmake --preset default
cmake --build --preset default

CCache — Compiler Cache

CCache là gì?

CCache lưu cache kết quả compilation. Khi bạn compile file với cùng:

  • Source content (hash)
  • Compiler flags
  • Compiler version

→ CCache trả về object file từ cache, không compile lại.

Installation

powershell
# macOS
brew install ccache

# Ubuntu/Debian
sudo apt install ccache

# Windows (via chocolatey hoặc download binary)
choco install ccache

Tích hợp với CMake

cmake
# CMakeLists.txt
find_program(CCACHE_PROGRAM ccache)
if(CCACHE_PROGRAM)
    message(STATUS "Found ccache: ${CCACHE_PROGRAM}")
    set(CMAKE_C_COMPILER_LAUNCHER ${CCACHE_PROGRAM})
    set(CMAKE_CXX_COMPILER_LAUNCHER ${CCACHE_PROGRAM})
endif()

Hoặc qua CMake Variable

powershell
cmake -B build -DCMAKE_CXX_COMPILER_LAUNCHER=ccache

Cache Statistics

powershell
# Xem cache stats
ccache -s

# Output example:
# Hits:           12345
# Misses:         678
# Hit rate:       94.8%
# Cache size:     2.1 GB

Configuration

bash
# ~/.ccache/ccache.conf
max_size = 10G
compression = true
compression_level = 6
hash_dir = false

sccache — Shared/Distributed Cache

sccache là gì?

sccache là CCache alternative bởi Mozilla, với features:

  • Cloud storage backend (S3, GCS, Azure Blob)
  • Shared cache giữa team members/CI
  • Rust support (ngoài C/C++)

Installation

powershell
# Via cargo (Rust)
cargo install sccache

# macOS
brew install sccache

# Windows
choco install sccache

Local Usage (như CCache)

powershell
# Start sccache server
sccache --start-server

# CMake integration
cmake -B build -DCMAKE_CXX_COMPILER_LAUNCHER=sccache

Team Shared Cache (S3)

bash
# Environment variables
export SCCACHE_BUCKET=my-company-sccache
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export SCCACHE_REGION=us-east-1

# Start with S3 backend
sccache --start-server
┌─────────────────────────────────────────────────────────────────────────┐
│                    SHARED CACHE WORKFLOW                                 │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   Developer A (first build):                                            │
│   ───────────────────────────                                           │
│   compile main.cpp → [CACHE MISS] → upload to S3                        │
│                                                                         │
│   Developer B (same code):                                              │
│   ───────────────────────────                                           │
│   compile main.cpp → [CACHE HIT] → download from S3                     │
│   → 50x faster (network is faster than compile)                         │
│                                                                         │
│   CI Server:                                                            │
│   ──────────                                                            │
│   All 10,000 files cached → rebuild in 2 minutes instead of 45          │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

distcc — Distributed Compilation

Concept

distcc phân tán compilation jobs sang nhiều machines trong network.

┌──────────┐     ┌────────────────┐
│  Local   │────>│  Build Server 1 │──> Returns .o
│ Machine  │────>│  Build Server 2 │──> Returns .o
│          │────>│  Build Server 3 │──> Returns .o
└──────────┘     └────────────────┘


  Link locally

Setup

bash
# Server machines
distccd --daemon --allow 192.168.1.0/24

# Client
export DISTCC_HOSTS="server1 server2 server3"
cmake -B build -DCMAKE_CXX_COMPILER_LAUNCHER=distcc
cmake --build build -j 30  # 30 jobs across 3 machines

⚠️ DISTCC CAVEATS

  • Preprocessing vẫn local (headers phải sync)
  • Network latency có thể negate benefits cho small files
  • Phức tạp setup cho heterogeneous environments
  • Recommendation: Prefer sccache với shared cache trước

Precompiled Headers (PCH)

Problem: Header Parse Time

cpp
// main.cpp
#include <vector>        // 10K+ lines
#include <string>        // 5K+ lines
#include <algorithm>     // 15K+ lines
#include <iostream>      // 8K+ lines
#include <fmt/core.h>    // 3K+ lines

int main() {
    // 10 lines of actual code
}

Mỗi file include những headers này → parse 40K+ lines mỗi lần.

Solution: Precompiled Headers

cmake
# CMakeLists.txt (CMake 3.16+)
add_executable(app src/main.cpp src/other.cpp)

# Tạo PCH từ các headers thường dùng
target_precompile_headers(app PRIVATE
    <vector>
    <string>
    <algorithm>
    <iostream>
    <fmt/core.h>
)

PCH Best Practices

cmake
# Shared PCH across targets
add_library(pch_target INTERFACE)
target_precompile_headers(pch_target INTERFACE
    <vector>
    <string>
    <memory>
    <unordered_map>
)

add_executable(app1 src/app1.cpp)
target_precompile_headers(app1 REUSE_FROM pch_target)

add_executable(app2 src/app2.cpp)
target_precompile_headers(app2 REUSE_FROM pch_target)

Unity Builds (Jumbo Builds)

Concept

Gộp nhiều .cpp files thành 1 file lớn để giảm compilation overhead.

cmake
# CMakeLists.txt (CMake 3.16+)
set(CMAKE_UNITY_BUILD ON)
set(CMAKE_UNITY_BUILD_BATCH_SIZE 10)  # 10 files per unity file

add_library(mylib 
    src/a.cpp src/b.cpp src/c.cpp 
    src/d.cpp src/e.cpp src/f.cpp
)

CMake tự động tạo:

cpp
// unity_0.cpp
#include "a.cpp"
#include "b.cpp"
#include "c.cpp"
#include "d.cpp"
#include "e.cpp"
#include "f.cpp"

Trade-offs

BenefitDrawback
Fewer compiler invocationsHigher memory usage
Better optimization opportunitiesHarder to debug
Faster initial buildIncremental build slower

💡 WHEN TO USE UNITY BUILDS

  • CI clean builds (không incremental)
  • Small projects (< 50 files)
  • HDR codebases (header-heavy)

HPN Tunnel: Complete Optimization Stack

cmake
# cmake/BuildOptimizations.cmake

# ============================================
# 1. CCACHE / SCCACHE
# ============================================
find_program(CCACHE_PROGRAM ccache)
find_program(SCCACHE_PROGRAM sccache)

if(SCCACHE_PROGRAM)
    set(CMAKE_C_COMPILER_LAUNCHER ${SCCACHE_PROGRAM})
    set(CMAKE_CXX_COMPILER_LAUNCHER ${SCCACHE_PROGRAM})
    message(STATUS "Using sccache for caching")
elseif(CCACHE_PROGRAM)
    set(CMAKE_C_COMPILER_LAUNCHER ${CCACHE_PROGRAM})
    set(CMAKE_CXX_COMPILER_LAUNCHER ${CCACHE_PROGRAM})
    message(STATUS "Using ccache for caching")
endif()

# ============================================
# 2. PRECOMPILED HEADERS
# ============================================
option(ENABLE_PCH "Enable precompiled headers" ON)

function(add_common_pch target)
    if(ENABLE_PCH)
        target_precompile_headers(${target} PRIVATE
            <vector>
            <string>
            <memory>
            <algorithm>
            <unordered_map>
            <functional>
        )
    endif()
endfunction()

# ============================================
# 3. UNITY BUILD (for CI)
# ============================================
option(ENABLE_UNITY_BUILD "Enable unity/jumbo builds" OFF)

if(ENABLE_UNITY_BUILD)
    set(CMAKE_UNITY_BUILD ON)
    set(CMAKE_UNITY_BUILD_BATCH_SIZE 16)
endif()

# ============================================
# 4. LINK TIME OPTIMIZATION
# ============================================
option(ENABLE_LTO "Enable Link Time Optimization" OFF)

if(ENABLE_LTO)
    include(CheckIPOSupported)
    check_ipo_supported(RESULT lto_supported OUTPUT lto_error)
    
    if(lto_supported)
        set(CMAKE_INTERPROCEDURAL_OPTIMIZATION ON)
        message(STATUS "LTO enabled")
    else()
        message(WARNING "LTO not supported: ${lto_error}")
    endif()
endif()

Usage

powershell
# Development (CCached, no LTO)
cmake -B build -G Ninja

# CI Release (Unity + LTO)
cmake -B build -G Ninja \
    -DCMAKE_BUILD_TYPE=Release \
    -DENABLE_UNITY_BUILD=ON \
    -DENABLE_LTO=ON

Performance Comparison

┌─────────────────────────────────────────────────────────────────────────┐
│                BUILD TIME COMPARISON (10K files project)                 │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   Configuration                    Clean Build    Incremental (1 file) │
│   ──────────────────────────────   ──────────     ────────────────────  │
│   Make (baseline)                  45 min         30 sec                │
│   Ninja                            38 min         8 sec                 │
│   Ninja + CCache (cold)            38 min         8 sec                 │
│   Ninja + CCache (warm)            3 min          2 sec                 │
│   Ninja + CCache + PCH             2 min          1 sec                 │
│   Unity Build (CI only)            20 min         N/A (always clean)   │
│                                                                         │
│   Savings: 45 min → 2 min = 22x faster                                  │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Bước tiếp theo

Đã có fast builds, giờ automate everything với CI/CD:

🔄 CI/CD GitHub Actions → — Matrix builds cho production-grade projects