Skip to content

Compilation Pipeline Deep Tech

HIR → MIR → LLVM IR → Machine Code

1. High-Level IR (HIR)

Desugaring Process

HIR là bước đầu tiên sau AST — nơi Rust "desugar" các syntax sugar thành dạng cơ bản hơn.

rust
// Source code
for i in 0..10 {
    println!("{}", i);
}

// Desugared HIR (pseudo-code)
{
    let mut iter = IntoIterator::into_iter(0..10);
    loop {
        match Iterator::next(&mut iter) {
            Some(i) => { println!("{}", i); }
            None => break,
        }
    }
}

Xem HIR

bash
# Compile với HIR output
rustc +nightly -Z unpretty=hir main.rs

# Hoặc với cargo
RUSTFLAGS="-Z unpretty=hir" cargo build

Những gì được desugar?

SyntaxDesugared thành
for x in iterloop { match iter.next() {...} }
x?match x { Ok(v) => v, Err(e) => return Err(e.into()) }
async fnfn() -> impl Future<Output = T>
method.call()Type::call(&method)
x..yRange { start: x, end: y }

2. Mid-Level IR (MIR)

MIR là gì?

MIR là Control Flow Graph (CFG) — nơi borrow checker hoạt động. Mỗi function được chia thành Basic Blocks.

                    MIR STRUCTURE
┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   fn example(x: i32) -> i32 {                                   │
│       if x > 0 { x + 1 } else { x - 1 }                         │
│   }                                                             │
│                                                                 │
│   Compiles to MIR:                                              │
│                                                                 │
│   ┌─────────────────┐                                           │
│   │     bb0         │ Entry block                               │
│   │ _2 = Gt(x, 0)   │ Compare                                   │
│   │ switchInt(_2)   │ Branch                                    │
│   └────────┬────────┘                                           │
│       ┌────┴────┐                                               │
│       ▼         ▼                                               │
│   ┌───────┐ ┌───────┐                                           │
│   │  bb1  │ │  bb2  │                                           │
│   │x + 1  │ │x - 1  │                                           │
│   └───┬───┘ └───┬───┘                                           │
│       │         │                                               │
│       └────┬────┘                                               │
│            ▼                                                    │
│       ┌───────┐                                                 │
│       │  bb3  │ Return block                                    │
│       │return │                                                 │
│       └───────┘                                                 │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Xem MIR

bash
# Xem MIR
rustc +nightly -Z unpretty=mir main.rs

# MIR output example:
fn example(_1: i32) -> i32 {
    let mut _0: i32;
    let mut _2: bool;

    bb0: {
        _2 = Gt(_1, const 0_i32);
        switchInt(move _2) -> [0: bb2, otherwise: bb1];
    }

    bb1: {
        _0 = Add(_1, const 1_i32);
        goto -> bb3;
    }

    bb2: {
        _0 = Sub(_1, const 1_i32);
        goto -> bb3;
    }

    bb3: {
        return;
    }
}

Borrow Checker trên MIR

rust
fn example() {
    let mut x = 5;
    let r = &mut x;  // Borrow starts
    *r = 10;
    println!("{}", x);  // Borrow must end before here
}

// MIR với borrow regions:
// _2 = &mut _1     // 'a starts
// (*_2) = 10       // Using 'a
// _3 = &_1         // ERROR: 'a still active!

MIR Optimizations

OptimizationMô tả
Constant PropagationReplace variables với known values
Dead Code EliminationRemove unreachable blocks
InliningInline small functions
Copy PropagationEliminate unnecessary copies

3. LLVM IR

LLVM là gì?

LLVM là backend compiler framework. Rust's codegen backend chuyển MIR → LLVM IR, sau đó LLVM optimize và generate machine code.

MIR ──▶ LLVM IR ──▶ LLVM Optimizer ──▶ Machine Code

                        ├── x86_64
                        ├── ARM
                        ├── WASM
                        └── RISC-V

Xem LLVM IR

bash
# Generate LLVM IR
rustc --emit=llvm-ir main.rs

# Output: main.ll

LLVM IR Example

rust
// Rust source
pub fn add(a: i32, b: i32) -> i32 {
    a + b
}
llvm
; LLVM IR output
define i32 @add(i32 %a, i32 %b) {
entry:
  %result = add nsw i32 %a, %b
  ret i32 %result
}

LLVM Optimization Levels

FlagOptimizationUse case
-C opt-level=0NoneDebug builds
-C opt-level=1BasicFast dev builds
-C opt-level=2StandardBalanced
-C opt-level=3AggressiveMax performance
-C opt-level=sSizeSmaller binary
-C opt-level=zMin sizeSmallest binary

4. The Complete Pipeline

rust
// Source: src/main.rs
fn main() {
    let x = vec![1, 2, 3];
    for i in &x {
        println!("{}", i);
    }
}

Stage 1: Parsing → AST

AST:
├── FnDef "main"
│   └── Block
│       ├── Let "x" = MacroCall "vec!"
│       └── ForLoop
│           ├── Pattern "i"
│           ├── Expr: &x
│           └── Block: MacroCall "println!"

Stage 2: HIR (Desugared)

rust
// for loop becomes:
{
    let mut iter = (&x).into_iter();
    loop {
        match iter.next() {
            Some(i) => { /* println */ }
            None => break,
        }
    }
}

Stage 3: MIR (CFG)

bb0: setup iter
bb1: call next()
bb2: match Some → bb3, None → bb4
bb3: println, goto bb1
bb4: return

Stage 4: LLVM IR → Assembly

asm
main:
    push    rbp
    mov     rbp, rsp
    ; ... vector allocation
    ; ... loop setup
.loop:
    call    core::iter::Iterator::next
    test    rax, rax
    je      .done
    ; ... println call
    jmp     .loop
.done:
    pop     rbp
    ret

🎯 Practical Commands

bash
# Xem tất cả các stages
rustc +nightly -Z unpretty=hir main.rs > hir.txt
rustc +nightly -Z unpretty=mir main.rs > mir.txt
rustc --emit=llvm-ir main.rs
rustc --emit=asm main.rs

# Với cargo
cargo rustc -- --emit=asm
cargo rustc -- -Z unpretty=mir  # nightly only

💡 DEBUGGING TIP

Khi gặp borrow checker error khó hiểu, thử xem MIR để hiểu lifetime regions.