Skip to content

Concurrency Foundations

"Don't communicate by sharing memory; share memory by communicating."
— Go Proverb

🧠 Philosophy: CSP (Communicating Sequential Processes)

Concurrency vs Parallelism

🎓 Professor Tom's Deep Dive: The Difference

Rob Pike's famous definition:

Concurrency is about DEALING with lots of things at once. Parallelism is about DOING lots of things at once.

AspectConcurrencyParallelism
DefinitionStructure, compositionExecution
AnalogyJuggling (1 person, many balls)Assembly line (many workers)
RequirementDesign patternMultiple CPUs
ExampleHandle 1000 HTTP requestsProcess video on 8 cores

Key insight: Concurrency enables parallelism but doesn't require it. You can have concurrency on a single-core machine!

go
// Concurrency: Structure
// Many goroutines waiting, but only execute when needed
for i := 0; i < 1000; i++ {
    go handleRequest(requests[i])  // Concurrent structure
}

// Parallelism: Execution
// Actually running on multiple cores simultaneously
runtime.GOMAXPROCS(8)  // Use 8 OS threads for parallel execution

The CSP Model

┌─────────────────────────────────────────────────────────────────────┐
│                    CSP: Communicating Sequential Processes         │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Traditional Model (Shared Memory):                               │
│   ┌─────────┐     ┌─────────┐                                      │
│   │ Thread1 │────►│  Shared │◄────│ Thread2 │   ← Locks needed!   │
│   └─────────┘     │  Memory │     └─────────┘                      │
│                   └─────────┘                                       │
│                       💥 Race Conditions                           │
│                                                                     │
│   ─────────────────────────────────────────────────────────────     │
│                                                                     │
│   CSP Model (Go's Way):                                            │
│   ┌─────────┐         ┌─────────┐         ┌─────────┐              │
│   │Goroutine│───msg──►│ Channel │───msg──►│Goroutine│              │
│   │    1    │         │  (pipe) │         │    2    │              │
│   └─────────┘         └─────────┘         └─────────┘              │
│                   ✅ No shared memory, no locks!                    │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

🚀 Goroutines Internals

"Goroutine không phải là Thread"

🔥 Raizo's Critical Distinction

┌─────────────────────────────────────────────────────────────────────┐
│                 OS Thread vs Goroutine                             │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   OS Thread:                                                        │
│   ┌────────────────────────────────────────────────────────────┐   │
│   │  Stack: 1-8 MB (fixed)                                      │   │
│   │  Creation: ~1ms (kernel call)                               │   │
│   │  Context Switch: ~1-10μs (kernel mode)                      │   │
│   │  Scheduling: OS kernel                                       │   │
│   │  Max count: ~10,000 (limited by RAM)                        │   │
│   └────────────────────────────────────────────────────────────┘   │
│                                                                     │
│   Goroutine:                                                        │
│   ┌────────────────────────────────────────────────────────────┐   │
│   │  Stack: 2KB initial (grows dynamically to GB)               │   │
│   │  Creation: ~300ns (just struct allocation)                  │   │
│   │  Context Switch: ~200ns (user space)                        │   │
│   │  Scheduling: Go runtime (user space)                        │   │
│   │  Max count: 1,000,000+ (only limited by RAM)                │   │
│   └────────────────────────────────────────────────────────────┘   │
│                                                                     │
│   Cost comparison:                                                  │
│   10,000 threads: 10GB-80GB RAM                                    │
│   10,000 goroutines: ~20MB RAM                                     │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

M:N Scheduler (GMP Model)

go
// Go's M:N scheduler multiplexes:
// N goroutines onto M OS threads (where M << N)

// G = Goroutine (your code)
// M = Machine (OS thread) 
// P = Processor (logical CPU, runs scheduler)
┌─────────────────────────────────────────────────────────────────────┐
│                       GMP Scheduler Model                          │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Global Run Queue                                                  │
│   ┌─────────────────────────────────────────────────────────────┐  │
│   │  G₁  G₂  G₃  G₄  G₅  G₆  ...  Gₙ  (waiting goroutines)     │  │
│   └───────────────────────────┬─────────────────────────────────┘  │
│                               │                                     │
│      Steal from global    ◄───┘                                     │
│                               │                                     │
│   ┌───────────────────────────┼───────────────────────────────┐    │
│   │            P₀             │            P₁                  │    │
│   │   ┌───────────────┐       │   ┌───────────────┐           │    │
│   │   │ Local Queue   │       │   │ Local Queue   │           │    │
│   │   │  G₁₀ G₁₁ G₁₂  │       │   │  G₂₀ G₂₁      │           │    │
│   │   └───────┬───────┘       │   └───────┬───────┘           │    │
│   │           │               │           │                    │    │
│   │           ▼               │           ▼                    │    │
│   │   ┌───────────────┐       │   ┌───────────────┐           │    │
│   │   │ M₀ (OS Thread)│       │   │ M₁ (OS Thread)│           │    │
│   │   │ Running: G₁₀  │       │   │ Running: G₂₀  │           │    │
│   │   └───────────────┘       │   └───────────────┘           │    │
│   │           │               │           │                    │    │
│   │           ▼               │           ▼                    │    │
│   │       CPU Core 0          │       CPU Core 1               │    │
│   └───────────────────────────┴───────────────────────────────┘    │
│                                                                     │
│   Work Stealing: P₁ can steal from P₀'s queue if idle              │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Goroutine Creation

go
func main() {
    // Simple goroutine
    go func() {
        fmt.Println("Hello from goroutine!")
    }()
    
    // Goroutine with parameters (capture correctly!)
    for i := 0; i < 10; i++ {
        go func(n int) {  // Pass as parameter!
            fmt.Println(n)
        }(i)
    }
    
    time.Sleep(time.Second)  // Wait for goroutines (BAD, use WaitGroup!)
}

🔒 Synchronization Primitives

WaitGroup: Waiting for Batch Jobs

go
import "sync"

func ProcessBatch(items []Item) {
    var wg sync.WaitGroup
    
    for _, item := range items {
        wg.Add(1)  // Increment counter BEFORE goroutine
        
        go func(it Item) {
            defer wg.Done()  // Decrement when done
            process(it)
        }(item)
    }
    
    wg.Wait()  // Block until counter reaches 0
    fmt.Println("All items processed!")
}

Mutex: Protecting Shared Data

go
import "sync"

type SafeCounter struct {
    mu    sync.Mutex
    count int
}

func (c *SafeCounter) Increment() {
    c.mu.Lock()
    defer c.mu.Unlock()
    c.count++
}

func (c *SafeCounter) Value() int {
    c.mu.Lock()
    defer c.mu.Unlock()
    return c.count
}

RWMutex: Read-Heavy Workloads

go
import "sync"

type Cache struct {
    mu   sync.RWMutex
    data map[string]string
}

// Multiple readers can access simultaneously
func (c *Cache) Get(key string) (string, bool) {
    c.mu.RLock()          // Read lock (shared)
    defer c.mu.RUnlock()
    v, ok := c.data[key]
    return v, ok
}

// Writers have exclusive access
func (c *Cache) Set(key, value string) {
    c.mu.Lock()           // Write lock (exclusive)
    defer c.mu.Unlock()
    c.data[key] = value
}

Atomic: Lightning-Fast Counters

go
import "sync/atomic"

// Faster than Mutex for simple counters
var counter int64

func IncrementAtomic() {
    atomic.AddInt64(&counter, 1)
}

func GetCounter() int64 {
    return atomic.LoadInt64(&counter)
}

// Performance comparison:
// Mutex:  ~25ns per operation
// Atomic: ~5ns per operation (5x faster!)

💥 The Enemy: Race Conditions

What Happens?

┌─────────────────────────────────────────────────────────────────────┐
│                      RACE CONDITION                                │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   counter := 0                                                      │
│                                                                     │
│   Goroutine 1          Time          Goroutine 2                   │
│   ─────────────────────────────────────────────────────────────     │
│   Read counter (0)      t₁                                         │
│                         t₂           Read counter (0)              │
│   Add 1 → temp=1        t₃                                         │
│                         t₄           Add 1 → temp=1                │
│   Write temp(1)         t₅                                         │
│                         t₆           Write temp(1)                 │
│   ─────────────────────────────────────────────────────────────     │
│                                                                     │
│   Expected: counter = 2                                             │
│   Actual:   counter = 1  💥                                        │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

The -race Flag

📌 HPN Standard: MANDATORY in CI/CD

bash
# Always run tests with race detector
$ go test -race ./...

# Run binary with race detector
$ go run -race main.go

# Build with race detector (for staging only, 10x slower)
$ go build -race -o myapp-race

CI Pipeline example (GitHub Actions):

yaml
- name: Run tests with race detector
  run: go test -race -v ./...

🎮 Spot the Bug: Data Race

🧩 Challenge: Find the Race Condition

go
package main

import (
    "fmt"
    "sync"
)

func main() {
    counter := 0
    var wg sync.WaitGroup
    
    for i := 0; i < 1000; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()
            counter++  // 🐛 Where's the bug?
        }()
    }
    
    wg.Wait()
    fmt.Println("Counter:", counter)
    // Expected: 1000
    // Actual: random number < 1000!
}
💡 Solution

The Bug: counter++ is NOT atomic! It's actually 3 operations:

  1. Read counter
  2. Increment
  3. Write back

Multiple goroutines executing these steps concurrently = data race.

Fix 1: Using Mutex

go
var mu sync.Mutex
counter := 0

go func() {
    defer wg.Done()
    mu.Lock()
    counter++
    mu.Unlock()
}()

Fix 2: Using Atomic (Preferred for counters)

go
var counter int64

go func() {
    defer wg.Done()
    atomic.AddInt64(&counter, 1)
}()

Fix 3: Using Channel (Go idiom)

go
results := make(chan int, 1000)

go func() {
    results <- 1  // Send increment
}()

// Collect results
total := 0
for i := 0; i < 1000; i++ {
    total += <-results
}

Detect with race flag:

bash
$ go run -race main.go
==================
WARNING: DATA RACE
Write at 0x00c0000140a8 by goroutine 7:
  main.main.func1()
      /path/main.go:15 +0x4c
...

📊 Summary

ConceptKey Point
CSPCommunicate via channels, not shared memory
Goroutine2KB stack, 1M+ possible, user-space scheduling
GMP ModelG=Goroutine, M=OSThread, P=Processor
WaitGroupWait for batch of goroutines
MutexExclusive lock for write-heavy
RWMutexRead-heavy workloads (multiple readers)
AtomicFastest for simple counters
-race flagMANDATORY in CI/CD pipelines

➡️ Tiếp theo

Foundations nắm vững rồi! Tiếp theo: Channels & Patterns - Worker pools, fan-out/fan-in, và graceful shutdown.