⚡ Concurrency Foundations

"Don't communicate by sharing memory; share memory by communicating."
— Go Proverb

🧠 Philosophy: CSP (Communicating Sequential Processes)

Concurrency vs Parallelism

🎓 Professor Tom's Deep Dive: The Difference

Rob Pike's famous definition:

Concurrency is about DEALING with lots of things at once. Parallelism is about DOING lots of things at once.

Aspect	Concurrency	Parallelism
Definition	Structure, composition	Execution
Analogy	Juggling (1 person, many balls)	Assembly line (many workers)
Requirement	Design pattern	Multiple CPUs
Example	Handle 1000 HTTP requests	Process video on 8 cores

Key insight: Concurrency enables parallelism but doesn't require it. You can have concurrency on a single-core machine!

// Concurrency: Structure
// Many goroutines waiting, but only execute when needed
for i := 0; i < 1000; i++ {
    go handleRequest(requests[i])  // Concurrent structure
}

// Parallelism: Execution
// Actually running on multiple cores simultaneously
runtime.GOMAXPROCS(8)  // Use 8 OS threads for parallel execution

The CSP Model

┌─────────────────────────────────────────────────────────────────────┐
│                    CSP: Communicating Sequential Processes         │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Traditional Model (Shared Memory):                               │
│   ┌─────────┐     ┌─────────┐                                      │
│   │ Thread1 │────►│  Shared │◄────│ Thread2 │   ← Locks needed!   │
│   └─────────┘     │  Memory │     └─────────┘                      │
│                   └─────────┘                                       │
│                       💥 Race Conditions                           │
│                                                                     │
│   ─────────────────────────────────────────────────────────────     │
│                                                                     │
│   CSP Model (Go's Way):                                            │
│   ┌─────────┐         ┌─────────┐         ┌─────────┐              │
│   │Goroutine│───msg──►│ Channel │───msg──►│Goroutine│              │
│   │    1    │         │  (pipe) │         │    2    │              │
│   └─────────┘         └─────────┘         └─────────┘              │
│                   ✅ No shared memory, no locks!                    │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

🚀 Goroutines Internals

"Goroutine không phải là Thread"

🔥 Raizo's Critical Distinction

┌─────────────────────────────────────────────────────────────────────┐
│                 OS Thread vs Goroutine                             │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   OS Thread:                                                        │
│   ┌────────────────────────────────────────────────────────────┐   │
│   │  Stack: 1-8 MB (fixed)                                      │   │
│   │  Creation: ~1ms (kernel call)                               │   │
│   │  Context Switch: ~1-10μs (kernel mode)                      │   │
│   │  Scheduling: OS kernel                                       │   │
│   │  Max count: ~10,000 (limited by RAM)                        │   │
│   └────────────────────────────────────────────────────────────┘   │
│                                                                     │
│   Goroutine:                                                        │
│   ┌────────────────────────────────────────────────────────────┐   │
│   │  Stack: 2KB initial (grows dynamically to GB)               │   │
│   │  Creation: ~300ns (just struct allocation)                  │   │
│   │  Context Switch: ~200ns (user space)                        │   │
│   │  Scheduling: Go runtime (user space)                        │   │
│   │  Max count: 1,000,000+ (only limited by RAM)                │   │
│   └────────────────────────────────────────────────────────────┘   │
│                                                                     │
│   Cost comparison:                                                  │
│   10,000 threads: 10GB-80GB RAM                                    │
│   10,000 goroutines: ~20MB RAM                                     │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

M:N Scheduler (GMP Model)

// Go's M:N scheduler multiplexes:
// N goroutines onto M OS threads (where M << N)

// G = Goroutine (your code)
// M = Machine (OS thread) 
// P = Processor (logical CPU, runs scheduler)

┌─────────────────────────────────────────────────────────────────────┐
│                       GMP Scheduler Model                          │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Global Run Queue                                                  │
│   ┌─────────────────────────────────────────────────────────────┐  │
│   │  G₁  G₂  G₃  G₄  G₅  G₆  ...  Gₙ  (waiting goroutines)     │  │
│   └───────────────────────────┬─────────────────────────────────┘  │
│                               │                                     │
│      Steal from global    ◄───┘                                     │
│                               │                                     │
│   ┌───────────────────────────┼───────────────────────────────┐    │
│   │            P₀             │            P₁                  │    │
│   │   ┌───────────────┐       │   ┌───────────────┐           │    │
│   │   │ Local Queue   │       │   │ Local Queue   │           │    │
│   │   │  G₁₀ G₁₁ G₁₂  │       │   │  G₂₀ G₂₁      │           │    │
│   │   └───────┬───────┘       │   └───────┬───────┘           │    │
│   │           │               │           │                    │    │
│   │           ▼               │           ▼                    │    │
│   │   ┌───────────────┐       │   ┌───────────────┐           │    │
│   │   │ M₀ (OS Thread)│       │   │ M₁ (OS Thread)│           │    │
│   │   │ Running: G₁₀  │       │   │ Running: G₂₀  │           │    │
│   │   └───────────────┘       │   └───────────────┘           │    │
│   │           │               │           │                    │    │
│   │           ▼               │           ▼                    │    │
│   │       CPU Core 0          │       CPU Core 1               │    │
│   └───────────────────────────┴───────────────────────────────┘    │
│                                                                     │
│   Work Stealing: P₁ can steal from P₀'s queue if idle              │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Goroutine Creation

func main() {
    // Simple goroutine
    go func() {
        fmt.Println("Hello from goroutine!")
    }()
    
    // Goroutine with parameters (capture correctly!)
    for i := 0; i < 10; i++ {
        go func(n int) {  // Pass as parameter!
            fmt.Println(n)
        }(i)
    }
    
    time.Sleep(time.Second)  // Wait for goroutines (BAD, use WaitGroup!)
}

🔒 Synchronization Primitives

WaitGroup: Waiting for Batch Jobs

import "sync"

func ProcessBatch(items []Item) {
    var wg sync.WaitGroup
    
    for _, item := range items {
        wg.Add(1)  // Increment counter BEFORE goroutine
        
        go func(it Item) {
            defer wg.Done()  // Decrement when done
            process(it)
        }(item)
    }
    
    wg.Wait()  // Block until counter reaches 0
    fmt.Println("All items processed!")
}

Mutex: Protecting Shared Data

import "sync"

type SafeCounter struct {
    mu    sync.Mutex
    count int
}

func (c *SafeCounter) Increment() {
    c.mu.Lock()
    defer c.mu.Unlock()
    c.count++
}

func (c *SafeCounter) Value() int {
    c.mu.Lock()
    defer c.mu.Unlock()
    return c.count
}

RWMutex: Read-Heavy Workloads

import "sync"

type Cache struct {
    mu   sync.RWMutex
    data map[string]string
}

// Multiple readers can access simultaneously
func (c *Cache) Get(key string) (string, bool) {
    c.mu.RLock()          // Read lock (shared)
    defer c.mu.RUnlock()
    v, ok := c.data[key]
    return v, ok
}

// Writers have exclusive access
func (c *Cache) Set(key, value string) {
    c.mu.Lock()           // Write lock (exclusive)
    defer c.mu.Unlock()
    c.data[key] = value
}

Atomic: Lightning-Fast Counters

import "sync/atomic"

// Faster than Mutex for simple counters
var counter int64

func IncrementAtomic() {
    atomic.AddInt64(&counter, 1)
}

func GetCounter() int64 {
    return atomic.LoadInt64(&counter)
}

// Performance comparison:
// Mutex:  ~25ns per operation
// Atomic: ~5ns per operation (5x faster!)

💥 The Enemy: Race Conditions

What Happens?

┌─────────────────────────────────────────────────────────────────────┐
│                      RACE CONDITION                                │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   counter := 0                                                      │
│                                                                     │
│   Goroutine 1          Time          Goroutine 2                   │
│   ─────────────────────────────────────────────────────────────     │
│   Read counter (0)      t₁                                         │
│                         t₂           Read counter (0)              │
│   Add 1 → temp=1        t₃                                         │
│                         t₄           Add 1 → temp=1                │
│   Write temp(1)         t₅                                         │
│                         t₆           Write temp(1)                 │
│   ─────────────────────────────────────────────────────────────     │
│                                                                     │
│   Expected: counter = 2                                             │
│   Actual:   counter = 1  💥                                        │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

The -race Flag

📌 HPN Standard: MANDATORY in CI/CD

bash

# Always run tests with race detector
$ go test -race ./...

# Run binary with race detector
$ go run -race main.go

# Build with race detector (for staging only, 10x slower)
$ go build -race -o myapp-race

CI Pipeline example (GitHub Actions):

yaml

- name: Run tests with race detector
  run: go test -race -v ./...

🎮 Spot the Bug: Data Race

🧩 Challenge: Find the Race Condition

package main

import (
    "fmt"
    "sync"
)

func main() {
    counter := 0
    var wg sync.WaitGroup
    
    for i := 0; i < 1000; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()
            counter++  // 🐛 Where's the bug?
        }()
    }
    
    wg.Wait()
    fmt.Println("Counter:", counter)
    // Expected: 1000
    // Actual: random number < 1000!
}

💡 Solution

The Bug: counter++ is NOT atomic! It's actually 3 operations:

Read counter
Increment
Write back

Multiple goroutines executing these steps concurrently = data race.

Fix 1: Using Mutex

var mu sync.Mutex
counter := 0

go func() {
    defer wg.Done()
    mu.Lock()
    counter++
    mu.Unlock()
}()

Fix 2: Using Atomic (Preferred for counters)

var counter int64

go func() {
    defer wg.Done()
    atomic.AddInt64(&counter, 1)
}()

Fix 3: Using Channel (Go idiom)

results := make(chan int, 1000)

go func() {
    results <- 1  // Send increment
}()

// Collect results
total := 0
for i := 0; i < 1000; i++ {
    total += <-results
}

Detect with race flag:

bash

$ go run -race main.go
==================
WARNING: DATA RACE
Write at 0x00c0000140a8 by goroutine 7:
  main.main.func1()
      /path/main.go:15 +0x4c
...

📊 Summary

Concept	Key Point
CSP	Communicate via channels, not shared memory
Goroutine	2KB stack, 1M+ possible, user-space scheduling
GMP Model	G=Goroutine, M=OSThread, P=Processor
WaitGroup	Wait for batch of goroutines
Mutex	Exclusive lock for write-heavy
RWMutex	Read-heavy workloads (multiple readers)
Atomic	Fastest for simple counters
-race flag	MANDATORY in CI/CD pipelines

➡️ Tiếp theo

Foundations nắm vững rồi! Tiếp theo: Channels & Patterns - Worker pools, fan-out/fan-in, và graceful shutdown.

⚡ Concurrency Foundations ​

🧠 Philosophy: CSP (Communicating Sequential Processes) ​

Concurrency vs Parallelism ​

The CSP Model ​

🚀 Goroutines Internals ​

"Goroutine không phải là Thread" ​

M:N Scheduler (GMP Model) ​

Goroutine Creation ​

🔒 Synchronization Primitives ​

WaitGroup: Waiting for Batch Jobs ​

Mutex: Protecting Shared Data ​

RWMutex: Read-Heavy Workloads ​

Atomic: Lightning-Fast Counters ​

💥 The Enemy: Race Conditions ​

What Happens? ​

The -race Flag ​

🎮 Spot the Bug: Data Race ​

📊 Summary ​

➡️ Tiếp theo ​