Giao diện
📡 I/O Architecture & Data Serialization
"Everything is a file" — Unix philosophy.
Trong Go, mọi data source đều implement
Trong Go, mọi data source đều implement
io.Reader. Master interface này để xây dựng data pipelines hiệu quả. 🔌 The Universal Interfaces
io.Reader & io.Writer
go
// Hai interfaces quan trọng nhất trong Go
type Reader interface {
Read(p []byte) (n int, err error)
}
type Writer interface {
Write(p []byte) (n int, err error)
}🎓 Professor Tom's Deep Dive: Universality
Mọi thứ trong Go implement hai interfaces này:
| Source/Dest | Type | Reader/Writer |
|---|---|---|
| Files | *os.File | Both |
| Network | net.Conn | Both |
| HTTP Body | *http.Response.Body | Reader |
| HTTP Response | http.ResponseWriter | Writer |
| Memory | *bytes.Buffer | Both |
| Compression | *gzip.Reader/Writer | Both |
| Encryption | cipher.StreamReader | Reader |
| Strings | *strings.Reader | Reader |
Power: Viết function nhận io.Reader → work với TẤT CẢ sources!
The Pipe Metaphor: Chaining
go
// Unix pipe: cat file.txt | gzip | encrypt | nc server 8080
// Go equivalent:
func ProcessAndSend(inputPath string, conn net.Conn, key []byte) error {
// Open file
file, err := os.Open(inputPath)
if err != nil {
return err
}
defer file.Close()
// Chain: File → Gzip → Encrypt → Network
gzipWriter := gzip.NewWriter(conn)
defer gzipWriter.Close()
encryptWriter := cipher.NewOFBWriter(gzipWriter, key)
// Data flows through the chain
_, err = io.Copy(encryptWriter, file)
return err
}Visual Pipeline:
┌─────────────────────────────────────────────────────────────────────┐
│ DATA PIPELINE (Unix Pipes) │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ File │───►│ Gzip │───►│ Encrypt │───►│ Network │ │
│ │ (Reader) │ │ (Writer) │ │ (Writer) │ │ (Conn) │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ │
│ io.Copy(encryptWriter, file) │
│ → Data streams through without loading entire file into RAM! │
│ │
└─────────────────────────────────────────────────────────────────────┘📖 Efficient File Processing
The Problem: Loading Huge Files
🔥 Raizo's Pitfall: Memory Explosion
go
// ❌ BAD: Load entire file into memory
func ProcessFileBad(path string) error {
data, err := os.ReadFile(path) // 10GB file = 10GB RAM!
if err != nil {
return err
}
lines := strings.Split(string(data), "\n") // Double memory!
for _, line := range lines {
process(line)
}
return nil
}Memory usage với 10GB file:
os.ReadFile: 10GB allocatedstrings.Split: Additional ~10GB for string copies- Total: 20GB+ for a 10GB file!
The Solution: Buffered I/O with bufio
go
// ✅ GOOD: Stream with constant memory
func ProcessFileGood(path string) error {
file, err := os.Open(path)
if err != nil {
return err
}
defer file.Close()
scanner := bufio.NewScanner(file)
// Optional: increase buffer for long lines
buf := make([]byte, 64*1024) // 64KB buffer
scanner.Buffer(buf, 1024*1024) // Max 1MB per line
for scanner.Scan() {
line := scanner.Text()
process(line) // Process one line at a time
}
return scanner.Err()
}Memory Comparison
┌─────────────────────────────────────────────────────────────────────┐
│ MEMORY USAGE: Processing 10GB Log File │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ os.ReadFile (BAD): │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ ████████████████████████████████████████████████ ~20GB RAM │ │
│ └────────────────────────────────────────────────────────────┘ │
│ 💥 Server crash với 512MB RAM! │
│ │
│ bufio.Scanner (GOOD): │
│ ┌────┐ │
│ │ ██ │ ~64KB buffer (constant!) │
│ └────┘ │
│ ✅ Works perfectly với 512MB RAM │
│ │
│ Time to first output: │
│ - ReadFile: Must wait for entire 10GB to load │
│ - Scanner: Immediate (reads first line instantly) │
│ │
└─────────────────────────────────────────────────────────────────────┘bufio.Reader vs bufio.Scanner
go
// bufio.Scanner - Line-by-line processing (most common)
scanner := bufio.NewScanner(reader)
for scanner.Scan() {
line := scanner.Text()
// process line
}
// bufio.Reader - More control (custom delimiters, peeking)
reader := bufio.NewReader(file)
for {
line, err := reader.ReadString('\n')
if err == io.EOF {
break
}
// process line
}📦 Data Serialization
JSON: The Universal Format
go
import "encoding/json"
type User struct {
ID int64 `json:"id"`
Name string `json:"name"`
Email string `json:"email,omitempty"` // Omit if empty
Password string `json:"-"` // Never serialize
CreatedAt time.Time `json:"created_at"`
}
// Marshal: Struct → JSON bytes
user := User{ID: 1, Name: "Raizo", Email: "raizo@hpn.dev"}
data, err := json.Marshal(user)
// {"id":1,"name":"Raizo","email":"raizo@hpn.dev","created_at":"..."}
// Unmarshal: JSON bytes → Struct
var parsed User
err = json.Unmarshal(data, &parsed)Streaming JSON for APIs
go
// ❌ BAD: Marshal to bytes, then write
func HandleUserBad(w http.ResponseWriter, r *http.Request) {
user := getUser()
data, _ := json.Marshal(user) // Allocate full byte slice
w.Write(data)
}
// ✅ GOOD: Stream directly to writer
func HandleUserGood(w http.ResponseWriter, r *http.Request) {
user := getUser()
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(user) // Stream directly!
}
// Benefits:
// - No intermediate byte slice allocation
// - Works with chunked transfer encoding
// - Lower memory footprint for large responsesGob: Go's Binary Format
🎓 Professor Tom's Deep Dive: When to Use Gob
Gob là Go-specific binary format:
| Aspect | JSON | Gob |
|---|---|---|
| Readable | ✅ Human readable | ❌ Binary |
| Size | Larger | ~50% smaller |
| Speed | Slower | 2-5x faster |
| Interop | ✅ Any language | ❌ Go only |
| Use Case | APIs, config | Internal storage, RPC |
go
import "encoding/gob"
// Encode to file
func SaveSession(path string, session *Session) error {
file, err := os.Create(path)
if err != nil {
return err
}
defer file.Close()
encoder := gob.NewEncoder(file)
return encoder.Encode(session)
}
// Decode from file
func LoadSession(path string) (*Session, error) {
file, err := os.Open(path)
if err != nil {
return nil, err
}
defer file.Close()
var session Session
decoder := gob.NewDecoder(file)
if err := decoder.Decode(&session); err != nil {
return nil, err
}
return &session, nil
}📌 HPN Application: Penalgo Save Files
Penalgo sử dụng Gob cho "Save Progress" feature:
go
// Compact và nhanh cho internal tools
type PenalgoProgress struct {
UserID string
CompletedLabs []string
Scores map[string]int
LastAccess time.Time
}
// Save: 50% smaller than JSON, 3x faster encode
func SaveProgress(userID string, progress *PenalgoProgress) error {
path := filepath.Join(dataDir, userID+".gob")
return SaveSession(path, progress)
}📁 Config File Patterns
Reading JSON Config at Startup
go
// config/config.go
type Config struct {
Server ServerConfig `json:"server"`
Database DatabaseConfig `json:"database"`
Redis RedisConfig `json:"redis"`
}
type ServerConfig struct {
Port int `json:"port"`
ReadTimeout time.Duration `json:"read_timeout"`
WriteTimeout time.Duration `json:"write_timeout"`
}
func LoadConfig(path string) (*Config, error) {
file, err := os.Open(path)
if err != nil {
return nil, fmt.Errorf("open config: %w", err)
}
defer file.Close()
var cfg Config
if err := json.NewDecoder(file).Decode(&cfg); err != nil {
return nil, fmt.Errorf("decode config: %w", err)
}
return &cfg, nil
}
// main.go
func main() {
cfg, err := config.LoadConfig("configs/config.json")
if err != nil {
log.Fatalf("failed to load config: %v", err)
}
server := NewServer(cfg)
server.Run()
}Config Structure
project/
├── cmd/
│ └── api/
│ └── main.go
├── configs/
│ ├── config.json # Default config
│ ├── config.dev.json # Development overrides
│ └── config.prod.json # Production settings
├── internal/
│ └── config/
│ └── config.go # Config types & loader🎮 Scenario Analysis
🧠 Production Challenge
Scenario: Bạn cần xử lý một file log nặng 10GB trên server chỉ có 512MB RAM.
Bạn sẽ dùng os.ReadFile hay bufio.Scanner?
💡 Phân tích cơ chế quản lý bộ nhớ
os.ReadFile Approach (❌ Sẽ Crash)
go
data, _ := os.ReadFile("10gb.log") // 💥 Out of Memory!Vấn đề:
os.ReadFileallocate slice 10GB để chứa toàn bộ file- Server 512MB RAM → OOM Kill ngay lập tức
- Không có cách nào để giảm memory footprint
bufio.Scanner Approach (✅ OK)
go
file, _ := os.Open("10gb.log")
scanner := bufio.NewScanner(file)
for scanner.Scan() {
process(scanner.Text()) // ~64KB buffer
}Cơ chế:
bufio.Scannerdùng internal buffer (default 64KB)- Mỗi lần
Scan():- Đọc data vào buffer
- Tìm delimiter (newline)
- Return line, reuse buffer cho line tiếp theo
- Memory constant (~64KB) bất kể file size!
Memory Timeline
os.ReadFile:
Time 0: ░░░░░░░░░░░░░░░░░░░░ 0MB
Time 1: ██████████████████████████████████████████ 10GB → 💥 OOM
bufio.Scanner:
Time 0: ██ 64KB
Time 1: ██ 64KB (same buffer reused)
Time 2: ██ 64KB
...
Time N: ██ 64KB → ✅ Completes successfully!Production Tips
- Large files: Always use streaming (
bufio,io.Copy) - Memory limit: Set via
GOMEMLIMIT(Go 1.19+) - Monitor: Use
runtime.ReadMemStatstrong dev - Long lines: Configure
scanner.Buffer(buf, maxLineSize)
📊 Summary: Module 2 Complete!
| Concept | Key Point |
|---|---|
| io.Reader/Writer | Universal interfaces for all I/O |
| Chaining | Pipe data through transformations |
| bufio | Constant memory for large files |
| JSON | Human-readable, use for APIs |
| Gob | Binary, faster, Go-only internal use |
| Config | Load at startup, fail fast |
🦴 Module 2 Complete: The Skeleton
🎉 Chúc mừng! Module 2 Hoàn thành!
Bạn đã xây dựng xong "Bộ xương" của Go:
- ✅ Structs & Interfaces — Cấu trúc dữ liệu
- ✅ Memory Layout — Alignment và optimization
- ✅ I/O Architecture — Mạch máu dữ liệu
- ✅ Serialization — Giao tiếp với thế giới
- ✅ Reflection — Tự nhận thức runtime
⚡ Tiếp theo: Module 3 - Concurrency
"Next, we breathe life into the monster using Goroutines."
- 🔥 Goroutines — Lightweight concurrency (2KB stack)
- 🔥 Channels — Safe communication between goroutines
- 🔥 Context — Cancellation, timeouts, request-scoped data
- 🔥 sync Primitives — Mutex, WaitGroup, atomic operations
- 🔥 Concurrency Patterns — Worker pools, fan-out/fan-in
"Do not communicate by sharing memory; share memory by communicating."