Skip to content

Logging & Debugging Standard Library

print() là cho debugging - logging là cho production

Learning Outcomes

Sau khi hoàn thành trang này, bạn sẽ:

  • 🎯 Hiểu logging module architecture (loggers, handlers, formatters)
  • 🎯 Cấu hình handlers cho console, file, và remote destinations
  • 🎯 Implement structured logging (JSON) cho observability
  • 🎯 Debug hiệu quả với pdb và các strategies
  • 🎯 Tránh các Production Pitfalls trong logging

Logging Architecture

Tổng Quan

┌─────────────────────────────────────────────────────────────┐
│                      Your Code                              │
│                                                             │
│   logger.info("User logged in", extra={"user_id": 123})    │
└─────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────┐
│                        Logger                               │
│   - Name hierarchy (app.auth.login)                        │
│   - Level filtering (DEBUG, INFO, WARNING, ERROR, CRITICAL)│
│   - Propagation to parent loggers                          │
└─────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────┐
│                       Handler(s)                            │
│   - StreamHandler (console)                                │
│   - FileHandler (file)                                     │
│   - RotatingFileHandler (size-based rotation)              │
│   - TimedRotatingFileHandler (time-based rotation)         │
│   - SocketHandler, HTTPHandler, etc.                       │
└─────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────┐
│                      Formatter                              │
│   - Format string: "%(asctime)s - %(name)s - %(message)s"  │
│   - JSON formatter for structured logging                  │
└─────────────────────────────────────────────────────────────┘


                         Output
              (Console, File, Network, etc.)

Log Levels

LevelNumericKhi nào dùng
DEBUG10Chi tiết cho debugging, không dùng trong production
INFO20Xác nhận mọi thứ hoạt động bình thường
WARNING30Có gì đó không mong đợi, nhưng app vẫn chạy
ERROR40Lỗi nghiêm trọng, một số chức năng không hoạt động
CRITICAL50Lỗi nghiêm trọng, app có thể không tiếp tục được

Basic Configuration

Quick Start

python
import logging

# ✅ Basic config (cho scripts đơn giản)
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
    datefmt="%Y-%m-%d %H:%M:%S"
)

logger = logging.getLogger(__name__)

logger.debug("This won't show (level is INFO)")
logger.info("Application started")
logger.warning("Low disk space")
logger.error("Failed to connect to database")
logger.critical("System is shutting down")

Logger Hierarchy

python
import logging

# Logger names form a hierarchy separated by dots
root_logger = logging.getLogger()           # Root logger
app_logger = logging.getLogger("app")       # app
auth_logger = logging.getLogger("app.auth") # app.auth (child of app)
db_logger = logging.getLogger("app.db")     # app.db (child of app)

# Messages propagate UP the hierarchy by default
# app.auth → app → root

# ✅ Best practice: Use __name__ for automatic hierarchy
logger = logging.getLogger(__name__)
# In myapp/auth/login.py → logger name is "myapp.auth.login"

Handlers - Nơi Log Đi Đến

StreamHandler (Console)

python
import logging
import sys

logger = logging.getLogger("myapp")
logger.setLevel(logging.DEBUG)

# Console handler
console_handler = logging.StreamHandler(sys.stdout)
console_handler.setLevel(logging.INFO)  # Chỉ INFO+ ra console

formatter = logging.Formatter(
    "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
)
console_handler.setFormatter(formatter)

logger.addHandler(console_handler)

FileHandler

python
import logging
from pathlib import Path

logger = logging.getLogger("myapp")

# Tạo log directory
log_dir = Path("logs")
log_dir.mkdir(exist_ok=True)

# File handler
file_handler = logging.FileHandler(
    log_dir / "app.log",
    encoding="utf-8"
)
file_handler.setLevel(logging.DEBUG)  # Tất cả levels vào file

formatter = logging.Formatter(
    "%(asctime)s - %(name)s - %(levelname)s - %(funcName)s:%(lineno)d - %(message)s"
)
file_handler.setFormatter(formatter)

logger.addHandler(file_handler)

RotatingFileHandler

python
import logging
from logging.handlers import RotatingFileHandler

logger = logging.getLogger("myapp")

# Rotate khi file đạt 5MB, giữ 3 backup files
rotating_handler = RotatingFileHandler(
    "logs/app.log",
    maxBytes=5 * 1024 * 1024,  # 5MB
    backupCount=3,
    encoding="utf-8"
)
# Files: app.log, app.log.1, app.log.2, app.log.3

logger.addHandler(rotating_handler)

TimedRotatingFileHandler

python
import logging
from logging.handlers import TimedRotatingFileHandler

logger = logging.getLogger("myapp")

# Rotate hàng ngày lúc midnight, giữ 7 ngày
timed_handler = TimedRotatingFileHandler(
    "logs/app.log",
    when="midnight",
    interval=1,
    backupCount=7,
    encoding="utf-8"
)
# Files: app.log, app.log.2024-01-01, app.log.2024-01-02, ...

logger.addHandler(timed_handler)

Multiple Handlers Pattern

python
import logging
import sys
from logging.handlers import RotatingFileHandler

def setup_logging(name: str, log_file: str = "app.log") -> logging.Logger:
    """
    Setup logger với console (INFO+) và file (DEBUG+).
    """
    logger = logging.getLogger(name)
    logger.setLevel(logging.DEBUG)
    
    # Prevent duplicate handlers
    if logger.handlers:
        return logger
    
    # Console: INFO and above
    console = logging.StreamHandler(sys.stdout)
    console.setLevel(logging.INFO)
    console.setFormatter(logging.Formatter(
        "%(levelname)s - %(message)s"
    ))
    
    # File: DEBUG and above (more detail)
    file_handler = RotatingFileHandler(
        log_file,
        maxBytes=10 * 1024 * 1024,
        backupCount=5,
        encoding="utf-8"
    )
    file_handler.setLevel(logging.DEBUG)
    file_handler.setFormatter(logging.Formatter(
        "%(asctime)s - %(name)s - %(levelname)s - "
        "%(funcName)s:%(lineno)d - %(message)s"
    ))
    
    logger.addHandler(console)
    logger.addHandler(file_handler)
    
    return logger

Formatters - Định Dạng Log

Format Attributes

AttributeFormatDescription
%(asctime)s2024-01-15 10:30:45,123Timestamp
%(name)smyapp.authLogger name
%(levelname)sINFOLevel name
%(levelno)d20Level number
%(message)sUser logged inLog message
%(funcName)sloginFunction name
%(lineno)d42Line number
%(pathname)s/app/auth.pyFull path
%(filename)sauth.pyFilename
%(module)sauthModule name
%(process)d12345Process ID
%(thread)d140735Thread ID
%(threadName)sMainThreadThread name

Custom Formatter

python
import logging
from datetime import datetime, timezone

class UTCFormatter(logging.Formatter):
    """Formatter với UTC timestamps."""
    
    converter = datetime.fromtimestamp
    
    def formatTime(self, record, datefmt=None):
        ct = self.converter(record.created, tz=timezone.utc)
        if datefmt:
            return ct.strftime(datefmt)
        return ct.isoformat()

# Usage
formatter = UTCFormatter(
    "%(asctime)s - %(levelname)s - %(message)s"
)

Structured Logging (JSON)

Tại Sao JSON Logging?

python
# ❌ Plain text - khó parse
# 2024-01-15 10:30:45 - INFO - User alice logged in from 192.168.1.1

# ✅ JSON - dễ parse, query, aggregate
# {"timestamp": "2024-01-15T10:30:45Z", "level": "INFO", 
#  "message": "User logged in", "user": "alice", "ip": "192.168.1.1"}

JSON Formatter

python
import logging
import json
from datetime import datetime, timezone

class JSONFormatter(logging.Formatter):
    """JSON formatter cho structured logging."""
    
    def format(self, record: logging.LogRecord) -> str:
        log_data = {
            "timestamp": datetime.fromtimestamp(
                record.created, tz=timezone.utc
            ).isoformat(),
            "level": record.levelname,
            "logger": record.name,
            "message": record.getMessage(),
            "module": record.module,
            "function": record.funcName,
            "line": record.lineno,
        }
        
        # Add extra fields
        if hasattr(record, "user_id"):
            log_data["user_id"] = record.user_id
        if hasattr(record, "request_id"):
            log_data["request_id"] = record.request_id
            
        # Add exception info
        if record.exc_info:
            log_data["exception"] = self.formatException(record.exc_info)
            
        return json.dumps(log_data, ensure_ascii=False)

# Usage
handler = logging.StreamHandler()
handler.setFormatter(JSONFormatter())
logger.addHandler(handler)

# Log với extra fields
logger.info("User logged in", extra={"user_id": 123, "request_id": "abc-123"})

python-json-logger Library

python
# pip install python-json-logger
from pythonjsonlogger import jsonlogger
import logging

logger = logging.getLogger()

handler = logging.StreamHandler()
formatter = jsonlogger.JsonFormatter(
    "%(timestamp)s %(level)s %(name)s %(message)s",
    rename_fields={"levelname": "level", "asctime": "timestamp"}
)
handler.setFormatter(formatter)
logger.addHandler(handler)

logger.info("Request processed", extra={
    "user_id": 123,
    "duration_ms": 45,
    "status_code": 200
})
python
# pip install structlog
import structlog

# Configure structlog
structlog.configure(
    processors=[
        structlog.stdlib.filter_by_level,
        structlog.stdlib.add_logger_name,
        structlog.stdlib.add_log_level,
        structlog.stdlib.PositionalArgumentsFormatter(),
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.StackInfoRenderer(),
        structlog.processors.format_exc_info,
        structlog.processors.UnicodeDecoder(),
        structlog.processors.JSONRenderer()
    ],
    wrapper_class=structlog.stdlib.BoundLogger,
    context_class=dict,
    logger_factory=structlog.stdlib.LoggerFactory(),
    cache_logger_on_first_use=True,
)

logger = structlog.get_logger()

# Structured logging với context
logger.info("user_login", user_id=123, ip="192.168.1.1")
# {"event": "user_login", "user_id": 123, "ip": "192.168.1.1", 
#  "timestamp": "2024-01-15T10:30:45Z", "level": "info"}

# Bind context cho subsequent logs
log = logger.bind(request_id="abc-123")
log.info("processing_started")
log.info("processing_completed", duration_ms=45)

Logging Configuration

python
import logging.config

LOGGING_CONFIG = {
    "version": 1,
    "disable_existing_loggers": False,
    
    "formatters": {
        "standard": {
            "format": "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
        },
        "json": {
            "()": "myapp.logging.JSONFormatter"
        }
    },
    
    "handlers": {
        "console": {
            "class": "logging.StreamHandler",
            "level": "INFO",
            "formatter": "standard",
            "stream": "ext://sys.stdout"
        },
        "file": {
            "class": "logging.handlers.RotatingFileHandler",
            "level": "DEBUG",
            "formatter": "json",
            "filename": "logs/app.log",
            "maxBytes": 10485760,  # 10MB
            "backupCount": 5,
            "encoding": "utf-8"
        }
    },
    
    "loggers": {
        "": {  # Root logger
            "handlers": ["console", "file"],
            "level": "DEBUG",
            "propagate": True
        },
        "myapp": {
            "handlers": ["console", "file"],
            "level": "DEBUG",
            "propagate": False
        },
        "sqlalchemy.engine": {
            "handlers": ["file"],
            "level": "WARNING",
            "propagate": False
        }
    }
}

logging.config.dictConfig(LOGGING_CONFIG)

File Config (YAML)

yaml
# logging.yaml
version: 1
disable_existing_loggers: false

formatters:
  standard:
    format: "%(asctime)s - %(name)s - %(levelname)s - %(message)s"

handlers:
  console:
    class: logging.StreamHandler
    level: INFO
    formatter: standard
    stream: ext://sys.stdout

loggers:
  myapp:
    level: DEBUG
    handlers: [console]
    propagate: false
python
import logging.config
import yaml
from pathlib import Path

config = yaml.safe_load(Path("logging.yaml").read_text())
logging.config.dictConfig(config)

pdb Debugging

Basic pdb Commands

CommandShortDescription
helphHiển thị help
wherewPrint stack trace
listlList source code
print exprp exprPrint expression
pp exprPretty print
nextnExecute next line
stepsStep into function
continuecContinue execution
returnrContinue until return
breakbSet breakpoint
clearclClear breakpoint
quitqQuit debugger

Breakpoint Methods

python
# Method 1: Built-in breakpoint() (Python 3.7+)
def process_data(data):
    result = transform(data)
    breakpoint()  # Debugger stops here
    return result

# Method 2: pdb.set_trace()
import pdb

def process_data(data):
    result = transform(data)
    pdb.set_trace()
    return result

# Method 3: Conditional breakpoint
def process_items(items):
    for i, item in enumerate(items):
        if i == 100:  # Debug only item 100
            breakpoint()
        process(item)

# Method 4: Post-mortem debugging
try:
    risky_operation()
except Exception:
    import pdb
    pdb.post_mortem()  # Debug at exception point

pdb++ (Enhanced Debugger)

bash
pip install pdbpp
python
# pdb++ tự động thay thế pdb với features:
# - Syntax highlighting
# - Tab completion
# - Sticky mode (hiển thị code context)
# - Smart command parsing

# Sử dụng giống pdb
breakpoint()  # Tự động dùng pdb++

Remote Debugging

python
# pip install remote-pdb
from remote_pdb import RemotePdb

def process():
    RemotePdb("0.0.0.0", 4444).set_trace()
    # Connect: telnet localhost 4444

Debugging Strategies

Strategy 1: Logging Levels

python
import logging

logger = logging.getLogger(__name__)

def complex_algorithm(data):
    logger.debug(f"Input data: {data}")
    
    step1_result = step1(data)
    logger.debug(f"After step1: {step1_result}")
    
    step2_result = step2(step1_result)
    logger.info(f"Processing completed: {len(step2_result)} items")
    
    return step2_result

Strategy 2: Context Managers for Timing

python
import logging
import time
from contextlib import contextmanager

logger = logging.getLogger(__name__)

@contextmanager
def log_duration(operation: str):
    start = time.perf_counter()
    logger.debug(f"Starting: {operation}")
    try:
        yield
    finally:
        duration = time.perf_counter() - start
        logger.info(f"Completed: {operation} in {duration:.3f}s")

# Usage
with log_duration("database_query"):
    results = db.query(...)

Strategy 3: Exception Logging

python
import logging
import traceback

logger = logging.getLogger(__name__)

def risky_operation():
    try:
        result = external_api_call()
        return result
    except ConnectionError as e:
        logger.error(
            "API connection failed",
            extra={"error": str(e)},
            exc_info=True  # Include stack trace
        )
        raise
    except ValueError as e:
        logger.warning(f"Invalid response: {e}")
        return None

Strategy 4: Request Context

python
import logging
import uuid
from contextvars import ContextVar

# Context variable for request ID
request_id_var: ContextVar[str] = ContextVar("request_id", default="")

class RequestContextFilter(logging.Filter):
    def filter(self, record):
        record.request_id = request_id_var.get()
        return True

# Setup
logger = logging.getLogger(__name__)
logger.addFilter(RequestContextFilter())

# In request handler
def handle_request(request):
    request_id_var.set(str(uuid.uuid4()))
    logger.info("Request started")  # Includes request_id
    # ... process request
    logger.info("Request completed")

Production Pitfalls

Pitfall 1: Logging Sensitive Data

python
import logging

logger = logging.getLogger(__name__)

# ❌ BUG: Log sensitive data
def login(username: str, password: str):
    logger.info(f"Login attempt: {username}, {password}")  # 💀 Password in logs!

# ✅ SỬA: Never log credentials
def login(username: str, password: str):
    logger.info(f"Login attempt: {username}")
    # Or use masking
    logger.debug(f"Password length: {len(password)}")

Pitfall 2: Expensive String Formatting

python
import logging

logger = logging.getLogger(__name__)

# ❌ CHẬM: String formatting luôn chạy
logger.debug(f"Processing {expensive_computation()}")  # Chạy dù DEBUG disabled!

# ✅ NHANH: Lazy formatting
logger.debug("Processing %s", expensive_computation())  # Chỉ format nếu DEBUG enabled

# ✅ NHANH: Check level first
if logger.isEnabledFor(logging.DEBUG):
    logger.debug(f"Processing {expensive_computation()}")

Pitfall 3: Missing Exception Info

python
import logging

logger = logging.getLogger(__name__)

try:
    risky_operation()
except Exception as e:
    # ❌ BUG: Mất stack trace
    logger.error(f"Error: {e}")
    
    # ✅ SỬA: Include exception info
    logger.error("Operation failed", exc_info=True)
    
    # ✅ Hoặc dùng logger.exception()
    logger.exception("Operation failed")  # Tự động include exc_info

Pitfall 4: Handler Duplication

python
import logging

# ❌ BUG: Mỗi lần gọi thêm handler mới
def get_logger():
    logger = logging.getLogger("myapp")
    handler = logging.StreamHandler()
    logger.addHandler(handler)  # Duplicate handlers!
    return logger

# ✅ SỬA: Check existing handlers
def get_logger():
    logger = logging.getLogger("myapp")
    if not logger.handlers:
        handler = logging.StreamHandler()
        logger.addHandler(handler)
    return logger

# ✅ BETTER: Configure once at startup
# In main.py or __init__.py
logging.config.dictConfig(LOGGING_CONFIG)

Pitfall 5: Blocking I/O in Async Code

python
import logging
import asyncio

logger = logging.getLogger(__name__)

# ❌ BUG: FileHandler blocks event loop
async def process():
    logger.info("Processing...")  # Blocking I/O!

# ✅ SỬA: Use QueueHandler for async
from logging.handlers import QueueHandler, QueueListener
import queue

log_queue = queue.Queue()
queue_handler = QueueHandler(log_queue)

# Actual handlers run in separate thread
file_handler = logging.FileHandler("app.log")
listener = QueueListener(log_queue, file_handler)
listener.start()

logger.addHandler(queue_handler)

Pitfall 6: Log File Permissions

python
import logging
import os
from pathlib import Path

# ❌ BUG: Log file với world-readable permissions
handler = logging.FileHandler("app.log")

# ✅ SỬA: Restrict permissions
log_path = Path("app.log")
handler = logging.FileHandler(log_path)

# Set restrictive permissions (owner read/write only)
os.chmod(log_path, 0o600)

Best Practices Summary

python
import logging
import sys

# === SETUP ===
# ✅ Configure once at application startup
logging.config.dictConfig(LOGGING_CONFIG)

# ✅ Use __name__ for logger hierarchy
logger = logging.getLogger(__name__)


# === LOGGING ===
# ✅ Use appropriate levels
logger.debug("Detailed info for debugging")
logger.info("Normal operation events")
logger.warning("Something unexpected but handled")
logger.error("Error occurred", exc_info=True)
logger.critical("System cannot continue")

# ✅ Lazy formatting
logger.info("User %s logged in from %s", user_id, ip_address)

# ✅ Structured logging với extra
logger.info("Request processed", extra={
    "user_id": user_id,
    "duration_ms": duration,
    "status_code": 200
})


# === SECURITY ===
# ✅ Never log sensitive data
logger.info("Login attempt", extra={"username": username})
# NOT: logger.info(f"Login: {username}:{password}")


# === DEBUGGING ===
# ✅ Use breakpoint() for debugging
def complex_function():
    breakpoint()  # Python 3.7+
    
# ✅ Post-mortem debugging
try:
    risky_code()
except Exception:
    import pdb; pdb.post_mortem()