Property-Based Testing — Để Máy Tìm Bug Thay Bạn

Tháng 3 năm 2020, một hệ thống thanh toán xử lý hàng triệu giao dịch mỗi ngày gặp lỗi nghiêm trọng: hàm serialize số thực sang JSON bị mất precision khi giá trị nằm trong khoảng 1e-7 đến 1e-15. Đội ngũ QA đã viết hơn 200 test case thủ công — nhưng không ai nghĩ đến trường hợp 0.00000003141592653. Sau 72 giờ downtime, nguyên nhân được xác định: floating-point rounding trong bước encode. Một property test đơn giản — decode(encode(x)) == x chạy với 10.000 giá trị ngẫu nhiên — đã có thể phát hiện bug này trong vòng 2 giây.

Đó là giới hạn cốt lõi của example-based testing: bạn chỉ tìm được bug mà bạn tưởng tượng ra. Bạn test 0, 1, -1, 999999 — nhưng bỏ sót 1e-10, float('inf'), NaN, hay chuỗi Unicode 4 byte. Property-based testing lật ngược cách tiếp cận: thay vì liệt kê từng trường hợp, bạn định nghĩa quy luật mà code phải tuân theo, rồi để framework tự sinh hàng nghìn đầu vào — bao gồm cả những edge case mà con người không bao giờ nghĩ tới.

Quick win: Thêm một decorator @given(st.text()) vào hàm xử lý chuỗi bất kỳ. Chạy thử. Khả năng cao bạn sẽ phát hiện ít nhất một edge case bị bỏ sót — chuỗi rỗng, ký tự null, hay emoji multi-byte.

Bức tranh tư duy

Example-based vs Property-based: hai triết lý kiểm thử

Example-based testing giống như kiểm tra chất lượng bằng cách lấy mẫu: bạn chọn 5 sản phẩm trên dây chuyền, đo đạc, rồi kết luận "toàn bộ lô hàng đạt chuẩn". Nếu bạn chọn đúng mẫu, kết luận chính xác. Nhưng nếu lỗi chỉ xuất hiện ở sản phẩm thứ 847 — bạn sẽ không bao giờ biết.

Property-based testing giống như thiết lập quy tắc chất lượng trên toàn bộ dây chuyền: "mọi sản phẩm phải nặng 100g ± 2g, chiều dài 10cm ± 0.5mm, bề mặt không có vết nứt". Thay vì kiểm tra từng sản phẩm cụ thể, bạn định nghĩa tính chất bất biến (invariant) rồi để hệ thống tự kiểm tra hàng nghìn mẫu ngẫu nhiên — bao gồm cả những trường hợp biên mà không ai chủ động nghĩ tới.

┌─────────────────────────────────────────────────────────────────┐
│                  HAI TRIẾT LÝ KIỂM THỬ                        │
├────────────────────────────┬────────────────────────────────────┤
│     EXAMPLE-BASED          │       PROPERTY-BASED              │
│                            │                                    │
│  Bạn nghĩ ra các mẫu:     │  Bạn định nghĩa quy luật:        │
│  test([3,1,2]) → [1,2,3]  │  ∀ xs: len(sort(xs)) == len(xs)  │
│  test([])      → []        │  ∀ xs: sort(xs) tăng dần         │
│  test([1])     → [1]       │  ∀ xs: multiset(sort(xs)) == xs  │
│                            │                                    │
│  Bạn quên: [-1,0,1]?      │  Framework tự sinh 10.000 mẫu:   │
│  [10**100]? [NaN]?         │  [], [0], [-1,1], [NaN, inf]...  │
│                            │                                    │
│  Coverage: trí tưởng tượng │  Coverage: không gian đầu vào    │
│  Bảo trì: N test cases     │  Bảo trì: vài properties         │
│  Tìm bug: đã biết trước   │  Tìm bug: CHƯA biết trước        │
└────────────────────────────┴────────────────────────────────────┘

Hãy nghĩ về nó theo cách này: example-based testing trả lời câu hỏi "code có chạy đúng với input X không?", còn property-based testing trả lời "code có tuân theo quy luật Y với MỌI input không?". Câu hỏi thứ hai mạnh hơn gấp bội — và Hypothesis là công cụ giúp bạn đặt câu hỏi đó.

Cốt lõi kỹ thuật

Hypothesis cơ bản: `@given` và strategies

@given là decorator trung tâm của Hypothesis. Nó biến một hàm test thường thành property test bằng cách tự động sinh dữ liệu đầu vào theo strategy (chiến lược sinh dữ liệu) mà bạn chỉ định:

python

from hypothesis import given, example, settings
from hypothesis import strategies as st

# Property: phép cộng có tính giao hoán
@given(st.integers(), st.integers())
def test_addition_is_commutative(a, b):
    assert a + b == b + a

# Property: đảo chuỗi hai lần trả về chuỗi gốc
@given(st.text())
def test_reverse_involution(s):
    assert s[::-1][::-1] == s

# Kết hợp @example để luôn test edge case cụ thể
@given(st.text())
@example("")              # Chuỗi rỗng — luôn kiểm tra
@example("\x00")          # Ký tự null
@example("🎉" * 1000)    # Emoji lặp nhiều lần
def test_string_encode_decode(s):
    encoded = s.encode("utf-8")
    assert encoded.decode("utf-8") == s

Mỗi lần chạy, Hypothesis mặc định sinh 100 bộ dữ liệu cho mỗi test. Nếu tìm thấy lỗi, nó tự động shrink (thu nhỏ) đầu vào để trả về ví dụ tối giản nhất gây lỗi.

Built-in strategies

Hypothesis cung cấp sẵn strategies cho hầu hết kiểu dữ liệu Python:

python

from hypothesis import strategies as st

# --- Kiểu nguyên thủy ---
st.integers()                                    # Mọi số nguyên
st.integers(min_value=0, max_value=255)          # Giới hạn phạm vi
st.floats()                                      # Bao gồm inf, -inf, NaN
st.floats(allow_nan=False, allow_infinity=False) # Chỉ số hữu hạn
st.booleans()                                    # True hoặc False
st.text()                                        # Chuỗi Unicode bất kỳ
st.text(min_size=1, max_size=50)                 # Giới hạn độ dài
st.text(alphabet="abcdef0123456789")             # Giới hạn bảng chữ cái
st.binary()                                      # bytes
st.none()                                        # Luôn trả về None

# --- Collections ---
st.lists(st.integers())                          # List số nguyên
st.lists(st.integers(), min_size=1, max_size=20) # Giới hạn kích thước
st.lists(st.integers(), unique=True)             # Không trùng lặp
st.sets(st.text(min_size=1))                     # Set chuỗi
st.dictionaries(
    keys=st.text(min_size=1, max_size=10),
    values=st.integers()
)
st.tuples(st.integers(), st.text(), st.booleans())

# --- Kết hợp ---
st.one_of(st.integers(), st.text(), st.none())   # Một trong nhiều kiểu
st.integers() | st.none()                         # Cú pháp rút gọn
st.sampled_from(["pending", "active", "closed"])  # Chọn từ danh sách
st.just(42)                                       # Giá trị cố định

# --- Biến đổi ---
positive = st.integers(min_value=1)
squares = positive.map(lambda x: x ** 2)          # Ánh xạ giá trị
even = st.integers().filter(lambda x: x % 2 == 0) # Lọc giá trị

Composite strategies: sinh dữ liệu phức tạp

Khi cần sinh các đối tượng domain phức tạp với dữ liệu nội bộ nhất quán, dùng @st.composite:

python

from hypothesis import strategies as st
from hypothesis import given
from dataclasses import dataclass
from typing import List

@dataclass
class OrderItem:
    product_id: int
    price: float
    quantity: int

@dataclass
class Order:
    order_id: str
    items: List[OrderItem]
    total: float

@st.composite
def valid_orders(draw):
    """Sinh Order với total khớp tổng items."""
    order_id = draw(st.text(
        alphabet="ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789",
        min_size=8, max_size=8
    ))

    num_items = draw(st.integers(min_value=1, max_value=5))
    items = []
    total = 0.0

    for _ in range(num_items):
        price = round(draw(st.floats(
            min_value=0.01, max_value=999.99,
            allow_nan=False, allow_infinity=False
        )), 2)
        quantity = draw(st.integers(min_value=1, max_value=20))
        product_id = draw(st.integers(min_value=1000, max_value=9999))

        items.append(OrderItem(
            product_id=product_id,
            price=price,
            quantity=quantity
        ))
        total += price * quantity

    return Order(order_id=order_id, items=items, total=round(total, 2))

@given(valid_orders())
def test_order_total_consistency(order):
    recalculated = sum(
        item.price * item.quantity for item in order.items
    )
    assert abs(order.total - recalculated) < 0.01

Điểm mấu chốt: draw() cho phép một strategy phụ thuộc vào kết quả của strategy khác — tạo ra dữ liệu nhất quán nội bộ thay vì ghép ngẫu nhiên các trường độc lập.

Stateful testing: RuleBasedStateMachine

Property test thông thường kiểm tra một lần gọi hàm. Stateful testing kiểm tra chuỗi thao tác — mô phỏng cách người dùng thực sự tương tác với hệ thống:

python

from hypothesis.stateful import (
    RuleBasedStateMachine, rule, invariant,
    precondition, Bundle
)
from hypothesis import strategies as st

class QueueMachine(RuleBasedStateMachine):
    """So sánh custom Queue với list (reference implementation)."""

    def __init__(self):
        super().__init__()
        self.model = []           # Reference: Python list
        self.queue = MyQueue()    # Implementation under test

    @rule(value=st.integers())
    def enqueue(self, value):
        self.model.append(value)
        self.queue.enqueue(value)

    @precondition(lambda self: len(self.model) > 0)
    @rule()
    def dequeue(self):
        expected = self.model.pop(0)
        actual = self.queue.dequeue()
        assert actual == expected

    @precondition(lambda self: len(self.model) > 0)
    @rule()
    def peek(self):
        assert self.queue.peek() == self.model[0]

    @invariant()
    def size_consistent(self):
        assert len(self.queue) == len(self.model)

    @invariant()
    def empty_consistent(self):
        assert self.queue.is_empty() == (len(self.model) == 0)

# Hypothesis tự sinh chuỗi thao tác ngẫu nhiên:
# enqueue(5) → enqueue(-3) → peek() → dequeue() → enqueue(0) → ...
TestQueueMachine = QueueMachine.TestCase

Stateful testing đặc biệt hiệu quả với: cấu trúc dữ liệu, state machines, database operations, và bất kỳ API nào có trạng thái nội bộ.

Shrinking: thu nhỏ ví dụ lỗi

Khi Hypothesis tìm thấy đầu vào gây lỗi, nó không dừng lại ngay. Thay vào đó, nó chạy thuật toán shrinking để tìm đầu vào tối giản nhất vẫn gây ra cùng lỗi:

python

from hypothesis import given
from hypothesis import strategies as st

def process_list(xs):
    """Bug: crash khi list có > 3 phần tử VÀ chứa số âm."""
    if len(xs) > 3 and any(x < 0 for x in xs):
        raise ValueError("unexpected negative in long list")
    return sum(xs)

@given(st.lists(st.integers()))
def test_process_list(xs):
    result = process_list(xs)
    assert isinstance(result, int)

# Hypothesis tìm thấy lỗi với input [847, -29103, 0, 7742]
# Sau shrinking, báo cáo: [0, 0, 0, -1]
# → Minimal: 4 phần tử (ngắn nhất > 3) + một số âm nhỏ nhất

Shrinking giúp debug nhanh hơn gấp bội: thay vì đọc một input 50 phần tử đầy giá trị ngẫu nhiên, bạn nhận được input 4 phần tử đơn giản nhất tái hiện lỗi.

Settings và profiles

Điều chỉnh hành vi Hypothesis cho từng ngữ cảnh:

python

from hypothesis import given, settings, Verbosity, HealthCheck, Phase

# Cấu hình trực tiếp trên test
@given(st.lists(st.integers()))
@settings(
    max_examples=500,             # Số lượng ví dụ (mặc định: 100)
    deadline=None,                # Tắt giới hạn thời gian/ví dụ
    suppress_health_check=[       # Bỏ qua health check cụ thể
        HealthCheck.too_slow,
        HealthCheck.filter_too_much
    ],
)
def test_with_custom_settings(xs):
    assert sorted(xs) == sorted(xs)

# Profile: cấu hình theo môi trường
settings.register_profile("ci", max_examples=1000)
settings.register_profile("dev", max_examples=50)
settings.register_profile("debug", max_examples=10,
                          verbosity=Verbosity.verbose)

# Kích hoạt profile qua biến môi trường:
# HYPOTHESIS_PROFILE=ci pytest tests/
settings.load_profile("ci")

Chiến lược thực tế: dùng max_examples=50 khi phát triển (nhanh), max_examples=1000 trên CI (kỹ lưỡng), verbosity=Verbosity.verbose khi debug (hiển thị mọi ví dụ được sinh).

Thực chiến

Tìm edge cases trong JSON parser/serializer

Đây là bài toán kinh điển của property-based testing: kiểm tra roundtrip — encode rồi decode phải trả về giá trị gốc. Nghe đơn giản, nhưng thực tế đầy cạm bẫy.

Bước 1: Định nghĩa strategy cho JSON-compatible data

python

from hypothesis import strategies as st

# JSON chỉ hỗ trợ: string, number, boolean, null, array, object
json_primitives = st.one_of(
    st.none(),
    st.booleans(),
    st.integers(min_value=-(2**53), max_value=2**53),
    st.floats(allow_nan=False, allow_infinity=False),
    st.text(),
)

# Recursive: JSON có thể chứa array/object lồng nhau
json_values = st.recursive(
    json_primitives,
    lambda children: st.one_of(
        st.lists(children, max_size=5),
        st.dictionaries(
            keys=st.text(min_size=0, max_size=20),
            values=children,
            max_size=5
        ),
    ),
    max_leaves=20,
)

Bước 2: Property test cho roundtrip

python

import json
import math
from hypothesis import given, settings

@given(json_values)
@settings(max_examples=500)
def test_json_roundtrip(value):
    """encode → decode phải trả về giá trị gốc."""
    encoded = json.dumps(value)
    decoded = json.loads(encoded)

    if isinstance(value, float):
        # Float comparison cần tolerance
        if math.isnan(value):
            assert math.isnan(decoded)
        else:
            assert abs(decoded - value) < 1e-10 or decoded == value
    else:
        assert decoded == value

Bước 3: Kiểm tra custom serializer thực tế

Giả sử hệ thống của bạn có serializer riêng cho API response:

python

from dataclasses import dataclass, asdict
from datetime import datetime, timezone
from typing import Optional
import json

@dataclass
class ApiResponse:
    status: str
    data: dict
    timestamp: str
    error: Optional[str] = None

    def to_json(self) -> str:
        return json.dumps(asdict(self), ensure_ascii=False)

    @classmethod
    def from_json(cls, raw: str) -> "ApiResponse":
        parsed = json.loads(raw)
        return cls(**parsed)

# Strategy sinh ApiResponse hợp lệ
@st.composite
def api_responses(draw):
    status = draw(st.sampled_from(["ok", "error", "partial"]))
    data = draw(st.dictionaries(
        keys=st.text(
            alphabet="abcdefghijklmnopqrstuvwxyz_",
            min_size=1, max_size=15
        ),
        values=st.one_of(
            st.integers(), st.text(), st.booleans(), st.none()
        ),
        max_size=10
    ))
    timestamp = draw(st.datetimes(
        min_value=datetime(2020, 1, 1),
        max_value=datetime(2030, 12, 31),
    )).isoformat()
    error = draw(st.one_of(st.none(), st.text(max_size=200)))

    return ApiResponse(
        status=status, data=data,
        timestamp=timestamp, error=error
    )

@given(api_responses())
def test_api_response_roundtrip(response):
    """Serialize → deserialize phải bảo toàn dữ liệu."""
    json_str = response.to_json()

    # JSON hợp lệ
    parsed = json.loads(json_str)
    assert isinstance(parsed, dict)

    # Roundtrip
    restored = ApiResponse.from_json(json_str)
    assert restored.status == response.status
    assert restored.data == response.data
    assert restored.timestamp == response.timestamp
    assert restored.error == response.error

Hypothesis sẽ tự động tìm: data chứa key rỗng, error chứa ký tự đặc biệt JSON (\", \\, \n), timestamp ở ranh giới năm. Những trường hợp này gần như không xuất hiện trong test thủ công.

Sai lầm điển hình

❌ Sai lầm 1: Test implementation thay vì test property

python

# SAI — test rằng sort dùng đúng thuật toán, không phải test tính chất
@given(st.lists(st.integers()))
def test_sort_uses_timsort(xs):
    import sys
    # Kiểm tra implementation detail — vô nghĩa và giòn
    assert sys.version_info >= (3, 0)  # "Python 3 dùng Timsort"
    result = sorted(xs)
    assert result is not xs  # Implementation detail

# ĐÚNG — test tính chất mà MỌI sort đúng phải thoả mãn
@given(st.lists(st.integers()))
def test_sort_properties(xs):
    result = sorted(xs)
    # Tính chất 1: Bảo toàn phần tử
    assert sorted(result) == sorted(xs)
    # Tính chất 2: Thứ tự tăng dần
    assert all(result[i] <= result[i + 1] for i in range(len(result) - 1))
    # Tính chất 3: Bảo toàn kích thước
    assert len(result) == len(xs)

Property tốt không phụ thuộc vào implementation — nó đúng cho quicksort, mergesort, hay bất kỳ thuật toán sắp xếp nào.

❌ Sai lầm 2: Strategy quá cụ thể, bỏ sót edge case

python

# SAI — chỉ test số dương nhỏ, bỏ sót số âm, zero, số cực lớn
@given(st.integers(min_value=1, max_value=100))
def test_absolute_value_too_narrow(n):
    assert abs(n) >= 0

# ĐÚNG — để Hypothesis khám phá toàn bộ không gian
@given(st.integers())
def test_absolute_value_full_range(n):
    result = abs(n)
    assert result >= 0
    assert result == n or result == -n

❌ Sai lầm 3: Bỏ qua kết quả shrinking

python

# SAI — thấy test fail, nhìn input gốc [8472, -291, 0, 77, 42, -1]
# rồi cố debug với chính input đó

# ĐÚNG — đọc shrunk example mà Hypothesis báo cáo:
# Falsifying example: test_process([0, 0, 0, -1])
# → Hiểu ngay: bug xảy ra khi list > 3 phần tử VÀ có số âm
# → Fix chính xác vấn đề thay vì đoán mò

Shrunk example là quà tặng — nó chỉ ra điều kiện tối thiểu gây lỗi. Luôn đọc nó trước khi debug.

❌ Sai lầm 4: Strategy không giới hạn gây test chậm

python

# SAI — nested list không giới hạn → sinh input khổng lồ
@given(st.lists(st.lists(st.lists(st.integers()))))
def test_deeply_nested_slow(data):
    pass  # Có thể chạy hàng phút

# ĐÚNG — giới hạn kích thước ở mỗi tầng
@given(st.lists(
    st.lists(
        st.integers(min_value=-1000, max_value=1000),
        max_size=10
    ),
    max_size=10
))
def test_nested_bounded(data):
    pass  # Chạy trong vài giây

❌ Sai lầm 5: Không dùng profiles cho CI vs local

python

# SAI — hardcode max_examples, CI chạy quá ít hoặc local chạy quá lâu
@given(st.integers())
@settings(max_examples=2000)  # Luôn 2000 — local chậm, CI có thể cần nhiều hơn
def test_hardcoded_examples(n):
    assert n * 0 == 0

# ĐÚNG — dùng profiles
from hypothesis import settings

settings.register_profile("dev", max_examples=50)
settings.register_profile("ci", max_examples=2000)
settings.register_profile("nightly", max_examples=10000)
# Kích hoạt: HYPOTHESIS_PROFILE=ci pytest

@given(st.integers())
def test_with_profile(n):
    assert n * 0 == 0

Under the Hood

Cách Hypothesis sinh và thu nhỏ dữ liệu

Hypothesis không sinh dữ liệu hoàn toàn ngẫu nhiên. Nó sử dụng hệ thống Conjecture — một engine dựa trên byte stream:

Sinh (Generation): Mỗi strategy tiêu thụ bytes từ một buffer ngẫu nhiên rồi biến đổi thành giá trị Python. Buffer này cho phép Hypothesis tái tạo chính xác cùng giá trị từ cùng buffer.
Shrinking: Khi tìm thấy buffer gây lỗi, Hypothesis thu nhỏ buffer — thay byte bằng 0, cắt bớt bytes, hoán vị blocks. Mỗi buffer nhỏ hơn được decode lại và chạy lại test. Quá trình lặp cho đến khi không thể thu nhỏ thêm.
Targeting: target(float_value) hướng dẫn Hypothesis ưu tiên sinh dữ liệu theo hướng tối đa hoá giá trị target — hữu ích khi edge case nằm ở vùng đặc biệt của không gian đầu vào.

Database lưu trữ ví dụ

Hypothesis lưu mọi ví dụ gây lỗi vào .hypothesis/examples/. Khi chạy lại test, nó replay các ví dụ cũ trước khi sinh dữ liệu ngẫu nhiên mới:

.hypothesis/
├── examples/         # Ví dụ gây lỗi — commit vào VCS
│   └── <test_hash>   # Mỗi test có file riêng
└── unicode_data/     # Cache dữ liệu Unicode

Quan trọng: Commit .hypothesis/examples/ vào version control để ví dụ lỗi được replay trên CI.

Cân nhắc hiệu năng

Yếu tố	Ảnh hưởng	Khuyến nghị
`max_examples`	Tuyến tính với thời gian chạy	50 (dev), 500-1000 (CI), 10000 (nightly)
Strategy lồng sâu	Sinh dữ liệu lớn → chậm	Giới hạn `max_size` ở mỗi tầng
`filter()` / `assume()`	Loại bỏ nhiều → sinh lại → chậm	Dùng strategy chính xác thay vì lọc
`@st.composite`	Overhead nhỏ so với `st.builds`	Dùng `st.builds` khi có thể, `@st.composite` khi cần logic phức tạp
Shrinking	Có thể chạy hàng nghìn lần	`deadline=None` nếu hàm test chậm

Khi nào KHÔNG nên dùng property-based testing

Property-based testing không phải viên đạn bạc. Không phù hợp khi:

Không xác định được property: Nếu bạn không thể phát biểu "với mọi input, kết quả phải thoả mãn X" — thì không có gì để test. Ví dụ: kiểm tra UI render đúng pixel.
Test phụ thuộc dịch vụ ngoài: Property test sinh hàng trăm request — nếu mỗi request gọi API bên ngoài, bạn sẽ bị rate-limit. Dùng mock.
Hàm quá chậm: Nếu mỗi lần gọi mất 1 giây, 100 examples = 100 giây. Giảm max_examples hoặc dùng example-based.
Logic nghiệp vụ không có oracle: Nếu hàm tính thuế theo 50 quy tắc và bạn không có implementation tham chiếu, property test biến thành viết lại implementation.

Nguyên tắc chung: Property-based testing phát huy tối đa với pure functions, serialization/deserialization, data structures, và mathematical operations — nơi properties dễ phát biểu và không có side effects.

Checklist ghi nhớ

✅ Checklist triển khai

[ ] Property test kiểm tra quy luật (invariant), không kiểm tra giá trị cụ thể
[ ] @given(strategy) sinh tự động, @example(value) đảm bảo edge case luôn được test
[ ] Dùng st.builds() cho object đơn giản, @st.composite khi cần logic phụ thuộc giữa các trường
[ ] Giới hạn max_size ở mỗi tầng cho collections lồng nhau để tránh test chậm
[ ] Luôn đọc shrunk example — đó là điều kiện tối thiểu gây lỗi
[ ] Commit .hypothesis/examples/ vào version control
[ ] Thiết lập profiles: dev (nhanh), ci (kỹ), nightly (toàn diện)
[ ] Roundtrip pattern: decode(encode(x)) == x — áp dụng cho mọi serializer
[ ] RuleBasedStateMachine cho kiểm tra chuỗi thao tác trên hệ thống có trạng thái
[ ] assume() chỉ khi không thể thay bằng strategy chính xác hơn — lạm dụng sẽ làm chậm test

Bài tập luyện tập

🧠 Quiz

Bài 1: Roundtrip cho hàm nén chuỗi

Viết property test cho hàm compress / decompress sau đây. Đảm bảo rằng với mọi chuỗi đầu vào, decompress(compress(s)) == s.

python

def compress(s: str) -> str:
    """Run-length encoding: 'aaabbc' → '3a2b1c'"""
    if not s:
        return ""
    result = []
    count = 1
    for i in range(1, len(s)):
        if s[i] == s[i - 1]:
            count += 1
        else:
            result.append(f"{count}{s[i-1]}")
            count = 1
    result.append(f"{count}{s[-1]}")
    return "".join(result)

def decompress(s: str) -> str:
    """Giải nén: '3a2b1c' → 'aaabbc'"""
    import re
    return "".join(
        char * int(num)
        for num, char in re.findall(r"(\d+)(.)", s)
    )

Xem lời giải

python

from hypothesis import given, example
from hypothesis import strategies as st

# Strategy: chuỗi chỉ chứa chữ cái (tránh chữ số gây nhập nhằng)
safe_text = st.text(
    alphabet=st.characters(whitelist_categories=("L",)),
    max_size=100
)

@given(safe_text)
@example("")
@example("a")
@example("aaa")
def test_compress_decompress_roundtrip(s):
    compressed = compress(s)
    decompressed = decompress(compressed)
    assert decompressed == s

@given(safe_text)
def test_compress_preserves_length(s):
    """Giải nén luôn trả về đúng số ký tự."""
    if s:
        assert len(decompress(compress(s))) == len(s)

Lưu ý: Hàm compress trên có bug nếu input chứa chữ số (ví dụ "a1b" → "1a111b" → giải nén sai). Hypothesis sẽ tìm ra nếu bạn dùng st.text() thay vì safe_text. Đây chính là sức mạnh của property-based testing — phát hiện lỗi mà bạn không nghĩ tới.

Bài 2: Stateful testing cho Stack

Viết RuleBasedStateMachine kiểm tra class Stack sau. So sánh hành vi với Python list làm reference.

python

class Stack:
    def __init__(self):
        self._items = []

    def push(self, item):
        self._items.append(item)

    def pop(self):
        if not self._items:
            raise IndexError("pop from empty stack")
        return self._items.pop()

    def peek(self):
        if not self._items:
            raise IndexError("peek at empty stack")
        return self._items[-1]

    def __len__(self):
        return len(self._items)

    def is_empty(self):
        return len(self._items) == 0

Xem lời giải

python

from hypothesis.stateful import RuleBasedStateMachine, rule, invariant, precondition
from hypothesis import strategies as st

class StackMachine(RuleBasedStateMachine):
    def __init__(self):
        super().__init__()
        self.model = []         # Reference: Python list
        self.stack = Stack()    # Implementation under test

    @rule(value=st.integers())
    def push(self, value):
        self.model.append(value)
        self.stack.push(value)

    @precondition(lambda self: len(self.model) > 0)
    @rule()
    def pop(self):
        expected = self.model.pop()
        actual = self.stack.pop()
        assert actual == expected

    @precondition(lambda self: len(self.model) > 0)
    @rule()
    def peek(self):
        assert self.stack.peek() == self.model[-1]

    @rule()
    def check_empty(self):
        assert self.stack.is_empty() == (len(self.model) == 0)

    @invariant()
    def size_matches(self):
        assert len(self.stack) == len(self.model)

TestStackMachine = StackMachine.TestCase

Hypothesis sẽ tự sinh chuỗi thao tác như: push(0) → push(-1) → pop() → peek() → push(5) → pop() → pop() và kiểm tra rằng mọi invariant luôn đúng sau mỗi bước.

Bài 3: Tìm bug trong hàm merge sorted lists

Hàm sau merge hai sorted list thành một sorted list. Viết property test để tìm bug:

python

def merge_sorted(a: list[int], b: list[int]) -> list[int]:
    """Merge hai sorted list thành một sorted list."""
    result = []
    i = j = 0
    while i < len(a) and j < len(b):
        if a[i] <= b[j]:
            result.append(a[i])
            i += 1
        else:
            result.append(b[j])
            j += 1
    # Bug: chỉ thêm phần còn lại của a, quên b
    result.extend(a[i:])
    return result

Xem lời giải

python

from hypothesis import given
from hypothesis import strategies as st

sorted_lists = st.lists(st.integers(), max_size=50).map(sorted)

@given(sorted_lists, sorted_lists)
def test_merge_sorted_length(a, b):
    """Kết quả phải có đúng len(a) + len(b) phần tử."""
    result = merge_sorted(a, b)
    assert len(result) == len(a) + len(b)  # FAIL: thiếu phần tử từ b

@given(sorted_lists, sorted_lists)
def test_merge_sorted_is_sorted(a, b):
    """Kết quả phải được sắp xếp."""
    result = merge_sorted(a, b)
    assert all(result[i] <= result[i + 1] for i in range(len(result) - 1))

@given(sorted_lists, sorted_lists)
def test_merge_sorted_contains_all(a, b):
    """Kết quả phải chứa mọi phần tử từ cả hai list."""
    from collections import Counter
    result = merge_sorted(a, b)
    assert Counter(result) == Counter(a) + Counter(b)  # FAIL

Hypothesis sẽ shrink đến trường hợp đơn giản nhất: merge_sorted([], [0]) → trả về [] thay vì [0]. Bug: thiếu result.extend(b[j:]).

Fix: Thêm result.extend(b[j:]) sau result.extend(a[i:]).

Property-Based Testing — Để Máy Tìm Bug Thay Bạn ​

Bức tranh tư duy ​

Example-based vs Property-based: hai triết lý kiểm thử ​

Cốt lõi kỹ thuật ​

Hypothesis cơ bản: @given và strategies ​

Built-in strategies ​

Composite strategies: sinh dữ liệu phức tạp ​

Stateful testing: RuleBasedStateMachine ​

Shrinking: thu nhỏ ví dụ lỗi ​

Settings và profiles ​

Thực chiến ​

Tìm edge cases trong JSON parser/serializer ​

Bước 1: Định nghĩa strategy cho JSON-compatible data ​

Bước 2: Property test cho roundtrip ​

Bước 3: Kiểm tra custom serializer thực tế ​

Sai lầm điển hình ​

❌ Sai lầm 1: Test implementation thay vì test property ​

❌ Sai lầm 2: Strategy quá cụ thể, bỏ sót edge case ​

❌ Sai lầm 3: Bỏ qua kết quả shrinking ​

❌ Sai lầm 4: Strategy không giới hạn gây test chậm ​

❌ Sai lầm 5: Không dùng profiles cho CI vs local ​

Under the Hood ​

Cách Hypothesis sinh và thu nhỏ dữ liệu ​

Database lưu trữ ví dụ ​

Cân nhắc hiệu năng ​

Khi nào KHÔNG nên dùng property-based testing ​

Checklist ghi nhớ ​

Bài tập luyện tập ​

Bài 1: Roundtrip cho hàm nén chuỗi ​

Bài 2: Stateful testing cho Stack ​

Bài 3: Tìm bug trong hàm merge sorted lists ​

Liên kết học tiếp ​

Property-Based Testing — Để Máy Tìm Bug Thay Bạn

Bức tranh tư duy

Example-based vs Property-based: hai triết lý kiểm thử

Cốt lõi kỹ thuật

Hypothesis cơ bản: `@given` và strategies

Built-in strategies

Composite strategies: sinh dữ liệu phức tạp

Stateful testing: RuleBasedStateMachine

Shrinking: thu nhỏ ví dụ lỗi

Settings và profiles

Thực chiến

Tìm edge cases trong JSON parser/serializer

Bước 1: Định nghĩa strategy cho JSON-compatible data

Bước 2: Property test cho roundtrip

Bước 3: Kiểm tra custom serializer thực tế

Sai lầm điển hình

❌ Sai lầm 1: Test implementation thay vì test property

❌ Sai lầm 2: Strategy quá cụ thể, bỏ sót edge case

❌ Sai lầm 3: Bỏ qua kết quả shrinking

❌ Sai lầm 4: Strategy không giới hạn gây test chậm

❌ Sai lầm 5: Không dùng profiles cho CI vs local

Under the Hood

Cách Hypothesis sinh và thu nhỏ dữ liệu

Database lưu trữ ví dụ

Cân nhắc hiệu năng

Khi nào KHÔNG nên dùng property-based testing

Checklist ghi nhớ

Bài tập luyện tập

Bài 1: Roundtrip cho hàm nén chuỗi

Bài 2: Stateful testing cho Stack

Bài 3: Tìm bug trong hàm merge sorted lists

Liên kết học tiếp