Giao diện
Timsort - The Real-World Champion
"Timsort là lý do Python sort() và Java Arrays.sort() nhanh đến vậy trên dữ liệu thực tế." - HPN
What is Timsort?
Timsort là thuật toán hybrid kết hợp Merge Sort và Insertion Sort, được thiết kế bởi Tim Peters năm 2002 cho Python.
Nó được sử dụng làm thuật toán sort mặc định trong:
- 🐍 Python (
list.sort(),sorted()) - ☕ Java (
Arrays.sort()cho objects) - 🦀 Rust (
slice::sort) - 🍎 Swift (
Array.sort()) - 📱 Android (Java collections)
📘 Tại sao không dùng Quick Sort?
Quick Sort có worst case O(N²) và không stable.
Timsort: O(N log N) guaranteed + stable + tối ưu cho real-world data.
Why Timsort is Fast on Real Data
Key Insight: Real Data Has "Runs"
Dữ liệu thực tế thường có runs - các đoạn đã sorted sẵn.
Real-world data examples:
- Log files: Mostly sorted by timestamp
- Database exports: Often partially sorted
- User lists: Alphabetically grouped
- Stock prices: Runs of increasing/decreasing
Random data: [7, 2, 9, 1, 5, 3, 8] → No runs
Real data: [1, 2, 3, 9, 8, 7, 4, 5, 6] → Runs: [1,2,3], [9,8,7], [4,5,6]Timsort detects và exploits các runs này!
The Algorithm
Phase 1: Identify Runs
python
def find_run(arr, start):
"""
Find a run starting at 'start'.
A run is either:
- Ascending: a[i] <= a[i+1]
- Descending: a[i] > a[i+1] (will be reversed)
"""
if start >= len(arr) - 1:
return start + 1
pos = start + 1
if arr[pos] < arr[start]:
# Descending run
while pos < len(arr) and arr[pos] < arr[pos - 1]:
pos += 1
# Reverse to make ascending
reverse(arr, start, pos)
else:
# Ascending run
while pos < len(arr) and arr[pos] >= arr[pos - 1]:
pos += 1
return posPhase 2: Extend Short Runs with Insertion Sort
python
MIN_RUN = 32 # Typically 32-64
def extend_run(arr, start, run_end, min_run):
"""
If run is shorter than MIN_RUN, extend it using Insertion Sort.
Insertion Sort is fast for small arrays (cache-friendly).
"""
target_end = min(start + min_run, len(arr))
# Binary Insertion Sort for the extension
for i in range(run_end, target_end):
# Use binary search to find insertion point
# Then shift and insert
...Phase 3: Merge Runs with Merge Sort
Key Optimizations
1. Galloping Mode
Khi merge 2 runs mà các elements từ một run liên tục thắng, switch sang galloping mode (tìm kiếm nhị phân thay vì so sánh từng cái).
python
def gallop_right(arr, key, base, length, hint):
"""
Find position where key should be inserted using galloping.
Instead of: 1, 2, 3, 4, 5, 6, 7, 8... (linear)
Gallop: 1, 2, 4, 8, 16... then binary search
Very fast when runs have different "densities"!
"""
...2. Memory-Efficient Merge
Chỉ cần O(min(run1, run2)) extra memory thay vì O(N).
python
def merge_low(arr, run1_start, run1_len, run2_start, run2_len):
"""
Copy smaller run to temp, merge in-place.
"""
# If run1 is smaller, copy run1 to temp
temp = arr[run1_start:run1_start + run1_len]
# Merge temp and run2 back into arr
...3. Stack-Based Merge Order
Dùng stack để ensure merge order tối ưu:
- Merge runs có size tương tự trước
- Avoid expensive large-small merges
python
# Merge invariants (maintain on stack):
# 1. run[i-2] > run[i-1] + run[i]
# 2. run[i-1] > run[i]Implementation
python
# HPN Engineering Standard
# Implementation: Simplified Timsort
from typing import List, Callable, TypeVar
from functools import cmp_to_key
T = TypeVar('T')
MIN_RUN = 32
def calc_min_run(n: int) -> int:
"""
Calculate minimum run length.
Returns value in [32, 64] such that n/min_run is a power of 2 or close.
"""
r = 0
while n >= MIN_RUN:
r |= n & 1
n >>= 1
return n + r
def insertion_sort_range(arr: List[T], left: int, right: int):
"""Insertion sort for [left, right)."""
for i in range(left + 1, right):
key = arr[i]
j = i - 1
while j >= left and arr[j] > key:
arr[j + 1] = arr[j]
j -= 1
arr[j + 1] = key
def merge(arr: List[T], left: int, mid: int, right: int):
"""Merge two sorted runs [left, mid) and [mid, right)."""
left_part = arr[left:mid]
right_part = arr[mid:right]
i = j = 0
k = left
while i < len(left_part) and j < len(right_part):
if left_part[i] <= right_part[j]: # <= for stability
arr[k] = left_part[i]
i += 1
else:
arr[k] = right_part[j]
j += 1
k += 1
while i < len(left_part):
arr[k] = left_part[i]
i += 1
k += 1
while j < len(right_part):
arr[k] = right_part[j]
j += 1
k += 1
def timsort(arr: List[T]) -> None:
"""
Simplified Timsort implementation.
Time: O(N log N) worst case
Space: O(N) for merging
Stable: Yes
Note: Real Timsort has more optimizations (galloping, etc.)
"""
n = len(arr)
if n <= 1:
return
min_run = calc_min_run(n)
# Step 1: Create runs of min_run using insertion sort
for start in range(0, n, min_run):
end = min(start + min_run, n)
insertion_sort_range(arr, start, end)
# Step 2: Merge runs
size = min_run
while size < n:
for left in range(0, n, 2 * size):
mid = min(left + size, n)
right = min(left + 2 * size, n)
if mid < right:
merge(arr, left, mid, right)
size *= 2
# ============================================
# COMPARISON WITH BUILT-IN
# ============================================
def benchmark_sorts(arr_size: int = 10000, runs: int = 100):
"""Compare timsort with built-in sort."""
import time
import random
# Partially sorted data (realistic)
def generate_partial_sorted():
arr = list(range(arr_size))
# Shuffle 10% of elements
for _ in range(arr_size // 10):
i, j = random.randint(0, arr_size-1), random.randint(0, arr_size-1)
arr[i], arr[j] = arr[j], arr[i]
return arr
# Our timsort
total_ours = 0
for _ in range(runs):
arr = generate_partial_sorted()
start = time.perf_counter()
timsort(arr)
total_ours += time.perf_counter() - start
# Built-in sort
total_builtin = 0
for _ in range(runs):
arr = generate_partial_sorted()
start = time.perf_counter()
arr.sort()
total_builtin += time.perf_counter() - start
print(f"Our Timsort: {total_ours/runs*1000:.3f} ms/run")
print(f"Built-in: {total_builtin/runs*1000:.3f} ms/run")
# ============================================
# USAGE EXAMPLE
# ============================================
if __name__ == "__main__":
# Basic usage
arr = [5, 2, 4, 6, 1, 3, 2, 6]
print(f"Before: {arr}")
timsort(arr)
print(f"After: {arr}")
# Stability test
print("\n=== Stability Test ===")
data = [('apple', 3), ('banana', 1), ('apple', 1), ('cherry', 2)]
# Sort by name (stable should preserve order of same names)
data_sorted = sorted(data, key=lambda x: x[0])
print(f"Sorted by name: {data_sorted}")
# apple(3) should still come before apple(1) if stable
# Benchmark
print("\n=== Benchmark (partial sorted data) ===")
benchmark_sorts(10000, 50)Complexity Analysis
| Case | Time | Notes |
|---|---|---|
| Best | O(N) | Already sorted! |
| Average | O(N log N) | Typical case |
| Worst | O(N log N) | Guaranteed |
| Space | O(N) | For merge buffer |
Timsort vs QuickSort
| Aspect | Timsort | QuickSort |
|---|---|---|
| Worst case | O(N log N) ✅ | O(N²) ❌ |
| Stable | Yes ✅ | No ❌ |
| Adaptive | Yes (fast on partial sorted) | No |
| Cache-friendly | Yes | Yes |
| In-place | No (O(N) space) | Yes (O(log N)) |
| Best for | Real-world data | Random data, memory-constrained |
💡 HPN's Rule
"Dùng built-in sort() trong 99% cases. Nó đã là Timsort và được tối ưu hàng thập kỷ."