Thực hành: Evaluation Metrics

🎯 Mục tiêu

🎯 Sau bài thực hành này, bạn sẽ:

Phân tích confusion matrix và rút ra insights
Chọn metric phù hợp cho từng bài toán cụ thể
Hiểu tradeoff giữa precision và recall

Mô tả bài tập

Bạn đang làm việc với model phát hiện giao dịch gian lận (fraud detection). Dataset rất imbalanced (1% fraud, 99% legitimate). Bạn cần chọn và diễn giải metrics đúng cách.

Yêu cầu

Bài 1: Confusion Matrix Analysis

Phân tích confusion matrix và tính các metrics liên quan.

python

import numpy as np

# Confusion matrix cho fraud detection
# [TN, FP]
# [FN, TP]
cm = np.array([
    [9850, 50],    # Actual Legitimate: 9850 đúng, 50 sai (FP)
    [30, 70],      # Actual Fraud: 30 bỏ sót (FN), 70 phát hiện (TP)
])

def analyze_confusion_matrix(cm):
    """Tính: accuracy, precision, recall, f1, specificity.
    Trả về dict với tất cả metrics và nhận xét."""
    # TODO: Implement
    pass

Bài 2: Chọn Metric Phù Hợp

Với mỗi scenario, xác định metric quan trọng nhất.

python

scenarios = [
    {
        'name': 'Fraud Detection',
        'description': 'Phát hiện giao dịch gian lận, bỏ sót fraud gây thiệt hại lớn',
        'best_metric': None,  # TODO: Fill in
        'reason': None,       # TODO: Fill in
    },
    {
        'name': 'Email Spam Filter',
        'description': 'Lọc spam email, đánh nhầm email quan trọng vào spam gây mất thông tin',
        'best_metric': None,
        'reason': None,
    },
    {
        'name': 'Medical Screening',
        'description': 'Sàng lọc bệnh giai đoạn đầu, bỏ sót bệnh nhân thật nguy hiểm hơn báo nhầm',
        'best_metric': None,
        'reason': None,
    },
]

def choose_metrics(scenarios):
    """Điền best_metric và reason cho mỗi scenario."""
    # TODO: Implement
    pass

Bài 3: Threshold Optimization

Tìm threshold tối ưu dựa trên business requirements.

python

from sklearn.metrics import precision_recall_curve

def find_optimal_threshold(y_true, y_scores, min_recall=0.9):
    """Tìm threshold cho precision cao nhất mà recall >= min_recall.
    Return: optimal_threshold, precision_at_threshold, recall_at_threshold."""
    # TODO: Implement
    pass

Gợi ý

💡 Xem gợi ý

Bài 1: Accuracy = (TP+TN)/total. Với imbalanced data, accuracy cao nhưng recall có thể thấp. Focus vào precision và recall.
Bài 2: Fraud/Medical cần recall cao (bỏ sót nguy hiểm). Spam filter cần precision cao (đánh nhầm gây mất info).
Bài 3: Dùng precision_recall_curve rồi filter các thresholds có recall >= min_recall, chọn cái có precision cao nhất.

Lời giải

✅ Xem lời giải

python

# Bài 1
def analyze_confusion_matrix(cm):
    tn, fp, fn, tp = cm[0, 0], cm[0, 1], cm[1, 0], cm[1, 1]
    total = cm.sum()
    accuracy = (tp + tn) / total
    precision = tp / (tp + fp) if (tp + fp) > 0 else 0
    recall = tp / (tp + fn) if (tp + fn) > 0 else 0
    f1 = 2 * precision * recall / (precision + recall) if (precision + recall) > 0 else 0
    specificity = tn / (tn + fp) if (tn + fp) > 0 else 0
    return {
        'accuracy': round(accuracy, 4),     # 0.992 — misleading!
        'precision': round(precision, 4),    # 0.583
        'recall': round(recall, 4),          # 0.7
        'f1': round(f1, 4),                  # 0.636
        'specificity': round(specificity, 4),# 0.995
        'comment': 'Accuracy 99.2% trông tốt nhưng recall chỉ 70% — bỏ sót 30% fraud!'
    }

# Bài 2
def choose_metrics(scenarios):
    answers = [
        {'best_metric': 'Recall', 'reason': 'Bỏ sót fraud gây thiệt hại tài chính lớn, cần maximize recall'},
        {'best_metric': 'Precision', 'reason': 'Đánh nhầm email quan trọng vào spam gây mất thông tin, cần minimize FP'},
        {'best_metric': 'Recall', 'reason': 'Bỏ sót bệnh nhân thật nguy hiểm hơn false alarm, cần maximize recall'},
    ]
    for s, a in zip(scenarios, answers):
        s.update(a)
    return scenarios

# Bài 3
def find_optimal_threshold(y_true, y_scores, min_recall=0.9):
    precisions, recalls, thresholds = precision_recall_curve(y_true, y_scores)
    valid_mask = recalls[:-1] >= min_recall
    if not valid_mask.any():
        return None, None, None
    valid_idx = np.where(valid_mask)[0]
    best_idx = valid_idx[np.argmax(precisions[:-1][valid_mask])]
    return (
        round(thresholds[best_idx], 4),
        round(precisions[best_idx], 4),
        round(recalls[best_idx], 4),
    )

Thực hành: Evaluation Metrics ​

Mô tả bài tập ​

Yêu cầu ​

Bài 1: Confusion Matrix Analysis ​

Bài 2: Chọn Metric Phù Hợp ​

Bài 3: Threshold Optimization ​

Gợi ý ​

Lời giải ​

Liên kết liên quan ​

Thực hành: Evaluation Metrics

Mô tả bài tập

Yêu cầu

Bài 1: Confusion Matrix Analysis

Bài 2: Chọn Metric Phù Hợp

Bài 3: Threshold Optimization

Gợi ý

Lời giải

Liên kết liên quan