Giao diện
ML Code Revamp (Refactor)
1. Objective
The objective of this workflow is to bridge the "Research to Production" gap. Notebooks are great for exploration (EDA) but terrible for production engineering. They suffer from hidden state, lack of testing, and poor version control. This workflow transforms "Spaghetti Code" into a structured, typed, and tested Python library.
2. Context & Scope
In Scope
This workflow covers Modularization (Functions/Classes), Typing (Type Hints), Configuration separation (Hardcoded vars -> YAML/Env), Logging (Print -> Logger), and Unit Testing.
Assumption: The notebook code "works" (produces a model/result), effectively.
Out of Scope
- Algorithm Change: We are not improving the model accuracy here, only the code quality.
- CI/CD Pipeline: Setting up the build server is "Implement CI/CD".
3. When to Use / When Not to Use
✅ Use This Workflow When
- A model has passed the "POC" stage and needs to be deployed.
- You need to run the code on a schedule (Airflow/Cron).
- Multiple team members need to collaborate on the codebase.
❌ Do NOT Use This Workflow When
- You are strictly doing throw-away EDA (Exploratory Data Analysis).
- The idea is not yet proven (Don't optimize code that might be deleted next week).
4. Inputs (Required/Optional)
Required Inputs
| Input | Description | Format | Example |
|---|---|---|---|
| NOTEBOOK | The source file. | .ipynb | experiment_v1.ipynb |
| REPO_NAME | Target project. | String | fraud-detection-lib |
Optional Inputs
| Input | Description | Default | Condition |
|---|---|---|---|
| STYLE_GUIDE | Formatting rules. | Black/PEP8 | If corp standard exists. |
5. Outputs (Artifacts)
| Artifact | Format | Destination | Quality Criteria |
|---|---|---|---|
| Source Code | .py files | src/ | Modular, Typed, Docstrings. |
| Tests | .py files | tests/ | Coverage > 80%. |
6. Operating Modes
⚡ Fast Mode
Timebox: 1 hour Scope: Script Conversion. Details: Exporting Notebook to a single main.py. Wrapping global code in if __name__ == "__main__":. Running Black formatter.
🎯 Standard Mode (Default)
Timebox: 4 hours Scope: Library Structure. Details: Creating src/data, src/features, src/models modules. Extracting functions. Adding Type Hints (def train(df: pd.DataFrame) -> Model). Replacing print() with logging.
🔬 Deep Mode
Timebox: 2 days Scope: OOP & Design Patterns. Details: Defining specific AbstractBaseClasses (e.g., BasePreProcessor). Implementing a Pipeline pattern. Adding Pydantic integration for config validation.
7. Constraints & Guardrails
Technical Constraints
- Global State: Notebooks rely on global variables defined 10 cells up. functions MUST NOT rely on globals. Pass all variables as arguments.
- Reproducibility: Fix random seeds.
Security & Privacy
CAUTION
Secrets in Code Notebooks often have AWS_KEY = 'AKIA...' hardcoded. When moving to .py, remove ALL credentials. Use os.getenv or .env files.
Compliance
- Licensing: If copying snippets from StackOverflow/Open Source, ensure license compatibility.
8. Procedure
Phase 1: Modularization (The Split)
Objective: Break it down.
Analyze the notebook. Identify logical blocks:
- Data Loading (
src/data/loader.py) - Preprocessing (
src/features/processing.py) - Training (
src/models/train.py) - Evaluation (
src/evaluation/metrics.py)
Create the functions. Copy code from cells to functions. Critical: Remove dependency on global scope variables. If a function needs df, pass df as an arg.
Verify:
src/folder exists with modules. No code runs at import time.
Phase 2: Professionalization (The Polish)
Objective: Clean it up.
Typing: Add Python Type Hints. def process(x: int) -> float:. Docstrings: Add standard docstrings (Args, Returns, Raises). Logging: Replace print("Starting...") with logger.info("Starting..."). Config: Move hardcoded paths/params to config.yaml or params.py. Formatting: Run black and ruff (or flake8) to fix style.
Verify: Linter passes with 0 errors.
Phase 3: Testing & Entrypoint
Objective: Make it runnable.
Create a main.py (or CLI via typer/argparse) that orchestrates the pipeline. Write Unit Tests (pytest).
- Test pure functions (e.g., "clean_text handles empty string").
- Smoke test the full pipeline on a small dummy dataset.
Verify:
pytestpasses.python main.pyruns successfully.
9. Technical Considerations
Import Hell: Relative imports (from .. import config) can be painful. Use absolute imports (from myproject import config) and install the package in "Editable Mode" (pip install -e .).
Data Versioning: Don't hardcode "v1.csv". Use DVC or pass the path as an argument.
Pickle Compatibility: If classes change, old pickles might break. Version your classes or use onnx/standard formats for models.
10. Quality Gates (Definition of Done)
Checklist
- [ ] No global variables in functions.
- [ ] Secrets removed.
- [ ] Logging implemented.
- [ ] Tests passing.
Validation
| Criterion | Method | Threshold |
|---|---|---|
| Maintainability | Cyclomatic Complexity | < 10 per function |
| Coverage | Pytest-cov | > 80% |
11. Failure Modes & Recovery
| Failure Mode | Symptoms | Recovery Action |
|---|---|---|
| Hidden State Bug | Code fails because df was mutated in cell 5. | Trace variable lifecycle. Use immutable transformations where possible. |
| Import Error | ModuleNotFound. | Fix PYTHONPATH or install local package with -e. |
| Slow Tests | Test suite takes 10 mins. | Mock the Database/Network calls. Don't train a full Forest in unit tests. |
12. Copy-Paste Prompt
TIP
One-Click Agent Invocation Copy the prompt below, replace placeholders, and paste into your agent.
text
Role: Act as a Senior ML Engineer.
Task: Execute the ML Code Revamp (Refactor) workflow.
## Objective & Scope
- **Goal**: Refactor experimental Notebooks into production-quality Python packages.
- **Scope**: Modularization, Type Hinting, Configuration extraction, Logging, and Testing.
## Inputs
- [ ] NOTEBOOK: Source Jupyter Notebook (.ipynb).
- [ ] REPO_NAME: Target Package Name.
- [ ] STYLE_GUIDE: PEP8/Black (default).
## Output Artifacts
- [ ] Source Code (src/ module)
- [ ] Unit Tests (tests/)
- [ ] Main Entrypoint (main.py)
## Execution Steps
1. **Structure**
- Create package layout (`src/`,
`tests/`). Extract Logic: Data -> `loader.py`,
Feats -> `features.py`,
Train -> `model.py`.
2. **Refactor**
-
Convert globals to function args. Add Type Hints (`def foo(x: int) -> float`). Replace prints with Logging. Extract secrets to Env/Config.
3. **Test**
- Write Unit Tests (pytest) for pure functions. Create CLI entrypoint. Verify Code Coverage.
## Quality Gates
- [ ] No global state in functions.
- [ ] Zero Linter errors (Black/Ruff).
- [ ] Type checks pass (Mypy).
- [ ] Tests pass > 80% coverage.
## Failure Handling
- If blocked, output a "Clarification Brief" detailing missing info or blockers.
## Constraints
- **Security**: Remove hardcoded secrets.
- **Reproducibility**: Fix random seeds.
## Command
Now execute this workflow step-by-step.Inputs
- NOTEBOOK: [Provide: Valid path]
- REPO_NAME: [Provide: Target Name]
Constraints
- Enforce Type Hints.
- Segregate Config/Secrets.
- Create Unit Tests.
Instructions
Execute the following procedure:
Phase 1: Structure Create standard directory tree (src/, tests/, config/). Extract Notebook cells into specific modules (data, features, model).
Phase 2: Refactor Rewrite functions to be pure (no globals). Add Docstrings and Types. format with Black.
Phase 3: Test Write pytest cases for core logic. Create main.py entrypoint using ArgumentParser.
Quality Gates
- [ ] Linter passed (Black/Ruff).
- [ ] Tests passed.
- [ ] CLI runs.
Output Format
- Refactored Python Files.
- Test Report.
---
## Appendix: Change Log
| Version | Date | Author | Changes |
|---------|------|--------|---------|
| 1.0.0 | 2026-01-14 | AI Engineering Team | Initial release |