Skip to content

ML Code Revamp (Refactor)


1. Objective

The objective of this workflow is to bridge the "Research to Production" gap. Notebooks are great for exploration (EDA) but terrible for production engineering. They suffer from hidden state, lack of testing, and poor version control. This workflow transforms "Spaghetti Code" into a structured, typed, and tested Python library.


2. Context & Scope

In Scope

This workflow covers Modularization (Functions/Classes), Typing (Type Hints), Configuration separation (Hardcoded vars -> YAML/Env), Logging (Print -> Logger), and Unit Testing.

Assumption: The notebook code "works" (produces a model/result), effectively.

Out of Scope

  • Algorithm Change: We are not improving the model accuracy here, only the code quality.
  • CI/CD Pipeline: Setting up the build server is "Implement CI/CD".

3. When to Use / When Not to Use

Use This Workflow When

  • A model has passed the "POC" stage and needs to be deployed.
  • You need to run the code on a schedule (Airflow/Cron).
  • Multiple team members need to collaborate on the codebase.

Do NOT Use This Workflow When

  • You are strictly doing throw-away EDA (Exploratory Data Analysis).
  • The idea is not yet proven (Don't optimize code that might be deleted next week).

4. Inputs (Required/Optional)

Required Inputs

InputDescriptionFormatExample
NOTEBOOKThe source file..ipynbexperiment_v1.ipynb
REPO_NAMETarget project.Stringfraud-detection-lib

Optional Inputs

InputDescriptionDefaultCondition
STYLE_GUIDEFormatting rules.Black/PEP8If corp standard exists.

5. Outputs (Artifacts)

ArtifactFormatDestinationQuality Criteria
Source Code.py filessrc/Modular, Typed, Docstrings.
Tests.py filestests/Coverage > 80%.

6. Operating Modes

Fast Mode

Timebox: 1 hour Scope: Script Conversion. Details: Exporting Notebook to a single main.py. Wrapping global code in if __name__ == "__main__":. Running Black formatter.

🎯 Standard Mode (Default)

Timebox: 4 hours Scope: Library Structure. Details: Creating src/data, src/features, src/models modules. Extracting functions. Adding Type Hints (def train(df: pd.DataFrame) -> Model). Replacing print() with logging.

🔬 Deep Mode

Timebox: 2 days Scope: OOP & Design Patterns. Details: Defining specific AbstractBaseClasses (e.g., BasePreProcessor). Implementing a Pipeline pattern. Adding Pydantic integration for config validation.


7. Constraints & Guardrails

Technical Constraints

  • Global State: Notebooks rely on global variables defined 10 cells up. functions MUST NOT rely on globals. Pass all variables as arguments.
  • Reproducibility: Fix random seeds.

Security & Privacy

CAUTION

Secrets in Code Notebooks often have AWS_KEY = 'AKIA...' hardcoded. When moving to .py, remove ALL credentials. Use os.getenv or .env files.

Compliance

  • Licensing: If copying snippets from StackOverflow/Open Source, ensure license compatibility.

8. Procedure

Phase 1: Modularization (The Split)

Objective: Break it down.

Analyze the notebook. Identify logical blocks:

  1. Data Loading (src/data/loader.py)
  2. Preprocessing (src/features/processing.py)
  3. Training (src/models/train.py)
  4. Evaluation (src/evaluation/metrics.py)

Create the functions. Copy code from cells to functions. Critical: Remove dependency on global scope variables. If a function needs df, pass df as an arg.

Verify: src/ folder exists with modules. No code runs at import time.

Phase 2: Professionalization (The Polish)

Objective: Clean it up.

Typing: Add Python Type Hints. def process(x: int) -> float:. Docstrings: Add standard docstrings (Args, Returns, Raises). Logging: Replace print("Starting...") with logger.info("Starting..."). Config: Move hardcoded paths/params to config.yaml or params.py. Formatting: Run black and ruff (or flake8) to fix style.

Verify: Linter passes with 0 errors.

Phase 3: Testing & Entrypoint

Objective: Make it runnable.

Create a main.py (or CLI via typer/argparse) that orchestrates the pipeline. Write Unit Tests (pytest).

  • Test pure functions (e.g., "clean_text handles empty string").
  • Smoke test the full pipeline on a small dummy dataset.

Verify: pytest passes. python main.py runs successfully.


9. Technical Considerations

Import Hell: Relative imports (from .. import config) can be painful. Use absolute imports (from myproject import config) and install the package in "Editable Mode" (pip install -e .).

Data Versioning: Don't hardcode "v1.csv". Use DVC or pass the path as an argument.

Pickle Compatibility: If classes change, old pickles might break. Version your classes or use onnx/standard formats for models.


10. Quality Gates (Definition of Done)

Checklist

  • [ ] No global variables in functions.
  • [ ] Secrets removed.
  • [ ] Logging implemented.
  • [ ] Tests passing.

Validation

CriterionMethodThreshold
MaintainabilityCyclomatic Complexity< 10 per function
CoveragePytest-cov> 80%

11. Failure Modes & Recovery

Failure ModeSymptomsRecovery Action
Hidden State BugCode fails because df was mutated in cell 5.Trace variable lifecycle. Use immutable transformations where possible.
Import ErrorModuleNotFound.Fix PYTHONPATH or install local package with -e.
Slow TestsTest suite takes 10 mins.Mock the Database/Network calls. Don't train a full Forest in unit tests.

12. Copy-Paste Prompt

TIP

One-Click Agent Invocation Copy the prompt below, replace placeholders, and paste into your agent.

text
Role: Act as a Senior ML Engineer.
Task: Execute the ML Code Revamp (Refactor) workflow.

## Objective & Scope
- **Goal**: Refactor experimental Notebooks into production-quality Python packages.
- **Scope**: Modularization, Type Hinting, Configuration extraction, Logging, and Testing.

## Inputs
- [ ] NOTEBOOK: Source Jupyter Notebook (.ipynb).
- [ ] REPO_NAME: Target Package Name.
- [ ] STYLE_GUIDE: PEP8/Black (default).

## Output Artifacts
- [ ] Source Code (src/ module)
- [ ] Unit Tests (tests/)
- [ ] Main Entrypoint (main.py)

## Execution Steps
1. **Structure**
   - Create package layout (`src/`,
     `tests/`). Extract Logic: Data -> `loader.py`,
     Feats -> `features.py`,
     Train -> `model.py`.
2. **Refactor**
   - 
     Convert globals to function args. Add Type Hints (`def foo(x: int) -> float`). Replace prints with Logging. Extract secrets to Env/Config.
3. **Test**
   - Write Unit Tests (pytest) for pure functions. Create CLI entrypoint. Verify Code Coverage.

## Quality Gates
- [ ] No global state in functions.
- [ ] Zero Linter errors (Black/Ruff).
- [ ] Type checks pass (Mypy).
- [ ] Tests pass > 80% coverage.

## Failure Handling
- If blocked, output a "Clarification Brief" detailing missing info or blockers.

## Constraints
- **Security**: Remove hardcoded secrets.
- **Reproducibility**: Fix random seeds.

## Command
Now execute this workflow step-by-step.

Inputs

  • NOTEBOOK: [Provide: Valid path]
  • REPO_NAME: [Provide: Target Name]

Constraints

  • Enforce Type Hints.
  • Segregate Config/Secrets.
  • Create Unit Tests.

Instructions

Execute the following procedure:

Phase 1: Structure Create standard directory tree (src/, tests/, config/). Extract Notebook cells into specific modules (data, features, model).

Phase 2: Refactor Rewrite functions to be pure (no globals). Add Docstrings and Types. format with Black.

Phase 3: Test Write pytest cases for core logic. Create main.py entrypoint using ArgumentParser.

Quality Gates

  • [ ] Linter passed (Black/Ruff).
  • [ ] Tests passed.
  • [ ] CLI runs.

Output Format

  • Refactored Python Files.
  • Test Report.

---

## Appendix: Change Log

| Version | Date | Author | Changes |
|---------|------|--------|---------|
| 1.0.0 | 2026-01-14 | AI Engineering Team | Initial release |

Cập nhật lần cuối: