ML Code Revamp (Refactor)

1. Objective

The objective of this workflow is to bridge the "Research to Production" gap. Notebooks are great for exploration (EDA) but terrible for production engineering. They suffer from hidden state, lack of testing, and poor version control. This workflow transforms "Spaghetti Code" into a structured, typed, and tested Python library.

2. Context & Scope

In Scope

This workflow covers Modularization (Functions/Classes), Typing (Type Hints), Configuration separation (Hardcoded vars -> YAML/Env), Logging (Print -> Logger), and Unit Testing.

Assumption: The notebook code "works" (produces a model/result), effectively.

Out of Scope

Algorithm Change: We are not improving the model accuracy here, only the code quality.
CI/CD Pipeline: Setting up the build server is "Implement CI/CD".

3. When to Use / When Not to Use

✅ Use This Workflow When

A model has passed the "POC" stage and needs to be deployed.
You need to run the code on a schedule (Airflow/Cron).
Multiple team members need to collaborate on the codebase.

❌ Do NOT Use This Workflow When

You are strictly doing throw-away EDA (Exploratory Data Analysis).
The idea is not yet proven (Don't optimize code that might be deleted next week).

4. Inputs (Required/Optional)

Required Inputs

Input	Description	Format	Example
NOTEBOOK	The source file.	.ipynb	`experiment_v1.ipynb`
REPO_NAME	Target project.	String	`fraud-detection-lib`

Optional Inputs

Input	Description	Default	Condition
STYLE_GUIDE	Formatting rules.	`Black/PEP8`	If corp standard exists.

5. Outputs (Artifacts)

Artifact	Format	Destination	Quality Criteria
Source Code	`.py` files	`src/`	Modular, Typed, Docstrings.
Tests	`.py` files	`tests/`	Coverage > 80%.

6. Operating Modes

⚡ Fast Mode

Timebox: 1 hour Scope: Script Conversion. Details: Exporting Notebook to a single main.py. Wrapping global code in if __name__ == "__main__":. Running Black formatter.

🎯 Standard Mode (Default)

Timebox: 4 hours Scope: Library Structure. Details: Creating src/data, src/features, src/models modules. Extracting functions. Adding Type Hints (def train(df: pd.DataFrame) -> Model). Replacing print() with logging.

🔬 Deep Mode

Timebox: 2 days Scope: OOP & Design Patterns. Details: Defining specific AbstractBaseClasses (e.g., BasePreProcessor). Implementing a Pipeline pattern. Adding Pydantic integration for config validation.

7. Constraints & Guardrails

Technical Constraints

Global State: Notebooks rely on global variables defined 10 cells up. functions MUST NOT rely on globals. Pass all variables as arguments.
Reproducibility: Fix random seeds.

Security & Privacy

CAUTION

Secrets in Code Notebooks often have AWS_KEY = 'AKIA...' hardcoded. When moving to .py, remove ALL credentials. Use os.getenv or .env files.

Compliance

Licensing: If copying snippets from StackOverflow/Open Source, ensure license compatibility.

8. Procedure

Phase 1: Modularization (The Split)

Objective: Break it down.

Analyze the notebook. Identify logical blocks:

Data Loading (src/data/loader.py)
Preprocessing (src/features/processing.py)
Training (src/models/train.py)
Evaluation (src/evaluation/metrics.py)

Create the functions. Copy code from cells to functions. Critical: Remove dependency on global scope variables. If a function needs df, pass df as an arg.

Verify: src/ folder exists with modules. No code runs at import time.

Phase 2: Professionalization (The Polish)

Objective: Clean it up.

Typing: Add Python Type Hints. def process(x: int) -> float:. Docstrings: Add standard docstrings (Args, Returns, Raises). Logging: Replace print("Starting...") with logger.info("Starting..."). Config: Move hardcoded paths/params to config.yaml or params.py. Formatting: Run black and ruff (or flake8) to fix style.

Verify: Linter passes with 0 errors.

Phase 3: Testing & Entrypoint

Objective: Make it runnable.

Create a main.py (or CLI via typer/argparse) that orchestrates the pipeline. Write Unit Tests (pytest).

Test pure functions (e.g., "clean_text handles empty string").
Smoke test the full pipeline on a small dummy dataset.

Verify: pytest passes. python main.py runs successfully.

9. Technical Considerations

Import Hell: Relative imports (from .. import config) can be painful. Use absolute imports (from myproject import config) and install the package in "Editable Mode" (pip install -e .).

Data Versioning: Don't hardcode "v1.csv". Use DVC or pass the path as an argument.

Pickle Compatibility: If classes change, old pickles might break. Version your classes or use onnx/standard formats for models.

10. Quality Gates (Definition of Done)

Checklist

[ ] No global variables in functions.
[ ] Secrets removed.
[ ] Logging implemented.
[ ] Tests passing.

Validation

Criterion	Method	Threshold
Maintainability	Cyclomatic Complexity	< 10 per function
Coverage	Pytest-cov	> 80%

11. Failure Modes & Recovery

Failure Mode	Symptoms	Recovery Action
Hidden State Bug	Code fails because `df` was mutated in cell 5.	Trace variable lifecycle. Use immutable transformations where possible.
Import Error	`ModuleNotFound`.	Fix `PYTHONPATH` or install local package with `-e`.
Slow Tests	Test suite takes 10 mins.	Mock the Database/Network calls. Don't train a full Forest in unit tests.

12. Copy-Paste Prompt

TIP

One-Click Agent Invocation Copy the prompt below, replace placeholders, and paste into your agent.

text

Role: Act as a Senior ML Engineer.
Task: Execute the ML Code Revamp (Refactor) workflow.

## Objective & Scope
- **Goal**: Refactor experimental Notebooks into production-quality Python packages.
- **Scope**: Modularization, Type Hinting, Configuration extraction, Logging, and Testing.

## Inputs
- [ ] NOTEBOOK: Source Jupyter Notebook (.ipynb).
- [ ] REPO_NAME: Target Package Name.
- [ ] STYLE_GUIDE: PEP8/Black (default).

## Output Artifacts
- [ ] Source Code (src/ module)
- [ ] Unit Tests (tests/)
- [ ] Main Entrypoint (main.py)

## Execution Steps
1. **Structure**
   - Create package layout (`src/`,
     `tests/`). Extract Logic: Data -> `loader.py`,
     Feats -> `features.py`,
     Train -> `model.py`.
2. **Refactor**
   - 
     Convert globals to function args. Add Type Hints (`def foo(x: int) -> float`). Replace prints with Logging. Extract secrets to Env/Config.
3. **Test**
   - Write Unit Tests (pytest) for pure functions. Create CLI entrypoint. Verify Code Coverage.

## Quality Gates
- [ ] No global state in functions.
- [ ] Zero Linter errors (Black/Ruff).
- [ ] Type checks pass (Mypy).
- [ ] Tests pass > 80% coverage.

## Failure Handling
- If blocked, output a "Clarification Brief" detailing missing info or blockers.

## Constraints
- **Security**: Remove hardcoded secrets.
- **Reproducibility**: Fix random seeds.

## Command
Now execute this workflow step-by-step.

Inputs

NOTEBOOK: [Provide: Valid path]
REPO_NAME: [Provide: Target Name]

Constraints

Enforce Type Hints.
Segregate Config/Secrets.
Create Unit Tests.

Instructions

Execute the following procedure:

Phase 1: Structure Create standard directory tree (src/, tests/, config/). Extract Notebook cells into specific modules (data, features, model).

Phase 2: Refactor Rewrite functions to be pure (no globals). Add Docstrings and Types. format with Black.

Phase 3: Test Write pytest cases for core logic. Create main.py entrypoint using ArgumentParser.

Quality Gates

[ ] Linter passed (Black/Ruff).
[ ] Tests passed.
[ ] CLI runs.

Output Format

Refactored Python Files.
Test Report.


---

## Appendix: Change Log

| Version | Date | Author | Changes |
|---------|------|--------|---------|
| 1.0.0 | 2026-01-14 | AI Engineering Team | Initial release |

ML Code Revamp (Refactor) ​

1. Objective ​

2. Context & Scope ​

In Scope ​

Out of Scope ​

3. When to Use / When Not to Use ​

✅ Use This Workflow When ​

❌ Do NOT Use This Workflow When ​

4. Inputs (Required/Optional) ​

Required Inputs ​

Optional Inputs ​

5. Outputs (Artifacts) ​

6. Operating Modes ​

⚡ Fast Mode ​

🎯 Standard Mode (Default) ​

🔬 Deep Mode ​

7. Constraints & Guardrails ​

Technical Constraints ​

Security & Privacy ​

Compliance ​

8. Procedure ​

Phase 1: Modularization (The Split) ​

Phase 2: Professionalization (The Polish) ​

Phase 3: Testing & Entrypoint ​

9. Technical Considerations ​

10. Quality Gates (Definition of Done) ​

Checklist ​

Validation ​

11. Failure Modes & Recovery ​

12. Copy-Paste Prompt ​

Inputs ​

Constraints ​

Instructions ​

Quality Gates ​

Output Format ​