Skip to content

Performance Tuning for AI & ML Engineering

1. Objective

To perform routine Performance Tuning within the AI & ML Engineering context.

2. When to use / When not to use

When to use:

  • Periodic maintenance.

When not to use:

  • During freeze.

3. Inputs (Required/Optional)

Required:

  • Current State

Optional:

  • Relevant documentation

4. Outputs (Artifacts)

  • Updated State: Description of the Updated State artifact.

5. Operating Modes

ModeDescriptionVerification Level
FastFocus on speed, minimal validation.Basic syntax/lint checks only.
StandardBalanced approach.Unit tests and standard linting.
DeepComprehensive analysis and optimization.Full test suite, performance profiling, security scan.

6. Constraints & Guardrails

  • No Broken Builds: Ensure all changes pass the build process.
  • Code Style: Strictly adhere to the project's linting and formatting rules.
  • Security: Do not introduce new vulnerabilities; sanitize all inputs.
  • Performance: Avoid O(n^2) or worse complexity unless strictly necessary and documented.
  • Testing: Maintain or improve code coverage; do not degrade it.

7. Procedure

Phase 1: Review

  1. Audit AI & ML Engineering assets.
  2. Identify stale items.

Phase 2: Action

  1. Execute Performance Tuning.
  2. Verify changes.

Phase 3: Log

  1. Update changelog.
  2. Notify team.

8. Quality Gates (Definition of Done)

  • [ ] code compiles/runs without errors.
  • [ ] All new components include identical or improved test coverage.
  • [ ] No new linting errors or warnings introduced.
  • [ ] Documentation updated (inline and external).
  • [ ] Security scan passes (no high/critical severities).

9. Failure Modes & Recovery

Failure ModeRecovery Action
Build FailureCheck error logs, revert recent changes, verify dependencies.
Test FailureIsolate failing test, debug logic, or update test if requirements changed.
Linting ErrorRun auto-formatter and manually fix remaining issues.
Merge ConflictRebase on main, resolve conflicts manually, run tests again.

10. Copy-Paste Prompt

text
Role: Act as a Lead Engineer specializing in Performance.
Task: Execute the Performance Tuning for AI & ML Engineering workflow.

## Objective & Scope
- **Goal**: Improve the runtime, latency, or throughput of AI/ML systems.
- **Scope**: Profiling code, identifying bottlenecks (CPU/GPU/IO), and applying optimizations.

## Inputs
- [ ] SYSTEM: The code or service to tune.
- [ ] BASELINE_METRICS: Current performance numbers (Latency, QPS).
- [ ] TARGET_METRICS: Desired performance goals.

## Output Artifacts
- [ ] Profiling Report (Flamegraph/Stats)
- [ ] Optimized Code/Config
- [ ] Validation Benchmark

## Execution Steps
1. **Profile**
   - Run profiling tools (cProfile,
     PyTorch Profiler,
     line_profiler). Isolate the bottleneck (Data Loading? Compute? Inference?).
2. **Optimize**
   - Apply fixes: Vectorization, Caching, Parallelism, Operator Fusion, or Algorithmic changes.
3. **Verify**
   - Benchmark the optimized system under load. Ensure accuracy is unchanged.

## Quality Gates
- [ ] Bottleneck identified and documented.
- [ ] Target metrics achieved.
- [ ] Functional correctness maintained (Regression test passed).

## Failure Handling
- If blocked, output a "Clarification Brief" detailing missing info or blockers.

## Constraints
- **Quality**: Optimizations must NOT degrade model accuracy.
- **Maintainability**: Avoid overly obscure hacks; comment optimization logic clearly.

## Command
Now execute this workflow step-by-step.

Cập nhật lần cuối: