Giao diện
Performance Tuning for AI & ML Engineering
1. Objective
To perform routine Performance Tuning within the AI & ML Engineering context.
2. When to use / When not to use
When to use:
- Periodic maintenance.
When not to use:
- During freeze.
3. Inputs (Required/Optional)
Required:
- Current State
Optional:
- Relevant documentation
4. Outputs (Artifacts)
- Updated State: Description of the Updated State artifact.
5. Operating Modes
| Mode | Description | Verification Level |
|---|---|---|
| Fast | Focus on speed, minimal validation. | Basic syntax/lint checks only. |
| Standard | Balanced approach. | Unit tests and standard linting. |
| Deep | Comprehensive analysis and optimization. | Full test suite, performance profiling, security scan. |
6. Constraints & Guardrails
- No Broken Builds: Ensure all changes pass the build process.
- Code Style: Strictly adhere to the project's linting and formatting rules.
- Security: Do not introduce new vulnerabilities; sanitize all inputs.
- Performance: Avoid O(n^2) or worse complexity unless strictly necessary and documented.
- Testing: Maintain or improve code coverage; do not degrade it.
7. Procedure
Phase 1: Review
- Audit AI & ML Engineering assets.
- Identify stale items.
Phase 2: Action
- Execute Performance Tuning.
- Verify changes.
Phase 3: Log
- Update changelog.
- Notify team.
8. Quality Gates (Definition of Done)
- [ ] code compiles/runs without errors.
- [ ] All new components include identical or improved test coverage.
- [ ] No new linting errors or warnings introduced.
- [ ] Documentation updated (inline and external).
- [ ] Security scan passes (no high/critical severities).
9. Failure Modes & Recovery
| Failure Mode | Recovery Action |
|---|---|
| Build Failure | Check error logs, revert recent changes, verify dependencies. |
| Test Failure | Isolate failing test, debug logic, or update test if requirements changed. |
| Linting Error | Run auto-formatter and manually fix remaining issues. |
| Merge Conflict | Rebase on main, resolve conflicts manually, run tests again. |
10. Copy-Paste Prompt
text
Role: Act as a Lead Engineer specializing in Performance.
Task: Execute the Performance Tuning for AI & ML Engineering workflow.
## Objective & Scope
- **Goal**: Improve the runtime, latency, or throughput of AI/ML systems.
- **Scope**: Profiling code, identifying bottlenecks (CPU/GPU/IO), and applying optimizations.
## Inputs
- [ ] SYSTEM: The code or service to tune.
- [ ] BASELINE_METRICS: Current performance numbers (Latency, QPS).
- [ ] TARGET_METRICS: Desired performance goals.
## Output Artifacts
- [ ] Profiling Report (Flamegraph/Stats)
- [ ] Optimized Code/Config
- [ ] Validation Benchmark
## Execution Steps
1. **Profile**
- Run profiling tools (cProfile,
PyTorch Profiler,
line_profiler). Isolate the bottleneck (Data Loading? Compute? Inference?).
2. **Optimize**
- Apply fixes: Vectorization, Caching, Parallelism, Operator Fusion, or Algorithmic changes.
3. **Verify**
- Benchmark the optimized system under load. Ensure accuracy is unchanged.
## Quality Gates
- [ ] Bottleneck identified and documented.
- [ ] Target metrics achieved.
- [ ] Functional correctness maintained (Regression test passed).
## Failure Handling
- If blocked, output a "Clarification Brief" detailing missing info or blockers.
## Constraints
- **Quality**: Optimizations must NOT degrade model accuracy.
- **Maintainability**: Avoid overly obscure hacks; comment optimization logic clearly.
## Command
Now execute this workflow step-by-step.