Giao diện
🔧 Tool Use & Function Calling
🎓 Page Overview
Trang này cung cấp kiến thức chuyên sâu về tool use và function calling trong LLM applications, từ design patterns đến reliability và error handling.
Level: Advanced Solves: Thiết kế robust tool integrations cho LLM agents với proper error handling và reliability
🎯 Function Calling Architecture
Tool Execution Flow
Tool Definition Schema
json
{
"name": "search_products",
"description": "Search products in catalog by query and filters",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query text"
},
"category": {
"type": "string",
"enum": ["electronics", "clothing", "home"],
"description": "Product category filter"
},
"max_price": {
"type": "number",
"description": "Maximum price in USD"
},
"limit": {
"type": "integer",
"default": 10,
"maximum": 50
}
},
"required": ["query"]
}
}🛡️ Reliability Patterns
1. Retry with Exponential Backoff
python
async def execute_with_retry(
tool_fn,
max_retries: int = 3,
base_delay: float = 1.0,
max_delay: float = 30.0
) -> ToolResult:
for attempt in range(max_retries):
try:
return await tool_fn()
except RetryableError as e:
if attempt == max_retries - 1:
raise
delay = min(base_delay * (2 ** attempt), max_delay)
delay += random.uniform(0, delay * 0.1) # Jitter
await asyncio.sleep(delay)2. Circuit Breaker Pattern
| State | Behavior |
|---|---|
| Closed | Normal operation, track failures |
| Open | Fail fast, return cached/fallback |
| Half-Open | Allow limited traffic to test recovery |
3. Timeout Hierarchy
yaml
timeouts:
connection: 5s # Time to establish connection
read: 30s # Time to receive response
total: 60s # Total operation time
llm_planning: 10s # Max time for LLM to decide tool4. Fallback Chain
🔄 Idempotency
Why Idempotency Matters
| Scenario | Without Idempotency | With Idempotency |
|---|---|---|
| Network retry | Duplicate orders | Safe retry |
| LLM re-planning | Multiple API calls | Single execution |
| Timeout + success | Unknown state | Consistent result |
Idempotency Key Design
python
def generate_idempotency_key(
user_id: str,
tool_name: str,
params: dict,
session_id: str
) -> str:
"""
Generate deterministic key for idempotent operations.
"""
# Normalize params for consistent hashing
normalized = json.dumps(params, sort_keys=True)
key_parts = [
user_id,
tool_name,
hashlib.sha256(normalized.encode()).hexdigest()[:16],
session_id
]
return ":".join(key_parts)Idempotency Store Pattern
Tool Classification
| Type | Idempotent | Example | Strategy |
|---|---|---|---|
| Read | Yes | get_user_info | Always safe to retry |
| Create | No → Yes | create_order | Require idempotency key |
| Update | Conditional | update_status | Version-based |
| Delete | Naturally | delete_item | Safe (no-op if missing) |
⚠️ Error Handling
Error Classification
python
class ToolError(Exception):
"""Base class for tool errors."""
pass
class RetryableError(ToolError):
"""Transient errors that may succeed on retry."""
pass
class ParameterError(ToolError):
"""Invalid parameters from LLM."""
pass
class PermissionError(ToolError):
"""User lacks permission for this operation."""
pass
class QuotaError(ToolError):
"""Rate limit or quota exceeded."""
pass
class ExternalServiceError(ToolError):
"""Third-party service failure."""
passError Response Format
json
{
"success": false,
"error": {
"code": "QUOTA_EXCEEDED",
"message": "API rate limit exceeded. Try again in 60 seconds.",
"retryable": true,
"retry_after": 60,
"user_message": "The service is temporarily busy. Please wait a moment."
},
"partial_result": null
}Error Recovery Strategies
| Error Type | Strategy | LLM Prompt |
|---|---|---|
| Parameter Error | Ask LLM to fix | "Parameter 'date' must be YYYY-MM-DD format" |
| Not Found | Alternative search | "No results for X, try broader search?" |
| Permission | Explain limitation | "Cannot access private data" |
| Quota | Wait or degrade | "Service busy, using cached data" |
| Timeout | Partial result | "Partial response, full data unavailable" |
🔒 Security Considerations
Input Validation
python
def validate_tool_call(
tool_name: str,
params: dict,
user_context: UserContext
) -> ValidationResult:
"""
Validate tool call before execution.
"""
# 1. Tool exists and is enabled
if tool_name not in ALLOWED_TOOLS:
return ValidationResult.denied("Unknown tool")
# 2. User has permission
if not user_context.can_use(tool_name):
return ValidationResult.denied("Permission denied")
# 3. Parameters are valid
schema = get_tool_schema(tool_name)
if not validate_against_schema(params, schema):
return ValidationResult.invalid("Parameter validation failed")
# 4. Rate limiting
if not rate_limiter.allow(user_context.id, tool_name):
return ValidationResult.rate_limited()
return ValidationResult.allowed()Sandboxing Principles
| Principle | Implementation |
|---|---|
| Least Privilege | Tools have minimal permissions |
| Isolation | Each tool runs in isolated context |
| Audit Logging | All tool calls are logged |
| Output Sanitization | Sensitive data masked in responses |
Injection Prevention
python
def sanitize_for_shell(param: str) -> str:
"""Prevent command injection."""
# Whitelist allowed characters
allowed = set(string.ascii_letters + string.digits + "._-")
return "".join(c if c in allowed else "_" for c in param)
def sanitize_for_sql(param: str) -> str:
"""Parameterized queries are preferred, this is defense-in-depth."""
return param.replace("'", "''").replace(";", "")📊 Observability
Tool Metrics
| Metric | Description | Alert Threshold |
|---|---|---|
tool.call.count | Tool invocations | Anomaly detection |
tool.call.latency_ms | Execution time | p95 > SLO |
tool.call.error_rate | Failure percentage | > 5% |
tool.call.retry_rate | Retry frequency | > 10% |
Structured Logging
json
{
"timestamp": "2024-01-15T10:30:00Z",
"level": "INFO",
"event": "tool_call",
"tool_name": "search_products",
"user_id": "user_123",
"session_id": "sess_456",
"parameters": {"query": "laptop", "category": "electronics"},
"duration_ms": 245,
"success": true,
"result_count": 15,
"idempotency_key": "user_123:search_products:abc123:sess_456"
}📋 Tool Engineering Checklist
Design Phase
- [ ] Define clear tool boundaries và responsibilities
- [ ] Document parameter schemas với examples
- [ ] Classify tools by idempotency requirements
- [ ] Plan error handling strategies
Implementation Phase
- [ ] Implement retry logic với exponential backoff
- [ ] Add circuit breaker for external dependencies
- [ ] Set up idempotency key handling
- [ ] Implement comprehensive input validation
Production Phase
- [ ] Monitor tool latency và error rates
- [ ] Set up alerting for circuit breaker trips
- [ ] Regular security audit of tool permissions
- [ ] Load test tool endpoints
🔗 Cross-References
- 📎 LLM App Architecture - Tool integration in overall design
- 📎 LLM Safety - Preventing tool misuse
- 📎 System Design - API Design - API reliability patterns
- 📎 ML Deployment - Production deployment patterns
📚 Further Reading
- "Function Calling" - OpenAI Documentation
- "Tool Use Best Practices" - Anthropic Guide
- "Building Reliable LLM Applications" - LangChain Docs