Proxy Mode - Zero-Code AI Governance
Proxy Mode is the simplest way to add governance to your AI applications. Wrap your AI calls and AxonFlow handles everything: policy enforcement, PII detection, rate limiting, and audit logging.
How It Works
- Your app calls
protect()with an AI function - AxonFlow extracts the request, evaluates policies
- If approved, AxonFlow executes the AI call
- Logs audit trail automatically
- Optionally filters response (PII detection)
Quick Start
TypeScript
import { AxonFlow } from '@axonflow/sdk';
import OpenAI from 'openai';
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const axonflow = new AxonFlow({ licenseKey: process.env.AXONFLOW_LICENSE_KEY });
// Wrap any AI call with protect()
const response = await axonflow.protect(async () => {
return openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'Analyze this data...' }]
});
});
Go
import "github.com/getaxonflow/axonflow-sdk-go"
client := axonflow.NewClient(axonflow.AxonFlowConfig{
AgentURL: os.Getenv("AXONFLOW_AGENT_URL"),
LicenseKey: os.Getenv("AXONFLOW_LICENSE_KEY"),
})
// Execute governed query
resp, err := client.ExecuteQuery(
userToken,
"Analyze this data...",
"chat",
nil,
)
Python
from axonflow import AxonFlow
async with AxonFlow(
agent_url=os.environ["AXONFLOW_AGENT_URL"],
client_id=os.environ["AXONFLOW_CLIENT_ID"],
client_secret=os.environ["AXONFLOW_CLIENT_SECRET"]
) as client:
response = await client.execute_query(
user_token="user-jwt",
query="Analyze this data...",
request_type="chat"
)
When to Use Proxy Mode
Best For
- Greenfield projects - Starting fresh with AI governance
- Simple integrations - Minimal code changes required
- Response filtering - Automatic PII detection and redaction
- 100% audit coverage - Every call automatically logged
- Beginners - Lower learning curve
Example Use Cases
| Scenario | Why Proxy Mode |
|---|---|
| Customer support chatbot | Simple, automatic audit trail |
| Internal Q&A assistant | Zero-code governance |
| Document summarization | Response filtering for PII |
| Code generation | Block prompt injection attacks |
Features
1. Automatic Policy Enforcement
All requests are checked against your policies before reaching the LLM:
// If request contains PII, it's blocked before reaching OpenAI
try {
const response = await axonflow.protect(async () => {
return openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'My SSN is 123-45-6789' }]
});
});
} catch (error) {
// error.message: "Request blocked by AxonFlow: PII detected"
}
2. Automatic Audit Logging
Every request is logged for compliance:
// No additional code needed - audit happens automatically
const response = await axonflow.protect(async () => {
return openai.chat.completions.create({ ... });
});
// Audit includes:
// - Timestamp
// - User token
// - Request content (sanitized)
// - Response summary
// - Policies evaluated
// - Token usage
3. Response Filtering (Enterprise)
PII in responses can be automatically redacted:
const response = await axonflow.protect(async () => {
return openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'What is John Smith\'s email?' }]
});
});
// Response: "The customer's email is [EMAIL REDACTED]"
4. Rate Limiting
Automatic rate limiting per user/tenant:
try {
const response = await axonflow.protect(async () => {
return openai.chat.completions.create({ ... });
});
} catch (error) {
if (error.message.includes('rate limit')) {
// Rate limit exceeded - wait and retry
}
}
5. Fail-Open Strategy
In production, if AxonFlow is unavailable, requests proceed with a warning:
const axonflow = new AxonFlow({
licenseKey: process.env.AXONFLOW_LICENSE_KEY,
mode: 'production' // Fail-open if AxonFlow is down
});
// If AxonFlow is unavailable, the AI call still proceeds
const response = await axonflow.protect(async () => {
return openai.chat.completions.create({ ... });
});
Client Wrapping (TypeScript)
For maximum convenience, wrap your entire AI client:
import { AxonFlow, wrapOpenAIClient } from '@axonflow/sdk';
import OpenAI from 'openai';
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const axonflow = new AxonFlow({ licenseKey: process.env.AXONFLOW_LICENSE_KEY });
// Wrap the entire client - all calls are now protected
const protectedOpenAI = wrapOpenAIClient(openai, axonflow);
// Use normally - governance happens invisibly
const response = await protectedOpenAI.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'Hello!' }]
});
Latency Considerations
Proxy Mode adds latency because requests go through AxonFlow:
| Deployment | Additional Latency |
|---|---|
| Public endpoint | ~50-100ms |
| VPC endpoint | ~10-20ms |
For latency-sensitive applications, consider Gateway Mode.
Comparison with Gateway Mode
| Feature | Proxy Mode | Gateway Mode |
|---|---|---|
| Integration Effort | Minimal | Moderate |
| Code Changes | Wrap existing calls | Pre-check + Audit |
| Latency Overhead | Higher (~50-100ms) | Lower (~10-20ms) |
| Response Filtering | Yes | No |
| Audit Coverage | 100% automatic | Manual (call audit API) |
| LLM Control | Limited | Full |
| Best For | Simple apps, beginners | Frameworks, performance |
See Choosing a Mode for detailed guidance.
Error Handling
try {
const response = await axonflow.protect(async () => {
return openai.chat.completions.create({ ... });
});
console.log('Success:', response);
} catch (error) {
if (error.message.includes('blocked by AxonFlow')) {
// Policy violation
console.log('Policy violation:', error.message);
} else if (error.message.includes('rate limit')) {
// Rate limit exceeded
console.log('Rate limited, try again later');
} else {
// Other errors (network, API, etc.)
console.error('Error:', error);
}
}
Configuration
const axonflow = new AxonFlow({
licenseKey: process.env.AXONFLOW_LICENSE_KEY,
endpoint: 'https://staging-eu.getaxonflow.com',
mode: 'production', // 'production' or 'sandbox'
debug: false, // Enable debug logging
timeout: 30000, // Request timeout in ms
retry: {
enabled: true,
maxAttempts: 3,
delay: 1000
},
cache: {
enabled: true,
ttl: 60000 // Cache TTL in ms
}
});
Next Steps
- Gateway Mode - For lowest latency
- Choosing a Mode - Decision guide
- TypeScript SDK - Full TypeScript documentation
- Go SDK - Full Go documentation
- Python SDK - Full Python documentation