Proxy Mode - Zero-Code AI Governance
Proxy Mode is the simplest way to add governance to your AI applications. Send your queries to AxonFlow and it handles everything: policy enforcement, LLM routing, PII detection, and audit logging.
Proxy Mode is the only mode that supports custom dynamic policies. Policies created in the Customer Portal UI or via the API are fully enforced in Proxy Mode.
This includes:
- Custom content policies
- Role-based access control
- Rate limiting rules
- Custom PII patterns
- Policy versioning and testing
If you need custom policies beyond the built-in security patterns, Proxy Mode is required.
How It Works
Key Benefits:
- You don't manage LLM API keys - AxonFlow routes to configured providers
- Automatic audit trail for every request
- Response filtering for PII before it reaches your app
- One API call for policy check + LLM execution + audit
Quick Start
TypeScript
import { AxonFlow } from '@axonflow/sdk'; // v1.7.1+
const axonflow = new AxonFlow({
endpoint: process.env.AXONFLOW_ENDPOINT,
clientId: process.env.AXONFLOW_CLIENT_ID,
clientSecret: process.env.AXONFLOW_CLIENT_SECRET, // Optional for community
});
// Single call handles: policy check → LLM routing → audit
const response = await axonflow.executeQuery({
userToken: 'user-123',
query: 'What are the benefits of AI governance?',
requestType: 'chat',
context: {
provider: 'openai',
model: 'gpt-4',
},
});
if (response.blocked) {
console.log('Blocked:', response.blockReason);
} else if (response.success) {
console.log('Response:', response.data);
}
Python
from axonflow import AxonFlow # v0.5.0+
import os
async with AxonFlow(
endpoint=os.environ.get("AXONFLOW_ENDPOINT", "http://localhost:8080"),
client_id=os.environ["AXONFLOW_CLIENT_ID"],
client_secret=os.environ["AXONFLOW_CLIENT_SECRET"]
) as client:
response = await client.execute_query(
user_token="user-123",
query="What are the benefits of AI governance?",
request_type="chat",
context={
"provider": "openai",
"model": "gpt-4"
}
)
if response.blocked:
print(f"Blocked: {response.block_reason}")
else:
print(f"Response: {response.data}")
Go
import "github.com/getaxonflow/axonflow-sdk-go" // v1.10.0+
client := axonflow.NewClient(axonflow.AxonFlowConfig{
Endpoint: os.Getenv("AXONFLOW_ENDPOINT"),
ClientID: os.Getenv("AXONFLOW_CLIENT_ID"),
ClientSecret: os.Getenv("AXONFLOW_CLIENT_SECRET"),
})
response, err := client.ExecuteQuery(
"user-123", // userToken
"What are the benefits of AI governance?", // query
"chat", // requestType
map[string]interface{}{ // context
"provider": "openai",
"model": "gpt-4",
},
)
if err != nil {
log.Fatal(err)
}
if response.Blocked {
fmt.Printf("Blocked: %s\n", response.BlockReason)
} else {
fmt.Printf("Response: %v\n", response.Data)
}
Java
import com.getaxonflow.sdk.AxonFlow; // v2.7.1+
import com.getaxonflow.sdk.ExecuteQueryRequest;
import com.getaxonflow.sdk.ExecuteQueryResponse;
AxonFlow client = AxonFlow.builder()
.endpoint(System.getenv("AXONFLOW_ENDPOINT"))
.clientId(System.getenv("AXONFLOW_CLIENT_ID"))
.clientSecret(System.getenv("AXONFLOW_CLIENT_SECRET"))
.build();
ExecuteQueryResponse response = client.executeQuery(
ExecuteQueryRequest.builder()
.userToken("user-123")
.query("What are the benefits of AI governance?")
.requestType("chat")
.context(Map.of(
"provider", "openai",
"model", "gpt-4"
))
.build()
);
if (response.isBlocked()) {
System.out.println("Blocked: " + response.getBlockReason());
} else {
System.out.println("Response: " + response.getData());
}
When to Use Proxy Mode
Best For
- Greenfield projects - Starting fresh with AI governance
- Simple integrations - Single API call for everything
- Response filtering - Automatic PII detection and redaction
- 100% audit coverage - Every call automatically logged
- Multi-provider routing - AxonFlow handles LLM provider selection
Example Use Cases
| Scenario | Why Proxy Mode |
|---|---|
| Customer support chatbot | Simple, automatic audit trail |
| Internal Q&A assistant | Zero-code governance |
| Document summarization | Response filtering for PII |
| Code generation | Block prompt injection attacks |
Response Handling
All SDKs return a consistent response structure:
interface ExecuteQueryResponse {
success: boolean; // True if LLM call succeeded
blocked: boolean; // True if blocked by policy
blockReason?: string; // Why it was blocked
data?: string; // LLM response content
policyInfo?: {
policiesEvaluated: string[]; // Policies that were checked
contextId: string; // Audit context ID
};
tokenUsage?: {
promptTokens: number;
completionTokens: number;
totalTokens: number;
};
}
Example: Handling Different Cases
const response = await axonflow.executeQuery({
userToken: 'user-123',
query: userInput,
requestType: 'chat',
context: { provider: 'openai', model: 'gpt-4' },
});
if (response.blocked) {
// Policy violation - show user-friendly message
console.log('Request blocked:', response.blockReason);
// Log which policies triggered
if (response.policyInfo?.policiesEvaluated) {
console.log('Policies:', response.policyInfo.policiesEvaluated.join(', '));
}
} else if (response.success) {
// Success - display the response
console.log('AI Response:', response.data);
// Track token usage for billing
if (response.tokenUsage) {
console.log(`Tokens used: ${response.tokenUsage.totalTokens}`);
}
} else {
// Error (network, LLM provider issue, etc.)
console.error('Request failed');
}
LLM Provider Configuration
In Proxy Mode, AxonFlow routes requests to your configured LLM providers. Specify the provider in the context:
// OpenAI
const response = await axonflow.executeQuery({
query: 'Hello!',
context: { provider: 'openai', model: 'gpt-4' },
});
// Anthropic
const response = await axonflow.executeQuery({
query: 'Hello!',
context: { provider: 'anthropic', model: 'claude-3-sonnet' },
});
// Google Gemini
const response = await axonflow.executeQuery({
query: 'Hello!',
context: { provider: 'gemini', model: 'gemini-2.0-flash' },
});
// Ollama (self-hosted)
const response = await axonflow.executeQuery({
query: 'Hello!',
context: { provider: 'ollama', model: 'llama2' },
});
// AWS Bedrock
const response = await axonflow.executeQuery({
query: 'Hello!',
context: { provider: 'bedrock', model: 'anthropic.claude-3-sonnet' },
});
Features
1. Automatic Policy Enforcement
All requests are checked against your policies before reaching the LLM:
// PII is blocked before reaching OpenAI
const response = await axonflow.executeQuery({
userToken: 'user-123',
query: 'Process payment for SSN 123-45-6789',
requestType: 'chat',
context: { provider: 'openai', model: 'gpt-4' },
});
// response.blocked = true
// response.blockReason = "PII detected: US Social Security Number"
2. Automatic Audit Logging
Every request is logged for compliance - no additional code needed:
const response = await axonflow.executeQuery({
userToken: 'user-123', // User identifier for audit
query: 'Analyze this data...',
requestType: 'chat',
context: { provider: 'openai', model: 'gpt-4' },
});
// Audit automatically includes:
// - Timestamp
// - User token
// - Request content (sanitized)
// - Response summary
// - Policies evaluated
// - Token usage
// - Latency
3. Response Filtering (Enterprise)
PII in LLM responses can be automatically redacted:
const response = await axonflow.executeQuery({
query: "What is John Smith's email?",
context: { provider: 'openai', model: 'gpt-4' },
});
// response.data = "The customer's email is [EMAIL REDACTED]"
4. Multi-Provider Routing
Route to different providers based on request type or load:
// Fast queries to Gemini
const fastResponse = await axonflow.executeQuery({
query: 'Quick question',
context: { provider: 'gemini', model: 'gemini-2.0-flash' },
});
// Complex reasoning to Claude
const complexResponse = await axonflow.executeQuery({
query: 'Analyze this complex document...',
context: { provider: 'anthropic', model: 'claude-3-opus' },
});
Latency Considerations
Proxy Mode adds latency because requests go through AxonFlow:
| Deployment | Additional Latency |
|---|---|
| SaaS endpoint | ~50-100ms |
| In-VPC deployment | ~10-20ms |
For latency-sensitive applications, consider Gateway Mode.
Comparison with Gateway Mode
| Feature | Proxy Mode | Gateway Mode |
|---|---|---|
| Integration Effort | Minimal | Moderate |
| Code Changes | Single executeQuery() call | Pre-check + LLM call + Audit |
| Latency Overhead | Higher (~50-100ms) | Lower (~10-20ms) |
| Static Policies (PII, SQL injection) | ✅ Yes | ✅ Yes |
| Dynamic Policies (custom rules) | ✅ Yes | ❌ No |
| Response Filtering | Yes (automatic) | No |
| Audit Coverage | 100% automatic | Manual |
| LLM API Keys | Managed by AxonFlow | Managed by you |
| LLM Control | AxonFlow routes | You call directly |
| Best For | Custom policies, full governance | Built-in policies only, lowest latency |
See Choosing a Mode for detailed guidance.
Error Handling
try {
const response = await axonflow.executeQuery({
userToken: 'user-123',
query: 'Hello!',
requestType: 'chat',
context: { provider: 'openai', model: 'gpt-4' },
});
if (response.blocked) {
// Policy violation - handle gracefully
showUserMessage('Your request was blocked: ' + response.blockReason);
} else if (response.success) {
// Display the response
displayResponse(response.data);
}
} catch (error) {
// Network/SDK errors
if (error.code === 'ECONNREFUSED') {
console.error('Cannot reach AxonFlow - check your endpoint');
} else if (error.code === 'TIMEOUT') {
console.error('Request timed out');
} else {
console.error('Unexpected error:', error.message);
}
}
Configuration
TypeScript
const axonflow = new AxonFlow({
endpoint: 'https://your-axonflow.example.com',
clientId: process.env.AXONFLOW_CLIENT_ID,
clientSecret: process.env.AXONFLOW_CLIENT_SECRET, // Optional for community
tenant: 'your-tenant-id',
debug: false,
timeout: 30000,
});
Python
client = AxonFlow(
endpoint="https://your-axonflow.example.com",
client_id=os.environ["AXONFLOW_CLIENT_ID"],
client_secret=os.environ["AXONFLOW_CLIENT_SECRET"],
tenant="your-tenant-id",
timeout=30.0,
)
Go
client := axonflow.NewClient(axonflow.AxonFlowConfig{
Endpoint: "https://your-axonflow.example.com",
ClientID: os.Getenv("AXONFLOW_CLIENT_ID"),
ClientSecret: os.Getenv("AXONFLOW_CLIENT_SECRET"),
Tenant: "your-tenant-id",
Timeout: 30 * time.Second,
})
Java
AxonFlow client = AxonFlow.builder()
.endpoint("https://your-axonflow.example.com")
.clientId(System.getenv("AXONFLOW_CLIENT_ID"))
.clientSecret(System.getenv("AXONFLOW_CLIENT_SECRET"))
.tenant("your-tenant-id")
.timeout(Duration.ofSeconds(30))
.build();
Next Steps
- Gateway Mode - For lowest latency with your own LLM calls
- Choosing a Mode - Decision guide
- LLM Interceptors - Wrapper functions for LLM clients (Python, Go, Java)
- TypeScript SDK - Full TypeScript documentation
- Python SDK - Full Python documentation
- Go SDK - Full Go documentation
- Java SDK - Full Java documentation