Proxy Mode - Zero-Code AI Governance
Proxy Mode is the simplest way to add governance to your AI applications. Send your queries to AxonFlow and it handles everything: policy enforcement, LLM routing, PII detection, and audit logging.
Proxy Mode is the only mode that supports custom dynamic policies. Policies created in the Customer Portal UI or via the API are fully enforced in Proxy Mode.
This includes:
- Custom content policies
- Role-based access control
- Rate limiting rules
- Custom PII patterns
- Policy versioning and testing
If you need custom policies beyond the built-in security patterns, Proxy Mode is required.
Prerequisites
| Language | Minimum Version | SDK Package | Install |
|---|---|---|---|
| TypeScript | Node.js 18+ | @axonflow/sdk v3.8.0 | npm install @axonflow/sdk |
| Python | 3.9+ | axonflow v3.8.0 | pip install axonflow |
| Go | 1.21+ | github.com/getaxonflow/axonflow-sdk-go/v3 v3.8.0 | go get github.com/getaxonflow/axonflow-sdk-go/v3 |
| Java | 11+ | com.getaxonflow:axonflow-sdk v3.8.0 | Add to pom.xml or build.gradle |
You also need:
- A running AxonFlow Agent (local Docker or SaaS endpoint)
AXONFLOW_CLIENT_ID(required) andAXONFLOW_CLIENT_SECRET(optional for community mode)
How It Works
Key Benefits:
- You don't manage LLM API keys - AxonFlow routes to configured providers
- Automatic audit trail for every request
- Response filtering for PII before it reaches your app
- One API call for policy check + LLM execution + audit
Quick Start
TypeScript
import { AxonFlow } from '@axonflow/sdk'; // v3.2.0+
const axonflow = new AxonFlow({
endpoint: process.env.AXONFLOW_ENDPOINT,
clientId: process.env.AXONFLOW_CLIENT_ID,
clientSecret: process.env.AXONFLOW_CLIENT_SECRET, // Optional for community
});
// Single call handles: policy check → LLM routing → audit
const response = await axonflow.executeQuery({
userToken: 'user-123',
query: 'What are the benefits of AI governance?',
requestType: 'chat',
context: {
provider: 'openai',
model: 'gpt-4',
},
});
if (response.blocked) {
console.log('Blocked:', response.blockReason);
} else if (response.success) {
console.log('Response:', response.data);
}
Python
from axonflow import AxonFlow # v3.2.0+
import os
async with AxonFlow(
endpoint=os.environ.get("AXONFLOW_ENDPOINT", "http://localhost:8080"),
client_id=os.environ["AXONFLOW_CLIENT_ID"],
client_secret=os.environ["AXONFLOW_CLIENT_SECRET"]
) as client:
response = await client.execute_query(
user_token="user-123",
query="What are the benefits of AI governance?",
request_type="chat",
context={
"provider": "openai",
"model": "gpt-4"
}
)
if response.blocked:
print(f"Blocked: {response.block_reason}")
else:
print(f"Response: {response.data}")
Go
import "github.com/getaxonflow/axonflow-sdk-go/v3" // v3.2.0+
client := axonflow.NewClient(axonflow.AxonFlowConfig{
Endpoint: os.Getenv("AXONFLOW_ENDPOINT"),
ClientID: os.Getenv("AXONFLOW_CLIENT_ID"),
ClientSecret: os.Getenv("AXONFLOW_CLIENT_SECRET"),
})
response, err := client.ExecuteQuery(
"user-123", // userToken
"What are the benefits of AI governance?", // query
"chat", // requestType
map[string]interface{}{ // context
"provider": "openai",
"model": "gpt-4",
},
)
if err != nil {
log.Fatal(err)
}
if response.Blocked {
fmt.Printf("Blocked: %s\n", response.BlockReason)
} else {
fmt.Printf("Response: %v\n", response.Data)
}
Java
import com.getaxonflow.sdk.AxonFlow; // v3.2.0+
import com.getaxonflow.sdk.ExecuteQueryRequest;
import com.getaxonflow.sdk.ExecuteQueryResponse;
AxonFlow client = AxonFlow.builder()
.endpoint(System.getenv("AXONFLOW_ENDPOINT"))
.clientId(System.getenv("AXONFLOW_CLIENT_ID"))
.clientSecret(System.getenv("AXONFLOW_CLIENT_SECRET"))
.build();
ExecuteQueryResponse response = client.executeQuery(
ExecuteQueryRequest.builder()
.userToken("user-123")
.query("What are the benefits of AI governance?")
.requestType("chat")
.context(Map.of(
"provider", "openai",
"model", "gpt-4"
))
.build()
);
if (response.isBlocked()) {
System.out.println("Blocked: " + response.getBlockReason());
} else {
System.out.println("Response: " + response.getData());
}
When to Use Proxy Mode
Best For
- Greenfield projects - Starting fresh with AI governance
- Simple integrations - Single API call for everything
- Response filtering - Automatic PII detection and redaction
- 100% audit coverage - Every call automatically logged
- Multi-provider routing - AxonFlow handles LLM provider selection
Example Use Cases
| Scenario | Why Proxy Mode |
|---|---|
| Customer support chatbot | Simple, automatic audit trail |
| Internal Q&A assistant | Zero-code governance |
| Document summarization | Response filtering for PII |
| Code generation | Block prompt injection attacks |
Response Handling
All SDKs return a consistent response structure:
interface ExecuteQueryResponse {
success: boolean; // True if LLM call succeeded
blocked: boolean; // True if blocked by policy
blockReason?: string; // Why it was blocked
data?: string; // LLM response content
policyInfo?: {
policiesEvaluated: string[]; // Policies that were checked
contextId: string; // Audit context ID
};
tokenUsage?: {
promptTokens: number;
completionTokens: number;
totalTokens: number;
};
}
Example: Handling Different Cases
const response = await axonflow.executeQuery({
userToken: 'user-123',
query: userInput,
requestType: 'chat',
context: { provider: 'openai', model: 'gpt-4' },
});
if (response.blocked) {
// Policy violation - show user-friendly message
console.log('Request blocked:', response.blockReason);
// Log which policies triggered
if (response.policyInfo?.policiesEvaluated) {
console.log('Policies:', response.policyInfo.policiesEvaluated.join(', '));
}
} else if (response.success) {
// Success - display the response
console.log('AI Response:', response.data);
// Track token usage for billing
if (response.tokenUsage) {
console.log(`Tokens used: ${response.tokenUsage.totalTokens}`);
}
} else {
// Error (network, LLM provider issue, etc.)
console.error('Request failed');
}
LLM Provider Configuration
In Proxy Mode, AxonFlow routes requests to your configured LLM providers. Specify the provider in the context:
// OpenAI
const response = await axonflow.executeQuery({
query: 'Hello!',
context: { provider: 'openai', model: 'gpt-4' },
});
// Anthropic
const response = await axonflow.executeQuery({
query: 'Hello!',
context: { provider: 'anthropic', model: 'claude-sonnet-4' },
});
// Google Gemini
const response = await axonflow.executeQuery({
query: 'Hello!',
context: { provider: 'gemini', model: 'gemini-2.0-flash' },
});
// Ollama (self-hosted)
const response = await axonflow.executeQuery({
query: 'Hello!',
context: { provider: 'ollama', model: 'llama3.2' },
});
// AWS Bedrock
const response = await axonflow.executeQuery({
query: 'Hello!',
context: { provider: 'bedrock', model: 'anthropic.claude-sonnet-4-20250514-v1:0' },
});
Features
1. Automatic Policy Enforcement
All requests are checked against your policies before reaching the LLM:
// PII is blocked before reaching OpenAI
const response = await axonflow.executeQuery({
userToken: 'user-123',
query: 'Process payment for SSN 123-45-6789',
requestType: 'chat',
context: { provider: 'openai', model: 'gpt-4' },
});
// response.blocked = true
// response.blockReason = "PII detected: US Social Security Number"
2. Automatic Audit Logging
Every request is logged for compliance - no additional code needed:
const response = await axonflow.executeQuery({
userToken: 'user-123', // User identifier for audit
query: 'Analyze this data...',
requestType: 'chat',
context: { provider: 'openai', model: 'gpt-4' },
});
// Audit automatically includes:
// - Timestamp
// - User token
// - Request content (sanitized)
// - Response summary
// - Policies evaluated
// - Token usage
// - Latency
3. Response Filtering (Enterprise)
PII in LLM responses can be automatically redacted:
const response = await axonflow.executeQuery({
query: "What is John Smith's email?",
context: { provider: 'openai', model: 'gpt-4' },
});
// response.data = "The customer's email is [EMAIL REDACTED]"
4. Multi-Provider Routing
Route to different providers based on request type or load:
// Fast queries to Gemini
const fastResponse = await axonflow.executeQuery({
query: 'Quick question',
context: { provider: 'gemini', model: 'gemini-2.0-flash' },
});
// Complex reasoning to Claude
const complexResponse = await axonflow.executeQuery({
query: 'Analyze this complex document...',
context: { provider: 'anthropic', model: 'claude-opus-4' },
});
Latency Considerations
Proxy Mode adds latency because requests go through AxonFlow:
| Deployment | Additional Latency |
|---|---|
| SaaS endpoint | ~50-100ms |
| In-VPC deployment | ~10-20ms |
For latency-sensitive applications, consider Gateway Mode.
Comparison with Gateway Mode
| Feature | Proxy Mode | Gateway Mode |
|---|---|---|
| Integration Effort | Minimal | Moderate |
| Code Changes | Single executeQuery() call | Pre-check + LLM call + Audit |
| Latency Overhead | Higher (~50-100ms) | Lower (~10-20ms) |
| Static Policies (PII, SQL injection) | ✅ Yes | ✅ Yes |
| Dynamic Policies (custom rules) | ✅ Yes | ❌ No |
| Response Filtering | Yes (automatic) | No |
| Audit Coverage | 100% automatic | Manual |
| LLM API Keys | Managed by AxonFlow | Managed by you |
| LLM Control | AxonFlow routes | You call directly |
| Best For | Custom policies, full governance | Built-in policies only, lowest latency |
See Choosing a Mode for detailed guidance.
Error Handling
TypeScript
import { AxonFlow, PolicyViolationError, AuthenticationError, RateLimitError } from '@axonflow/sdk';
try {
const response = await axonflow.executeQuery({
userToken: 'user-123',
query: 'Hello!',
requestType: 'chat',
context: { provider: 'openai', model: 'gpt-4' },
});
if (response.blocked) {
// Policy violation - handle gracefully
showUserMessage('Your request was blocked: ' + response.blockReason);
} else if (response.success) {
displayResponse(response.data);
}
} catch (error) {
if (error instanceof AuthenticationError) {
// 401 - Invalid credentials
console.error('Authentication failed - check AXONFLOW_CLIENT_ID and AXONFLOW_CLIENT_SECRET');
} else if (error instanceof RateLimitError) {
// 429 - Too many requests
console.error(`Rate limited. Retry after ${error.resetAt}`);
} else if (error.code === 'ECONNREFUSED') {
// Cannot reach AxonFlow Agent
console.error('Cannot reach AxonFlow - check your endpoint URL and that the Agent is running');
} else if (error.code === 'TIMEOUT' || error.name === 'TimeoutError') {
// Request exceeded timeout
console.error('Request timed out - consider increasing timeout or check Agent health');
} else {
console.error('Unexpected error:', error.message);
}
}
Go
resp, err := client.ExecuteQuery("user-123", query, "chat", context)
if err != nil {
// Network errors, timeouts, authentication failures
log.Printf("Request failed: %v", err)
return
}
if resp.Blocked {
// Policy violation
log.Printf("Blocked: %s (policies: %v)", resp.BlockReason, resp.PolicyInfo.PoliciesEvaluated)
return
}
if !resp.Success {
// Downstream LLM provider error
log.Printf("LLM call failed: %s", resp.Error)
return
}
fmt.Printf("Result: %v\n", resp.Data)
Python
try:
response = await client.execute_query(
user_token="user-123",
query="Hello!",
request_type="chat",
context={"provider": "openai", "model": "gpt-4"}
)
if response.blocked:
print(f"Blocked: {response.block_reason}")
elif response.success:
print(f"Response: {response.data}")
except axonflow.AuthenticationError:
print("Authentication failed - check credentials")
except axonflow.RateLimitError as e:
print(f"Rate limited - retry after {e.reset_at}")
except ConnectionError:
print("Cannot reach AxonFlow Agent - check endpoint")
except TimeoutError:
print("Request timed out")
Configuration
TypeScript
const axonflow = new AxonFlow({
endpoint: 'https://your-axonflow.example.com',
clientId: process.env.AXONFLOW_CLIENT_ID,
clientSecret: process.env.AXONFLOW_CLIENT_SECRET, // Optional for community
tenant: 'your-tenant-id',
debug: false,
timeout: 30000,
});
Python
client = AxonFlow(
endpoint="https://your-axonflow.example.com",
client_id=os.environ["AXONFLOW_CLIENT_ID"],
client_secret=os.environ["AXONFLOW_CLIENT_SECRET"],
tenant="your-tenant-id",
timeout=30.0,
)
Go
client := axonflow.NewClient(axonflow.AxonFlowConfig{
Endpoint: "https://your-axonflow.example.com",
ClientID: os.Getenv("AXONFLOW_CLIENT_ID"),
ClientSecret: os.Getenv("AXONFLOW_CLIENT_SECRET"),
Tenant: "your-tenant-id",
Timeout: 30 * time.Second,
})
Java
AxonFlow client = AxonFlow.builder()
.endpoint("https://your-axonflow.example.com")
.clientId(System.getenv("AXONFLOW_CLIENT_ID"))
.clientSecret(System.getenv("AXONFLOW_CLIENT_SECRET"))
.tenant("your-tenant-id")
.timeout(Duration.ofSeconds(30))
.build();
Next Steps
- Gateway Mode - For lowest latency with your own LLM calls
- Choosing a Mode - Decision guide
- LLM Interceptors - Wrapper functions for LLM clients (Python, Go, Java)
- TypeScript SDK - Full TypeScript documentation
- Python SDK - Full Python documentation
- Go SDK - Full Go documentation
- Java SDK - Full Java documentation
