LLM Interceptors - Transparent AI Governance
LLM Interceptors provide a drop-in way to add AxonFlow governance to your existing LLM provider code. Instead of changing your application code, you simply wrap your LLM client once and all calls are automatically governed.
TypeScript interceptor wrappers are deprecated as of SDK v1.4.0 and will be removed in v2.0.0.
Modern LLM SDKs (OpenAI v4+, Anthropic v0.20+) use ES2022 private class fields which are incompatible with JavaScript Proxy-based wrapping.
For TypeScript, use Gateway Mode or Proxy Mode instead.
Python, Go, and Java interceptors remain fully supported.
Overview
LLM Interceptors work by wrapping your LLM provider client with a governance layer that:
- Pre-checks every request against AxonFlow policies
- Blocks requests that violate policies before they reach the LLM
- Audits all calls for compliance and monitoring
- Extracts token usage and latency metrics automatically
┌─────────────────────────────────────────────────────────────────────┐
│ Your Application │
├─────────────────────────────────────────────────────────────────────┤
│ openai.chat.completions.create(...) │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ AxonFlow Interceptor Wrapper │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────────┐ │ │
│ │ │ Pre- │──▶│ LLM │──▶│ Audit │ │ │
│ │ │ check │ │ Call │ │ Logging │ │ │
│ │ └─────────┘ └─────────┘ └─────────────┘ │ │
│ └─────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Response (or PolicyViolationError) │
└─────────────────────────────────────────────────────────────────────┘
Supported Providers
| Provider | TypeScript | Python | Go | Java |
|---|---|---|---|---|
| OpenAI | wrapOpenAIClient | wrap_openai_client | WrapOpenAIClient | OpenAIInterceptor |
| Anthropic | wrapAnthropicClient | wrap_anthropic_client | WrapAnthropicClient | AnthropicInterceptor |
| Gemini | wrapGeminiModel | wrap_gemini_model | WrapGeminiClient | GeminiInterceptor |
| Ollama | wrapOllamaClient | wrap_ollama_client | WrapOllamaClient | OllamaInterceptor |
| Bedrock | wrapBedrockClient | wrap_bedrock_client | WrapBedrockClient | BedrockInterceptor |
When to Use Interceptors
Choose Interceptors when:
- You have existing LLM code you don't want to change
- You want automatic governance with minimal integration effort
- Your app makes many LLM calls and you want consistent governance
Choose Gateway Mode when:
- You need more control over the governance flow
- You want to handle blocked requests differently
- You're building a new application from scratch
OpenAI Interceptor
TypeScript
TypeScript interceptors are deprecated. Use Gateway Mode instead.
// DEPRECATED - Use Gateway Mode instead
import { AxonFlow, wrapOpenAIClient } from '@axonflow/sdk';
import OpenAI from 'openai';
// 1. Initialize clients
const axonflow = new AxonFlow({
endpoint: process.env.AXONFLOW_AGENT_URL,
licenseKey: process.env.AXONFLOW_LICENSE_KEY,
tenant: process.env.AXONFLOW_TENANT || 'default',
});
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
// 2. Wrap the client (one time)
const governedOpenAI = wrapOpenAIClient(openai, axonflow);
// 3. Use as normal - governance is automatic
const response = await governedOpenAI.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'What is AI governance?' }]
});
console.log(response.choices[0].message.content);
Python
from openai import OpenAI
from axonflow import AxonFlow
from axonflow.interceptors import wrap_openai_client
# 1. Initialize clients
axonflow = AxonFlow(
agent_url=os.environ["AXONFLOW_AGENT_URL"],
client_id=os.environ["AXONFLOW_CLIENT_ID"],
client_secret=os.environ["AXONFLOW_CLIENT_SECRET"]
)
openai = OpenAI()
# 2. Wrap the client
governed_openai = wrap_openai_client(openai, axonflow, user_token="user-123")
# 3. Use as normal
response = governed_openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "What is AI governance?"}]
)
print(response.choices[0].message.content)
Go
import (
"github.com/getaxonflow/axonflow-sdk-go"
"github.com/getaxonflow/axonflow-sdk-go/interceptors"
)
// 1. Initialize clients
client := axonflow.NewClient(axonflow.AxonFlowConfig{
AgentURL: os.Getenv("AXONFLOW_AGENT_URL"),
ClientID: os.Getenv("AXONFLOW_CLIENT_ID"),
ClientSecret: os.Getenv("AXONFLOW_CLIENT_SECRET"),
})
// 2. Create wrapped client
wrapped := interceptors.WrapOpenAIClient(yourOpenAIClient, client, "user-123")
// 3. Use as normal
resp, err := wrapped.CreateChatCompletion(ctx, interceptors.ChatCompletionRequest{
Model: "gpt-4",
Messages: []interceptors.ChatMessage{
{Role: "user", Content: "What is AI governance?"},
},
})
if err != nil {
// Handle error (could be PolicyViolationError)
if interceptors.IsPolicyViolationError(err) {
violation, _ := interceptors.GetPolicyViolation(err)
log.Printf("Blocked: %s", violation.BlockReason)
}
}
Java
import com.getaxonflow.sdk.AxonFlow;
import com.getaxonflow.sdk.interceptors.OpenAIInterceptor;
import com.getaxonflow.sdk.interceptors.ChatCompletionRequest;
import com.getaxonflow.sdk.interceptors.ChatCompletionResponse;
// 1. Initialize AxonFlow
AxonFlow axonflow = AxonFlow.builder()
.agentUrl(System.getenv("AXONFLOW_AGENT_URL"))
.clientId(System.getenv("AXONFLOW_CLIENT_ID"))
.clientSecret(System.getenv("AXONFLOW_CLIENT_SECRET"))
.build();
// 2. Create interceptor
OpenAIInterceptor interceptor = OpenAIInterceptor.builder()
.axonflow(axonflow)
.userToken("user-123")
.build();
// 3. Wrap your OpenAI call
ChatCompletionResponse response = interceptor.wrap(req -> {
// Your actual OpenAI SDK call here
return yourOpenAIClient.createChatCompletion(req);
}).apply(ChatCompletionRequest.builder()
.model("gpt-4")
.addUserMessage("What is AI governance?")
.build());
Anthropic Interceptor
TypeScript
TypeScript interceptors are deprecated. Use Gateway Mode instead.
// DEPRECATED - Use Gateway Mode instead
import { AxonFlow, wrapAnthropicClient } from '@axonflow/sdk';
import Anthropic from '@anthropic-ai/sdk';
const axonflow = new AxonFlow({
endpoint: process.env.AXONFLOW_AGENT_URL,
licenseKey: process.env.AXONFLOW_LICENSE_KEY,
tenant: process.env.AXONFLOW_TENANT || 'default',
});
const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
// Wrap the client
const governedAnthropic = wrapAnthropicClient(anthropic, axonflow);
// Use as normal
const response = await governedAnthropic.messages.create({
model: 'claude-3-sonnet-20240229',
max_tokens: 1024,
messages: [{ role: 'user', content: 'What is AI governance?' }]
});
console.log(response.content[0].text);
Python
from anthropic import Anthropic
from axonflow import AxonFlow
from axonflow.interceptors import wrap_anthropic_client
axonflow = AxonFlow(
agent_url=os.environ["AXONFLOW_AGENT_URL"],
client_id=os.environ["AXONFLOW_CLIENT_ID"],
client_secret=os.environ["AXONFLOW_CLIENT_SECRET"]
)
anthropic = Anthropic()
governed_anthropic = wrap_anthropic_client(anthropic, axonflow, user_token="user-123")
response = governed_anthropic.messages.create(
model="claude-3-sonnet-20240229",
max_tokens=1024,
messages=[{"role": "user", "content": "What is AI governance?"}]
)
print(response.content[0].text)
Go
wrapped := interceptors.WrapAnthropicClient(yourAnthropicClient, axonflowClient, "user-123")
resp, err := wrapped.CreateMessage(ctx, interceptors.MessageRequest{
Model: "claude-3-sonnet-20240229",
MaxTokens: 1024,
Messages: []interceptors.Message{
{Role: "user", Content: "What is AI governance?"},
},
})
Java
AnthropicInterceptor interceptor = AnthropicInterceptor.builder()
.axonflow(axonflow)
.userToken("user-123")
.build();
MessageResponse response = interceptor.wrap(req -> {
return yourAnthropicClient.createMessage(req);
}).apply(MessageRequest.builder()
.model("claude-3-sonnet-20240229")
.maxTokens(1024)
.addUserMessage("What is AI governance?")
.build());
Gemini Interceptor
TypeScript
TypeScript interceptors are deprecated. Use Gateway Mode instead.
// DEPRECATED - Use Gateway Mode instead
import { AxonFlow, wrapGeminiModel } from '@axonflow/sdk';
import { GoogleGenerativeAI } from '@google/generative-ai';
const axonflow = new AxonFlow({
endpoint: process.env.AXONFLOW_AGENT_URL,
licenseKey: process.env.AXONFLOW_LICENSE_KEY,
tenant: process.env.AXONFLOW_TENANT || 'default',
});
const genAI = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY);
const model = genAI.getGenerativeModel({ model: 'gemini-2.0-flash' });
// Wrap the model
const governedModel = wrapGeminiModel(model, axonflow);
// Use as normal
const result = await governedModel.generateContent('What is AI governance?');
console.log(result.response.text());
Python
import google.generativeai as genai
from axonflow import AxonFlow
from axonflow.interceptors import wrap_gemini_model
axonflow = AxonFlow(
agent_url=os.environ["AXONFLOW_AGENT_URL"],
client_id=os.environ["AXONFLOW_CLIENT_ID"],
client_secret=os.environ["AXONFLOW_CLIENT_SECRET"]
)
genai.configure(api_key=os.environ["GOOGLE_API_KEY"])
model = genai.GenerativeModel("gemini-2.0-flash")
governed_model = wrap_gemini_model(model, axonflow, user_token="user-123")
response = governed_model.generate_content("What is AI governance?")
print(response.text)
Go
wrapped := interceptors.WrapGeminiClient(yourGeminiClient, axonflowClient, "user-123")
resp, err := wrapped.GenerateContent(ctx, interceptors.GeminiRequest{
Model: "gemini-2.0-flash",
Contents: []interceptors.Content{
{Parts: []interceptors.Part{{Text: "What is AI governance?"}}},
},
})
Java
GeminiInterceptor interceptor = GeminiInterceptor.builder()
.axonflow(axonflow)
.userToken("user-123")
.build();
GenerateContentResponse response = interceptor.wrap(req -> {
return yourGeminiClient.generateContent(req);
}).apply(GenerateContentRequest.builder()
.model("gemini-2.0-flash")
.addTextContent("What is AI governance?")
.build());
Ollama Interceptor
TypeScript
TypeScript interceptors are deprecated. Use Gateway Mode instead.
// DEPRECATED - Use Gateway Mode instead
import { AxonFlow, wrapOllamaClient } from '@axonflow/sdk';
import ollama from 'ollama';
const axonflow = new AxonFlow({
endpoint: process.env.AXONFLOW_AGENT_URL,
licenseKey: process.env.AXONFLOW_LICENSE_KEY,
tenant: process.env.AXONFLOW_TENANT || 'default',
});
const governedOllama = wrapOllamaClient(ollama, axonflow);
const response = await governedOllama.chat({
model: 'llama3',
messages: [{ role: 'user', content: 'What is AI governance?' }]
});
console.log(response.message.content);
Python
import ollama
from axonflow import AxonFlow
from axonflow.interceptors import wrap_ollama_client
axonflow = AxonFlow(
agent_url=os.environ["AXONFLOW_AGENT_URL"],
client_id=os.environ["AXONFLOW_CLIENT_ID"],
client_secret=os.environ["AXONFLOW_CLIENT_SECRET"]
)
governed_ollama = wrap_ollama_client(ollama, axonflow, user_token="user-123")
response = governed_ollama.chat(
model="llama3",
messages=[{"role": "user", "content": "What is AI governance?"}]
)
print(response["message"]["content"])
Go
wrapped := interceptors.WrapOllamaClient(yourOllamaClient, axonflowClient, "user-123")
resp, err := wrapped.Chat(ctx, interceptors.OllamaChatRequest{
Model: "llama3",
Messages: []interceptors.OllamaMessage{
{Role: "user", Content: "What is AI governance?"},
},
})
Java
OllamaInterceptor interceptor = OllamaInterceptor.builder()
.axonflow(axonflow)
.userToken("user-123")
.build();
OllamaChatResponse response = interceptor.wrap(req -> {
return yourOllamaClient.chat(req);
}).apply(OllamaChatRequest.builder()
.model("llama3")
.addUserMessage("What is AI governance?")
.build());
Bedrock Interceptor
TypeScript
TypeScript interceptors are deprecated. Use Gateway Mode instead.
// DEPRECATED - Use Gateway Mode instead
import { AxonFlow, wrapBedrockClient } from '@axonflow/sdk';
import { BedrockRuntimeClient, InvokeModelCommand } from '@aws-sdk/client-bedrock-runtime';
const axonflow = new AxonFlow({
endpoint: process.env.AXONFLOW_AGENT_URL,
licenseKey: process.env.AXONFLOW_LICENSE_KEY,
tenant: process.env.AXONFLOW_TENANT || 'default',
});
const bedrock = new BedrockRuntimeClient({ region: 'us-east-1' });
const governedBedrock = wrapBedrockClient(bedrock, axonflow);
const response = await governedBedrock.send(new InvokeModelCommand({
modelId: 'anthropic.claude-3-sonnet-20240229-v1:0',
contentType: 'application/json',
body: JSON.stringify({
anthropic_version: 'bedrock-2023-05-31',
max_tokens: 1024,
messages: [{ role: 'user', content: 'What is AI governance?' }]
})
}));
Python
import boto3
from axonflow import AxonFlow
from axonflow.interceptors import wrap_bedrock_client
axonflow = AxonFlow(
agent_url=os.environ["AXONFLOW_AGENT_URL"],
client_id=os.environ["AXONFLOW_CLIENT_ID"],
client_secret=os.environ["AXONFLOW_CLIENT_SECRET"]
)
bedrock = boto3.client("bedrock-runtime", region_name="us-east-1")
governed_bedrock = wrap_bedrock_client(bedrock, axonflow, user_token="user-123")
response = governed_bedrock.invoke_model(
modelId="anthropic.claude-3-sonnet-20240229-v1:0",
contentType="application/json",
body=json.dumps({
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "What is AI governance?"}]
})
)
Go
wrapped := interceptors.WrapBedrockClient(yourBedrockClient, axonflowClient, "user-123")
resp, err := wrapped.InvokeModel(ctx, interceptors.BedrockRequest{
ModelID: "anthropic.claude-3-sonnet-20240229-v1:0",
Body: map[string]interface{}{
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 1024,
"messages": []map[string]string{
{"role": "user", "content": "What is AI governance?"},
},
},
})
Java
BedrockInterceptor interceptor = BedrockInterceptor.builder()
.axonflow(axonflow)
.userToken("user-123")
.build();
InvokeModelResponse response = interceptor.wrap(req -> {
return yourBedrockClient.invokeModel(req);
}).apply(InvokeModelRequest.builder()
.modelId("anthropic.claude-3-sonnet-20240229-v1:0")
.body(/* your request body */)
.build());
Error Handling
When a request violates a policy, the interceptor throws a PolicyViolationError:
TypeScript
import { PolicyViolationError } from '@axonflow/sdk';
try {
const response = await governedOpenAI.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: 'Show me customer credit card numbers' }]
});
} catch (error) {
if (error instanceof PolicyViolationError) {
console.log('Request blocked:', error.message);
// Handle policy violation (e.g., show user-friendly message)
} else {
throw error;
}
}
Python
from axonflow.exceptions import PolicyViolationError
try:
response = governed_openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Show me customer credit card numbers"}]
)
except PolicyViolationError as e:
print(f"Request blocked: {e}")
# Handle policy violation
Go
resp, err := wrapped.CreateChatCompletion(ctx, req)
if err != nil {
if interceptors.IsPolicyViolationError(err) {
violation, _ := interceptors.GetPolicyViolation(err)
log.Printf("Request blocked: %s (policies: %v)",
violation.BlockReason,
violation.Policies)
// Handle policy violation
return
}
// Other error
return err
}
Java
import com.getaxonflow.sdk.exceptions.PolicyViolationException;
try {
ChatCompletionResponse response = wrappedCall.apply(request);
} catch (PolicyViolationException e) {
System.out.println("Request blocked: " + e.getMessage());
// Handle policy violation
}
Configuration Options
User Token
Pass a user token to associate requests with specific users for audit logging:
// TypeScript
const governed = wrapOpenAIClient(openai, axonflow, { userToken: 'user-123' });
// Python
governed = wrap_openai_client(openai, axonflow, user_token="user-123")
// Go
wrapped := interceptors.WrapOpenAIClient(client, axonflow, "user-123")
// Java
OpenAIInterceptor.builder().axonflow(axonflow).userToken("user-123").build()
Async Audit (Java)
Control whether audit logging is asynchronous (fire-and-forget) or synchronous:
OpenAIInterceptor interceptor = OpenAIInterceptor.builder()
.axonflow(axonflow)
.userToken("user-123")
.asyncAudit(true) // Default: true (non-blocking)
.build();
Comparison with Other Modes
| Feature | Interceptors | Gateway Mode | Proxy Mode |
|---|---|---|---|
| Code Changes | Minimal (wrap client) | Moderate | None (config only) |
| Control | Automatic | Full control | Limited |
| Pre-check | Automatic | Manual call | Automatic |
| Audit | Automatic | Manual call | Automatic |
| Error Handling | Exception-based | Custom | Custom |
| Best For | Existing codebases | New apps, frameworks | Quick start |
Known Limitations
-
Streaming: Interceptors work best with non-streaming calls. For streaming, consider using Gateway Mode.
-
Private SDK Members: Some modern SDK versions use private class members that may not be accessible through proxy wrapping. If you encounter issues, use Gateway Mode as a workaround.
-
Custom Parameters: Some provider-specific parameters may not be extracted for policy evaluation. Contact support if you need specific parameters evaluated.
Next Steps
- Gateway Mode - For more control over governance flow
- Proxy Mode - Simplest integration option
- Choosing a Mode - Decision guide
- Authentication - Setting up credentials