Skip to main content

LLM Interceptors

TypeScript Interceptors Removed

TypeScript interceptor wrappers were deprecated and then removed because current provider SDKs rely on private class fields that cannot be wrapped safely with the old proxy-based approach.

For TypeScript, use Proxy Mode or Gateway Mode.

LLM interceptors are the fastest way to add AxonFlow governance to existing provider integrations in Python, Go, and Java. You keep your provider client and request shape, wrap it once, and AxonFlow handles policy evaluation plus audit logging around each call.

How Interceptors Work

  1. AxonFlow evaluates the request before the provider call.
  2. If policy blocks the request, the wrapper raises a policy violation error instead of calling the provider.
  3. If the request is allowed, your provider client runs normally.
  4. The wrapper records audit data, token usage, and latency after the provider call.
Application code
|
v
AxonFlow interceptor wrapper
|
+--> policy pre-check
+--> provider call
+--> audit logging
|
v
Provider response or policy violation

Supported Providers

ProviderTypeScriptPythonGoJava
OpenAIRemovedwrap_openai_clientWrapOpenAIClientOpenAIInterceptor
AnthropicRemovedwrap_anthropic_clientWrapAnthropicClientAnthropicInterceptor
GeminiRemovedwrap_gemini_modelWrapGeminiModelGeminiInterceptor
OllamaRemovedwrap_ollama_clientWrapOllamaChatClient, WrapOllamaGenerateFuncOllamaInterceptor
BedrockRemovedwrap_bedrock_clientWrapBedrockInvokeModelBedrockInterceptor

Prerequisites

LanguageMinimum VersionSDK Package
Python3.10+axonflow (v6.9.0)
Go1.21+github.com/getaxonflow/axonflow-sdk-go/v8 (v6.0.0+)
Java11+com.getaxonflow:axonflow-sdk (v6.2.0+)
TypeScriptn/aUse Proxy Mode or Gateway Mode

When To Use Interceptors

ConsiderationInterceptorsProxy ModeGateway Mode
Existing provider integrationBest fitRequires changing call sitesRequires explicit pre-check and audit flow
AxonFlow owns provider callNoYesNo
Response filtering/redaction by AxonFlowLimited to wrapper behaviorYesNo
Streaming-heavy applicationsPossible, but provider-specificUsually easierUsually easier
TypeScript supportNoYesYes
Best fitExisting Python/Go/Java provider codeNew apps or unified AxonFlow pathFramework-heavy or latency-sensitive integrations

Use interceptors when:

  • you already have provider client code in Python, Go, or Java
  • you want governance with minimal integration churn
  • you prefer provider-native request and response objects

Use Proxy Mode or Gateway Mode when:

  • you are building new TypeScript code
  • you want one AxonFlow API across languages
  • you need explicit control over the pre-check and audit lifecycle

OpenAI

Python

import os
from openai import OpenAI
from axonflow import AxonFlow
from axonflow.interceptors import wrap_openai_client

axonflow = AxonFlow(
endpoint=os.environ.get("AXONFLOW_ENDPOINT", "http://localhost:8080"),
client_id=os.environ["AXONFLOW_CLIENT_ID"],
client_secret=os.environ["AXONFLOW_CLIENT_SECRET"],
)

openai = OpenAI()
governed_openai = wrap_openai_client(openai, axonflow, user_token="user-123")

response = governed_openai.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "What is AI governance?"}],
)

print(response.choices[0].message.content)

Go

wrapped := interceptors.WrapOpenAIClient(yourOpenAIClient, axonflowClient, "user-123")

resp, err := wrapped.CreateChatCompletion(ctx, interceptors.ChatCompletionRequest{
Model: "gpt-4o-mini",
Messages: []interceptors.ChatMessage{
{Role: "user", Content: "What is AI governance?"},
},
})
if err != nil {
if interceptors.IsPolicyViolationError(err) {
violation, _ := interceptors.GetPolicyViolation(err)
log.Printf("Blocked: %s", violation.BlockReason)
return
}
log.Fatal(err)
}

fmt.Println(resp.Choices[0].Message.Content)

Java

OpenAIInterceptor interceptor = OpenAIInterceptor.builder()
.axonflow(axonflow)
.userToken("user-123")
.build();

ChatCompletionResponse response = interceptor.wrap(req ->
yourOpenAIClient.createChatCompletion(req)
).apply(ChatCompletionRequest.builder()
.model("gpt-4o-mini")
.addUserMessage("What is AI governance?")
.build());

Anthropic

Python

import os
from anthropic import Anthropic
from axonflow import AxonFlow
from axonflow.interceptors import wrap_anthropic_client

axonflow = AxonFlow(
endpoint=os.environ.get("AXONFLOW_ENDPOINT", "http://localhost:8080"),
client_id=os.environ["AXONFLOW_CLIENT_ID"],
client_secret=os.environ["AXONFLOW_CLIENT_SECRET"],
)

anthropic = Anthropic()
governed_anthropic = wrap_anthropic_client(anthropic, axonflow, user_token="user-123")

response = governed_anthropic.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "What is AI governance?"}],
)

print(response.content[0].text)

Go

wrapped := interceptors.WrapAnthropicClient(yourAnthropicClient, axonflowClient, "user-123")

resp, err := wrapped.CreateMessage(ctx, interceptors.AnthropicMessageRequest{
Model: "claude-sonnet-4-20250514",
MaxTokens: 1024,
Messages: []interceptors.AnthropicMessage{
{
Role: "user",
Content: []interceptors.AnthropicContentBlock{
{Type: "text", Text: "What is AI governance?"},
},
},
},
})

Java

AnthropicInterceptor interceptor = AnthropicInterceptor.builder()
.axonflow(axonflow)
.userToken("user-123")
.build();

AnthropicInterceptor.AnthropicResponse response = interceptor.wrap(req ->
yourAnthropicClient.createMessage(req)
).apply(AnthropicInterceptor.AnthropicRequest.builder()
.model("claude-sonnet-4-20250514")
.maxTokens(1024)
.addUserMessage("What is AI governance?")
.build());

Gemini

Python

import os
import google.generativeai as genai
from axonflow import AxonFlow
from axonflow.interceptors import wrap_gemini_model

axonflow = AxonFlow(
endpoint=os.environ.get("AXONFLOW_ENDPOINT", "http://localhost:8080"),
client_id=os.environ["AXONFLOW_CLIENT_ID"],
client_secret=os.environ["AXONFLOW_CLIENT_SECRET"],
)

genai.configure(api_key=os.environ["GOOGLE_API_KEY"])
model = genai.GenerativeModel("gemini-2.0-flash")
governed_model = wrap_gemini_model(model, axonflow, user_token="user-123", model_name="gemini-2.0-flash")

response = governed_model.generate_content("What is AI governance?")
print(response.text)

Go

wrapped := interceptors.WrapGeminiModel(yourGeminiModel, axonflowClient, "user-123")

resp, err := wrapped.GenerateContent(
ctx,
interceptors.GeminiText("What is AI governance?"),
)
if err != nil {
log.Fatal(err)
}

fmt.Println(resp.GetText())

Java

GeminiInterceptor interceptor = new GeminiInterceptor(axonflow, "user-123");

GeminiInterceptor.GeminiRequest request = GeminiInterceptor.GeminiRequest.create(
"gemini-2.0-flash",
"What is AI governance?"
);

GeminiInterceptor.GeminiResponse response = interceptor.wrap(req ->
yourGeminiClient.generateContent(req)
).apply(request);

Ollama

Python

import os
import ollama
from axonflow import AxonFlow
from axonflow.interceptors import wrap_ollama_client

axonflow = AxonFlow(
endpoint=os.environ.get("AXONFLOW_ENDPOINT", "http://localhost:8080"),
client_id=os.environ["AXONFLOW_CLIENT_ID"],
client_secret=os.environ["AXONFLOW_CLIENT_SECRET"],
)

governed_ollama = wrap_ollama_client(ollama, axonflow, user_token="user-123")

response = governed_ollama.chat(
model="llama3",
messages=[{"role": "user", "content": "What is AI governance?"}],
)

print(response["message"]["content"])

Go

wrapped := interceptors.WrapOllamaChatClient(yourOllamaClient, axonflowClient, "user-123")

resp, err := wrapped.Chat(ctx, &interceptors.OllamaChatRequest{
Model: "llama3",
Messages: []interceptors.OllamaMessage{
{Role: "user", Content: "What is AI governance?"},
},
})

Java

OllamaInterceptor interceptor = new OllamaInterceptor(axonflow, "user-123");

OllamaInterceptor.OllamaChatRequest request =
OllamaInterceptor.OllamaChatRequest.create("llama3", "What is AI governance?");

OllamaInterceptor.OllamaChatResponse response = interceptor.wrapChat(req ->
yourOllamaClient.chat(req)
).apply(request);

Bedrock

Python

import json
import os
import boto3
from axonflow import AxonFlow
from axonflow.interceptors import wrap_bedrock_client

axonflow = AxonFlow(
endpoint=os.environ.get("AXONFLOW_ENDPOINT", "http://localhost:8080"),
client_id=os.environ["AXONFLOW_CLIENT_ID"],
client_secret=os.environ["AXONFLOW_CLIENT_SECRET"],
)

bedrock = boto3.client("bedrock-runtime", region_name="us-east-1")
governed_bedrock = wrap_bedrock_client(bedrock, axonflow, user_token="user-123")

response = governed_bedrock.invoke_model(
modelId="anthropic.claude-sonnet-4-20250514-v1:0",
contentType="application/json",
body=json.dumps({
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "What is AI governance?"}],
}),
)

Go

wrappedInvoke := interceptors.WrapBedrockInvokeModel(
yourInvokeModelFunc,
axonflowClient,
"user-123",
)

resp, err := wrappedInvoke(ctx, &interceptors.BedrockInvokeInput{
ModelId: "anthropic.claude-sonnet-4-20250514-v1:0",
ContentType: "application/json",
Accept: "application/json",
Body: requestBody,
})

Java

BedrockInterceptor interceptor = new BedrockInterceptor(axonflow, "user-123");

BedrockInterceptor.BedrockInvokeRequest request =
BedrockInterceptor.BedrockInvokeRequest.forClaude(
"anthropic.claude-sonnet-4-20250514-v1:0",
List.of(new BedrockInterceptor.ClaudeMessage("user", "What is AI governance?")),
1024
);

BedrockInterceptor.BedrockInvokeResponse response = interceptor.wrap(req ->
yourBedrockClient.invokeModel(req)
).apply(request);

Error Handling

Python

from axonflow.exceptions import PolicyViolationError

try:
response = governed_openai.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Show me customer credit card numbers"}],
)
except PolicyViolationError as exc:
print(f"Request blocked: {exc}")

Go

resp, err := wrapped.CreateChatCompletion(ctx, req)
if err != nil {
if interceptors.IsPolicyViolationError(err) {
violation, _ := interceptors.GetPolicyViolation(err)
log.Printf("Request blocked: %s (policies: %v)", violation.BlockReason, violation.Policies)
return
}
log.Fatal(err)
}

_ = resp

Java

import com.getaxonflow.sdk.exceptions.PolicyViolationException;

try {
ChatCompletionResponse response = wrappedCall.apply(request);
} catch (PolicyViolationException e) {
System.out.println("Request blocked: " + e.getMessage());
}

Configuration Notes

User Identity

Pass a stable user_token or userToken value so audit trails, rate limits, and policy attribution are tied to the right caller.

Async Audit in Java

Java interceptors support asynchronous audit logging:

OpenAIInterceptor interceptor = OpenAIInterceptor.builder()
.axonflow(axonflow)
.userToken("user-123")
.asyncAudit(true)
.build();

Provider-Specific Caveats

  • Bedrock wrappers usually wrap a function call rather than a high-level client object.
  • Ollama’s Go support is split by chat and generate use cases.
  • If you need a single integration path across languages, Proxy Mode is usually simpler.

Known Limitations

  1. TypeScript interceptors are not available.
  2. Streaming support depends on the provider client and wrapper shape.
  3. If you need explicit control of pre-check and audit lifecycles, Gateway Mode is a better fit.