Proxy Mode - Zero-Code AI Governance

Proxy Mode is the simplest way to add governance to your AI applications. Wrap your AI calls and AxonFlow handles everything: policy enforcement, PII detection, rate limiting, and audit logging.

How It Works

Your app calls protect() with an AI function
AxonFlow extracts the request, evaluates policies
If approved, AxonFlow executes the AI call
Logs audit trail automatically
Optionally filters response (PII detection)

Quick Start

TypeScript

import { AxonFlow } from '@axonflow/sdk';
import OpenAI from 'openai';

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const axonflow = new AxonFlow({ licenseKey: process.env.AXONFLOW_LICENSE_KEY });

// Wrap any AI call with protect()
const response = await axonflow.protect(async () => {
  return openai.chat.completions.create({
    model: 'gpt-4',
    messages: [{ role: 'user', content: 'Analyze this data...' }]
  });
});

Go

import "github.com/getaxonflow/axonflow-sdk-go"

client := axonflow.NewClient(axonflow.AxonFlowConfig{
    AgentURL:   os.Getenv("AXONFLOW_AGENT_URL"),
    LicenseKey: os.Getenv("AXONFLOW_LICENSE_KEY"),
})

// Execute governed query
resp, err := client.ExecuteQuery(
    userToken,
    "Analyze this data...",
    "chat",
    nil,
)

Python

from axonflow import AxonFlow

async with AxonFlow(
    agent_url=os.environ["AXONFLOW_AGENT_URL"],
    client_id=os.environ["AXONFLOW_CLIENT_ID"],
    client_secret=os.environ["AXONFLOW_CLIENT_SECRET"]
) as client:
    response = await client.execute_query(
        user_token="user-jwt",
        query="Analyze this data...",
        request_type="chat"
    )

When to Use Proxy Mode

Best For

Greenfield projects - Starting fresh with AI governance
Simple integrations - Minimal code changes required
Response filtering - Automatic PII detection and redaction
100% audit coverage - Every call automatically logged
Beginners - Lower learning curve

Example Use Cases

Scenario	Why Proxy Mode
Customer support chatbot	Simple, automatic audit trail
Internal Q&A assistant	Zero-code governance
Document summarization	Response filtering for PII
Code generation	Block prompt injection attacks

Features

1. Automatic Policy Enforcement

All requests are checked against your policies before reaching the LLM:

// If request contains PII, it's blocked before reaching OpenAI
try {
  const response = await axonflow.protect(async () => {
    return openai.chat.completions.create({
      model: 'gpt-4',
      messages: [{ role: 'user', content: 'My SSN is 123-45-6789' }]
    });
  });
} catch (error) {
  // error.message: "Request blocked by AxonFlow: PII detected"
}

2. Automatic Audit Logging

Every request is logged for compliance:

// No additional code needed - audit happens automatically
const response = await axonflow.protect(async () => {
  return openai.chat.completions.create({ ... });
});

// Audit includes:
// - Timestamp
// - User token
// - Request content (sanitized)
// - Response summary
// - Policies evaluated
// - Token usage

3. Response Filtering (Enterprise)

PII in responses can be automatically redacted:

const response = await axonflow.protect(async () => {
  return openai.chat.completions.create({
    model: 'gpt-4',
    messages: [{ role: 'user', content: 'What is John Smith\'s email?' }]
  });
});

// Response: "The customer's email is [EMAIL REDACTED]"

4. Rate Limiting

Automatic rate limiting per user/tenant:

try {
  const response = await axonflow.protect(async () => {
    return openai.chat.completions.create({ ... });
  });
} catch (error) {
  if (error.message.includes('rate limit')) {
    // Rate limit exceeded - wait and retry
  }
}

5. Fail-Open Strategy

In production, if AxonFlow is unavailable, requests proceed with a warning:

const axonflow = new AxonFlow({
  licenseKey: process.env.AXONFLOW_LICENSE_KEY,
  mode: 'production'  // Fail-open if AxonFlow is down
});

// If AxonFlow is unavailable, the AI call still proceeds
const response = await axonflow.protect(async () => {
  return openai.chat.completions.create({ ... });
});

Client Wrapping (TypeScript)

For maximum convenience, wrap your entire AI client:

import { AxonFlow, wrapOpenAIClient } from '@axonflow/sdk';
import OpenAI from 'openai';

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const axonflow = new AxonFlow({ licenseKey: process.env.AXONFLOW_LICENSE_KEY });

// Wrap the entire client - all calls are now protected
const protectedOpenAI = wrapOpenAIClient(openai, axonflow);

// Use normally - governance happens invisibly
const response = await protectedOpenAI.chat.completions.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'Hello!' }]
});

Latency Considerations

Proxy Mode adds latency because requests go through AxonFlow:

Deployment	Additional Latency
Public endpoint	~50-100ms
VPC endpoint	~10-20ms

For latency-sensitive applications, consider Gateway Mode.

Comparison with Gateway Mode

Feature	Proxy Mode	Gateway Mode
Integration Effort	Minimal	Moderate
Code Changes	Wrap existing calls	Pre-check + Audit
Latency Overhead	Higher (~50-100ms)	Lower (~10-20ms)
Response Filtering	Yes	No
Audit Coverage	100% automatic	Manual (call audit API)
LLM Control	Limited	Full
Best For	Simple apps, beginners	Frameworks, performance

See Choosing a Mode for detailed guidance.

Error Handling

try {
  const response = await axonflow.protect(async () => {
    return openai.chat.completions.create({ ... });
  });

  console.log('Success:', response);
} catch (error) {
  if (error.message.includes('blocked by AxonFlow')) {
    // Policy violation
    console.log('Policy violation:', error.message);
  } else if (error.message.includes('rate limit')) {
    // Rate limit exceeded
    console.log('Rate limited, try again later');
  } else {
    // Other errors (network, API, etc.)
    console.error('Error:', error);
  }
}

Configuration

const axonflow = new AxonFlow({
  licenseKey: process.env.AXONFLOW_LICENSE_KEY,
  endpoint: 'https://staging-eu.getaxonflow.com',
  mode: 'production',        // 'production' or 'sandbox'
  debug: false,              // Enable debug logging
  timeout: 30000,            // Request timeout in ms
  retry: {
    enabled: true,
    maxAttempts: 3,
    delay: 1000
  },
  cache: {
    enabled: true,
    ttl: 60000               // Cache TTL in ms
  }
});

Next Steps

Gateway Mode - For lowest latency
Choosing a Mode - Decision guide
TypeScript SDK - Full TypeScript documentation
Go SDK - Full Go documentation
Python SDK - Full Python documentation

How It Works​

Quick Start​

TypeScript​

Go​

Python​

When to Use Proxy Mode​

Best For​

Example Use Cases​

Features​

1. Automatic Policy Enforcement​

2. Automatic Audit Logging​

3. Response Filtering (Enterprise)​

4. Rate Limiting​

5. Fail-Open Strategy​

Client Wrapping (TypeScript)​

Latency Considerations​

Comparison with Gateway Mode​

Error Handling​

Configuration​

Next Steps​