Skip to main content

Choosing Between Proxy Mode and Gateway Mode

AxonFlow offers two integration modes to fit different requirements. This guide helps you choose the right one for your application.

Quick Decision Guide

Feature Comparison

FeatureProxy ModeGateway Mode
Integration
Code changesMinimal (wrap calls)Moderate (pre-check + audit)
Learning curveLowMedium
Framework supportGoodBest (LangChain, LlamaIndex, etc.)
Performance
Latency overhead~50-100ms (public) / ~10-20ms (VPC)~10-20ms
Request flowYour App → AxonFlow → LLMYour App → LLM (direct)
Features
Policy enforcementAutomaticAutomatic (pre-check)
Audit logging100% automaticManual (call audit API)
Response filteringYes (PII redaction)No
Rate limitingAutomaticAutomatic (pre-check)
Control
LLM providerAxonFlow routesYou choose
Model selectionLimitedFull control
Request modificationLimitedFull control

When to Choose Proxy Mode

Ideal Use Cases

  1. Greenfield Applications

    • Starting fresh with AI features
    • Want governance from day one
    • No existing LLM integration to preserve
  2. Simple Chatbots

    • Customer support chat
    • Internal Q&A assistants
    • Document summarization
  3. Compliance-Heavy Industries

    • Healthcare (HIPAA)
    • Finance (SOX, PCI)
    • Need 100% automatic audit coverage
  4. Response Filtering Required

    • PII detection and redaction in responses
    • Content moderation
    • Output sanitization

Code Example

// Proxy Mode - Simple, automatic governance
const response = await axonflow.protect(async () => {
return openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: prompt }]
});
});

When to Choose Gateway Mode

Ideal Use Cases

  1. Framework Integrations

    • LangChain, LlamaIndex, CrewAI
    • Custom agent frameworks
    • Multi-step reasoning pipelines
  2. Performance-Critical Applications

    • Real-time chat with sub-100ms latency
    • High-frequency trading analysis
    • Interactive code generation
  3. Multi-Provider Applications

    • Using multiple LLM providers
    • Provider failover logic
    • Cost optimization routing
  4. Existing Applications

    • Already have LLM integration
    • Don't want to change LLM call patterns
    • Just need governance hooks

Code Example

// Gateway Mode - Full control, lowest latency
// 1. Pre-check
const ctx = await axonflow.getPolicyApprovedContext({
userToken: token,
query: prompt,
dataSources: ['postgres']
});

if (!ctx.approved) throw new Error(ctx.blockReason);

// 2. Direct LLM call
const response = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: prompt }]
});

// 3. Audit
await axonflow.auditLLMCall({
contextId: ctx.contextId,
responseSummary: response.choices[0].message.content?.substring(0, 100),
provider: 'openai',
model: 'gpt-4',
tokenUsage: { ... },
latencyMs
});

Latency Breakdown

ModeOverheadNotes
Proxy (Public)50-100msFull request proxying
Proxy (VPC)20-40msSame-region deployment
Gateway10-20msAudit is non-blocking

Migration Path

Proxy → Gateway

If you start with Proxy Mode and need lower latency later:

// Before: Proxy Mode
const response = await axonflow.protect(async () => {
return openai.chat.completions.create({ ... });
});

// After: Gateway Mode
const ctx = await axonflow.getPolicyApprovedContext({ ... });
const response = await openai.chat.completions.create({ ... });
await axonflow.auditLLMCall({ contextId: ctx.contextId, ... });

Gateway → Proxy

If you want simpler code with automatic features:

// Before: Gateway Mode
const ctx = await axonflow.getPolicyApprovedContext({ ... });
const response = await openai.chat.completions.create({ ... });
await axonflow.auditLLMCall({ ... });

// After: Proxy Mode
const response = await axonflow.protect(async () => {
return openai.chat.completions.create({ ... });
});

Hybrid Approach

You can use both modes in the same application:

// Use Proxy Mode for simple chat
app.post('/api/chat', async (req, res) => {
const response = await axonflow.protect(async () => {
return openai.chat.completions.create({
model: 'gpt-3.5-turbo',
messages: [{ role: 'user', content: req.body.prompt }]
});
});
res.json(response);
});

// Use Gateway Mode for performance-critical analytics
app.post('/api/analyze', async (req, res) => {
const ctx = await axonflow.getPolicyApprovedContext({
userToken: req.headers.authorization,
query: req.body.query,
dataSources: ['postgres', 'snowflake']
});

if (!ctx.approved) {
return res.status(403).json({ error: ctx.blockReason });
}

// Direct LLM call for lowest latency
const response = await openai.chat.completions.create({ ... });

// Fire-and-forget audit
axonflow.auditLLMCall({ contextId: ctx.contextId, ... }).catch(console.error);

res.json(response);
});

Decision Matrix

RequirementRecommended Mode
Fastest integrationProxy
Lowest latencyGateway
Response filteringProxy
Framework integration (LangChain)Gateway
100% automatic auditProxy
Multi-provider routingGateway
Compliance reportingEither (both have full audit)
Existing LLM codeGateway
New projectProxy (start simple)

Summary

  • Start with Proxy Mode if you're new to AxonFlow or building a simple application
  • Use Gateway Mode if you need lowest latency, use frameworks, or need full LLM control
  • You can migrate between modes as your requirements evolve
  • Hybrid approach works well for applications with mixed requirements

Next Steps