Choosing Between Proxy Mode and Gateway Mode
AxonFlow offers two integration modes to fit different requirements. This guide helps you choose the right one for your application.
Quick Decision Guide
Feature Comparison
| Feature | Proxy Mode | Gateway Mode |
|---|---|---|
| Integration | ||
| Code changes | Minimal (wrap calls) | Moderate (pre-check + audit) |
| Learning curve | Low | Medium |
| Framework support | Good | Best (LangChain, LlamaIndex, etc.) |
| Performance | ||
| Latency overhead | ~50-100ms (public) / ~10-20ms (VPC) | ~10-20ms |
| Request flow | Your App → AxonFlow → LLM | Your App → LLM (direct) |
| Features | ||
| Policy enforcement | Automatic | Automatic (pre-check) |
| Audit logging | 100% automatic | Manual (call audit API) |
| Response filtering | Yes (PII redaction) | No |
| Rate limiting | Automatic | Automatic (pre-check) |
| Control | ||
| LLM provider | AxonFlow routes | You choose |
| Model selection | Limited | Full control |
| Request modification | Limited | Full control |
When to Choose Proxy Mode
Ideal Use Cases
-
Greenfield Applications
- Starting fresh with AI features
- Want governance from day one
- No existing LLM integration to preserve
-
Simple Chatbots
- Customer support chat
- Internal Q&A assistants
- Document summarization
-
Compliance-Heavy Industries
- Healthcare (HIPAA)
- Finance (SOX, PCI)
- Need 100% automatic audit coverage
-
Response Filtering Required
- PII detection and redaction in responses
- Content moderation
- Output sanitization
Code Example
// Proxy Mode - Simple, automatic governance
const response = await axonflow.protect(async () => {
return openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: prompt }]
});
});
When to Choose Gateway Mode
Ideal Use Cases
-
Framework Integrations
- LangChain, LlamaIndex, CrewAI
- Custom agent frameworks
- Multi-step reasoning pipelines
-
Performance-Critical Applications
- Real-time chat with sub-100ms latency
- High-frequency trading analysis
- Interactive code generation
-
Multi-Provider Applications
- Using multiple LLM providers
- Provider failover logic
- Cost optimization routing
-
Existing Applications
- Already have LLM integration
- Don't want to change LLM call patterns
- Just need governance hooks
Code Example
// Gateway Mode - Full control, lowest latency
// 1. Pre-check
const ctx = await axonflow.getPolicyApprovedContext({
userToken: token,
query: prompt,
dataSources: ['postgres']
});
if (!ctx.approved) throw new Error(ctx.blockReason);
// 2. Direct LLM call
const response = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: prompt }]
});
// 3. Audit
await axonflow.auditLLMCall({
contextId: ctx.contextId,
responseSummary: response.choices[0].message.content?.substring(0, 100),
provider: 'openai',
model: 'gpt-4',
tokenUsage: { ... },
latencyMs
});
Latency Breakdown
| Mode | Overhead | Notes |
|---|---|---|
| Proxy (Public) | 50-100ms | Full request proxying |
| Proxy (VPC) | 20-40ms | Same-region deployment |
| Gateway | 10-20ms | Audit is non-blocking |
Migration Path
Proxy → Gateway
If you start with Proxy Mode and need lower latency later:
// Before: Proxy Mode
const response = await axonflow.protect(async () => {
return openai.chat.completions.create({ ... });
});
// After: Gateway Mode
const ctx = await axonflow.getPolicyApprovedContext({ ... });
const response = await openai.chat.completions.create({ ... });
await axonflow.auditLLMCall({ contextId: ctx.contextId, ... });
Gateway → Proxy
If you want simpler code with automatic features:
// Before: Gateway Mode
const ctx = await axonflow.getPolicyApprovedContext({ ... });
const response = await openai.chat.completions.create({ ... });
await axonflow.auditLLMCall({ ... });
// After: Proxy Mode
const response = await axonflow.protect(async () => {
return openai.chat.completions.create({ ... });
});
Hybrid Approach
You can use both modes in the same application:
// Use Proxy Mode for simple chat
app.post('/api/chat', async (req, res) => {
const response = await axonflow.protect(async () => {
return openai.chat.completions.create({
model: 'gpt-3.5-turbo',
messages: [{ role: 'user', content: req.body.prompt }]
});
});
res.json(response);
});
// Use Gateway Mode for performance-critical analytics
app.post('/api/analyze', async (req, res) => {
const ctx = await axonflow.getPolicyApprovedContext({
userToken: req.headers.authorization,
query: req.body.query,
dataSources: ['postgres', 'snowflake']
});
if (!ctx.approved) {
return res.status(403).json({ error: ctx.blockReason });
}
// Direct LLM call for lowest latency
const response = await openai.chat.completions.create({ ... });
// Fire-and-forget audit
axonflow.auditLLMCall({ contextId: ctx.contextId, ... }).catch(console.error);
res.json(response);
});
Decision Matrix
| Requirement | Recommended Mode |
|---|---|
| Fastest integration | Proxy |
| Lowest latency | Gateway |
| Response filtering | Proxy |
| Framework integration (LangChain) | Gateway |
| 100% automatic audit | Proxy |
| Multi-provider routing | Gateway |
| Compliance reporting | Either (both have full audit) |
| Existing LLM code | Gateway |
| New project | Proxy (start simple) |
Summary
- Start with Proxy Mode if you're new to AxonFlow or building a simple application
- Use Gateway Mode if you need lowest latency, use frameworks, or need full LLM control
- You can migrate between modes as your requirements evolve
- Hybrid approach works well for applications with mixed requirements