Architecture Overview
For a visual walkthrough of AxonFlow's architecture, including how the control plane works, policy enforcement flow, and multi-agent planning: Watch on YouTube
AxonFlow is a control plane for production AI systems. It doesn't replace your agent frameworks — it makes them operable in production.
The Problem: Why Execution Breaks
Most agent frameworks optimize for authoring workflows, not operating them.
Once agents touch real systems, teams hit familiar problems:
- Silent failures — An agent retries a database write 3 times, each with side effects
- No runtime visibility — Which policy blocked the request? What was the LLM's reasoning?
- Permission gaps — Agent accessed customer data it shouldn't have; discovered in prod
- Compliance gaps — No audit trail for AI decisions; fails regulatory review
Gateways aren't enough. API gateways can rate-limit and authenticate, but they don't understand AI workflows. They can't enforce "block if PII detected" or "require approval for high-risk decisions."
AxonFlow treats agents as long-running, stateful systems that require governance, observability, and control at runtime.
Where AxonFlow Sits
AxonFlow sits between your applications and LLM providers, providing policy enforcement and audit at execution time.
┌─────────────┐ ┌─────────────────────────────┐ ┌─────────────────┐
│ │ │ AxonFlow │ │ LLM Providers │
│ Your │ ───▶ │ ┌─────────┐ ┌─────────┐ │ ───▶ │ OpenAI │
│ App │ │ │ Policy │ │ Audit │ │ │ Anthropic │
│ │ ◀─── │ │ Engine │ │ Log │ │ ◀─── │ Gemini │
└─────────────┘ │ └─────────┘ └─────────┘ │ └─────────────────┘
└─────────────────────────────┘
│
▼
┌─────────────────────────────┐
│ MCP Connectors │
│ Postgres, Salesforce, S3 │
└─────────────────────────────┘
AxonFlow provides:
- Policy enforcement — Block PII, SQLi, dangerous queries before they reach LLMs
- Audit logging — Complete trail of every AI decision for compliance
- Cost controls — Budget limits and usage tracking per tenant
Why AxonFlow often becomes the default control plane
Teams typically start by placing AxonFlow in front of a single workflow or agent to evaluate policy enforcement, auditability, and execution control. As usage grows, running some AI workflows through AxonFlow and others outside it tends to create fragmented audit logs, inconsistent policies, and duplicated observability. For this reason, teams that continue using AxonFlow often standardize on it as a single control plane for AI workflows, while retaining their existing orchestration frameworks and execution logic.
Gateway Mode (Recommended for Existing Stacks)
AxonFlow wraps your existing agent framework with pre-execution checks and post-execution audit:
┌──────────────────────────────────────────────────────────────────────────────┐
│ GATEWAY MODE │
│ │
│ ┌─────────┐ ┌─────────────┐ ┌───────────────────────────────┐ │
│ │ │ ───▶ │ AxonFlow │ ───▶ │ Your Agent Framework │ │
│ │ App │ │ Pre-check │ │ (LangChain / CrewAI / etc) │ │
│ │ │ │ │ │ │ │ │
│ └─────────┘ └─────────────┘ │ ▼ │ │
│ ▲ │ ┌─────────┐ │ │
│ │ │ │ LLM │ │ │
│ │ │ │Provider │ │ │
│ │ │ └────┬────┘ │ │
│ │ ┌─────────────┐ │ │ │ │
│ └────────── │ AxonFlow │ ◀─── │ ▼ │ │
│ │ Audit │ │ Response │ │
│ └─────────────┘ └───────────────────────────────┘ │
│ │
│ Flow: Pre-check → Your LLM call → Audit │
│ Latency overhead: ~15ms │
└──────────────────────────────────────────────────────────────────────────────┘
SDK Integration:
# 1. Pre-check: Get policy approval
approval = await axonflow.get_policy_approved_context(query)
if not approval.approved:
raise PolicyViolation(approval.block_reason)
# 2. Your existing LLM call (LangChain, CrewAI, etc.)
response = await your_langchain_agent.run(query)
# 3. Audit: Record what happened
await axonflow.audit_llm_call(approval.context_id, response)
Use when: You already have LangChain/CrewAI/AutoGen agents and want to add governance.
Proxy Mode (Full Control)
AxonFlow handles the complete request lifecycle, including LLM calls:
┌──────────────────────────────────────────────────────────────────────────────┐
│ PROXY MODE │
│ │
│ ┌─────────┐ ┌─────────────────────────────────────────┐ ┌─────┐ │
│ │ │ │ AxonFlow │ │ │ │
│ │ App │ ───▶ │ ┌────────┐ ┌────────┐ ┌───────┐ │ ───▶ │ LLM │ │
│ │ │ │ │ Policy │───▶│ MAP │───▶│Router │ │ │ │ │
│ │ │ │ │ Engine │ │Planning│ │ │ │ │ │ │
│ └─────────┘ │ └────────┘ └────────┘ └───────┘ │ └──┬──┘ │
│ ▲ │ │ │ │
│ │ │ ┌────────┐ ┌────────┐ ┌───────┐ │ │ │
│ └────────── │ │ Audit │◀───│Response│◀───│ Cost │ │ ◀───────┘ │
│ │ │ Log │ │ Check │ │Control│ │ │
│ │ └────────┘ └────────┘ └───────┘ │ │
│ └─────────────────────────────────────────┘ │
│ │
│ Flow: App → AxonFlow (everything) → LLM → AxonFlow → App │
│ Full governance: policies, planning, routing, cost, audit │
└──────────────────────────────────────────────────────────────────────────────┘
SDK Integration:
# AxonFlow handles everything
response = await axonflow.execute_query(
query="Analyze customer sentiment for Q4",
request_type="analysis"
)
# Policies, LLM routing, cost tracking, audit — all handled
Use when: Building new applications, need full governance from the start.
When to Choose Gateway Mode
| Scenario | Recommendation |
|---|---|
| Existing LangChain/CrewAI agents | Gateway Mode |
| Custom agent framework | Gateway Mode |
| Building from scratch | Proxy Mode |
| Need MAP multi-agent planning | Proxy Mode |
| Migration from existing stack | Gateway Mode first, Proxy Mode later |
Control Plane vs Orchestration
| Aspect | Orchestration (LangChain) | Control Plane (AxonFlow) |
|---|---|---|
| Focus | Chain construction, prompts | Runtime governance, audit |
| When | Authoring time | Execution time |
| Concern | "How do I build this workflow?" | "Should this step be allowed to run?" |
| Output | LLM response | Allow/Block decision + audit trail |
Key insight: AxonFlow doesn't compete with LangChain. LangChain runs your workflow; AxonFlow decides whether each step is allowed to proceed.
Execution Model
Policies Before and After Steps
Every step in a workflow passes through policy evaluation:
Two-Phase Policy Model
AxonFlow uses a two-phase policy model for both LLM calls and MCP connector access:
Phase 1 (System Policies): Pattern-based, compiled, in-memory, no DB lookups. Under 10ms P95.
Phase 2 (Tenant Policies): Condition-based, tenant-aware, cached 5 minutes. Under 30ms P95.
| Access Type | Phase 1 (System) | Phase 2 (Tenant) |
|---|---|---|
| LLM - Proxy Mode | ✅ | ✅ |
| LLM - Gateway Mode | ✅ | ❌ (you handle LLM directly) |
| MCP Connectors | ✅ | ✅ (when MCP_DYNAMIC_POLICIES_ENABLED=true) |
Note: MCP connectors are evaluated independently from LLM mode selection. You can use Gateway Mode for LLM calls (lowest latency) while still having full two-phase policy evaluation on MCP connector access.
Human-in-the-Loop Approvals (Enterprise)
High-risk decisions can require human approval:
Request → Policy Check → HITL Queue → Human Approves → Execute → Audit
│
└── Human Rejects → Block + Audit
Audit and Replay
Both modes provide audit logging, but with different depth:
| Aspect | Gateway Mode | Proxy Mode |
|---|---|---|
| Audit Type | Lightweight | Comprehensive (Decision Chain) |
| Captures | Provider, model, tokens, cost | + Policy triggers, risk levels, outcomes |
| Response Storage | SHA-256 hash (privacy) | Full decision tracing |
| Tamper Detection | Basic | input_hash, output_hash, audit_hash |
| Workflow Reconstruction | Not supported | Full replay via chain_id |
| Compliance | Cost tracking | EU AI Act Article 12 |
Gateway Mode is optimized for cost tracking and usage monitoring.
Proxy Mode provides compliance-grade audit trails with step-by-step decision tracing, risk classification, and execution replay.
Multi-Agent Planning (MAP)
MAP turns natural language requests into executable workflows:
# Natural language → Workflow
plan = await axonflow.generate_plan("Book cheapest flight to London next Tuesday")
# Returns: [flight-search.search, flight-search.compare, booking.reserve]
Core Services
System Overview
┌─────────────────────────────────────────────────────────────────────────────┐
│ AXONFLOW COMPONENTS │
│ │
│ ┌─────────────────────────────────────────┐ │
│ │ LLM Providers │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │ OpenAI │ │Anthropic│ │ Gemini │ │ │
│ │ └─────────┘ └─────────┘ └─────────┘ │ │
│ │ ┌─────────┐ ┌─────────┐ │ │
│ │ │ Azure │ │ Ollama │ │ │
│ │ └─────────┘ └─────────┘ │ │
│ └──────────────────▲──────────────────────┘ │
│ │ │
│ ┌─────────┐ ┌──────────────────┐ ┌──────────────────┐ │
│ │ │ │ Agent (:8080) │ │Orchestrator(:8081)│ │
│ │ App │─────▶│ │─────▶│ │──────────┤
│ │ │ │ • System Policy │ │ • Tenant Policy │ │
│ └─────────┘ │ • PII Detection │ │ • LLM Routing │ │
│ │ • SQLi Scanning │ │ • Cost Controls │ │
│ │ • Rate Limits │ │ • MAP Planning │ │
│ │ • Gateway APIs │ │ • Execution Replay│ │
│ │ • MCP Handler │ │ │ │
│ └────────┬─────────┘ └─────────┬──────────┘ │
│ │ │ │
│ │ ┌─────────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────────┐ ┌──────────────────┐ │
│ │ PostgreSQL (:5432) │ │ Redis (:6379) │ │
│ │ │ │ │ │
│ │ • Policies │ │ • Rate Limits │ │
│ │ • Audit Logs │ │ • Policy Cache │ │
│ │ • Cost Budgets │ │ • Session State │ │
│ │ • Tenant Config │ │ │ │
│ └─────────────────────┘ └──────────────────┘ │
│ │
│ docker compose up -d │
└─────────────────────────────────────────────────────────────────────────────┘
Data Flow:
- App → Agent: All requests enter through the Agent on port 8080
- Agent → Orchestrator: In Proxy Mode, approved requests route to Orchestrator for LLM execution
- Agent ↔ Postgres: System policies loaded at startup, audit logs written per-request
- Agent ↔ Redis: Rate limit counters, policy cache (5-min TTL)
- Orchestrator → LLM: Routes to configured providers based on cost, latency, or policy rules
Agent Service (:8080)
The Agent service is the primary entry point for policy enforcement.
Responsibilities:
- System policy evaluation (PII, SQLi, rate limits)
- Request validation and authentication
- Gateway Mode pre-check and audit APIs
- MCP connector orchestration
- Audit logging with decision chain
Specifications:
- Default Count: 5 tasks (configurable 1-50)
- CPU: 1 vCPU per task
- Memory: 2 GB per task
- P95 Latency: Under 10ms for policy evaluation
Orchestrator Service (:8081)
The Orchestrator service handles LLM routing, tenant policies, and multi-agent coordination.
Responsibilities:
- Tenant policy enforcement
- LLM routing and failover
- Multi-Agent Planning (MAP)
- Cost tracking and budget enforcement
- Execution replay
Specifications:
- Default Count: 10 tasks (configurable 1-50)
- CPU: 1 vCPU per task
- Memory: 2 GB per task
- P95 Latency: Under 30ms for tenant policies
What This Enables
Scales Across Frameworks
Gateway Mode means you don't rewrite your agents. Add governance to LangChain today, CrewAI tomorrow, custom agents next month — same control plane.
Governance Emerges Naturally
Policies are defined once, applied everywhere. A "block SSN in queries" policy works whether the request comes from a chatbot, a batch job, or an internal tool.
Sticky in Production
Once you have audit trails, cost controls, and approval workflows, they become infrastructure. Teams build on top of them; removing them breaks things.
Performance
| Operation | P95 Latency | Notes |
|---|---|---|
| System Policy Evaluation | Under 10ms | In-memory |
| Tenant Policy Evaluation | Under 30ms | Cached |
| Gateway Pre-check | Under 15ms | System + context |
| MCP Connector Query | Under 50ms | Pooled connections |
Infrastructure (Enterprise)
For enterprise deployments, AxonFlow runs entirely within your AWS VPC:
Key Points:
- All data stays within your VPC
- Multi-AZ deployment for high availability
- Private subnets for compute and database
- Secrets Manager for credential storage
Integration Points
LLM Providers
AxonFlow is model-agnostic — any model available through a supported provider works automatically.
| Provider | Community | Enterprise |
|---|---|---|
| OpenAI | Yes | Yes |
| Anthropic | Yes | Yes |
| Azure OpenAI | Yes | Yes |
| Google Gemini | Yes | Yes |
| Ollama (self-hosted) | Yes | Yes |
| AWS Bedrock | No | Yes |
AI Frameworks
- LangChain, LangGraph, LlamaIndex
- CrewAI, AutoGen, DSPy
- Semantic Kernel, Copilot Studio
- Custom integrations via SDK
MCP Connectors
- Databases: PostgreSQL, MySQL, MongoDB, Redis, Cassandra
- Storage: S3, GCS, Azure Blob
- Enterprise: Salesforce, Slack, Amadeus, Jira, ServiceNow, Snowflake
Next Steps
- Infrastructure Details - CloudFormation resources
- Security Best Practices - Security configuration
- Gateway Mode Guide - Integration with existing frameworks
- Getting Started - Quick setup guide