Skip to main content

Architecture Overview

Architecture deep dive (12 min)

For a visual walkthrough of AxonFlow's architecture, including how the control plane works, policy enforcement flow, and multi-agent planning: Watch on YouTube

AxonFlow is a control plane for production AI systems. It doesn't replace your agent frameworks — it makes them operable in production.


The Problem: Why Execution Breaks

Most agent frameworks optimize for authoring workflows, not operating them.

Once agents touch real systems, teams hit familiar problems:

  • Silent failures — An agent retries a database write 3 times, each with side effects
  • No runtime visibility — Which policy blocked the request? What was the LLM's reasoning?
  • Permission gaps — Agent accessed customer data it shouldn't have; discovered in prod
  • Compliance gaps — No audit trail for AI decisions; fails regulatory review

Gateways aren't enough. API gateways can rate-limit and authenticate, but they don't understand AI workflows. They can't enforce "block if PII detected" or "require approval for high-risk decisions."

AxonFlow treats agents as long-running, stateful systems that require governance, observability, and control at runtime.


Where AxonFlow Sits

AxonFlow sits between your applications and LLM providers, providing policy enforcement and audit at execution time.

┌─────────────┐      ┌─────────────────────────────┐      ┌─────────────────┐
│ │ │ AxonFlow │ │ LLM Providers │
│ Your │ ───▶ │ ┌─────────┐ ┌─────────┐ │ ───▶ │ OpenAI │
│ App │ │ │ Policy │ │ Audit │ │ │ Anthropic │
│ │ ◀─── │ │ Engine │ │ Log │ │ ◀─── │ Gemini │
└─────────────┘ │ └─────────┘ └─────────┘ │ └─────────────────┘
└─────────────────────────────┘


┌─────────────────────────────┐
│ MCP Connectors │
│ Postgres, Salesforce, S3 │
└─────────────────────────────┘

AxonFlow provides:

  • Policy enforcement — Block PII, SQLi, dangerous queries before they reach LLMs
  • Audit logging — Complete trail of every AI decision for compliance
  • Cost controls — Budget limits and usage tracking per tenant

Why AxonFlow often becomes the default control plane

Teams typically start by placing AxonFlow in front of a single workflow or agent to evaluate policy enforcement, auditability, and execution control. As usage grows, running some AI workflows through AxonFlow and others outside it tends to create fragmented audit logs, inconsistent policies, and duplicated observability. For this reason, teams that continue using AxonFlow often standardize on it as a single control plane for AI workflows, while retaining their existing orchestration frameworks and execution logic.

AxonFlow wraps your existing agent framework with pre-execution checks and post-execution audit:

┌──────────────────────────────────────────────────────────────────────────────┐
│ GATEWAY MODE │
│ │
│ ┌─────────┐ ┌─────────────┐ ┌───────────────────────────────┐ │
│ │ │ ───▶ │ AxonFlow │ ───▶ │ Your Agent Framework │ │
│ │ App │ │ Pre-check │ │ (LangChain / CrewAI / etc) │ │
│ │ │ │ │ │ │ │ │
│ └─────────┘ └─────────────┘ │ ▼ │ │
│ ▲ │ ┌─────────┐ │ │
│ │ │ │ LLM │ │ │
│ │ │ │Provider │ │ │
│ │ │ └────┬────┘ │ │
│ │ ┌─────────────┐ │ │ │ │
│ └────────── │ AxonFlow │ ◀─── │ ▼ │ │
│ │ Audit │ │ Response │ │
│ └─────────────┘ └───────────────────────────────┘ │
│ │
│ Flow: Pre-check → Your LLM call → Audit │
│ Latency overhead: ~15ms │
└──────────────────────────────────────────────────────────────────────────────┘

SDK Integration:

# 1. Pre-check: Get policy approval
approval = await axonflow.get_policy_approved_context(query)
if not approval.approved:
raise PolicyViolation(approval.block_reason)

# 2. Your existing LLM call (LangChain, CrewAI, etc.)
response = await your_langchain_agent.run(query)

# 3. Audit: Record what happened
await axonflow.audit_llm_call(approval.context_id, response)

Use when: You already have LangChain/CrewAI/AutoGen agents and want to add governance.

Proxy Mode (Full Control)

AxonFlow handles the complete request lifecycle, including LLM calls:

┌──────────────────────────────────────────────────────────────────────────────┐
│ PROXY MODE │
│ │
│ ┌─────────┐ ┌─────────────────────────────────────────┐ ┌─────┐ │
│ │ │ │ AxonFlow │ │ │ │
│ │ App │ ───▶ │ ┌────────┐ ┌────────┐ ┌───────┐ │ ───▶ │ LLM │ │
│ │ │ │ │ Policy │───▶│ MAP │───▶│Router │ │ │ │ │
│ │ │ │ │ Engine │ │Planning│ │ │ │ │ │ │
│ └─────────┘ │ └────────┘ └────────┘ └───────┘ │ └──┬──┘ │
│ ▲ │ │ │ │
│ │ │ ┌────────┐ ┌────────┐ ┌───────┐ │ │ │
│ └────────── │ │ Audit │◀───│Response│◀───│ Cost │ │ ◀───────┘ │
│ │ │ Log │ │ Check │ │Control│ │ │
│ │ └────────┘ └────────┘ └───────┘ │ │
│ └─────────────────────────────────────────┘ │
│ │
│ Flow: App → AxonFlow (everything) → LLM → AxonFlow → App │
│ Full governance: policies, planning, routing, cost, audit │
└──────────────────────────────────────────────────────────────────────────────┘

SDK Integration:

# AxonFlow handles everything
response = await axonflow.execute_query(
query="Analyze customer sentiment for Q4",
request_type="analysis"
)
# Policies, LLM routing, cost tracking, audit — all handled

Use when: Building new applications, need full governance from the start.

When to Choose Gateway Mode

ScenarioRecommendation
Existing LangChain/CrewAI agentsGateway Mode
Custom agent frameworkGateway Mode
Building from scratchProxy Mode
Need MAP multi-agent planningProxy Mode
Migration from existing stackGateway Mode first, Proxy Mode later

Control Plane vs Orchestration

AspectOrchestration (LangChain)Control Plane (AxonFlow)
FocusChain construction, promptsRuntime governance, audit
WhenAuthoring timeExecution time
Concern"How do I build this workflow?""Should this step be allowed to run?"
OutputLLM responseAllow/Block decision + audit trail

Key insight: AxonFlow doesn't compete with LangChain. LangChain runs your workflow; AxonFlow decides whether each step is allowed to proceed.


Execution Model

Policies Before and After Steps

Every step in a workflow passes through policy evaluation:

Two-Phase Policy Model

AxonFlow uses a two-phase policy model for both LLM calls and MCP connector access:

Phase 1 (System Policies): Pattern-based, compiled, in-memory, no DB lookups. Under 10ms P95.

Phase 2 (Tenant Policies): Condition-based, tenant-aware, cached 5 minutes. Under 30ms P95.

Access TypePhase 1 (System)Phase 2 (Tenant)
LLM - Proxy Mode
LLM - Gateway Mode❌ (you handle LLM directly)
MCP Connectors✅ (when MCP_DYNAMIC_POLICIES_ENABLED=true)

Note: MCP connectors are evaluated independently from LLM mode selection. You can use Gateway Mode for LLM calls (lowest latency) while still having full two-phase policy evaluation on MCP connector access.

Human-in-the-Loop Approvals (Enterprise)

High-risk decisions can require human approval:

Request → Policy Check → HITL Queue → Human Approves → Execute → Audit

└── Human Rejects → Block + Audit

Audit and Replay

Both modes provide audit logging, but with different depth:

AspectGateway ModeProxy Mode
Audit TypeLightweightComprehensive (Decision Chain)
CapturesProvider, model, tokens, cost+ Policy triggers, risk levels, outcomes
Response StorageSHA-256 hash (privacy)Full decision tracing
Tamper DetectionBasicinput_hash, output_hash, audit_hash
Workflow ReconstructionNot supportedFull replay via chain_id
ComplianceCost trackingEU AI Act Article 12

Gateway Mode is optimized for cost tracking and usage monitoring.

Proxy Mode provides compliance-grade audit trails with step-by-step decision tracing, risk classification, and execution replay.

Multi-Agent Planning (MAP)

MAP turns natural language requests into executable workflows:

# Natural language → Workflow
plan = await axonflow.generate_plan("Book cheapest flight to London next Tuesday")
# Returns: [flight-search.search, flight-search.compare, booking.reserve]

Core Services

System Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│ AXONFLOW COMPONENTS │
│ │
│ ┌─────────────────────────────────────────┐ │
│ │ LLM Providers │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │ OpenAI │ │Anthropic│ │ Gemini │ │ │
│ │ └─────────┘ └─────────┘ └─────────┘ │ │
│ │ ┌─────────┐ ┌─────────┐ │ │
│ │ │ Azure │ │ Ollama │ │ │
│ │ └─────────┘ └─────────┘ │ │
│ └──────────────────▲──────────────────────┘ │
│ │ │
│ ┌─────────┐ ┌──────────────────┐ ┌──────────────────┐ │
│ │ │ │ Agent (:8080) │ │Orchestrator(:8081)│ │
│ │ App │─────▶│ │─────▶│ │──────────┤
│ │ │ │ • System Policy │ │ • Tenant Policy │ │
│ └─────────┘ │ • PII Detection │ │ • LLM Routing │ │
│ │ • SQLi Scanning │ │ • Cost Controls │ │
│ │ • Rate Limits │ │ • MAP Planning │ │
│ │ • Gateway APIs │ │ • Execution Replay│ │
│ │ • MCP Handler │ │ │ │
│ └────────┬─────────┘ └─────────┬──────────┘ │
│ │ │ │
│ │ ┌─────────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────────┐ ┌──────────────────┐ │
│ │ PostgreSQL (:5432) │ │ Redis (:6379) │ │
│ │ │ │ │ │
│ │ • Policies │ │ • Rate Limits │ │
│ │ • Audit Logs │ │ • Policy Cache │ │
│ │ • Cost Budgets │ │ • Session State │ │
│ │ • Tenant Config │ │ │ │
│ └─────────────────────┘ └──────────────────┘ │
│ │
│ docker compose up -d │
└─────────────────────────────────────────────────────────────────────────────┘

Data Flow:

  1. App → Agent: All requests enter through the Agent on port 8080
  2. Agent → Orchestrator: In Proxy Mode, approved requests route to Orchestrator for LLM execution
  3. Agent ↔ Postgres: System policies loaded at startup, audit logs written per-request
  4. Agent ↔ Redis: Rate limit counters, policy cache (5-min TTL)
  5. Orchestrator → LLM: Routes to configured providers based on cost, latency, or policy rules

Agent Service (:8080)

The Agent service is the primary entry point for policy enforcement.

Responsibilities:

  • System policy evaluation (PII, SQLi, rate limits)
  • Request validation and authentication
  • Gateway Mode pre-check and audit APIs
  • MCP connector orchestration
  • Audit logging with decision chain

Specifications:

  • Default Count: 5 tasks (configurable 1-50)
  • CPU: 1 vCPU per task
  • Memory: 2 GB per task
  • P95 Latency: Under 10ms for policy evaluation

Orchestrator Service (:8081)

The Orchestrator service handles LLM routing, tenant policies, and multi-agent coordination.

Responsibilities:

  • Tenant policy enforcement
  • LLM routing and failover
  • Multi-Agent Planning (MAP)
  • Cost tracking and budget enforcement
  • Execution replay

Specifications:

  • Default Count: 10 tasks (configurable 1-50)
  • CPU: 1 vCPU per task
  • Memory: 2 GB per task
  • P95 Latency: Under 30ms for tenant policies

What This Enables

Scales Across Frameworks

Gateway Mode means you don't rewrite your agents. Add governance to LangChain today, CrewAI tomorrow, custom agents next month — same control plane.

Governance Emerges Naturally

Policies are defined once, applied everywhere. A "block SSN in queries" policy works whether the request comes from a chatbot, a batch job, or an internal tool.

Sticky in Production

Once you have audit trails, cost controls, and approval workflows, they become infrastructure. Teams build on top of them; removing them breaks things.


Performance

OperationP95 LatencyNotes
System Policy EvaluationUnder 10msIn-memory
Tenant Policy EvaluationUnder 30msCached
Gateway Pre-checkUnder 15msSystem + context
MCP Connector QueryUnder 50msPooled connections

Infrastructure (Enterprise)

For enterprise deployments, AxonFlow runs entirely within your AWS VPC:

Key Points:

  • All data stays within your VPC
  • Multi-AZ deployment for high availability
  • Private subnets for compute and database
  • Secrets Manager for credential storage

Integration Points

LLM Providers

AxonFlow is model-agnostic — any model available through a supported provider works automatically.

ProviderCommunityEnterprise
OpenAIYesYes
AnthropicYesYes
Azure OpenAIYesYes
Google GeminiYesYes
Ollama (self-hosted)YesYes
AWS BedrockNoYes

AI Frameworks

  • LangChain, LangGraph, LlamaIndex
  • CrewAI, AutoGen, DSPy
  • Semantic Kernel, Copilot Studio
  • Custom integrations via SDK

MCP Connectors

  • Databases: PostgreSQL, MySQL, MongoDB, Redis, Cassandra
  • Storage: S3, GCS, Azure Blob
  • Enterprise: Salesforce, Slack, Amadeus, Jira, ServiceNow, Snowflake

Next Steps