Architecture Overview

Architecture deep dive (12 min)

For a visual walkthrough of AxonFlow's architecture, including how the control plane works, policy enforcement flow, and multi-agent planning: Watch on YouTube

AxonFlow is a control plane for production AI systems. It doesn't replace your agent frameworks — it makes them operable in production.

The Problem: Why Execution Breaks

Most agent frameworks optimize for authoring workflows, not operating them.

Once agents touch real systems, teams hit familiar problems:

Silent failures — An agent retries a database write 3 times, each with side effects
No runtime visibility — Which policy blocked the request? What was the LLM's reasoning?
Permission gaps — Agent accessed customer data it shouldn't have; discovered in prod
Compliance gaps — No audit trail for AI decisions; fails regulatory review

Gateways aren't enough. API gateways can rate-limit and authenticate, but they don't understand AI workflows. They can't enforce "block if PII detected" or "require approval for high-risk decisions."

AxonFlow treats agents as long-running, stateful systems that require governance, observability, and control at runtime.

Where AxonFlow Sits

AxonFlow sits between your applications and LLM providers, providing policy enforcement and audit at execution time.

┌─────────────┐    ┌──────────────────────────────────────────────────┐
│  Your App   │───▶│                Agent (:8080)                     │
│   (SDK)     │    │  ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │
└─────────────┘    │  │  Policy  │ │   MCP    │ │  Media / Code    │ │
                   │  │  Engine  │ │Connectors│ │  Governance      │ │
                   │  │  (60+)   │ │          │ │                  │ │
                   │  └──────────┘ └──────────┘ └──────────────────┘ │
                   │  ┌──────────────────┐ ┌─────────────────────┐   │
                   │  │ PII / SQLi       │ │ Circuit Breaker     │   │
                   │  │ Detection        │ │ (Kill Switch)       │   │
                   │  └──────────────────┘ └─────────────────────┘   │
                   └────────────────────┬─────────────────────────────┘
                                        │
                                        ▼
                   ┌──────────────────────────────────────────────────┐
                   │             Orchestrator (:8081)                 │
                   │  ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │
                   │  │   WCP    │ │  MAP     │ │  Cost Controls   │ │
                   │  │  Step    │ │  Plan +  │ │  & Multi-Model   │ │
                   │  │  Gates   │ │  Execute │ │  Routing         │ │
                   │  └──────────┘ └──────────┘ └──────────────────┘ │
                   │  ┌──────────────────┐ ┌─────────────────────┐   │
                   │  │ HITL Approval    │ │ Evidence Export      │   │
                   │  │ Gates            │ │ & Decision Replay    │   │
                   │  └──────────────────┘ └─────────────────────┘   │
                   └────────────────────┬─────────────────────────────┘
                                        │
                                        ▼
                   ┌──────────────────────────────────────────────────┐
                   │                 LLM Providers                    │
                   │   (OpenAI, Anthropic, Gemini, Bedrock, Ollama)   │
                   └──────────────────────────────────────────────────┘

           PostgreSQL (policies, audit, evidence) • Redis (cache)
                   MCP Connectors (Postgres, Salesforce, S3, ...)

AxonFlow provides:

Policy enforcement — Block PII, SQLi, dangerous queries before they reach LLMs
Media and code governance — Image classification, code scanning policies
Workflow Control Plane (WCP) — Step-level gates for external orchestrators (LangChain, Temporal, etc.)
Circuit breaker — Emergency kill switch to halt all LLM calls instantly (Evaluation+)
Audit logging — Complete trail of every AI decision for compliance
Evidence export and decision replay — Compliance-grade exports and step-by-step replay (Evaluation+)
Cost controls — Budget limits and usage tracking per tenant
HITL approval gates — Human review for high-risk decisions (Evaluation+)

Why AxonFlow often becomes the default control plane

Teams typically start by placing AxonFlow in front of a single workflow or agent to evaluate policy enforcement, auditability, and execution control. As usage grows, running some AI workflows through AxonFlow and others outside it tends to create fragmented audit logs, inconsistent policies, and duplicated observability. For this reason, teams that continue using AxonFlow often standardize on it as a single control plane for AI workflows, while retaining their existing orchestration frameworks and execution logic.

Gateway Mode (Recommended for Existing Stacks)

AxonFlow wraps your existing agent framework with pre-execution checks and post-execution audit:

┌──────────────────────────────────────────────────────────────────────────────┐
│                              GATEWAY MODE                                     │
│                                                                               │
│   ┌─────────┐      ┌─────────────┐      ┌───────────────────────────────┐    │
│   │         │ ───▶ │  AxonFlow   │ ───▶ │  Your Agent Framework         │    │
│   │   App   │      │  Pre-check  │      │  (LangChain / CrewAI / etc)   │    │
│   │         │      │             │      │              │                │    │
│   └─────────┘      └─────────────┘      │              ▼                │    │
│        ▲                                │         ┌─────────┐           │    │
│        │                                │         │   LLM   │           │    │
│        │                                │         │Provider │           │    │
│        │                                │         └────┬────┘           │    │
│        │           ┌─────────────┐      │              │                │    │
│        └────────── │  AxonFlow   │ ◀─── │              ▼                │    │
│                    │   Audit     │      │         Response              │    │
│                    └─────────────┘      └───────────────────────────────┘    │
│                                                                               │
│   Flow: Pre-check → Your LLM call → Audit                                    │
│   Latency overhead: ~15ms                                                     │
└──────────────────────────────────────────────────────────────────────────────┘

SDK Integration:

# 1. Pre-check: Get policy approval
approval = await axonflow.get_policy_approved_context(query)
if not approval.approved:
    raise PolicyViolation(approval.block_reason)

# 2. Your existing LLM call (LangChain, CrewAI, etc.)
response = await your_langchain_agent.run(query)

# 3. Audit: Record what happened
await axonflow.audit_llm_call(approval.context_id, response)

Use when: You already have LangChain/CrewAI/AutoGen agents and want to add governance.

Proxy Mode (Full Control)

AxonFlow handles the complete request lifecycle, including LLM calls:

┌──────────────────────────────────────────────────────────────────────────────┐
│                               PROXY MODE                                      │
│                                                                               │
│   ┌─────────┐      ┌─────────────────────────────────────────┐      ┌─────┐  │
│   │         │      │              AxonFlow                    │      │     │  │
│   │   App   │ ───▶ │  ┌────────┐    ┌────────┐    ┌───────┐  │ ───▶ │ LLM │  │
│   │         │      │  │ Policy │───▶│  MAP   │───▶│Router │  │      │     │  │
│   │         │      │  │ Engine │    │Planning│    │       │  │      │     │  │
│   └─────────┘      │  └────────┘    └────────┘    └───────┘  │      └──┬──┘  │
│        ▲           │                                          │         │     │
│        │           │  ┌────────┐    ┌────────┐    ┌───────┐  │         │     │
│        └────────── │  │ Audit  │◀───│Response│◀───│ Cost  │  │ ◀───────┘     │
│                    │  │  Log   │    │ Check  │    │Control│  │               │
│                    │  └────────┘    └────────┘    └───────┘  │               │
│                    └─────────────────────────────────────────┘               │
│                                                                               │
│   Flow: App → AxonFlow (everything) → LLM → AxonFlow → App                   │
│   Full governance: policies, planning, routing, cost, audit                   │
└──────────────────────────────────────────────────────────────────────────────┘

SDK Integration:

# AxonFlow handles everything
response = await axonflow.execute_query(
    query="Analyze customer sentiment for Q4",
    request_type="analysis"
)
# Policies, LLM routing, cost tracking, audit — all handled

Use when: Building new applications, need full governance from the start.

When to Choose Gateway Mode

Scenario	Recommendation
Existing LangChain/CrewAI agents	Gateway Mode
Custom agent framework	Gateway Mode
Building from scratch	Proxy Mode
Need MAP multi-agent planning	Proxy Mode
Migration from existing stack	Gateway Mode first, Proxy Mode later

Control Plane vs Orchestration

Aspect	Orchestration (LangChain)	Control Plane (AxonFlow)
Focus	Chain construction, prompts	Runtime governance, audit
When	Authoring time	Execution time
Concern	"How do I build this workflow?"	"Should this step be allowed to run?"
Output	LLM response	Allow/Block decision + audit trail

Key insight: AxonFlow doesn't compete with LangChain. LangChain runs your workflow; AxonFlow decides whether each step is allowed to proceed.

Execution Model

Policies Before and After Steps

Every step in a workflow passes through policy evaluation:

Two-Phase Policy Model

AxonFlow uses a two-phase policy model for both LLM calls and MCP connector access:

Phase 1 (System Policies): Pattern-based, compiled, in-memory, no DB lookups. Under 10ms P95.

Phase 2 (Tenant Policies): Condition-based, tenant-aware, cached 5 minutes. Under 30ms P95.

Access Type	Phase 1 (System)	Phase 2 (Tenant)
LLM - Proxy Mode	✅	✅
LLM - Gateway Mode	✅	❌ (you handle LLM directly)
MCP Connectors	✅	✅ (when `MCP_DYNAMIC_POLICIES_ENABLED=true`)

Note: MCP connectors are evaluated independently from LLM mode selection. You can use Gateway Mode for LLM calls (lowest latency) while still having full two-phase policy evaluation on MCP connector access.

Human-in-the-Loop Approvals (Enterprise)

High-risk decisions can require human approval:

Request → Policy Check → HITL Queue → Human Approves → Execute → Audit
                              │
                              └── Human Rejects → Block + Audit

Audit and Replay

Both modes provide audit logging, but with different depth:

Aspect	Gateway Mode	Proxy Mode
Audit Type	Lightweight	Comprehensive (Decision Chain)
Captures	Provider, model, tokens, cost	+ Policy triggers, risk levels, outcomes
Response Storage	SHA-256 hash (privacy)	Full decision tracing
Tamper Detection	Basic	`input_hash`, `output_hash`, `audit_hash`
Workflow Reconstruction	Not supported	Full replay via `chain_id`
Compliance	Cost tracking	EU AI Act Article 12

Gateway Mode is optimized for cost tracking and usage monitoring.

Proxy Mode provides compliance-grade audit trails with step-by-step decision tracing, risk classification, and execution replay.

Multi-Agent Planning (MAP)

MAP turns natural language requests into executable workflows:

# Natural language → Workflow
plan = await axonflow.generate_plan("Book cheapest flight to London next Tuesday")
# Returns: [flight-search.search, flight-search.compare, booking.reserve]

Core Services

System Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                           AXONFLOW COMPONENTS                                │
│                                                                              │
│                              ┌─────────────────────────────────────────┐    │
│                              │          LLM Providers                   │    │
│                              │  ┌─────────┐ ┌─────────┐ ┌─────────┐   │    │
│                              │  │ OpenAI  │ │Anthropic│ │ Gemini  │   │    │
│                              │  └─────────┘ └─────────┘ └─────────┘   │    │
│                              │  ┌─────────┐ ┌─────────┐ ┌─────────┐   │    │
│                              │  │ Azure   │ │ Ollama  │ │ Bedrock │   │    │
│                              │  └─────────┘ └─────────┘ └─────────┘   │    │
│                              └──────────────────▲──────────────────────┘    │
│                                                 │                            │
│   ┌─────────┐      ┌──────────────────┐      ┌──────────────────┐          │
│   │         │      │  Agent (:8080)   │      │Orchestrator(:8081)│          │
│   │   App   │─────▶│                  │─────▶│                  │──────────┤
│   │         │      │  • System Policy │      │  • Tenant Policy  │          │
│   └─────────┘      │  • PII Detection │      │  • LLM Routing    │          │
│                    │  • SQLi Scanning │      │  • Cost Controls  │          │
│                    │  • Rate Limits   │      │  • MAP Planning   │          │
│                    │  • Gateway APIs  │      │  • WCP Step Gates │          │
│                    │  • MCP Handler   │      │  • HITL Approvals │          │
│                    │  • Media / Code  │      │  • Evidence Export │          │
│                    │    Governance    │      │  • Decision Replay │          │
│                    │  • Circuit Break │      │                    │          │
│                    └────────┬─────────┘      └─────────┬──────────┘          │
│                             │                          │                     │
│                             │    ┌─────────────────────┘                     │
│                             │    │                                           │
│                             ▼    ▼                                           │
│                    ┌─────────────────────┐    ┌──────────────────┐          │
│                    │  PostgreSQL (:5432) │    │   Redis (:6379)  │          │
│                    │                     │    │                  │          │
│                    │  • Policies         │    │  • Rate Limits   │          │
│                    │  • Audit Logs       │    │  • Policy Cache  │          │
│                    │  • Cost Budgets     │    │  • Session State │          │
│                    │  • Evidence         │    │  • Circuit State │          │
│                    │  • Workflow Steps   │    │                  │          │
│                    │  • Tenant Config    │    │                  │          │
│                    └─────────────────────┘    └──────────────────┘          │
│                                                                              │
│   docker compose up -d                                                       │
└─────────────────────────────────────────────────────────────────────────────┘

Data Flow:

App → Agent: All requests enter through the Agent on port 8080
Agent → Orchestrator: In Proxy Mode, approved requests route to Orchestrator for LLM execution
Agent ↔ Postgres: System policies loaded at startup, audit logs written per-request
Agent ↔ Redis: Rate limit counters, policy cache (5-min TTL)
Orchestrator → LLM: Routes to configured providers based on cost, latency, or policy rules

Agent Service (:8080)

The Agent service is the primary entry point for policy enforcement.

Responsibilities:

System policy evaluation (PII, SQLi, rate limits)
Request validation and authentication
Gateway Mode pre-check and audit APIs
MCP connector orchestration
Media and code governance (image classification, code scanning)
Circuit breaker (emergency kill switch)
Audit logging with decision chain

Specifications:

Default Count: 5 tasks (configurable 1-50)
CPU: 1 vCPU per task
Memory: 2 GB per task
P95 Latency: Under 10ms for policy evaluation

Orchestrator Service (:8081)

The Orchestrator service handles LLM routing, tenant policies, and multi-agent coordination.

Responsibilities:

Tenant policy enforcement
LLM routing and failover
Multi-Agent Planning (MAP)
Workflow Control Plane (WCP) step gates
Cost tracking and budget enforcement
HITL approval gates
Evidence export (compliance-grade)
Execution replay

Specifications:

Default Count: 10 tasks (configurable 1-50)
CPU: 1 vCPU per task
Memory: 2 GB per task
P95 Latency: Under 30ms for tenant policies

What This Enables

Scales Across Frameworks

Gateway Mode means you don't rewrite your agents. Add governance to LangChain today, CrewAI tomorrow, custom agents next month — same control plane.

Governance Emerges Naturally

Policies are defined once, applied everywhere. A "block SSN in queries" policy works whether the request comes from a chatbot, a batch job, or an internal tool.

Sticky in Production

Once you have audit trails, cost controls, and approval workflows, they become infrastructure. Teams build on top of them; removing them breaks things.

Performance

Operation	P95 Latency	Notes
System Policy Evaluation	Under 10ms	In-memory
Tenant Policy Evaluation	Under 30ms	Cached
Gateway Pre-check	Under 15ms	System + context
MCP Connector Query	Under 50ms	Pooled connections

Infrastructure (Enterprise)

For enterprise deployments, AxonFlow runs entirely within your AWS VPC:

Key Points:

All data stays within your VPC
Multi-AZ deployment for high availability
Private subnets for compute and database
Secrets Manager for credential storage

Integration Points

LLM Providers

AxonFlow is model-agnostic — any model available through a supported provider works automatically.

Provider	Community	Enterprise
OpenAI	Yes	Yes
Anthropic	Yes	Yes
Azure OpenAI	Yes	Yes
Google Gemini	Yes	Yes
Ollama (self-hosted)	Yes	Yes
AWS Bedrock	No	Yes

AI Frameworks

LangChain, LangGraph, LlamaIndex
CrewAI, AutoGen, DSPy
Semantic Kernel, Copilot Studio
Custom integrations via SDK

MCP Connectors

Databases: PostgreSQL, MySQL, MongoDB, Redis, Cassandra
Storage: S3, GCS, Azure Blob
Enterprise: Salesforce, Slack, Amadeus, Jira, ServiceNow, Snowflake

Next Steps

Infrastructure Details - CloudFormation resources
Security Best Practices - Security configuration
Gateway Mode Guide - Integration with existing frameworks
Getting Started - Quick setup guide

The Problem: Why Execution Breaks​

Where AxonFlow Sits​

Why AxonFlow often becomes the default control plane​

Gateway Mode (Recommended for Existing Stacks)​

Proxy Mode (Full Control)​

When to Choose Gateway Mode​

Control Plane vs Orchestration​

Execution Model​

Policies Before and After Steps​

Two-Phase Policy Model​

Human-in-the-Loop Approvals (Enterprise)​

Audit and Replay​

Multi-Agent Planning (MAP)​

Core Services​

System Overview​

Agent Service (:8080)​

Orchestrator Service (:8081)​

What This Enables​

Scales Across Frameworks​

Governance Emerges Naturally​

Sticky in Production​

Performance​

Infrastructure (Enterprise)​

Integration Points​

LLM Providers​

AI Frameworks​

MCP Connectors​

Next Steps​