Decision Mode

AxonFlow's existing integration modes -- Gateway Mode, Proxy Mode, and Workflow Control Plane -- are described in Choosing an Integration Mode. Decision Mode is a new integration option on a different axis: instead of your application code calling AxonFlow, your existing gateway infrastructure calls AxonFlow.

Available in v8.3.0+

Decision Mode is available starting with platform v8.3.0. The Decision API (POST /api/v1/decide) is live at all tiers (Community through Enterprise). Three reference PEP adapters ship as working examples. Envoy ext_authz integration is planned for a future release.

What Decision Mode is

Many platform teams already run their own gateway infrastructure, often several layers of it: an agent gateway, a connector or MCP gateway, an LLM gateway. For those teams, neither rewriting application code nor routing traffic through a new proxy is attractive.

Decision Mode fits here. AxonFlow runs as a standalone policy decision service. Your existing gateways each make one inline call to AxonFlow per request, receive a verdict (allow, deny, or require approval), and enforce it. AxonFlow is consulted; it is never on the traffic path.

This is the well-established PDP/PEP (Policy Decision Point / Policy Enforcement Point) separation used by policy engines across the industry. It has three properties platform teams care about:

Where traffic is required to pass through governed gateways, enforcement does not depend on developer discipline. The gateway is infrastructure that requests pass through by construction. There is no per-application SDK call to omit.
One policy brain, every layer. Because every gateway calls the same decision service, the policy hierarchy is resolved centrally and enforced identically at each stage.
One end-to-end trace. Because every decision goes through the same service, decisions made at different gateway layers correlate into a single trace, which feeds audit logging.

The Decision API builds on the same policy engine as Gateway Mode's POST /api/policy/pre-check -- same engine, different caller. Decision Mode is additive and framework-neutral. Your existing gateways, routers, and providers stay exactly as they are; AxonFlow is added beside them, not inserted into the path.

Quick start

The Decision API is a single endpoint on the Agent: POST /api/v1/decide (port 8080).

Allow verdict

A clean LLM-stage request passes the policy engine and returns verdict: "allow":

curl -s -X POST http://localhost:8080/api/v1/decide \
  -H "Content-Type: application/json" \
  -d '{
    "stage": "llm",
    "caller_identity": {
      "gateway_id": "llm-gateway-01",
      "tenant_id": "acme-prod"
    },
    "target": {
      "type": "llm",
      "model": "gpt-4o",
      "provider": "openai"
    },
    "query": "What is the customer order status?"
  }' | jq .

{
  "verdict": "allow",
  "decision_id": "f81d4fae-7dec-11d0-a765-00a0c91e6bf6",
  "trace_id": "0af7651916cd43dd8448eb211c80319c",
  "stage": "llm",
  "reasons": [],
  "obligations": [],
  "evaluated_policies": [],
  "expires_at": "2026-05-23T10:35:00Z"
}

Deny verdict (SQL injection)

A query containing a SQL injection pattern triggers the built-in SQLi policy and returns verdict: "deny". On deny, the first entry in evaluated_policies is the blocking policy:

curl -s -X POST http://localhost:8080/api/v1/decide \
  -H "Content-Type: application/json" \
  -d '{
    "stage": "tool",
    "caller_identity": {
      "gateway_id": "mcp-gateway-01",
      "tenant_id": "acme-prod"
    },
    "target": {
      "type": "tool",
      "tool": "postgres.query"
    },
    "query": "SELECT * FROM users WHERE id=1 UNION SELECT password FROM credentials"
  }' | jq .

{
  "verdict": "deny",
  "decision_id": "a73e5b1c-2b48-4f2e-a3c4-2e8a3b9f8d1e",
  "trace_id": "7b3c8d2e1a4f5069b8c7d6e5f4a3b2c1",
  "stage": "tool",
  "reasons": ["SQL injection pattern matched"],
  "obligations": [],
  "evaluated_policies": ["sys_sqli_union"],
  "expires_at": "2026-05-23T10:35:00Z"
}

Trace correlation across gateway layers

Pass a W3C traceparent header and the response reuses the same trace_id. This lets multi-layer gateways (agent, MCP, LLM) stitch their decisions into one end-to-end trace:

curl -s -X POST http://localhost:8080/api/v1/decide \
  -H "Content-Type: application/json" \
  -H "traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01" \
  -d '{
    "stage": "agent",
    "caller_identity": {
      "gateway_id": "agent-gateway-01",
      "tenant_id": "acme-prod"
    },
    "target": {
      "type": "agent"
    },
    "query": "Investigate the suspicious payment and draft a summary"
  }' | jq .trace_id

"4bf92f3577b34da6a3ce929d0e0e4736"

The trace_id in the response matches the trace-id portion of the inbound traceparent. When no traceparent is provided, AxonFlow mints a fresh 32-hex trace ID.

Auth: in Community mode, no auth header is required. In Enterprise mode, use the same Authorization: Basic <base64> header as Gateway Mode.

Reference PEP adapters

AxonFlow ships three reference PEP adapters covering each gateway layer in the architecture diagram below. Each adapter is a working example you can run locally with Docker Compose, then adapt for your infrastructure.

LLM Gateway

The LLM adapter wraps any HTTP-based LLM endpoint with Decision Mode enforcement. It is a Go HTTP middleware that intercepts every request, calls POST /api/v1/decide, and enforces the verdict before forwarding:

client → adapter (:8888) → your LLM gateway
              ↓
       AxonFlow agent (:8080)
       POST /api/v1/decide

The adapter:

Extracts model and user message from the OpenAI-shaped request body.
Calls the Decision API with stage: "llm" and the configured gateway identity.
On allow: forwards the request to the downstream LLM, propagates traceparent for end-to-end tracing.
On deny: returns a structured JSON error with decision_id, trace_id, and reasons (not a bare 403).
On Decision API failure: applies the configured fail-open or fail-closed posture.

The adapter is configurable via environment variables (AXONFLOW_ENDPOINT, AXONFLOW_GATEWAY_ID, AXONFLOW_FAIL_OPEN) and can also be used as a Go library:

import adapter "github.com/getaxonflow/axonflow/examples/integrations/decision-mode-adapter"

handler := adapter.Middleware(adapter.Config{
    AxonFlowEndpoint: "http://axonflow:8080",
    GatewayID:        "my-llm-gateway",
    FailOpen:         false,
}, yourDownstreamHandler)

A Docker Compose PoC harness is included that runs the full round-trip (agent + adapter + mock LLM). See the adapter source and README for setup instructions.

MCP Gateway

The MCP adapter intercepts JSON-RPC 2.0 requests at the MCP layer. It sits between your MCP client and your MCP server, checking tools/call requests against the Decision API before they reach the tool:

MCP client → MCP adapter (:9090) → your MCP server
                    ↓
             AxonFlow agent (:8080)
             POST /api/v1/decide

The adapter:

Intercepts tools/call requests by default. Other methods (tools/list, resources/read, etc.) pass through without a policy check. Set MCP_INTERCEPT_METHODS to include additional JSON-RPC methods you want governed.
Extracts the tool name and arguments from the JSON-RPC params object.
Calls the Decision API with stage: "tool", target.type: "tool", and target.tool set to the tool name.
On allow: forwards the JSON-RPC request to the upstream MCP server and returns its result.
On deny: returns a JSON-RPC error response (not a bare HTTP 403). The error uses code -32001 with the policy reasons in the data field, so MCP clients see a well-formed error rather than a transport failure.
On needs_approval: returns JSON-RPC error code -32002.
On Decision API failure: applies the configured fail mode (MCP_FAIL_MODE=open or closed, default closed).

Example: a tools/call request for a SQL-injecting tool argument:

curl -s -X POST http://localhost:9090 \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "tools/call",
    "params": {
      "name": "postgres.query",
      "arguments": {
        "sql": "SELECT * FROM users UNION SELECT password FROM credentials"
      }
    }
  }'

{
  "jsonrpc": "2.0",
  "id": 1,
  "error": {
    "code": -32001,
    "message": "SQL injection pattern matched",
    "data": {
      "decision_id": "a73e5b1c-2b48-4f2e-a3c4-2e8a3b9f8d1e",
      "trace_id": "7b3c8d2e1a4f5069b8c7d6e5f4a3b2c1",
      "evaluated_policies": ["sys_sqli_union"],
      "reasons": ["SQL injection pattern matched"]
    }
  }
}

The adapter is configured via environment variables (MCP_SERVER_URL, AXONFLOW_ENDPOINT, MCP_GATEWAY_ID, MCP_INTERCEPT_METHODS, MCP_FAIL_MODE). A Docker Compose PoC harness with a mock MCP server is included. See the MCP adapter source for setup instructions.

Agent Gateway

The Agent Gateway adapter uses the same adapter binary as the LLM Gateway, configured with AXONFLOW_STAGE=agent. It sits in front of any HTTP-based agent routing layer and enforces policy before requests reach the agent:

client → adapter-agent (:8889) → your agent backend
                 ↓
          AxonFlow agent (:8080)
          POST /api/v1/decide

Because the adapter is stage-agnostic, the only configuration difference from the LLM adapter is the AXONFLOW_STAGE environment variable:

Variable	LLM Gateway	Agent Gateway
`AXONFLOW_STAGE`	`llm` (default)	`agent`
`LISTEN_ADDR`	`:8888`	`:8889`
`DOWNSTREAM_URL`	Your LLM endpoint	Your agent endpoint
`AXONFLOW_GATEWAY_ID`	e.g. `llm-gateway-01`	e.g. `agent-gateway-01`

A Docker Compose override file is included for running both the LLM and Agent adapters side by side. See the agent gateway Docker Compose for the full setup.

Request and response

Endpoint: POST /api/v1/decide on the Agent (port 8080)

For the full OpenAPI schema, see Agent API Endpoints.

Request body

Field	Type	Required	Description
`stage`	string	Yes	Which gateway layer is calling: `llm`, `tool`, or `agent`.
`caller_identity`	object	No	Gateway-asserted identity. In Enterprise mode, the auth-derived identity is authoritative; body values must match or the request is rejected with 403.
`caller_identity.gateway_id`	string	No	Identifier for the calling gateway instance (for audit).
`caller_identity.org_id`	string	No	Organization scope.
`caller_identity.tenant_id`	string	No	Tenant scope.
`target`	object	No	What the gateway is about to call.
`target.type`	string	No	`llm`, `tool`, or `agent`.
`target.model`	string	No	Model name (when `type=llm`).
`target.provider`	string	No	Provider name (when `type=llm`).
`target.tool`	string	No	Tool name (when `type=tool`).
`query`	string	Yes	The content to evaluate against the policy engine.
`user_token`	string	No	End-user identity token, when the gateway has one to forward.
`context`	object	No	Caller-supplied audit context (string values). Allowlisted keys are propagated into the decision audit record and the OTel span — see Request context propagation.

Response body (HTTP 200)

Field	Type	Description
`verdict`	string	`allow`, `deny`, or `needs_approval`.
`decision_id`	string	Unique identifier for this decision (UUID).
`trace_id`	string	W3C-compatible 32-hex trace identifier. Reuses inbound `traceparent` when present.
`stage`	string	Echoes the request stage.
`reasons`	string[]	Human-readable reasons (populated on deny).
`obligations`	object[]	Required follow-up actions attached to an allow verdict (e.g. `{"type": "redact_pii", "detail": "..."}`). Fulfilled by calling an AxonFlow engine endpoint — never by PEP-side logic. See Obligations and the two-touch flow.
`evaluated_policies`	string[]	Policies that matched. On deny, the first entry is the blocking policy.
`expires_at`	string	ISO 8601 timestamp. The verdict is valid until this time (default: 5 minutes). PEP adapters may cache the result until expiry.

Obligations and the two-touch flow

POST /api/v1/decide is decision-only. It returns a verdict plus obligations and never mutates content — there is no redacted-payload field on the response, and the endpoint never sees the model/tool output. An obligation is self-describing: a redact_pii obligation carries a fulfillment block naming the engine endpoint, method, phase, and content types the PEP uses to discharge it:

{ "type": "redact_pii", "detail": "...",
  "fulfillment": { "endpoint": "/api/v1/mcp/check-input", "method": "POST",
                   "phase": "request", "content_types": ["text/plain"] } }

The PEP fulfills it by calling the named endpoint and forwarding the engine-redacted result — it never redacts itself. Request and response redaction are a symmetric pair: POST /api/v1/mcp/check-input returns an engine-redacted request (redacted_statement), POST /api/v1/mcp/check-output returns an engine-redacted response (redacted_data). So a governed exchange is two touches — decide → fulfill via the named endpoint → forward:

Client → Gateway ─(1) POST /api/v1/decide ──────────────────→ AxonFlow  → allow + obligation (names endpoint)
Gateway → Backend (forward) → raw content
Gateway ─(2) POST <obligation endpoint> (content) ─────────→ AxonFlow  → engine-redacted content
Gateway → Client (forward redacted content)

On the fulfillment call, the check-input (request) leg returns a redaction_evaluated boolean alongside redacted / redacted_statement. It is the load-bearing fail-closed signal: true means the redactor ran (forward redacted_statement); false or absent means the redactor did not run, in which case the PEP must fail closed rather than forward the statement as if clean — redacted: false would otherwise be indistinguishable from "ran, found nothing." The check-output (response) leg has no such field — /decide only emits request-phase obligations, so the response leg fails closed on any unsuccessful check-output round-trip.

A PEP, gateway, SDK, or client must not implement PII detection or redaction itself. The reference platform/shared/pep client carries no PII patterns — its only redaction path is that engine round-trip, and it fails closed if an obligation can't be discharged through the engine. The contract is content-type-aware (an unsupported content type is rejected with 415 rather than forwarded ungoverned), coverage is policy-derived (the PII categories your active policies enable), and gateway detection is connector-agnostic — AxonFlow governs whatever content the PEP submits, with no "enabled connector" prerequisite. For the complete decide → fulfill → forward loop and every fail-closed rule, see Building a Policy Enforcement Point; for PII specifics see PII Detection → Decision Mode: two-touch redaction.

Error responses

Status	Meaning
400	Invalid request body, missing `stage`, or missing `query`.
403	`caller_identity` does not match the authenticated identity (Enterprise mode).
503	Circuit breaker is active. The body carries `verdict: "deny"` as a fail-closed default. A `Retry-After` header is included when available. PEP adapters should apply their configured fail-open or fail-closed posture.

Request context propagation

A PEP can attach audit context to each decision via the request context object. AxonFlow propagates allowlisted keys end-to-end: into the decision's OpenTelemetry span (as request.context.<key> attributes) and into the persisted audit record, so a SIEM can correlate AxonFlow's decision with upstream logs by, for example, session_id.

Allowlist. Only keys matching AXONFLOW_DECISION_CONTEXT_ALLOWLIST (comma-separated) are kept; everything else is dropped. The default covers common agent / session / leader identity headers (x-ai-agent, x-session-id, x-leader-identity) plus a tenant-scoped header family. A trailing * is a prefix match, and matching is case- and separator-insensitive (X-AI-Agent, x-ai-agent, and x_ai_agent all match x-ai-agent).
Canonicalization. Surviving keys are stored in lower_snake_case (X-AI-Agent → x_ai_agent) so joins are deterministic regardless of header casing.
Limits. Non-string values are dropped; each value is capped at 256 bytes; the map is capped at 10 keys (surplus dropped and flagged request.context.truncated=true on the span / context_truncated in the audit record).
Read back. The full map is returned by GET /api/v1/decisions/{id}/explain; the list endpoint GET /api/v1/decisions returns the first 5 keys per decision.

curl -sS http://localhost:8080/api/v1/decide \
  -H 'Content-Type: application/json' \
  -d '{
        "stage": "llm",
        "query": "Summarize the quarterly report",
        "context": {
          "X-AI-Agent": "claude-code",
          "X-Session-ID": "sess-abc123",
          "X-Leader-Identity": "[email protected]"
        }
      }'

The decision span then carries request.context.x_ai_agent=claude-code, request.context.x_session_id=sess-abc123, and [email protected].

Trace correlation

Decision Mode is designed for multi-layer gateway architectures where a single user request crosses several enforcement points. Trace correlation ties those decisions together.

Pass a traceparent header on each call:

The agent gateway calls POST /api/v1/decide with traceparent: 00-<trace-id>-<span-id>-01.
AxonFlow returns the same trace_id in the response.
The agent gateway propagates the traceparent downstream.
The MCP gateway and LLM gateway each call POST /api/v1/decide with the same traceparent.
All three decisions share one trace_id in the audit trail.

When AXONFLOW_OTEL_ENDPOINT is configured, each decision emits an OpenTelemetry span on the axonflow.agent.decision tracer for end-to-end observability. See examples/integrations/otel-tracing/ in the AxonFlow repository for a local setup with Jaeger.

How it differs from existing modes

Gateway Mode, Proxy Mode, and WCP describe who owns the LLM call and how the application talks to AxonFlow. Decision Mode describes who calls AxonFlow's policy engine:

In Gateway Mode, your application code calls AxonFlow's pre-check API.
In Decision Mode, your infrastructure gateway calls AxonFlow's decision API.

These are different axes. They can coexist: a large organization can use Gateway Mode for services where deep, context-rich checks matter, and Decision Mode at its gateway layers for enterprise-wide enforcement -- all against the same policy hierarchy.

Connector-agnostic evaluation

Decision Mode evaluates the query content against the policy engine independently of which gateway calls it. POST /api/v1/decide governs content, not a specific managed connector, so its evaluation is connector-agnostic by design:

stage is recorded for audit and trace correlation; the target descriptor (model / provider / tool) does not scope policy. stage is written to the decision's audit record; the target fields are accepted by the API contract but do not change which policies run. The same category set -- SQL injection, dangerous-operation, sensitive-data, the enabled compliance categories, and the enabled PII categories -- is evaluated for an llm, tool, or agent stage alike. A stage: "tool" call for postgres.query is scanned identically to an llm-stage prompt.
The two per-connector controls in MCP governance do not apply on this path. Neither the static-policy connector allowlist (MCP_STATIC_POLICIES_CONNECTORS) nor per-connector dynamic policies -- rate limits, budgets, time- and role-based access -- are evaluated by POST /api/v1/decide. Those are properties of a managed MCP connector, which a PDP does not have.

This is the same reasoning behind obligation fulfillment being connector-agnostic: applying a connector allowlist at the decision point would let an operator's allowlist silently narrow enforcement for traffic the PDP is meant to govern universally.

If you need per-connector scoping -- for example, evaluating static policies for only a subset of connectors, or applying a rate limit or budget to a specific connector -- govern that traffic through the native MCP governance endpoints, which carry a real connector_type. The static-policy connector allowlist applies to the check-input and check-output checks, and per-connector dynamic policies (rate limits, budgets, time- and role-based access) apply on the request path (check-input). Decision Mode and MCP governance compose: use Decision Mode for uniform, gateway-level content enforcement, and MCP governance where connector-scoped controls matter.

Decision Mode across multiple gateways

Decision Mode architecture: a Policy Decision Point beside your existing gateway, applied across three gateway layers -- agent, MCP, and LLM -- calling one decision service.

The diagram shows Decision Mode applied to a three-layer gateway architecture. A request crosses the agent gateway, the connector (MCP) gateway, and the LLM gateway in turn. Each gateway calls the same AxonFlow decision service, which resolves the policy hierarchy and returns a verdict plus a shared trace identifier. The result is consistent enforcement at every stage and a single, correlated audit trail, without changing the application or the gateways' routing.

Reference adapters are available for all three gateway layers shown in the diagram. See the adapter sections above for setup instructions.

When to use Decision Mode

Consider Decision Mode when:

You already operate your own gateway layers (agent gateway, MCP gateway, LLM gateway) and want centralized policy enforcement without per-application SDK integration.
You have many engineering teams behind those gateways, so per-application integration does not scale and is not auditable.
You answer to a security, risk, or compliance function that wants enforcement to be structural, not discretionary.

If you do not already run your own gateway infrastructure, Gateway Mode or Proxy Mode will serve you better and faster.

Choosing an Integration Mode -- Gateway Mode, Proxy Mode, WCP, and how Decision Mode relates
Deployment Mode Matrix -- operational topology (Community, Evaluation, SaaS, In-VPC)
Architecture Overview -- the component model behind all integration modes
Agent API Endpoints -- full OpenAPI reference including POST /api/v1/decide

What Decision Mode is​

Quick start​

Allow verdict​

Deny verdict (SQL injection)​

Trace correlation across gateway layers​

Reference PEP adapters​

LLM Gateway​

MCP Gateway​

Agent Gateway​

Request and response​

Request body​

Response body (HTTP 200)​

Obligations and the two-touch flow​

Error responses​

Request context propagation​

Trace correlation​

How it differs from existing modes​

Connector-agnostic evaluation​

Decision Mode across multiple gateways​

When to use Decision Mode​

Related pages​