Skip to main content

Infrastructure Details

This page describes the infrastructure shape of a typical AxonFlow deployment without hard-coding one exact cloud template or marketplace bundle. That makes it more useful for engineers running the community stack today and for teams reviewing a later move to protected enterprise deployment paths.

Community Runtime Topology

The default self-hosted community setup is the local Docker Compose stack in the main repo. It is the fastest way to bring up the full public runtime and verify request flow, policies, monitoring, and connector behavior.

Services in the Community Stack

ServiceDefault PortPurpose
Agent8080Entry point for policy checks, MCP access, and gateway APIs
Orchestrator8081Request execution, tenant-policy logic, workflow APIs
PostgreSQL5432Persistent platform state
Redis6379Caching and coordination
Prometheus9090Metrics collection
Grafana3000Dashboards for local operations

What This Topology Is Good For

  • SDK and API integration work
  • validating policy behavior
  • reviewing connector access patterns
  • building early multi-agent workflows
  • local or lower-scale self-hosted environments

Resource Requirements

ComponentDevelopmentProduction
Agent1 vCPU, 512 MB2-4 vCPU, 2-4 GB
Orchestrator1 vCPU, 1 GB2-4 vCPU, 4-8 GB
PostgreSQLShared, 256 MBDedicated, 2-4 GB (RDS recommended)
RedisShared, 128 MBDedicated, 512 MB-1 GB (ElastiCache recommended)
Total minimum2 vCPU, 4 GB8+ vCPU, 16+ GB

These are starting points. Actual requirements depend on request volume, policy complexity, and connector count. The Docker Compose stack runs comfortably on a 4 GB laptop for development.

For production scaling: the Agent is CPU-bound during policy evaluation (sub-5ms per request), and the Orchestrator is I/O-bound waiting on LLM providers. Horizontal scaling of both services behind a load balancer is the standard production pattern.

Operational Building Blocks

No matter where AxonFlow runs, the same infrastructure concerns show up.

Compute

You need separate runtime capacity for the Agent and Orchestrator because they serve different latency and scaling profiles:

  • Agent capacity is driven by ingress rate, policy evaluation volume, and connector activity
  • Orchestrator capacity is driven by workflow execution, provider routing, and proxy-mode traffic

Data Stores

AxonFlow relies on two core stores:

  • PostgreSQL for durable state such as policies, audit records, workflow state, and configuration
  • Redis for runtime caches and coordination paths such as rate limiting and short-lived execution state

Observability

A production-grade setup should expose:

  • health endpoints for the Agent and Orchestrator
  • Prometheus scrape targets
  • dashboards that track request volume, policy outcomes, errors, and latency
  • logs that allow request and incident investigation

Community Self-Hosted Guidance

For community users running on their own infrastructure, the practical deployment checklist is:

  1. Run Agent, Orchestrator, PostgreSQL, and Redis with persistent storage for PostgreSQL.
  2. Keep Prometheus and Grafana available in non-trivial environments so you can observe request volume, failure modes, and policy outcomes.
  3. Treat the Agent as the public-facing application entry point and keep the Orchestrator reachable only by trusted internal services.
  4. Back up PostgreSQL if the environment matters operationally or stores audit data you need to retain.
  5. Put TLS, auth, and secret management in front of the runtime before exposing it beyond local development.

Enterprise Deployment Topology

Evaluation and Enterprise deployments generally add:

  • hardened secret and credential management
  • stronger network isolation
  • managed ingress and TLS termination
  • protected compliance and governance modules
  • enterprise operational tooling and portal surfaces

Those deployments are documented in the protected enterprise docs because the exact topology depends on the licensed package and environment. The community docs should point you toward the model, not pretend one AWS-specific shape is universal.

For most serious deployments:

  • expose the Agent to trusted callers
  • keep the Orchestrator on an internal network path
  • restrict PostgreSQL and Redis to private connectivity
  • treat provider credentials and connector credentials as secrets, not inline config

This model works whether you deploy on a single VM, Kubernetes, ECS, Nomad, or other infrastructure you already operate.

Monitoring and Verification

After deployment, verify at minimum:

  • Agent /health responds successfully
  • Orchestrator /health responds successfully
  • Prometheus is scraping both services
  • Grafana dashboards show request and policy activity
  • PostgreSQL persistence survives restarts
  • Redis connectivity is stable under load

When Teams Usually Upgrade

Teams commonly start on the community topology for developer onboarding, proof-of-concept work, and initial product integration. They usually move toward Evaluation or Enterprise when they need:

  • broader connector coverage
  • richer governance workflows
  • compliance-specific modules
  • stronger enterprise operations guidance
  • protected customer-portal and admin capabilities