LLM Provider Routing
AxonFlow supports server-side routing across multiple configured LLM providers. This is the page to read when you want to run more than one provider in production and control fallback, cost, latency, or hard provider pinning behavior.
What Routing Actually Controls
Routing applies to AxonFlow-managed provider calls such as:
- Proxy Mode
- MAP and related multi-step routed workflows
In Gateway Mode, your application still chooses and calls the provider directly. AxonFlow governs the request before and after that call, but it is not selecting the provider for you.
Provider Availability by Tier
| Provider | Community | Enterprise |
|---|---|---|
| OpenAI | ✅ | ✅ |
| Anthropic | ✅ | ✅ |
| Azure OpenAI | ✅ | ✅ |
| Gemini | ✅ | ✅ |
| Ollama | ✅ | ✅ |
| AWS Bedrock | ❌ | ✅ |
| Custom Provider | ❌ | ✅ |
Server Configuration
Routing is configured on the AxonFlow side:
| Variable | Values | Default | Description |
|---|---|---|---|
LLM_ROUTING_STRATEGY | weighted, round_robin, failover | weighted | Community and enterprise routing strategy |
PROVIDER_WEIGHTS | provider:weight,... | Equal weights | Used by weighted routing |
DEFAULT_LLM_PROVIDER | Provider name | None | Primary provider for failover |
LLM_PROVIDERS | provider1,provider2,... | All configured | Comma-separated list of enabled provider names. Only providers in this list will be considered for routing. If not set, all configured providers are eligible. |
Enterprise builds add:
| Variable | Values | Default | Description |
|---|---|---|---|
LLM_ROUTING_STRATEGY | cost_optimized | Off unless selected | Enterprise-only routing strategy |
PROVIDER_COSTS | provider:cost,... | Runtime defaults | Used by cost-optimized routing |
Routing Strategies
Weighted
LLM_ROUTING_STRATEGY=weighted
PROVIDER_WEIGHTS=openai:50,anthropic:30,gemini:20
Use this for gradual migrations, split traffic validation, and steady multi-provider production routing.
Round Robin
LLM_ROUTING_STRATEGY=round_robin
Use this when you want balanced distribution across healthy providers.
Failover
LLM_ROUTING_STRATEGY=failover
DEFAULT_LLM_PROVIDER=anthropic
Use this when one provider is primary and the others are operational backup paths.
Cost Optimized
LLM_ROUTING_STRATEGY=cost_optimized
PROVIDER_COSTS=ollama:0,anthropic:0.025,openai:0.03
Use this when you want the runtime to choose the cheapest healthy provider automatically. This strategy is enterprise-only.
Request-Level Hints
Clients can provide a provider and model preference in the request context:
const response = await client.proxyLLMCall({
userToken: 'user-123',
query: 'Draft a security review memo.',
requestType: 'chat',
context: {
provider: 'anthropic',
model: 'claude-sonnet-4-20250514',
},
});
By default:
context.provideris a preference- AxonFlow may still route elsewhere if the selected provider is unhealthy or the server strategy says otherwise
To hard-pin a request:
const response = await client.proxyLLMCall({
userToken: 'user-123',
query: 'Draft a security review memo.',
requestType: 'chat',
context: {
provider: 'anthropic',
model: 'claude-sonnet-4-20250514',
strict_provider: true,
},
});
When Teams Usually Need More Than One Provider
- A cloud primary plus Ollama fallback for development or regulated environments
- Anthropic for long-context reasoning, OpenAI for general-purpose chat
- Azure OpenAI for production alignment with existing Azure controls, but a second provider for resilience
- Bedrock added later as the company moves from pilot to governed enterprise rollout
Practical Guidance
- Start with one provider and verify policy behavior first.
- Add a second provider when you have a concrete reason: resilience, cost, region, or model fit.
- Use soft preferences first and only enable strict pinning where the downstream workflow truly requires it.
Common Production Use Cases
- OpenAI primary with Anthropic failover for customer-facing copilots
- Ollama plus cloud fallback for private enterprise assistants
- Azure OpenAI for Microsoft-centric deployment with a second provider for resilience
- Bedrock for regulated AWS estates that need enterprise-only provider operations
