LLM Providers Overview
AxonFlow supports multiple LLM providers out of the box, allowing you to choose the best provider for your use case, compliance requirements, and cost constraints.
Supported Providers
All LLM providers are available in the open-source edition:
| Provider | Models | Best For |
|---|---|---|
| OpenAI | GPT-4, GPT-4o, GPT-3.5-turbo | General purpose, latest capabilities |
| Anthropic | Claude 3.5 Sonnet, Claude 3 Opus | Long context, safety-focused |
| AWS Bedrock | Claude, Llama, Titan, Mistral | HIPAA compliance, VPC isolation |
| Google Gemini | Gemini Pro, Gemini Ultra | Multimodal, Google ecosystem |
| Ollama | Llama 3.1, Mistral, Mixtral | Self-hosted, air-gapped environments |
| Custom | Any | Custom providers via SDK |
Provider Selection
Choose your LLM provider based on:
Compliance Requirements
| Requirement | Recommended Provider |
|---|---|
| HIPAA | AWS Bedrock with VPC endpoints |
| FedRAMP | Ollama (self-hosted) or AWS GovCloud |
| Air-gapped | Ollama |
| Data residency | Bedrock (regional) or Ollama |
Cost Optimization
| Provider | Cost per 1K tokens (approx) | Best For |
|---|---|---|
| Ollama | $0 (hardware only) | High volume, predictable cost |
| Bedrock (Claude) | $0.015 | HIPAA with cost savings |
| OpenAI (GPT-4o) | $0.005 | General purpose |
| Anthropic (Sonnet) | $0.003 | Cost-effective quality |
Latency
| Provider | Typical Latency | Best For |
|---|---|---|
| Ollama | 50-200ms | Real-time applications |
| OpenAI | 200-500ms | Interactive apps |
| Bedrock | 300-800ms | Batch processing |
Configuration
Environment Variables
The simplest way to configure providers:
# OpenAI
export OPENAI_API_KEY=sk-xxx
# Anthropic
export ANTHROPIC_API_KEY=sk-ant-xxx
# AWS Bedrock (uses AWS credential chain)
export AWS_REGION=us-east-1
export AWS_ACCESS_KEY_ID=xxx
export AWS_SECRET_ACCESS_KEY=xxx
# Google Gemini
export GOOGLE_API_KEY=xxx
# Ollama
export OLLAMA_ENDPOINT=http://localhost:11434
YAML Configuration
For more control, use YAML configuration:
# axonflow.yaml
version: "1.0"
llm_providers:
openai:
enabled: true
config:
model: gpt-4o
max_tokens: 4096
credentials:
api_key: ${OPENAI_API_KEY}
priority: 10
weight: 0.5
anthropic:
enabled: true
config:
model: claude-3-5-sonnet-20241022
max_tokens: 8192
credentials:
api_key: ${ANTHROPIC_API_KEY}
priority: 8
weight: 0.3
bedrock:
enabled: true
config:
model: anthropic.claude-3-5-sonnet-20241022-v2:0
region: us-east-1
max_tokens: 4096
priority: 5
weight: 0.2
ollama:
enabled: true
config:
endpoint: ${OLLAMA_ENDPOINT:-http://localhost:11434}
model: llama3.1:70b
priority: 3
weight: 0.0 # Fallback only
Multi-Provider Routing
AxonFlow supports intelligent routing across multiple providers:
Routing Strategies
| Strategy | Description | Use Case |
|---|---|---|
| Priority | Use highest priority available | Failover scenarios |
| Weighted | Distribute by weight | Load balancing |
| Cost-Optimized | Route to cheapest first | Cost reduction |
| Round-Robin | Even distribution | Even load distribution |
Automatic Failover
When a provider fails:
- Request is retried with exponential backoff
- After threshold failures, provider is marked unhealthy
- Traffic automatically routes to healthy providers
- Health checks restore provider when recovered
Circuit Breaker
Prevents cascading failures:
- Opens after configurable failure threshold (default: 5)
- Blocks requests to unhealthy provider
- Automatically closes after reset timeout
Provider-Specific Guides
- AWS Bedrock Setup - HIPAA-compliant deployment
- Ollama Setup - Self-hosted deployment
- Custom Provider SDK - Build your own provider
Enterprise Features
Enterprise customers get additional capabilities via the Customer Portal:
- Runtime Configuration - Change providers without redeployment
- Credential Management - Secure API key storage and rotation
- Advanced Monitoring - Per-provider metrics and cost tracking
- SLA Management - Provider-specific SLOs and alerting
See Enterprise Provider Features for details.