LLM Providers Overview
AxonFlow supports multiple LLM providers out of the box, allowing you to choose the best provider for your use case, compliance requirements, and cost constraints.
Supported Providers
| Provider | Models | Best For | Edition |
|---|---|---|---|
| OpenAI | GPT-4o, GPT-4o-mini, GPT-4, o1, o3-mini | General purpose, latest capabilities | Community |
| Azure OpenAI | GPT-4o, GPT-4, GPT-4-turbo (Azure-hosted) | Azure ecosystem, enterprise security, data residency | Community |
| Anthropic | Claude Opus 4.6, Claude 4.5 Sonnet, Claude 4.5 Haiku, Claude 3.5 Sonnet | Long context, safety-focused, reasoning | Community |
| Google Gemini | Gemini 2.0 Flash, Gemini 1.5 Pro, Gemini 1.5 Flash | Multimodal, Google ecosystem | Community |
| Ollama | Llama 3.3, Mistral, Mixtral, DeepSeek | Self-hosted, air-gapped environments | Community |
| AWS Bedrock | Claude, Llama, Titan, Mistral | HIPAA compliance, VPC isolation | Enterprise |
| Custom | Any | Custom providers via SDK | Community |
Try It: Route a Query Through AxonFlow
Send a query through AxonFlow and have it routed to your configured LLM provider. AxonFlow handles provider selection, policy enforcement, and audit logging automatically.
curl
curl -X POST http://localhost:8080/api/v1/query/execute \
-H "Content-Type: application/json" \
-H "X-Client-Id: my-tenant" \
-H "X-Client-Secret: your-client-secret" \
-d '{
"query": "Explain the difference between symmetric and asymmetric encryption",
"provider": "openai",
"model": "gpt-4o"
}'
Response:
{
"response": "Symmetric encryption uses a single shared key for both encryption and decryption...",
"provider": "openai",
"model": "gpt-4o",
"token_usage": {
"prompt_tokens": 14,
"completion_tokens": 210,
"total_tokens": 224
},
"policy_info": {
"allowed": true,
"applied_policies": ["pii-global-email", "security-sqli-union"],
"risk_score": 0.05
}
}
SDK Examples
- TypeScript
- Python
- Go
import { AxonFlow } from '@axonflow/sdk';
const client = new AxonFlow({
endpoint: 'http://localhost:8080',
clientId: 'my-tenant',
clientSecret: 'your-client-secret',
});
const result = await client.executeQuery({
query: 'Explain the difference between symmetric and asymmetric encryption',
provider: 'openai',
model: 'gpt-4o',
});
console.log(result.response);
console.log('Tokens used:', result.tokenUsage.totalTokens);
console.log('Policy allowed:', result.policyInfo.allowed);
from axonflow import AxonFlow
client = AxonFlow(
endpoint="http://localhost:8080",
client_id="my-tenant",
client_secret="your-client-secret",
)
result = client.proxy_llm_call(
query="Explain the difference between symmetric and asymmetric encryption",
provider="openai",
model="gpt-4o",
)
print(result.response)
print(f"Tokens used: {result.token_usage.total_tokens}")
print(f"Policy allowed: {result.policy_info.allowed}")
package main
import (
"fmt"
axonflow "github.com/getaxonflow/axonflow-sdk-go/v3"
)
func main() {
client := axonflow.NewClient(axonflow.AxonFlowConfig{
Endpoint: "http://localhost:8080",
ClientID: "my-tenant",
ClientSecret: "your-client-secret",
})
result, err := client.ProxyLLMCall(
"",
"Explain the difference between symmetric and asymmetric encryption",
"chat",
map[string]interface{}{
"provider": "openai",
"model": "gpt-4o",
},
)
if err != nil {
panic(err)
}
fmt.Println(result.Response)
fmt.Printf("Tokens used: %d\n", result.TokenUsage.TotalTokens)
fmt.Printf("Policy allowed: %t\n", result.PolicyInfo.Allowed)
}
Provider Selection
Choose your LLM provider based on:
Compliance Requirements
| Requirement | Recommended Provider |
|---|---|
| HIPAA | AWS Bedrock with VPC endpoints |
| FedRAMP | Ollama (self-hosted) or AWS GovCloud |
| Air-gapped | Ollama |
| Data residency | Bedrock (regional) or Ollama |
Cost Optimization
| Provider | Cost per 1K tokens (approx) | Best For |
|---|---|---|
| Ollama | $0 (hardware only) | High volume, predictable cost |
| Bedrock (Claude) | $0.015 | HIPAA with cost savings |
| OpenAI (GPT-4o) | $0.005 | General purpose |
| Anthropic (Sonnet) | $0.003 | Cost-effective quality |
Latency
| Provider | Typical Latency | Best For |
|---|---|---|
| Ollama | 50-200ms | Real-time applications |
| OpenAI | 200-500ms | Interactive apps |
| Bedrock | 300-800ms | Batch processing |
Configuration
Environment Variables
The simplest way to configure providers:
# OpenAI
export OPENAI_API_KEY=sk-xxx
# Azure OpenAI
export AZURE_OPENAI_ENDPOINT=https://your-resource.cognitiveservices.azure.com
export AZURE_OPENAI_API_KEY=your-api-key
export AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4o-mini
# Anthropic
export ANTHROPIC_API_KEY=sk-ant-xxx
# AWS Bedrock (uses AWS credential chain)
export AWS_REGION=us-east-1
export AWS_ACCESS_KEY_ID=xxx
export AWS_SECRET_ACCESS_KEY=xxx
# Google Gemini
export GOOGLE_API_KEY=xxx
# Ollama
export OLLAMA_ENDPOINT=http://localhost:11434
YAML Configuration
For more control, use YAML configuration:
# axonflow.yaml
version: "1.0"
llm_providers:
openai:
enabled: true
config:
model: gpt-4o
max_tokens: 4096
credentials:
api_key: ${OPENAI_API_KEY}
priority: 10
weight: 0.5
anthropic:
enabled: true
config:
model: claude-sonnet-4-20250514
max_tokens: 8192
credentials:
api_key: ${ANTHROPIC_API_KEY}
priority: 8
weight: 0.3
bedrock:
enabled: true
config:
model: anthropic.claude-sonnet-4-20250514-v1:0
region: us-east-1
max_tokens: 4096
priority: 5
weight: 0.2
ollama:
enabled: true
config:
endpoint: ${OLLAMA_ENDPOINT:-http://localhost:11434}
model: llama3.2:latest
priority: 3
weight: 0.0 # Fallback only
Multi-Provider Routing
AxonFlow supports intelligent routing across multiple providers:
Routing Strategies
| Strategy | Description | Use Case |
|---|---|---|
| Priority | Use highest priority available | Failover scenarios |
| Weighted | Distribute by weight | Load balancing |
| Cost-Optimized | Route to cheapest first | Cost reduction |
| Round-Robin | Even distribution | Even load distribution |
Automatic Failover
When a provider fails:
- Request is retried with exponential backoff
- After threshold failures, provider is marked unhealthy
- Traffic automatically routes to healthy providers
- Health checks restore provider when recovered
Circuit Breaker
Prevents cascading failures:
- Opens after configurable failure threshold (default: 5)
- Blocks requests to unhealthy provider
- Automatically closes after reset timeout
Provider-Specific Guides
- Provider Routing - Configure routing strategies (Community)
- OpenAI Setup - GPT-4o and GPT models (Community)
- Azure OpenAI Setup - Azure-hosted OpenAI models (Community)
- Anthropic Setup - Claude models (Community)
- Google Gemini Setup - Gemini models (Community)
- Ollama Setup - Self-hosted deployment (Community)
- AWS Bedrock Setup - HIPAA-compliant deployment (Enterprise)
- Custom Provider SDK - Build your own provider
Community vs Enterprise
| Feature | Community | Enterprise |
|---|---|---|
| OpenAI, Anthropic, Gemini, Ollama | ✅ | ✅ |
| Azure OpenAI (environment variables) | ✅ | ✅ |
| Multi-provider routing and failover | ✅ | ✅ |
| Circuit breaker and health checks | ✅ | ✅ |
| Custom Provider SDK | ✅ | ✅ |
| Azure OpenAI (managed credentials) | ❌ | ✅ |
| AWS Bedrock (HIPAA, VPC isolation) | ❌ | ✅ |
| Runtime provider configuration | ❌ | ✅ |
| Credential encryption and rotation | ❌ | ✅ |
| Per-provider cost controls and budgets | ❌ | ✅ |
| Usage analytics and cost dashboards | ❌ | ✅ |
| SLA management and alerting | ❌ | ✅ |
Enterprise adds Azure OpenAI with managed credentials, AWS Bedrock for HIPAA-compliant deployments, credential encryption and rotation, per-provider cost controls, and usage analytics dashboards. Build custom providers with the Custom Provider SDK to integrate proprietary or specialized LLM endpoints. Compare Editions | Request Demo | AWS Marketplace
See Enterprise Provider Features for details.
