Anthropic Claude Setup
AxonFlow supports Anthropic's Claude models for LLM routing and orchestration. Claude is available in the Community edition and is known for its safety-focused design and long context capabilities.
Prerequisites
- Anthropic account
- API key from Anthropic Console
Quick Start
1. Get API Key
- Go to Anthropic Console
- Click "Create Key"
- Copy the generated key
2. Configure AxonFlow
# Required
export ANTHROPIC_API_KEY=sk-ant-your-api-key-here
# Optional: Specify model (default: claude-3-5-sonnet-20241022)
export ANTHROPIC_MODEL=claude-3-5-haiku-20241022
3. Start AxonFlow
docker compose up -d
Supported Models
| Model | Context Window | Best For |
|---|---|---|
claude-sonnet-4-20250514 | 200K tokens | Latest, balanced quality/speed |
claude-opus-4-20250514 | 200K tokens | Highest capability |
claude-3-5-sonnet-20241022 | 200K tokens | Fast, high quality (default) |
claude-3-5-haiku-20241022 | 200K tokens | Fastest, cost-effective |
claude-3-opus-20240229 | 200K tokens | Complex reasoning |
claude-3-sonnet-20240229 | 200K tokens | Balanced (legacy) |
claude-3-haiku-20240307 | 200K tokens | Fast responses (legacy) |
Configuration Options
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
ANTHROPIC_API_KEY | Yes | - | Anthropic API key |
ANTHROPIC_MODEL | No | claude-3-5-sonnet-20241022 | Default model |
ANTHROPIC_ENDPOINT | No | https://api.anthropic.com | API endpoint |
ANTHROPIC_TIMEOUT_SECONDS | No | 120 | Request timeout (seconds) |
Always set ANTHROPIC_MODEL explicitly if your API key has limited model access (e.g., only claude-3-haiku-20240307). AxonFlow uses the configured model for all requests to this provider.
YAML Configuration
For more control, use YAML configuration:
# axonflow.yaml
llm_providers:
anthropic:
enabled: true
config:
model: claude-3-5-sonnet-20241022
max_tokens: 8192
timeout: 120s
credentials:
api_key: ${ANTHROPIC_API_KEY}
priority: 10
weight: 0.5
Capabilities
The Anthropic provider supports:
- Chat completions - Conversational AI
- Streaming responses - Real-time token streaming
- Long context - Up to 200K tokens
- Vision - Image understanding
- Tool use - Function calling
- Code generation - Programming assistance
- Constitutional AI - Built-in safety alignment
Usage Examples
Proxy Mode (Python SDK)
Proxy mode routes requests through AxonFlow for simple integration:
from axonflow import AxonFlow
async with AxonFlow(agent_url="http://localhost:8080") as client:
response = await client.execute_query(
user_token="user-123",
query="Explain quantum computing",
request_type="chat",
context={"provider": "anthropic", "model": "claude-3-5-sonnet-20241022"}
)
print(response.content)
Proxy Mode (cURL)
curl -X POST http://localhost:8080/api/request \
-H "Content-Type: application/json" \
-H "X-User-Token: user-123" \
-d '{
"query": "What is quantum computing?",
"provider": "anthropic",
"model": "claude-3-5-sonnet-20241022",
"max_tokens": 500
}'
Gateway Mode (TypeScript SDK)
Gateway mode gives you full control over the LLM call while AxonFlow handles policy enforcement and audit logging:
import { AxonFlow } from '@axonflow/sdk';
import Anthropic from '@anthropic-ai/sdk';
const axonflow = new AxonFlow({
endpoint: 'http://localhost:8080',
apiKey: 'your-axonflow-key'
});
// 1. Pre-check: Get policy approval
const ctx = await axonflow.getPolicyApprovedContext({
userToken: 'user-123',
query: 'Explain quantum computing'
});
if (!ctx.approved) {
throw new Error(`Blocked: ${ctx.blockReason}`);
}
// 2. Call Anthropic directly
const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
const message = await anthropic.messages.create({
model: 'claude-3-5-sonnet-20241022',
max_tokens: 1024,
messages: [{ role: 'user', content: ctx.approvedData.query }]
});
const response = message.content[0].text;
// 3. Audit the call
await axonflow.auditLLMCall({
contextId: ctx.contextId,
responseSummary: response.substring(0, 100),
provider: 'anthropic',
model: 'claude-3-5-sonnet-20241022',
tokenUsage: {
promptTokens: message.usage.input_tokens,
completionTokens: message.usage.output_tokens,
totalTokens: message.usage.input_tokens + message.usage.output_tokens
},
latencyMs: 250
});
Streaming
Anthropic supports server-sent events (SSE) for streaming responses:
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
const stream = await anthropic.messages.stream({
model: 'claude-3-5-sonnet-20241022',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Write a story' }]
});
for await (const chunk of stream) {
if (chunk.type === 'content_block_delta') {
process.stdout.write(chunk.delta.text);
}
}
Pricing
Anthropic pricing (as of December 2025):
| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| Claude Sonnet 4 | $3.00 | $15.00 |
| Claude Opus 4 | $15.00 | $75.00 |
| Claude 3.5 Sonnet | $3.00 | $15.00 |
| Claude 3.5 Haiku | $0.80 | $4.00 |
| Claude 3 Opus | $15.00 | $75.00 |
AxonFlow provides cost estimation via the /api/cost/estimate endpoint.
Multi-Provider Routing
Configure Anthropic alongside other providers for intelligent routing:
llm_providers:
anthropic:
enabled: true
config:
model: claude-3-5-sonnet-20241022
credentials:
api_key: ${ANTHROPIC_API_KEY}
priority: 100
openai:
enabled: true
config:
model: gpt-4o
credentials:
api_key: ${OPENAI_API_KEY}
priority: 50
routing:
strategy: priority
fallback_enabled: true
Health Checks
Check the Anthropic provider health status:
# Check specific provider health
curl http://localhost:8081/api/v1/llm-providers/anthropic/health
Response:
{
"provider": "anthropic",
"status": "healthy",
"latency_ms": 52,
"model": "claude-3-5-sonnet-20241022"
}
To check all configured providers at once:
curl http://localhost:8081/api/v1/llm-providers/status
Response:
{
"providers": {
"anthropic": {
"status": "healthy",
"latency_ms": 52,
"model": "claude-3-5-sonnet-20241022"
},
"openai": {
"status": "healthy",
"latency_ms": 45,
"model": "gpt-4o"
}
}
}
Error Handling
Common error codes from Anthropic:
| Status | Reason | Action |
|---|---|---|
| 400 | Invalid request | Check request format |
| 401 | Invalid API key | Verify ANTHROPIC_API_KEY |
| 429 | Rate limit exceeded | Implement backoff/retry |
| 500 | Server error | Retry with exponential backoff |
| 529 | API overloaded | Retry with backoff |
AxonFlow automatically handles retries for transient errors (429, 500, 529).
Best Practices
- Use appropriate models - Sonnet for most tasks, Haiku for speed, Opus for complex reasoning
- Set reasonable timeouts - 120s default is good for most use cases
- Enable fallback providers - Configure OpenAI/Gemini as backup
- Monitor costs - Use AxonFlow's cost dashboard to track usage
- Leverage long context - Claude handles up to 200K tokens well
Troubleshooting
"Invalid API key"
- Verify the key at Anthropic Console
- Ensure the key hasn't been disabled
- Check for leading/trailing whitespace
"Model not found"
- Verify model name format (e.g.,
claude-3-5-sonnet-20241022) - Check if model is available in your region
- Note: Older model versions may be deprecated
"Rate limit exceeded"
- Check usage at Anthropic Console
- Request a rate limit increase
- Implement exponential backoff
Next Steps
- LLM Providers Overview - All supported providers
- OpenAI Setup - Alternative provider
- Google Gemini Setup - Multimodal capabilities
- Custom Provider SDK - Build custom providers