Anthropic Claude Setup

AxonFlow supports Anthropic's Claude models for LLM routing and orchestration. Claude is available in the Community edition and is known for its safety-focused design and long context capabilities.

Prerequisites

Anthropic account
API key from Anthropic Console

Quick Start

1. Get API Key

Go to Anthropic Console
Click "Create Key"
Copy the generated key

2. Configure AxonFlow

# Required
export ANTHROPIC_API_KEY=sk-ant-your-api-key-here

# Optional: Specify model (default: claude-3-5-sonnet-20241022)
export ANTHROPIC_MODEL=claude-3-5-haiku-20241022

3. Start AxonFlow

docker compose up -d

Supported Models

Model	Context Window	Best For
`claude-sonnet-4-20250514`	200K tokens	Latest, balanced quality/speed
`claude-opus-4-20250514`	200K tokens	Highest capability
`claude-3-5-sonnet-20241022`	200K tokens	Fast, high quality (default)
`claude-3-5-haiku-20241022`	200K tokens	Fastest, cost-effective
`claude-3-opus-20240229`	200K tokens	Complex reasoning
`claude-3-sonnet-20240229`	200K tokens	Balanced (legacy)
`claude-3-haiku-20240307`	200K tokens	Fast responses (legacy)

Configuration Options

Environment Variables

Variable	Required	Default	Description
`ANTHROPIC_API_KEY`	Yes	-	Anthropic API key
`ANTHROPIC_MODEL`	No	`claude-3-5-sonnet-20241022`	Default model
`ANTHROPIC_ENDPOINT`	No	`https://api.anthropic.com`	API endpoint
`ANTHROPIC_TIMEOUT_SECONDS`	No	`120`	Request timeout (seconds)

Model Configuration

Always set ANTHROPIC_MODEL explicitly if your API key has limited model access (e.g., only claude-3-haiku-20240307). AxonFlow uses the configured model for all requests to this provider.

YAML Configuration

For more control, use YAML configuration:

# axonflow.yaml
llm_providers:
  anthropic:
    enabled: true
    config:
      model: claude-3-5-sonnet-20241022
      max_tokens: 8192
      timeout: 120s
    credentials:
      api_key: ${ANTHROPIC_API_KEY}
    priority: 10
    weight: 0.5

Capabilities

The Anthropic provider supports:

Chat completions - Conversational AI
Streaming responses - Real-time token streaming
Long context - Up to 200K tokens
Vision - Image understanding
Tool use - Function calling
Code generation - Programming assistance
Constitutional AI - Built-in safety alignment

Usage Examples

Proxy Mode (Python SDK)

Proxy mode routes requests through AxonFlow for simple integration:

from axonflow import AxonFlow

async with AxonFlow(agent_url="http://localhost:8080") as client:
    response = await client.execute_query(
        user_token="user-123",
        query="Explain quantum computing",
        request_type="chat",
        context={"provider": "anthropic", "model": "claude-3-5-sonnet-20241022"}
    )
    print(response.content)

Proxy Mode (cURL)

curl -X POST http://localhost:8080/api/request \
  -H "Content-Type: application/json" \
  -H "X-User-Token: user-123" \
  -d '{
    "query": "What is quantum computing?",
    "provider": "anthropic",
    "model": "claude-3-5-sonnet-20241022",
    "max_tokens": 500
  }'

Gateway Mode (TypeScript SDK)

Gateway mode gives you full control over the LLM call while AxonFlow handles policy enforcement and audit logging:

import { AxonFlow } from '@axonflow/sdk';
import Anthropic from '@anthropic-ai/sdk';

const axonflow = new AxonFlow({
  endpoint: 'http://localhost:8080',
  apiKey: 'your-axonflow-key'
});

// 1. Pre-check: Get policy approval
const ctx = await axonflow.getPolicyApprovedContext({
  userToken: 'user-123',
  query: 'Explain quantum computing'
});

if (!ctx.approved) {
  throw new Error(`Blocked: ${ctx.blockReason}`);
}

// 2. Call Anthropic directly
const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
const message = await anthropic.messages.create({
  model: 'claude-3-5-sonnet-20241022',
  max_tokens: 1024,
  messages: [{ role: 'user', content: ctx.approvedData.query }]
});
const response = message.content[0].text;

// 3. Audit the call
await axonflow.auditLLMCall({
  contextId: ctx.contextId,
  responseSummary: response.substring(0, 100),
  provider: 'anthropic',
  model: 'claude-3-5-sonnet-20241022',
  tokenUsage: {
    promptTokens: message.usage.input_tokens,
    completionTokens: message.usage.output_tokens,
    totalTokens: message.usage.input_tokens + message.usage.output_tokens
  },
  latencyMs: 250
});

Streaming

Anthropic supports server-sent events (SSE) for streaming responses:

import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

const stream = await anthropic.messages.stream({
  model: 'claude-3-5-sonnet-20241022',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Write a story' }]
});

for await (const chunk of stream) {
  if (chunk.type === 'content_block_delta') {
    process.stdout.write(chunk.delta.text);
  }
}

Pricing

Anthropic pricing (as of December 2025):

Model	Input (per 1M tokens)	Output (per 1M tokens)
Claude Sonnet 4	$3.00	$15.00
Claude Opus 4	$15.00	$75.00
Claude 3.5 Sonnet	$3.00	$15.00
Claude 3.5 Haiku	$0.80	$4.00
Claude 3 Opus	$15.00	$75.00

AxonFlow provides cost estimation via the /api/cost/estimate endpoint.

Multi-Provider Routing

Configure Anthropic alongside other providers for intelligent routing:

llm_providers:
  anthropic:
    enabled: true
    config:
      model: claude-3-5-sonnet-20241022
    credentials:
      api_key: ${ANTHROPIC_API_KEY}
    priority: 100

  openai:
    enabled: true
    config:
      model: gpt-4o
    credentials:
      api_key: ${OPENAI_API_KEY}
    priority: 50

routing:
  strategy: priority
  fallback_enabled: true

Health Checks

Check the Anthropic provider health status:

# Check specific provider health
curl http://localhost:8081/api/v1/llm-providers/anthropic/health

Response:

{
  "provider": "anthropic",
  "status": "healthy",
  "latency_ms": 52,
  "model": "claude-3-5-sonnet-20241022"
}

To check all configured providers at once:

curl http://localhost:8081/api/v1/llm-providers/status

Response:

{
  "providers": {
    "anthropic": {
      "status": "healthy",
      "latency_ms": 52,
      "model": "claude-3-5-sonnet-20241022"
    },
    "openai": {
      "status": "healthy",
      "latency_ms": 45,
      "model": "gpt-4o"
    }
  }
}

Error Handling

Common error codes from Anthropic:

Status	Reason	Action
400	Invalid request	Check request format
401	Invalid API key	Verify `ANTHROPIC_API_KEY`
429	Rate limit exceeded	Implement backoff/retry
500	Server error	Retry with exponential backoff
529	API overloaded	Retry with backoff

AxonFlow automatically handles retries for transient errors (429, 500, 529).

Best Practices

Use appropriate models - Sonnet for most tasks, Haiku for speed, Opus for complex reasoning
Set reasonable timeouts - 120s default is good for most use cases
Enable fallback providers - Configure OpenAI/Gemini as backup
Monitor costs - Use AxonFlow's cost dashboard to track usage
Leverage long context - Claude handles up to 200K tokens well

Troubleshooting

"Invalid API key"

Verify the key at Anthropic Console
Ensure the key hasn't been disabled
Check for leading/trailing whitespace

"Model not found"

Verify model name format (e.g., claude-3-5-sonnet-20241022)
Check if model is available in your region
Note: Older model versions may be deprecated

"Rate limit exceeded"

Check usage at Anthropic Console
Request a rate limit increase
Implement exponential backoff

Next Steps

LLM Providers Overview - All supported providers
OpenAI Setup - Alternative provider
Google Gemini Setup - Multimodal capabilities
Custom Provider SDK - Build custom providers

Prerequisites​

Quick Start​

1. Get API Key​

2. Configure AxonFlow​

3. Start AxonFlow​

Supported Models​

Configuration Options​

Environment Variables​

YAML Configuration​

Capabilities​

Usage Examples​

Proxy Mode (Python SDK)​

Proxy Mode (cURL)​

Gateway Mode (TypeScript SDK)​

Streaming​

Pricing​

Multi-Provider Routing​

Health Checks​

Error Handling​

Best Practices​

Troubleshooting​

"Invalid API key"​

"Model not found"​

"Rate limit exceeded"​

Next Steps​