Skip to main content

OpenAI Setup

AxonFlow supports OpenAI's GPT models for LLM routing and orchestration. OpenAI is available in the Community edition.

Prerequisites

Quick Start

1. Get API Key

  1. Go to OpenAI Platform
  2. Click "Create new secret key"
  3. Copy the generated key

2. Configure AxonFlow

# Required
export OPENAI_API_KEY=sk-your-api-key-here

# Optional: Specify model (default: gpt-4o)
export OPENAI_MODEL=gpt-4o-mini

3. Start AxonFlow

docker compose up -d

Supported Models

ModelContext WindowBest For
gpt-4o128K tokensLatest flagship model (default)
gpt-4o-mini128K tokensCost-effective, fast
gpt-4-turbo128K tokensPrevious flagship
gpt-3.5-turbo16K tokensBudget-friendly
o1-preview128K tokensAdvanced reasoning
o1-mini128K tokensFast reasoning

Configuration Options

Environment Variables

VariableRequiredDefaultDescription
OPENAI_API_KEYYes-OpenAI API key
OPENAI_MODELNogpt-4oDefault model
OPENAI_ENDPOINTNohttps://api.openai.comAPI endpoint
OPENAI_TIMEOUT_SECONDSNo120Request timeout (seconds)
Model Configuration

Always set OPENAI_MODEL explicitly if your API key has limited model access. AxonFlow uses the configured model for all requests to this provider. If not set, it defaults to gpt-4o.

YAML Configuration

For more control, use YAML configuration:

# axonflow.yaml
llm_providers:
openai:
enabled: true
config:
model: gpt-4o
max_tokens: 4096
timeout: 120s
credentials:
api_key: ${OPENAI_API_KEY}
priority: 10
weight: 0.5

Capabilities

The OpenAI provider supports:

  • Chat completions - Conversational AI
  • Streaming responses - Real-time token streaming
  • Function calling - Tool use and structured output
  • Vision - Image understanding (GPT-4o, GPT-4-turbo)
  • JSON mode - Structured output
  • Code generation - Programming assistance

Usage Examples

Proxy Mode (Python SDK)

Proxy mode routes requests through AxonFlow for simple integration:

from axonflow import AxonFlow

async with AxonFlow(agent_url="http://localhost:8080") as client:
response = await client.execute_query(
user_token="user-123",
query="Explain machine learning",
request_type="chat",
context={"provider": "openai", "model": "gpt-4o"}
)
print(response.content)

Proxy Mode (cURL)

curl -X POST http://localhost:8080/api/request \
-H "Content-Type: application/json" \
-H "X-User-Token: user-123" \
-d '{
"query": "What is machine learning?",
"provider": "openai",
"model": "gpt-4o",
"max_tokens": 500
}'

Gateway Mode (TypeScript SDK)

Gateway mode gives you full control over the LLM call while AxonFlow handles policy enforcement and audit logging:

import { AxonFlow } from '@axonflow/sdk';
import OpenAI from 'openai';

const axonflow = new AxonFlow({
endpoint: 'http://localhost:8080',
apiKey: 'your-axonflow-key'
});

// 1. Pre-check: Get policy approval
const ctx = await axonflow.getPolicyApprovedContext({
userToken: 'user-123',
query: 'Explain machine learning'
});

if (!ctx.approved) {
throw new Error(`Blocked: ${ctx.blockReason}`);
}

// 2. Call OpenAI directly
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const completion = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: ctx.approvedData.query }]
});
const response = completion.choices[0].message.content;

// 3. Audit the call
await axonflow.auditLLMCall({
contextId: ctx.contextId,
responseSummary: response.substring(0, 100),
provider: 'openai',
model: 'gpt-4o',
tokenUsage: {
promptTokens: completion.usage.prompt_tokens,
completionTokens: completion.usage.completion_tokens,
totalTokens: completion.usage.total_tokens
},
latencyMs: 250
});

Pricing

OpenAI pricing (as of December 2025):

ModelInput (per 1M tokens)Output (per 1M tokens)
GPT-4o$2.50$10.00
GPT-4o-mini$0.15$0.60
GPT-4-turbo$10.00$30.00
GPT-3.5-turbo$0.50$1.50
o1-preview$15.00$60.00
o1-mini$3.00$12.00

AxonFlow provides cost estimation via the /api/cost/estimate endpoint.

Multi-Provider Routing

Configure OpenAI alongside other providers for intelligent routing:

llm_providers:
openai:
enabled: true
config:
model: gpt-4o
credentials:
api_key: ${OPENAI_API_KEY}
priority: 100

anthropic:
enabled: true
config:
model: claude-3-5-sonnet-20241022
credentials:
api_key: ${ANTHROPIC_API_KEY}
priority: 50

routing:
strategy: priority
fallback_enabled: true

Health Checks

Check the OpenAI provider health status:

# Check specific provider health
curl http://localhost:8081/api/v1/llm-providers/openai/health

Response:

{
"provider": "openai",
"status": "healthy",
"latency_ms": 45,
"model": "gpt-4o"
}

To check all configured providers at once:

curl http://localhost:8081/api/v1/llm-providers/status

Response:

{
"providers": {
"openai": {
"status": "healthy",
"latency_ms": 45,
"model": "gpt-4o"
},
"anthropic": {
"status": "healthy",
"latency_ms": 52,
"model": "claude-3-5-sonnet-20241022"
}
}
}

Error Handling

Common error codes from OpenAI:

StatusReasonAction
401Invalid API keyVerify OPENAI_API_KEY
429Rate limit exceededImplement backoff/retry
500Server errorRetry with exponential backoff
503Service unavailableRetry or failover to another provider

AxonFlow automatically handles retries for transient errors (429, 500, 503).

Best Practices

  1. Use appropriate models - GPT-4o for quality, GPT-4o-mini for cost
  2. Set reasonable timeouts - 120s default is good for most use cases
  3. Enable fallback providers - Configure Anthropic/Gemini as backup
  4. Monitor costs - Use AxonFlow's cost dashboard to track usage
  5. Handle rate limits - Implement client-side retry logic for high-volume apps

Troubleshooting

"Invalid API key"

  • Verify the key at OpenAI Platform
  • Ensure the key hasn't been revoked
  • Check for leading/trailing whitespace

"Model not found"

  • Verify model name (e.g., gpt-4o, not gpt-4-o)
  • Check if your account has access to the model
  • Some models require specific API access

"Rate limit exceeded"

  • Check usage at OpenAI Usage
  • Consider upgrading your plan
  • Implement exponential backoff

Next Steps