Azure OpenAI Setup
AxonFlow supports Azure OpenAI Service for LLM routing and orchestration. Azure OpenAI is available in the Community edition.
Prerequisites
- Azure account with Azure OpenAI access
- A deployed model in Azure OpenAI (e.g., gpt-4o-mini)
- Endpoint URL and API key from Azure Portal
Quick Start
1. Deploy a Model in Azure
- Go to Azure Portal
- Create an Azure OpenAI resource (or use Azure AI Foundry)
- Deploy a model (e.g.,
gpt-4o-mini) - Copy your endpoint URL and API key
2. Configure AxonFlow
# Required
export AZURE_OPENAI_ENDPOINT=https://your-resource.cognitiveservices.azure.com
export AZURE_OPENAI_API_KEY=your-api-key
export AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4o-mini
# Optional
export AZURE_OPENAI_API_VERSION=2024-08-01-preview
3. Start AxonFlow
docker compose up -d
Authentication Patterns
Azure OpenAI supports two authentication patterns. AxonFlow auto-detects which to use based on your endpoint URL:
| Endpoint Pattern | Auth Method | Header |
|---|---|---|
*.cognitiveservices.azure.com | Bearer Token | Authorization: Bearer <key> |
*.openai.azure.com | API Key | api-key: <key> |
Microsoft is transitioning to Azure AI Foundry. New Azure OpenAI resources typically use the cognitiveservices.azure.com endpoint pattern (Bearer token auth).
Configuration Options
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
AZURE_OPENAI_ENDPOINT | Yes | - | Azure OpenAI endpoint URL |
AZURE_OPENAI_API_KEY | Yes | - | API key or bearer token |
AZURE_OPENAI_DEPLOYMENT_NAME | Yes | - | Model deployment name |
AZURE_OPENAI_API_VERSION | No | 2024-08-01-preview | API version |
AZURE_OPENAI_TIMEOUT_SECONDS | No | 120 | Request timeout |
In Azure OpenAI, the "model" is your deployment name, which can be any string you choose when deploying. Always use your deployment name (e.g., my-gpt4o-deployment), not the underlying model name.
YAML Configuration
For more control, use YAML configuration:
# axonflow.yaml
llm_providers:
azure-openai:
enabled: true
config:
deployment_name: gpt-4o-mini
api_version: 2024-08-01-preview
timeout: 120s
credentials:
endpoint: ${AZURE_OPENAI_ENDPOINT}
api_key: ${AZURE_OPENAI_API_KEY}
priority: 10
Supported Models
Azure OpenAI supports various OpenAI models. Common deployments include:
| Model | Context Window | Best For |
|---|---|---|
gpt-4o | 128K tokens | Latest flagship model |
gpt-4o-mini | 128K tokens | Cost-effective, fast |
gpt-4-turbo | 128K tokens | Previous flagship with vision |
gpt-4 | 8K tokens | Standard GPT-4 |
Capabilities
The Azure OpenAI provider supports:
- Chat completions - Conversational AI
- Streaming responses - Real-time token streaming
- Function calling - Tool use and structured output
- Vision - Image understanding (GPT-4o, GPT-4-turbo)
- JSON mode - Structured output
Usage Examples
Proxy Mode (Python SDK)
Proxy mode routes requests through AxonFlow for simple integration:
from axonflow import AxonFlow
async with AxonFlow(agent_url="http://localhost:8080") as client:
response = await client.execute_query(
user_token="user-123",
query="Explain Azure OpenAI",
request_type="chat",
context={"provider": "azure-openai"}
)
print(response.content)
Proxy Mode (cURL)
curl -X POST http://localhost:8080/api/request \
-H "Content-Type: application/json" \
-H "X-User-Token: user-123" \
-d '{
"query": "What is Azure OpenAI?",
"provider": "azure-openai",
"max_tokens": 500
}'
Gateway Mode (TypeScript SDK)
Gateway mode gives you full control over the LLM call while AxonFlow handles policy enforcement and audit logging:
import { AxonFlow } from '@axonflow/sdk';
const axonflow = new AxonFlow({
endpoint: 'http://localhost:8080',
apiKey: 'your-axonflow-key'
});
// 1. Pre-check: Get policy approval
const ctx = await axonflow.getPolicyApprovedContext({
userToken: 'user-123',
query: 'Explain Azure OpenAI'
});
if (!ctx.approved) {
throw new Error(`Blocked: ${ctx.blockReason}`);
}
// 2. Call Azure OpenAI directly
const endpoint = process.env.AZURE_OPENAI_ENDPOINT;
const apiKey = process.env.AZURE_OPENAI_API_KEY;
const deployment = process.env.AZURE_OPENAI_DEPLOYMENT_NAME;
const headers: Record<string, string> = {
'Content-Type': 'application/json'
};
// Auto-detect auth type
if (endpoint.includes('cognitiveservices.azure.com')) {
headers['Authorization'] = `Bearer ${apiKey}`;
} else {
headers['api-key'] = apiKey;
}
const azureResponse = await fetch(
`${endpoint}/openai/deployments/${deployment}/chat/completions?api-version=2024-08-01-preview`,
{
method: 'POST',
headers,
body: JSON.stringify({
messages: [{ role: 'user', content: ctx.approvedData.query }],
max_tokens: 500
})
}
);
const completion = await azureResponse.json();
const response = completion.choices[0].message.content;
// 3. Audit the call
await axonflow.auditLLMCall({
contextId: ctx.contextId,
responseSummary: response.substring(0, 100),
provider: 'azure-openai',
model: deployment,
tokenUsage: {
promptTokens: completion.usage.prompt_tokens,
completionTokens: completion.usage.completion_tokens,
totalTokens: completion.usage.total_tokens
},
latencyMs: 250
});
Gateway Mode (Go SDK)
package main
import (
"bytes"
"encoding/json"
"fmt"
"net/http"
"os"
"strings"
axonflow "github.com/getaxonflow/axonflow-sdk-go"
)
func main() {
client := axonflow.NewClient(axonflow.Config{
Endpoint: "http://localhost:8080",
})
defer client.Close()
// 1. Pre-check
ctx, err := client.GetPolicyApprovedContext(axonflow.PreCheckRequest{
UserToken: "user-123",
Query: "Explain Azure OpenAI",
})
if err != nil {
panic(err)
}
if ctx.Blocked {
fmt.Printf("Blocked: %s\n", ctx.BlockReason)
return
}
// 2. Call Azure OpenAI directly
endpoint := os.Getenv("AZURE_OPENAI_ENDPOINT")
apiKey := os.Getenv("AZURE_OPENAI_API_KEY")
deployment := os.Getenv("AZURE_OPENAI_DEPLOYMENT_NAME")
azureURL := fmt.Sprintf("%s/openai/deployments/%s/chat/completions?api-version=2024-08-01-preview",
endpoint, deployment)
body, _ := json.Marshal(map[string]interface{}{
"messages": []map[string]string{{"role": "user", "content": ctx.ApprovedQuery}},
"max_tokens": 500,
})
req, _ := http.NewRequest("POST", azureURL, bytes.NewReader(body))
req.Header.Set("Content-Type", "application/json")
// Auto-detect auth type
if strings.Contains(strings.ToLower(endpoint), "cognitiveservices.azure.com") {
req.Header.Set("Authorization", "Bearer "+apiKey)
} else {
req.Header.Set("api-key", apiKey)
}
resp, err := http.DefaultClient.Do(req)
if err != nil {
panic(err)
}
defer resp.Body.Close()
// 3. Audit the call
client.AuditLLMCall(axonflow.AuditRequest{
ContextID: ctx.ContextID,
ResponseSummary: "Azure OpenAI response",
Provider: "azure-openai",
Model: deployment,
})
}
Health Checks
Check the Azure OpenAI provider health status:
# Check specific provider health
curl http://localhost:8081/api/v1/llm-providers/azure-openai/health
Response:
{
"provider": "azure-openai",
"status": "healthy",
"latency_ms": 125,
"model": "gpt-4o-mini"
}
Error Handling
Common error codes from Azure OpenAI:
| Status | Reason | Action |
|---|---|---|
| 401 | Invalid API key | Verify credentials in Azure Portal |
| 403 | Access denied | Check RBAC permissions |
| 404 | Deployment not found | Verify AZURE_OPENAI_DEPLOYMENT_NAME |
| 429 | Rate limit exceeded | Implement backoff/retry |
| 500 | Server error | Retry with exponential backoff |
AxonFlow automatically handles retries for transient errors (429, 500).
Troubleshooting
404 DeploymentNotFound
- Verify
AZURE_OPENAI_DEPLOYMENT_NAMEmatches your Azure deployment exactly - Check the deployment status in Azure Portal (must be "Succeeded")
- Ensure you're using the deployment name, not the model name
401 Unauthorized
- For Foundry endpoints (
*.cognitiveservices.azure.com): Verify Bearer token - For Classic endpoints (
*.openai.azure.com): Check API key in Azure Portal
No Quota Available
Azure OpenAI requires quota allocation:
- Check your subscription type (some free tiers have no quota)
- Request quota increase in Azure Portal > Quotas
- Consider using Azure AI Foundry which may have different quota policies
Wrong Authentication Header
If you get auth errors, verify AxonFlow is detecting the correct pattern:
- Foundry (
cognitiveservices.azure.com) usesAuthorization: Bearer - Classic (
openai.azure.com) usesapi-keyheader
Best Practices
- Use Foundry endpoints - Microsoft is transitioning to Azure AI Foundry
- Set reasonable timeouts - 120s default handles most requests
- Enable fallback providers - Configure OpenAI/Anthropic as backup
- Monitor costs - Use AxonFlow's cost dashboard to track usage
- Handle rate limits - Azure has per-deployment rate limits
Multi-Provider Routing
Configure Azure OpenAI alongside other providers for intelligent routing:
llm_providers:
azure-openai:
enabled: true
config:
deployment_name: gpt-4o-mini
credentials:
endpoint: ${AZURE_OPENAI_ENDPOINT}
api_key: ${AZURE_OPENAI_API_KEY}
priority: 100
openai:
enabled: true
config:
model: gpt-4o
credentials:
api_key: ${OPENAI_API_KEY}
priority: 50
routing:
strategy: priority
fallback_enabled: true
Community Examples
Complete code examples are available in the repository:
Next Steps
- LLM Providers Overview - All supported providers
- OpenAI Setup - Standard OpenAI API
- Provider Routing - Multi-provider configuration
- Custom Provider SDK - Build custom providers