Skip to main content

Azure OpenAI Setup

AxonFlow supports Azure OpenAI Service for LLM routing and orchestration. Azure OpenAI is available in the Community edition.

Prerequisites

  • Azure account with Azure OpenAI access
  • A deployed model in Azure OpenAI (e.g., gpt-4o-mini)
  • Endpoint URL and API key from Azure Portal

Quick Start

1. Deploy a Model in Azure

  1. Go to Azure Portal
  2. Create an Azure OpenAI resource (or use Azure AI Foundry)
  3. Deploy a model (e.g., gpt-4o-mini)
  4. Copy your endpoint URL and API key

2. Configure AxonFlow

# Required
export AZURE_OPENAI_ENDPOINT=https://your-resource.cognitiveservices.azure.com
export AZURE_OPENAI_API_KEY=your-api-key
export AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4o-mini

# Optional
export AZURE_OPENAI_API_VERSION=2024-08-01-preview

3. Start AxonFlow

docker compose up -d

Authentication Patterns

Azure OpenAI supports two authentication patterns. AxonFlow auto-detects which to use based on your endpoint URL:

Endpoint PatternAuth MethodHeader
*.cognitiveservices.azure.comBearer TokenAuthorization: Bearer <key>
*.openai.azure.comAPI Keyapi-key: <key>
Azure AI Foundry

Microsoft is transitioning to Azure AI Foundry. New Azure OpenAI resources typically use the cognitiveservices.azure.com endpoint pattern (Bearer token auth).

Configuration Options

Environment Variables

VariableRequiredDefaultDescription
AZURE_OPENAI_ENDPOINTYes-Azure OpenAI endpoint URL
AZURE_OPENAI_API_KEYYes-API key or bearer token
AZURE_OPENAI_DEPLOYMENT_NAMEYes-Model deployment name
AZURE_OPENAI_API_VERSIONNo2024-08-01-previewAPI version
AZURE_OPENAI_TIMEOUT_SECONDSNo120Request timeout
Deployment Name vs Model Name

In Azure OpenAI, the "model" is your deployment name, which can be any string you choose when deploying. Always use your deployment name (e.g., my-gpt4o-deployment), not the underlying model name.

YAML Configuration

For more control, use YAML configuration:

# axonflow.yaml
llm_providers:
azure-openai:
enabled: true
config:
deployment_name: gpt-4o-mini
api_version: 2024-08-01-preview
timeout: 120s
credentials:
endpoint: ${AZURE_OPENAI_ENDPOINT}
api_key: ${AZURE_OPENAI_API_KEY}
priority: 10

Supported Models

Azure OpenAI supports various OpenAI models. Common deployments include:

ModelContext WindowBest For
gpt-4o128K tokensLatest flagship model
gpt-4o-mini128K tokensCost-effective, fast
gpt-4-turbo128K tokensPrevious flagship with vision
gpt-48K tokensStandard GPT-4

Capabilities

The Azure OpenAI provider supports:

  • Chat completions - Conversational AI
  • Streaming responses - Real-time token streaming
  • Function calling - Tool use and structured output
  • Vision - Image understanding (GPT-4o, GPT-4-turbo)
  • JSON mode - Structured output

Usage Examples

Proxy Mode (Python SDK)

Proxy mode routes requests through AxonFlow for simple integration:

from axonflow import AxonFlow

async with AxonFlow(agent_url="http://localhost:8080") as client:
response = await client.execute_query(
user_token="user-123",
query="Explain Azure OpenAI",
request_type="chat",
context={"provider": "azure-openai"}
)
print(response.content)

Proxy Mode (cURL)

curl -X POST http://localhost:8080/api/request \
-H "Content-Type: application/json" \
-H "X-User-Token: user-123" \
-d '{
"query": "What is Azure OpenAI?",
"provider": "azure-openai",
"max_tokens": 500
}'

Gateway Mode (TypeScript SDK)

Gateway mode gives you full control over the LLM call while AxonFlow handles policy enforcement and audit logging:

import { AxonFlow } from '@axonflow/sdk';

const axonflow = new AxonFlow({
endpoint: 'http://localhost:8080',
apiKey: 'your-axonflow-key'
});

// 1. Pre-check: Get policy approval
const ctx = await axonflow.getPolicyApprovedContext({
userToken: 'user-123',
query: 'Explain Azure OpenAI'
});

if (!ctx.approved) {
throw new Error(`Blocked: ${ctx.blockReason}`);
}

// 2. Call Azure OpenAI directly
const endpoint = process.env.AZURE_OPENAI_ENDPOINT;
const apiKey = process.env.AZURE_OPENAI_API_KEY;
const deployment = process.env.AZURE_OPENAI_DEPLOYMENT_NAME;

const headers: Record<string, string> = {
'Content-Type': 'application/json'
};

// Auto-detect auth type
if (endpoint.includes('cognitiveservices.azure.com')) {
headers['Authorization'] = `Bearer ${apiKey}`;
} else {
headers['api-key'] = apiKey;
}

const azureResponse = await fetch(
`${endpoint}/openai/deployments/${deployment}/chat/completions?api-version=2024-08-01-preview`,
{
method: 'POST',
headers,
body: JSON.stringify({
messages: [{ role: 'user', content: ctx.approvedData.query }],
max_tokens: 500
})
}
);

const completion = await azureResponse.json();
const response = completion.choices[0].message.content;

// 3. Audit the call
await axonflow.auditLLMCall({
contextId: ctx.contextId,
responseSummary: response.substring(0, 100),
provider: 'azure-openai',
model: deployment,
tokenUsage: {
promptTokens: completion.usage.prompt_tokens,
completionTokens: completion.usage.completion_tokens,
totalTokens: completion.usage.total_tokens
},
latencyMs: 250
});

Gateway Mode (Go SDK)

package main

import (
"bytes"
"encoding/json"
"fmt"
"net/http"
"os"
"strings"

axonflow "github.com/getaxonflow/axonflow-sdk-go"
)

func main() {
client := axonflow.NewClient(axonflow.Config{
Endpoint: "http://localhost:8080",
})
defer client.Close()

// 1. Pre-check
ctx, err := client.GetPolicyApprovedContext(axonflow.PreCheckRequest{
UserToken: "user-123",
Query: "Explain Azure OpenAI",
})
if err != nil {
panic(err)
}

if ctx.Blocked {
fmt.Printf("Blocked: %s\n", ctx.BlockReason)
return
}

// 2. Call Azure OpenAI directly
endpoint := os.Getenv("AZURE_OPENAI_ENDPOINT")
apiKey := os.Getenv("AZURE_OPENAI_API_KEY")
deployment := os.Getenv("AZURE_OPENAI_DEPLOYMENT_NAME")

azureURL := fmt.Sprintf("%s/openai/deployments/%s/chat/completions?api-version=2024-08-01-preview",
endpoint, deployment)

body, _ := json.Marshal(map[string]interface{}{
"messages": []map[string]string{{"role": "user", "content": ctx.ApprovedQuery}},
"max_tokens": 500,
})

req, _ := http.NewRequest("POST", azureURL, bytes.NewReader(body))
req.Header.Set("Content-Type", "application/json")

// Auto-detect auth type
if strings.Contains(strings.ToLower(endpoint), "cognitiveservices.azure.com") {
req.Header.Set("Authorization", "Bearer "+apiKey)
} else {
req.Header.Set("api-key", apiKey)
}

resp, err := http.DefaultClient.Do(req)
if err != nil {
panic(err)
}
defer resp.Body.Close()

// 3. Audit the call
client.AuditLLMCall(axonflow.AuditRequest{
ContextID: ctx.ContextID,
ResponseSummary: "Azure OpenAI response",
Provider: "azure-openai",
Model: deployment,
})
}

Health Checks

Check the Azure OpenAI provider health status:

# Check specific provider health
curl http://localhost:8081/api/v1/llm-providers/azure-openai/health

Response:

{
"provider": "azure-openai",
"status": "healthy",
"latency_ms": 125,
"model": "gpt-4o-mini"
}

Error Handling

Common error codes from Azure OpenAI:

StatusReasonAction
401Invalid API keyVerify credentials in Azure Portal
403Access deniedCheck RBAC permissions
404Deployment not foundVerify AZURE_OPENAI_DEPLOYMENT_NAME
429Rate limit exceededImplement backoff/retry
500Server errorRetry with exponential backoff

AxonFlow automatically handles retries for transient errors (429, 500).

Troubleshooting

404 DeploymentNotFound

  • Verify AZURE_OPENAI_DEPLOYMENT_NAME matches your Azure deployment exactly
  • Check the deployment status in Azure Portal (must be "Succeeded")
  • Ensure you're using the deployment name, not the model name

401 Unauthorized

  • For Foundry endpoints (*.cognitiveservices.azure.com): Verify Bearer token
  • For Classic endpoints (*.openai.azure.com): Check API key in Azure Portal

No Quota Available

Azure OpenAI requires quota allocation:

  1. Check your subscription type (some free tiers have no quota)
  2. Request quota increase in Azure Portal > Quotas
  3. Consider using Azure AI Foundry which may have different quota policies

Wrong Authentication Header

If you get auth errors, verify AxonFlow is detecting the correct pattern:

  • Foundry (cognitiveservices.azure.com) uses Authorization: Bearer
  • Classic (openai.azure.com) uses api-key header

Best Practices

  1. Use Foundry endpoints - Microsoft is transitioning to Azure AI Foundry
  2. Set reasonable timeouts - 120s default handles most requests
  3. Enable fallback providers - Configure OpenAI/Anthropic as backup
  4. Monitor costs - Use AxonFlow's cost dashboard to track usage
  5. Handle rate limits - Azure has per-deployment rate limits

Multi-Provider Routing

Configure Azure OpenAI alongside other providers for intelligent routing:

llm_providers:
azure-openai:
enabled: true
config:
deployment_name: gpt-4o-mini
credentials:
endpoint: ${AZURE_OPENAI_ENDPOINT}
api_key: ${AZURE_OPENAI_API_KEY}
priority: 100

openai:
enabled: true
config:
model: gpt-4o
credentials:
api_key: ${OPENAI_API_KEY}
priority: 50

routing:
strategy: priority
fallback_enabled: true

Community Examples

Complete code examples are available in the repository:

Next Steps