Azure OpenAI Setup

AxonFlow supports Azure OpenAI Service for LLM routing and orchestration. Azure OpenAI is available in the Community edition.

Prerequisites

Azure account with Azure OpenAI access
A deployed model in Azure OpenAI (e.g., gpt-4o-mini)
Endpoint URL and API key from Azure Portal

Quick Start

1. Deploy a Model in Azure

Go to Azure Portal
Create an Azure OpenAI resource (or use Azure AI Foundry)
Deploy a model (e.g., gpt-4o-mini)
Copy your endpoint URL and API key

2. Configure AxonFlow

# Required
export AZURE_OPENAI_ENDPOINT=https://your-resource.cognitiveservices.azure.com
export AZURE_OPENAI_API_KEY=your-api-key
export AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4o-mini

# Optional
export AZURE_OPENAI_API_VERSION=2024-08-01-preview

3. Start AxonFlow

docker compose up -d

Authentication Patterns

Azure OpenAI supports two authentication patterns. AxonFlow auto-detects which to use based on your endpoint URL:

Endpoint Pattern	Auth Method	Header
`*.cognitiveservices.azure.com`	Bearer Token	`Authorization: Bearer <key>`
`*.openai.azure.com`	API Key	`api-key: <key>`

Azure AI Foundry

Microsoft is transitioning to Azure AI Foundry. New Azure OpenAI resources typically use the cognitiveservices.azure.com endpoint pattern (Bearer token auth).

Configuration Options

Environment Variables

Variable	Required	Default	Description
`AZURE_OPENAI_ENDPOINT`	Yes	-	Azure OpenAI endpoint URL
`AZURE_OPENAI_API_KEY`	Yes	-	API key or bearer token
`AZURE_OPENAI_DEPLOYMENT_NAME`	Yes	-	Model deployment name
`AZURE_OPENAI_API_VERSION`	No	`2024-08-01-preview`	API version
`AZURE_OPENAI_TIMEOUT_SECONDS`	No	`120`	Request timeout

Deployment Name vs Model Name

In Azure OpenAI, the "model" is your deployment name, which can be any string you choose when deploying. Always use your deployment name (e.g., my-gpt4o-deployment), not the underlying model name.

YAML Configuration

For more control, use YAML configuration:

# axonflow.yaml
llm_providers:
  azure-openai:
    enabled: true
    config:
      deployment_name: gpt-4o-mini
      api_version: 2024-08-01-preview
      timeout: 120s
    credentials:
      endpoint: ${AZURE_OPENAI_ENDPOINT}
      api_key: ${AZURE_OPENAI_API_KEY}
    priority: 10

Supported Models

Azure OpenAI supports various OpenAI models. Common deployments include:

Model	Context Window	Best For
`gpt-4o`	128K tokens	Latest flagship model
`gpt-4o-mini`	128K tokens	Cost-effective, fast
`gpt-4o`	128K tokens	Previous flagship with vision
`gpt-4`	8K tokens	Standard GPT-4

Capabilities

The Azure OpenAI provider supports:

Chat completions - Conversational AI
Streaming responses - Real-time token streaming
Function calling - Tool use and structured output
Vision - Image understanding (GPT-4o, GPT-4-turbo)
JSON mode - Structured output

Usage Examples

Proxy Mode (Python SDK)

Proxy mode routes requests through AxonFlow for simple integration:

from axonflow import AxonFlow

async with AxonFlow(endpoint="http://localhost:8080") as client:
    response = await client.execute_query(
        user_token="user-123",
        query="Explain Azure OpenAI",
        request_type="chat",
        context={"provider": "azure-openai"}
    )
    print(response.content)

Proxy Mode (cURL)

curl -X POST http://localhost:8080/api/request \
  -H "Content-Type: application/json" \
  -H "X-User-Token: user-123" \
  -d '{
    "query": "What is Azure OpenAI?",
    "provider": "azure-openai",
    "max_tokens": 500
  }'

Gateway Mode (TypeScript SDK)

Gateway mode gives you full control over the LLM call while AxonFlow handles policy enforcement and audit logging:

import { AxonFlow } from '@axonflow/sdk';

const axonflow = new AxonFlow({
  endpoint: 'http://localhost:8080',
  apiKey: 'your-axonflow-key'
});

// 1. Pre-check: Get policy approval
const ctx = await axonflow.getPolicyApprovedContext({
  userToken: 'user-123',
  query: 'Explain Azure OpenAI'
});

if (!ctx.approved) {
  throw new Error(`Blocked: ${ctx.blockReason}`);
}

// 2. Call Azure OpenAI directly
const endpoint = process.env.AZURE_OPENAI_ENDPOINT;
const apiKey = process.env.AZURE_OPENAI_API_KEY;
const deployment = process.env.AZURE_OPENAI_DEPLOYMENT_NAME;

const headers: Record<string, string> = {
  'Content-Type': 'application/json'
};

// Auto-detect auth type
if (endpoint.includes('cognitiveservices.azure.com')) {
  headers['Authorization'] = `Bearer ${apiKey}`;
} else {
  headers['api-key'] = apiKey;
}

const azureResponse = await fetch(
  `${endpoint}/openai/deployments/${deployment}/chat/completions?api-version=2024-08-01-preview`,
  {
    method: 'POST',
    headers,
    body: JSON.stringify({
      messages: [{ role: 'user', content: ctx.approvedData.query }],
      max_tokens: 500
    })
  }
);

const completion = await azureResponse.json();
const response = completion.choices[0].message.content;

// 3. Audit the call
await axonflow.auditLLMCall({
  contextId: ctx.contextId,
  responseSummary: response.substring(0, 100),
  provider: 'azure-openai',
  model: deployment,
  tokenUsage: {
    promptTokens: completion.usage.prompt_tokens,
    completionTokens: completion.usage.completion_tokens,
    totalTokens: completion.usage.total_tokens
  },
  latencyMs: 250
});

Gateway Mode (Go SDK)

package main

import (
    "bytes"
    "encoding/json"
    "fmt"
    "net/http"
    "os"
    "strings"

    axonflow "github.com/getaxonflow/axonflow-sdk-go/v3"
)

func main() {
    client := axonflow.NewClient(axonflow.AxonFlowConfig{
        Endpoint: "http://localhost:8080",
    })
    defer client.Close()

    // 1. Pre-check
    ctx, err := client.GetPolicyApprovedContext(axonflow.PreCheckRequest{
        UserToken: "user-123",
        Query:     "Explain Azure OpenAI",
    })
    if err != nil {
        panic(err)
    }

    if ctx.Blocked {
        fmt.Printf("Blocked: %s\n", ctx.BlockReason)
        return
    }

    // 2. Call Azure OpenAI directly
    endpoint := os.Getenv("AZURE_OPENAI_ENDPOINT")
    apiKey := os.Getenv("AZURE_OPENAI_API_KEY")
    deployment := os.Getenv("AZURE_OPENAI_DEPLOYMENT_NAME")

    azureURL := fmt.Sprintf("%s/openai/deployments/%s/chat/completions?api-version=2024-08-01-preview",
        endpoint, deployment)

    body, _ := json.Marshal(map[string]interface{}{
        "messages":   []map[string]string{{"role": "user", "content": ctx.ApprovedQuery}},
        "max_tokens": 500,
    })

    req, _ := http.NewRequest("POST", azureURL, bytes.NewReader(body))
    req.Header.Set("Content-Type", "application/json")

    // Auto-detect auth type
    if strings.Contains(strings.ToLower(endpoint), "cognitiveservices.azure.com") {
        req.Header.Set("Authorization", "Bearer "+apiKey)
    } else {
        req.Header.Set("api-key", apiKey)
    }

    resp, err := http.DefaultClient.Do(req)
    if err != nil {
        panic(err)
    }
    defer resp.Body.Close()

    // 3. Audit the call
    client.AuditLLMCall(axonflow.AuditRequest{
        ContextID:       ctx.ContextID,
        ResponseSummary: "Azure OpenAI response",
        Provider:        "azure-openai",
        Model:           deployment,
    })
}

Health Checks

Check the Azure OpenAI provider health status:

# Check specific provider health
curl http://localhost:8080/api/v1/llm-providers/azure-openai/health

Response:

{
  "provider": "azure-openai",
  "status": "healthy",
  "latency_ms": 125,
  "model": "gpt-4o-mini"
}

Error Handling

Common error codes from Azure OpenAI:

Status	Reason	Action
401	Invalid API key	Verify credentials in Azure Portal
403	Access denied	Check RBAC permissions
404	Deployment not found	Verify `AZURE_OPENAI_DEPLOYMENT_NAME`
429	Rate limit exceeded	Implement backoff/retry
500	Server error	Retry with exponential backoff

AxonFlow automatically handles retries for transient errors (429, 500).

Troubleshooting

404 DeploymentNotFound

Verify AZURE_OPENAI_DEPLOYMENT_NAME matches your Azure deployment exactly
Check the deployment status in Azure Portal (must be "Succeeded")
Ensure you're using the deployment name, not the model name

401 Unauthorized

For Foundry endpoints (*.cognitiveservices.azure.com): Verify Bearer token
For Classic endpoints (*.openai.azure.com): Check API key in Azure Portal

No Quota Available

Azure OpenAI requires quota allocation:

Check your subscription type (some free tiers have no quota)
Request quota increase in Azure Portal > Quotas
Consider using Azure AI Foundry which may have different quota policies

Wrong Authentication Header

If you get auth errors, verify AxonFlow is detecting the correct pattern:

Foundry (cognitiveservices.azure.com) uses Authorization: Bearer
Classic (openai.azure.com) uses api-key header

Best Practices

Use Foundry endpoints - Microsoft is transitioning to Azure AI Foundry
Set reasonable timeouts - 120s default handles most requests
Enable fallback providers - Configure OpenAI/Anthropic as backup
Monitor costs - Use AxonFlow's cost dashboard to track usage
Handle rate limits - Azure has per-deployment rate limits

Multi-Provider Routing

Configure Azure OpenAI alongside other providers for intelligent routing:

llm_providers:
  azure-openai:
    enabled: true
    config:
      deployment_name: gpt-4o-mini
    credentials:
      endpoint: ${AZURE_OPENAI_ENDPOINT}
      api_key: ${AZURE_OPENAI_API_KEY}
    priority: 100

  openai:
    enabled: true
    config:
      model: gpt-4o
    credentials:
      api_key: ${OPENAI_API_KEY}
    priority: 50

routing:
  strategy: priority
  fallback_enabled: true

Community Examples

Complete code examples are available in the repository:

Hello World (Gateway + Proxy Mode)

Security Examples

PII Detection (Python) - SSN, credit card, Aadhaar detection
SQL Injection Scanning (TypeScript) - SQLi pattern blocking
Proxy Mode with SQLi (Go) - Policy enforcement via proxy

Next Steps

LLM Providers Overview - All supported providers
OpenAI Setup - Standard OpenAI API
Provider Routing - Multi-provider configuration
Custom Provider SDK - Build custom providers

Prerequisites​

Quick Start​

1. Deploy a Model in Azure​

2. Configure AxonFlow​

3. Start AxonFlow​

Authentication Patterns​

Configuration Options​

Environment Variables​

YAML Configuration​

Supported Models​

Capabilities​

Usage Examples​

Proxy Mode (Python SDK)​

Proxy Mode (cURL)​

Gateway Mode (TypeScript SDK)​

Gateway Mode (Go SDK)​

Health Checks​

Error Handling​

Troubleshooting​

404 DeploymentNotFound​

401 Unauthorized​

No Quota Available​

Wrong Authentication Header​

Best Practices​

Multi-Provider Routing​

Community Examples​

Hello World (Gateway + Proxy Mode)​

Security Examples​

Next Steps​