LLM Provider Management API

Manage LLM providers, routing configuration, and provider health through the Agent API.

Overview

The LLM Provider API allows you to:

Configure multiple LLM providers (OpenAI, Anthropic, Azure OpenAI, Gemini, Ollama, Bedrock)
Define routing strategies (weighted, round-robin, failover)
Monitor provider health and performance
Test provider connectivity

Request-Level Routing Controls (Advanced)

For inference requests sent to /api/v1/process, provider selection controls are passed in context:

context.provider (string): preferred provider (fallback allowed)
context.strict_provider (boolean, optional): hard-pin provider for that request (no fallback)

Example:

curl -X POST http://localhost:8080/api/v1/process \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Summarize this report",
    "request_type": "chat",
    "context": {
      "provider": "openai",
      "strict_provider": true
    },
    "user": {"email":"[email protected]","role":"analyst"},
    "client": {"id":"analytics-app","tenant_id":"tenant-1"}
  }'

Base URL: http://localhost:8080 (Agent)

Endpoints

GET /api/v1/llm-provider-types

List all available LLM provider types that can be configured.

Request:

curl http://localhost:8080/api/v1/llm-provider-types

Response (200 OK):

{
  "provider_types": [
    {
      "type": "openai",
      "community": true,
      "required_tier": "community"
    },
    {
      "type": "anthropic",
      "community": true,
      "required_tier": "community"
    },
    {
      "type": "azure-openai",
      "community": true,
      "required_tier": "community"
    },
    {
      "type": "gemini",
      "community": true,
      "required_tier": "community"
    },
    {
      "type": "ollama",
      "community": true,
      "required_tier": "community"
    },
    {
      "type": "bedrock",
      "community": false,
      "required_tier": "enterprise"
    }
  ],
  "count": 6
}

GET /api/v1/llm-providers

List all configured LLM providers.

Request:

curl http://localhost:8080/api/v1/llm-providers

Response (200 OK):

{
  "providers": [
    {
      "name": "openai",
      "type": "openai",
      "endpoint": "https://api.openai.com/v1",
      "model": "gpt-4o",
      "enabled": true,
      "priority": 1,
      "weight": 60,
      "rate_limit": 1000,
      "timeout_seconds": 30,
      "has_api_key": true,
      "health": {
        "status": "healthy",
        "last_checked": "2025-01-02T10:00:00Z"
      }
    },
    {
      "name": "anthropic",
      "type": "anthropic",
      "model": "claude-sonnet-4-20250514",
      "enabled": true,
      "priority": 2,
      "weight": 40,
      "rate_limit": 500,
      "timeout_seconds": 30,
      "has_api_key": true,
      "health": {
        "status": "healthy",
        "last_checked": "2025-01-02T10:00:00Z"
      }
    }
  ],
  "pagination": {
    "page": 1,
    "page_size": 20,
    "total_items": 2,
    "total_pages": 1
  }
}

POST /api/v1/llm-providers

Create a new LLM provider configuration.

Request:

curl -X POST http://localhost:8080/api/v1/llm-providers \
  -H "Content-Type: application/json" \
  -d '{
    "name": "azure-prod",
    "type": "azure-openai",
    "api_key": "your-azure-api-key",
    "endpoint": "https://my-resource.openai.azure.com",
    "model": "gpt-4o",
    "enabled": true,
    "priority": 3,
    "weight": 30,
    "rate_limit": 500,
    "timeout_seconds": 30,
    "settings": {
      "api_version": "2024-02-15-preview",
      "deployment_name": "gpt-4o-deployment"
    }
  }'

Request Body:

Field	Type	Required	Description
`name`	string	Yes	Unique provider identifier
`type`	string	Yes	Provider type (openai, anthropic, etc.)
`api_key`	string	No	API key (or use `api_key_secret_arn`)
`api_key_secret_arn`	string	No	AWS Secrets Manager ARN for API key
`endpoint`	string	No	Provider endpoint URL
`model`	string	No	Default model to use
`region`	string	No	AWS region (for Bedrock)
`enabled`	boolean	No	Whether provider is active (default: true)
`priority`	integer	No	Priority for failover ordering
`weight`	integer	No	Routing weight (0-100)
`rate_limit`	integer	No	Rate limit per minute
`timeout_seconds`	integer	No	Request timeout in seconds
`settings`	object	No	Provider-specific settings

Response (201 Created):

{
  "provider": {
    "name": "azure-prod",
    "type": "azure-openai",
    "endpoint": "https://my-resource.openai.azure.com",
    "model": "gpt-4o",
    "enabled": true,
    "priority": 3,
    "weight": 30,
    "rate_limit": 500,
    "timeout_seconds": 30,
    "has_api_key": true
  }
}

GET /api/v1/llm-providers/{name}

Get configuration for a specific provider.

Request:

curl http://localhost:8080/api/v1/llm-providers/openai

Response (200 OK):

{
  "provider": {
    "name": "openai",
    "type": "openai",
    "endpoint": "https://api.openai.com/v1",
    "model": "gpt-4o",
    "enabled": true,
    "priority": 1,
    "weight": 60,
    "rate_limit": 1000,
    "timeout_seconds": 30,
    "has_api_key": true,
    "health": {
      "status": "healthy",
      "last_checked": "2025-01-02T10:00:00Z"
    }
  }
}

note

API keys and secrets are never returned in responses for security. The has_api_key field indicates whether credentials are configured.

PUT /api/v1/llm-providers/{name}

Update an existing provider configuration.

Request:

curl -X PUT http://localhost:8080/api/v1/llm-providers/openai \
  -H "Content-Type: application/json" \
  -d '{
    "enabled": true,
    "weight": 50,
    "model": "gpt-4o",
    "timeout_seconds": 45
  }'

Request Body:

All fields are optional. Only provided fields will be updated.

Field	Type	Description
`api_key`	string	API key
`api_key_secret_arn`	string	AWS Secrets Manager ARN
`endpoint`	string	Provider endpoint URL
`model`	string	Default model
`region`	string	AWS region
`enabled`	boolean	Active status
`priority`	integer	Failover priority
`weight`	integer	Routing weight (0-100)
`rate_limit`	integer	Rate limit per minute
`timeout_seconds`	integer	Request timeout
`settings`	object	Provider-specific settings

Response (200 OK):

{
  "provider": {
    "name": "openai",
    "type": "openai",
    "endpoint": "https://api.openai.com/v1",
    "model": "gpt-4o",
    "enabled": true,
    "priority": 1,
    "weight": 50,
    "rate_limit": 1000,
    "timeout_seconds": 45,
    "has_api_key": true
  }
}

DELETE /api/v1/llm-providers/{name}

Delete a provider configuration.

Request:

curl -X DELETE http://localhost:8080/api/v1/llm-providers/azure-prod

Response (204 No Content):

No response body on successful deletion.

Routing Configuration

GET /api/v1/llm-providers/routing

Get the current routing weights for all providers.

Request:

curl http://localhost:8080/api/v1/llm-providers/routing

Response (200 OK):

{
  "weights": {
    "openai": 60,
    "anthropic": 40
  }
}

The weights map contains the routing weight (0-100) for each configured provider. Requests are distributed proportionally based on these weights.

PUT /api/v1/llm-providers/routing

Update routing weights for providers.

Request:

curl -X PUT http://localhost:8080/api/v1/llm-providers/routing \
  -H "Content-Type: application/json" \
  -d '{
    "weights": {
      "openai": 40,
      "anthropic": 60
    }
  }'

Request Body:

Field	Type	Required	Description
`weights`	object	Yes	Map of provider name to weight (0-100)

Response (200 OK):

{
  "weights": {
    "openai": 40,
    "anthropic": 60
  }
}

Health & Testing

GET /api/v1/llm-providers/status

Get health status of all configured providers. This triggers a health check on all providers.

Request:

curl http://localhost:8080/api/v1/llm-providers/status

Response (200 OK):

{
  "providers": {
    "openai": {
      "status": "healthy",
      "message": "",
      "last_checked": "2025-01-02T10:00:00Z",
      "latency_ms": 245
    },
    "anthropic": {
      "status": "healthy",
      "message": "",
      "last_checked": "2025-01-02T10:00:00Z",
      "latency_ms": 312
    },
    "ollama": {
      "status": "unhealthy",
      "message": "connection refused",
      "last_checked": "2025-01-02T10:00:00Z"
    }
  }
}

GET /api/v1/llm-providers/{name}/health

Check health of a specific provider. Triggers an immediate health check.

Request:

curl http://localhost:8080/api/v1/llm-providers/openai/health

Response (200 OK):

{
  "name": "openai",
  "health": {
    "status": "healthy",
    "message": "",
    "last_checked": "2025-01-02T10:00:00Z",
    "latency_ms": 234
  }
}

Health Status Values:

Provider Status

Value	Description
`healthy`	Provider is responding normally
`degraded`	Provider is responding but with elevated latency or error rates
`unhealthy`	Provider is not responding or returning errors
`unknown`	Health check not yet performed

Provider Types

Value	Description	Tier
`openai`	OpenAI API (GPT-4, GPT-3.5)	Community
`anthropic`	Anthropic API (Claude)	Community
`azure-openai`	Azure OpenAI Service	Community
`gemini`	Google Gemini API	Community
`ollama`	Ollama local models	Community
`bedrock`	AWS Bedrock	Enterprise

POST /api/v1/llm-providers/{name}/test

Test a provider by sending a simple completion request.

Request:

curl -X POST http://localhost:8080/api/v1/llm-providers/anthropic/test \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Say hello in exactly 3 words.",
    "model": "claude-haiku-4-5-20251001",
    "max_tokens": 20
  }'

Response (200 OK):

{
  "status": "success",
  "provider": "anthropic",
  "model": "claude-haiku-4-5-20251001",
  "response": "Hello there, friend!",
  "latency_ms": 693,
  "usage": {
    "prompt_tokens": 8,
    "completion_tokens": 4,
    "total_tokens": 12
  }
}

Response (500 Error):

{
  "error": {
    "code": "TEST_FAILED",
    "message": "test failed: model not found: claude-invalid-model"
  }
}

Environment Variables

LLM providers can also be configured via environment variables:

Variable	Description
`OPENAI_API_KEY`	OpenAI API key
`ANTHROPIC_API_KEY`	Anthropic API key
`AZURE_OPENAI_ENDPOINT`	Azure OpenAI endpoint URL
`AZURE_OPENAI_API_KEY`	Azure OpenAI API key
`AZURE_OPENAI_DEPLOYMENT`	Azure OpenAI deployment name
`GOOGLE_API_KEY`	Google Gemini API key
`OLLAMA_ENDPOINT`	Ollama server endpoint
`OLLAMA_MODEL`	Default Ollama model
`LLM_ROUTING_STRATEGY`	Routing strategy (weighted, round_robin, failover)
`PROVIDER_WEIGHTS`	Weights (e.g., "openai:60,anthropic:40")
`DEFAULT_LLM_PROVIDER`	Default provider name

Error Responses

HTTP Status	Error Code	Description
400	`INVALID_PROVIDER_CONFIG`	Invalid provider configuration
404	`PROVIDER_NOT_FOUND`	Provider does not exist
409	`PROVIDER_ALREADY_EXISTS`	Provider name already in use
500	`PROVIDER_TEST_FAILED`	Provider test/health check failed

SDK Examples

Use the AxonFlow SDKs to manage LLM providers programmatically.

List Providers (Go)

providers, _ := client.ListLLMProviders()
for _, p := range providers.Providers {
    fmt.Printf("%s: %s (weight: %d)\n", p.Name, p.Type, p.Weight)
}

List Providers (Python)

providers = await client.list_llm_providers()
for p in providers.providers:
    print(f"{p.name}: {p.type} (weight: {p.weight}, status: {p.health.status})")

List Providers (TypeScript)

const providers = await client.listLLMProviders();
providers.providers.forEach(p => {
  console.log(`${p.name}: ${p.type} (weight: ${p.weight})`);
});

Get Provider Health (Python)

health = await client.get_llm_provider_health("openai")
print(f"Status: {health.status}, Latency: {health.latency_ms}ms")

Update Routing (TypeScript)

await client.updateLLMProviderRouting({
  weights: { openai: 60, anthropic: 40 }
});

Test Provider (Java)

TestProviderResult result = client.testLLMProvider("anthropic",
    TestProviderRequest.builder().prompt("Hello").build());
System.out.printf("Response: %s (%dms)\n", result.getResponse(), result.getLatencyMs());

Community Examples

LLM Routing - Provider configuration and routing

Next Steps

Agent Endpoints - Policy enforcement API
Orchestrator Endpoints - Multi-agent execution
SDK Documentation - Language-specific SDKs

Overview​

Request-Level Routing Controls (Advanced)​

Endpoints​

GET /api/v1/llm-provider-types​

GET /api/v1/llm-providers​

POST /api/v1/llm-providers​

GET /api/v1/llm-providers/{name}​

PUT /api/v1/llm-providers/{name}​

DELETE /api/v1/llm-providers/{name}​

Routing Configuration​

GET /api/v1/llm-providers/routing​

PUT /api/v1/llm-providers/routing​

Health & Testing​

GET /api/v1/llm-providers/status​

GET /api/v1/llm-providers/{name}/health​

Provider Status​

Provider Types​

POST /api/v1/llm-providers/{name}/test​

Environment Variables​

Error Responses​

SDK Examples​

List Providers (Go)​

List Providers (Python)​

List Providers (TypeScript)​

Get Provider Health (Python)​

Update Routing (TypeScript)​

Test Provider (Java)​

Community Examples​

Next Steps​

Overview

Request-Level Routing Controls (Advanced)

Endpoints

GET /api/v1/llm-provider-types

GET /api/v1/llm-providers

POST /api/v1/llm-providers

GET /api/v1/llm-providers/{name}

PUT /api/v1/llm-providers/{name}

DELETE /api/v1/llm-providers/{name}

Routing Configuration

GET /api/v1/llm-providers/routing

PUT /api/v1/llm-providers/routing

Health & Testing

GET /api/v1/llm-providers/status

GET /api/v1/llm-providers/{name}/health

Provider Status

Provider Types

POST /api/v1/llm-providers/{name}/test

Environment Variables

Error Responses

SDK Examples

List Providers (Go)

List Providers (Python)

List Providers (TypeScript)

Get Provider Health (Python)

Update Routing (TypeScript)

Test Provider (Java)

Community Examples

Next Steps