Skip to main content

LLM Provider Management API

Manage LLM providers, routing configuration, and provider health through the Agent API.

Overview

The LLM Provider API allows you to:

  • Configure multiple LLM providers (OpenAI, Anthropic, Azure OpenAI, Gemini, Ollama, Bedrock)
  • Define routing strategies (weighted, round-robin, failover)
  • Monitor provider health and performance
  • Test provider connectivity

Request-Level Routing Controls (Advanced)

For inference requests sent to /api/v1/process, provider selection controls are passed in context:

  • context.provider (string): preferred provider (fallback allowed)
  • context.strict_provider (boolean, optional): hard-pin provider for that request (no fallback)

Example:

curl -X POST http://localhost:8080/api/v1/process \
-H "Content-Type: application/json" \
-d '{
"query": "Summarize this report",
"request_type": "chat",
"context": {
"provider": "openai",
"strict_provider": true
},
"user": {"email":"[email protected]","role":"analyst"},
"client": {"id":"analytics-app","tenant_id":"tenant-1"}
}'

Base URL: http://localhost:8080 (Agent)


Endpoints

GET /api/v1/llm-provider-types

List all available LLM provider types that can be configured.

Request:

curl http://localhost:8080/api/v1/llm-provider-types

Response (200 OK):

{
"provider_types": [
{
"type": "openai",
"community": true,
"required_tier": "community"
},
{
"type": "anthropic",
"community": true,
"required_tier": "community"
},
{
"type": "azure-openai",
"community": true,
"required_tier": "community"
},
{
"type": "gemini",
"community": true,
"required_tier": "community"
},
{
"type": "ollama",
"community": true,
"required_tier": "community"
},
{
"type": "bedrock",
"community": false,
"required_tier": "enterprise"
}
],
"count": 6
}

GET /api/v1/llm-providers

List all configured LLM providers.

Request:

curl http://localhost:8080/api/v1/llm-providers

Response (200 OK):

{
"providers": [
{
"name": "openai",
"type": "openai",
"endpoint": "https://api.openai.com/v1",
"model": "gpt-4o",
"enabled": true,
"priority": 1,
"weight": 60,
"rate_limit": 1000,
"timeout_seconds": 30,
"has_api_key": true,
"health": {
"status": "healthy",
"last_checked": "2025-01-02T10:00:00Z"
}
},
{
"name": "anthropic",
"type": "anthropic",
"model": "claude-sonnet-4-20250514",
"enabled": true,
"priority": 2,
"weight": 40,
"rate_limit": 500,
"timeout_seconds": 30,
"has_api_key": true,
"health": {
"status": "healthy",
"last_checked": "2025-01-02T10:00:00Z"
}
}
],
"pagination": {
"page": 1,
"page_size": 20,
"total_items": 2,
"total_pages": 1
}
}

POST /api/v1/llm-providers

Create a new LLM provider configuration.

Request:

curl -X POST http://localhost:8080/api/v1/llm-providers \
-H "Content-Type: application/json" \
-d '{
"name": "azure-prod",
"type": "azure-openai",
"api_key": "your-azure-api-key",
"endpoint": "https://my-resource.openai.azure.com",
"model": "gpt-4o",
"enabled": true,
"priority": 3,
"weight": 30,
"rate_limit": 500,
"timeout_seconds": 30,
"settings": {
"api_version": "2024-02-15-preview",
"deployment_name": "gpt-4o-deployment"
}
}'

Request Body:

FieldTypeRequiredDescription
namestringYesUnique provider identifier
typestringYesProvider type (openai, anthropic, etc.)
api_keystringNoAPI key (or use api_key_secret_arn)
api_key_secret_arnstringNoAWS Secrets Manager ARN for API key
endpointstringNoProvider endpoint URL
modelstringNoDefault model to use
regionstringNoAWS region (for Bedrock)
enabledbooleanNoWhether provider is active (default: true)
priorityintegerNoPriority for failover ordering
weightintegerNoRouting weight (0-100)
rate_limitintegerNoRate limit per minute
timeout_secondsintegerNoRequest timeout in seconds
settingsobjectNoProvider-specific settings

Response (201 Created):

{
"provider": {
"name": "azure-prod",
"type": "azure-openai",
"endpoint": "https://my-resource.openai.azure.com",
"model": "gpt-4o",
"enabled": true,
"priority": 3,
"weight": 30,
"rate_limit": 500,
"timeout_seconds": 30,
"has_api_key": true
}
}

GET /api/v1/llm-providers/{name}

Get configuration for a specific provider.

Request:

curl http://localhost:8080/api/v1/llm-providers/openai

Response (200 OK):

{
"provider": {
"name": "openai",
"type": "openai",
"endpoint": "https://api.openai.com/v1",
"model": "gpt-4o",
"enabled": true,
"priority": 1,
"weight": 60,
"rate_limit": 1000,
"timeout_seconds": 30,
"has_api_key": true,
"health": {
"status": "healthy",
"last_checked": "2025-01-02T10:00:00Z"
}
}
}
note

API keys and secrets are never returned in responses for security. The has_api_key field indicates whether credentials are configured.


PUT /api/v1/llm-providers/{name}

Update an existing provider configuration.

Request:

curl -X PUT http://localhost:8080/api/v1/llm-providers/openai \
-H "Content-Type: application/json" \
-d '{
"enabled": true,
"weight": 50,
"model": "gpt-4o",
"timeout_seconds": 45
}'

Request Body:

All fields are optional. Only provided fields will be updated.

FieldTypeDescription
api_keystringAPI key
api_key_secret_arnstringAWS Secrets Manager ARN
endpointstringProvider endpoint URL
modelstringDefault model
regionstringAWS region
enabledbooleanActive status
priorityintegerFailover priority
weightintegerRouting weight (0-100)
rate_limitintegerRate limit per minute
timeout_secondsintegerRequest timeout
settingsobjectProvider-specific settings

Response (200 OK):

{
"provider": {
"name": "openai",
"type": "openai",
"endpoint": "https://api.openai.com/v1",
"model": "gpt-4o",
"enabled": true,
"priority": 1,
"weight": 50,
"rate_limit": 1000,
"timeout_seconds": 45,
"has_api_key": true
}
}

DELETE /api/v1/llm-providers/{name}

Delete a provider configuration.

Request:

curl -X DELETE http://localhost:8080/api/v1/llm-providers/azure-prod

Response (204 No Content):

No response body on successful deletion.


Routing Configuration

GET /api/v1/llm-providers/routing

Get the current routing weights for all providers.

Request:

curl http://localhost:8080/api/v1/llm-providers/routing

Response (200 OK):

{
"weights": {
"openai": 60,
"anthropic": 40
}
}

The weights map contains the routing weight (0-100) for each configured provider. Requests are distributed proportionally based on these weights.


PUT /api/v1/llm-providers/routing

Update routing weights for providers.

Request:

curl -X PUT http://localhost:8080/api/v1/llm-providers/routing \
-H "Content-Type: application/json" \
-d '{
"weights": {
"openai": 40,
"anthropic": 60
}
}'

Request Body:

FieldTypeRequiredDescription
weightsobjectYesMap of provider name to weight (0-100)

Response (200 OK):

{
"weights": {
"openai": 40,
"anthropic": 60
}
}

Health & Testing

GET /api/v1/llm-providers/status

Get health status of all configured providers. This triggers a health check on all providers.

Request:

curl http://localhost:8080/api/v1/llm-providers/status

Response (200 OK):

{
"providers": {
"openai": {
"status": "healthy",
"message": "",
"last_checked": "2025-01-02T10:00:00Z",
"latency_ms": 245
},
"anthropic": {
"status": "healthy",
"message": "",
"last_checked": "2025-01-02T10:00:00Z",
"latency_ms": 312
},
"ollama": {
"status": "unhealthy",
"message": "connection refused",
"last_checked": "2025-01-02T10:00:00Z"
}
}
}

GET /api/v1/llm-providers/{name}/health

Check health of a specific provider. Triggers an immediate health check.

Request:

curl http://localhost:8080/api/v1/llm-providers/openai/health

Response (200 OK):

{
"name": "openai",
"health": {
"status": "healthy",
"message": "",
"last_checked": "2025-01-02T10:00:00Z",
"latency_ms": 234
}
}

Health Status Values:

Provider Status

ValueDescription
healthyProvider is responding normally
degradedProvider is responding but with elevated latency or error rates
unhealthyProvider is not responding or returning errors
unknownHealth check not yet performed

Provider Types

ValueDescriptionTier
openaiOpenAI API (GPT-4, GPT-3.5)Community
anthropicAnthropic API (Claude)Community
azure-openaiAzure OpenAI ServiceCommunity
geminiGoogle Gemini APICommunity
ollamaOllama local modelsCommunity
bedrockAWS BedrockEnterprise

POST /api/v1/llm-providers/{name}/test

Test a provider by sending a simple completion request.

Request:

curl -X POST http://localhost:8080/api/v1/llm-providers/anthropic/test \
-H "Content-Type: application/json" \
-d '{
"prompt": "Say hello in exactly 3 words.",
"model": "claude-haiku-4-5-20251001",
"max_tokens": 20
}'

Response (200 OK):

{
"status": "success",
"provider": "anthropic",
"model": "claude-haiku-4-5-20251001",
"response": "Hello there, friend!",
"latency_ms": 693,
"usage": {
"prompt_tokens": 8,
"completion_tokens": 4,
"total_tokens": 12
}
}

Response (500 Error):

{
"error": {
"code": "TEST_FAILED",
"message": "test failed: model not found: claude-invalid-model"
}
}

Environment Variables

LLM providers can also be configured via environment variables:

VariableDescription
OPENAI_API_KEYOpenAI API key
ANTHROPIC_API_KEYAnthropic API key
AZURE_OPENAI_ENDPOINTAzure OpenAI endpoint URL
AZURE_OPENAI_API_KEYAzure OpenAI API key
AZURE_OPENAI_DEPLOYMENTAzure OpenAI deployment name
GOOGLE_API_KEYGoogle Gemini API key
OLLAMA_ENDPOINTOllama server endpoint
OLLAMA_MODELDefault Ollama model
LLM_ROUTING_STRATEGYRouting strategy (weighted, round_robin, failover)
PROVIDER_WEIGHTSWeights (e.g., "openai:60,anthropic:40")
DEFAULT_LLM_PROVIDERDefault provider name

Error Responses

HTTP StatusError CodeDescription
400INVALID_PROVIDER_CONFIGInvalid provider configuration
404PROVIDER_NOT_FOUNDProvider does not exist
409PROVIDER_ALREADY_EXISTSProvider name already in use
500PROVIDER_TEST_FAILEDProvider test/health check failed

SDK Examples

Use the AxonFlow SDKs to manage LLM providers programmatically.

List Providers (Go)

providers, _ := client.ListLLMProviders()
for _, p := range providers.Providers {
fmt.Printf("%s: %s (weight: %d)\n", p.Name, p.Type, p.Weight)
}

List Providers (Python)

providers = await client.list_llm_providers()
for p in providers.providers:
print(f"{p.name}: {p.type} (weight: {p.weight}, status: {p.health.status})")

List Providers (TypeScript)

const providers = await client.listLLMProviders();
providers.providers.forEach(p => {
console.log(`${p.name}: ${p.type} (weight: ${p.weight})`);
});

Get Provider Health (Python)

health = await client.get_llm_provider_health("openai")
print(f"Status: {health.status}, Latency: {health.latency_ms}ms")

Update Routing (TypeScript)

await client.updateLLMProviderRouting({
weights: { openai: 60, anthropic: 40 }
});

Test Provider (Java)

TestProviderResult result = client.testLLMProvider("anthropic",
TestProviderRequest.builder().prompt("Hello").build());
System.out.printf("Response: %s (%dms)\n", result.getResponse(), result.getLatencyMs());

Community Examples


Next Steps