Skip to main content

LLM Provider Management API

Manage LLM providers, routing configuration, and provider health through the Orchestrator API.

Overview

The LLM Provider API allows you to:

  • Configure multiple LLM providers (OpenAI, Anthropic, Azure OpenAI, Gemini, Ollama, Bedrock)
  • Define routing strategies (weighted, round-robin, failover)
  • Monitor provider health and performance
  • Test provider connectivity

Base URL: http://localhost:8081 (Orchestrator)


Endpoints

GET /api/v1/llm-provider-types

List all available LLM provider types that can be configured.

Request:

curl http://localhost:8081/api/v1/llm-provider-types

Response (200 OK):

{
"provider_types": [
{
"type": "openai",
"community": true,
"required_tier": "community"
},
{
"type": "anthropic",
"community": true,
"required_tier": "community"
},
{
"type": "azure-openai",
"community": true,
"required_tier": "community"
},
{
"type": "gemini",
"community": true,
"required_tier": "community"
},
{
"type": "ollama",
"community": true,
"required_tier": "community"
},
{
"type": "bedrock",
"community": false,
"required_tier": "enterprise"
}
],
"count": 6
}

GET /api/v1/llm-providers

List all configured LLM providers.

Request:

curl http://localhost:8081/api/v1/llm-providers

Response (200 OK):

{
"providers": [
{
"name": "openai",
"type": "openai",
"endpoint": "https://api.openai.com/v1",
"model": "gpt-4o",
"enabled": true,
"priority": 1,
"weight": 60,
"rate_limit": 1000,
"timeout_seconds": 30,
"has_api_key": true,
"health": {
"status": "healthy",
"last_checked": "2025-01-02T10:00:00Z"
}
},
{
"name": "anthropic",
"type": "anthropic",
"model": "claude-sonnet-4-20250514",
"enabled": true,
"priority": 2,
"weight": 40,
"rate_limit": 500,
"timeout_seconds": 30,
"has_api_key": true,
"health": {
"status": "healthy",
"last_checked": "2025-01-02T10:00:00Z"
}
}
],
"pagination": {
"page": 1,
"page_size": 20,
"total_items": 2,
"total_pages": 1
}
}

POST /api/v1/llm-providers

Create a new LLM provider configuration.

Request:

curl -X POST http://localhost:8081/api/v1/llm-providers \
-H "Content-Type: application/json" \
-d '{
"name": "azure-prod",
"type": "azure-openai",
"api_key": "your-azure-api-key",
"endpoint": "https://my-resource.openai.azure.com",
"model": "gpt-4o",
"enabled": true,
"priority": 3,
"weight": 30,
"rate_limit": 500,
"timeout_seconds": 30,
"settings": {
"api_version": "2024-02-15-preview",
"deployment_name": "gpt-4o-deployment"
}
}'

Request Body:

FieldTypeRequiredDescription
namestringYesUnique provider identifier
typestringYesProvider type (openai, anthropic, etc.)
api_keystringNoAPI key (or use api_key_secret_arn)
api_key_secret_arnstringNoAWS Secrets Manager ARN for API key
endpointstringNoProvider endpoint URL
modelstringNoDefault model to use
regionstringNoAWS region (for Bedrock)
enabledbooleanNoWhether provider is active (default: true)
priorityintegerNoPriority for failover ordering
weightintegerNoRouting weight (0-100)
rate_limitintegerNoRate limit per minute
timeout_secondsintegerNoRequest timeout in seconds
settingsobjectNoProvider-specific settings

Response (201 Created):

{
"provider": {
"name": "azure-prod",
"type": "azure-openai",
"endpoint": "https://my-resource.openai.azure.com",
"model": "gpt-4o",
"enabled": true,
"priority": 3,
"weight": 30,
"rate_limit": 500,
"timeout_seconds": 30,
"has_api_key": true
}
}

GET /api/v1/llm-providers/{name}

Get configuration for a specific provider.

Request:

curl http://localhost:8081/api/v1/llm-providers/openai

Response (200 OK):

{
"provider": {
"name": "openai",
"type": "openai",
"endpoint": "https://api.openai.com/v1",
"model": "gpt-4o",
"enabled": true,
"priority": 1,
"weight": 60,
"rate_limit": 1000,
"timeout_seconds": 30,
"has_api_key": true,
"health": {
"status": "healthy",
"last_checked": "2025-01-02T10:00:00Z"
}
}
}
note

API keys and secrets are never returned in responses for security. The has_api_key field indicates whether credentials are configured.


PUT /api/v1/llm-providers/{name}

Update an existing provider configuration.

Request:

curl -X PUT http://localhost:8081/api/v1/llm-providers/openai \
-H "Content-Type: application/json" \
-d '{
"enabled": true,
"weight": 50,
"model": "gpt-4-turbo",
"timeout_seconds": 45
}'

Request Body:

All fields are optional. Only provided fields will be updated.

FieldTypeDescription
api_keystringAPI key
api_key_secret_arnstringAWS Secrets Manager ARN
endpointstringProvider endpoint URL
modelstringDefault model
regionstringAWS region
enabledbooleanActive status
priorityintegerFailover priority
weightintegerRouting weight (0-100)
rate_limitintegerRate limit per minute
timeout_secondsintegerRequest timeout
settingsobjectProvider-specific settings

Response (200 OK):

{
"provider": {
"name": "openai",
"type": "openai",
"endpoint": "https://api.openai.com/v1",
"model": "gpt-4-turbo",
"enabled": true,
"priority": 1,
"weight": 50,
"rate_limit": 1000,
"timeout_seconds": 45,
"has_api_key": true
}
}

DELETE /api/v1/llm-providers/{name}

Delete a provider configuration.

Request:

curl -X DELETE http://localhost:8081/api/v1/llm-providers/azure-prod

Response (204 No Content):

No response body on successful deletion.


Routing Configuration

GET /api/v1/llm-providers/routing

Get the current routing weights for all providers.

Request:

curl http://localhost:8081/api/v1/llm-providers/routing

Response (200 OK):

{
"weights": {
"openai": 60,
"anthropic": 40
}
}

The weights map contains the routing weight (0-100) for each configured provider. Requests are distributed proportionally based on these weights.


PUT /api/v1/llm-providers/routing

Update routing weights for providers.

Request:

curl -X PUT http://localhost:8081/api/v1/llm-providers/routing \
-H "Content-Type: application/json" \
-d '{
"weights": {
"openai": 40,
"anthropic": 60
}
}'

Request Body:

FieldTypeRequiredDescription
weightsobjectYesMap of provider name to weight (0-100)

Response (200 OK):

{
"weights": {
"openai": 40,
"anthropic": 60
}
}

Health & Testing

GET /api/v1/llm-providers/status

Get health status of all configured providers. This triggers a health check on all providers.

Request:

curl http://localhost:8081/api/v1/llm-providers/status

Response (200 OK):

{
"providers": {
"openai": {
"status": "healthy",
"message": "",
"last_checked": "2025-01-02T10:00:00Z",
"latency_ms": 245
},
"anthropic": {
"status": "healthy",
"message": "",
"last_checked": "2025-01-02T10:00:00Z",
"latency_ms": 312
},
"ollama": {
"status": "unhealthy",
"message": "connection refused",
"last_checked": "2025-01-02T10:00:00Z"
}
}
}

GET /api/v1/llm-providers/{name}/health

Check health of a specific provider. Triggers an immediate health check.

Request:

curl http://localhost:8081/api/v1/llm-providers/openai/health

Response (200 OK):

{
"name": "openai",
"health": {
"status": "healthy",
"message": "",
"last_checked": "2025-01-02T10:00:00Z",
"latency_ms": 234
}
}

Health Status Values:

StatusDescription
healthyProvider is responding normally
unhealthyProvider is not responding or returning errors
unknownHealth check not yet performed

POST /api/v1/llm-providers/{name}/test

Test a provider by sending a simple completion request.

Request:

curl -X POST http://localhost:8081/api/v1/llm-providers/anthropic/test \
-H "Content-Type: application/json" \
-d '{
"prompt": "Say hello in exactly 3 words.",
"model": "claude-3-haiku-20240307",
"max_tokens": 20
}'

Response (200 OK):

{
"status": "success",
"provider": "anthropic",
"model": "claude-3-haiku-20240307",
"response": "Hello there, friend!",
"latency_ms": 693,
"usage": {
"prompt_tokens": 8,
"completion_tokens": 4,
"total_tokens": 12
}
}

Response (500 Error):

{
"error": {
"code": "TEST_FAILED",
"message": "test failed: model not found: claude-invalid-model"
}
}

Environment Variables

LLM providers can also be configured via environment variables:

VariableDescription
OPENAI_API_KEYOpenAI API key
ANTHROPIC_API_KEYAnthropic API key
AZURE_OPENAI_ENDPOINTAzure OpenAI endpoint URL
AZURE_OPENAI_API_KEYAzure OpenAI API key
AZURE_OPENAI_DEPLOYMENTAzure OpenAI deployment name
GOOGLE_API_KEYGoogle Gemini API key
OLLAMA_ENDPOINTOllama server endpoint
OLLAMA_MODELDefault Ollama model
LLM_ROUTING_STRATEGYRouting strategy (weighted, round_robin, failover)
PROVIDER_WEIGHTSWeights (e.g., "openai:60,anthropic:40")
DEFAULT_LLM_PROVIDERDefault provider name

Error Responses

HTTP StatusError CodeDescription
400INVALID_PROVIDER_CONFIGInvalid provider configuration
404PROVIDER_NOT_FOUNDProvider does not exist
409PROVIDER_ALREADY_EXISTSProvider name already in use
500PROVIDER_TEST_FAILEDProvider test/health check failed

Next Steps