LLM Provider Management API
Manage LLM providers, routing configuration, and provider health through the Orchestrator API.
Overview
The LLM Provider API allows you to:
- Configure multiple LLM providers (OpenAI, Anthropic, Azure OpenAI, Gemini, Ollama, Bedrock)
- Define routing strategies (weighted, round-robin, failover)
- Monitor provider health and performance
- Test provider connectivity
Base URL: http://localhost:8081 (Orchestrator)
Endpoints
GET /api/v1/llm-provider-types
List all available LLM provider types that can be configured.
Request:
curl http://localhost:8081/api/v1/llm-provider-types
Response (200 OK):
{
"provider_types": [
{
"type": "openai",
"community": true,
"required_tier": "community"
},
{
"type": "anthropic",
"community": true,
"required_tier": "community"
},
{
"type": "azure-openai",
"community": true,
"required_tier": "community"
},
{
"type": "gemini",
"community": true,
"required_tier": "community"
},
{
"type": "ollama",
"community": true,
"required_tier": "community"
},
{
"type": "bedrock",
"community": false,
"required_tier": "enterprise"
}
],
"count": 6
}
GET /api/v1/llm-providers
List all configured LLM providers.
Request:
curl http://localhost:8081/api/v1/llm-providers
Response (200 OK):
{
"providers": [
{
"name": "openai",
"type": "openai",
"endpoint": "https://api.openai.com/v1",
"model": "gpt-4o",
"enabled": true,
"priority": 1,
"weight": 60,
"rate_limit": 1000,
"timeout_seconds": 30,
"has_api_key": true,
"health": {
"status": "healthy",
"last_checked": "2025-01-02T10:00:00Z"
}
},
{
"name": "anthropic",
"type": "anthropic",
"model": "claude-sonnet-4-20250514",
"enabled": true,
"priority": 2,
"weight": 40,
"rate_limit": 500,
"timeout_seconds": 30,
"has_api_key": true,
"health": {
"status": "healthy",
"last_checked": "2025-01-02T10:00:00Z"
}
}
],
"pagination": {
"page": 1,
"page_size": 20,
"total_items": 2,
"total_pages": 1
}
}
POST /api/v1/llm-providers
Create a new LLM provider configuration.
Request:
curl -X POST http://localhost:8081/api/v1/llm-providers \
-H "Content-Type: application/json" \
-d '{
"name": "azure-prod",
"type": "azure-openai",
"api_key": "your-azure-api-key",
"endpoint": "https://my-resource.openai.azure.com",
"model": "gpt-4o",
"enabled": true,
"priority": 3,
"weight": 30,
"rate_limit": 500,
"timeout_seconds": 30,
"settings": {
"api_version": "2024-02-15-preview",
"deployment_name": "gpt-4o-deployment"
}
}'
Request Body:
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Unique provider identifier |
type | string | Yes | Provider type (openai, anthropic, etc.) |
api_key | string | No | API key (or use api_key_secret_arn) |
api_key_secret_arn | string | No | AWS Secrets Manager ARN for API key |
endpoint | string | No | Provider endpoint URL |
model | string | No | Default model to use |
region | string | No | AWS region (for Bedrock) |
enabled | boolean | No | Whether provider is active (default: true) |
priority | integer | No | Priority for failover ordering |
weight | integer | No | Routing weight (0-100) |
rate_limit | integer | No | Rate limit per minute |
timeout_seconds | integer | No | Request timeout in seconds |
settings | object | No | Provider-specific settings |
Response (201 Created):
{
"provider": {
"name": "azure-prod",
"type": "azure-openai",
"endpoint": "https://my-resource.openai.azure.com",
"model": "gpt-4o",
"enabled": true,
"priority": 3,
"weight": 30,
"rate_limit": 500,
"timeout_seconds": 30,
"has_api_key": true
}
}
GET /api/v1/llm-providers/{name}
Get configuration for a specific provider.
Request:
curl http://localhost:8081/api/v1/llm-providers/openai
Response (200 OK):
{
"provider": {
"name": "openai",
"type": "openai",
"endpoint": "https://api.openai.com/v1",
"model": "gpt-4o",
"enabled": true,
"priority": 1,
"weight": 60,
"rate_limit": 1000,
"timeout_seconds": 30,
"has_api_key": true,
"health": {
"status": "healthy",
"last_checked": "2025-01-02T10:00:00Z"
}
}
}
API keys and secrets are never returned in responses for security. The has_api_key field indicates whether credentials are configured.
PUT /api/v1/llm-providers/{name}
Update an existing provider configuration.
Request:
curl -X PUT http://localhost:8081/api/v1/llm-providers/openai \
-H "Content-Type: application/json" \
-d '{
"enabled": true,
"weight": 50,
"model": "gpt-4-turbo",
"timeout_seconds": 45
}'
Request Body:
All fields are optional. Only provided fields will be updated.
| Field | Type | Description |
|---|---|---|
api_key | string | API key |
api_key_secret_arn | string | AWS Secrets Manager ARN |
endpoint | string | Provider endpoint URL |
model | string | Default model |
region | string | AWS region |
enabled | boolean | Active status |
priority | integer | Failover priority |
weight | integer | Routing weight (0-100) |
rate_limit | integer | Rate limit per minute |
timeout_seconds | integer | Request timeout |
settings | object | Provider-specific settings |
Response (200 OK):
{
"provider": {
"name": "openai",
"type": "openai",
"endpoint": "https://api.openai.com/v1",
"model": "gpt-4-turbo",
"enabled": true,
"priority": 1,
"weight": 50,
"rate_limit": 1000,
"timeout_seconds": 45,
"has_api_key": true
}
}
DELETE /api/v1/llm-providers/{name}
Delete a provider configuration.
Request:
curl -X DELETE http://localhost:8081/api/v1/llm-providers/azure-prod
Response (204 No Content):
No response body on successful deletion.
Routing Configuration
GET /api/v1/llm-providers/routing
Get the current routing weights for all providers.
Request:
curl http://localhost:8081/api/v1/llm-providers/routing
Response (200 OK):
{
"weights": {
"openai": 60,
"anthropic": 40
}
}
The weights map contains the routing weight (0-100) for each configured provider. Requests are distributed proportionally based on these weights.
PUT /api/v1/llm-providers/routing
Update routing weights for providers.
Request:
curl -X PUT http://localhost:8081/api/v1/llm-providers/routing \
-H "Content-Type: application/json" \
-d '{
"weights": {
"openai": 40,
"anthropic": 60
}
}'
Request Body:
| Field | Type | Required | Description |
|---|---|---|---|
weights | object | Yes | Map of provider name to weight (0-100) |
Response (200 OK):
{
"weights": {
"openai": 40,
"anthropic": 60
}
}
Health & Testing
GET /api/v1/llm-providers/status
Get health status of all configured providers. This triggers a health check on all providers.
Request:
curl http://localhost:8081/api/v1/llm-providers/status
Response (200 OK):
{
"providers": {
"openai": {
"status": "healthy",
"message": "",
"last_checked": "2025-01-02T10:00:00Z",
"latency_ms": 245
},
"anthropic": {
"status": "healthy",
"message": "",
"last_checked": "2025-01-02T10:00:00Z",
"latency_ms": 312
},
"ollama": {
"status": "unhealthy",
"message": "connection refused",
"last_checked": "2025-01-02T10:00:00Z"
}
}
}
GET /api/v1/llm-providers/{name}/health
Check health of a specific provider. Triggers an immediate health check.
Request:
curl http://localhost:8081/api/v1/llm-providers/openai/health
Response (200 OK):
{
"name": "openai",
"health": {
"status": "healthy",
"message": "",
"last_checked": "2025-01-02T10:00:00Z",
"latency_ms": 234
}
}
Health Status Values:
| Status | Description |
|---|---|
healthy | Provider is responding normally |
unhealthy | Provider is not responding or returning errors |
unknown | Health check not yet performed |
POST /api/v1/llm-providers/{name}/test
Test a provider by sending a simple completion request.
Request:
curl -X POST http://localhost:8081/api/v1/llm-providers/anthropic/test \
-H "Content-Type: application/json" \
-d '{
"prompt": "Say hello in exactly 3 words.",
"model": "claude-3-haiku-20240307",
"max_tokens": 20
}'
Response (200 OK):
{
"status": "success",
"provider": "anthropic",
"model": "claude-3-haiku-20240307",
"response": "Hello there, friend!",
"latency_ms": 693,
"usage": {
"prompt_tokens": 8,
"completion_tokens": 4,
"total_tokens": 12
}
}
Response (500 Error):
{
"error": {
"code": "TEST_FAILED",
"message": "test failed: model not found: claude-invalid-model"
}
}
Environment Variables
LLM providers can also be configured via environment variables:
| Variable | Description |
|---|---|
OPENAI_API_KEY | OpenAI API key |
ANTHROPIC_API_KEY | Anthropic API key |
AZURE_OPENAI_ENDPOINT | Azure OpenAI endpoint URL |
AZURE_OPENAI_API_KEY | Azure OpenAI API key |
AZURE_OPENAI_DEPLOYMENT | Azure OpenAI deployment name |
GOOGLE_API_KEY | Google Gemini API key |
OLLAMA_ENDPOINT | Ollama server endpoint |
OLLAMA_MODEL | Default Ollama model |
LLM_ROUTING_STRATEGY | Routing strategy (weighted, round_robin, failover) |
PROVIDER_WEIGHTS | Weights (e.g., "openai:60,anthropic:40") |
DEFAULT_LLM_PROVIDER | Default provider name |
Error Responses
| HTTP Status | Error Code | Description |
|---|---|---|
| 400 | INVALID_PROVIDER_CONFIG | Invalid provider configuration |
| 404 | PROVIDER_NOT_FOUND | Provider does not exist |
| 409 | PROVIDER_ALREADY_EXISTS | Provider name already in use |
| 500 | PROVIDER_TEST_FAILED | Provider test/health check failed |
Next Steps
- Agent Endpoints - Policy enforcement API
- Orchestrator Endpoints - Multi-agent execution
- SDK Documentation - Language-specific SDKs