LLM Provider Management API
Use this API to register providers, inspect provider health, update routing weights, and control the provider layer behind POST /api/v1/process. These routes are served by the Orchestrator and are commonly reached through the Agent proxy.
Overview
Verified routes:
| Method | Path | Purpose |
|---|---|---|
GET | /api/v1/llm-provider-types | List available provider factories |
GET | /api/v1/llm-providers | List configured providers |
POST | /api/v1/llm-providers | Create a provider |
GET | /api/v1/llm-providers/{name} | Get provider config |
PUT | /api/v1/llm-providers/{name} | Update provider config |
DELETE | /api/v1/llm-providers/{name} | Delete provider config |
GET | /api/v1/llm-providers/routing | Read routing weights |
PUT | /api/v1/llm-providers/routing | Update routing weights |
GET | /api/v1/llm-providers/status | Health for all providers |
GET | /api/v1/llm-providers/{name}/health | Health for one provider |
POST | /api/v1/llm-providers/{name}/test | Connectivity test |
Request-Level Routing Controls (Advanced)
For inference requests sent to /api/v1/process, provider selection controls are passed in context:
context.provider(string): preferred provider (fallback allowed)context.strict_provider(boolean, optional): hard-pin provider for that request (no fallback)
Example:
curl -X POST http://localhost:8080/api/v1/process \
-H "Content-Type: application/json" \
-d '{
"query": "Summarize this report",
"request_type": "chat",
"context": {
"provider": "openai",
"strict_provider": true
},
"user": {"email":"[email protected]","role":"analyst"},
"client": {"id":"analytics-app","tenant_id":"tenant-1"}
}'
Base URL: http://localhost:8080 (Agent)
You can also call the same routes directly on the Orchestrator at http://localhost:8081.
Listing Providers
Verified list query parameters:
| Query param | Notes |
|---|---|
page | Default 1 |
page_size | Default 20, max 100 |
type | Filter by provider type |
enabled | true or false |
The list response is a providers array plus pagination. The provider resource shape includes fields such as name, type, endpoint, model, region, enabled, priority, weight, rate_limit, timeout_seconds, has_api_key, optional settings, and optional health.
Verified provider resource fields:
| Field | Notes |
|---|---|
name | Unique provider name |
type | Provider factory type |
endpoint | Optional custom endpoint |
model | Default model |
region | Region for Bedrock or region-aware providers |
enabled | Provider is active for routing |
priority | Failover ordering |
weight | Weighted routing value |
rate_limit | Optional rate limit |
timeout_seconds | Optional provider timeout |
has_api_key | Credentials are configured, without exposing the secret |
settings | Provider-specific configuration like Azure deployment name |
health.status | Current health state when available |
health.message | Health detail |
health.last_checked | Timestamp of the latest health check |
Example list response:
{
"providers": [
{
"name": "openai-primary",
"type": "openai",
"endpoint": "https://api.openai.com/v1",
"model": "gpt-4o",
"enabled": true,
"priority": 1,
"weight": 70,
"rate_limit": 1000,
"timeout_seconds": 30,
"has_api_key": true,
"health": {
"status": "healthy",
"last_checked": "2026-03-29T11:58:00Z"
}
}
],
"pagination": {
"page": 1,
"page_size": 20,
"total_items": 1,
"total_pages": 1
}
}
Provider Types and Availability
GET /api/v1/llm-provider-types tells you what factories the running build can configure. This is the key detail for edition accuracy: some provider types exist only when the build and license support them.
That means teams evaluating production architecture can use this endpoint to verify what the current deployment actually supports, rather than relying on assumptions from older docs or past installs.
Create and Update Rules
Verified create requirements:
nameis requiredtypeis required- unsupported provider types return
VALIDATION_ERROR - duplicate names return
CONFLICT - license-gated provider registration can return
LICENSE_ERROR
Minimal example:
curl -X POST http://localhost:8080/api/v1/llm-providers \
-H "Content-Type: application/json" \
-H "Authorization: Basic $(echo -n 'client-id:client-secret' | base64)" \
-d '{
"name": "azure-prod",
"type": "azure-openai",
"endpoint": "https://example.openai.azure.com",
"model": "gpt-4o",
"enabled": true,
"priority": 1,
"weight": 100,
"settings": {
"deployment_name": "gpt-4o-prod",
"api_version": "2024-02-15-preview"
}
}'
Create and update fields accepted by the handler:
| Field | Create | Update | Notes |
|---|---|---|---|
name | Yes | No | Resource identifier |
type | Yes | No | Must match a registered provider factory |
api_key | Optional | Optional | Not returned in responses |
api_key_secret_arn | Optional | Optional | Alternate secret source |
endpoint | Optional | Optional | Custom base URL |
model | Optional | Optional | Default model |
region | Optional | Optional | Region-aware providers |
enabled | Optional | Optional | Route or disable the provider |
priority | Optional | Optional | Failover order |
weight | Optional | Optional | Weighted routing |
rate_limit | Optional | Optional | Per-provider limit |
timeout_seconds | Optional | Optional | Provider timeout |
settings | Optional | Optional | Provider-specific settings |
Single-provider reads return:
{
"provider": {
"name": "azure-prod",
"type": "azure-openai",
"endpoint": "https://example.openai.azure.com",
"model": "gpt-4o",
"enabled": true,
"priority": 1,
"weight": 100,
"has_api_key": true,
"settings": {
"deployment_name": "gpt-4o-prod",
"api_version": "2024-02-15-preview"
}
}
}
Routing Configuration
The routing API is intentionally simple:
{
"weights": {
"openai": 70,
"anthropic": 30
}
}
Operationally, teams usually combine these routes in three phases:
- Start with one provider and confirm connectivity with
/api/v1/llm-providers/{name}/test. - Add a second provider and use
/api/v1/llm-providers/statusplus routing weights for failover or distribution. - Push request-level overrides via
/api/v1/processwhen specific workflows need hard provider pinning for compliance or quality reasons.
That makes it easy to implement provider failover and weighted routing without having to rewrite client applications. In practice, this becomes much more valuable as teams scale from one provider in community deployments to multi-provider and cost-aware routing in evaluation or enterprise programs.
