Skip to main content

LLM Provider Management API

Use this API to register providers, inspect provider health, update routing weights, and control the provider layer behind POST /api/v1/process. These routes are served by the Orchestrator and are commonly reached through the Agent proxy.

Overview

Verified routes:

MethodPathPurpose
GET/api/v1/llm-provider-typesList available provider factories
GET/api/v1/llm-providersList configured providers
POST/api/v1/llm-providersCreate a provider
GET/api/v1/llm-providers/{name}Get provider config
PUT/api/v1/llm-providers/{name}Update provider config
DELETE/api/v1/llm-providers/{name}Delete provider config
GET/api/v1/llm-providers/routingRead routing weights
PUT/api/v1/llm-providers/routingUpdate routing weights
GET/api/v1/llm-providers/statusHealth for all providers
GET/api/v1/llm-providers/{name}/healthHealth for one provider
POST/api/v1/llm-providers/{name}/testConnectivity test

Request-Level Routing Controls (Advanced)

For inference requests sent to /api/v1/process, provider selection controls are passed in context:

  • context.provider (string): preferred provider (fallback allowed)
  • context.strict_provider (boolean, optional): hard-pin provider for that request (no fallback)

Example:

curl -X POST http://localhost:8080/api/v1/process \
-H "Content-Type: application/json" \
-d '{
"query": "Summarize this report",
"request_type": "chat",
"context": {
"provider": "openai",
"strict_provider": true
},
"user": {"email":"[email protected]","role":"analyst"},
"client": {"id":"analytics-app","tenant_id":"tenant-1"}
}'

Base URL: http://localhost:8080 (Agent)

You can also call the same routes directly on the Orchestrator at http://localhost:8081.

Listing Providers

Verified list query parameters:

Query paramNotes
pageDefault 1
page_sizeDefault 20, max 100
typeFilter by provider type
enabledtrue or false

The list response is a providers array plus pagination. The provider resource shape includes fields such as name, type, endpoint, model, region, enabled, priority, weight, rate_limit, timeout_seconds, has_api_key, optional settings, and optional health.

Verified provider resource fields:

FieldNotes
nameUnique provider name
typeProvider factory type
endpointOptional custom endpoint
modelDefault model
regionRegion for Bedrock or region-aware providers
enabledProvider is active for routing
priorityFailover ordering
weightWeighted routing value
rate_limitOptional rate limit
timeout_secondsOptional provider timeout
has_api_keyCredentials are configured, without exposing the secret
settingsProvider-specific configuration like Azure deployment name
health.statusCurrent health state when available
health.messageHealth detail
health.last_checkedTimestamp of the latest health check

Example list response:

{
"providers": [
{
"name": "openai-primary",
"type": "openai",
"endpoint": "https://api.openai.com/v1",
"model": "gpt-4o",
"enabled": true,
"priority": 1,
"weight": 70,
"rate_limit": 1000,
"timeout_seconds": 30,
"has_api_key": true,
"health": {
"status": "healthy",
"last_checked": "2026-03-29T11:58:00Z"
}
}
],
"pagination": {
"page": 1,
"page_size": 20,
"total_items": 1,
"total_pages": 1
}
}

Provider Types and Availability

GET /api/v1/llm-provider-types tells you what factories the running build can configure. This is the key detail for edition accuracy: some provider types exist only when the build and license support them.

That means teams evaluating production architecture can use this endpoint to verify what the current deployment actually supports, rather than relying on assumptions from older docs or past installs.

Create and Update Rules

Verified create requirements:

  • name is required
  • type is required
  • unsupported provider types return VALIDATION_ERROR
  • duplicate names return CONFLICT
  • license-gated provider registration can return LICENSE_ERROR

Minimal example:

curl -X POST http://localhost:8080/api/v1/llm-providers \
-H "Content-Type: application/json" \
-H "Authorization: Basic $(echo -n 'client-id:client-secret' | base64)" \
-d '{
"name": "azure-prod",
"type": "azure-openai",
"endpoint": "https://example.openai.azure.com",
"model": "gpt-4o",
"enabled": true,
"priority": 1,
"weight": 100,
"settings": {
"deployment_name": "gpt-4o-prod",
"api_version": "2024-02-15-preview"
}
}'

Create and update fields accepted by the handler:

FieldCreateUpdateNotes
nameYesNoResource identifier
typeYesNoMust match a registered provider factory
api_keyOptionalOptionalNot returned in responses
api_key_secret_arnOptionalOptionalAlternate secret source
endpointOptionalOptionalCustom base URL
modelOptionalOptionalDefault model
regionOptionalOptionalRegion-aware providers
enabledOptionalOptionalRoute or disable the provider
priorityOptionalOptionalFailover order
weightOptionalOptionalWeighted routing
rate_limitOptionalOptionalPer-provider limit
timeout_secondsOptionalOptionalProvider timeout
settingsOptionalOptionalProvider-specific settings

Single-provider reads return:

{
"provider": {
"name": "azure-prod",
"type": "azure-openai",
"endpoint": "https://example.openai.azure.com",
"model": "gpt-4o",
"enabled": true,
"priority": 1,
"weight": 100,
"has_api_key": true,
"settings": {
"deployment_name": "gpt-4o-prod",
"api_version": "2024-02-15-preview"
}
}
}

Routing Configuration

The routing API is intentionally simple:

{
"weights": {
"openai": 70,
"anthropic": 30
}
}

Operationally, teams usually combine these routes in three phases:

  1. Start with one provider and confirm connectivity with /api/v1/llm-providers/{name}/test.
  2. Add a second provider and use /api/v1/llm-providers/status plus routing weights for failover or distribution.
  3. Push request-level overrides via /api/v1/process when specific workflows need hard provider pinning for compliance or quality reasons.

That makes it easy to implement provider failover and weighted routing without having to rewrite client applications. In practice, this becomes much more valuable as teams scale from one provider in community deployments to multi-provider and cost-aware routing in evaluation or enterprise programs.