Cost Management
AxonFlow provides comprehensive cost management capabilities to help organizations control and optimize LLM spending. Set budgets at multiple levels, receive alerts before limits are reached, and enforce spending policies automatically.
In AxonFlow, budget limits are governance policies. Just like content policies block harmful requests, budget policies can block or downgrade requests that would exceed spending limits.
Why Cost Management Matters
AI agent deployments can quickly become expensive without proper controls:
- Runaway Costs: A buggy agent loop can burn through API credits in minutes
- Budget Allocation: Different teams/projects need separate spending limits
- Visibility: Organizations need to know which agents/workflows cost the most
- Accountability: Costs must be attributable to specific use cases
- Forecasting: Monthly spend needs to be predictable
AxonFlow tracks every token, calculates costs in real-time, and enforces budget policies automatically.
Budget Hierarchy
Budgets can be set at multiple scopes, creating a hierarchy:
Organization Budget ($10,000/month)
│
├── Team: Platform ($5,000/month)
│ ├── Agent: code-reviewer ($1,000/month)
│ └── Agent: test-generator ($500/month)
│
├── Team: Data Science ($3,000/month)
│ └── Workflow: daily-analysis ($100/day)
│
└── Team: Customer Support ($2,000/month)
└── Agent: ticket-responder ($50/day)
Scope Types
| Scope | Use Case |
|---|---|
organization | Overall company spending limit |
team | Department or project budget |
agent | Individual AI agent budget |
workflow | Specific workflow budget |
user | Per-user spending limit |
Creating Budgets
Using the SDK
from axonflow import AxonFlow, CreateBudgetRequest, BudgetScope, BudgetPeriod, BudgetOnExceed
async with AxonFlow(endpoint="http://localhost:8081") as client:
# Organization-level monthly budget
await client.create_budget(CreateBudgetRequest(
id="org-monthly",
name="Organization Monthly Budget",
scope=BudgetScope.ORGANIZATION,
limit_usd=10000.0,
period=BudgetPeriod.MONTHLY,
on_exceed=BudgetOnExceed.BLOCK,
alert_thresholds=[50, 80, 90, 100]
))
# Team budget (nested under org)
await client.create_budget(CreateBudgetRequest(
id="platform-team",
name="Platform Team Budget",
scope=BudgetScope.TEAM,
scope_id="platform",
limit_usd=5000.0,
period=BudgetPeriod.MONTHLY,
on_exceed=BudgetOnExceed.WARN,
alert_thresholds=[50, 80, 100]
))
# Agent budget (daily limit)
await client.create_budget(CreateBudgetRequest(
id="code-reviewer-daily",
name="Code Reviewer Daily Limit",
scope=BudgetScope.AGENT,
scope_id="code-reviewer",
limit_usd=50.0,
period=BudgetPeriod.DAILY,
on_exceed=BudgetOnExceed.BLOCK,
alert_thresholds=[80, 100]
))
Budget Periods
| Period | Reset Frequency |
|---|---|
daily | Every day at midnight UTC |
weekly | Every Monday at midnight UTC |
monthly | First of each month at midnight UTC |
quarterly | First of each quarter |
yearly | January 1st |
Enforcement Actions
When a budget threshold is reached, AxonFlow takes action based on the on_exceed setting:
| Action | Behavior |
|---|---|
warn | Log warning, send alert, allow request |
block | Reject request with budget exceeded error |
downgrade | Switch to a cheaper model (Enterprise) |
Example: Blocking When Budget Exceeded
# Check budget before making LLM request
decision = await client.check_budget(BudgetCheckRequest(
team_id="platform",
agent_id="code-reviewer"
))
if decision.allowed:
# Safe to make LLM request
response = await llm.generate(prompt)
else:
# Budget exceeded
print(f"Blocked: {decision.message}")
print(f"Used: ${decision.used_usd:.2f} / ${decision.limit_usd:.2f}")
Alert Thresholds
Configure percentage thresholds to receive alerts before budgets are exceeded:
await client.create_budget(CreateBudgetRequest(
id="team-budget",
name="Engineering Team",
scope=BudgetScope.TEAM,
scope_id="engineering",
limit_usd=5000.0,
period=BudgetPeriod.MONTHLY,
on_exceed=BudgetOnExceed.BLOCK,
alert_thresholds=[50, 80, 90, 100] # Alert at 50%, 80%, 90%, 100%
))
Viewing Alerts
# Get alerts for a budget
alerts = await client.get_budget_alerts("team-budget")
for alert in alerts.alerts:
print(f"[{alert.created_at}] {alert.message}")
print(f" Threshold: {alert.threshold}%")
print(f" Amount: ${alert.amount_usd:.2f}")
Monitoring Usage
Real-Time Budget Status
status = await client.get_budget_status("team-budget")
print(f"Budget: {status.budget.name}")
print(f"Used: ${status.used_usd:.2f} / ${status.budget.limit_usd:.2f}")
print(f"Remaining: ${status.remaining_usd:.2f}")
print(f"Percentage: {status.percentage:.1f}%")
print(f"Period: {status.period_start} to {status.period_end}")
if status.is_exceeded:
print("WARNING: Budget exceeded!")
Usage Summary
usage = await client.get_usage_summary(period="monthly")
print(f"Total Cost: ${usage.total_cost_usd:.2f}")
print(f"Total Requests: {usage.total_requests:,}")
print(f"Tokens In: {usage.total_tokens_in:,}")
print(f"Tokens Out: {usage.total_tokens_out:,}")
Usage Breakdown
Analyze spending by different dimensions:
# By provider
by_provider = await client.get_usage_breakdown("provider", "monthly")
for item in by_provider.items:
print(f"{item.name}: ${item.cost_usd:.2f} ({item.percentage:.1f}%)")
# By model
by_model = await client.get_usage_breakdown("model", "monthly")
# By team
by_team = await client.get_usage_breakdown("team", "monthly")
# By agent
by_agent = await client.get_usage_breakdown("agent", "monthly")
Best Practices
1. Start with Organization Budget
Always set a top-level organization budget as a safety net:
await client.create_budget(CreateBudgetRequest(
id="org-safety-net",
name="Organization Safety Net",
scope=BudgetScope.ORGANIZATION,
limit_usd=50000.0, # High limit as safety net
period=BudgetPeriod.MONTHLY,
on_exceed=BudgetOnExceed.BLOCK,
alert_thresholds=[80, 90, 95, 100]
))
2. Use Daily Limits for Agents
Agents can have runaway loops. Daily limits catch issues quickly:
await client.create_budget(CreateBudgetRequest(
id="agent-daily",
scope=BudgetScope.AGENT,
scope_id="my-agent",
limit_usd=100.0, # $100/day max
period=BudgetPeriod.DAILY,
on_exceed=BudgetOnExceed.BLOCK
))
3. Pre-Check Before Expensive Operations
Always check budgets before calling expensive models:
async def safe_llm_call(prompt: str, team_id: str):
# Check budget first
decision = await client.check_budget(BudgetCheckRequest(team_id=team_id))
if not decision.allowed:
raise BudgetExceededError(decision.message)
# Safe to proceed
return await llm.generate(prompt)
4. Set Meaningful Alert Thresholds
Configure alerts that give you time to react:
- 50%: Early warning, halfway through budget
- 80%: Time to review spending patterns
- 90%: Consider reducing usage or increasing budget
- 100%: Budget exhausted
Community vs Enterprise
| Feature | Community | Enterprise |
|---|---|---|
| Usage tracking | ✅ | ✅ |
| Budget limits | ✅ | ✅ |
| Budget hierarchy | ✅ | ✅ |
| Alert thresholds | ✅ | ✅ |
| Pre-flight budget check | ✅ | ✅ |
| Usage breakdown | ✅ | ✅ |
| Usage forecast | ❌ | ✅ |
| Usage export | ❌ | ✅ |
| Alert channels (Slack, email, webhook) | ❌ | ✅ |
| Auto-downgrade to cheaper models | ❌ | ✅ |
| Budget rollover | ❌ | ✅ |
| Cost dashboard | ❌ | ✅ |
Next Steps
- Cost Controls API Reference - Complete API documentation
- Audit Logging - Track all AI interactions
- Policy-as-Code - Define governance rules