AxonFlow vs LangSmith
LangSmith and AxonFlow solve different problems in the AI production stack. LangSmith helps agent teams debug and improve behavior. AxonFlow decides whether production agent actions are allowed to run, records why, and produces the evidence regulated teams need.
This page is an honest comparison. LangSmith is strong in areas where AxonFlow has no plans to compete, and AxonFlow is strong in areas LangSmith does not cover. Many teams use both.
When to use LangSmith
LangSmith is the better choice when your primary need is observability and iteration:
- Trace exploration — LangSmith's trace UI is mature and feature-rich: hierarchical span trees, a messages view for multi-turn conversations, custom dashboards, automatic trace clustering, and alerting via webhooks or PagerDuty.
- Evaluation loops — Dataset management, multiple evaluator types (human, heuristic, LLM-as-judge, pairwise), online evaluation of production traffic, experiment comparison, and CI/CD integration with threshold-based gates.
- Prompt management — Version-controlled prompt hub with commit tags, environment promotion (staging/production), diff comparison, and SDK integration via
client.pull_prompt(). - LangGraph-native development — If your agent stack is built on LangGraph, LangSmith provides native integration with deployment infrastructure, human-in-the-loop via
interrupt()primitives, and durable execution with automatic checkpointing.
LangSmith is SOC 2 Type II certified and offers HIPAA BAAs for healthcare use cases. It is a strong default for teams that are iterating on agent quality and want deep visibility into what their agents are doing.
When to use AxonFlow
AxonFlow is the better choice when your primary need is enforcement before execution:
- Policy enforcement across LLM, MCP, and workflow layers — AxonFlow governs LLM calls, MCP tool calls, and workflow steps through a single policy engine. Policies are category-based (PII detection, SQL injection, dangerous patterns, sensitive data) with configurable actions: block, flag, log, or require approval.
- Human-in-the-loop approval gates — Platform-managed HITL with a centralized approval queue, expiration timers, webhook notifications, and idempotency-key deduplication. Approvals are infrastructure, not application code.
- Kill switch — A dedicated production kill switch to halt agent actions at the global, organization, or system level. Available in the Enterprise edition.
- Decision Mode — AxonFlow runs as a standalone policy decision service alongside your existing gateways. Your LLM gateway, agent gateway, or MCP gateway each make one inline call to AxonFlow per request and enforce the verdict (allow, deny, or require approval). This is the PDP/PEP separation pattern.
- Cost controls and circuit breaker — Per-org, per-tenant, per-provider LLM cost tracking with configurable budget limits (warn, block, or downgrade on exceed). Automatic circuit breaker trips after repeated policy violations or elevated error rates, preventing runaway spend or abuse. Per-tenant threshold overrides.
- Self-hosted without vendor lock-in — Docker Compose deployment for Community and Enterprise editions. No Kubernetes requirement for basic deployments. Source-available under BSL 1.1.
- Framework-agnostic — Works with LangChain, Google ADK, LiteLLM, n8n, CrewAI, Semantic Kernel, and 15+ frameworks via Gateway Mode, Decision Mode, and dedicated plugins. Not tied to any single orchestration stack.
- Compliance evidence export — Regulatory audit trail exports for RBI, SEBI, and OJK frameworks with date-range filtering, structured JSON and CSV formats, retention policy management, and breach notification workflows.
AxonFlow is the right choice for teams that need to prove what their agents did, why it was allowed, and who approved it.
When to use both
LangSmith and AxonFlow are complementary. LangSmith handles observability and iteration; AxonFlow handles enforcement and evidence. The integration point is the trace_id.
How it works:
- Your application generates a W3C
traceparentheader or receives one from an upstream service. - AxonFlow propagates the
trace_idthrough every decision response and audit log entry. - LangSmith records the same
trace_idin its trace tree. - When reviewing a LangSmith trace, you can look up the corresponding AxonFlow decision to see which policies were evaluated and what verdict was returned.
Both tools emit OpenTelemetry spans, so a shared OTel backend (Jaeger, Grafana Tempo, Datadog) can correlate the full execution path: application code → AxonFlow decision → LLM call → LangSmith trace.
┌─────────────────────────────────────────────────────────┐
│ Your Application │
│ │
│ 1. Build request │
│ 2. Call AxonFlow (policy check) ──→ allow / deny │
│ 3. If allowed, call LLM provider │
│ 4. LangSmith traces the LLM call │
│ │
│ Shared: trace_id links AxonFlow decision │
│ to LangSmith trace │
└─────────────────────────────────────────────────────────┘
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────────┐
│ AxonFlow │ │ LangSmith │
│ │ │ │
│ Policy engine │ │ Trace exploration │
│ Audit trail │ │ Eval harness │
│ HITL approvals │ │ Prompt management │
│ Kill switch │ │ Dashboards │
│ Evidence export │ │ LangGraph runtime │
└─────────────────┘ └─────────────────────┘
│ │
└──────────┬───────────────────┘
▼
┌─────────────────────┐
│ OTel Backend │
│ (Jaeger / Grafana / │
│ Datadog) │
│ │
│ Correlated spans │
│ from both tools │
└─────────────────────┘
This architecture means neither tool needs to replicate the other's capabilities. AxonFlow does not need a trace explorer; LangSmith does not need a policy engine.
Feature comparison
The table below compares capabilities as of May 2026. Claims about LangSmith are based on their public documentation. Where a capability is in private beta or planned, that is noted.
| Capability | LangSmith | AxonFlow |
|---|---|---|
| LLM call governance | Spend limits + PII redaction (LLM Gateway, private beta) | Category-based policy engine with configurable actions (block, flag, log, require approval) |
| MCP / tool call governance | Planned | Shipped — MCP server with per-tool policy gates |
| Workflow step gates | Via LangGraph interrupt() (framework-level, requires code changes) | Platform-managed via Workflow Control Protocol (WCP) |
| HITL approval | Via LangGraph interrupt() (code-level, no centralized queue) | Platform-managed with centralized queue, expiration, webhooks, idempotency |
| Self-hosted deployment | Yes — Enterprise add-on, requires Kubernetes | Yes — Docker Compose, no Kubernetes requirement |
| Kill switch | No dedicated feature | Yes — global, org-level, and system-level (Enterprise) |
| Cost controls / budgets | Spend limits in LLM Gateway (private beta) | Per-org, per-tenant, per-provider budgets with warn/block/downgrade actions |
| Circuit breaker | No | Yes — auto-trips on policy violations or error rate; per-tenant thresholds (Enterprise) |
| Framework integration | LangChain / LangGraph native | LangChain, ADK, LiteLLM, n8n, CrewAI, Semantic Kernel, 15+ frameworks |
| Compliance evidence export | Audit logs in OCSF format (Enterprise). No packaged regulatory reports. | Regulatory export endpoints (RBI, SEBI, OJK) with retention policies and breach notification |
| Trace exploration UX | Strong — hierarchical trees, messages view, dashboards, clustering, SmithDB, alerts | No built-in UI — emits OpenTelemetry spans for Jaeger, Grafana, Datadog |
| Eval harness | Strong — datasets, experiments, 4+ evaluator types, online evals, CI/CD | No |
| Prompt management | Yes — versioned hub, commit tags, playground, environment promotion | No |
| Decision Mode (PDP/PEP) | No | Yes — policy decision service for existing gateway infrastructure |
| Drop-in OpenAI compatibility | Via LLM Gateway base_url swap (private beta) | In progress |
| OTel trace correlation | Yes — bidirectional, OTLP export | Yes — OTLP export with W3C traceparent propagation |
| PII detection | PII redaction via Presidio (LLM Gateway, private beta) | 6 regional PII categories (global, US, EU, India, Singapore, Indonesia) with confidence scoring |
| SQL injection detection | No | Yes — 37 detection patterns |
| Source availability | Proprietary (LangChain framework is MIT) | BSL 1.1 (source-available) |
What AxonFlow does not do
AxonFlow is a policy enforcement and evidence platform. It does not:
- Replace LangSmith for observability. AxonFlow emits structured decision data. It does not provide a trace exploration UI, custom dashboards, or automatic trace clustering.
- Provide an eval harness. AxonFlow does not manage evaluation datasets, run experiments, or score model outputs. Teams that need evals should use LangSmith, Braintrust, or a similar evaluation platform.
- Manage prompts. AxonFlow does not store, version, or deploy prompts. It can reference prompt metadata in policy rules, but prompt lifecycle management is out of scope.
- Author agents. AxonFlow integrates with LangChain, LangGraph, Google ADK, CrewAI, n8n, and other frameworks. It governs agent actions; it does not replace agent orchestration.
Further reading
- Decision Mode architecture — how AxonFlow works as a standalone policy decision service
- Choosing an integration mode — Gateway Mode, Proxy Mode, Decision Mode
- Audit logging — how AxonFlow records decisions
- LiteLLM integration — using AxonFlow with LiteLLM proxy
