LangChain + AxonFlow Integration
Prerequisites: Python 3.10+, AxonFlow running (Getting Started), pip install langchain langchain-openai axonflow
What The Current SDK Surface Gives LangChain Teams
LangChain does not need a separate adapter to benefit from the newer governance surface. The Python SDK already used in this integration now gives you:
client.explain_decision(decision_id)for turning a block, review event, or support ticket into a concrete Decision Explainability record- audit search filters for
decision_id,policy_name, andoverride_id, which makes policy debugging and override cleanup much easier in production - richer audit and decision correlation around the same governed SDK calls your application already makes
The practical benefit is not "more SDK methods". It is faster incident review, a clearer path from denied requests to an explanation, and cleaner policy debugging for existing LangChain apps.
For teams running serious LangChain apps, those capabilities close common production gaps. When a request is denied because a prompt contains sensitive data, the application can now store the decision_id, route the user to an explanation path, and support internal debugging without re-running the whole interaction blind. When an override is created, support and platform teams can search audit records by decision or override rather than manually stitching together raw logs.
The important limitation is that this does not turn generic LangChain chains into step-gated workflows with checkpoint recovery. If a governed LLM call is blocked in a normal LangChain execution path, the chain stops there. You still get explainability and better audit correlation, but not the checkpoint and resume model described on the LangGraph pages.
What LangChain Governance Means in Practice
LangChain is the most widely used framework for building LLM applications, with hundreds of millions of downloads and a vast ecosystem of integrations. It provides a unified interface across LLM providers, composable chains via LCEL (LangChain Expression Language), retrieval-augmented generation with sensible defaults, and agent patterns for tool-using AI. LangChain makes it straightforward to build a working prototype in under 10 lines of code.
Governance for LangChain means adding the production controls that the framework intentionally does not include: policy enforcement that can block or modify requests before they reach the LLM, PII detection that catches sensitive data in prompts and retrieved documents, SQL injection prevention for agents that interact with databases, per-user cost attribution for billing and budget enforcement, and audit trails that satisfy compliance requirements (HIPAA, GDPR, SOX, EU AI Act). These controls need to work without requiring changes to your existing LangChain chains, agents, or RAG pipelines.
AxonFlow provides two integration modes for LangChain. Gateway mode (recommended) wraps your existing LangChain code with a pre-check and audit call: get_policy_approved_context() evaluates policies before the LLM call, and audit_llm_call() records what happened after. It gives you full control over your LLM provider keys and model selection. Proxy mode routes all LLM calls through AxonFlow with a single proxy_llm_call() request, which handles policy enforcement, LLM routing, PII detection, and audit logging automatically. Proxy mode is simpler and supports custom tenant policies, but it also means AxonFlow owns the governed model call.
For PII detection in LangChain chains, AxonFlow scans both the user query and any context (such as RAG-retrieved documents) against 12+ PII patterns including SSN, credit card numbers, Aadhaar numbers, email addresses, and phone numbers. Detected PII can be blocked, redacted, or logged depending on your policy configuration. For LangChain agents that generate SQL queries, AxonFlow's SQL injection detection scans prompts for 37+ attack patterns before they reach the LLM, preventing prompt injection attacks that could manipulate database-connected agents.
What LangChain Does Well
LangChain is the most popular framework for building LLM applications—with 847 million downloads and growing. Its strengths are real and substantial:
Unified Model Interface: Switch between OpenAI, Anthropic, Google, Bedrock, and Ollama without rewriting your application. This abstraction saves weeks of integration work.
Rapid Prototyping: Build a working agent in under 10 lines of code. LCEL (LangChain Expression Language) makes composing chains intuitive.
Rich Ecosystem: 1,400+ integrations—vector stores, document loaders, retrievers, tools. If you need a connector, it probably exists.
RAG Made Simple: Retrieval-Augmented Generation with sensible defaults. Load documents, create embeddings, query with context—all handled.
LangGraph for State: When you outgrow simple chains, LangGraph provides durable execution, human-in-the-loop patterns, and persistence.
Active Community: Extensive documentation, tutorials, and community support. Issues get addressed quickly.
What LangChain Doesn't Try to Solve
LangChain focuses on making LLM applications easy to build. These concerns are explicitly out of scope:
| Production Requirement | LangChain's Position |
|---|---|
| Policy enforcement at inference time | Not provided—no built-in way to block requests based on content, user, or context |
| PII detection and blocking | Requires external integration (Presidio)—not built in |
| SQL injection prevention | Not addressed—must implement separately |
| Per-user cost allocation | Not provided—no way to attribute costs to users or departments |
| Audit trails for compliance | Requires LangSmith (paid SaaS) or third-party tools |
| Role-based access control | Not addressed—no permission model |
| Request/response logging | Off by default—must configure manually |
This isn't a criticism—it's a design choice. LangChain handles orchestration. Governance is a separate concern.
Where Teams Hit Production Friction
Based on real enterprise deployments, here are the blockers that appear after the prototype works:
1. The Compliance Audit
"Show me every prompt that contained customer identifiers in the last 90 days."
LangChain doesn't log prompts by default. LangSmith does, but if it wasn't enabled from day one, you're reconstructing from application logs. For HIPAA, GDPR, or SOX audits, this is painful.
2. The $30K Invoice
A recursive loop triggers 47,000 API calls over a weekend. The bill arrives. Debugging takes three days because:
- No per-request logging was enabled
- No cost attribution per user or session
- No circuit breaker on spend
LangChain processed every request as intended. Nothing was watching what it processed.
3. The "Why Did It Say That?" Question
A hallucinated figure appears in a financial report. Compliance asks:
- What was the original prompt?
- What context was retrieved?
- What model and temperature were used?
- Who was the requesting user?
This information wasn't captured. LangChain returned a response; what happened between prompt and response was invisible.
4. The Security Review Block
Security review: BLOCKED
- No audit trail for LLM decisions
- PII exposure risk in prompts
- SQL injection not prevented
- Policy enforcement unclear
- Cost controls missing
The prototype worked perfectly. It can't ship.
5. The PII Leak That Wasn't (But Could Have Been)
A support agent was trained on customer service logs. During a security scan, it could be prompted to retrieve customer phone numbers—information that was in the training data. LangChain has no built-in PII detection.
How AxonFlow Plugs In
AxonFlow doesn't replace LangChain. It sits underneath it—providing the governance layer that LangChain intentionally doesn't include:
┌─────────────────┐
│ Your App │
└────────┬────────┘
│
v
┌─────────────────┐
│ LangChain │ <-- Chains, Agents, RAG
└────────┬────────┘
│
v
┌─────────────────────────────────┐
│ AxonFlow │
│ ┌───────────┐ ┌────────────┐ │
│ │ Policy │ │ Audit │ │
│ │ Enforce │ │ Trail │ │
│ └───────────┘ └────────────┘ │
│ ┌───────────┐ ┌────────────┐ │
│ │ PII │ │ Cost │ │
│ │ Detection│ │ Control │ │
│ └───────────┘ └────────────┘ │
└────────────────┬────────────────┘
│
v
┌─────────────────┐
│ LLM Provider │
└─────────────────┘
What this gives you:
- Every request logged with user identity and context
- PII detected and blocked before hitting the LLM (SSN, credit cards, etc.)
- SQL injection attempts blocked
- Cost tracked per user, per department, per agent
- Compliance auditors can query 90 days of decisions in seconds
What stays the same:
- Your LangChain code doesn't change
- No new abstractions to learn
- No framework lock-in
Integration Patterns
Pattern 1: Pre-Check + Audit (Gateway Mode) — Recommended
Recommended default for most teams. This is the standard pattern—check every request before the LLM, audit after:
from langchain_openai import ChatOpenAI
from axonflow import AxonFlow, TokenUsage
import time
def governed_langchain_call(
user_token: str,
query: str,
context: dict = None
) -> str:
"""LangChain call with AxonFlow governance."""
start_time = time.time()
with AxonFlow.sync(
endpoint="http://localhost:8080",
client_id="langchain-app",
client_secret="your-secret"
) as axonflow:
# 1. Pre-check: Is this request allowed?
ctx = axonflow.get_policy_approved_context(
user_token=user_token,
query=query,
context={**(context or {}), "framework": "langchain"}
)
if not ctx.approved:
return f"Blocked: {ctx.block_reason}"
# Logged: who, what, when, why blocked
# 2. Your existing LangChain code
llm = ChatOpenAI(model="gpt-4")
response = llm.invoke(query)
latency_ms = int((time.time() - start_time) * 1000)
# 3. Audit: Record what happened
axonflow.audit_llm_call(
context_id=ctx.context_id,
response_summary=response.content[:200],
provider="openai",
model="gpt-4",
token_usage=TokenUsage(
prompt_tokens=response.usage_metadata.get("input_tokens", 0),
completion_tokens=response.usage_metadata.get("output_tokens", 0),
total_tokens=response.usage_metadata.get("total_tokens", 0)
),
latency_ms=latency_ms
)
return response.content
Pattern 2: Governed RAG Pipeline — Advanced: For RAG systems
Add governance to retrieval-augmented generation:
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import Chroma
from langchain.chains import RetrievalQA
from axonflow import AxonFlow, TokenUsage
import time
class GovernedRAG:
def __init__(self, vectorstore_path: str):
self.embeddings = OpenAIEmbeddings()
self.vectorstore = Chroma(
persist_directory=vectorstore_path,
embedding_function=self.embeddings
)
self.llm = ChatOpenAI(model="gpt-4")
self.chain = RetrievalQA.from_chain_type(
llm=self.llm,
retriever=self.vectorstore.as_retriever(),
return_source_documents=True
)
def query(self, user_token: str, question: str, context: dict = None) -> dict:
start_time = time.time()
with AxonFlow.sync(
endpoint="http://localhost:8080",
client_id="langchain-rag",
client_secret="your-secret"
) as axonflow:
# Pre-check with RAG context
ctx = axonflow.get_policy_approved_context(
user_token=user_token,
query=question,
context={
**(context or {}),
"pipeline": "rag",
"data_tier": context.get("data_tier", "standard")
}
)
if not ctx.approved:
return {"answer": f"Blocked: {ctx.block_reason}", "blocked": True}
# Execute RAG
result = self.chain({"query": question})
sources = [doc.metadata for doc in result.get("source_documents", [])]
latency_ms = int((time.time() - start_time) * 1000)
# Audit with source attribution
axonflow.audit_llm_call(
context_id=ctx.context_id,
response_summary=result["result"][:200],
provider="openai",
model="gpt-4",
token_usage=TokenUsage(prompt_tokens=100, completion_tokens=50, total_tokens=150),
latency_ms=latency_ms,
metadata={"source_count": len(sources), "sources": sources[:5]}
)
return {"answer": result["result"], "sources": sources, "context_id": ctx.context_id}
Pattern 3: Proxy Mode — Alternative: For simpler deployments
Route all LLM calls through AxonFlow for automatic governance:
from axonflow import AxonFlow
# All governance happens automatically
with AxonFlow.sync(
endpoint="http://localhost:8080",
client_id="langchain-app"
) as axonflow:
result = axonflow.proxy_llm_call(
user_token="user-123",
query="What are the key AI governance regulations?",
request_type="chat",
context={"framework": "langchain", "department": "research"}
)
if result.blocked:
print(f"Blocked: {result.block_reason}")
else:
print(result.data) # LLM response
Tool-Level Governance (Python SDK v6.0.0+)
GovernedTool wraps any LangChain BaseTool with input/output policy enforcement. AxonFlowChatModel wraps any BaseChatModel with pre-check and audit. Together they provide full governance at both the LLM and tool boundaries.
GovernedTool — Govern Every Tool Call
GovernedTool subclasses BaseTool, so AgentExecutor and any LangChain agent accept it directly. Every tool invocation runs through mcp_check_input (before execution) and mcp_check_output (after execution):
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_tool_calling_agent
from axonflow import AxonFlow
from axonflow.adapters import govern_tools
@tool
def search_customers(query: str) -> str:
"""Search the customer database by name."""
return db.query("SELECT name, email, order_status FROM customers WHERE name LIKE ?", [f"%{query}%"])
@tool
def send_notification(message: str) -> str:
"""Send a notification to a customer."""
return email_service.send(message)
with AxonFlow.sync(
endpoint="http://localhost:8080",
client_id="your-client-id",
client_secret="your-secret",
) as client:
# One line — wraps all tools with input/output governance
governed = govern_tools([search_customers, send_notification], client)
llm = ChatOpenAI(model="gpt-4.1-nano")
agent = create_tool_calling_agent(llm, governed, prompt)
executor = AgentExecutor(agent=agent, tools=governed)
# Tool calls are now governed:
# - Input: PII in tool args blocked before execution
# - Output: PII in tool results redacted before LLM sees them
result = executor.invoke({"input": "Find John's account"})
AxonFlowChatModel — Govern LLM Calls
AxonFlowChatModel wraps any BaseChatModel with pre_check before every LLM call and audit_llm_call after. Drop-in replacement:
from langchain_anthropic import ChatAnthropic
from axonflow import AxonFlow
from axonflow.adapters import AxonFlowChatModel
async with AxonFlow(
endpoint="http://localhost:8080",
client_id="your-client-id",
client_secret="your-secret",
) as client:
model = AxonFlowChatModel(
wrapped=ChatAnthropic(model_name="claude-sonnet-4-6"),
axonflow=client,
)
# Use exactly like ChatAnthropic — governance is transparent
result = await model.ainvoke(
messages,
config={"configurable": {"user_token": "user-jwt"}},
)
Combined: Full LLM + Tool Governance
For complete coverage, use both together. AxonFlowChatModel governs LLM calls (async), GovernedTool governs tool calls. Use the async path with ainvoke:
async with AxonFlow(
endpoint="http://localhost:8080",
client_id="your-client-id",
client_secret="your-secret",
) as client:
# Govern the model (LLM calls)
model = AxonFlowChatModel(
wrapped=ChatOpenAI(model="gpt-4.1-nano"),
axonflow=client,
)
# Govern the tools (tool calls)
governed = govern_tools([search_customers, send_notification], client)
agent = create_tool_calling_agent(model, governed, prompt)
executor = AgentExecutor(agent=agent, tools=governed)
# Use ainvoke for async context (AxonFlowChatModel governance is async-only)
result = await executor.ainvoke({"input": "Find John's account"})
This gives you policy enforcement at two boundaries:
- LLM boundary:
pre_checkon every model call,audit_llm_callafter - Tool boundary:
mcp_check_inputbefore tool execution,mcp_check_outputafter (with redaction support)
GovernedTool.invoke() (sync) requires a synchronous AxonFlow client (AxonFlow.sync()). GovernedTool.ainvoke() (async) works with the async client. When combining with AxonFlowChatModel (async-only governance), use the async path throughout.
For details on why these are separate boundaries with different policy capabilities, see Per-Tool Governance.
Other Languages: Go, TypeScript, Java — For polyglot or service-oriented architectures
Go SDK Integration
For Go services orchestrating LangChain or hybrid architectures:
package main
import (
"context"
"fmt"
"time"
"github.com/getaxonflow/axonflow-sdk-go/v5"
"github.com/sashabaranov/go-openai"
)
func GovernedLangChainCall(
ctx context.Context,
client *axonflow.AxonFlowClient,
openai *openai.Client,
userToken, query string,
callContext map[string]interface{},
) (string, error) {
startTime := time.Now()
// Pre-check
callContext["framework"] = "langchain"
result, err := client.GetPolicyApprovedContext(userToken, query, nil, callContext)
if err != nil {
return "", err
}
if !result.Approved {
return "", fmt.Errorf("blocked: %s", result.BlockReason)
}
// Make LLM call
resp, err := openai.CreateChatCompletion(ctx, openai.ChatCompletionRequest{
Model: openai.GPT4,
Messages: []openai.ChatCompletionMessage{
{Role: "user", Content: query},
},
})
if err != nil {
return "", err
}
return resp.Choices[0].Message.Content, nil
}
TypeScript SDK Integration
import { AxonFlow } from "@axonflow/sdk";
import { ChatOpenAI } from "@langchain/openai";
async function governedLangChainCall(
userToken: string,
query: string,
context?: Record<string, unknown>
): Promise<string> {
const axonflow = new AxonFlow({
endpoint: "http://localhost:8080",
clientId: process.env.AXONFLOW_CLIENT_ID,
clientSecret: process.env.AXONFLOW_CLIENT_SECRET,
});
const startTime = Date.now();
// Pre-check
const approval = await axonflow.getPolicyApprovedContext({
userToken,
query,
context: { ...context, framework: "langchain" },
});
if (!approval.approved) {
throw new Error(`Blocked: ${approval.blockReason}`);
}
// LangChain call
const llm = new ChatOpenAI({ model: "gpt-4" });
const response = await llm.invoke(query);
const latencyMs = Date.now() - startTime;
// Audit
await axonflow.auditLLMCall({
contextId: approval.contextId,
responseSummary: response.content.slice(0, 200),
provider: "openai",
model: "gpt-4",
tokenUsage: { promptTokens: 100, completionTokens: 50, totalTokens: 150 },
latencyMs,
});
return response.content;
}
Java SDK Integration
import com.getaxonflow.sdk.AxonFlow;
import com.getaxonflow.sdk.AxonFlowConfig;
import com.getaxonflow.sdk.types.*;
public class GovernedLangChain {
private final AxonFlow axonflow;
public GovernedLangChain() {
this.axonflow = AxonFlow.create(AxonFlowConfig.builder()
.endpoint("http://localhost:8080")
.clientId("langchain-java")
.clientSecret(System.getenv("AXONFLOW_CLIENT_SECRET"))
.build());
}
public String governedCall(String userToken, String query, Map<String, Object> context)
throws Exception {
context.put("framework", "langchain");
long startTime = System.currentTimeMillis();
// Pre-check
PolicyApprovalResult approval = axonflow.getPolicyApprovedContext(
PolicyApprovalRequest.builder()
.userToken(userToken)
.query(query)
.context(context)
.build()
);
if (!approval.isApproved()) {
throw new PolicyViolationException(approval.getBlockReason());
}
// Your LLM call here
String response = callLLM(query);
long latencyMs = System.currentTimeMillis() - startTime;
// Audit
axonflow.auditLLMCall(AuditOptions.builder()
.contextId(approval.getContextId())
.responseSummary(response.substring(0, Math.min(200, response.length())))
.provider("openai")
.model("gpt-4")
.tokenUsage(TokenUsage.of(100, 50))
.latencyMs(latencyMs)
.build());
return response;
}
}
Example Implementations
| Language | SDK | Example |
|---|---|---|
| Python | axonflow | hello-world/python |
| Go | axonflow-sdk-go | hello-world/go |
| TypeScript | @axonflow/sdk | hello-world/typescript |
| Java | axonflow-sdk | hello-world/java |
| HTTP/curl | Raw HTTP | hello-world/http (for PHP, Ruby, Rust, etc.) |
Best Practices
Always use context IDs: The context_id from get_policy_approved_context() must be passed to audit_llm_call() for proper correlation.
Handle blocked requests gracefully: Check ctx.approved before making LLM calls. Return user-friendly messages when blocked.
Always audit, even on errors: Wrap LLM calls in try/except and call audit_llm_call() in both success and error paths.
Use meaningful context: Pass relevant metadata (department, use_case, data_tier) to enable fine-grained policies.
Troubleshooting
| Issue | Solution |
|---|---|
| Pre-check returns 401 | Verify client_secret is correct |
| Audit calls failing | Check context_id is from valid pre-check (not expired) |
| High latency | AxonFlow adds under 10ms; if higher, check network to AxonFlow endpoint |
| Policies not applying | Verify context fields match policy conditions; check AxonFlow logs |
