LangChain + AxonFlow Integration

Prerequisites: Python 3.10+, AxonFlow running (Getting Started), pip install langchain langchain-openai axonflow

What The Current SDK Surface Gives LangChain Teams

LangChain does not need a separate adapter to benefit from the newer governance surface. The Python SDK already used in this integration now gives you:

client.explain_decision(decision_id) for turning a block, review event, or support ticket into a concrete Decision Explainability record
audit search filters for decision_id, policy_name, and override_id, which makes policy debugging and override cleanup much easier in production
richer audit and decision correlation around the same governed SDK calls your application already makes

The practical benefit is not "more SDK methods". It is faster incident review, a clearer path from denied requests to an explanation, and cleaner policy debugging for existing LangChain apps.

For teams running serious LangChain apps, those capabilities close common production gaps. When a request is denied because a prompt contains sensitive data, the application can now store the decision_id, route the user to an explanation path, and support internal debugging without re-running the whole interaction blind. When an override is created, support and platform teams can search audit records by decision or override rather than manually stitching together raw logs.

The important limitation is that this does not turn generic LangChain chains into step-gated workflows with checkpoint recovery. If a governed LLM call is blocked in a normal LangChain execution path, the chain stops there. You still get explainability and better audit correlation, but not the checkpoint and resume model described on the LangGraph pages.

What LangChain Governance Means in Practice

LangChain is the most widely used framework for building LLM applications, with hundreds of millions of downloads and a vast ecosystem of integrations. It provides a unified interface across LLM providers, composable chains via LCEL (LangChain Expression Language), retrieval-augmented generation with sensible defaults, and agent patterns for tool-using AI. LangChain makes it straightforward to build a working prototype in under 10 lines of code.

Governance for LangChain means adding the production controls that the framework intentionally does not include: policy enforcement that can block or modify requests before they reach the LLM, PII detection that catches sensitive data in prompts and retrieved documents, SQL injection prevention for agents that interact with databases, per-user cost attribution for billing and budget enforcement, and audit trails that satisfy compliance requirements (HIPAA, GDPR, SOX, EU AI Act). These controls need to work without requiring changes to your existing LangChain chains, agents, or RAG pipelines.

AxonFlow provides two integration modes for LangChain. Gateway mode (recommended) wraps your existing LangChain code with a pre-check and audit call: get_policy_approved_context() evaluates policies before the LLM call, and audit_llm_call() records what happened after. It gives you full control over your LLM provider keys and model selection. Proxy mode routes all LLM calls through AxonFlow with a single proxy_llm_call() request, which handles policy enforcement, LLM routing, PII detection, and audit logging automatically. Proxy mode is simpler and supports custom tenant policies, but it also means AxonFlow owns the governed model call.

For PII detection in LangChain chains, AxonFlow scans both the user query and any context (such as RAG-retrieved documents) against 12+ PII patterns including SSN, credit card numbers, Aadhaar numbers, email addresses, and phone numbers. Detected PII can be blocked, redacted, or logged depending on your policy configuration. For LangChain agents that generate SQL queries, AxonFlow's SQL injection detection scans prompts for 37+ attack patterns before they reach the LLM, preventing prompt injection attacks that could manipulate database-connected agents.

What LangChain Does Well

LangChain is the most popular framework for building LLM applications—with 847 million downloads and growing. Its strengths are real and substantial:

Unified Model Interface: Switch between OpenAI, Anthropic, Google, Bedrock, and Ollama without rewriting your application. This abstraction saves weeks of integration work.

Rapid Prototyping: Build a working agent in under 10 lines of code. LCEL (LangChain Expression Language) makes composing chains intuitive.

Rich Ecosystem: 1,400+ integrations—vector stores, document loaders, retrievers, tools. If you need a connector, it probably exists.

RAG Made Simple: Retrieval-Augmented Generation with sensible defaults. Load documents, create embeddings, query with context—all handled.

LangGraph for State: When you outgrow simple chains, LangGraph provides durable execution, human-in-the-loop patterns, and persistence.

Active Community: Extensive documentation, tutorials, and community support. Issues get addressed quickly.

What LangChain Doesn't Try to Solve

LangChain focuses on making LLM applications easy to build. These concerns are explicitly out of scope:

Production Requirement	LangChain's Position
Policy enforcement at inference time	Not provided—no built-in way to block requests based on content, user, or context
PII detection and blocking	Requires external integration (Presidio)—not built in
SQL injection prevention	Not addressed—must implement separately
Per-user cost allocation	Not provided—no way to attribute costs to users or departments
Audit trails for compliance	Requires LangSmith (paid SaaS) or third-party tools
Role-based access control	Not addressed—no permission model
Request/response logging	Off by default—must configure manually

This isn't a criticism—it's a design choice. LangChain handles orchestration. Governance is a separate concern.

Where Teams Hit Production Friction

Based on real enterprise deployments, here are the blockers that appear after the prototype works:

1. The Compliance Audit

"Show me every prompt that contained customer identifiers in the last 90 days."

LangChain doesn't log prompts by default. LangSmith does, but if it wasn't enabled from day one, you're reconstructing from application logs. For HIPAA, GDPR, or SOX audits, this is painful.

2. The $30K Invoice

A recursive loop triggers 47,000 API calls over a weekend. The bill arrives. Debugging takes three days because:

No per-request logging was enabled
No cost attribution per user or session
No circuit breaker on spend

LangChain processed every request as intended. Nothing was watching what it processed.

3. The "Why Did It Say That?" Question

A hallucinated figure appears in a financial report. Compliance asks:

What was the original prompt?
What context was retrieved?
What model and temperature were used?
Who was the requesting user?

This information wasn't captured. LangChain returned a response; what happened between prompt and response was invisible.

4. The Security Review Block

Security review: BLOCKED
- No audit trail for LLM decisions
- PII exposure risk in prompts
- SQL injection not prevented
- Policy enforcement unclear
- Cost controls missing

The prototype worked perfectly. It can't ship.

5. The PII Leak That Wasn't (But Could Have Been)

A support agent was trained on customer service logs. During a security scan, it could be prompted to retrieve customer phone numbers—information that was in the training data. LangChain has no built-in PII detection.

How AxonFlow Plugs In

AxonFlow doesn't replace LangChain. It sits underneath it—providing the governance layer that LangChain intentionally doesn't include:

┌─────────────────┐
│   Your App      │
└────────┬────────┘
         │
         v
┌─────────────────┐
│    LangChain    │  <-- Chains, Agents, RAG
└────────┬────────┘
         │
         v
┌─────────────────────────────────┐
│         AxonFlow                │
│  ┌───────────┐  ┌────────────┐  │
│  │  Policy   │  │   Audit    │  │
│  │  Enforce  │  │   Trail    │  │
│  └───────────┘  └────────────┘  │
│  ┌───────────┐  ┌────────────┐  │
│  │    PII    │  │    Cost    │  │
│  │  Detection│  │   Control  │  │
│  └───────────┘  └────────────┘  │
└────────────────┬────────────────┘
                 │
                 v
┌─────────────────┐
│   LLM Provider  │
└─────────────────┘

What this gives you:

Every request logged with user identity and context
PII detected and blocked before hitting the LLM (SSN, credit cards, etc.)
SQL injection attempts blocked
Cost tracked per user, per department, per agent
Compliance auditors can query 90 days of decisions in seconds

What stays the same:

Your LangChain code doesn't change
No new abstractions to learn
No framework lock-in

Integration Patterns

Pattern 1: Pre-Check + Audit (Gateway Mode) — Recommended

Recommended default for most teams. This is the standard pattern—check every request before the LLM, audit after:

from langchain_openai import ChatOpenAI
from axonflow import AxonFlow, TokenUsage
import time

def governed_langchain_call(
    user_token: str,
    query: str,
    context: dict = None
) -> str:
    """LangChain call with AxonFlow governance."""
    start_time = time.time()

    with AxonFlow.sync(
        endpoint="http://localhost:8080",
        client_id="langchain-app",
        client_secret="your-secret"
    ) as axonflow:
        # 1. Pre-check: Is this request allowed?
        ctx = axonflow.get_policy_approved_context(
            user_token=user_token,
            query=query,
            context={**(context or {}), "framework": "langchain"}
        )

        if not ctx.approved:
            return f"Blocked: {ctx.block_reason}"
            # Logged: who, what, when, why blocked

        # 2. Your existing LangChain code
        llm = ChatOpenAI(model="gpt-4")
        response = llm.invoke(query)
        latency_ms = int((time.time() - start_time) * 1000)

        # 3. Audit: Record what happened
        axonflow.audit_llm_call(
            context_id=ctx.context_id,
            response_summary=response.content[:200],
            provider="openai",
            model="gpt-4",
            token_usage=TokenUsage(
                prompt_tokens=response.usage_metadata.get("input_tokens", 0),
                completion_tokens=response.usage_metadata.get("output_tokens", 0),
                total_tokens=response.usage_metadata.get("total_tokens", 0)
            ),
            latency_ms=latency_ms
        )

        return response.content

Pattern 2: Governed RAG Pipeline — Advanced: For RAG systems

Add governance to retrieval-augmented generation:

from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import Chroma
from langchain.chains import RetrievalQA
from axonflow import AxonFlow, TokenUsage
import time

class GovernedRAG:
    def __init__(self, vectorstore_path: str):
        self.embeddings = OpenAIEmbeddings()
        self.vectorstore = Chroma(
            persist_directory=vectorstore_path,
            embedding_function=self.embeddings
        )
        self.llm = ChatOpenAI(model="gpt-4")
        self.chain = RetrievalQA.from_chain_type(
            llm=self.llm,
            retriever=self.vectorstore.as_retriever(),
            return_source_documents=True
        )

    def query(self, user_token: str, question: str, context: dict = None) -> dict:
        start_time = time.time()

        with AxonFlow.sync(
            endpoint="http://localhost:8080",
            client_id="langchain-rag",
            client_secret="your-secret"
        ) as axonflow:
            # Pre-check with RAG context
            ctx = axonflow.get_policy_approved_context(
                user_token=user_token,
                query=question,
                context={
                    **(context or {}),
                    "pipeline": "rag",
                    "data_tier": context.get("data_tier", "standard")
                }
            )

            if not ctx.approved:
                return {"answer": f"Blocked: {ctx.block_reason}", "blocked": True}

            # Execute RAG
            result = self.chain({"query": question})
            sources = [doc.metadata for doc in result.get("source_documents", [])]
            latency_ms = int((time.time() - start_time) * 1000)

            # Audit with source attribution
            axonflow.audit_llm_call(
                context_id=ctx.context_id,
                response_summary=result["result"][:200],
                provider="openai",
                model="gpt-4",
                token_usage=TokenUsage(prompt_tokens=100, completion_tokens=50, total_tokens=150),
                latency_ms=latency_ms,
                metadata={"source_count": len(sources), "sources": sources[:5]}
            )

            return {"answer": result["result"], "sources": sources, "context_id": ctx.context_id}

Pattern 3: Proxy Mode — Alternative: For simpler deployments

Route all LLM calls through AxonFlow for automatic governance:

from axonflow import AxonFlow

# All governance happens automatically
with AxonFlow.sync(
    endpoint="http://localhost:8080",
    client_id="langchain-app"
) as axonflow:
    result = axonflow.proxy_llm_call(
        user_token="user-123",
        query="What are the key AI governance regulations?",
        request_type="chat",
        context={"framework": "langchain", "department": "research"}
    )

    if result.blocked:
        print(f"Blocked: {result.block_reason}")
    else:
        print(result.data)  # LLM response

Tool-Level Governance (Python SDK v6.0.0+)

New in v5.3.0

GovernedTool wraps any LangChain BaseTool with input/output policy enforcement. AxonFlowChatModel wraps any BaseChatModel with pre-check and audit. Together they provide full governance at both the LLM and tool boundaries.

GovernedTool — Govern Every Tool Call

GovernedTool subclasses BaseTool, so AgentExecutor and any LangChain agent accept it directly. Every tool invocation runs through mcp_check_input (before execution) and mcp_check_output (after execution):

from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_tool_calling_agent
from axonflow import AxonFlow
from axonflow.adapters import govern_tools

@tool
def search_customers(query: str) -> str:
    """Search the customer database by name."""
    return db.query("SELECT name, email, order_status FROM customers WHERE name LIKE ?", [f"%{query}%"])

@tool
def send_notification(message: str) -> str:
    """Send a notification to a customer."""
    return email_service.send(message)

with AxonFlow.sync(
        endpoint="http://localhost:8080",
        client_id="your-client-id",
        client_secret="your-secret",
    ) as client:
    # One line — wraps all tools with input/output governance
    governed = govern_tools([search_customers, send_notification], client)

    llm = ChatOpenAI(model="gpt-4.1-nano")
    agent = create_tool_calling_agent(llm, governed, prompt)
    executor = AgentExecutor(agent=agent, tools=governed)

    # Tool calls are now governed:
    # - Input: PII in tool args blocked before execution
    # - Output: PII in tool results redacted before LLM sees them
    result = executor.invoke({"input": "Find John's account"})

AxonFlowChatModel — Govern LLM Calls

AxonFlowChatModel wraps any BaseChatModel with pre_check before every LLM call and audit_llm_call after. Drop-in replacement:

from langchain_anthropic import ChatAnthropic
from axonflow import AxonFlow
from axonflow.adapters import AxonFlowChatModel

async with AxonFlow(
        endpoint="http://localhost:8080",
        client_id="your-client-id",
        client_secret="your-secret",
    ) as client:
    model = AxonFlowChatModel(
        wrapped=ChatAnthropic(model_name="claude-sonnet-4-6"),
        axonflow=client,
    )

    # Use exactly like ChatAnthropic — governance is transparent
    result = await model.ainvoke(
        messages,
        config={"configurable": {"user_token": "user-jwt"}},
    )

Combined: Full LLM + Tool Governance

For complete coverage, use both together. AxonFlowChatModel governs LLM calls (async), GovernedTool governs tool calls. Use the async path with ainvoke:

async with AxonFlow(
        endpoint="http://localhost:8080",
        client_id="your-client-id",
        client_secret="your-secret",
    ) as client:
    # Govern the model (LLM calls)
    model = AxonFlowChatModel(
        wrapped=ChatOpenAI(model="gpt-4.1-nano"),
        axonflow=client,
    )

    # Govern the tools (tool calls)
    governed = govern_tools([search_customers, send_notification], client)

    agent = create_tool_calling_agent(model, governed, prompt)
    executor = AgentExecutor(agent=agent, tools=governed)

    # Use ainvoke for async context (AxonFlowChatModel governance is async-only)
    result = await executor.ainvoke({"input": "Find John's account"})

This gives you policy enforcement at two boundaries:

LLM boundary: pre_check on every model call, audit_llm_call after
Tool boundary: mcp_check_input before tool execution, mcp_check_output after (with redaction support)

Sync vs Async

GovernedTool.invoke() (sync) requires a synchronous AxonFlow client (AxonFlow.sync()). GovernedTool.ainvoke() (async) works with the async client. When combining with AxonFlowChatModel (async-only governance), use the async path throughout.

For details on why these are separate boundaries with different policy capabilities, see Per-Tool Governance.

Other Languages: Go, TypeScript, Java — For polyglot or service-oriented architectures

Go SDK Integration

For Go services orchestrating LangChain or hybrid architectures:

package main

import (
    "context"
    "fmt"
    "time"

    "github.com/getaxonflow/axonflow-sdk-go/v5"
    "github.com/sashabaranov/go-openai"
)

func GovernedLangChainCall(
    ctx context.Context,
    client *axonflow.AxonFlowClient,
    openai *openai.Client,
    userToken, query string,
    callContext map[string]interface{},
) (string, error) {
    startTime := time.Now()

    // Pre-check
    callContext["framework"] = "langchain"
    result, err := client.GetPolicyApprovedContext(userToken, query, nil, callContext)
    if err != nil {
        return "", err
    }

    if !result.Approved {
        return "", fmt.Errorf("blocked: %s", result.BlockReason)
    }

    // Make LLM call
    resp, err := openai.CreateChatCompletion(ctx, openai.ChatCompletionRequest{
        Model: openai.GPT4,
        Messages: []openai.ChatCompletionMessage{
            {Role: "user", Content: query},
        },
    })
    if err != nil {
        return "", err
    }

    return resp.Choices[0].Message.Content, nil
}

TypeScript SDK Integration

import { AxonFlow } from "@axonflow/sdk";
import { ChatOpenAI } from "@langchain/openai";

async function governedLangChainCall(
  userToken: string,
  query: string,
  context?: Record<string, unknown>
): Promise<string> {
  const axonflow = new AxonFlow({
    endpoint: "http://localhost:8080",
    clientId: process.env.AXONFLOW_CLIENT_ID,
    clientSecret: process.env.AXONFLOW_CLIENT_SECRET,
  });

  const startTime = Date.now();

  // Pre-check
  const approval = await axonflow.getPolicyApprovedContext({
    userToken,
    query,
    context: { ...context, framework: "langchain" },
  });

  if (!approval.approved) {
    throw new Error(`Blocked: ${approval.blockReason}`);
  }

  // LangChain call
  const llm = new ChatOpenAI({ model: "gpt-4" });
  const response = await llm.invoke(query);
  const latencyMs = Date.now() - startTime;

  // Audit
  await axonflow.auditLLMCall({
    contextId: approval.contextId,
    responseSummary: response.content.slice(0, 200),
    provider: "openai",
    model: "gpt-4",
    tokenUsage: { promptTokens: 100, completionTokens: 50, totalTokens: 150 },
    latencyMs,
  });

  return response.content;
}

Java SDK Integration

import com.getaxonflow.sdk.AxonFlow;
import com.getaxonflow.sdk.AxonFlowConfig;
import com.getaxonflow.sdk.types.*;

public class GovernedLangChain {
    private final AxonFlow axonflow;

    public GovernedLangChain() {
        this.axonflow = AxonFlow.create(AxonFlowConfig.builder()
            .endpoint("http://localhost:8080")
            .clientId("langchain-java")
            .clientSecret(System.getenv("AXONFLOW_CLIENT_SECRET"))
            .build());
    }

    public String governedCall(String userToken, String query, Map<String, Object> context)
        throws Exception {

        context.put("framework", "langchain");
        long startTime = System.currentTimeMillis();

        // Pre-check
        PolicyApprovalResult approval = axonflow.getPolicyApprovedContext(
            PolicyApprovalRequest.builder()
                .userToken(userToken)
                .query(query)
                .context(context)
                .build()
        );

        if (!approval.isApproved()) {
            throw new PolicyViolationException(approval.getBlockReason());
        }

        // Your LLM call here
        String response = callLLM(query);
        long latencyMs = System.currentTimeMillis() - startTime;

        // Audit
        axonflow.auditLLMCall(AuditOptions.builder()
            .contextId(approval.getContextId())
            .responseSummary(response.substring(0, Math.min(200, response.length())))
            .provider("openai")
            .model("gpt-4")
            .tokenUsage(TokenUsage.of(100, 50))
            .latencyMs(latencyMs)
            .build());

        return response;
    }
}

Example Implementations

Language	SDK	Example
Python	axonflow	hello-world/python
Go	axonflow-sdk-go	hello-world/go
TypeScript	@axonflow/sdk	hello-world/typescript
Java	axonflow-sdk	hello-world/java
HTTP/curl	Raw HTTP	hello-world/http (for PHP, Ruby, Rust, etc.)

Best Practices

Always use context IDs: The context_id from get_policy_approved_context() must be passed to audit_llm_call() for proper correlation.

Handle blocked requests gracefully: Check ctx.approved before making LLM calls. Return user-friendly messages when blocked.

Always audit, even on errors: Wrap LLM calls in try/except and call audit_llm_call() in both success and error paths.

Use meaningful context: Pass relevant metadata (department, use_case, data_tier) to enable fine-grained policies.

Troubleshooting

Issue	Solution
Pre-check returns 401	Verify `client_secret` is correct
Audit calls failing	Check `context_id` is from valid pre-check (not expired)
High latency	AxonFlow adds under 10ms; if higher, check network to AxonFlow endpoint
Policies not applying	Verify context fields match policy conditions; check AxonFlow logs

What The Current SDK Surface Gives LangChain Teams​

What LangChain Governance Means in Practice​

What LangChain Does Well​

What LangChain Doesn't Try to Solve​

Where Teams Hit Production Friction​

1. The Compliance Audit​

2. The $30K Invoice​

3. The "Why Did It Say That?" Question​

4. The Security Review Block​

5. The PII Leak That Wasn't (But Could Have Been)​

How AxonFlow Plugs In​

Integration Patterns​

Pattern 1: Pre-Check + Audit (Gateway Mode) — Recommended​

Tool-Level Governance (Python SDK v6.0.0+)​

GovernedTool — Govern Every Tool Call​

AxonFlowChatModel — Govern LLM Calls​

Combined: Full LLM + Tool Governance​

Go SDK Integration​

TypeScript SDK Integration​

Java SDK Integration​

Example Implementations​

Related Resources​