CrewAI + AxonFlow Integration

Prerequisites: Python 3.10+, AxonFlow running (Getting Started), pip install crewai axonflow

What The Current Governance Surface Gives CrewAI Teams

CrewAI does not need a framework-specific adapter change for these capabilities to matter. The Python SDK and workflow-control APIs already used around crew execution now give you:

explain_decision() for understanding why a crew run, delegated task, or gated step was denied or held for review
audit search by decision_id, policy_name, and override_id, which is useful when reconstructing why a crew was blocked or temporarily unblocked
richer audit and decision correlation around the governed SDK calls the crew already uses

That translates into concrete operational value for multi-agent teams: better support workflows after a deny and cleaner incident reconstruction across delegated work.

For CrewAI specifically, this helps with the messy parts of delegation-heavy systems. When a crew stalls because one delegated task is denied, staff engineers need to understand which decision blocked the run, what policy or override state was involved, and whether the crew can safely resume without replaying everything that came before. The newer audit filters and explainability path make that a first-class workflow instead of a manual forensic exercise.

The important limitation is that a normal CrewAI run does not get LangGraph-style checkpoints or step-gate recovery from these additions alone. If a governed call is blocked, the crew run halts and the surrounding application has to decide what happens next. Explainability and audit search help a lot with that support loop, but they do not create automatic resume semantics for CrewAI.

Why CrewAI Needs a Governance Layer

CrewAI is a Python framework for building teams of AI agents that collaborate autonomously on complex tasks. Each agent in a crew has a defined role (Researcher, Writer, Analyst), a backstory, and a set of tools. Agents delegate subtasks to each other, share context, and work through sequential, hierarchical, or consensual processes to produce a final result. This makes CrewAI one of the most popular choices for multi-agent orchestration, with thousands of production deployments.

The governance challenge with CrewAI is fundamentally different from single-agent frameworks. In a CrewAI crew, multiple agents make independent LLM calls, and each agent can delegate work to any other agent in the group. A research agent might pass customer data to a writing agent, which then passes a draft to an editing agent. Every one of these handoffs is an LLM call that could contain PII, trigger cost overruns, or violate compliance policies. Without governance, you have no visibility into what data flows between agents, no control over how many LLM calls a delegation chain produces, and no audit trail showing which agent made which decision.

AxonFlow integrates with CrewAI through gateway mode. The recommended pattern wraps the entire crew execution with a pre-check and audit call, giving you policy enforcement before the crew starts and a complete audit record when it finishes. For teams that need finer control, a per-task governance pattern applies policy checks before each individual task executes, so you can enforce different policies for different agent roles within the same crew.

Both patterns use the same three-step flow: call get_policy_approved_context() to check policies, execute the CrewAI logic, then call audit_llm_call() to record what happened. Your existing CrewAI agent definitions, task configurations, and process types remain unchanged.

What CrewAI Does Well

CrewAI is a framework for orchestrating autonomous AI agents that collaborate on complex tasks. Its strengths are compelling:

Role-Based Agents: Define agents with specific roles (Researcher, Writer, Analyst). Each agent has a backstory, goals, and expertise.

Task Delegation: Agents delegate subtasks to each other. Complex work is divided naturally.

Process Types: Sequential, hierarchical, and consensual processes. Different collaboration patterns for different needs.

Tool Integration: Each agent can use different tools. Researchers search, writers format, analysts compute.

Memory and Context: Agents remember context across interactions. Long-running collaborations work naturally.

Human-in-the-Loop: Agents can ask for human input when needed. Supervised autonomy is built in.

What CrewAI Doesn't Try to Solve

CrewAI focuses on multi-agent collaboration. These concerns are explicitly out of scope:

Production Requirement	CrewAI's Position
Policy enforcement before agent actions	Not provided—agents act based on their roles, not policies
PII detection in agent communications	Not addressed—agents share data freely
SQL injection prevention	Not provided—must implement at tool level
Per-agent or per-crew cost attribution	Not tracked—requires external monitoring
Audit trails for compliance	Not built in—conversations aren't logged by default
Cross-agent access control	Not addressed—agents can delegate to any other agent
Token budget enforcement	Not provided—crews can consume unlimited tokens

This isn't a criticism—it's a design choice. CrewAI handles collaboration. Governance is a separate concern.

Where Teams Hit Production Friction

Based on real enterprise deployments, here are the blockers that appear after the prototype works:

1. The Delegation Loop

A researcher agent delegates to an analyst. The analyst delegates back with a question. The researcher re-delegates. This continues. By Monday, 15,000 API calls have accumulated.

CrewAI processed every delegation as intended. Nothing was watching the cost of collaboration.

2. The Invisible Handoff

A customer service crew handles a complaint. The complaint gets handed off between three agents. The customer asks:

Who handled my issue?
What was discussed at each step?
Who made the final decision?

CrewAI orchestrated the handoffs. Without custom logging, the collaboration history is gone.

3. The Cross-Agent Data Leak

An HR agent collects employee information. It delegates a summary to a reporting agent. The reporting agent, designed for external reports, now has internal HR data in its context.

CrewAI has no mechanism to filter data between agents based on sensitivity.

4. The Security Review Block

Security review: BLOCKED
- No audit trail for inter-agent delegation
- PII can flow between agents without filtering
- No policy enforcement per agent role
- Cost controls missing
- No role-based access for agent actions

The multi-agent crew worked perfectly. It can't ship.

5. The Autonomous Tool Abuse

An agent with database access decides to query extensively for a task. 10,000 database queries later, the task is complete—and the database is throttled.

CrewAI gave the agent autonomy. Nothing governed how much autonomy was appropriate.

How AxonFlow Plugs In

AxonFlow doesn't replace CrewAI. It sits underneath it—providing the governance layer that CrewAI intentionally doesn't include:

┌─────────────────┐
│   Your App      │
└────────┬────────┘
         │
         v
┌─────────────────┐
│    CrewAI       │  <-- Agents, Tasks, Delegation
└────────┬────────┘
         │
         v
┌─────────────────────────────────┐
│         AxonFlow                │
│  ┌───────────┐  ┌────────────┐  │
│  │  Policy   │  │   Audit    │  │
│  │  Enforce  │  │   Trail    │  │
│  └───────────┘  └────────────┘  │
│  ┌───────────┐  ┌────────────┐  │
│  │    PII    │  │    Cost    │  │
│  │  Detection│  │   Control  │  │
│  └───────────┘  └────────────┘  │
└────────────────┬────────────────┘
                 │
                 v
┌─────────────────┐
│   LLM Provider  │
└─────────────────┘

What this gives you:

Every agent action logged with role and delegation context
PII detected and blocked before crossing agent boundaries
SQL injection attempts blocked in agent tools
Cost tracked per agent, per crew, per user
Compliance auditors can query the full collaboration history

What stays the same:

Your CrewAI code doesn't change
Agent definitions work as before
No new abstractions to learn

Integration Patterns

Pattern 1: Governed Crew Runner (Python) — Recommended

Recommended default for most teams. Wrap CrewAI crews with AxonFlow governance:

import os
import time
from crewai import Agent, Task, Crew, Process
from langchain_openai import ChatOpenAI
from axonflow import AxonFlow, TokenUsage


class GovernedCrewRunner:
    """Run CrewAI crews with AxonFlow governance."""

    def __init__(
        self,
        axonflow_url: str = None,
        client_id: str = None,
        client_secret: str = None,
        model: str = "gpt-4",
    ):
        self.axonflow_endpoint = axonflow_url or os.getenv("AXONFLOW_ENDPOINT", "http://localhost:8080")
        self.client_id = client_id or os.getenv("AXONFLOW_CLIENT_ID", "crewai-app")
        self.client_secret = client_secret or os.getenv("AXONFLOW_CLIENT_SECRET")
        self.model = model
        self.llm = ChatOpenAI(
            model=model,
            temperature=0.7,
            openai_api_key=os.getenv("OPENAI_API_KEY"),
        )

    def create_agent(
        self,
        role: str,
        goal: str,
        backstory: str,
        tools: list = None,
    ) -> Agent:
        """Create a CrewAI agent."""
        return Agent(
            role=role,
            goal=goal,
            backstory=backstory,
            verbose=True,
            llm=self.llm,
            tools=tools or [],
        )

    def run_governed_crew(
        self,
        user_token: str,
        crew: Crew,
        inputs: dict,
        context: dict = None,
    ) -> str:
        """Execute a CrewAI crew with AxonFlow governance."""
        start_time = time.time()
        query = " ".join(f"{k}: {v}" for k, v in inputs.items())

        with AxonFlow.sync(
            endpoint=self.axonflow_endpoint,
            client_id=self.client_id,
            client_secret=self.client_secret,
        ) as axonflow:
            # 1. Pre-check
            ctx = axonflow.get_policy_approved_context(
                user_token=user_token,
                query=query,
                context={
                    **(context or {}),
                    "framework": "crewai",
                    "agent_count": len(crew.agents),
                    "agent_roles": [a.role for a in crew.agents],
                },
            )

            if not ctx.approved:
                raise PermissionError(f"Crew blocked: {ctx.block_reason}")

            try:
                # 2. Execute crew
                result = crew.kickoff(inputs=inputs)
                latency_ms = int((time.time() - start_time) * 1000)

                # 3. Audit
                axonflow.audit_llm_call(
                    context_id=ctx.context_id,
                    response_summary=str(result)[:200],
                    provider="openai",
                    model=self.model,
                    token_usage=TokenUsage(prompt_tokens=500, completion_tokens=200, total_tokens=700),
                    latency_ms=latency_ms,
                    metadata={"crew_agents": [a.role for a in crew.agents]},
                )

                return str(result)

            except Exception as e:
                axonflow.audit_llm_call(
                    context_id=ctx.context_id,
                    response_summary=f"Crew error: {str(e)}",
                    provider="openai",
                    model=self.model,
                    token_usage=TokenUsage(prompt_tokens=0, completion_tokens=0, total_tokens=0),
                    latency_ms=int((time.time() - start_time) * 1000),
                    metadata={"error": str(e)},
                )
                raise


# Usage
runner = GovernedCrewRunner()

# Create agents
researcher = runner.create_agent(
    role="Research Analyst",
    goal="Find accurate information",
    backstory="Expert researcher with analytical skills",
)

writer = runner.create_agent(
    role="Content Writer",
    goal="Create clear, engaging content",
    backstory="Technical writer specializing in documentation",
)

# Create tasks
research_task = Task(
    description="Research the latest developments in {topic}",
    expected_output="Comprehensive research summary",
    agent=researcher,
)

writing_task = Task(
    description="Write a blog post based on the research",
    expected_output="Well-structured blog post",
    agent=writer,
)

# Create and run governed crew
crew = Crew(
    agents=[researcher, writer],
    tasks=[research_task, writing_task],
    process=Process.sequential,
)

result = runner.run_governed_crew(
    user_token="user-123",
    crew=crew,
    inputs={"topic": "AI governance in healthcare"},
    context={"department": "marketing"},
)

Pattern 2: Go Service for Crew Governance — For service-oriented architectures

For Go services coordinating CrewAI:

package main

import (
	"context"
	"encoding/json"
	"fmt"
	"net/http"
	"time"

	"github.com/getaxonflow/axonflow-sdk-go/v5"
)

type CrewGovernanceService struct {
	client *axonflow.AxonFlowClient
}

func NewCrewGovernanceService(agentURL, clientSecret string) *CrewGovernanceService {
	return &CrewGovernanceService{
		client: axonflow.NewClient(axonflow.AxonFlowConfig{
			Endpoint:     agentURL,
			ClientID:     "crewai-service",
			ClientSecret: clientSecret,
		}),
	}
}

type CrewRequest struct {
	UserToken  string                 `json:"user_token"`
	CrewID     string                 `json:"crew_id"`
	Inputs     map[string]interface{} `json:"inputs"`
	AgentRoles []string               `json:"agent_roles"`
}

type CrewResponse struct {
	Approved  bool   `json:"approved"`
	ContextID string `json:"context_id"`
	Reason    string `json:"reason,omitempty"`
}

func (s *CrewGovernanceService) CheckCrewExecution(ctx context.Context, req CrewRequest) (*CrewResponse, error) {
	inputsJSON, _ := json.Marshal(req.Inputs)

	result, err := s.client.GetPolicyApprovedContext(
		req.UserToken,
		string(inputsJSON),
		nil,
		map[string]interface{}{
			"framework":   "crewai",
			"crew_id":     req.CrewID,
			"agent_roles": req.AgentRoles,
			"agent_count": len(req.AgentRoles),
		},
	)
	if err != nil {
		return nil, fmt.Errorf("pre-check failed: %w", err)
	}

	return &CrewResponse{
		Approved:  result.Approved,
		ContextID: result.ContextID,
		Reason:    result.BlockReason,
	}, nil
}

func (s *CrewGovernanceService) AuditCrewCompletion(
	ctx context.Context,
	contextID, result string,
	latencyMs int,
	metadata map[string]interface{},
) error {
	_, err := s.client.AuditLLMCall(
		contextID,
		truncate(result, 200),
		"openai",
		"gpt-4",
		axonflow.TokenUsage{PromptTokens: 500, CompletionTokens: 200, TotalTokens: 700},
		int64(latencyMs),
		metadata,
	)
	return err
}

func truncate(s string, maxLen int) string {
	if len(s) <= maxLen {
		return s
	}
	return s[:maxLen]
}

Pattern 3: Per-Task Governance — Advanced: For fine-grained control

Apply governance before each task:

class TaskGovernedCrew:
    """CrewAI with per-task governance checks."""

    def __init__(self, axonflow_url: str, client_id: str, client_secret: str):
        self.axonflow_url = axonflow_url
        self.client_id = client_id
        self.client_secret = client_secret

    def execute_governed_task(
        self,
        user_token: str,
        task: Task,
        context: dict = None,
    ) -> str:
        """Execute a single task with governance."""
        start_time = time.time()

        with AxonFlow.sync(
            endpoint=self.axonflow_url,
            client_id=self.client_id,
            client_secret=self.client_secret,
        ) as axonflow:
            ctx = axonflow.get_policy_approved_context(
                user_token=user_token,
                query=task.description,
                context={
                    **(context or {}),
                    "task_name": task.description[:50],
                    "agent_role": task.agent.role if task.agent else "unknown",
                },
            )

            if not ctx.approved:
                return f"Task blocked: {ctx.block_reason}"

            result = task.execute()

            axonflow.audit_llm_call(
                context_id=ctx.context_id,
                response_summary=str(result)[:200],
                provider="openai",
                model="gpt-4",
                latency_ms=int((time.time() - start_time) * 1000),
            )

            return result

Tool-Level Governance (Python SDK v6.0.0+)

New in v5.3.0

GovernedTool wraps any LangChain BaseTool with AxonFlow input/output policy enforcement. CrewAI accepts LangChain tools via CrewAIBaseTool.from_langchain(), so governed tools work with CrewAI crews.

CrewAI has its own tool type, but it supports LangChain tools through from_langchain. Wrap your tools with govern_tools, then convert:

from langchain_core.tools import tool
from crewai.tools import BaseTool as CrewAIBaseTool
from crewai import Agent, Task, Crew
from axonflow import AxonFlow
from axonflow.adapters import govern_tools

@tool
def search_database(query: str) -> str:
    """Search the internal database."""
    return db.execute(query)

@tool
def send_report(content: str) -> str:
    """Send a report via email."""
    return email.send(content)

with AxonFlow.sync(
        endpoint="http://localhost:8080",
        client_id="your-client-id",
        client_secret="your-secret",
    ) as client:
    # Wrap LangChain tools with governance
    governed = govern_tools([search_database, send_report], client)

    # Convert to CrewAI tools
    crewai_tools = [CrewAIBaseTool.from_langchain(t) for t in governed]

    researcher = Agent(
        role="Database Researcher",
        goal="Find relevant data",
        tools=crewai_tools,
        # ...
    )

    # Tool calls are governed: PII blocked at input, redacted at output
    crew = Crew(agents=[researcher], tasks=[...])
    result = crew.kickoff()

For full details on GovernedTool, see Per-Tool Governance.

Example Implementations

Language	SDK	Example
Python	axonflow	crewai/python
Go	axonflow-sdk-go	crewai/go

What The Current Governance Surface Gives CrewAI Teams​

Why CrewAI Needs a Governance Layer​

What CrewAI Does Well​

What CrewAI Doesn't Try to Solve​

Where Teams Hit Production Friction​

1. The Delegation Loop​

2. The Invisible Handoff​

3. The Cross-Agent Data Leak​

4. The Security Review Block​

5. The Autonomous Tool Abuse​

How AxonFlow Plugs In​

Integration Patterns​

Pattern 1: Governed Crew Runner (Python) — Recommended​

Tool-Level Governance (Python SDK v6.0.0+)​

Example Implementations​

Related Resources​