AutoGen + AxonFlow Integration
What AutoGen Does Well
AutoGen is Microsoft's premier framework for building multi-agent AI systems—with strong community adoption and continuous development. Its strengths are real:
Multi-Agent Orchestration: Build teams of autonomous agents that converse, collaborate, and delegate tasks. GroupChat, UserProxyAgent, and AssistantAgent patterns handle complex coordination.
Flexible Agent Design: Define agents with custom behaviors, tool use, and memory. Agents can execute code, call APIs, and interact with humans in the loop.
Code Execution: Built-in sandboxed code execution with Docker support. Agents can write, run, and iterate on code safely.
Human-in-the-Loop: Native patterns for human approval, feedback, and intervention. human_input_mode controls when humans are consulted.
Active Development: Regular releases, strong Microsoft backing, and active community. Issues get addressed, patterns evolve.
What AutoGen Doesn't Try to Solve
AutoGen focuses on agent orchestration and collaboration. These concerns are explicitly out of scope:
| Production Requirement | AutoGen's Position |
|---|---|
| Policy enforcement before agent actions | Not provided—no built-in way to block requests based on content or context |
| PII detection in agent communications | Not addressed—agents can share sensitive data freely between themselves |
| SQL injection prevention | Not provided—code execution is sandboxed but input validation is external |
| Per-agent or per-user cost attribution | Not tracked—no way to attribute API costs to specific agents or users |
| Audit trails for compliance | Requires external logging—agent conversations aren't logged by default |
| Cross-agent access control | Not addressed—any agent can message any other agent in a group |
| Token budget enforcement | Not provided—agents can consume unlimited tokens without limits |
This isn't a criticism—it's a design choice. AutoGen handles orchestration. Governance is a separate concern.
Where Teams Hit Production Friction
Based on real enterprise deployments, here are the blockers that appear after the prototype works:
1. The Recursive Agent Loop
A researcher agent asks an analyst agent for data. The analyst requests clarification. The researcher rephrases. This continues. Over the weekend, 23,000 API calls are made before the conversation timeout.
AutoGen processed every message as intended. Nothing was watching the cost of what it processed.
2. The "What Did The Agents Discuss?" Question
Compliance asks for an audit trail of a multi-agent decision. The agents reached a conclusion, but:
- What prompts were sent between agents?
- What data was shared in the conversation?
- Which agent made which decision?
- What was the chain of reasoning?
This information wasn't captured. AutoGen facilitated the conversation; it didn't log it for compliance.
3. The PII Exposure in Agent Memory
An HR agent has access to employee data. It shares context with a reporting agent. The reporting agent, designed to produce summaries for managers, now has employee SSNs in its conversation history.
AutoGen has no built-in mechanism to filter PII between agents.
4. The Security Review Block
Security review: BLOCKED
- No audit trail for inter-agent communications
- PII can flow between agents without filtering
- Code execution governed only by Docker isolation
- No role-based access control for agents
- Cost controls missing
The multi-agent system worked perfectly in demo. It can't ship.
5. The Runaway Code Execution
A coding agent generates and executes code in a loop. Each iteration calls an API. The code works, but the API has per-call costs. 50,000 iterations later, the bill arrives.
AutoGen's sandbox prevented security issues. It didn't prevent financial issues.
How AxonFlow Plugs In
AxonFlow doesn't replace AutoGen. It sits underneath it—providing the governance layer that AutoGen intentionally doesn't include:
┌─────────────────┐
│ Your App │
└────────┬────────┘
│
v
┌─────────────────┐
│ AutoGen │ <-- Agents, GroupChat, Code Execution
└────────┬────────┘
│
v
┌─────────────────────────────────┐
│ AxonFlow │
│ ┌───────────┐ ┌────────────┐ │
│ │ Policy │ │ Audit │ │
│ │ Enforce │ │ Trail │ │
│ └───────────┘ └────────────┘ │
│ ┌───────────┐ ┌────────────┐ │
│ │ PII │ │ Cost │ │
│ │ Detection│ │ Control │ │
│ └───────────┘ └────────────┘ │
└────────────────┬────────────────┘
│
v
┌─────────────────┐
│ LLM Provider │
└─────────────────┘
What this gives you:
- Every agent action logged with agent identity and context
- PII detected and blocked before flowing between agents
- SQL injection attempts blocked even in code generation
- Cost tracked per agent, per user, per conversation
- Compliance auditors can query the full decision chain
What stays the same:
- Your AutoGen code doesn't change
- Agent orchestration patterns work as before
- No new abstractions to learn
Integration Patterns
Pattern 1: Governed Agent Wrapper (Gateway Mode)
Wrap AutoGen agents with AxonFlow governance:
import os
import time
from typing import Optional, Dict, Any, List, Union
from autogen import AssistantAgent, UserProxyAgent, ConversableAgent
from axonflow import AxonFlow, TokenUsage
class GovernedAutoGenAgent:
"""Wrapper that adds AxonFlow governance to AutoGen agents."""
def __init__(
self,
axonflow_client: AxonFlow,
agent: ConversableAgent,
user_token: str,
agent_role: str = "assistant"
):
self.axonflow = axonflow_client
self.agent = agent
self.user_token = user_token
self.agent_role = agent_role
# Wrap the agent's LLM call
self._original_generate = agent.generate_reply
agent.generate_reply = self._governed_generate
def _governed_generate(
self,
messages: Optional[List[Dict]] = None,
sender: Optional[ConversableAgent] = None,
**kwargs
) -> Union[str, Dict, None]:
"""Generate reply with AxonFlow governance."""
query = ""
if messages:
last_message = messages[-1]
if isinstance(last_message, dict):
query = last_message.get("content", "")
else:
query = str(last_message)
start_time = time.time()
ctx = self.axonflow.get_policy_approved_context(
user_token=self.user_token,
query=query,
context={
"agent_role": self.agent_role,
"agent_name": self.agent.name,
"framework": "autogen"
}
)
if not ctx.approved:
return f"[BLOCKED by policy: {ctx.block_reason}]"
llm_start = time.time()
response = self._original_generate(messages, sender, **kwargs)
llm_end = time.time()
response_text = response if isinstance(response, str) else str(response)
self.axonflow.audit_llm_call(
context_id=ctx.context_id,
response_summary=response_text[:200],
provider="openai",
model=self.agent.llm_config.get("model", "gpt-4"),
latency_ms=int((llm_end - llm_start) * 1000)
)
return response
# Usage
with AxonFlow.sync(
agent_url=os.getenv("AXONFLOW_AGENT_URL", "http://localhost:8080"),
client_id=os.getenv("AXONFLOW_CLIENT_ID"),
client_secret=os.getenv("AXONFLOW_CLIENT_SECRET")
) as axonflow:
assistant = AssistantAgent(
name="research_assistant",
llm_config={"model": "gpt-4", "api_key": os.getenv("OPENAI_API_KEY")}
)
governed = GovernedAutoGenAgent(
axonflow_client=axonflow,
agent=assistant,
user_token="user-123",
agent_role="research_assistant"
)
# Agent is now governed
Pattern 2: GroupChat with Per-Agent Policies
Apply different policies to different agent roles:
class GovernedGroupChat:
"""GroupChat with AxonFlow governance for each agent."""
def __init__(self, axonflow: AxonFlow, user_token: str):
self.axonflow = axonflow
self.user_token = user_token
self.agent_policies = {}
def add_agent(self, agent: AssistantAgent, policy_context: dict):
"""Register an agent with its policy context."""
self.agent_policies[agent.name] = policy_context
original_generate = agent.generate_reply
def governed_generate(messages=None, sender=None, **kwargs):
query = messages[-1].get("content", "") if messages else ""
ctx_data = self.agent_policies.get(agent.name, {})
ctx_data["agent_name"] = agent.name
ctx_data["framework"] = "autogen"
ctx = self.axonflow.get_policy_approved_context(
user_token=self.user_token,
query=query,
context=ctx_data
)
if not ctx.approved:
return f"[BLOCKED: {ctx.block_reason}]"
response = original_generate(messages, sender, **kwargs)
self.axonflow.audit_llm_call(
context_id=ctx.context_id,
response_summary=str(response)[:200],
provider="openai",
model=agent.llm_config.get("model", "gpt-4"),
latency_ms=100
)
return response
agent.generate_reply = governed_generate
return agent
# Different policies per agent role
governed = GovernedGroupChat(axonflow, user_token="analyst-team")
researcher = governed.add_agent(
AssistantAgent(name="researcher", llm_config={"model": "gpt-4"}),
policy_context={"role": "researcher", "data_access": ["external", "public"]}
)
analyst = governed.add_agent(
AssistantAgent(name="analyst", llm_config={"model": "gpt-4"}),
policy_context={"role": "analyst", "data_access": ["internal", "financial"]}
)
Pattern 3: Java Service Orchestrating AutoGen
For Java services coordinating with AutoGen via REST:
package com.example.autogen;
import com.getaxonflow.sdk.AxonFlow;
import com.getaxonflow.sdk.AxonFlowConfig;
import com.getaxonflow.sdk.PolicyApprovalResult;
import com.getaxonflow.sdk.TokenUsage;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.net.URI;
import java.util.Map;
public class GovernedAutoGenService {
private final AxonFlow axonflow;
private final HttpClient httpClient;
private final String autogenServiceUrl;
public GovernedAutoGenService(String autogenServiceUrl) {
this.autogenServiceUrl = autogenServiceUrl;
this.httpClient = HttpClient.newHttpClient();
AxonFlowConfig config = AxonFlowConfig.builder()
.agentUrl(System.getenv("AXONFLOW_AGENT_URL"))
.clientId(System.getenv("AXONFLOW_CLIENT_ID"))
.clientSecret(System.getenv("AXONFLOW_CLIENT_SECRET"))
.build();
this.axonflow = AxonFlow.create(config);
}
public String executeGovernedConversation(
String userToken,
String message,
String agentTeam,
Map<String, Object> context
) throws Exception {
context.put("framework", "autogen");
context.put("agent_team", agentTeam);
PolicyApprovalResult approval = axonflow.getPolicyApprovedContext(
userToken, message, context
);
if (!approval.isApproved()) {
throw new PolicyViolationException(approval.getBlockReason());
}
long startTime = System.currentTimeMillis();
HttpResponse<String> response = httpClient.send(
HttpRequest.newBuilder()
.uri(URI.create(autogenServiceUrl + "/chat"))
.header("Content-Type", "application/json")
.POST(HttpRequest.BodyPublishers.ofString(
"{\"message\": \"" + message + "\", \"agent_team\": \"" + agentTeam + "\"}"
))
.build(),
HttpResponse.BodyHandlers.ofString()
);
axonflow.auditLLMCall(
approval.getContextId(),
response.body().substring(0, Math.min(200, response.body().length())),
"openai", "gpt-4",
TokenUsage.of(0, 0, 0),
System.currentTimeMillis() - startTime
);
return response.body();
}
}
Example Implementations
| Language | SDK | Example |
|---|---|---|
| Python | axonflow | autogen/python |
| Java | axonflow-sdk | autogen/java |