Evaluation Rollout Guide
The Evaluation tier exists for the moment when Community is no longer enough to answer the real question. The real question is not "does AxonFlow compile and run?" It is "can this become part of a production-grade operating model?"
This guide is for the engineer or platform owner running that next step.
What Evaluation Is For
Evaluation is best used to validate the controls that usually matter just before production:
- real approval queues
- policy simulation before rollout
- evidence export for governance review
- larger policy, execution, and provider limits
- more realistic workload and operator behavior
The current evaluation tier gives you:
50tenant policies5organization policies5custom policy connectors3LLM providers100pending approvals14day audit retention- policy simulation
- evidence export up to
5000records and3exports per day 24-hour approval expiry300/day policy simulation cap50inputs/run impact report limit
See Community vs Evaluation vs Enterprise for the complete limit profile.
That is enough to run a meaningful internal pilot rather than only a developer proof of concept.
Ready to start? Request an Evaluation License
Pick The Right Evaluation Scope
A strong evaluation scope usually has all three of these:
- one real application or workflow that matters
- one workflow path with meaningful governance or approval risk
- one stakeholder beyond the core engineering team
Weak evaluations are usually too small. They prove that the platform starts, but they do not prove that the organization can operate it.
Good examples:
- a customer-support assistant with governed connector access
- an internal research assistant with redaction and evidence requirements
- a multi-step workflow that requires review before execution of risky actions
Recommended 3-Phase Evaluation
Phase 1: Prove Technical Fit
Use the first phase to answer:
- does the SDK integration fit the app architecture?
- do the right connectors and providers exist?
- do policies catch the right classes of risky behavior?
This is where Community To Enterprise Migration and Deployment Mode Matrix are most useful.
Phase 2: Prove Operational Fit
This is the real evaluation phase. Validate:
- approval queues with real reviewer behavior
- policy simulation before policy rollouts
- execution visibility and incident handling
- evidence export for internal governance review
- whether the current limits are enough for the intended pilot
This is where pages like Human-in-the-Loop and Execution Viewer matter.
Phase 3: Prove Organizational Fit
Use the final phase to answer:
- would security sign off on this rollout model?
- can reviewers and operators use it without engineering babysitting everything?
- does the pilot already point toward identity, portal workflows, or enterprise connectors?
That is the phase where the enterprise decision usually becomes obvious.
Exit Criteria For A Good Evaluation
Before you call the evaluation successful, you should have answers to:
- which workflows deserve approval gates?
- which policies need simulation before rollout?
- how will operators inspect, replay, and export executions?
- what are the first scale or governance limits you are likely to hit?
- is Evaluation enough for the intended production pilot, or is Enterprise the realistic landing zone?
If those answers are still fuzzy, the evaluation probably measured developer excitement more than platform fit.
Signals That Evaluation Should Turn Into Enterprise
The strongest signals are:
- several teams want to share the platform
- non-engineers need approval or portal workflows
- SSO or SCIM becomes mandatory
- security, procurement, or compliance wants stronger operational evidence
- enterprise connectors or provider management become part of the plan
That is when Enterprise Overview and Enterprise Rollout Checklist become more relevant than one more pilot iteration.
