Evaluation Rollout Guide

The Evaluation tier exists for the moment when Community is no longer enough to answer the real question. The real question is not "does AxonFlow compile and run?" It is "can this become part of a production-grade operating model?"

This guide is for the engineer or platform owner running that next step.

What Evaluation Is For

Evaluation is best used to validate the controls that usually matter just before production:

real approval queues
policy simulation before rollout
evidence export for governance review
larger policy, execution, and provider limits
more realistic workload and operator behavior

The current evaluation tier gives you:

50 tenant policies
5 organization policies
5 custom policy connectors
3 LLM providers
100 pending approvals
14 day audit retention
policy simulation
evidence export up to 5000 records and 3 exports per day
24-hour approval expiry
300/day policy simulation cap
50 inputs/run impact report limit

See Community vs Evaluation vs Enterprise for the complete limit profile.

That is enough to run a meaningful internal pilot rather than only a developer proof of concept.

Ready to start? Request an Evaluation License

Pick The Right Evaluation Scope

A strong evaluation scope usually has all three of these:

one real application or workflow that matters
one workflow path with meaningful governance or approval risk
one stakeholder beyond the core engineering team

Weak evaluations are usually too small. They prove that the platform starts, but they do not prove that the organization can operate it.

Good examples:

a customer-support assistant with governed connector access
an internal research assistant with redaction and evidence requirements
a multi-step workflow that requires review before execution of risky actions

Recommended 3-Phase Evaluation

Phase 1: Prove Technical Fit

Use the first phase to answer:

does the SDK integration fit the app architecture?
do the right connectors and providers exist?
do policies catch the right classes of risky behavior?

This is where Community To Enterprise Migration and Deployment Mode Matrix are most useful.

Phase 2: Prove Operational Fit

This is the real evaluation phase. Validate:

approval queues with real reviewer behavior
policy simulation before policy rollouts
execution visibility and incident handling
evidence export for internal governance review
whether the current limits are enough for the intended pilot

This is where pages like Human-in-the-Loop and Execution Viewer matter.

Phase 3: Prove Organizational Fit

Use the final phase to answer:

would security sign off on this rollout model?
can reviewers and operators use it without engineering babysitting everything?
does the pilot already point toward identity, portal workflows, or enterprise connectors?

That is the phase where the enterprise decision usually becomes obvious.

Exit Criteria For A Good Evaluation

Before you call the evaluation successful, you should have answers to:

which workflows deserve approval gates?
which policies need simulation before rollout?
how will operators inspect, replay, and export executions?
what are the first scale or governance limits you are likely to hit?
is Evaluation enough for the intended production pilot, or is Enterprise the realistic landing zone?

If those answers are still fuzzy, the evaluation probably measured developer excitement more than platform fit.

Signals That Evaluation Should Turn Into Enterprise

The strongest signals are:

several teams want to share the platform
non-engineers need approval or portal workflows
SSO or SCIM becomes mandatory
security, procurement, or compliance wants stronger operational evidence
enterprise connectors or provider management become part of the plan

That is when Enterprise Overview and Enterprise Rollout Checklist become more relevant than one more pilot iteration.

What Evaluation Is For​

Pick The Right Evaluation Scope​

Recommended 3-Phase Evaluation​

Phase 1: Prove Technical Fit​

Phase 2: Prove Operational Fit​

Phase 3: Prove Organizational Fit​

Exit Criteria For A Good Evaluation​

Signals That Evaluation Should Turn Into Enterprise​

Related Docs​