Retry Semantics & Idempotency

Production workflows retry. Agents crash and resume, upstream callers double-fire requests, message queues redeliver, network acknowledgements vanish mid-flight. When that happens, your policy decisions, audit trail, and downstream side-effects need to stay coherent — a retried step should not look like a brand-new decision, and a duplicate complete should not book a second payment.

The Workflow Control Plane exposes two first-class primitives for this:

retry_context — a structured object returned on every /gate response that tells callers exactly which attempt this is, whether a prior attempt completed, and what the prior decision was.
idempotency_key — an optional caller-supplied business-level key that is recorded on the step and must match across /gate and /complete for the same (workflow_id, step_id).

Both are available on the same POST /api/v1/workflows/{workflow_id}/steps/{step_id}/gate and POST /api/v1/workflows/{workflow_id}/steps/{step_id}/complete endpoints you already use. No new endpoints, no new lifecycle. This page covers the wire shape, the rules, and how to migrate from the older cached / decision_source fields.

Why retry_context replaces `cached`

Before retry_context, callers had two booleans to work with: cached (true when the gate returned a memoized decision) and decision_source ("fresh" or "cached"). Those answer one narrow question: "did you re-evaluate?" They do not tell you:

how many times the step has been gated
whether /complete ever landed for this step
what the prior decision actually was
whether a prior attempt left state that a resumed agent should reuse

retry_context answers all of those explicitly. cached and decision_source are preserved for backward compatibility and will remain on the response, but new integrations should read retry_context directly. See Migration from cached below.

The `retry_context` object

Every /gate response — including the very first call on a step — carries a populated retry_context object:

{
  "decision": "allow",
  "step_id": "step-2",
  "decision_id": "dec_7c41...",
  "cached": false,
  "decision_source": "fresh",
  "retry_context": {
    "gate_count": 1,
    "completion_count": 0,
    "prior_completion_status": "none",
    "prior_output_available": false,
    "prior_output": null,
    "prior_completion_at": null,
    "first_attempt_at": "2026-04-21T15:30:45.123Z",
    "last_attempt_at": "2026-04-21T15:30:45.123Z",
    "last_decision": "allow",
    "idempotency_key": ""
  }
}

Field reference

Field	Type	Description
`gate_count`	integer ≥ 1	Number of times `/gate` has been called for this `(workflow_id, step_id)`, including the current call. First call returns `1`. Increments on every gate call regardless of decision.
`completion_count`	integer ≥ 0	Number of times `/complete` has been successfully called for this step. Typically `0` on the first gate call, `1` after the step completes.
`prior_completion_status`	enum string	One of `"none"`, `"completed"`, `"gated_not_completed"`. See status values.
`prior_output_available`	bool	`true` iff `prior_completion_status == "completed"`. Mirrors whether `prior_output` could be returned if the caller opts in.
`prior_output`	object or `null`	Always present in the schema. Populated only when the gate call set `?include_prior_output=true` and the prior attempt completed. Otherwise `null`.
`prior_completion_at`	ISO 8601 string or `null`	Timestamp of the prior `/complete`, if any. `null` otherwise.
`first_attempt_at`	ISO 8601 string	Timestamp of the first gate call on this step. On the first call, equals `last_attempt_at`.
`last_attempt_at`	ISO 8601 string	Timestamp of the current gate call — the one that produced this response.
`last_decision`	enum string	Decision of the immediately prior gate call. One of `"allow"`, `"block"`, `"require_approval"`. On the first call, equals the current response's `decision`.
`idempotency_key`	string	Idempotency key the caller supplied on the first gate call, if any. Empty string `""` if the caller never supplied one. Once set on a step, it is immutable for the lifetime of that step. See Idempotency keys.

All timestamps are RFC 3339 / ISO 8601 with a timezone offset (typically Z for UTC).

First-call invariant

When gate_count == 1, the response is fully determined:

completion_count == 0
prior_completion_status == "none"
prior_output_available == false
prior_output == null
prior_completion_at == null
first_attempt_at == last_attempt_at
last_decision equals the current call's decision
idempotency_key is whatever the caller passed, or "" if none was supplied

Callers that want to detect "fresh step, never seen before" should branch on gate_count == 1, not on cached == false. cached is a hint about memoization; gate_count is the authoritative attempt counter.

`prior_completion_status` values

Value	Meaning
`"none"`	First gate call for this `(workflow_id, step_id)`. No prior attempt exists.
`"completed"`	A prior gate landed and `/complete` was successfully called for it. The step has already produced output, which can be retrieved via `?include_prior_output=true`.
`"gated_not_completed"`	A prior gate landed but `/complete` never followed. The earlier decision is known, but the agent either crashed before calling `/complete`, is still working, or decided not to proceed. Note that AxonFlow cannot distinguish these three cases — the Workflow Control Plane is cooperative, and only learns about execution outcomes when the agent reports them.

Detecting retries

A retry is any gate call with gate_count > 1. From there, the shape of prior_completion_status tells you what to do:

if gate.retry_context.gate_count > 1:
    status = gate.retry_context.prior_completion_status
    if status == "completed":
        # Step already finished on a previous attempt. Safe to short-circuit
        # and return the prior output to your workflow.
        return fetch_prior_output(workflow_id, step_id)
    elif status == "gated_not_completed":
        # Prior gate landed but no /complete. Decide whether to re-run or
        # bail based on your own state. AxonFlow will re-gate, but the
        # policy decision in last_decision tells you what was said before.
        log.info("Retrying step that was %s on previous attempt",
                 gate.retry_context.last_decision)

In the session-retry case — upstream caller re-fires /gate for the same step after /complete already landed — prior_completion_status == "completed" tells you to skip re-execution and reuse the earlier output. That is the signal that lets you prevent duplicate side-effects on the caller's side, assuming you have access to the prior output.

Requesting prior_output

By default, prior_output is null even when a prior attempt completed. Prior step output may be large or contain sensitive data, so retrieval is opt-in via a query parameter:

curl -X POST \
  "http://localhost:8080/api/v1/workflows/wf_abc123/steps/step-2/gate?include_prior_output=true" \
  -H "Content-Type: application/json" \
  -H "Authorization: Basic $(echo -n 'my-app:my-secret' | base64)" \
  -d '{
    "step_name": "Transfer funds",
    "step_type": "tool_call"
  }'

When ?include_prior_output=true and prior_completion_status == "completed", the prior_output field is populated with the object that was passed to /complete on the earlier attempt. In every other case — ?include_prior_output=false or omitted, no prior completion, or first call on this step — prior_output is null. Passing ?include_prior_output=true on the very first call for a step is a harmless no-op: there is no prior attempt to surface.

prior_output_available always reflects whether a prior output could be returned, independent of the query parameter. A caller can check prior_output_available first to decide whether to re-issue the gate with ?include_prior_output=true.

Idempotency keys

idempotency_key is an optional caller-supplied string, up to 255 characters, that pins a step to a business-level identifier such as an invoice number, a customer-initiated wire transfer ID, or a content hash of the underlying request. It gives your workflow a way to assert: "this step is about this transaction, and no other." A key longer than 255 characters is rejected with 400 BAD_REQUEST on both /gate and /complete.

Scope and lifetime

Within this release, idempotency_key enforcement is scoped to a single (workflow_id, step_id). Key enforcement across workflow IDs — preventing the same business transaction from being processed twice through two different workflow runs — is a planned future enhancement and not part of this release. This page covers per-step behavior only.

Passing on `/gate`

Supply idempotency_key in the /gate request body:

curl -X POST http://localhost:8080/api/v1/workflows/wf_abc123/steps/step-2/gate \
  -H "Content-Type: application/json" \
  -H "Authorization: Basic $(echo -n 'my-app:my-secret' | base64)" \
  -d '{
    "step_name": "Wire transfer",
    "step_type": "tool_call",
    "idempotency_key": "payment:wire:acct4471:invoice-7721"
  }'

The key is recorded on the step row on the first gate call that sets it, and is echoed back on retry_context.idempotency_key in every subsequent gate response for that step. Once set, the key is immutable — subsequent gate calls for the same step must pass the same key, or they receive 409 IDEMPOTENCY_KEY_MISMATCH.

A caller that never supplies an idempotency_key gets the pre-existing behavior: retry_context.idempotency_key is the empty string "", and no key matching is enforced on /complete.

Passing on `/complete`

When /complete lands for a step that set an idempotency key on its gate, the same key must be passed on the completion request:

curl -X POST http://localhost:8080/api/v1/workflows/wf_abc123/steps/step-2/complete \
  -H "Content-Type: application/json" \
  -H "Authorization: Basic $(echo -n 'my-app:my-secret' | base64)" \
  -d '{
    "output": {"transfer_id": "txn-88f210"},
    "tokens_in": 0,
    "tokens_out": 0,
    "cost_usd": 0,
    "idempotency_key": "payment:wire:acct4471:invoice-7721"
  }'

Mismatch rules

The first /gate call on a step fixes the key (or absence of one) for that step's lifetime. Every subsequent /gate and the /complete call must present exactly the same key state, including the "no key" case on both sides.

First `/gate` key	Subsequent `/gate` or `/complete` key	Result
Present	Same value	✅ Accepted
Present	Different value	❌ `409 IDEMPOTENCY_KEY_MISMATCH`
Present	Omitted	❌ `409 IDEMPOTENCY_KEY_MISMATCH`
Omitted	Present	❌ `409 IDEMPOTENCY_KEY_MISMATCH`
Omitted	Omitted	✅ Accepted (legacy behavior)

Mismatch is enforced on both endpoints: a repeat /gate with a different key fails the same way a /complete with a different key does.

`409 IDEMPOTENCY_KEY_MISMATCH`

A key mismatch on /gate or /complete returns HTTP 409 Conflict with this envelope:

{
  "error": {
    "code": "IDEMPOTENCY_KEY_MISMATCH",
    "message": "idempotency_key does not match the key recorded on the step's first gate call",
    "details": {
      "workflow_id": "wf_41231a72",
      "step_id": "step-2",
      "expected_idempotency_key": "payment:wire:acct4471:invoice-7721",
      "received_idempotency_key": "payment:wire:acct4471:invoice-9999"
    }
  }
}

expected_idempotency_key is the empty string "" when the step was first gated without a key but a later call supplied one.
received_idempotency_key is the empty string "" when the step was first gated with a key but a later call omitted it.

All four SDKs expose this as a typed error:

SDK	Type
Python	`IdempotencyKeyMismatchError`
TypeScript	`IdempotencyKeyMismatchError`
Go	`IdempotencyKeyMismatchError` (returned via `errors.As`)
Java	`IdempotencyKeyMismatchException` (subclass of `AxonFlowException`)

Treat a 409 as non-retriable without operator intervention. Either the caller generated the wrong key for the step, or an earlier crash left the step with a key the current caller no longer has. Both cases require a human decision before you can safely proceed.

SDK usage

All four AxonFlow SDKs (Python, Go, TypeScript, Java) expose retry_context as a non-nullable field on the gate response type and accept idempotency_key on both the gate request and the complete request. The exact field names follow each language's idiomatic casing — retry_context in Python, retryContext in TypeScript and Java, RetryContext in Go — and mirror the JSON shape in this page. The include_prior_output option is exposed as a boolean parameter on the SDK's gate method.

See SDK Integration for complete client examples. The WCP lifecycle methods (step_gate, mark_step_completed, LangGraph adapters) all accept the new optional arguments without changing existing call signatures — this is an additive change.

Retry-aware policies (Evaluation tier and above)

The primitives described above — retry_context on the response, idempotency_key on the request, 409 IDEMPOTENCY_KEY_MISMATCH enforcement — are available on every tier, including Community. They are purely wire-level: agents can read and write them on any license.

What requires an Evaluation license or higher is authoring dynamic policies that reference retry state as a condition. In other words: the policy engine resolving step.gate_count > 3 or step.prior_completion_status == "gated_not_completed" at evaluation time. Seven new step.* condition fields are added in this release:

Policy condition field	Source
`step.gate_count`	Integer attempt counter
`step.completion_count`	Integer successful-complete counter
`step.prior_completion_status`	Enum: `"none"`, `"completed"`, `"gated_not_completed"`
`step.prior_output_available`	Bool
`step.last_decision`	Enum: `"allow"`, `"block"`, `"require_approval"`
`step.first_attempt_age_seconds`	Seconds between the first gate call and now
`step.idempotency_key`	The caller-supplied key, empty string if none

Policies using any of these fields unlock rule patterns that used to require custom code — for example "if a prior completion failed and this is attempt 3 or more, require approval", "block when rapid retries happen within 30 seconds", or "escalate severity when a step keeps hitting gated_not_completed".

Tier gating on create

On a Community license, attempting to create or update a dynamic policy that references any step.* field is rejected at create time with HTTP 403 Forbidden and the error code FEATURE_REQUIRES_EVALUATION_LICENSE:

{
  "error": {
    "code": "FEATURE_REQUIRES_EVALUATION_LICENSE",
    "message": "Retry-aware policy condition \"step.gate_count\" requires Evaluation or Enterprise license. Get a free Evaluation license at https://getaxonflow.com/evaluation-license"
  }
}

The check fires before the tenant policy-count query, so the rejection is immediate and does not consume a DB roundtrip. Evaluation and Enterprise tiers accept these fields at create and evaluate them at runtime.

UX note: retry-aware policies need `retry_policy: "reevaluate"`

By default, WCP step gates are idempotent on retry — a second /gate call with the same (workflow_id, step_id) returns the cached decision from the database without re-running the policy engine. That is exactly what you want for consistent auditability most of the time, but it means a retry-aware policy like "block after 3 attempts" will never re-fire on attempt 4 if the attempt-3 response was cached.

To make retry-aware policies evaluate on each attempt, callers must pass retry_policy: "reevaluate" on the /gate request (or the SDK equivalent). This is consistent with the cache semantics documented in the SDK Integration page.

Migration from `cached` / `decision_source`

cached and decision_source remain on the gate response and continue to return the same values they did in earlier releases. They are deprecated but not removed; SDKs will keep exposing them for the foreseeable future, and no breaking change is planned in this release.

For new code, the direct replacements are:

Old check	New check
`gate.cached == true`	`gate.retry_context.gate_count > 1`
`gate.cached == false`	`gate.retry_context.gate_count == 1`
`gate.decision_source == "fresh"`	`gate.retry_context.prior_completion_status == "none"` or resolved policy re-evaluation
"was there a prior run?"	Branch on `gate.retry_context.prior_completion_status`

The new fields give you more detail — prior decision, prior completion time, prior output — but the old booleans still work if all you need is "did the gate re-evaluate this time?".

Cooperative control plane, still

retry_context surfaces state that AxonFlow already tracks internally on the step row. It does not change the fact that the Workflow Control Plane is a cooperative control plane: AxonFlow only learns about step outcomes when the agent or orchestrator calls /gate and /complete. A step whose agent crashed between /gate and /complete shows up as prior_completion_status == "gated_not_completed" forever — there is no background reconciliation, no automatic completion, and no cross-workflow deduplication within the scope of this release.

In other words: idempotency_key prevents the agent from accidentally calling /complete twice with contradictory keys on the same step. It does not stop a determined caller from re-creating the workflow with a new workflow_id and re-submitting the same business transaction. Stronger cross-workflow guarantees are a planned future enhancement.

Workflow Control Plane — /gate and /complete lifecycle, request and response shapes, decision types
SDK Integration — client examples for all four languages
Policy Configuration — writing policies that read step context
API Error Codes — IDEMPOTENCY_KEY_MISMATCH placement in the broader error-code catalog

Why retry_context replaces cached​

The retry_context object​

Field reference​

First-call invariant​

prior_completion_status values​

Detecting retries​

Requesting prior_output​

Idempotency keys​

Scope and lifetime​

Passing on /gate​

Passing on /complete​

Mismatch rules​

409 IDEMPOTENCY_KEY_MISMATCH​

SDK usage​

Retry-aware policies (Evaluation tier and above)​

Tier gating on create​

UX note: retry-aware policies need retry_policy: "reevaluate"​

Migration from cached / decision_source​

Cooperative control plane, still​

Related​