Skip to main content

Payment Agent — Retry & Reconciliation

A payment agent is the canonical case where "retry-on-failure" stops being a safe default and starts becoming an expensive bug. If the downstream bank timeout raced the actual transfer, your agent's next attempt is about to move money twice. This tutorial walks through a production-minded pattern for building that agent on AxonFlow's Workflow Control Plane using two primitives that shipped in platform v7.3.0:

  • retry_context on every /gate response — tells the agent "this is attempt N, a prior attempt left the step in gated_not_completed state, the prior decision was allow"
  • idempotency_key on /gate and /complete — pins the step to a business-level identifier (invoice number, wire transfer ID, content hash) so a retry with a different key is rejected before the payment is re-posted

Both are wire-level capabilities available on every AxonFlow license tier, so this tutorial works on a Community install. No Evaluation license required for the core flow. An Evaluation-tier extension at the end shows how to push the pattern further with retry-aware policies.

The scenario

Your agent is processing a €500 wire transfer to a vendor for invoice INV-7721. The downstream bank call:

  1. Succeeds on the bank side — the transfer is booked
  2. But the HTTP response to your agent times out at 30 seconds
  3. Your agent never sees the ack, so it never calls /complete
  4. A retry fires

Without idempotency discipline, attempt 2 posts the transfer a second time and the vendor gets paid twice. With retry_context + idempotency_key and a reconciliation step, attempt 2 sees the earlier state, asks the bank for the authoritative status, and either closes the step cleanly (if the first post landed) or re-posts (if it didn't).

Flow

The branch AxonFlow enables is the middle one: your agent can tell, without guessing, that the prior attempt left the step gated but never completed. That is the signal to reconcile with the bank instead of blindly re-posting.

Prerequisites

  • AxonFlow running locally (Getting Started). A Community license is sufficient.
  • One AxonFlow SDK installed (Python, TypeScript, Go, or Java) — or curl + bash if you prefer to stay framework-agnostic.
  • A downstream payments API that supports its own idempotency (every reputable bank API does; Stripe, Adyen, and bank core-banking APIs all accept an Idempotency-Key header).
Scope

This tutorial uses the wire-level capabilities shipped in AxonFlow v7.3.0. retry_context is returned on every gate response on every tier. idempotency_key is enforced per (workflow_id, step_id) on every tier. Cross-workflow idempotency is not in scope today — if your application retries by creating a new workflow, add your own check against a local idempotency store. See Retry Semantics & Idempotency for the full wire shape.

Step 1 — Create the workflow

Every run gets its own workflow_id. You pick the name and optionally pass a trace ID for correlation with upstream logging.

from axonflow import AxonFlow
from axonflow.workflow import CreateWorkflowRequest, WorkflowSource

async with AxonFlow(
endpoint="http://localhost:8080",
client_id="payment-agent",
client_secret="your-secret",
) as client:
workflow = await client.create_workflow(
CreateWorkflowRequest(
workflow_name="vendor-payment",
source=WorkflowSource.EXTERNAL,
trace_id="upstream-correlation-abc",
)
)
workflow_id = workflow.workflow_id # e.g. "wf_abc123"

Step 2 — First gate, pinned to the invoice

The critical line in this step is idempotency_key: "payment:wire:INV-7721". The exact string is up to you — AxonFlow treats it as opaque — but it must uniquely identify the business transaction. Common patterns:

  • "payment:wire:<invoice-id>"
  • "payment:card:<order-id>"
  • "payment:refund:<original-payment-id>"
  • SHA-256 hash of a canonical request body when you need one key per request

Once set on the first gate call, the key is bound to the step for its lifetime. Passing a different key on any later /gate or /complete is rejected with 409 IDEMPOTENCY_KEY_MISMATCH before downstream side-effects fire.

from axonflow.workflow import StepGateRequest, StepType

IDEMPOTENCY_KEY = "payment:wire:INV-7721"

gate = await client.step_gate(
workflow_id=workflow_id,
step_id="transfer",
request=StepGateRequest(
step_name="Wire transfer to vendor",
step_type=StepType.TOOL_CALL,
step_input={"amount_eur": 500, "vendor_account": "DE89370400440532013000"},
idempotency_key=IDEMPOTENCY_KEY,
),
)

assert gate.decision == "allow"
assert gate.retry_context.gate_count == 1
assert gate.retry_context.prior_completion_status == "none"
assert gate.retry_context.idempotency_key == IDEMPOTENCY_KEY

Step 3 — Post to the bank, hit a timeout

Your agent posts the transfer to the bank with the same idempotency key in the bank's own idempotency header. Critical discipline: if you chose payment:wire:INV-7721 as the AxonFlow key, pass exactly that in the bank's header too. Using the same key across both sides is what makes the reconciliation step in the next section possible — you can ask the bank "did you already settle the transfer with key X?" and get an authoritative answer.

# Pseudocode; substitute your bank client.
try:
response = bank_client.create_transfer(
amount_eur=500,
beneficiary_iban="DE89370400440532013000",
idempotency_key=IDEMPOTENCY_KEY,
timeout=30,
)
# Happy path: we got an ack. Go to Step 5.
except (TimeoutError, ConnectionError):
# The transfer may or may not have landed. We don't know.
# Do NOT retry blindly. Go to Step 4.
pass

At this point, AxonFlow has recorded the gate decision but no /complete has landed. In the database, the step sits in a state that AxonFlow calls gated_not_completed — and that is exactly what the next gate call will reveal.

Step 4 — Retry gate, read retry_context

Your orchestration layer (LangGraph, Temporal, a cron, an upstream queue) eventually retries. Before posting anything to the bank, the agent re-calls /gate with the same workflow ID, step ID, and idempotency key.

gate = await client.step_gate(
workflow_id=workflow_id,
step_id="transfer",
request=StepGateRequest(
step_name="Wire transfer to vendor",
step_type=StepType.TOOL_CALL,
idempotency_key=IDEMPOTENCY_KEY,
),
)

assert gate.retry_context.gate_count == 2
assert gate.retry_context.prior_completion_status == "gated_not_completed"
assert gate.retry_context.last_decision == "allow"
assert gate.retry_context.idempotency_key == IDEMPOTENCY_KEY

Three signals are doing the work here:

  • gate_count: 2 — this is a retry, not a first attempt
  • prior_completion_status: "gated_not_completed" — a prior gate landed but /complete never did (the step is in flight, crashed mid-flight, or timed out on the downstream side)
  • last_decision: "allow" — the prior evaluation was allowed; if it had been block, you would not retry at all

The agent must now not re-post the transfer. That would be a duplicate under any reasonable definition.

Step 5 — Reconcile with the bank

This is the step that AxonFlow does not do for you. retry_context tells the agent "the prior attempt is in ambiguous state"; the agent has to ask the downstream system what actually happened. Every serious payments API exposes a lookup-by-idempotency-key endpoint for exactly this reason.

# Pseudocode; substitute your bank client.
status = bank_client.get_payment_status(idempotency_key=IDEMPOTENCY_KEY)

if status == "settled":
# Branch A: the first post succeeded. Close the step.
output = {"bank_ref": status.bank_ref, "settled_at": status.settled_at.isoformat()}
elif status == "not_found":
# Branch B: the first post did not land. Re-post.
response = bank_client.create_transfer(
amount_eur=500,
beneficiary_iban="DE89370400440532013000",
idempotency_key=IDEMPOTENCY_KEY,
)
output = {"bank_ref": response.bank_ref, "settled_at": response.settled_at.isoformat()}
else:
# Branch C: status is "pending" or "processing". Back off and retry later.
raise RetryableError(f"bank still settling payment {IDEMPOTENCY_KEY}")

The bank's get_payment_status(idempotency_key=...) pattern varies by provider, but the shape is universal — Stripe calls it "idempotent retrieval", Adyen uses /payments/{paymentPspReference}, most bank core-banking APIs expose an equivalent /payments/{reference} endpoint. You pass the same key you used on the create call, and the provider tells you whether that create actually ran.

Step 6 — /complete with matching key

Whichever branch you took in Step 5, when you mark the step complete you must pass the same idempotency key. A mismatch fires 409 IDEMPOTENCY_KEY_MISMATCH — AxonFlow refuses to close the step with a key that does not match the one recorded on the first gate call.

from axonflow.workflow import MarkStepCompletedRequest

await client.mark_step_completed(
workflow_id=workflow_id,
step_id="transfer",
request=MarkStepCompletedRequest(
output=output,
idempotency_key=IDEMPOTENCY_KEY,
),
)

await client.complete_workflow(workflow_id)

Handling key mismatches explicitly

If an accident in the agent or a downstream state corruption sends a /complete with the wrong key (different invoice, different account, different content hash), AxonFlow refuses. SDKs expose this as a typed error that's worth catching separately from other failures — it is almost always a caller-side integrity bug, not a transient failure.

from axonflow.exceptions import IdempotencyKeyMismatchError

try:
await client.mark_step_completed(
workflow_id=workflow_id,
step_id="transfer",
request=MarkStepCompletedRequest(
output=output,
idempotency_key="payment:wire:INV-9999", # wrong!
),
)
except IdempotencyKeyMismatchError as e:
log.error(
"refusing to complete step %s: expected key %s, got %s",
e.step_id, e.expected_idempotency_key, e.received_idempotency_key,
)
# Do not retry with a different key. Pause for operator review.

Evaluation-tier extension — retry-aware policy

The pattern above uses only wire-level capabilities, which are available on every tier. Evaluation and Enterprise tiers unlock the ability to author dynamic policies whose conditions read retry state directly. That lets you push "any retry past 5 minutes needs human approval" out of application code and into declarative policy:

curl -X POST http://localhost:8080/api/v1/policies \
-H "Content-Type: application/json" \
-H "Authorization: Basic $(echo -n 'payment-agent:your-secret' | base64)" \
-d '{
"name": "slow-payment-retry-needs-approval",
"description": "Any payment retry older than 5 minutes that never completed requires human approval",
"type": "context_aware",
"category": "dynamic-compliance",
"priority": 900,
"enabled": true,
"conditions": [
{"field": "step.prior_completion_status", "operator": "equals", "value": "gated_not_completed"},
{"field": "step.first_attempt_age_seconds", "operator": "greater_than", "value": 300}
],
"actions": [
{"type": "require_approval", "config": {"reason": "Retry on un-completed payment older than 5 minutes — verify downstream state before re-posting", "severity": "high"}}
]
}'

A few things to note about the shape:

  • conditions is a flat array, ANDed implicitly — every condition must match.
  • Operators are equals, not_equals, contains, greater_than, less_than, regex, in (not eq / gt).
  • actions is a plural array with each action wrapped in {type, config}. The reason and severity live inside config, not at the top level.
  • type must be one of content, user, risk, cost, context_aware, media, rate-limit, budget, time-access, role-access, mcp, connector. Retry-aware policies are context_aware.
  • category must start with dynamic- or media-. Use dynamic-compliance for retry-aware rules.

On a Community license this request is rejected at create time with 403 FEATURE_REQUIRES_EVALUATION_LICENSE because any step.* condition requires the Evaluation tier. On Evaluation or Enterprise it's accepted and, combined with retry_policy: "reevaluate" on subsequent gate calls, upgrades a long-running gated_not_completed state from "the agent is supposed to reconcile" into "a human approves the reconciliation path." See Retry Semantics & Idempotency — Retry-aware policies for the full field list and the rejection envelope.

What this protects you from

This pattern closes four specific failure modes that every payment-agent team eventually meets in production:

  1. Duplicate posts on upstream timeout — the canonical case in the walkthrough
  2. Agent crash between /gate and /completeprior_completion_status: "gated_not_completed" flags the step for reconciliation on the next run
  3. Wrong transaction pinned to an approved step — the key mismatch check on /complete refuses to close a step with a key that doesn't match the one recorded on the first gate
  4. Stale application state reusing a completed step's key — a later /gate on a completed step with a mismatched key is also rejected

What this pattern does not cover:

  • Cross-workflow replay — if your orchestration layer creates a new workflow_id for the retry, AxonFlow's per-step enforcement does not span workflow IDs in this release. Keep your own idempotency store keyed on the business-level identifier (invoice number, order ID) and check it before creating the replacement workflow. Cross-workflow enforcement is a planned future enhancement.
  • Downstream reconciliation logic itself — AxonFlow surfaces the retry state; your agent still needs to call the bank's lookup endpoint and act on the result.