v8.0.0 Enterprise Migration Guide
This guide is the detailed enterprise companion to the v7 to v8 Migration Guide. It covers multi-tenant DSN secrets wiring, customer-portal admin handlers, NodeMonitor, Marketplace and community-saas-mode env combinations, the customer-portal API consumer breaking change, and the source-fork audit recipe.
If you are running a single-tenant self-hosted Community deployment, the simple-path community upgrade in the public guide is sufficient — you do not need this page.
Audience
This guide is for operators of:
- In-VPC Enterprise deployments running customer-portal admin handlers, multi-node enforcement, or AWS Marketplace metering.
- Community-SaaS mode deployments running cross-org workers (sweep, recovery, tenant-delete).
- Self-hosted source forks with customized handlers writing to FORCE-RLS-enabled tables.
Deployment-shape decision recap
From the public guide, the env-var requirement matrix:
| Deployment shape | AXONFLOW_DB_APP_ROLE_URL | AXONFLOW_DB_PLATFORM_ADMIN_URL |
|---|---|---|
| Community, single-tenant self-hosted | Yes (or DATABASE_URL fallback in dev) | No |
| Enterprise self-hosted with customer-portal | Yes | Yes |
| Enterprise self-hosted with multi-node enforcement | Yes | Yes |
| Enterprise self-hosted with AWS Marketplace metering | Yes | Yes |
| Community-SaaS mode (sweep / recovery / tenant-delete) | Yes | Yes |
| Single-tenant enterprise without any of the above | Yes | No |
If you fall into any of the bolded rows, the rest of this guide applies.
Step-by-step enterprise upgrade
1. Take a fresh database snapshot
The rollback contract for v8.0.0 is snapshot restore + image revert. Per-batch _down.sql migration files exist (099_v9_rls_b1_sparse_tables_down.sql, 103_v9_rls_b9_identity_force_down.sql, etc.) but they revert one batch, not the full identity-model rollback. For a full rollback, you need a snapshot taken immediately before the upgrade.
- AWS RDS: trigger a manual snapshot via the AWS console or
aws rds create-db-snapshot --db-instance-identifier <id> --db-snapshot-identifier <name>. Wait for statusavailablebefore proceeding. - Self-managed Postgres: use your existing backup tool. Confirm the backup completed successfully.
- Containerised Postgres in your own VPC: take an EBS volume snapshot of the underlying disk, OR run a
pg_dumpto a separate object store.
Record the snapshot identifier. You'll need it for rollback.
2. Confirm ORG_ID and DEPLOYMENT_KIND env on the agent task
ORG_ID is the deployment's customer/account identifier. Migration 094's Pass-2 backfill stamps this onto every historical empty-org_id row.
| Hosting | Where ORG_ID lives |
|---|---|
AWS Marketplace / CFN cloudformation-ecs-fargate.yaml | OrganizationID CFN parameter → agent task def's ORG_ID env |
| Docker Compose (self-hosted enterprise) | ORG_ID line in docker-compose.enterprise.yml or .env |
| Custom Kubernetes / EC2 / on-prem | ORG_ID env var on the agent container/process |
If ORG_ID is unset on a production deployment, the agent's deployment-org getter falls back to local-dev-org, and migration 094 will refuse to run (the production-safety guard). Set ORG_ID and redeploy before continuing.
DEPLOYMENT_KIND distinguishes a real production deployment from a local docker-compose / community-mode install. Migration 094 uses it as a defence-in-depth signal alongside ORG_ID.
- CFN-deployed stacks:
DEPLOYMENT_KIND=productionis hardcoded incloudformation-ecs-fargate.yaml. No action needed. - Docker Compose: defaults to
dev. Override by settingDEPLOYMENT_KIND=productionin your shell or.envfile beforedocker-compose up -dIF this docker-compose is your production deployment shape. - Custom hosting: set
DEPLOYMENT_KIND=productionon the agent task.
3. Run the preflight script
The preflight script lives at scripts/deployment/v9_self_hosted_preflight.sh in the getaxonflow/axonflow-enterprise repository (BSL 1.1, source-available). It is dependency-free (bash + psql + optionally aws CLI for ECS discovery). Download it without cloning the repo:
curl -fsSLO https://raw.githubusercontent.com/getaxonflow/axonflow-enterprise/main/scripts/deployment/v9_self_hosted_preflight.sh
chmod +x v9_self_hosted_preflight.sh
For your DATABASE_URL, the Marketplace CFN template stores the RDS master password in AWS Secrets Manager:
STACK_NAME=<your-stack-name>
DB_ENDPOINT=$(aws cloudformation describe-stacks --stack-name "$STACK_NAME" \
--query 'Stacks[0].Outputs[?OutputKey==`DatabaseEndpoint`].OutputValue' --output text)
DB_SECRET_ARN=$(aws cloudformation describe-stacks --stack-name "$STACK_NAME" \
--query 'Stacks[0].Outputs[?OutputKey==`DatabaseSecretArn`].OutputValue' --output text)
DB_PASSWORD=$(aws secretsmanager get-secret-value --secret-id "$DB_SECRET_ARN" \
--query SecretString --output text | jq -r .password)
export DATABASE_URL="postgres://axonflow:${DB_PASSWORD}@${DB_ENDPOINT}:5432/axonflow?sslmode=require"
Run the preflight:
DEPLOYMENT_KIND=production \
ORG_ID=acme-corp \
RDS_INSTANCE_IDENTIFIER=axonflow-acme-prod-db \
./v9_self_hosted_preflight.sh
The script reports PASS/WARN/FAIL across eight checks. Do not proceed if the script returns a FAIL.
4. Provision axonflow_app_role and axonflow_platform_admin
v8.0.0 ships scripts/operators/provision-app-role.sh:
curl -fsSLO https://raw.githubusercontent.com/getaxonflow/axonflow-enterprise/main/scripts/operators/provision-app-role.sh
chmod +x provision-app-role.sh
# Operator supplies the passwords — the script does not generate or print them:
export APP_ROLE_PASSWORD="$(openssl rand -base64 32 | tr -d '/+=' | head -c 32)"
export PLATFORM_ADMIN_PASSWORD="$(openssl rand -base64 32 | tr -d '/+=' | head -c 32)"
# Stash both passwords in your secrets manager BEFORE running:
DATABASE_URL="$DATABASE_URL" ./provision-app-role.sh
The script is idempotent. Re-running with mismatched passwords exits 1 unless you set FORCE_RESET=1 to rotate.
Construct the two DSN env vars:
AXONFLOW_DB_APP_ROLE_URL=postgres://axonflow_app_role:$APP_ROLE_PASSWORD@<host>:5432/axonflow?sslmode=require
AXONFLOW_DB_PLATFORM_ADMIN_URL=postgres://axonflow_platform_admin:$PLATFORM_ADMIN_PASSWORD@<host>:5432/axonflow?sslmode=require
Store both in your secrets manager and reference them from your agent + customer-portal + orchestrator task definitions:
- CFN-deployed stacks: store as two secrets
axonflow/<stack>/app-role-urlandaxonflow/<stack>/platform-admin-url; reference viaSecrets:in the task def YAML. - Docker Compose: load via
.envfile or--env-fileflag. - Kubernetes: mount as Secret-backed env vars.
5. Per-feature env-var combinations
Required env vars by deployment feature:
| Feature | AXONFLOW_DB_APP_ROLE_URL | AXONFLOW_DB_PLATFORM_ADMIN_URL | Additional env |
|---|---|---|---|
| Stock in-VPC enterprise (no portal, no marketplace, no multi-node) | required | not required | — |
| Customer-portal admin handlers | required | required | ADMIN_API_KEY (from secrets manager) |
| Multi-node enforcement | required | required | ENABLE_NODE_MONITOR=true |
| AWS Marketplace metering | required | required | Marketplace CFN parameters (auto-wired) |
| Community-SaaS mode (sweep + recovery + tenant-delete) | required | required | COMMUNITY_SAAS_SWEEP_ENABLED=true, COMMUNITY_SAAS_REGISTRATION_ENABLED=true |
GDPR right-to-erasure endpoints (/api/v1/tenant/{id}/delete-*) | required | required | — |
Setting AXONFLOW_DB_PLATFORM_ADMIN_URL to an empty string is treated the same as unset — the refuse-to-boot guard fires.
Customer-portal API consumer breaking change (v8.0.0)
Affects: direct HTTP API consumers of the customer-portal role APIs. In v8+ the current session-authenticated routes are /api/v1/roles, /api/v1/roles/{id}, /api/v1/users/{email}/roles, and /api/v1/users/{email}/roles/{roleId}. Older internal clients and notes may refer to the pre-v8 customer-portal role paths.
Not affected: the bundled customer-portal UI (ee/platform/customer-portal-ui), which consumes its own backend and updates atomically with the schema rename.
The custom_roles.tenant_id and role_assignments.tenant_id columns were renamed to org_id (migration 111 in the v8.0.0 series). The JSON tag on the API response payloads renamed correspondingly:
- Pre-v8.0.0:
{"tenant_id": "acme-corp", "role_name": "...", ...} - v8.0.0+:
{"org_id": "acme-corp", "role_name": "...", ...}
If you wrote a tool that consumes the customer-portal HTTP API for roles or role-assignments directly (your own admin scripts, an SSO provisioning integration, a compliance audit pipeline that polls the role list), update your consumer to read org_id instead of tenant_id. The semantic meaning is unchanged — same row, same value, renamed JSON field.
If you write to these endpoints, the request payload accepts both tenant_id and org_id for one compatibility window; the new canonical field is org_id.
Find your consumers
To find code in your own repos that reads these endpoints or older customer-portal role paths:
grep -rnE '(customer-portal/(roles|role-assignments)|/api/v1/(roles|users/.*/roles))' --include='*.py' --include='*.js' --include='*.ts' --include='*.go' .
For each match, check the response handler for tenant_id references in the role / role-assignment payload — those need to become org_id.
Source-fork audit recipe (customized handlers must wrap FORCE-RLS writes)
The full audit recipe — broader-scope than the public guide's brief mention.
Step A — grep for direct write sites
In your fork's root:
grep -rnE 'db\.(Exec|QueryRow|Query)|tx\.(Exec|QueryRow|Query)' --include='*.go' \
-- platform/ ee/ | grep -E 'INSERT|UPDATE|DELETE'
Step B — check the surrounding function for each match
| Pattern | Verdict |
|---|---|
Call is inside a WithOrgScope(ctx, db, orgID, func(tx *sql.Tx) error { ... }) closure | Safe |
Call is inside a withRequestOrgScope(r, h.db, fn) closure (customer-portal) | Safe |
Call is on a *sql.DB opened via OpenPlatformAdminConnection() | Safe (admin pool, BYPASSRLS) |
Call goes through a SECURITY DEFINER helper (auth_lookup_api_key(), etc.) | Safe |
| None of the above | Needs a wrapper before flipping AXONFLOW_DB_USE_APP_ROLE=true |
Step C — identify which RLS-enabled tables are touched
Run psql -c '\d <table>' on each candidate table. Output includes Row security: enabled (and FORCE if FORCE RLS is on). FORCE-RLS-enabled tables in v8.0.0 include (non-exhaustive):
organizations,tenantsaudit_logs,mcp_query_audits,audit_archive,audit_retention_config,decision_chaindynamic_policies,policy_overridesconnector_configs,connectors,agent_heartbeats,node_violationscommunity_saas_registrationssaml_configurationsusage_events,deployment_upgradesapi_keys,customers
Step D — staged rollout
- Stage 1: in staging, set
AXONFLOW_DB_USE_APP_ROLE=true+ provision both DSNs. Carry production-shaped traffic through at least one full diurnal cycle. - Stage 2: watch agent + orchestrator + customer-portal logs for
pq: new row violates row-level security policy. Each line names the table the customized handler tried to write outside an org-scoped transaction. - Stage 3: also watch for silent zero-row writes —
DELETEunder FORCE RLS evaluatesUSING(notWITH CHECK), so a DELETE outside org scope filters silently to zero rows without error. The mutation gate shape is "row STILL EXISTS after 'successful' response" not "500 / 42501". Audit your customized DELETE call sites accordingly. - Stage 4: if violations surface, set
AXONFLOW_DB_USE_APP_ROLE=falsein staging to keep traffic flowing while you fix the handler. Re-flip once the violation queue is clean. - Stage 5: production flip after staging is clean for one diurnal cycle.
Customer-portal admin handlers — DSN format detail
The customer-portal admin API key + DSN wiring landed in v8.0.0:
- The
AdminAPIKeySecretCFN resource generates a strong API key, stored in AWS Secrets Manager. - The customer-portal task definition references the secret as
ADMIN_API_KEYenv var. - Customer-portal admin handlers compare incoming
X-Admin-API-Keyheader againstADMIN_API_KEY— anonymous requests get HTTP 401. - Admin handlers open
AXONFLOW_DB_PLATFORM_ADMIN_URLfor cross-org operations (org create / list / quota update / deletion).
The seed workflow that creates the first org on a fresh stack passes the API key via masked X-Admin-API-Key header.
NodeMonitor + Marketplace + customer-portal admin DSN combinations
If your stack runs multiple cross-org features simultaneously:
# Agent task def env block (cloudformation-ecs-fargate.yaml):
- Name: AXONFLOW_DB_USE_APP_ROLE
Value: "true"
- Name: AXONFLOW_DB_APP_ROLE_URL
ValueFrom: !Ref AppRoleDSNSecretArn
- Name: AXONFLOW_DB_PLATFORM_ADMIN_URL
ValueFrom: !Ref PlatformAdminDSNSecretArn
- Name: ENABLE_NODE_MONITOR
Value: "true"
- Name: COMMUNITY_SAAS_SWEEP_ENABLED
Value: "false" # only true on csaas-mode stacks
- Name: ENABLE_AWS_MARKETPLACE_METERING
Value: "true"
# Customer-portal task def env block:
- Name: AXONFLOW_DB_USE_APP_ROLE
Value: "true"
- Name: AXONFLOW_DB_APP_ROLE_URL
ValueFrom: !Ref AppRoleDSNSecretArn
- Name: AXONFLOW_DB_PLATFORM_ADMIN_URL
ValueFrom: !Ref PlatformAdminDSNSecretArn
- Name: ADMIN_API_KEY
ValueFrom: !Ref AdminAPIKeySecretArn
- Name: ENVIRONMENT
Value: "production" # gates HandleForgotPassword to HTTP 501 in production
Rollback path
Before migration 094 has stamped historical rows
If you catch the issue while the agent is still in its boot loop (migration 094 prod-safety branch fired and refused to stamp anything):
- Revert the image tag to the prior v8.x release.
- Restart the agent + orchestrator services.
- No DB rollback needed — migration 094 either ran cleanly or aborted; no partial-state in between.
After migration 094 has run successfully
Snapshot-restore path. The schema additions are not destructive (new columns + new roles), but migration 094 stamped historical rows with your ORG_ID value. Reverting the platform image will not reverse those stamps.
- Stop the agent + orchestrator services (or scale to 0).
- Restore the pre-upgrade RDS snapshot:
-
AWS RDS rename-then-restore (recommended): rename the live instance, then restore the snapshot with the original identifier. The agent's existing
DATABASE_URLresolves to the restored instance.aws rds modify-db-instance --db-instance-identifier <live> --new-db-instance-identifier <live>-broken
aws rds restore-db-instance-from-db-snapshot --db-instance-identifier <live> --db-snapshot-identifier <snapshot> -
Restore-with-new-name + Route 53: restore the snapshot with a new identifier, then update
DATABASE_URL(CFN parameter) to the new endpoint, or use Route 53 CNAME to repoint. -
Self-managed Postgres: stop the live instance, restore from your backup tool, restart.
-
- Revert the agent + orchestrator + customer-portal image tags to the prior v8.x release.
- Restart services.
Per-batch rollback for FORCE RLS
Each FORCE RLS migration ships with a _down.sql pair. These revert FORCE on one batch without reverting the broader v8.0.0 schema. Operators do not normally need these; the snapshot-restore path is the contract.
Recommended SDK + plugin floors
| SDK / Plugin | Recommended | Minimum |
|---|---|---|
| Go SDK | v8.1.0 | v8.0.0 |
| Python SDK | v8.1.0 | v8.0.0 |
| TypeScript SDK | v8.1.0 | v8.0.0 |
| Java SDK | v8.1.0 | v8.0.0 |
| Rust SDK | v0.3.1 | v0.2.0 |
| Plugin (claude / cursor / codex) | v1.5.0 | v1.4.0 |
| openclaw | v2.5.0 | v2.4.0 |
The platform's /health endpoint advertises the current floor — client SDKs older than the minimum emit a warning but continue to work through the entire v8.0.0 lifecycle (deprecated-alias compatibility window).
Additional v8.0.0 guarantees in this area
- Agent
agent_heartbeatsUPSERT under FORCE RLS — the agent'ssendHeartbeatpath wraps the UPSERT in an org-scoped transaction so the heartbeat write succeeds underaxonflow_app_roleagainst the FORCE-RLS-enabled table (migration 107). Operators staging the role flip do not need to watch for heartbeat-related RLS violations as a special case — the stock handler is RLS-correct. - AST audit walker for write-path coverage — a build-time guard preventing customized writes from bypassing the three wrap patterns is in place; CI fails on any unwrapped
INSERT/UPDATE/DELETEinto an RLS-gated table. The source-fork audit recipe in Step A–D above is CI-enforced on the stock codebase — operators maintaining forks inherit the same guard when they pick up the v8.0.0 codebase and customize on top. Step A's manual grep stays useful for pre-merge local checks; Step D's staged rollout is backstopped by the build-time guarantee that no new bypass crept in.
See also
- v7 → v8 Migration Guide — the public-facing companion (deployment-shape decision tree + simple-path community upgrade)
- v8.0.0 Self-Hosted Upgrade Guide — operator-focused stage-by-stage upgrade
- Customer-Portal Admin API — protected admin handler reference
- Audit Logging Guide — what gets logged under FORCE RLS
Enterprise Rollout Checklist
Use this page as part of the protected enterprise operating model:
- confirm the deployment shape in Deployment Operations and Deployment Operations
- check identity and access requirements in Authentication, SSO Configuration, and SCIM Overview
- connect governance workflows to Policy Management, Approvals Queue, and Audit Logging Guide
- use Support Escalation when the rollout needs escalation paths, incident context, or production-readiness review
