Skip to content

Phase 18 — Per-engagement bucket migration script

Last updated: 2026-05-08

scripts/auditforge_migrate_buckets.py migrates engagements from the shared platform bucket to per-engagement isolated buckets. This completes the Phase 7 isolation feature for engagements that already existed when the feature flag was off — a real procurement-review checkbox for partner-firm pitches ("all client data is isolated").

Safety properties

Property Implementation
Dry-run by default --execute flag required to mutate anything
Interactive confirmation --yes flag required to skip the [y/N] prompt
Source data preserved Source-bucket objects are never deleted; rollback is a single record edit (revert source_bucket="")
Idempotent Re-running on a migrated engagement is a no-op (record already has source_bucket); re-running mid-migration only copies missing keys
Verification After copy, list target bucket and assert object count + total size match the source plan; record update only happens on successful verification

What gets copied

For each engagement, the script enumerates and copies every object under these prefixes from settings.BUCKET_NAME to metis-af-<short_id>-<account_id>:

auditforge/engagements/<engagement_id>/...      # findings.json, audit_log shards, source/, index/
auditforge/deliverables/<engagement_id>.<ext>   # cached rendered deliverables (json/md/docx)

Per-engagement S3 keys retain their full path inside the new bucket. The shape of findings_key, audit_log_key, etc. on the engagement record stays valid after migration — the only change is which bucket those keys resolve to (the engagement's source_bucket).

CLI

# Plan (dry-run) all unmigrated engagements
python scripts/auditforge_migrate_buckets.py

# Plan a single engagement
python scripts/auditforge_migrate_buckets.py --engagement-id eng-abc

# Execute against one engagement (interactive confirm)
python scripts/auditforge_migrate_buckets.py --engagement-id eng-abc --execute

# Execute the whole list non-interactively
python scripts/auditforge_migrate_buckets.py --execute --yes

# Re-verify a previous migration without copying
python scripts/auditforge_migrate_buckets.py --engagement-id eng-abc --verify-only

Sample dry-run output:

Source bucket: mobilemetis-metis-indexes-741783034843

======================================================================
Engagement             Target bucket                                   Objs    MB
----------------------------------------------------------------------
eng-6715f196fa40       metis-af-6715f196fa40-741783034843               12   2.83
eng-3f06f14ea94a       metis-af-3f06f14ea94a-741783034843                7   0.41
======================================================================
Total: 2 engagements, 19 objects, 3.24 MB

Where to run it

The script needs IAM access to: - List + read objects in settings.BUCKET_NAME (the shared source bucket) - Create + put objects in metis-af-* (the new per-engagement buckets)

Practical options: 1. ECS Exec into the running task — easiest, the task role already has both. aws ecs execute-command --cluster metis-demo --task <task-arn> --interactive --command "/bin/bash" then python /app/scripts/auditforge_migrate_buckets.py --execute --yes. 2. Local laptop with admin AWS credentials — works if your ~/.aws/credentials has full S3 + STS permissions on the account. 3. One-off ECS task — run a separate task with the same task role and a custom command override.

If the script encounters AccessDenied (e.g., running locally without prod IAM), it logs and returns 0 with no plan — non-destructive failure mode.

Rollback

If a partner reports a problem after migration:

  1. Edit the engagement store (auditforge/engagements.json) to set source_bucket: "" for the affected engagement
  2. Re-save (or upload to S3 if running from local)
  3. Reads/writes go back to the shared bucket — source data is still there because the script never deletes

The new bucket is left in place. Decommissioning is a separate operation (decommission_engagement_bucket(bucket)); intentionally not part of the migrate script.

When to delete the source-bucket objects

After all engagements are verified working from their isolated buckets — typically a 30-day soak period — the operator can clean up the source-bucket copies. That's a separate one-line bulk-delete operation, deliberately not bundled with the migration. Two reasons: (1) keeping the source data around is the rollback path; (2) the migration script's safety guarantees rest on never destructively modifying the source.

Files

  • scripts/auditforge_migrate_buckets.py — CLI entry point + plan/execute logic
  • tests/test_auditforge_migrate_buckets.py — 10 unit tests with mocked S3 covering plan, execute, partial-already-in-target, verify-only, AccessDenied/NoSuchBucket handling, size-mismatch failure, help-message smoke