Phase 18 — Per-engagement bucket migration script¶
Last updated: 2026-05-08
scripts/auditforge_migrate_buckets.py migrates engagements from the shared platform bucket to per-engagement isolated buckets. This completes the Phase 7 isolation feature for engagements that already existed when the feature flag was off — a real procurement-review checkbox for partner-firm pitches ("all client data is isolated").
Safety properties¶
| Property | Implementation |
|---|---|
| Dry-run by default | --execute flag required to mutate anything |
| Interactive confirmation | --yes flag required to skip the [y/N] prompt |
| Source data preserved | Source-bucket objects are never deleted; rollback is a single record edit (revert source_bucket="") |
| Idempotent | Re-running on a migrated engagement is a no-op (record already has source_bucket); re-running mid-migration only copies missing keys |
| Verification | After copy, list target bucket and assert object count + total size match the source plan; record update only happens on successful verification |
What gets copied¶
For each engagement, the script enumerates and copies every object under these prefixes from settings.BUCKET_NAME to metis-af-<short_id>-<account_id>:
auditforge/engagements/<engagement_id>/... # findings.json, audit_log shards, source/, index/
auditforge/deliverables/<engagement_id>.<ext> # cached rendered deliverables (json/md/docx)
Per-engagement S3 keys retain their full path inside the new bucket. The shape of findings_key, audit_log_key, etc. on the engagement record stays valid after migration — the only change is which bucket those keys resolve to (the engagement's source_bucket).
CLI¶
# Plan (dry-run) all unmigrated engagements
python scripts/auditforge_migrate_buckets.py
# Plan a single engagement
python scripts/auditforge_migrate_buckets.py --engagement-id eng-abc
# Execute against one engagement (interactive confirm)
python scripts/auditforge_migrate_buckets.py --engagement-id eng-abc --execute
# Execute the whole list non-interactively
python scripts/auditforge_migrate_buckets.py --execute --yes
# Re-verify a previous migration without copying
python scripts/auditforge_migrate_buckets.py --engagement-id eng-abc --verify-only
Sample dry-run output:
Source bucket: mobilemetis-metis-indexes-741783034843
======================================================================
Engagement Target bucket Objs MB
----------------------------------------------------------------------
eng-6715f196fa40 metis-af-6715f196fa40-741783034843 12 2.83
eng-3f06f14ea94a metis-af-3f06f14ea94a-741783034843 7 0.41
======================================================================
Total: 2 engagements, 19 objects, 3.24 MB
Where to run it¶
The script needs IAM access to:
- List + read objects in settings.BUCKET_NAME (the shared source bucket)
- Create + put objects in metis-af-* (the new per-engagement buckets)
Practical options:
1. ECS Exec into the running task — easiest, the task role already has both. aws ecs execute-command --cluster metis-demo --task <task-arn> --interactive --command "/bin/bash" then python /app/scripts/auditforge_migrate_buckets.py --execute --yes.
2. Local laptop with admin AWS credentials — works if your ~/.aws/credentials has full S3 + STS permissions on the account.
3. One-off ECS task — run a separate task with the same task role and a custom command override.
If the script encounters AccessDenied (e.g., running locally without prod IAM), it logs and returns 0 with no plan — non-destructive failure mode.
Rollback¶
If a partner reports a problem after migration:
- Edit the engagement store (
auditforge/engagements.json) to setsource_bucket: ""for the affected engagement - Re-save (or upload to S3 if running from local)
- Reads/writes go back to the shared bucket — source data is still there because the script never deletes
The new bucket is left in place. Decommissioning is a separate operation (decommission_engagement_bucket(bucket)); intentionally not part of the migrate script.
When to delete the source-bucket objects¶
After all engagements are verified working from their isolated buckets — typically a 30-day soak period — the operator can clean up the source-bucket copies. That's a separate one-line bulk-delete operation, deliberately not bundled with the migration. Two reasons: (1) keeping the source data around is the rollback path; (2) the migration script's safety guarantees rest on never destructively modifying the source.
Files¶
scripts/auditforge_migrate_buckets.py— CLI entry point + plan/execute logictests/test_auditforge_migrate_buckets.py— 10 unit tests with mocked S3 covering plan, execute, partial-already-in-target, verify-only, AccessDenied/NoSuchBucket handling, size-mismatch failure, help-message smoke