Per-engagement S3 bucket isolation (Phase 7)¶
What¶
Each engagement gets its own dedicated S3 bucket. Source documents, findings, deliverables, and audit logs for engagement A are stored in a different physical bucket than engagement B's. A bug or misconfiguration affecting one engagement cannot expose another's data.
Behind a feature flag — AUDITFORGE_PROVISION_PER_ENGAGEMENT_BUCKET=true. Default off so local dev and existing engagements keep using the shared platform bucket with prefix isolation.
Why this matters¶
Prefix isolation in a shared bucket (today's default) is workable for early customers but doesn't pass the SOC 2 Type 2 bar:
- An IAM mistake (bucket policy too permissive, principal granted blanket read access) exposes every engagement at once
- An application bug that constructs the wrong key prefix can leak engagement A's findings into engagement B's response
- Audit-log review of "who read what" requires parsing application logs, not just CloudTrail S3 access logs
- Compliance auditors regularly flag shared-bucket multi-tenancy as inadequate isolation
Per-engagement buckets eliminate the prefix-construction risk class. Each bucket has its own ACL, its own bucket policy, its own CloudTrail-able access pattern. A misconfiguration affects one engagement, not the whole platform.
Architecture¶
engagement-create flow:
1. Generate engagement_id
2. If AUDITFORGE_PROVISION_PER_ENGAGEMENT_BUCKET set:
a. Provision bucket: metis-af-{eng-short}-{account-id}
b. Apply security controls: versioning, AES256 encryption,
public-access-block (all four blockPublicAcls/IgnorePublicAcls/
BlockPublicPolicy/RestrictPublicBuckets = true)
c. Tag for cost tracking + discovery
3. Persist engagement.source_bucket = bucket name (or empty)
read/write flow:
1. Caller has engagement object
2. FindingsStore.list/get/edit/etc accept optional bucket=...
3. Pass engagement.source_bucket or None
4. Store internally: bucket override or fall back to self._bucket
(which is the shared platform bucket)
engagement-delete flow:
1. Read engagement.source_bucket
2. If non-empty, decommission_engagement_bucket():
a. List all object versions + delete markers
b. Batch-delete in 1000-object chunks
c. delete_bucket()
3. Best-effort — engagement deletion succeeds even if cleanup fails
(cleanup can be retried offline)
What's stored where¶
| Artifact | Per-engagement bucket | Shared bucket |
|---|---|---|
findings.json |
✅ when flag on | ✅ otherwise |
Cached deliverable ({id}.json, .md, .docx, -methodology.md) |
✅ when flag on | ✅ otherwise |
Audit log shards (shard-NNNNN.jsonl) |
✅ when flag on | ✅ otherwise |
engagements.json (engagement index) |
— | ✅ always (it's the index) |
firms.json (firm branding) |
— | ✅ always (firm-level, not engagement-level) |
Per-tenant corpus index ({client_id}/index/*) |
— | ✅ always (Metis tenant data, not engagement data) |
The engagement index lives in the shared bucket because it's how the system finds engagements. Per-engagement buckets are provisioned for engagements; the index that points at them is platform-level.
Bucket name format¶
Total length: 32 characters (metis-af- + 12 + - + 12). Well under S3's 63-character limit. Account ID at the end gives global uniqueness.
Feature flag¶
Environment variable: AUDITFORGE_PROVISION_PER_ENGAGEMENT_BUCKET
Accepted values (case-insensitive): 1, true, yes enables. Any other value (including unset) keeps the shared-bucket default.
For production rollout: set the flag on the ECS task definition. Existing engagements (created before the flag flipped) continue to work — source_bucket is empty on their record, code path falls back to the shared bucket. Only new engagements get dedicated buckets.
Migration path for existing engagements (manual)¶
Today: existing engagements use the shared bucket with prefix isolation. To migrate one to a dedicated bucket:
- Provision the bucket: call
provision_engagement_bucket(engagement_id)from a script - Copy artifacts:
aws s3 sync s3://shared-bucket/auditforge/engagements/{id}/ s3://metis-af-{id}-{acct}/auditforge/engagements/{id}/ aws s3 sync s3://shared-bucket/auditforge/deliverables/{id}* s3://metis-af-{id}-{acct}/auditforge/deliverables/ aws s3 sync s3://shared-bucket/auditforge/audit_logs/{id}/ s3://metis-af-{id}-{acct}/auditforge/audit_logs/{id}/ - Update the engagement record: set
source_bucketto the new bucket name - (Optional) Delete the artifacts from the shared bucket once confirmed working
A one-shot migration script is on the roadmap. For now, manual migration is fine — most existing engagements are pre-paid-customer test engagements and don't warrant migration.
AWS account limits¶
Default soft limit: 100 buckets per account. Each engagement = 1 bucket → ~100 engagements before requesting a quota increase. AWS routinely raises this limit on request to 1000+ for legitimate use cases.
Per-account quota math: - 100 engagements: default - 1000 engagements: standard quota raise - 10,000+: requires AWS account-team conversation
A firm running 50 engagements/year hits the default limit in ~2 years. Plan for the quota raise as the firm crosses ~30 engagements.
Cost characteristics¶
S3 buckets themselves are free; storage costs are unchanged (data is the same data, just in a different physical bucket). Slight overhead in:
- API calls: each bucket-provisioning is ~5 API calls (CreateBucket + 4 security-control PUTs)
- IAM: bucket policies add to the IAM evaluation cost on read; trivial at engagement scale
- CloudTrail: per-bucket S3 data events are billable separately if enabled; recommend enabling them on per-engagement buckets specifically (not the shared one) to keep costs proportional to high-value data
Code¶
app/auditforge/buckets.py— provisioning, security controls, decommissioning, upload helperapp/auditforge/engagement.py—source_bucketfield + provisioning at create-time + decommissioning at delete-timeapp/auditforge/findings.py— FindingsStore methods acceptbucket=...overrideapp/auditforge/report.py— deliverable cache lookup + sync respects per-engagement bucketapp/auditforge/runner.py— audit log writer + findings store calls threadengagement.source_bucketapp/auditforge_endpoints.py— endpoint handlers passeng.source_bucket or Noneto all store calls
Tests¶
The bucket-threading change is covered by the existing FindingsStore + endpoint tests (105 passing as of Phase 7). Provisioning is exercised manually via the env flag — automated provisioning tests use moto or LocalStack and are an open hardening item.
Open follow-ups¶
- Automated provisioning tests with moto / LocalStack
- One-shot migration script for existing engagements (
scripts/auditforge_migrate_to_per_engagement_bucket.py) - Per-bucket lifecycle policy: auto-archive findings.json older than 90 days to Glacier Deep Archive
- CloudTrail S3 data-events enablement template for per-engagement buckets
- Bucket policy template restricting principals to a single ECS task role + the engagement's own audit log writer
Related¶
- 01-architecture.md — overall storage layout
- 09-runner-and-store.md — runner internals