Public Roadmap¶
What we've discussed building, in priority order. Items move to changelog.md as they ship.
Last updated: 2026-05-10
🚧 In progress¶
Nothing actively in flight as of 2026-05-10.
⏳ Next up (weeks)¶
- Edge-case PDF coverage — text-heavy NIST docs work cleanly. Still need validation against scanned (OCR-only) PDFs, password-protected docs, and embedded-font edge cases. Promote when a real customer corpus surfaces one of these.
📋 Considered (months)¶
These have been discussed but aren't actively in flight. Customer signal moves them forward.
| Item | Status | Notes |
|---|---|---|
| Classification override UI | Considered | Partners can override doc-type/jurisdiction classifications pre-ingest. full_rebuild does internal classification today; real corpora will tell us if it bites |
| True resumable multipart uploads | Considered | Per-file granularity is the pragmatic substitute today; a 50MB-PDF mid-upload drop requires re-uploading that file. Worth fixing if real corpora include many large files |
| Per-engagement bucket for indexes | Considered | Source documents already isolated; index files (FAISS, BM25) still live in the shared bucket. Hardening pass when a customer asks |
| SAML / SSO via Auth0 or Cognito | Considered | At $1,000–2,000 per engagement pricing, mid-market firms don't demand it. Promote if Big-4 conversations materialize |
| WebAuthn / FIDO2 | Considered | Would replace TOTP for hardware-key-based auth. Lower priority than SSO |
| Audit-tech integrations (CCH, Caseware, Thomson Reuters) | Considered | JSON export is the current integration surface. Top-100 firms will eventually demand this |
| Saved searches + webhook notifications | Considered | "Notify me when a finding matches X" workflow. Blocked on SES sandbox + Resend integration |
| Cross-region S3 replication | Considered | Single-region us-east-1 today; deploy when first 10 paying customers or one contractual requirement |
| Custom audit primitives | Considered | Firms define domain-specific check patterns. Significant scope; defer until volume justifies |
| Cost analytics dashboard | Considered | Per-firm spend, archetype averages, projected per-engagement cost. Helpful for partner pricing conversations |
| SCIM provisioning | Considered | Manual user-create only today. Promote with SSO |
| SIEM forwarding | Considered | CloudWatch logs only today. Worth wiring when a partner asks |
| API documentation (OpenAPI/Swagger) published | Considered | FastAPI auto-generates one; publishing is a few hours of polish |
| Status page (statuspage.io free tier) | Considered | Half-day of work; defer until first customer impact |
| Reference customer + case study | Critical | Highest-impact non-engineering item — one real co-delivery engagement closes case-study, real-corpus-cost-data, and real-PDF-stress-test gaps simultaneously |
| SOC 2 Type 1 audit | Critical, deferred | Substantive controls in place; auditor-signed report not started. ~3 months, $15–30K. Defer to month 6+ at current pricing |
| Penetration test | Critical, deferred | Table stakes for procurement-heavy enterprise. ~$8–15K. Same timing as SOC 2 |
| Cyber liability insurance | Critical, admin | $2–5K/yr. Limits cumulative exposure across engagements |
| DPAs signed with Anthropic + OpenAI | Critical, admin | Both vendors offer ready-to-sign DPAs. ~1 week of admin work, $0 |
✅ Recently shipped¶
See changelog.md for the full history. Recent highlights:
- In-product guided walkthrough (Phase 29) — 2026-05-11
- Auto-launching interactive tour built with react-joyride
- 21 steps covering the complete audit workflow
- Restartable via "Take tour" header button — demo-ready
- Admin-token rotation + per-token rate limits — 2026-05-11
- Closed admin-token leak (was published in user-manual.md)
- Daily rate limits on
run,investigate_further,recompute,intake_extractendpoints - Hygiene pass — 2026-05-11
- Signed-URL S3 lifecycle rule (auto-expire exports after 30 days)
- 15 pre-flag engagements migrated to per-engagement isolated buckets
- Real-PDF stress test (2 NIST docs, 2.6 MB) ingested in ~50 seconds
- Multi-reviewer collaboration + budget cap UX (Phase 28) — 2026-05-10
- Self-serve corpus onboarding (Phase 25–27) — 2026-05-10
- Per-engagement bucket isolation default-on — 2026-05-10
- Audit log signed URL (Phase 24) — 2026-05-09
- Engagement template library (Phase 23) — 2026-05-09
- Cluster diff over time (Phase 22) — 2026-05-09
- Engagement archive (Phase 21) — 2026-05-08
- Engagement freeze on deliver (Phase 20) — 2026-05-08
How this list is maintained¶
This roadmap is curated rather than mechanically generated. When a feature is discussed with a customer or in an internal review, it lands here as "Considered." When work starts, it moves to "In progress." When it ships, the entry moves to changelog.md with the ship date and a one-paragraph summary.
Items are removed only when explicitly de-scoped (rare). The default disposition is to keep them visible so customers can see what's been weighed.
Contact: chris@base2ml.com to suggest priorities or signal a customer need.