Cross-engagement findings search (Phase 5)¶
What¶
A "Search" tab in the AuditForge UI that runs full-corpus, cross-engagement search across every finding the firm has access to. Filters by primitive, severity, status, firm, and free-text match against description / root_cause / auditor_notes. Results link back to per-engagement detail.
Why¶
Audit firms running 5+ engagements regularly want to ask:
- "Have we found this exact concern before in another client's audit?"
- "Show me every
flow_down_checkfinding we've ever produced — I want to refine our internal methodology." - "Has the firm's senior partner already accepted findings on NIST SP 800-171 r3 in the last quarter?"
- "Which clients of ours have we found
coverage_checkgaps for under DFARS 252.204-7012?"
Without cross-engagement search, the partner has to open each engagement separately and scroll. With it, the firm builds compound knowledge over time — every audit informs the next.
Endpoint¶
GET /auditforge/findings/search?q=...&primitive=...&severity=...&status=...&firm_id=...&canonical_only=true&limit=200
| Param | Default | Notes |
|---|---|---|
q |
"" | Substring match (case-insensitive) against description + root_cause + auditor_notes |
primitive |
none | One of the ten primitives |
severity |
none | critical / high / medium / low |
status |
none | pending / accepted / rejected / refined |
firm_id |
none | Restrict to engagements under this firm |
canonical_only |
true | Hide raw findings whose canonical merged them |
limit |
100 | Cap on returned matches (max 200) |
Returns:
{
"query": "...",
"filters": { "primitive": null, "severity": "critical", ... },
"matches": [
{
"engagement_id": "eng-...",
"engagement_client_name": "Northstar Defense Inc.",
"firm_id": "firm-...",
"finding": { ...full finding object... }
},
...
],
"count": 47,
"limit": 200
}
Performance characteristics¶
The endpoint loads per-engagement findings on demand from FindingsStore (which caches in memory after first S3 fetch). Response time scales with the number of engagements with matching findings, not the total finding count across the firm.
For a firm with 50 engagements averaging 17 findings each (850 total findings), a typical filter-narrowed search completes in 100–300ms warm, 1–3s cold (full S3 fetch of every engagement's findings JSON the first time).
For a firm with 500+ engagements, an indexing layer becomes worthwhile. Open hardening item: add a Whoosh / SQLite FTS5 index over findings, populated by update_status / edit / _persist write paths.
Frontend¶
FindingsSearch.tsx is the search page. Lives under the "Search" tab in the AuditForge top nav. Layout:
- Free-text input at the top (Enter to search)
- Four filter dropdowns: primitive, severity, status, firm_id
- "Search" button
- Results: each match is a clickable card showing severity badge, primitive, engagement client, firm id, confidence percentage, two-line description preview, optional auditor-notes preview
- Click a match → navigates to that engagement's detail view with the finding ready to select
Cost¶
No LLM calls. Pure store reads. Free per search.
Code¶
app/auditforge_endpoints.py—search_findingsendpointfrontend/src/components/FindingsSearch.tsx— UIfrontend/src/components/AuditForge.tsx— Search tab in shellfrontend/src/api/auditforge.ts—searchFindingstyped call
Open improvements¶
- Pagination (today: hard limit 200, no cursor)
- Index-backed FTS for sub-100ms search at firm scale
- Date-range filter (currently no filter on
created_at/updated_at) - Saved searches (partner subscribes to "every new critical finding under firm X")
- Cross-engagement clustering: not just search, but "show me the patterns that appear across engagements" — early version of this is the "systemic patterns" within an engagement (Stage F)