Cross-engagement findings search (Phase 5)¶

What¶

A "Search" tab in the AuditForge UI that runs full-corpus, cross-engagement search across every finding the firm has access to. Filters by primitive, severity, status, firm, and free-text match against description / root_cause / auditor_notes. Results link back to per-engagement detail.

Why¶

Audit firms running 5+ engagements regularly want to ask:

"Have we found this exact concern before in another client's audit?"
"Show me every flow_down_check finding we've ever produced — I want to refine our internal methodology."
"Has the firm's senior partner already accepted findings on NIST SP 800-171 r3 in the last quarter?"
"Which clients of ours have we found coverage_check gaps for under DFARS 252.204-7012?"

Without cross-engagement search, the partner has to open each engagement separately and scroll. With it, the firm builds compound knowledge over time — every audit informs the next.

Endpoint¶

GET /auditforge/findings/search?q=...&primitive=...&severity=...&status=...&firm_id=...&canonical_only=true&limit=200

Param	Default	Notes
`q`	""	Substring match (case-insensitive) against description + root_cause + auditor_notes
`primitive`	none	One of the ten primitives
`severity`	none	critical / high / medium / low
`status`	none	pending / accepted / rejected / refined
`firm_id`	none	Restrict to engagements under this firm
`canonical_only`	true	Hide raw findings whose canonical merged them
`limit`	100	Cap on returned matches (max 200)

Returns:

{
  "query": "...",
  "filters": { "primitive": null, "severity": "critical", ... },
  "matches": [
    {
      "engagement_id": "eng-...",
      "engagement_client_name": "Northstar Defense Inc.",
      "firm_id": "firm-...",
      "finding": { ...full finding object... }
    },
    ...
  ],
  "count": 47,
  "limit": 200
}

Performance characteristics¶

The endpoint loads per-engagement findings on demand from FindingsStore (which caches in memory after first S3 fetch). Response time scales with the number of engagements with matching findings, not the total finding count across the firm.

For a firm with 50 engagements averaging 17 findings each (850 total findings), a typical filter-narrowed search completes in 100–300ms warm, 1–3s cold (full S3 fetch of every engagement's findings JSON the first time).

For a firm with 500+ engagements, an indexing layer becomes worthwhile. Open hardening item: add a Whoosh / SQLite FTS5 index over findings, populated by update_status / edit / _persist write paths.

Frontend¶

FindingsSearch.tsx is the search page. Lives under the "Search" tab in the AuditForge top nav. Layout:

Free-text input at the top (Enter to search)
Four filter dropdowns: primitive, severity, status, firm_id
"Search" button
Results: each match is a clickable card showing severity badge, primitive, engagement client, firm id, confidence percentage, two-line description preview, optional auditor-notes preview
Click a match → navigates to that engagement's detail view with the finding ready to select

Cost¶

No LLM calls. Pure store reads. Free per search.

Code¶

app/auditforge_endpoints.py — search_findings endpoint
frontend/src/components/FindingsSearch.tsx — UI
frontend/src/components/AuditForge.tsx — Search tab in shell
frontend/src/api/auditforge.ts — searchFindings typed call

Open improvements¶

Pagination (today: hard limit 200, no cursor)
Index-backed FTS for sub-100ms search at firm scale
Date-range filter (currently no filter on created_at / updated_at)
Saved searches (partner subscribes to "every new critical finding under firm X")
Cross-engagement clustering: not just search, but "show me the patterns that appear across engagements" — early version of this is the "systemic patterns" within an engagement (Stage F)