Skip to content

Cross-engagement findings search (Phase 5)

What

A "Search" tab in the AuditForge UI that runs full-corpus, cross-engagement search across every finding the firm has access to. Filters by primitive, severity, status, firm, and free-text match against description / root_cause / auditor_notes. Results link back to per-engagement detail.

Why

Audit firms running 5+ engagements regularly want to ask:

  • "Have we found this exact concern before in another client's audit?"
  • "Show me every flow_down_check finding we've ever produced — I want to refine our internal methodology."
  • "Has the firm's senior partner already accepted findings on NIST SP 800-171 r3 in the last quarter?"
  • "Which clients of ours have we found coverage_check gaps for under DFARS 252.204-7012?"

Without cross-engagement search, the partner has to open each engagement separately and scroll. With it, the firm builds compound knowledge over time — every audit informs the next.

Endpoint

GET /auditforge/findings/search?q=...&primitive=...&severity=...&status=...&firm_id=...&canonical_only=true&limit=200
Param Default Notes
q "" Substring match (case-insensitive) against description + root_cause + auditor_notes
primitive none One of the ten primitives
severity none critical / high / medium / low
status none pending / accepted / rejected / refined
firm_id none Restrict to engagements under this firm
canonical_only true Hide raw findings whose canonical merged them
limit 100 Cap on returned matches (max 200)

Returns:

{
  "query": "...",
  "filters": { "primitive": null, "severity": "critical", ... },
  "matches": [
    {
      "engagement_id": "eng-...",
      "engagement_client_name": "Northstar Defense Inc.",
      "firm_id": "firm-...",
      "finding": { ...full finding object... }
    },
    ...
  ],
  "count": 47,
  "limit": 200
}

Performance characteristics

The endpoint loads per-engagement findings on demand from FindingsStore (which caches in memory after first S3 fetch). Response time scales with the number of engagements with matching findings, not the total finding count across the firm.

For a firm with 50 engagements averaging 17 findings each (850 total findings), a typical filter-narrowed search completes in 100–300ms warm, 1–3s cold (full S3 fetch of every engagement's findings JSON the first time).

For a firm with 500+ engagements, an indexing layer becomes worthwhile. Open hardening item: add a Whoosh / SQLite FTS5 index over findings, populated by update_status / edit / _persist write paths.

Frontend

FindingsSearch.tsx is the search page. Lives under the "Search" tab in the AuditForge top nav. Layout:

  • Free-text input at the top (Enter to search)
  • Four filter dropdowns: primitive, severity, status, firm_id
  • "Search" button
  • Results: each match is a clickable card showing severity badge, primitive, engagement client, firm id, confidence percentage, two-line description preview, optional auditor-notes preview
  • Click a match → navigates to that engagement's detail view with the finding ready to select

Cost

No LLM calls. Pure store reads. Free per search.

Code

  • app/auditforge_endpoints.pysearch_findings endpoint
  • frontend/src/components/FindingsSearch.tsx — UI
  • frontend/src/components/AuditForge.tsx — Search tab in shell
  • frontend/src/api/auditforge.tssearchFindings typed call

Open improvements

  • Pagination (today: hard limit 200, no cursor)
  • Index-backed FTS for sub-100ms search at firm scale
  • Date-range filter (currently no filter on created_at / updated_at)
  • Saved searches (partner subscribes to "every new critical finding under firm X")
  • Cross-engagement clustering: not just search, but "show me the patterns that appear across engagements" — early version of this is the "systemic patterns" within an engagement (Stage F)