Cross-engagement portfolio clusters (Phase 10)¶
What¶
Beyond Stage F's within-engagement pattern detection, this surfaces cross-engagement patterns — recurring themes that show up across multiple audits in the firm's portfolio. Findings that share a structural root cause and appear in 2+ different clients become candidates for standard remediation playbooks the firm can productize.
This is the moat-strengthening feature. After a firm has run 5+ engagements through AuditForge, the portfolio-clusters view shows what recurs. The Nth audit benefits from the prior N-1; the firm's accumulated judgment compounds into a structured asset.
Use case¶
"We've found this same flow-down gap in 4 of our 7 defense clients. Here's the pattern + a case for templatizing the remediation guidance into a standard engagement deliverable section."
The portfolio-clusters view answers questions like:
- What recurring textual issues appear across our defense-contractor audits?
- Which framework citations are most often cited incorrectly across our clients?
- Where in our portfolio is the same root cause showing up — and is it worth productizing?
Architecture¶
Filter → Load all matching findings across firm's engagements
(canonical-only by default; primitive + severity filters)
→ Truncate to top-50 by severity × confidence (most material first)
→ Send to Opus call with cluster-extraction system prompt
→ Parse strict JSON: { clusters: [{ theme, member_finding_ids,
representative_finding_ids,
suggested_standard_remediation,
confidence }] }
→ Build full PortfolioCluster records with prevalence math,
severity distribution, representative quotes, member
engagement IDs
→ Sort by prevalence (more engagements affected first)
→ Cache in S3 keyed by hash(filters), TTL 24h
Endpoints¶
GET /auditforge/findings/portfolio-clusters?firm_id=...&primitive=...&severity=...
POST /auditforge/findings/portfolio-clusters/recompute?<same params>
GET returns the cached result (or null if no cache exists for these filters). POST fires an Opus call, computes fresh clusters, caches the result, returns it. The frontend's "Patterns" tab uses GET on initial load and POST on the partner's "Recompute" click.
For non-admin session callers, the firm_id parameter is overridden to the caller's firm — partners see only their firm's portfolio patterns.
Output shape¶
{
"filter_key": "<sha256 prefix>",
"cached": true,
"result": {
"generated_at": "2026-05-09T04:15:00Z",
"filters": { "firm_id": "firm-...", "primitive": null, "severity": null, "canonical_only": true },
"total_findings_considered": 47,
"total_engagements_scanned": 7,
"cost_cents": 28.4,
"succeeded": true,
"clusters": [
{
"id": "pc-001",
"theme": "Subcontract DFARS 252.204-7012 flow-down language is incomplete or absent in Tier-2 vendors across multiple clients",
"prevalence_summary": "found in 4 of 7 engagements (57%)",
"member_engagement_ids": ["eng-...", "eng-...", "eng-...", "eng-..."],
"member_finding_ids": ["cf-...", "cf-...", ...],
"severity_distribution": { "critical": 2, "high": 5, "medium": 1, "low": 0 },
"representative_quotes": [
{
"engagement_id": "eng-...",
"finding_id": "cf-...",
"quote": "Subcontractor shall implement controls aligned with NIST SP 800-171 r2 ...",
"doc_title": "Subcontract Alpha — Software Services",
"description_snippet": "The Master Contract §3.4 requires r3 ..."
},
...
],
"suggested_standard_remediation": "Establish a master-contract-to-subcontract reconciliation checklist that ...",
"confidence": 0.85
},
...
]
}
}
Frontend¶
A new "Patterns" tab in the AuditForge nav. Layout:
- Filters: primitive dropdown, severity dropdown, "Recompute" button (with cost confirmation modal)
- Summary line: "47 findings analysed across 7 engagements · 6 patterns surfaced · cached · generated 4 hours ago"
- Cluster cards, sorted by prevalence:
- Header: cluster id, prevalence summary, severity distribution badges, confidence percentage
- Theme paragraph (LLM-written one-sentence description)
- Suggested standard remediation (highlighted call-out)
- Expandable section: representative verbatim quotes (click engagement ID → drill-through), full list of affected engagement IDs
Cost characteristics¶
| Operation | Cost |
|---|---|
| GET cached | Free |
| POST recompute (≤50 findings, ≤10 engagements) | ~$0.10–$0.30 |
| POST recompute (50 findings cap, dense portfolio) | ~$0.30–$0.40 |
Hard cap on the LLM call: $0.80 per recompute via an isolated CostBudget. Cache TTL: 24 hours.
For a firm running monthly recomputes per filter set, annual cost is roughly $5–10 in compute. Negligible against the value of cross-engagement insight.
Security / access¶
- Admin-token callers and admin-role users see all firms' clusters; can pass
firm_idto scope - Non-admin session callers'
firm_idis overridden to their own firm; cross-firm portfolio insight is not exposed
Code¶
app/auditforge/portfolio_clusters.py—PortfolioCluster+PortfolioClustersResultdataclasses,cluster_portfolio_findings()LLM entry point, S3-backed cache (load/save),make_filter_key()helperapp/auditforge_endpoints.py—GET /findings/portfolio-clusters+POST /findings/portfolio-clusters/recomputefrontend/src/components/PortfolioClusters.tsx— Patterns tab UIfrontend/src/api/auditforge.ts— typedgetPortfolioClusters+recomputePortfolioClusters
Open follow-ups¶
- Filter persistence per user — today the partner's filter selection is in-memory; on tab switch it resets. Save to localStorage.
- Date range filter — useful for "patterns from this fiscal year only"
- Per-cluster export — generate a partner-deliverable doc from a cluster (theme + suggested remediation + supporting evidence quotes)
- Cluster diff over time — compare current clusters vs previous month's; surface what's emerging vs what's resolved
- Deeper integration with engagement detail — when viewing a finding, show "this finding is part of pattern PC-003" with one-click pivot
Related¶
- 07-stage-f-deepening.md — within-engagement pattern detection (Stage F)
- 14-stage-e5-consolidation.md — same Opus reasoning approach, but per-engagement
- 19-findings-search.md — flat cross-engagement search (this is the structured cluster equivalent)