AuditForge — Diagrams¶

Mermaid source for visual artifacts used in engineering docs, sales decks, and investor pitches. Diagrams are built from a single source of truth so they don't drift between audiences. Render with mermaid.live, the GitHub renderer, or any docs site that supports mermaid (the AuditForge docs site does via MkDocs Material).

1. Seven-stage pipeline (with E.5 / F.5)¶

The end-to-end audit flow. Used in engineering architecture docs and the methodology white paper.

flowchart LR
    A[A. Profile<br/>corpus shape] --> B[B. Catalog<br/>primitive targets]
    B --> C[C. Synthesize<br/>scoped questions]
    C --> D[D. Validate<br/>relevance + dedupe]
    D --> E[E. Investigate<br/>retrieve+reason+verify]
    E -.iterates.-> B
    E --> E5[E.5 Consolidate<br/>cluster by root cause]
    E5 --> F[F. Deepen<br/>patterns + follow-ups]
    F --> F5[F.5 Filter<br/>def/likely/spec/reject]
    F5 --> G[G. Report<br/>white-label deliverable]

    style E5 fill:#1e3a52,stroke:#3a7a9c,color:#fff
    style F5 fill:#1e3a52,stroke:#3a7a9c,color:#fff
    style G fill:#0f3520,stroke:#22c55e,color:#fff

Reading guide: - B → E loops with iterations 1..N (default 3) before E.5 fires - E.5 (consolidate) and F.5 (filter) run once at the end of all iterations, not per iteration - The deliverable rendering at G applies firm-level white-label branding

2. Per-question investigation (Stage E zoom-in)¶

What happens to a single question inside Stage E. Used in the engineering deep-dive doc and as a hand-out when a partner asks "what does the AI actually do per finding?"

flowchart TD
    Q[Question] --> R[Hybrid retrieve<br/>FAISS + BM25]
    R --> CK[Re-rank<br/>archetype-tuned]
    CK --> RZ[Reason<br/>primitive prompt template]
    RZ --> EA[Evidence anchor<br/>verbatim quote match]
    EA --> QV{quote_verified?}
    QV -->|yes| AV[Adversarial verifier<br/>Opus skeptical review]
    QV -->|no| FLAG[Flag for partner review]
    AV --> RC{verifier rec}
    RC -->|keep| OUT[Raw finding<br/>persisted]
    RC -->|refine| OUT2[Raw finding<br/>refined + persisted]
    RC -->|flag| FLAG
    FLAG --> OUT3[Raw finding<br/>flagged + persisted]

    style AV fill:#1e3a52,stroke:#3a7a9c,color:#fff
    style FLAG fill:#3d2530,stroke:#f59e0b,color:#fff
    style OUT fill:#0f3520,stroke:#22c55e,color:#fff
    style OUT2 fill:#0f3520,stroke:#22c55e,color:#fff
    style OUT3 fill:#0f3520,stroke:#22c55e,color:#fff

Key points: - Every finding cites verbatim quotes; quote_verified flag is set if the quote matches retrieved chunks exactly - Adversarial verifier never silently rejects — it can refine, flag, or pass through - Even flagged findings reach the partner-review queue with the verifier's commentary attached

3. Filter override ruleset (Stage F.5 zoom-in)¶

Why findings move between filter buckets. Used in the engineering filter doc and procurement-review responses ("why won't AuditForge silently delete an inconvenient finding?").

flowchart TD
    IN[Canonical finding] --> P1[Pass 1: LLM classify]
    P1 --> ST{status}
    ST -->|definitive| KEEP1[Main body]
    ST -->|likely| KEEP2[Main body, qualified language]
    ST -->|speculative| KEEP3[Main body, hedged]
    ST -->|rejected| OV[Pass 2: override ruleset]
    OV --> R1{corroboration ≥ 0.7?}
    R1 -->|yes| UP[Upgrade to speculative]
    R1 -->|no| R2{≥3 quotes from ≥3 docs?}
    R2 -->|yes| UP
    R2 -->|no| R3{regulatory pattern match?}
    R3 -->|yes| UP
    R3 -->|no| REJ[Considered-but-rejected appendix]
    UP --> KEEP3

    style OV fill:#1e3a52,stroke:#3a7a9c,color:#fff
    style UP fill:#0f3520,stroke:#22c55e,color:#fff
    style REJ fill:#3d2530,stroke:#f59e0b,color:#fff

Key points: - The override ruleset is upgrade-only — it can elevate a Pass 1 reject to "speculative" but never demotes a Pass 1 keep - Three deterministic rules check corroboration, evidence breadth, and regulatory-pattern match — any one fires to override

4. Data flow¶

How a partner's engagement moves through the system from request to deliverable. Used in engineering onboarding and integration documentation.

sequenceDiagram
    participant P as Partner (UI)
    participant A as FastAPI (auditforge_router)
    participant S as run_audit() pipeline
    participant LLM as Anthropic / OpenAI
    participant ST as S3 store
    participant DEL as Deliverable cache

    P->>A: POST /engagement (firm, client, archetype, budget)
    A->>ST: persist engagement record
    A-->>P: 201 engagement object

    P->>A: POST /engagement/{id}/intake
    A->>ST: persist intake
    A-->>P: updated engagement

    P->>A: POST /engagement/{id}/run
    A->>S: spawn background task
    A-->>P: 202 queued

    P->>A: GET /engagement/{id}/stream (SSE)

    loop pipeline stages
        S->>LLM: stage call
        LLM-->>S: response
        S->>ST: persist findings/state
        S->>A: progress event
        A-->>P: SSE event
    end

    S->>DEL: write polished deliverable
    S->>A: run_complete

    P->>A: GET /engagement/{id}/deliverable?format=markdown
    A->>DEL: lookup cached
    DEL-->>A: polished MD
    A-->>P: deliverable

5. Production deployment topology¶

Where the code actually runs. Used in deployment docs and security reviews.

flowchart TB
    subgraph internet[" "]
        U[Partner's browser<br/>any device]
    end

    subgraph aws[AWS us-east-1 · Account 741783034843]
        DNS[Vercel DNS<br/>metis-demo.base2ml.com]
        ALB[Application Load Balancer<br/>HTTPS w/ ACM cert]

        subgraph ecs[ECS Fargate · Graviton2 ARM64]
            APP[FastAPI app<br/>+ React SPA static]
        end

        subgraph storage[S3 + SSM]
            S3[Bucket: mobilemetis-metis-indexes-741783034843<br/>auditforge/* + clients/*]
            SSM1[SSM: ANTHROPIC_API_KEY]
            SSM2[SSM: AUDITFORGE_OPENAI_API_KEY]
            SSM3[SSM: ADMIN_TOKEN]
        end

        subgraph llm[External LLM APIs]
            CL[Anthropic API<br/>Sonnet 4.6 / Opus 4.7]
            OAI[OpenAI API<br/>gpt-4o-mini]
        end
    end

    U -->|HTTPS| DNS
    DNS --> ALB
    ALB --> APP
    APP --> S3
    APP -.read on boot.-> SSM1
    APP -.read on boot.-> SSM2
    APP -.read on boot.-> SSM3
    APP -->|HTTPS| CL
    APP -->|HTTPS| OAI

    GH[GitHub<br/>Actions CD] -.push to main.-> ECR[ECR<br/>metis-demo:latest]
    ECR -.pulls on deploy.-> APP

    style APP fill:#1e3a52,stroke:#3a7a9c,color:#fff
    style ALB fill:#0f3520,stroke:#22c55e,color:#fff
    style S3 fill:#1a1a3a,stroke:#3a4a8a,color:#fff

6. The ten primitives, mapped¶

What each primitive looks for at a glance. Used in sales decks (the page that justifies "yes we cover the textual issues a senior partner would catch").

mindmap
    root((10 audit<br/>primitives))
        Coverage
            coverage_check
                Required document missing
                Control category absent
        Consistency
            conflict_check
                Documents take opposing positions
            consistency_check
                Defined term used differently
        Currency
            currency_check
                Superseded standard cited
            citation_integrity_check
                Cited authority does not say claim
        Hierarchy
            flow_down_check
                Parent obligation not propagated
        Logic
            temporal_check
                Required precedence violated
            quantitative_check
                Quantitative inconsistency
            obligation_check
                One-sided obligation
            ambiguity_check
                Dispute-prone language

7. Engagement archetypes¶

How the four archetypes tune the same engine. Used in sales conversations explaining "why we have four archetypes, not one or twelve."

flowchart LR
    subgraph engine[Same engine, four tunings]
        direction TB
        E1[Catalog priority<br/>weighting]
        E2[Validator<br/>strictness]
        E3[Deliverable<br/>framing template]
    end

    A1[Capability + Leverage<br/>balanced; junior-readable] --> engine
    A2[Remediation Pipeline<br/>coverage/flow-down/obligation heavy] --> engine
    A3[Premium / Defensibility<br/>citation/consistency/conflict heavy] --> engine
    A4[Continuous Monitoring<br/>currency/temporal heavy] --> engine

    engine --> OUT[Same primitives,<br/>different finding emphasis]

    style A2 fill:#1e3a52,stroke:#3a7a9c,color:#fff
    style A3 fill:#3a2952,stroke:#9c3a7a,color:#fff

Key point: New audit types (HIPAA, SOC 2, ABA Model Rules) are not new code — they're new archetype configurations or new intake prompts.

8. Compound knowledge moat (cross-engagement)¶

The investor-facing diagram explaining why audit firms can't easily switch off AuditForge once they've run 5+ engagements. Used in investor pitches.

flowchart LR
    E1[Engagement 1<br/>17 findings] --> KB[(Firm knowledge base<br/>partner-validated findings,<br/>cross-engagement search)]
    E2[Engagement 2<br/>22 findings] --> KB
    E3[Engagement 3<br/>15 findings] --> KB
    EN[Engagement N<br/>findings] --> KB

    KB --> SEARCH[Cross-engagement<br/>search]
    KB --> PATTERN[Cross-engagement<br/>patterns]
    KB --> METHOD[Methodology<br/>refinement]

    SEARCH --> NEW[New engagement<br/>informed by all priors]
    PATTERN --> NEW
    METHOD --> NEW

    style KB fill:#1e3a52,stroke:#3a7a9c,color:#fff
    style NEW fill:#0f3520,stroke:#22c55e,color:#fff

Key point: The Nth audit benefits from the prior N-1. Switching costs compound — leaving AuditForge means abandoning a structured, searchable record of every prior engagement decision.

Where these diagrams render¶

Engineering docs (this directory): GitHub renders mermaid natively. The MkDocs Material site at https://docs.base2ml.com supports mermaid via the pymdownx.superfences extension.
Sales decks: render in mermaid.live, screenshot, embed in slides. Or use a presentation tool that supports mermaid directly (Slidev, marp).
Investor pitches: same as sales — screenshot the mermaid render or use a renderer-aware tool.
Methodology white paper: link to this file from the published methodology, or render the relevant ones inline.

Updating the diagrams¶

When the architecture changes, update mermaid source here, not screenshots somewhere downstream. Sales / investor artifacts should re-render from source — the cost is small, and screenshot drift is the main reason audit-firm sales material misrepresents the actual product after 6 months.