AuditForge — Operator Manual¶

For audit-firm partners and associates running engagements through the AuditForge web UI. This is the document you hand to a newly-onboarded associate at the firm.

What you're operating¶

AuditForge is a deep-audit engine. You give it a corpus of contracts, policies, SOPs, and attestations; it produces a structured set of evidence-anchored findings; you review, accept, refine, or reject each one before the deliverable goes to your end client. The deliverable carries your firm's branding, not AuditForge's.

You are the senior reviewer. AuditForge surfaces candidate findings; your judgment decides which ones reach the client.

Roles¶

AuditForge has three user roles within a firm:

Role	What you can do
Admin	Everything — platform-wide. Used for firm/user CRUD and recovery.
Partner	Full access to your firm's engagements: create, run, accept/reject/refine/edit findings, investigate further, recompute portfolio patterns.
Associate	Read-only. Browse engagements, view findings, export deliverables and audit logs — but can't make decisions on findings. The intent is that the partner reviews and signs off.

Your firm admin sets your role when creating your account. To change someone's role (e.g., promote an associate to partner), the admin uses POST /auditforge/user/{user_id}/role. Existing sessions are revoked so the new role takes effect on next login.

Logging in¶

Open https://metis-demo.base2ml.com/?view=auditforge. The landing page describes the product; scroll down to the auth gate or click into it directly.

Two ways to sign in:

Email + password (preferred) — your firm admin creates an account for you with an initial password; sign in with email + password and the UI prompts you to rotate the password on first login. Sessions last 12 hours.
Admin token — legacy path used during pilot onboarding. Contact chris@base2ml.com to obtain a token if your firm needs one for initial setup. Per-user accounts are the preferred path; admin tokens are issued only for the initial setup window and rotated regularly.

MFA (recommended). After signing in, click "Enable MFA" in the header. Scan the QR code with Google Authenticator, 1Password, Authy, or any RFC-6238 authenticator app. Save the 10 backup codes shown at the end of enrollment — each works exactly once and they're shown only this time. After MFA is enabled, every future login requires both your password and a 6-digit code from your authenticator.

If you lose your authenticator device, use a backup code at the login prompt instead of the 6-digit code. If you lose both, contact your firm admin to clear MFA via the admin recovery flow.

The "Sign out" button (top-right) revokes your session server-side. Closing your laptop without signing out leaves the session active until the 12-hour TTL expires.

The guided tour¶

The first time you log in, AuditForge auto-launches a 21-step walkthrough that covers everything below. Skip or step through at your own pace. Restart it any time from the Take tour button in the header. The tour is also designed to be demo-friendly — if a prospect or partner is shadowing your screen, the overlay narrates the product so you don't have to.

The four tabs¶

Tab	Purpose
Engagements	Portfolio dashboard — every audit your firm has run, with summary cards and filters. Default view.
Search	Cross-engagement findings search. Free-text + filters across every audit your firm has produced.
Patterns	Portfolio clustering. Recurring themes across multiple audits with prevalence + suggested standard remediation.
Firms	Branding profiles applied to deliverables. Set up your firm's logo, colors, methodology disclaimer here before you run your first engagement.

First-time setup: configure your firm¶

Before your first engagement, click the Firms tab and + New firm.

Field	What to put
Display name	"Acme Audit Partners, LLP" — exact text on the cover page
Short name	"Acme" — used in tight footers
Tagline	"Defense compliance · Pittsburgh" — appears under the firm name on cover
Logo URL	Public HTTPS URL to your logo (firm hosts on their own CDN). PNG with transparent background renders best.
Primary color	Cover heading bars and accent. Use your firm brand color.
Accent color	Highlights and callouts.
Confidentiality notice	"CONFIDENTIAL — prepared for [Client] under MSA dated …" — renders as a blockquote above the executive summary
Methodology disclaimer	"This audit was conducted by Acme using AI-assisted document review tooling. All findings have been reviewed and validated by a senior partner before delivery." — appended to the methodology section
Footer text	"© 2026 Acme Audit Partners, LLP · acme-audit.com · Privileged & Confidential" — every page footer
Default archetype	What pre-fills on the new-engagement form. Use Remediation Pipeline if you typically scope follow-up work.
Default budget	What pre-fills on the new-engagement form. $15 is a safe default for a synthetic-corpus validation; production audits usually run $5–50.

You can edit any of these later — changes apply to future deliverable renders. Existing cached deliverables don't auto-regenerate; re-run the deliverable export to pick up changes.

Multiple firm profiles are supported (e.g., parent firm + a sub-brand). Each engagement points at one firm.

Creating an engagement¶

Click + New engagement on the Engagements tab.

Option A — AI-assisted (recommended)¶

Click ✨ AI-assisted intake at the top of the form. Type a paragraph describing the engagement: client industry, frameworks in scope, focus areas, anything you already know is concerning. 2–5 sentences works.

Click Extract & fill form. The form below populates from the LLM's interpretation. Review and edit anything that's wrong. Cost: ~$0.05 per extract.

Option B — Manual¶

Skip the AI panel and fill the form directly. Required fields are marked with *.

Form fields¶

Field	What it does
Firm	The branding profile applied to the deliverable. Defaults to your firm's primary profile if you have only one.
Client name	The end client's name as you want it on the cover ("Northstar Defense Inc.").
Corpus	The Metis tenant whose corpus this audit runs against. Must be ingested already; a partner ops engineer handles ingestion separately.
Archetype	Audit framing — see 01-architecture.md for the four archetypes.
Domain	One sentence about the client and their posture.
Audit purpose	One sentence about why this audit and what decision it informs.
Frameworks	Comma-separated. Be specific: "NIST SP 800-171, DFARS 252.204-7012, CMMC L2", not "cybersecurity rules".
Focus areas	Comma-separated. Domain areas the partner already knows are interesting.
Known concerns	Comma-separated. Things the partner suspects are problems before the audit even starts. The catalog stage prioritizes investigating these.
Budget	Hard cap on compute spend. The system gracefully downshifts model tiers as it approaches the cap.
Max questions	Optional cap on the investigate stage. Useful when you want predictable cost on a large corpus.
Adversarial verifier	Toggle. Doubles per-question cost but materially reduces false-positive rate. Default on.

Click "Create & start audit"¶

The system creates the engagement record, sets the intake, and kicks off the audit pipeline as a background task. You're dropped into the engagement detail view. A live progress banner shows pipeline stage transitions in real time.

Reviewing findings¶

When the audit completes, the engagement's status flips to findings_review. You'll see a finding count, severity distribution, and the canonical findings in the left-hand list.

The list¶

Sorted by severity descending, then confidence descending
Filter chips for severity, status, and primitive at the top
Each row shows: severity badge, finding title (first sentence of description), primitive, confidence, merged-finding count if it's a canonical
Rejected findings render with strikethrough; accepted findings show a green pill

The detail panel¶

Click any finding. The right pane shows:

Finding — the full description
Root cause — the structural / process failure behind it (often the most useful field for partner judgment)
Evidence — verbatim quotes from the source documents, anchored to title + section + page. Each quote is what the LLM saw; if a quote doesn't appear in the source, that's a red flag and quote_verified: false will surface
Remediation — scope of work, estimated effort hours, dependencies, risk if unaddressed. This becomes the "next engagement" sales motion.
Consolidation footer if this is a canonical: "Consolidated from N raw findings · M primitive angles agreed · corroboration X%"

The four actions¶

Action	When to use
Accept	The finding is correct and well-evidenced. Goes to the deliverable as-is.
Reject	The finding is wrong or out of scope. Goes to the "Considered but rejected" appendix with your reason in the auditor notes.
Refine (save notes)	The finding is useful but you have additional context. Status stays Refined; notes show in the audit log.
✎ Edit	The finding is correct but the language needs tightening before client delivery. Edit the description, root cause, remediation, severity, or hours inline. Status flips to Refined and the audit log records `[EDITED <fields>]`.

The auditor-notes textarea is always visible. Whatever you type there is sent with whichever action button you click.

Investigate further¶

Below the auditor notes is "Investigate further…". Click to spawn a focused mini-audit seeded by this finding.

In the panel that opens: optional steering text ("look at audit-rights flow-down specifically across both subcontracts"), budget cap (default $15), max questions (default 20). Hit Start focused audit. New findings appear in the engagement when the focused run completes (1–3 minutes).

Exporting the deliverable¶

Three buttons in the engagement detail header:

Format	Use for
Print / PDF	Open in a new tab; browser Print → Save as PDF produces a client-deliverable PDF with firm branding
Markdown	Inline review, copy/paste into your firm's report template, version-control
DOCX	Open in Word, edit, deliver to client
JSON	Downstream tooling — your CRM, project management, billing
Audit log	Newline-delimited JSON of every LLM call AuditForge made for this engagement. Hand to your client's procurement reviewer when asked "show me how this was conducted."

The deliverable renders with your firm's branding (cover, methodology disclaimer, footer) and includes the canonical findings (filtered to non-rejected) in the main body, plus a "Considered but rejected" appendix for transparency.

The polished version with the LLM-written executive summary is generated at audit-complete time and cached. Subsequent exports return the cached artifact in milliseconds.

Portfolio patterns¶

Click the Patterns tab to see recurring themes across your firm's audit portfolio. The system clusters findings that share a structural root cause and appear in multiple clients' audits.

Each cluster shows: - A one-sentence theme describing the recurring issue - Prevalence ("found in 4 of 7 engagements") - Severity distribution within the cluster - Suggested standard remediation language the firm can templatize - Representative quotes from member findings (click engagement IDs to drill through)

Cost: Each "Recompute" click runs an Opus call (~$0.10–$0.40 depending on portfolio size). Results cache for 24 hours, so re-opening the tab is free. The button shows the estimated cost before firing.

Why this matters: When you find the same flow-down gap in 4 of your 7 defense clients, that's a productizable opportunity. The Patterns view surfaces these candidates so your firm can templatize remediation playbooks and price them as offerings.

Cross-engagement search¶

Click the Search tab to find findings across every engagement your firm has run. Useful for:

"Have we found this exact concern in another client's audit?" — type the concern, see every prior finding that mentions it
"Show me every flow_down_check we've ever produced" — filter by primitive
"Has the senior partner already accepted findings on NIST SP 800-171 r3 this quarter?" — combine free-text + status=accepted
Refining your firm's internal methodology — review every audit's findings on a single concern in one view

The search runs against all engagements visible to your admin token. Click any result to jump directly to that engagement's detail.

The match shows the engagement client name, firm, primitive, severity, confidence, and a two-line preview. Auditor notes (if any) appear under the description so you can see prior partner judgment at a glance.

Costs you should know¶

Activity	Typical cost
Audit on a small corpus (10–20 docs)	$2–10
Audit on a medium corpus (50–200 docs)	$10–50
Audit on a large corpus (200+ docs)	$50–200
Investigate-further on one finding	$1–3
AI-assisted intake extract	$0.03–0.08
Adversarial verifier (per-question Opus call)	doubles per-question cost

Cost is tracked per engagement and visible in the engagement detail header. Hard caps prevent runaway spend.

When things go wrong¶

"Failed to load corpora"¶

Your admin token isn't valid or the backend is down. Click "Clear token" in the header and re-enter.

"Account locked. Try again in 15 minutes."¶

You've hit the brute-force lockout — 5 failed login attempts within 15 minutes. Wait 15 minutes for the lockout to expire, or ask your firm admin to clear it. The lockout protects you from credential stuffing; it triggers on wrong passwords, wrong MFA codes, and invalid backup codes. Successful logins (with correct password + MFA) clear the counter.

Audit aborts early with "BudgetExceeded"¶

Your budget cap was too low for the corpus size. The findings produced before the abort are persisted; you can review them but the deliverable won't have an executive summary. Re-run with a higher budget or a max_questions_per_iteration cap.

Finding has `quote_verified: false`¶

The LLM produced a quote but it didn't match any chunk in the retrieved corpus. Treat as suspicious — verify against the source PDF before accepting.

SSE connection dropped (network blip). Findings still appear, just on a 5-second poll instead of immediately. Refresh the page to retry.

"Engagement has no recorded client_id"¶

You created an engagement but never ran the initial audit. Click Start audit on the engagement detail view; investigate-further requires the initial run to record the corpus binding.

Multi-reviewer collaboration¶

Two patterns of multi-partner collaboration are supported:

Engagement assignment. Each engagement can have one assigned reviewer. Click the small "Assigned: …" or "Unassigned" chip in the engagement header to open a dropdown of admin/partner users in your firm. Pick one to assign; pick "— Unassign —" to clear. The chip is informational — it doesn't restrict who can edit findings (that's role-based), it just signals "this person is the lead reviewer."

Per-finding comments. Below each finding's evidence and remediation sections is a "Reviewer comments" thread. Use it for partner-to-partner context: "I checked clause 4.2, looks clean", "investigate further before accepting", "second opinion needed on severity." Each comment shows author + timestamp; author or an admin can delete. Comments are internal — not rendered in the deliverable.

Don't confuse the comments thread with the public Auditor notes field at the bottom of each finding. Auditor notes flow into the finding record and become part of the audit trail; comments are throwaway internal context.

Onboarding a new client corpus (self-serve)¶

Click + Upload corpus on the Engagements tab (next to + New engagement). This single flow handles everything: creating the engagement, uploading documents, kicking off chunking + indexing.

Setup — pick your firm, name the client. The modal creates the engagement immediately.
Upload — drag-and-drop your documents (PDF, DOCX, TXT, MD, CSV, XLSX, EML, MBOX, PNG, JPG, TIFF). Files upload one at a time so a flaky network only re-uploads the failing file. Up to 50 MB per file, 500 files per engagement.
Start ingest — once all files are uploaded, click Start ingest. A background task chunks the documents and builds the search index. Typically 1–5 minutes depending on corpus size. Live progress shows in the modal; safe to leave it open or close and come back.
Open engagement — once ingestion completes, click Open engagement to land in the engagement detail view. Fill the intake (or use AI-assisted extract), pick an archetype, click Start audit.

If a file fails to upload (network blip, bad format), the row shows a Retry button. If ingestion fails (corrupted PDF, malformed CSV), you'll see the error message and can delete the offending file, then retry.

This replaces the older "create a PilotForge tenant manually, upload files via a separate UI, then come back to AuditForge" flow. Onboarding a fresh client corpus is now a 5–10 minute self-serve process.

Saving and reusing templates¶

A firm running the same shape of audit (e.g., CMMC pre-assessment) for many clients shouldn't re-fill the intake every time. Click Save as template in the engagement header (after the engagement has an intake) → the platform captures the firm, archetype, budget, and full intake. Next time you click + New engagement, a "Start from template" picker appears at the top of the form; pick one and the form prefills.

The template is a starting point, not a lock — you can edit any field before kicking off the audit. Editing a template later does not retroactively change engagements already created from it.

Comparing portfolio patterns over time¶

In the Patterns tab, click Show diff vs previous snapshot to see how the firm's accumulated patterns evolved between the latest two recomputes. Each entry is classified as:

NEW — a pattern that didn't exist before
GREW — a pattern that gained findings (more clients show this issue)
SHRANK — a pattern with fewer findings (remediations landed)
STEADY — same membership as before
RESOLVED — a pattern that disappeared (no longer detected in current findings)

Useful for partner conversations: "Look — last quarter we had 4 clients with flow-down gaps. Now it's 7. Standard remediation playbook is the right path." Each Recompute captures a fresh snapshot; only the most recent prior snapshot is retained for diff (single-hop, not full history).

Downloading the audit log via signed URL¶

For most engagements, click the Audit log button to stream the JSONL download through your browser. For very large engagements (50+ MB), the streaming approach can time out. Click the small URL button next to it instead — the platform materializes the log to S3 and copies a presigned URL to your clipboard (10-minute expiry). Paste the URL into your terminal to download with curl/wget, or into a download manager that handles big files better than the browser.

Archiving an engagement¶

After an engagement is delivered, an admin can Archive it from the engagement header. Archived engagements drop out of the default engagement list — useful when you have 50+ delivered engagements and only the last ~10 are relevant to today's work. Findings, audit logs, deliverables are all preserved.

To see archived engagements, check the Include archived box at the top of the engagement list. To restore one, open it and click Unarchive in the header. Archive/unarchive are admin-only.

Marking an engagement delivered¶

When the partner is satisfied with the deliverable and has handed it to the client, click Mark delivered in the engagement header. This:

Stamps delivered_at on the engagement
Freezes finding mutations: nobody (partner or associate) can change finding status, edit, refine, run a new audit, or use bulk-action until the engagement is unfrozen
Adds a yellow lock banner across the top of the engagement so anyone reviewing knows it's locked

The chain-of-custody trail is preserved: if a correction is needed later, only an admin can Unfreeze, and that action is logged. After unfreeze, the engagement is back in findings_review state and partners can edit again.

Bulk-reviewing many findings¶

A finding list with 50+ rows is normal on a defense-contractor or healthcare corpus. Click the checkbox at the left edge of any finding row to start bulk selection. The bar above the list shows the count and three buttons:

Accept all — every selected finding flips to accepted
Reject all — every selected finding flips to rejected
Mark refined — every selected finding flips to refined (use when the partner edited the language but the substance stands)

The "Select all visible" button in the filter bar selects everything matching the current filters — handy for "accept every low-severity finding flagged in the last run" or "reject every flagged-by-currency-check finding from a stale draft." Up to 200 findings per call.

After the action, a banner reports accept → 47/50 (with · 3 failed if any IDs went stale). Clear the banner and the selection is empty again.

Lost your authenticator device¶

Use a backup code at the MFA prompt — each of the ten codes shown during enrollment works exactly once. If you've used all ten, ask your firm admin to call POST /auditforge/user/{your_id}/clear-mfa (or use the admin recovery flow). After that you can log in with password alone and re-enroll TOTP from the header.

Forgot your password¶

Ask your firm admin to call POST /auditforge/user/{your_id}/reset-password with a temporary password. The temp password is communicated through a secure side channel (not email — SES is sandboxed). Your existing sessions are revoked; on next login you'll be required to set a new password yourself before you can do anything else. Your TOTP enrollment is not cleared by a password reset — you'll still need your authenticator at login.

Things you cannot do (yet)¶

Pause / cancel an in-flight audit (you have to wait for budget cap or completion)
Custom primitives beyond the built-in ten
White-label DOCX with embedded firm logo image (text-only "Prepared by" today; image embed is Phase 2.6)
Server-rendered PDF (today: open Print/PDF view in a new tab, use browser Print → Save as PDF)

These are on the roadmap.

Getting help¶

Email chris@base2ml.com with the engagement id. Engagement audit logs are persisted per engagement and exportable; we can replay any decision the system made.