Audit log export (Phase 14)¶

What¶

GET /auditforge/engagement/{id}/audit-log returns the full per-engagement audit log — every LLM call AuditForge made for this engagement, with timestamp, model, input/output tokens, cost, latency, stage metadata, and any tier-downshift hints.

This is the artifact a partner hands to their end client's procurement reviewer when asked "how was this audit conducted? Show me the reasoning." The methodology white paper describes the process; the audit log is the evidence that the process was followed for this specific engagement.

Why this matters¶

SOC 2 Type 2 controls require audit-trail retention for security-relevant events. For audit firms, the bar goes higher: their end clients' regulators (DoD CIO for CMMC, HHS OCR for HIPAA, PCAOB for financial audits) sometimes ask for the underlying decision trail. Without an exportable log, the partner has to write a one-off explanation each time. With it, they hand over a structured artifact.

Format¶

Two output formats:

Format	Content type	Use case
`jsonl` (default)	`application/x-ndjson`	Newline-delimited JSON, streaming-safe, one event per line. Easy to grep/awk/pipe into log tooling. Returned as a downloadable file.
`json`	`application/json`	Single JSON array wrapping all events; easier for downstream code that wants a parseable JSON document. Returned as a normal response.

Each event is a JSON object emitted by LLMClient at every call:

{
  "event": "llm_call",
  "model": "claude-sonnet-4-6",
  "provider": "anthropic",
  "input_tokens": 4128,
  "output_tokens": 1240,
  "cost_cents": 5.32,
  "latency_ms": 8410,
  "downshifted_from": null,
  "budget_utilization": 0.42,
  "metadata": {
    "engagement_id": "eng-...",
    "stage": "investigate",
    "step": "primitive_currency_check",
    "question_id": "q-..."
  },
  "ts": "2026-05-08T20:10:42.512Z"
}

For an engagement with hundreds of LLM calls (typical: 80 investigate + ~10 catalog + ~5 verifier + 1 consolidate + 1 filter + 1 exec summary ≈ 100 events), the JSONL file is ~30–80 KB.

Implementation¶

The audit log writer (app/auditforge/audit_log.py) shards the log:

${CACHE_DIR}/auditforge/audit_logs/{engagement_id}/shard-00000.jsonl
${CACHE_DIR}/auditforge/audit_logs/{engagement_id}/shard-00001.jsonl
...

with a 10 MB shard cap. Each shard uploads to S3:

s3://{bucket}/auditforge/engagements/{engagement_id}/audit_log/shard-NNNNN.jsonl

Per-engagement bucket isolation (Phase 7) is honored: when engagement.source_bucket is set, the audit log lives there; otherwise it lives in the shared platform bucket.

The export endpoint:

Lists local shards (always preferred — most recent writes)
Lists S3 shards (older, archived from prior task lifetimes)
Dedupes by basename (shard-NNNNN.jsonl — local wins when both present)
For jsonl: streams shards in order with StreamingResponse; client receives a download
For json: materializes everything into an array (small enough; capped at the JSONL byte budget)

A trailing newline is yielded between shards in the jsonl stream so consumers don't accidentally merge two events into one line at shard boundaries.

Authentication¶

Standard auth gate + per-engagement firm scoping (Phase 9). Non-admin session callers can only export logs for engagements in their own firm. Cross-firm requests return 404.

Frontend¶

Engagement detail header has an "Audit log" button next to the deliverable export buttons. Click downloads the JSONL file:

{engagement-id}-audit-log.jsonl

(Browser handles the download via Content-Disposition: attachment.)

Cost¶

Zero. No LLM calls; just S3 reads + streaming.

What the audit log captures¶

Every Anthropic API call (Sonnet, Opus) with prompt tokens / completion tokens / cost
Every OpenAI API call (gpt-4o-mini for mechanical tier) with same fields
Stage attribution (profile, catalog, synthesize, validate, investigate, consolidate, deepen, filter, report, verify, intake_extract, portfolio_clusters)
Per-call latency
Tier downshifts (when budget governance moved a call to a cheaper tier)
Budget utilization at the time of each call

What the audit log does NOT capture¶

LLM prompts and responses themselves (privacy: corpus excerpts may include PII / CUI)
User actions (accept/reject/refine clicks) — these are audit-trailed via auditor_notes on each finding instead
Engagement-level state transitions (created → cataloging → ... → complete) — these live on the engagement record

If a procurement reviewer asks for prompts + responses, the partner can grant temporary access to the local log directory in a separate, NDA-bounded discussion. We deliberately don't expose those over the API.

Code¶

app/auditforge/audit_log.py — AuditLogWriter (existing); shard format
app/auditforge_endpoints.py — GET /engagement/{id}/audit-log with format=jsonl|json
frontend/src/api/auditforge.ts — auditLogUrl(engagementId, format)
frontend/src/components/EngagementDetail.tsx — "Audit log" button next to deliverable exports

Open follow-ups¶

Date-range filter — ?since=2026-04-01&until=2026-05-01 for partial export
Stage filter — ?stage=investigate to extract just the per-question events
Signed URLs — generate a time-limited pre-signed S3 URL the partner can hand to a procurement reviewer for direct download (no AuditForge auth required)
CSV format — flat tabular export for spreadsheet workflows
Live tail via SSE — stream new events as they arrive during a running audit (separate from the existing /stream progress events)

09-runner-and-store.md — runner internals; describes when audit-log events fire
api-reference.md — endpoint spec