Skip to content

Audit log export (Phase 14)

What

GET /auditforge/engagement/{id}/audit-log returns the full per-engagement audit log — every LLM call AuditForge made for this engagement, with timestamp, model, input/output tokens, cost, latency, stage metadata, and any tier-downshift hints.

This is the artifact a partner hands to their end client's procurement reviewer when asked "how was this audit conducted? Show me the reasoning." The methodology white paper describes the process; the audit log is the evidence that the process was followed for this specific engagement.

Why this matters

SOC 2 Type 2 controls require audit-trail retention for security-relevant events. For audit firms, the bar goes higher: their end clients' regulators (DoD CIO for CMMC, HHS OCR for HIPAA, PCAOB for financial audits) sometimes ask for the underlying decision trail. Without an exportable log, the partner has to write a one-off explanation each time. With it, they hand over a structured artifact.

Format

Two output formats:

Format Content type Use case
jsonl (default) application/x-ndjson Newline-delimited JSON, streaming-safe, one event per line. Easy to grep/awk/pipe into log tooling. Returned as a downloadable file.
json application/json Single JSON array wrapping all events; easier for downstream code that wants a parseable JSON document. Returned as a normal response.

Each event is a JSON object emitted by LLMClient at every call:

{
  "event": "llm_call",
  "model": "claude-sonnet-4-6",
  "provider": "anthropic",
  "input_tokens": 4128,
  "output_tokens": 1240,
  "cost_cents": 5.32,
  "latency_ms": 8410,
  "downshifted_from": null,
  "budget_utilization": 0.42,
  "metadata": {
    "engagement_id": "eng-...",
    "stage": "investigate",
    "step": "primitive_currency_check",
    "question_id": "q-..."
  },
  "ts": "2026-05-08T20:10:42.512Z"
}

For an engagement with hundreds of LLM calls (typical: 80 investigate + ~10 catalog + ~5 verifier + 1 consolidate + 1 filter + 1 exec summary ≈ 100 events), the JSONL file is ~30–80 KB.

Implementation

The audit log writer (app/auditforge/audit_log.py) shards the log:

${CACHE_DIR}/auditforge/audit_logs/{engagement_id}/shard-00000.jsonl
${CACHE_DIR}/auditforge/audit_logs/{engagement_id}/shard-00001.jsonl
...

with a 10 MB shard cap. Each shard uploads to S3:

s3://{bucket}/auditforge/engagements/{engagement_id}/audit_log/shard-NNNNN.jsonl

Per-engagement bucket isolation (Phase 7) is honored: when engagement.source_bucket is set, the audit log lives there; otherwise it lives in the shared platform bucket.

The export endpoint:

  1. Lists local shards (always preferred — most recent writes)
  2. Lists S3 shards (older, archived from prior task lifetimes)
  3. Dedupes by basename (shard-NNNNN.jsonl — local wins when both present)
  4. For jsonl: streams shards in order with StreamingResponse; client receives a download
  5. For json: materializes everything into an array (small enough; capped at the JSONL byte budget)

A trailing newline is yielded between shards in the jsonl stream so consumers don't accidentally merge two events into one line at shard boundaries.

Authentication

Standard auth gate + per-engagement firm scoping (Phase 9). Non-admin session callers can only export logs for engagements in their own firm. Cross-firm requests return 404.

Frontend

Engagement detail header has an "Audit log" button next to the deliverable export buttons. Click downloads the JSONL file:

{engagement-id}-audit-log.jsonl

(Browser handles the download via Content-Disposition: attachment.)

Cost

Zero. No LLM calls; just S3 reads + streaming.

What the audit log captures

  • Every Anthropic API call (Sonnet, Opus) with prompt tokens / completion tokens / cost
  • Every OpenAI API call (gpt-4o-mini for mechanical tier) with same fields
  • Stage attribution (profile, catalog, synthesize, validate, investigate, consolidate, deepen, filter, report, verify, intake_extract, portfolio_clusters)
  • Per-call latency
  • Tier downshifts (when budget governance moved a call to a cheaper tier)
  • Budget utilization at the time of each call

What the audit log does NOT capture

  • LLM prompts and responses themselves (privacy: corpus excerpts may include PII / CUI)
  • User actions (accept/reject/refine clicks) — these are audit-trailed via auditor_notes on each finding instead
  • Engagement-level state transitions (created → cataloging → ... → complete) — these live on the engagement record

If a procurement reviewer asks for prompts + responses, the partner can grant temporary access to the local log directory in a separate, NDA-bounded discussion. We deliberately don't expose those over the API.

Code

  • app/auditforge/audit_log.pyAuditLogWriter (existing); shard format
  • app/auditforge_endpoints.pyGET /engagement/{id}/audit-log with format=jsonl|json
  • frontend/src/api/auditforge.tsauditLogUrl(engagementId, format)
  • frontend/src/components/EngagementDetail.tsx — "Audit log" button next to deliverable exports

Open follow-ups

  • Date-range filter?since=2026-04-01&until=2026-05-01 for partial export
  • Stage filter?stage=investigate to extract just the per-question events
  • Signed URLs — generate a time-limited pre-signed S3 URL the partner can hand to a procurement reviewer for direct download (no AuditForge auth required)
  • CSV format — flat tabular export for spreadsheet workflows
  • Live tail via SSE — stream new events as they arrive during a running audit (separate from the existing /stream progress events)