The Half-Life of Documentation¶

Why currency beats comprehensiveness, and what that requires of a system¶

A Base2ML white paper. Third in a series following The Information Paradox and Why Conflict Detection Is Harder Than It Looks.

A document on a shelf is not the same as a document in force¶

Every organization has documents that are technically still in its records but no longer reflect how anything actually works. The 2017 SOP that's been quietly superseded by verbal practice. The handbook section that became obsolete when the regulation changed in 2022 but never got revised. The procedure that everyone knows is wrong but nobody has gotten around to updating.

These documents aren't lost; they're worse than lost. They produce confident, citable answers that are operationally wrong. A new clerk searches the records, finds the 2017 SOP, follows it, and creates a problem the experienced staff would have caught instinctively because they remember that the SOP is no longer in force.

The challenge of organizational knowledge is not primarily about finding documents. Most organizations have search; finding is a partially-solved problem. The harder challenge is knowing which documents found are still authoritative — and surfacing the right ones with that authority in mind.

This is the currency problem. It's quietly responsible for a large fraction of the operational mistakes that traceable to "the documentation said so." We've spent the time since we started this work understanding what currency tracking actually requires, and the answer is more than the obvious instincts suggest.

Why date metadata isn't enough¶

The first instinct is to lean on the date a document was produced. A 2024 policy is probably more current than a 2017 policy. A revision dated this year is probably more authoritative than one dated five years ago. Sort by date; trust the newest.

This works in narrow cases and breaks in important ones.

A document from 2024 may codify a process that's been outdated since 2022. The author wrote down what they thought the process was; they didn't realize the regulation had changed; the document was approved through a process that didn't include anyone who would have caught the error. The newest document, by date, is wrong. The 2017 document, predating the change, may have been correct for its time but is no longer in force. The 2022 regulatory shift document is the actual current authority.

A 2018 ordinance might still be authoritative on most of what it covers, while two specific sections have been superseded by 2021 and 2024 revisions that didn't replace the whole document. Treating the 2018 doc as wholesale-stale loses correct content; treating it as wholesale-current ignores the supersessions.

A handbook last revised in 2019 might be partially current and partially obsolete. Some sections describe processes that haven't changed; some describe processes that have. Date alone doesn't distinguish.

Date is a useful signal. It is not the signal. A useful currency system has to handle: documents that are old but still authoritative, documents that are new but operationally wrong, documents that are partially superseded, and documents whose authority is implicit (no formal supersession record exists, but everyone knows the practice has changed).

The supersession graph¶

The right mental model for documentation currency is a directed graph, not a sorted list. Documents have relationships: A supersedes B; A modifies section 4 of B; A replaces B for purpose X but not purpose Y; A and B both apply, and the user has to know which one fits the context.

Most organizations maintain this graph informally, in the heads of their experienced staff. Newcomers learn it gradually, with frequent corrections. The graph is real; it's just not written down anywhere systematic.

A currency-aware knowledge system has to make the graph explicit and operationally useful. This means:

Persisted supersession relationships. When a document supersedes another, the system needs a record of that fact — typically operator-supplied, sometimes inferred from explicit textual cues ("This policy replaces Policy 4.1 dated March 2017"). The relationship is metadata about the corpus, separate from the document content itself.

Partial supersession. A revision that replaces a specific section of an older document, leaving the rest in force. This is the harder case. A clean implementation distinguishes "supersedes section 4.2" from "supersedes the entire document." A pragmatic implementation may collapse to whole-document supersession with a note, accepting some loss of fidelity in exchange for operational simplicity.

Currency labels at the chunk level. When a system retrieves a passage, the user needs to see whether that passage is current, legacy, draft, or contested. Sticking the label on the document as a whole can mislead — a passage from a partially-superseded document may itself still be in force. Per-passage currency is the right granularity, even if it's harder to implement.

Replacement pointers. When a passage is legacy, the system should ideally point to what replaced it. "This passage from the 2017 SOP has been superseded by section 3.4 of the 2022 resolution" is a much more useful surface than just "this passage is legacy." The pointer itself becomes navigable; the user can click through to the replacement.

The supersession graph is the substrate. Every other currency-related capability rides on top.

The deterministic versus algorithmic tradeoff¶

There are two broad approaches to populating the supersession graph, and the right answer depends on the organization's volume and operating posture.

Algorithmic. The system uses LLM analysis or heuristic comparison to detect probable supersession relationships automatically. New 2024 doc looks substantively similar to old 2018 doc but has more recent date and clearer authority chain — flag as probable supersession. Operator reviews and accepts/rejects. Over time, the system suggests; the operator confirms.

This works at scale. It produces some false positives (similar documents that don't actually supersede each other) and some false negatives (supersessions where the new document was substantively rewritten). The operator review step contains both error modes. The system's value is in surfacing candidates the operator wouldn't have noticed unaided.

Deterministic. The system requires the operator to mark supersession relationships explicitly when they create or update documents. No automation; no algorithmic guess. The graph is exactly what was deliberately recorded.

This works at small scale and produces high precision. It fails when the corpus is large enough that operators can't keep up, or when supersessions happen organically (informal practice changes without explicit documentation), or when the operators who know the relationships aren't the ones loading the documents.

The right answer for most organizations is a hybrid. Algorithmic suggestion for new candidates; deterministic operator confirmation; explicit operator override at any time. The system provides the scale; the operator provides the authority.

The trap to avoid is fully-algorithmic supersession with no review — confidently rewriting the corpus's authority structure based on an LLM's guess. The cost of getting that wrong, in a regulated environment, is paid in operational mistakes that are traceable to the system's silent decisions. The trap on the other side is fully-deterministic with no automation, which works in a 50-document corpus and doesn't work in a 5000-document one.

What "current" means to a user¶

A second-order point worth dwelling on: the same document can be "current" for one user and "legacy" for another. The borough manager's question about overtime policy in operations is meaningfully different from the same question asked by a litigator preparing a grievance defense. The litigator wants to know what was in force on a specific date in 2019. The manager wants to know what's in force right now. The same retrieval may need to surface the same documents but with different currency framing.

This is not solved by a binary current/legacy tag. It requires the system to support time-scoped queries — "what was in force on date X?" — and to render currency information in a way that makes the temporal context visible to the user. This is a relatively unusual capability in current knowledge systems, and most users don't think to ask for it until they need it.

A pragmatic implementation: time-scoped queries are a separate mode, not the default; the default is "as of now"; the operator can switch modes when the situation requires. Most queries, in most operational contexts, only need "as of now." The minority that need temporal scoping deserve the capability when they need it.

The cost of getting currency wrong¶

The cost is asymmetric in a similar way to the conflict-detection asymmetry.

Treating a current document as legacy costs the user some attention — they see the legacy tag, decide to look anyway, find the answer is actually still authoritative, dismiss the tag. Annoying. Bounded.

Treating a legacy document as current is operationally dangerous. The user reads a confident answer based on a passage that was authoritative five years ago and isn't anymore. They act on it. The mistake is concentrated, traceable, and visible.

A currency-aware system should err toward over-flagging legacy. Better to have the user override an over-aggressive legacy tag than to have them act on outdated content as if it were current. Over time, with operator-driven corrections, the noise floor drops.

There is also a corpus-quality dimension worth being honest about. A system can be perfectly designed for currency tracking; if the corpus has no supersession data, the system has nothing to work with. The best technical implementation is rate-limited by the operator's willingness to mark supersessions when they happen. Investment in currency tracking is an investment in operator workflow as much as it is in algorithms.

The deterministic alternative — manual override¶

The cheapest and most reliable currency tracking, in our experience, isn't algorithmic. It's the operator marking specific documents as legacy or draft as they discover them. No LLM call. No re-ingestion. No index rebuild. The mark applies instantly to retrieval, outranks both the regex heuristics and the algorithmic supersession suggestions, and persists in the audit trail.

We position this in our product as the recommended low-cost daily companion to the algorithmic layer. Algorithms run on schedule and surface candidates; operators correct misclassifications in seconds; the corrected state outranks everything else. The combination is more accurate, cheaper, and more accountable than either approach alone.

The lesson worth taking from our experience: don't overengineer the algorithmic side at the expense of making manual override fast and easy. Operators discover currency problems all the time — a clerk reads a passage and immediately knows it's wrong, a manager reviews an answer and recognizes the underlying document is out of date. The cost of capturing that discovery is the friction in the override flow. If the override takes thirty seconds, operators will use it constantly. If it takes two minutes through a forms-heavy admin panel, operators will skip it and the corpus accumulates errors.

The fastest override mechanism we've found: the operator clicks a single button on the document in question, types a one-sentence reason, and the override applies. Anything more elaborate is friction the operator will, eventually, route around — and routing around currency tooling is how legacy content stays present in confident answers.

What to look for¶

If currency tracking matters in your environment, the questions worth asking of any system you evaluate go beyond "does it have date metadata." How does the system represent supersession relationships — as edges in a graph that can be traversed, or as a flat status flag on each document? When a passage is legacy, can the system point at what replaced it? Does the operator have a fast path to mark a document outdated when they discover it — measured in seconds, not minutes — and does that mark outrank the system's automatic classification? When the corpus has been ingested and the operator wants to retroactively correct a misclassification, what's the workflow? Are time-scoped queries — "what was in force on date X?" — supported, or does the system only know about the current state?

The shape of the right answer here is non-obvious. Currency is a graph, not a sorted list. The operator's manual mark outranks the algorithm's guess. The cost of false legacy is bounded; the cost of false current is concentrated and traceable. A system that gets these right is meaningfully more valuable in operational settings than one that doesn't.

If you're working through these tradeoffs and want a sounding board — diagnostic, not pitch — we'd welcome the conversation.

About Base2ML. Base2ML is a Pittsburgh-based company building knowledge-access tools for organizations that need to find what they already have. We work in the specific space where retrieval, authority hierarchy, and conflict surfacing meet operational reality.

Contact. Base2ML · chris@base2ml.com · base2ml.com · docs.base2ml.com

Numbers and percentages are deliberately not invented. Where industry research provided a credible figure we cite it; where it didn't, we say so rather than fabricating one.