Offline Replay Verification: Proving AI Governance Without Trusting the Live System

The verification problem for AI governance has a fundamental trust issue: the system that produces governance records is the same system that would benefit from those records showing compliance. An audit that relies on the live system to verify its own governance is not an independent audit. Offline replay verification solves this. Given a signed governance export, an archived rule set, and a deterministic evaluation engine, any party — including a regulatory examiner with no access to the production system — can verify that a specific decision was made correctly under the specified governance framework. The verification produces a cryptographic proof that matches or does not match the original signed record. No trust in the original system is required.

What Offline Replay Verification Provides

Offline replay verification answers four questions that governance audits must address:

Was this the input that was actually evaluated? The signed governance record contains the canonical hash of the input. The archived input, when canonicalized and hashed, must produce the same hash. If they match, the archived input is the input that was evaluated. If they do not match, either the input was modified after the fact or the record was corrupted.

Was this the governance framework that was actually applied? The signed governance record contains the rule set version hash. The archived rule set, when hashed, must produce the same hash. If they match, the archived rule set is the rule set that governed the decision.

Was the verdict correct given the input and the rule set? Re-evaluating the archived input against the archived rule set on the verifier’s own infrastructure produces a verdict. If the produced verdict matches the verdict in the signed record, the evaluation was correct. If they do not match, either the record was falsified or the evaluation engine has a determinism failure.

Has the audit record been tampered with? The hash chain linking each record to its predecessor allows verification that no records have been inserted, deleted, or modified between the genesis record and the record under examination. The signature on each record verifies that the record was produced by the authorized signing key.

The Offline Verification Protocol

A complete offline verification of a governance decision requires the following artifacts:

The signed governance record: The original record produced at the time of the governance decision, containing: canonical input hash, rule set version hash, verdict, triggered rules, timestamp, chain link, and cryptographic signature.
The rule set archive: The specific rule set version referenced by the rule set version hash in the governance record — the complete rule set at the exact version that was active at the time of the decision.
The canonical input: The normalized form of the input that was evaluated, after all normalization transformations were applied. The original raw input may also be archived to allow independent verification of the normalization.
The chain segment: The subsequence of governance records from the genesis record (or a known-good anchor point) to the record under examination. Required to verify chain integrity.
The signing key public component: The public half of the HMAC-SHA256 key used to sign the governance record. Required to verify the record’s signature.

With these artifacts, the verification protocol is:

Verify the chain segment. For each record from the anchor to the target, verify that the record’s chain link matches the hash of the preceding record. Any break in the chain indicates insertion, deletion, or modification.
Verify the record signature. Compute HMAC-SHA256 over the canonical record content using the signing key. Compare with the signature in the record. Match confirms the record was produced by the authorized signing key.
Verify the rule set archive. Hash the rule set archive. Compare with the rule set version hash in the governance record. Match confirms the archived rule set is the rule set that governed the decision.
Verify the canonical input. Hash the canonical input. Compare with the input hash in the governance record. Match confirms the archived input is the input that was evaluated.
Re-evaluate. Run the canonical input through the deterministic evaluation engine, using the archived rule set. Capture the verdict and triggered rules.
Compare verdicts. The produced verdict must match the verdict in the signed record. The produced triggered rules must match the triggered rules in the record. Match confirms the governance decision was correctly derived from the stated input and rule set.

If all six steps pass, the verification confirms that the input is authentic, the rule set is authentic, the record is authentic, the chain is intact, and the verdict was correctly derived. No trust in the original system is required for any of these confirmations.

Building the Verification Package

For a governance system to support offline replay verification, it must maintain four archives:

The governance record archive: Every signed governance record, organized for retrieval by record ID, timestamp, and tenant. Records are immutable — they are written once and never modified.
The rule set version archive: Every rule set version that has been active, indexed by version hash. When a rule set is updated, the previous version is archived before the update takes effect.
The canonical input archive: The canonical form of every input evaluated, indexed by input hash. For privacy-sensitive deployments, canonical inputs may be encrypted with a key held by the tenant — auditors with the tenant’s decryption key can retrieve the input; others cannot.
The chain index: The complete hash chain from genesis to current, with efficient random-access indexing. The chain index enables verification of chain integrity for any subsequence without requiring full chain traversal.

The archive storage requirements for a typical governance deployment are modest. A governance record for a single decision is typically 500–2,000 bytes. At one million decisions per day, the record archive grows at 500MB–2GB per day — within the range of standard object storage at negligible cost relative to the compliance and audit value they provide.

The Regulatory Examination Scenario

Consider the scenario where a bank is examined by a federal banking regulator who identifies a loan decision that may not have complied with fair lending requirements. The examiner requests documentation of the governance framework that was applied to the decision and evidence that the framework was actively enforced.

With offline replay verification infrastructure:

The bank provides the examiner with a signed verification package containing the governance record for the decision, the rule set archive for the relevant version, the canonical input, and the chain segment from the relevant period. The examiner’s independent technical reviewer installs the deterministic evaluation engine and runs the verification protocol on air-gapped infrastructure with no connection to the bank’s systems. The verification produces: confirmation that the record was signed by the authorized key, confirmation that the rule set version matches the archived version, confirmation that the canonical input matches the archived input, confirmation that the chain is intact, and confirmation that re-evaluation produces the same verdict as the original record.

The examiner has cryptographic proof that the specific governance framework was applied to the specific decision, the framework was the currently approved framework at the time, the record has not been altered, and the verdict was correctly derived.

Without offline replay verification infrastructure, the bank’s response to the same examination request is: log entries, policy documentation, and administrator attestation. These do not constitute cryptographic proof of governance enforcement. The examination is longer, more contentious, and less favorable to the bank.

The Zero-Trust Property

The most powerful property of offline replay verification is that it is zero-trust with respect to the original system. The verification requires:

No API calls to the production system
No database access to the production database
No cooperation from system administrators
No trust in the system operator’s claims about what occurred

The verification is cryptographic proof, not verbal assertion. An examiner who does not trust the system operator can verify governance compliance independently. A plaintiff’s counsel who doubts the governance claims can commission an independent verification. A regulator who wants an unimpeachable record of governance enforcement can archive the signed record and verify it at any future point.

This zero-trust property is what distinguishes governance infrastructure from governance documentation. Documentation requires trust in the documenter. Cryptographic proof requires trust only in the math. The math is auditable. The governance is provable. The compliance is demonstrable without cooperation from the party being examined. That is the standard that AI governance in regulated industries must reach — and offline replay verification is the mechanism that reaches it.

Offline Replay Verification Zero-Trust Audit Cryptographic Proof Signed Governance Records Independent Verification Governance Compliance Hash Chain Integrity