← Back to Blog

Engineering · Performance · Deterministic Enforcement

How We Built a Deterministic Governance Runtime with Sub-1ms Enforcement

Engineering a governance gate that runs before every LLM call, adds under 1ms of latency, and produces the same verdict for the same input every time — here is how we built it.

EVE Engineering · September 2, 2025 · 7 min read
How We Built a Deterministic Governance Runtime with Sub-1ms Enforcement

The first question engineers ask when we describe our governance gate is: how does it stay under a millisecond? The second question is: why does that matter? The answer to the second question explains the answer to the first. Governance that adds meaningful latency to every request will be removed from the hot path. Governance that lives outside the hot path is not governance — it is auditing after the fact. If you want enforcement, you need enforcement latency that production systems can absorb. Sub-millisecond is the target because it makes avoidance economically irrational.

0.3ms
Minimum gate latency
0.8ms
Maximum gate latency
250–2500×
Faster than LLM evaluation

The Architecture Decision That Changes Everything

The most important design decision in building a deterministic governance runtime is also the most constraining: no LLM calls in the governance plane. This is not a performance optimization. It is an architectural requirement. The moment you introduce an LLM call into the governance evaluation path, you have introduced three fundamental problems.

Eliminating LLM calls from the governance plane means the entire evaluation must run on compiled logic, pattern matching, arithmetic comparison, and cryptographic primitives. This is the constraint that drives every subsequent design decision.

Precompiled Rule Sets

The governance rules — charter principles, policy thresholds, domain-specific constraints — are not evaluated at request time by interpreting a rule document. They are compiled at deployment time into evaluation structures that the runtime can execute directly.

Concretely: a rule like "deny any request where the role claim is VIEWER and the requested action class is WRITE" becomes a pair of field extractors and a comparison, not a natural language string that gets semantically evaluated. The evaluation is O(1) per rule. A set of 14 charter rules plus domain-specific policy thresholds evaluates in microseconds, not milliseconds.

The compilation step also serves as a validation step. A rule that cannot be compiled into a deterministic evaluator is a rule that cannot be enforced deterministically. The compilation boundary forces governance engineers to express rules in terms that can be mechanically applied, not terms that require interpretation.

Pattern Matching at the Hot Path

For certain rule classes — injection detection, prohibited content patterns, authority claim extraction — we use precompiled finite automata. The patterns are compiled from their specification into DFA form at deployment time. A DFA can scan a request of several thousand tokens in a single pass without backtracking, with CPU-cache-friendly memory access patterns.

DFA compilation has a well-understood tradeoff: large rule sets can produce large automata with significant state space. We manage this through rule stratification — the highest-priority rules form the innermost DFA and are always evaluated first.

The compilation artifacts are deterministic. Given the same rule set and compiler version, the output DFA is always identical. This means the governance behavior is reproducible from the specification alone, independently of the running system.

Replay Determinism

Every evaluation in the governance plane produces a signed record. The record contains: the input canonicalized and hashed using JCS (RFC 8785 JSON Canonicalization Scheme), the rule set version and hash, the verdict (ALLOW / MODIFY / BLOCK), the specific rules that triggered if any, a timestamp, and an HMAC-SHA256 signature over all the above fields.

A second system with the same rule set and signing key can take any of these records and verify the verdict by re-running the evaluation against the stored input hash. It does not need access to the original system. It does not need to call any API. The replay is purely local computation.

This property — offline replay without system dependency — is what makes the governance record legally meaningful rather than just operationally useful. An auditor can verify a decision from two years ago without trusting that the infrastructure is in the same state it was when the decision was made.

Latency Budget

In production, our governance gate adds between 0.3ms and 0.8ms to the request latency. The variance is primarily driven by request size (more tokens to scan) and cache state (cold-start on the first request after a deployment).

To put this in context: a call to an external LLM for evaluation purposes adds between 200ms and 2000ms depending on the provider, model, and queue depth. Our gate is 250x to 2500x faster, without the reproducibility or adversarial surface problems. The latency budget also means we can afford to run the governance evaluation synchronously in the request path, before the LLM call. There is no need for an async governance check that races with the model call. The governance verdict is available before the model ever sees the input.

What This Enables That Probabilistic Approaches Cannot

A governance system that runs deterministically, before the LLM, at sub-millisecond latency enables properties that probabilistic post-hoc approaches cannot provide:

The engineering challenge of building a deterministic governance runtime is real. The compilation pipeline, the DFA construction, the canonicalization scheme, the signing infrastructure — each of these requires careful implementation and ongoing maintenance. But the result is a governance component with properties that LLM-based approaches cannot match: deterministic, replayable, sub-millisecond, adversarially immune. That is what enforcement looks like as infrastructure, not as a feature.

Engineering Deterministic Governance DFA Pattern Matching JCS Canonicalization Performance Pre-LLM Enforcement Audit Trail