The Guard: Post-Generation Filtering
A guard sits after the LLM in the inference pipeline. The model generates a full or partial response, and the guard then evaluates that output — scanning for prohibited content categories, toxicity signals, hallucination markers, PII patterns, or policy violations. If a violation is detected, the guard suppresses the output, substitutes a canned refusal, or flags it for review.
This architecture powers the content moderation layers of most commercial LLM APIs and works well for consumer use cases where the primary concern is output quality and user experience. But it has four structural failure modes that make it unsuitable as the primary compliance control in regulated industries.
How Guards Are Implemented
Guards typically combine one or more of these detection mechanisms:
- Classifier models: A separate, lighter-weight ML model classifying output text across predefined harm categories
- Regex and keyword matching: Rule-based scanning for prohibited strings, PII patterns (SSN, credit card formats), or domain-specific markers
- Embedding similarity: Comparing output embeddings to embeddings of known policy-violating content
- LLM-as-judge: A second LLM call that evaluates whether the primary model's output violates policy — expensive, slow, and still probabilistic
Each mechanism has a well-characterized failure mode. Classifiers have false negative rates. Regex patterns are bypassable through paraphrasing or Unicode normalization. Embedding similarity requires comprehensive coverage of the entire policy violation space. LLM-as-judge inherits the probabilistic limitations of the models it employs and adds 300–2000 ms of additional latency per request.
Why Guards Fail for Regulated Industries
The fundamental problems with guards in regulated contexts are not their accuracy rate — they are their temporal position in the pipeline and the evidentiary record they produce.
Problem 1: The LLM Has Already Generated the Content
When a guard catches a policy violation, the LLM has already generated the prohibited content. For most consumer applications this is manageable — the user never sees it. But in regulated industries, generation itself can constitute a compliance event. A clinical AI system that generates a medication recommendation it is not authorized to make has violated policy at the moment of generation, regardless of whether the guard subsequently suppresses the output before the clinician sees it.
In streaming response architectures — which are now standard for LLM chat interfaces — the model begins transmitting tokens to the user interface before generation is complete. A post-generation guard operating on the complete response cannot intercept tokens already transmitted. Some architectures buffer the entire response before delivery, but this adds significant perceived latency and is frequently rejected by user experience requirements at scale.
Problem 2: No Cryptographic Proof of Pre-Evaluation
Regulators and auditors for regulated industries — the OCC and Federal Reserve for banks, FDA for medical devices, state insurance commissioners, government contracting officers — are increasingly sophisticated about AI governance. They do not ask whether you have logging. They ask whether you can prove that a specific policy was evaluated before a specific inference occurred.
A guard operating post-generation cannot produce this proof. Even comprehensive guard logs record that an output was evaluated after generation — not that a policy was evaluated before generation began. This distinction is legally and regulatorily meaningful. Pre-execution evaluation is an affirmative control; post-generation filtering is a best-effort remediation applied to content that already exists.
Audit reality: During a regulatory examination of an AI-assisted loan origination system, the examiner asked: "Show me the record that proves your fair lending policy was evaluated before this credit decision was generated." A post-generation content filter log cannot answer this question. A signed pre-execution decision certificate can.
Problem 3: Guards Cannot Enforce Request-Level Policy
Guards evaluate outputs — they cannot evaluate the context, authorization state, or policy applicability of the request that generated them. A physician's workstation querying a clinical AI for medication dosing is a different policy context from a patient-facing portal asking the same question. Guards operating on output text cannot distinguish these contexts because the output text may be identical. Only a pre-execution gate that evaluates the full request context — user role, session metadata, data classification, organizational policy state — can enforce request-level policy with precision.
Problem 4: Probabilistic Models Cannot Provide Deterministic Guarantees
Classifier-based guards are probabilistic. They have accuracy rates, not guarantees. For regulated industries with zero-tolerance requirements — a clinical system that must never suggest a drug the patient is allergic to; a financial AI that must never recommend a security to a customer whose suitability profile prohibits it — a 99.9% accurate guard still produces thousands of violations at scale. Deterministic policy enforcement, by contrast, evaluates a rule and either the condition is met or it is not. Every time. Without exception.
The Gate: Pre-Execution Policy Enforcement
A gate sits before the LLM in the inference pipeline. It evaluates the incoming request — including user context, session state, data classification of referenced content, applicable policy set, and organizational authorization state — and makes a disposition decision (ALLOW, BLOCK, or MODIFY) before the LLM processes the request. If the disposition is BLOCK, the LLM never executes. If MODIFY, the request is transformed to bring it within policy scope before execution begins.
Technical Architecture of a Governance Gate
A production-grade governance gate consists of four architectural components operating synchronously in the request path:
1. Request Context Assembly
Before policy evaluation, the gate assembles the full request context: the user identity and role (from session token or API key), data classifications of any content referenced in the request, the organizational policy set applicable to this user and context, request metadata (endpoint, timestamp, request chain identifiers for distributed tracing), and any active policy overrides or time-bounded exceptions. This context assembly is the foundation for request-level policy enforcement — without it, the gate can only evaluate request text content, not authorization state.
2. Deterministic Policy Evaluation Engine
The core of a governance gate is a deterministic evaluation engine — not a probabilistic classifier. Policy rules are expressed in structured form (JSON or YAML policy documents) and evaluated against the assembled request context using deterministic logic. A rule like {"block": {"user_role": "patient", "contains_category": "prescription_advice"}} evaluates to a binary result. There is no accuracy rate — the rule executes consistently on every matching input.
Speed is essential. The gate must complete evaluation synchronously in the request path without introducing user-perceptible latency. CoreGuard's policy evaluation completes in under 2 milliseconds for standard policy sets, adding negligible overhead to a typical LLM request that takes 300–3000 milliseconds to complete. This performance requires that policy evaluation operates against in-memory compiled rule sets — not database lookups or remote API calls on the critical path.
3. Decision Certificate Generation
After evaluation, the gate generates a Governed Decision Certificate — a cryptographically signed record of the evaluation event. The certificate includes: a hash of the request content binding the certificate to the specific request, the policy set version evaluated (enabling policy change attribution), the evaluation timestamp (which is provably pre-inference), the disposition, the specific policy rules that triggered it, and an HMAC-SHA256 signature using the organization's governance signing key.
This certificate is the evidentiary record that answers an auditor's question: "Prove that your policy was evaluated before this inference occurred." The timestamp is pre-inference. The request hash is independently verifiable. The policy version is specified. The signature is verifiable offline without querying any live system. No post-generation guard architecture can produce an equivalent artifact by definition.
4. Response Binding and BLOCK Handling
For ALLOW dispositions, the gate attaches the certificate identifier to request metadata so the LLM response links back to the pre-execution evaluation record. For MODIFY dispositions, the gate produces a modified request along with a certificate documenting the transformation applied. For BLOCK dispositions, the gate returns a policy-cited response directly — the LLM is never called, and the certificate is the complete interaction record. BLOCK responses are returned with the same latency as the policy evaluation itself: under 2 ms.
Latency Analysis: Gate vs. Guard
The common objection to pre-execution governance is latency. In practice, the answer depends on architecture — and the comparison favors gates:
| Architecture | Added Latency | Position in Pipeline | Produces Pre-Execution Proof? |
|---|---|---|---|
| CoreGuard deterministic policy gate | 0.8 – 2.5 ms | Pre-execution | Yes — HMAC-signed certificate |
| Post-generation classifier guard | 50 – 200 ms | Post-generation | No |
| LLM-as-judge guard | 300 – 2000 ms | Post-generation | No |
| Async logging only (no guard or gate) | < 0.1 ms | Post-generation (async) | No |
For a typical LLM inference that takes 500–1500 ms, a 2 ms pre-execution gate adds under 0.4% overhead — imperceptible to users. LLM-as-judge guards add 20–400% overhead and still operate post-generation, providing neither the performance advantage of deterministic evaluation nor the evidentiary value of pre-execution proof.
The performance advantage of deterministic gates over probabilistic guards reflects a fundamental architectural principle: deterministic rule evaluation over structured data is orders of magnitude faster than probabilistic neural network forward passes over unstructured text. A governance gate evaluating structured policy rules is doing a lookup table traversal; a classifier guard is running matrix multiplications at GPU memory bandwidth limits.
Integration Patterns
A governance gate integrates into existing LLM infrastructure in three primary patterns, ordered from least to most application code change required:
Sidecar Container Pattern (Zero Application Code Change)
Deploy the governance gate as a sidecar container in the same Kubernetes pod as the application layer that calls the LLM API. Configure the application's HTTP proxy to route through the sidecar. All outbound LLM requests pass through the sidecar, which evaluates policy and either forwards with a certificate header or returns a BLOCK response. The application code does not change.
# docker-compose excerpt
services:
app:
environment:
HTTPS_PROXY: http://coreguard-sidecar:3128
coreguard-sidecar:
image: evecore/coreguard:latest
environment:
POLICY_SET: lending_v1
SIGNING_KEY_REF: vault://governance/signing-key
SDK Integration Pattern (Direct Integration)
For applications where the LLM call is made directly in application code, the governance SDK wraps the provider client. The application calls the SDK; the SDK handles policy evaluation, certificate generation, and provider routing transparently.
from coreguard import GovernedLLMClient
client = GovernedLLMClient(
policy_set="lending_v1",
api_key=COREGUARD_KEY,
provider="anthropic"
)
response = await client.chat(
messages=conversation,
user_context={"role": "loan_officer", "org_id": "acme", "user_id": uid}
)
# Pre-execution proof is attached to every response
print(response.certificate.disposition) # "ALLOW"
print(response.certificate.policy_version) # "lending_v1@2026-04-01"
print(response.certificate.timestamp) # ISO8601 pre-inference timestamp
API Proxy Pattern (Network-Level Interception)
For organizations that cannot modify application code or deploy sidecar infrastructure, the governance gate runs as a standalone API proxy endpoint. Applications are reconfigured to target the governance proxy instead of the LLM provider directly. Policy evaluation happens at the network layer — entirely transparent to the application, deployable in under an hour with a DNS or load balancer configuration change.
What the Gate Cannot Do
Architectural honesty requires acknowledging gate limitations. A gate evaluates the request against policy — it cannot evaluate output content for novel stochastic violations that emerge from LLM generation. A request that passes all policy checks (authorized user, authorized use case, authorized data access) can still produce output containing factual errors, unexpected framing, or stylistic problems that output monitoring might catch.
Mature LLM governance architectures therefore layer both: a pre-execution gate providing deterministic policy enforcement and cryptographic pre-execution proof (the compliance layer), combined with lightweight output monitoring for quality and anomaly detection (the quality layer). These are distinct functions serving distinct purposes, and conflating them leads to choosing the wrong tool for each job — or worse, believing that a quality tool satisfies compliance requirements it was not designed to meet.
Key principle: The gate is the compliance layer — deterministic, certifiable, auditable. Output monitoring is the quality layer — probabilistic, heuristic, advisory. Regulated industries need both, but only the gate produces the evidentiary record that satisfies regulatory audit requirements. No post-generation guard can substitute for it.
See CoreGuard enforce your policies in real time
Sub-millisecond AI governance with signed audit certificates. Deploy as sidecar, SDK, or API in under an hour.
Explore CoreGuardConclusion
The guard versus gate distinction is not academic — it is the difference between a compliance program that can answer an auditor's hardest question ("prove that this policy was evaluated before this inference occurred") and one that cannot. Post-generation content filters are valuable quality controls, but they are architecturally incapable of producing pre-execution evidentiary records, and they cannot prevent violations from occurring at the generation event.
Building a governance gate requires a deterministic policy evaluation engine, a cryptographic certificate generation mechanism, and a synchronous integration architecture that places evaluation before LLM inference begins. The latency cost is under 2 milliseconds — imperceptible to users, and decisive to auditors. Organizations deploying AI in regulated contexts should architect their compliance layer around a pre-execution gate from day one, treating output monitoring as a complementary quality control rather than a compliance substitute.