Healthcare organizations deploying large language models face a compliance landscape that is more complex than many IT and compliance teams anticipate. HIPAA's Security Rule, Privacy Rule, and Breach Notification Rule all apply to AI systems that handle Protected Health Information — and the way LLMs handle PHI creates specific compliance risks that have no direct analog in traditional healthcare IT systems.
The core problem is that LLMs are designed to process natural language, and natural language clinical data is full of PHI. A clinical documentation assistant that receives a dictated note, an AI that summarizes discharge instructions, a chatbot that answers patient questions — each of these systems will routinely encounter patient names, dates of birth, medical record numbers, diagnoses, and treatment details. Every one of those interactions is an ePHI access event subject to HIPAA's technical safeguard requirements.
This checklist covers the 18 HIPAA safeguards most directly implicated by AI deployments, with specific guidance on what each requires technically, where AI systems commonly fail, and how automated enforcement infrastructure addresses each requirement. Use it to assess your current posture and identify gaps before your next OCR examination or security risk analysis update.
What Makes AI Different from Traditional Healthcare IT
Traditional healthcare IT systems — EHRs, PACS, laboratory information systems — have bounded, predictable data handling behavior. They accept structured inputs, perform defined operations, and produce structured outputs. Auditing their ePHI access is a matter of logging which users accessed which records through which interfaces. The data handling is transparent and controllable because the system's behavior is defined by its code, not by its training data.
LLMs introduce three characteristics that create HIPAA compliance challenges traditional IT governance frameworks do not handle well:
Implicit PHI extraction. When a clinician asks an LLM "What are the standard treatment options for a 67-year-old diabetic patient with Stage 3 CKD?", the question itself may constitute PHI if it identifies a specific patient. LLMs cannot reliably distinguish between questions that describe a patient and questions that describe a general clinical scenario — and neither can simple keyword-based filters. The PHI determination depends on context that the LLM may not have access to.
Non-deterministic data handling. Traditional systems handle PHI in defined ways that can be fully documented. An LLM's handling of PHI in a conversation is shaped by its training data, the specific prompt context, and sampling randomness. The same PHI-containing prompt may produce different outputs in different sessions, making it difficult to specify exactly what the system "does" with PHI and whether that handling satisfies the minimum necessary standard.
Training data risk. LLMs trained on healthcare data may have internalized PHI from their training corpus in ways that could surface in model outputs. This is a risk that traditional healthcare IT systems do not present. The OCR has not issued specific guidance on this risk as of 2026, but it represents a potential HIPAA vulnerability that healthcare organizations deploying foundation models need to assess during vendor due diligence.
The 18-Item HIPAA AI Compliance Checklist
The following table maps each HIPAA requirement to its regulatory source, describes the compliance status of a typical LLM deployment without governance infrastructure, and identifies the enforcement architecture required to satisfy the requirement.
| # | HIPAA Requirement | CFR Citation | Typical AI Deployment Status | Enforcement Architecture Required |
|---|---|---|---|---|
| 1 | Unique user identification | §164.312(a)(2)(i) | PARTIAL — API keys often shared across users | Per-user authentication before AI access; enforce in enforcement layer |
| 2 | Automatic logoff | §164.312(a)(2)(iii) | PARTIAL — Session management varies by integration | Enforce session timeout in AI integration layer; certificate links to session ID |
| 3 | Encryption in transit | §164.312(e)(2)(ii) | PASS — TLS standard for API calls | Verify TLS enforced for all AI API endpoints including governance layer |
| 4 | Encryption at rest | §164.312(a)(2)(iv) | PARTIAL — Audit record encryption often not verified | Encrypt enforcement layer audit records at rest; key management policy required |
| 5 | Audit controls — record and examine access | §164.312(b) | FAIL — Standard LLM logs lack required structure | HMAC-signed per-access audit record for every AI interaction with ePHI |
| 6 | Audit controls — tamper-evident records | §164.312(b) | FAIL — Standard logs are mutable | Hash-chained, HMAC-signed records that cannot be altered without detection |
| 7 | Integrity controls for ePHI | §164.312(c)(1) | FAIL — LLM outputs can alter clinical documentation content | Enforce validation rules on AI-generated clinical content before use in records |
| 8 | Minimum necessary standard | §164.502(b) | FAIL — LLMs typically receive full clinical context when partial context would suffice | Enforce minimum necessary data rules in pre-execution layer; block excess PHI fields |
| 9 | Purpose limitation for PHI use | §164.506 | PARTIAL — AI use outside authorized purposes not enforced | Enforce authorized use context in enforcement layer; block off-purpose PHI requests |
| 10 | Business Associate Agreement with AI vendor | §164.308(b)(1) | FAIL — Many AI vendors do not offer HIPAA BAA | Verify BAA before AI vendor onboarding; enforcement layer blocks uncovered endpoints |
| 11 | BAA satisfactory assurances from subcontractors | §164.308(b)(4) | FAIL — Foundation model providers often use subprocessors not covered by your BAA | Flow-down requirement verification in vendor due diligence; document subprocessor coverage |
| 12 | Workforce training on AI PHI handling | §164.308(a)(5) | FAIL — Standard HIPAA training does not cover AI-specific PHI risks | Update workforce training to include AI PHI handling; document completion records |
| 13 | Security risk analysis for AI systems | §164.308(a)(1)(ii)(A) | FAIL — AI systems often not included in formal SRA | Include all AI systems handling ePHI in SRA; document risk analysis and mitigation |
| 14 | Access control — role-based PHI access | §164.312(a)(1) | PARTIAL — AI access often not subject to same role controls as EHR access | Enforce role-based access in enforcement layer; different policy rules per user role |
| 15 | Breach notification for AI incidents | §164.400–414 | PARTIAL — AI-related breaches may not be identified as such | Enforcement layer incident detection; structured breach event records for notification process |
| 16 | Disposal of ePHI — AI conversation logs | §164.310(d)(2)(i) | FAIL — AI conversation logs containing ePHI often retained without disposal schedule | Retention policy enforced by governance layer; disposal audit records for verification |
| 17 | Contingency plan — AI system availability | §164.308(a)(7) | PARTIAL — AI system downtime may not be included in contingency planning | Document AI system in contingency plan; test failover from AI-assisted to manual workflows |
| 18 | PHI use in model training/fine-tuning | §164.502; §164.514 | FAIL — PHI used in fine-tuning without authorization is a HIPAA violation | Explicit prohibition in enforcement layer policy; BAA must cover training data use |
PHI in Prompts: The Immediate Compliance Risk
Of the 18 checklist items, the one that creates the most immediate compliance risk in healthcare AI deployments is #5 and #6: the audit control requirement. Healthcare organizations that have deployed LLMs for clinical documentation, patient communication, or administrative functions and are capturing LLM inputs and outputs in standard application logs have almost certainly failed the HIPAA audit control requirement — not because they are not logging, but because their logs lack the structure the requirement demands.
HIPAA §164.312(b) requires that covered entities implement hardware, software, and procedural mechanisms to record and examine activity in information systems that contain or use ePHI. The OCR's interpretive guidance specifies that these mechanisms must produce audit logs that can be reviewed to determine who accessed what ePHI and when, in a format that is meaningful for investigation and that can detect unauthorized access.
A standard LLM application log — timestamp, user ID, prompt hash, response hash, latency — does not satisfy this requirement. It does not show what ePHI was accessed, it does not show what the AI did with the ePHI, and it does not produce a meaningful audit trail for an OCR investigation. The OCR needs to be able to determine, for a specific incident, exactly what patient information was accessed by whom and through what mechanism. A hash of the prompt does not provide that information.
Logging that AI interactions occurred is not the same as audit controls that satisfy HIPAA §164.312(b). The OCR requires records that identify the specific ePHI accessed, the purpose of access, the user who accessed it, and the system through which access occurred. Standard LLM application logs capture none of this in the structured form that §164.312(b) requires.
Business Associate Agreement Requirements for AI Vendors
The BAA requirement (checklist items #10 and #11) is the most frequently discovered compliance gap during healthcare AI program reviews. The analysis is straightforward but the practical situation is complex:
Any entity that handles ePHI on behalf of a covered entity is a business associate and must execute a BAA before receiving ePHI. An LLM provider whose models receive clinical data containing ePHI is a business associate — it does not matter that the LLM is processing natural language rather than structured health records. The PHI is PHI regardless of its format.
As of 2026, the major foundation model providers have heterogeneous HIPAA BAA postures:
- Microsoft Azure OpenAI Service — Offers HIPAA BAA coverage; requires specific Azure regions and configurations to activate coverage
- AWS Bedrock — Offers HIPAA BAA coverage for Claude models via AWS; organization must have existing BAA with AWS and enable HIPAA-eligible services
- Google Cloud Vertex AI — Offers HIPAA BAA; Healthcare Data Processing Addendum required for ePHI processing
- Anthropic direct API — Does not offer HIPAA BAA as of this writing; use via AWS Bedrock or GCP Vertex required for HIPAA-covered use
- OpenAI direct API — Offers BAA for Enterprise tier only; not available on standard or developer tiers
The subprocessor question (checklist item #11) is more complex. When a healthcare organization executes a BAA with a cloud provider that hosts an LLM, the BAA covers the cloud provider's direct handling of ePHI. But the foundation model provider may use infrastructure subprocessors — GPU cloud providers, inference optimization services, monitoring vendors — whose handling of data processed during inference may not be covered by the BAA. Healthcare organizations should request a complete subprocessor list from AI vendors and verify that each subprocessor is either covered by a downstream BAA or does not handle ePHI.
Minimum Necessary Standard Applied to AI Prompts
HIPAA's minimum necessary standard requires that covered entities make reasonable efforts to limit the use or disclosure of PHI to the minimum necessary to accomplish the intended purpose. For AI systems, this creates a specific technical requirement: the data passed to an AI model in a prompt should include only the PHI elements required for the specific task the AI is performing.
A clinical documentation assistant that needs to improve the clarity of a discharge summary does not need the patient's full medical record history in its context. A patient communication chatbot that is answering a question about appointment scheduling does not need the patient's diagnosis or medication list in its context. An AI that is summarizing a referral letter does not need the patient's insurance information in its context.
The minimum necessary standard violation pattern in AI deployments is the "dump the full record" integration — connecting an LLM to an EHR API that returns the full patient record for every AI interaction, regardless of what subset of the record is actually needed for the specific task. This is operationally convenient but creates a persistent minimum necessary standard violation for every AI interaction.
The enforcement layer addresses this by implementing data field validation rules. Before a request containing clinical data reaches the AI, the enforcement layer evaluates whether the data fields included in the request are appropriate for the declared task type. A task type of DISCHARGE_SUMMARY_EDIT authorizes clinical note content and demographic data but not insurance, billing, or genetic information. A request for this task type that includes billing data returns MODIFIED with those fields removed before the AI receives the request.
OCR Examination Preparation for AI Systems
The OCR's HIPAA Security Rule audit program has been examining healthcare AI systems with increasing specificity since 2024. Covered entities that receive an OCR audit involving AI systems should expect examination requests that include:
- A list of all AI systems that handle or have access to ePHI
- The BAA documentation for each AI vendor and their subprocessors
- Evidence that each AI system was included in the organization's most recent security risk analysis
- The audit log for AI-related ePHI access for the period under examination
- Evidence that the audit log satisfies §164.312(b) requirements (structured, tamper-evident, access-linked)
- Evidence that the minimum necessary standard is enforced for AI-accessible PHI fields
- The workforce training records showing AI-specific PHI handling training
- The disposal schedule for AI conversation logs containing ePHI
Organizations that cannot produce audit logs that satisfy item 5 above will receive a finding on §164.312(b) audit controls — historically one of the most frequently cited HIPAA violations in OCR enforcement actions. The enforcement layer's per-interaction, HMAC-signed audit records directly address this examination risk.
A healthcare AI deployment that is governed by a pre-execution enforcement layer with HMAC-signed audit records, minimum necessary data enforcement, and BAA coverage verification can answer every OCR examination request for AI systems with concrete, verifiable evidence. The alternative — reconstructing compliance evidence from unstructured logs after receiving an examination notice — is a much more difficult and less convincing response.
CoreGuard's HIPAA Enforcement Capabilities
CoreGuard addresses the most critical HIPAA compliance gaps for healthcare AI deployments through its pre-execution enforcement layer and certificate generation infrastructure. The key enforcement capabilities relevant to this checklist are:
PHI minimum necessary enforcement. CoreGuard's healthcare policy rules validate that AI requests include only the PHI fields authorized for the declared task type. Task type definitions are configurable and map directly to the organization's minimum necessary policies. Requests with excess PHI fields return MODIFIED with the unauthorized fields removed before AI processing.
BAA coverage verification. CoreGuard's request routing includes endpoint coverage validation — before any ePHI-containing request is routed to an AI endpoint, the enforcement layer verifies that the endpoint is covered under an active BAA. Requests to uncovered endpoints return BLOCKED with reason code HIPAA_BAA_NOT_COVERED.
HMAC-signed audit records. Every AI interaction that passes through the CoreGuard enforcement layer generates a signed audit certificate that includes the interaction timestamp, the requesting user identifier, the task type, the data fields included in the request, the policy rules evaluated, and the final disposition. These certificates form the structured, tamper-evident audit log that §164.312(b) requires.
Purpose limitation enforcement. CoreGuard's policy rules include context-type validation that restricts AI model access to authorized purposes. A model deployed for clinical documentation assistance cannot be used for insurance processing queries within the same enforcement policy — the enforcement layer detects the context mismatch and returns BLOCKED.
For additional context on how automated enforcement addresses healthcare AI compliance, see our articles on AI compliance automation and how to evaluate AI governance vendors. For regulatory context across industries, our coverage of EU AI Act enforcement requirements and CFPB fair lending guidance provides relevant parallels. The technical documentation for CoreGuard's healthcare enforcement capabilities is available in the full documentation.