PII in Healthcare: HIPAA Compliance Checklist for 2026
Now I have the style reference. Let me write the article.
PII in Healthcare: HIPAA Compliance Checklist for 2026
In January 2025, the HHS Office for Civil Rights settled with a regional hospital network for $4.75 million after an investigation revealed unencrypted patient records sitting in a decommissioned cloud storage bucket — accessible to anyone with the URL. The data had been there for nineteen months. Over 2.3 million patients were affected. The hospital's CISO told investigators they believed the data had been deleted during a system migration.
Healthcare is the most breached industry in the United States for the fourteenth consecutive year. According to the IBM Cost of a Data Breach Report 2025, the average cost of a healthcare data breach reached $11.09 million — nearly three times the cross-industry average. The reason is simple: healthcare organizations handle the most sensitive categories of personally identifiable information (PII) that exist — medical diagnoses, treatment histories, genetic data, insurance identifiers, Social Security numbers — and they do so across sprawling, fragmented systems that were never designed with modern data privacy in mind.
HIPAA's Security Rule, Privacy Rule, and Breach Notification Rule establish the regulatory floor, but the real challenge isn't understanding the law — it's operationalizing it across EHR systems, claims databases, medical imaging archives, research datasets, telehealth platforms, and the hundreds of SaaS tools that touch patient data every day. This checklist gives you a concrete, actionable framework for identifying, classifying, and protecting PII across your healthcare infrastructure in 2026.
Understanding Protected Health Information (PHI) vs. PII

Before building your compliance program, get the definitions right. HIPAA regulates Protected Health Information (PHI) — any individually identifiable health information held or transmitted by a covered entity or its business associates. PII is the broader category: any data that can identify a person, whether or not it's health-related.
The critical distinction: all PHI is PII, but not all PII is PHI. A patient's Social Security number stored in your billing system is both PII and PHI. That same SSN stored in your HR system for an employee who is not a patient is PII but not PHI (though it's still subject to state privacy laws and potentially CCPA).
HIPAA's 18 PHI identifiers include:
- Names
- Geographic data smaller than a state
- Dates (birth, admission, discharge, death) and ages over 89
- Phone numbers, fax numbers, email addresses
- Social Security numbers
- Medical record numbers
- Health plan beneficiary numbers
- Account numbers
- Certificate/license numbers
- Vehicle identifiers and serial numbers
- Device identifiers and serial numbers
- Web URLs and IP addresses
- Biometric identifiers (fingerprints, voiceprints)
- Full-face photographs
- Any other unique identifying number or code
Conduct a Comprehensive PHI Data Inventory

You cannot protect what you cannot find. The first step in any HIPAA compliance program is a complete inventory of where PHI lives across your organization.
Map every system that touches patient data
Start with the obvious — your EHR (Epic, Cerner, Meditech), practice management system, billing platform, and claims clearinghouse. Then dig into the systems people forget:
- Telehealth platforms: Zoom for Healthcare, Doxy.me, Amwell — session recordings, chat logs, intake forms
- Medical imaging: PACS systems, DICOM servers, radiology workstations
- Research databases: Clinical trial data, de-identified datasets (verify they're actually de-identified)
- Communication tools: Secure messaging apps, pager systems, provider-to-provider email
- Analytics platforms: Population health tools, quality reporting dashboards
- Legacy systems: That Access database from 2014 that "nobody uses anymore" but still contains 80,000 patient records
Scan for shadow PHI
The highest-risk PHI isn't in your EHR — it's in the places nobody is monitoring. Run automated PII scans across:
`bash
Example: Scan shared drives and cloud storage for PHI patterns
privasift scan /mnt/shared-drive /data/exports s3://hospital-analytics-bucket \ --profile hipaa \ --format json \ --output phi-discovery-report.json`Common places shadow PHI hides:
- Developer staging environments with production data copies
- Excel exports on shared drives, created for "quick analysis" and never deleted
- Email attachments — patient lists, referral letters, lab results
- Application log files containing patient names, MRNs, or diagnoses in error messages
- Backup tapes and snapshots with no encryption and no retention policy
Implement Technical Safeguards (the HIPAA Security Rule Checklist)

The HIPAA Security Rule (45 CFR Part 160 and Subparts A and C of Part 164) requires administrative, physical, and technical safeguards. Here's the technical checklist for 2026:
Access controls (§ 164.312(a)(1))
- [ ] Unique user identification for every person accessing PHI
- [ ] Emergency access procedures documented and tested
- [ ] Automatic logoff after inactivity (15 minutes maximum for clinical workstations)
- [ ] Encryption and decryption of PHI at rest
Audit controls (§ 164.312(b))
- [ ] Hardware, software, and procedural mechanisms to record and examine access to PHI
- [ ] Audit logs retained for minimum 6 years (HIPAA requirement)
- [ ] Regular audit log reviews (monthly minimum, weekly recommended)
Integrity controls (§ 164.312(c)(1))
- [ ] Mechanisms to authenticate ePHI and confirm it hasn't been altered or destroyed
- [ ] Digital signatures or checksums for PHI in transit
Transmission security (§ 164.312(e)(1))
- [ ] TLS 1.2+ for all PHI in transit (TLS 1.3 preferred)
- [ ] End-to-end encryption for telehealth sessions
- [ ] Encrypted email or secure messaging for any PHI communication
Implementation example: audit logging for PHI access
`python
import logging
import json
from datetime import datetime, timezone
phi_logger = logging.getLogger("phi_access") phi_logger.setLevel(logging.INFO) handler = logging.FileHandler("/var/log/hipaa/phi_access.jsonl") phi_logger.addHandler(handler)
def log_phi_access(user_id: str, patient_id: str, resource: str, action: str, reason: str): """Log every PHI access event for HIPAA audit trail.""" entry = { "timestamp": datetime.now(timezone.utc).isoformat(), "user_id": user_id, "patient_id": patient_id, "resource": resource, "action": action, # "read", "write", "delete", "export" "reason": reason, # "treatment", "payment", "operations", "research" "source_ip": get_request_ip(), "session_id": get_session_id(), } phi_logger.info(json.dumps(entry))
Usage in your application layer
log_phi_access( user_id="dr.chen", patient_id="MRN-00482931", resource="lab_results", action="read", reason="treatment" )`Every PHI access — read, write, export, delete — must be logged with who, what, when, why, and from where. These logs are your primary evidence during an OCR investigation.
Business Associate Agreements and Third-Party Risk

Under HIPAA, any vendor that creates, receives, maintains, or transmits PHI on your behalf is a Business Associate (BA). You need a signed Business Associate Agreement (BAA) with every one of them — no exceptions.
Audit your vendor ecosystem
Build a registry of every third party that touches PHI:
| Vendor | PHI Access | BAA Signed | Last Review | Risk Level | |---|---|---|---|---| | AWS (hosting) | ePHI storage | Yes, 2025-03 | 2025-09 | High | | Stripe (payments) | Billing codes, patient IDs | Yes, 2024-11 | 2025-06 | Medium | | Zoom Healthcare | Session recordings | Yes, 2025-01 | 2025-07 | High | | SendGrid (email) | Patient names, appt details | Missing | Never | Critical |
The missing BAA is the violation. In 2024, the HHS fined a dental practice $62,500 for using a cloud-based scheduling tool without a BAA — even though no breach had occurred. The absence of the agreement was itself the violation.
What a BAA must include
- How the BA will safeguard PHI
- Permitted uses and disclosures
- Breach notification obligations (the BA must notify you within 60 days, but negotiate for 24-48 hours)
- Return or destruction of PHI at contract termination
- Right for the covered entity to audit the BA's compliance
Monitor business associate compliance
A signed BAA is necessary but not sufficient. Implement ongoing monitoring:
- Request SOC 2 Type II reports annually from high-risk BAs
- Include HIPAA-specific audit rights in your contracts
- Run PII scans on any data received from or shared with BAs to verify scope alignment
- Track sub-processors — your BA's vendors who may also access PHI
Breach Notification and Incident Response
HIPAA's Breach Notification Rule requires notification to affected individuals within 60 days of discovering a breach. Breaches affecting 500+ individuals must also be reported to HHS and prominent media outlets. HHS publishes all large breaches on its public "Wall of Shame" — a significant reputational risk.
Build a HIPAA-specific incident response plan
Your IR plan should include these HIPAA-specific elements:
1. Risk assessment methodology: HIPAA requires a four-factor risk assessment to determine if a breach triggers notification: - Nature and extent of the PHI involved - The unauthorized person who accessed or used the PHI - Whether the PHI was actually acquired or viewed - Extent to which the risk has been mitigated
2. Notification timelines: - Individual notice: within 60 days of discovery - HHS notice: within 60 days for breaches of 500+; annual log for breaches under 500 - State attorney general: check state-specific requirements (many states have shorter windows — California requires notification "in the most expedient time possible")
3. Documentation requirements: Document every step of your investigation and risk assessment. This documentation is what OCR reviews during an investigation.
Automate breach detection
Don't rely on humans to notice breaches. Implement automated monitoring:
`yaml
Example: Alert rules for PHI-related anomalies
alerts: - name: bulk_phi_export condition: "phi_records_exported > 1000 AND time_window = '1h'" severity: critical action: page_security_team- name: phi_access_outside_hours condition: "phi_access AND time NOT BETWEEN '06:00' AND '22:00' AND user_role != 'on_call'" severity: high action: notify_compliance
- name: unauthorized_phi_location
condition: "pii_scan_detected_phi AND storage_location NOT IN approved_phi_systems"
severity: critical
action: quarantine_and_alert
`
The third rule is critical: continuous PII scanning catches PHI that has leaked to unauthorized systems before it becomes a reportable breach.
De-identification and the Safe Harbor Method
HIPAA provides two methods for de-identifying PHI so it's no longer subject to HIPAA restrictions: Expert Determination (§ 164.514(b)(1)) and Safe Harbor (§ 164.514(b)(2)). Safe Harbor is more commonly used and requires removing all 18 identifiers listed earlier.
Common de-identification failures
De-identification sounds simple but fails in practice:
- Dates not generalized: Safe Harbor requires removing all dates more specific than year. "Admitted March 15, 2025" violates Safe Harbor; "Admitted 2025" does not.
- Zip codes not truncated: Geographic data must be truncated to the first three digits, and if that three-digit zip contains fewer than 20,000 people, it must be set to "000."
- Free-text fields not scrubbed: A clinical note reading "John Smith, 47-year-old male from Springfield" contains at least three identifiers — name, age (if over 89), and geographic subdivision.
- Re-identification through combination: Individual fields may be de-identified, but the combination of age + gender + diagnosis + admission year can re-identify patients in small populations.
Validate de-identification with automated scanning
`python
After de-identification, scan the output to verify no PHI remains
import subprocess import jsonresult = subprocess.run( ["privasift", "scan", "./deidentified_dataset/", "--profile", "hipaa", "--format", "json", "--fail-on-detection"], capture_output=True, text=True )
if result.returncode != 0:
findings = json.loads(result.stdout)
print(f"De-identification FAILED: {len(findings)} PHI instances found")
for f in findings[:10]:
print(f" {f['file']}:{f['line']} — {f['type']}: {f['match']}")
else:
print("De-identification verified: no PHI detected")
`
Run this validation every time you generate a de-identified dataset. A 2024 study published in JAMIA found that 12% of datasets labeled as "de-identified" in healthcare research repositories still contained recoverable PHI.
Ongoing Compliance: Monitoring, Training, and Risk Assessment
HIPAA compliance is not a project — it's a continuous process. The HHS has increasingly emphasized that point-in-time compliance is insufficient.
Annual requirements
- Risk assessment: HIPAA requires a "regular" risk assessment. Industry standard is annual, but after any significant infrastructure change (cloud migration, new EHR deployment, M&A activity), a new assessment is warranted.
- Security awareness training: All workforce members, not just clinical staff. Include phishing simulations — healthcare is the #1 target for phishing attacks.
- Policy review: Update policies to reflect regulatory changes. In 2025, the HHS proposed modifications to the Security Rule that would make encryption and MFA mandatory (previously "addressable").
- BAA review: Audit all business associate agreements for completeness and currency.
Continuous monitoring
- Monthly PII scans across all data stores to catch PHI drift
- Weekly audit log reviews for anomalous access patterns
- Real-time alerting on bulk data exports, access from unusual locations, and privilege escalation
- Quarterly penetration testing of systems containing PHI
Prepare for the 2026 HIPAA Security Rule update
The proposed HIPAA Security Rule modifications (published in the Federal Register in January 2025) signal significant changes:
- Encryption of ePHI at rest and in transit would become required (no longer "addressable")
- Multi-factor authentication for all systems containing ePHI
- Network segmentation requirements for PHI-containing systems
- 72-hour restoration requirement after a security incident
- Technology asset inventory and network mapping requirements
Frequently Asked Questions
What is the difference between PHI and PII under HIPAA?
PII (Personally Identifiable Information) is any data that can identify an individual — names, emails, SSNs, IP addresses. PHI (Protected Health Information) is a subset of PII defined by HIPAA: individually identifiable health information created or received by a covered entity (health plans, healthcare providers, healthcare clearinghouses) or their business associates. The key distinction is the healthcare context. A patient's email address in your EHR is PHI. That same email in a generic marketing database is PII but not PHI. This matters because PHI triggers HIPAA's specific safeguard requirements, breach notification rules, and penalties — which are separate from (and often stricter than) general PII regulations like CCPA.
What are the penalties for HIPAA violations in 2026?
HIPAA penalties are tiered based on the level of culpability. Tier 1 (lack of knowledge): $137 to $68,928 per violation. Tier 2 (reasonable cause): $1,379 to $68,928 per violation. Tier 3 (willful neglect, corrected): $13,785 to $68,928 per violation. Tier 4 (willful neglect, not corrected): $68,928 per violation with an annual maximum of $2,067,813 per identical provision. These amounts are adjusted annually for inflation. Criminal penalties can reach $250,000 and up to 10 years imprisonment for offenses committed with intent to sell or use PHI for personal gain. Beyond direct fines, HHS can impose corrective action plans lasting 2-3 years with independent monitoring — an operational burden that often exceeds the fine itself. And the reputational damage from appearing on the HHS Breach Portal is incalculable.
How does HIPAA interact with state privacy laws like CCPA?
HIPAA preempts state laws that are "contrary" to HIPAA — but only for covered entities handling PHI. Here's where it gets complicated: if a healthcare organization processes data that falls outside HIPAA's scope (website analytics, marketing data, employee data not related to health plans), that data may be subject to CCPA, state breach notification laws, or other regulations. Additionally, CCPA explicitly exempts medical information governed by CMIA (California Confidentiality of Medical Information Act) and PHI under HIPAA — but only when collected by a covered entity or business associate for HIPAA-covered purposes. The same hospital's patient records are HIPAA-governed, but its website visitor tracking data falls under CCPA. Organizations need to classify data not just by type, but by the regulatory context in which it's processed.
Do we need to encrypt all PHI under HIPAA?
Under the current HIPAA Security Rule, encryption is an "addressable" implementation specification — meaning you must implement it if reasonable and appropriate, or document why an equivalent alternative measure was used. In practice, failing to encrypt ePHI and then experiencing a breach is extremely difficult to defend. Unencrypted PHI that is lost or stolen is presumed to be a breach under the Breach Notification Rule, while properly encrypted PHI (using NIST-approved algorithms) that is lost or stolen is not a reportable breach. The proposed 2025 Security Rule modifications would make encryption mandatory for all ePHI at rest and in transit, removing the "addressable" designation. Even before this rule is finalized, encryption should be treated as mandatory — the cost of implementation is trivial compared to the cost of a breach involving unencrypted PHI.
How often should we scan our systems for unauthorized PHI?
At minimum, monthly automated scans across all data stores, with continuous scanning for high-risk environments (development/staging systems, shared file storage, cloud buckets). The goal is to detect PHI that has drifted outside your controlled perimeter — patient data in log files, exports on shared drives, test databases loaded with production data. Integrate PII scanning into your CI/CD pipeline to catch PHI in code repositories, test fixtures, and configuration files before deployment. After any system migration, data integration project, or new vendor onboarding, run an immediate targeted scan. Organizations that scan quarterly or less frequently regularly discover PHI that has been exposed for months — turning a containable incident into a reportable breach with a longer exposure window and higher penalty risk.
Start Scanning for PII Today
PrivaSift automatically detects PII across your files, databases, and cloud storage — helping you stay GDPR and CCPA compliant without the manual work.
[Try PrivaSift Free →](https://privasift.com)
Scan your data for PII — free, no setup required
Try PrivaSift