Identifying Non‑ePHI: Examples, Edge Cases, and Documentation Best Practices
Non-ePHI Identification
Identifying non‑ePHI starts with scope. ePHI is individually identifiable health information held or transmitted electronically by HIPAA covered entities or their business associates. Information falls outside ePHI when it never meets PHI criteria, or when it has been properly de‑identified.
What qualifies as non‑ePHI
- De‑identified datasets that meet the Safe Harbor method or the Expert Determination method.
- Education records subject to FERPA exclusions, including most student health records maintained by schools.
- Employment records held by a covered entity in its role as employer (e.g., HR files, FMLA paperwork in HR systems).
- Aggregate statistics that cannot identify an individual (with small‑cell suppression to reduce data re‑identification risk).
- Public information not created or received by a covered entity for care or payment, when it does not identify a person in a healthcare context.
- Information about individuals deceased for more than 50 years.
What does not qualify (common confusions)
- Limited Data Sets are still PHI and therefore not non‑ePHI.
- Pseudonymized data with reversible tokens remains PHI if a key exists and could be accessed by the recipient.
- Consumer app data may be outside HIPAA, but if a covered entity ingests it for care, it can become PHI.
Quick self‑check
- Who holds the data? If it is a HIPAA covered entity or business associate, apply PHI tests first.
- Why was it created or received? Treatment, payment, or operations lean toward PHI.
- Can the data identify a person directly or by reasonable inference? If yes, treat as PHI unless de‑identification compliance is complete.
De-identification Methods
HIPAA recognizes two paths to transform PHI into non‑ePHI: the Safe Harbor method and the Expert Determination method. Both aim to reduce data re‑identification risk to a very small level, but they do so differently.
Safe Harbor method
Safe Harbor requires removing 18 identifier categories for the individual and relatives/household (for example: names; geographic details smaller than a state except permitted three‑digit ZIPs; all elements of dates except year; phone, email, SSN, MRN, health plan IDs; account/vehicle/device numbers; URLs/IPs; biometric identifiers; full‑face images; and any other unique identifiers). You must also have no actual knowledge that the remaining data can identify a person.
Key nuances include using 000 for three‑digit ZIP codes with populations under 20,000 and grouping ages 90 and over. If you need longitudinal linkage, use a non‑derivable re‑identification code maintained separately and never disclose the key.
Expert Determination method
The Expert Determination method engages a qualified expert to analyze the dataset and certify that the risk of re‑identification is very small. Techniques may include k‑anonymity, l‑diversity, t‑closeness, differential privacy noise, generalization, and suppression. The expert’s report should state the methods, assumptions, thresholds, and validation steps, supporting de‑identification compliance.
Choosing the right approach
- Safe Harbor method fits structured data where utility survives removal of direct identifiers.
- Expert Determination method fits complex, high‑dimensional data (free text, images, genomics, geolocation) where tailored risk analysis is needed.
- Whichever path you choose, implement release governance, recipient obligations against re‑identification, and monitoring.
Documentation Best Practices
Good records make your non‑ePHI determinations auditable and repeatable. Document the decision logic, approvals, and controls so teams can rely on consistent standards.
Ready to simplify HIPAA compliance?
Join thousands of organizations that trust Accountable to manage their compliance needs.
Core records to maintain
- Data inventory and classification showing systems, data elements, holders, purposes, and whether data is PHI, ePHI, or non‑ePHI.
- Non‑ePHI Determination Record indicating rationale (e.g., FERPA exclusions, Safe Harbor removal list, Expert Determination report) and decision date.
- De‑identification playbook detailing techniques, thresholds, validation metrics, and prohibition of data re‑identification.
- Release and sharing logs noting recipients, intended use, retention, and contractual terms on re‑identification and onward sharing.
- Risk assessments addressing residual linkage risks and mitigations (small‑cell suppression, aggregation, rounding, or noise).
- Access control policies specifying who can view, export, or combine non‑ePHI with other datasets.
Governance and lifecycle
- Approval workflow: data owner, privacy officer, and security sign‑off before release.
- Versioning: track transformations and keep Expert Determination method reports current when data changes.
- Retention and disposal: define timelines and secure destruction; avoid indefinite retention.
- Training: ensure analysts understand Safe Harbor identifiers and re‑identification pitfalls.
Edge Cases in Non-ePHI Identification
Some scenarios need extra scrutiny because context can flip data between PHI and non‑ePHI.
- School‑based care: Student clinic records are typically FERPA education records (non‑ePHI). Care provided at an affiliated hospital is usually HIPAA PHI.
- Consumer wearables: Data in a fitness app may be outside HIPAA. Once imported into a covered entity’s EHR for care, it becomes PHI.
- Genomic and imaging data: Even without names, uniqueness can be high; prefer Expert Determination method and strict release controls.
- Free‑text notes: Narrative fields often leak identifiers (names, places, events). Use NLP redaction plus expert review.
- Small populations: Rare conditions or granular geography can enable linkage; apply aggregation and minimum‑cell thresholds.
- Photographs and videos: Faces and comparable images are identifiers; even blurred media may reveal identity via tattoos or context.
- Research datasets: A Limited Data Set remains PHI. True de‑identification or expert certification is required for non‑ePHI status.
Best Practices for Handling Non-ePHI
Non‑ePHI still deserves strong stewardship. Treat it as potentially sensitive to prevent accidental re‑identification and to uphold trust.
- Minimize data: Share only elements needed for the use case; prefer aggregates over row‑level data.
- Prevent linkage: Prohibit combining with external datasets; audit merges; consider synthetic data for development work.
- Apply security controls: Encrypt at rest and in transit, enforce role‑based access control policies, log access, and segment networks.
- Contract for safety: Use data use agreements that ban re‑identification, restrict onward sharing, and require breach notification—BAAs are not required for non‑ePHI.
- Validate regularly: Reassess data re‑identification risk when scope, external data availability, or techniques change.
- Prepare for incidents: Maintain response playbooks covering revocation of access, recipient notifications, and dataset recall.
In short, identify scope, choose the right de‑identification pathway, and document decisions. With governance, technical controls, and training, you can use non‑ePHI effectively while keeping privacy risks low.
FAQs
What types of records are excluded from ePHI?
Common exclusions include de‑identified datasets, education records covered by FERPA, employment records kept by an employer (even if the employer is a covered entity), aggregate non‑identifiable statistics, and records of individuals deceased for more than 50 years. Always confirm the holder, purpose, and identifiability before classifying.
How does the Safe Harbor method protect privacy?
Safe Harbor removes 18 identifier categories—such as names, granular locations, contact numbers, account/device IDs, web identifiers, full‑face images, and precise dates—then requires no actual knowledge of identifiability. With identifiers stripped and residual checks in place, the dataset is treated as non‑ePHI for release.
When should expert determination be used for de-identification?
Use Expert Determination when Safe Harbor would destroy utility or when data is complex (free text, images, genomics, detailed geolocation). A qualified expert assesses the dataset and certifies that data re‑identification risk is very small using statistical methods and controls documented in a formal report.
What documentation is required for non-ePHI handling?
Maintain a Non‑ePHI Determination Record, de‑identification methodology and validation notes, risk assessments, release logs, and data use agreements. Keep access control policies, retention schedules, and periodic reviews to prove ongoing de‑identification compliance and responsible stewardship.
Ready to simplify HIPAA compliance?
Join thousands of organizations that trust Accountable to manage their compliance needs.