HIPAA’s Formal Definition of Protected Health Information (PHI): What Qualifies, What Doesn’t, and Why It Matters
Definition of Protected Health Information
Under the HIPAA Privacy Rule, protected health information (PHI) is a subset of individually identifiable health information. It concerns an individual’s past, present, or future physical or mental health or condition, the provision of health care, or payment for health care.
Information becomes PHI when it is created, received, maintained, or transmitted by Covered Entities or their Business Associates and either directly identifies a person or can reasonably be used to identify them. PHI can exist in any form or medium—electronic, paper, or oral.
Core elements that make data PHI
- Content: health, care, or payment information about an individual.
- Identifiability: data that identifies or could identify the individual.
- Context: held or processed by a Covered Entity or Business Associate.
Forms and Mediums of PHI
PHI spans formats: electronic PHI (ePHI) in EHRs, portals, claims systems, backups, and logs; paper records like charts, referral letters, and printed reports; and oral communications such as consultations, handoffs, and voicemails.
PHI also appears in less obvious places: appointment reminders, metadata in documents, spreadsheets exported from analytics tools, screen captures, chat transcripts handled on behalf of a provider, and device data if it ties to a person’s identity.
Typical locations
- Operational systems: EHRs, billing, clearinghouses, customer service platforms.
- Workflows: email, secure messaging, fax servers, and transcription services.
- Storage: archives, disaster recovery sites, mobile devices, and removable media.
Exclusions from PHI Coverage
Some information falls outside HIPAA’s PHI scope. De-identified data that meets HIPAA’s de-identification standards is not PHI. Education records and certain student treatment records protected by the Family Educational Rights and Privacy Act are excluded from the HIPAA Privacy Rule.
Employment records held by a Covered Entity in its role as an employer are not PHI. Consumer health data collected by apps or devices that are not acting on behalf of a Covered Entity or Business Associate generally is not PHI, even if it is sensitive. Information about an individual who has been deceased for more than 50 years is also excluded.
Ready to simplify HIPAA compliance?
Join thousands of organizations that trust Accountable to manage their compliance needs.
Borderline cases to evaluate
- Research datasets: de-identified sets are not PHI; limited data sets remain PHI with restrictions.
- Vendor-held data: becomes PHI if the vendor is a Business Associate processing on behalf of a Covered Entity.
- Aggregates and dashboards: summaries are PHI if individuals are reasonably re-identifiable.
The 18 Identifiers of PHI
HIPAA’s Safe Harbor method defines a PHI Identifiers List that must be removed for de-identification. These 18 identifiers are:
- Names.
- All geographic subdivisions smaller than a state, including street address, city, county, precinct, ZIP code, and equivalent geocodes (with limited three-digit ZIP exceptions).
- All elements of dates (except year) for dates directly related to an individual, including birth, admission, discharge, and death; and all ages over 89 (and related date elements), except as aggregated to age 90+.
- Telephone numbers.
- Fax numbers.
- Email addresses.
- Social Security numbers.
- Medical record numbers.
- Health plan beneficiary numbers.
- Account numbers.
- Certificate/license numbers.
- Vehicle identifiers and serial numbers, including license plates.
- Device identifiers and serial numbers.
- Web URLs.
- IP addresses.
- Biometric identifiers, including finger and voice prints.
- Full-face photographs and comparable images.
- Any other unique identifying number, characteristic, or code (except permitted re-identification codes).
De-Identification and Its Impact
HIPAA recognizes two De-Identification Standards. Under Safe Harbor, you remove all 18 identifiers and ensure no actual knowledge of residual re-identification risk. Under Expert Determination, a qualified expert applies statistical or scientific methods to achieve very small re-identification risk and documents the approach.
De-identified data falls outside the HIPAA Privacy Rule and Security Rule, enabling broader use and sharing. However, a limited data set—where some identifiers remain (for example, certain dates and city, state, or ZIP)—is still PHI and requires a data use agreement with specific permitted purposes and safeguards.
Operational implications
- Design pipelines that either remove identifiers (Safe Harbor) or maintain expert documentation (Expert Determination).
- Control re-identification keys; store them separately with strict access and audit trails.
- Continuously assess small-population and mosaic risks when combining datasets.
Importance of PHI Definition for Compliance
Clear scoping of PHI anchors your privacy and security program: data inventories, risk analyses, access controls, and minimum necessary policies all depend on knowing exactly what is PHI. This definition drives training, monitoring, and how you apply the HIPAA Security Rule to ePHI systems.
It also governs third-party risk. When vendors handle PHI, they become Business Associates and need business associate agreements, appropriate safeguards, and incident reporting duties. Mapping flows of PHI across systems and vendors lets you prevent leaks and enforce least privilege.
Finally, the PHI scope triggers breach notification and supports individual rights, including access, amendments, and accounting of disclosures. Treat the PHI Identifiers List and de-identification methods as practical tools for tailoring controls while enabling compliant data use.
Conclusion
PHI is individually identifiable health information held by Covered Entities or Business Associates, regardless of medium. Knowing what qualifies—and what does not—lets you apply the HIPAA Privacy Rule precisely, use de-identification to reduce risk, and build workflows that protect people while enabling responsible data use.
FAQs
What information is considered protected health information under HIPAA?
PHI is individually identifiable health, care, or payment information that identifies a person (or could reasonably do so) when created, received, maintained, or transmitted by a Covered Entity or its Business Associate in any form—electronic, paper, or oral.
How does de-identified data differ from PHI?
De-identified data either has all 18 Safe Harbor identifiers removed without reasonable re-identification risk, or an expert certifies that the risk is very small using statistical methods. Because it no longer identifies individuals, de-identified data is not PHI under HIPAA; limited data sets remain PHI and require data use agreements.
Which entities are covered by HIPAA regulations regarding PHI?
HIPAA covers health plans, health care clearinghouses, and most health care providers that conduct standard electronic transactions, along with their Business Associates that create, receive, maintain, or transmit PHI on their behalf.
Ready to simplify HIPAA compliance?
Join thousands of organizations that trust Accountable to manage their compliance needs.