HIPAA’s Definition of Protected Health Information (PHI): Scope, Exclusions, and De‑Identification Rules

Check out the new compliance progress tracker


Product Pricing Demo Video Free HIPAA Training
LATEST
video thumbnail
Admin Dashboard Walkthrough Jake guides you step-by-step through the process of achieving HIPAA compliance
Ready to get started? Book a demo with our team
Talk to an expert

HIPAA’s Definition of Protected Health Information (PHI): Scope, Exclusions, and De‑Identification Rules

Kevin Henry

HIPAA

January 23, 2024

6 minutes read
Share this article
HIPAA’s Definition of Protected Health Information (PHI): Scope, Exclusions, and De‑Identification Rules

Understanding what counts as PHI is essential to apply HIPAA’s Privacy Rule correctly. This guide explains the definition, scope, exclusions, and the de-identification standards you can use to remove privacy risks while preserving data utility.

Definition of PHI

Under the Privacy Rule, protected health information is a subset of individually identifiable health information created or received by a covered entity or business associate. It relates to an individual’s past, present, or future health status, care, or payment for care, and it either identifies the person or could reasonably be used to identify them.

Individually Identifiable Health Information

Information is “individually identifiable” if it includes direct identifiers (like a name) or indirect details that, in combination, make a person reasonably identifiable. The definition applies regardless of format—electronic (ePHI), paper, or oral.

Scope of PHI

PHI spans all media and workflows where health information is handled by regulated parties. If you collect, transmit, access, or store health data on behalf of a regulated party, you are handling PHI.

Who is regulated

  • Covered Entity: health plans, most health care providers that conduct standard electronic transactions, and health care clearinghouses.
  • Business Associate: any person or organization that performs services for a covered entity and needs PHI to do so (for example, a claims processor, analytics vendor, or cloud host).

What PHI can include

PHI can be clinical (diagnoses, lab results, prescriptions), administrative (billing details, claim numbers), or contextual (care dates, locations of service), so long as it is tied—directly or indirectly—to a specific individual.

Exclusions from PHI

  • De-identified information: data that meets HIPAA’s de-identification standards is not PHI.
  • Education records and treatment records covered by FERPA are excluded from HIPAA’s definition of PHI.
  • Employment records held by a covered entity in its role as employer are not PHI.
  • Information about a decedent more than 50 years after death is no longer PHI.
  • Health information held by entities that are neither covered entities nor business associates (and not acting for one) is not PHI under HIPAA, though other laws may apply.

De-Identification of PHI

De-identification removes the reasonable likelihood that data could identify an individual. Once PHI is de-identified, it is no longer subject to the Privacy Rule. HIPAA recognizes two de-identification standards: the Safe Harbor method and the Expert Determination method.

Limited Data Set is not de-identified

A Limited Data Set (LDS) removes direct identifiers but permits certain fields like dates and general locations (city, state, ZIP). An LDS remains PHI and requires a Data Use Agreement. It is distinct from fully de-identified data.

Re-Identification Risk

Both methods aim to reduce re-identification risk to an acceptable level. Managing risk depends on what fields remain, how the data is shared, and what other data sources could be combined with it.

Ready to simplify HIPAA compliance?

Join thousands of organizations that trust Accountable to manage their compliance needs.

Safe Harbor Method for De-Identification

Safe Harbor requires removal of these 18 identifiers about the individual or relatives, employers, or household members, plus no actual knowledge that remaining data can identify someone:

  • Names
  • Geographic subdivisions smaller than a state (street address, city, county, precinct, ZIP code), except the initial three digits of a ZIP code if the combined area has more than 20,000 people; otherwise, use 000
  • All elements of dates (except year) for dates directly related to the individual, including birth, admission, discharge, and death; ages over 89 and related elements must be aggregated into a single “age 90 or older” category
  • Telephone numbers
  • Fax numbers
  • Email addresses
  • Social Security numbers
  • Medical record numbers
  • Health plan beneficiary numbers
  • Account numbers
  • Certificate/license numbers
  • Vehicle identifiers and serial numbers, including license plates
  • Device identifiers and serial numbers
  • Web URLs
  • IP address numbers
  • Biometric identifiers, including finger and voice prints
  • Full-face photographs and comparable images
  • Any other unique identifying number, characteristic, or code (except a permitted re-identification code as described by HIPAA)

After removal, confirm you have no actual knowledge that the individual can still be identified from remaining data alone or in combination.

Expert Determination Method for De-Identification

This pathway uses a qualified expert who applies statistical, scientific, or technical principles to determine and document that the risk of re-identification is very small for the intended data release and context.

What the expert evaluates

  • Likelihood of matching with external data and the availability of such data
  • Uniqueness and granularity of remaining fields (e.g., rare diagnoses, small geographies)
  • Safeguards and controls around access, sharing, and retention
  • Transformation methods (generalization, suppression, aggregation, or noise)

Documentation and maintenance

The expert provides written methods, assumptions, and results, including the data’s sharing context. Reassess if the data, environment, or external data landscape changes.

Use and Disclosure of De-Identified Data

Data that meet HIPAA de-identification standards are not PHI, so you may use and disclose them without an authorization, minimum necessary analysis, or Business Associate Agreement. Ethical, contractual, or state-law constraints can still apply, so adopt safeguards and audit practices.

Re-identification codes

  • You may assign a code to permit re-identification by the originating covered entity.
  • The code cannot be derived from, or related to, information about the individual and must not be translatable by recipients.
  • Do not disclose the re-identification mechanism and do not use the code for other purposes.

Limited Data Set and BAAs

Sharing a Limited Data Set requires a Data Use Agreement that defines permitted uses, authorized recipients, and safeguards. If a vendor receives only de-identified data and no PHI, a Business Associate Agreement is generally not required; if PHI or an LDS is shared, a BAA is required.

Conclusion

HIPAA’s definition of PHI hinges on whether information is individually identifiable and handled by a covered entity or business associate. When you cannot avoid sharing data, apply the de-identification standards—Safe Harbor or Expert Determination—to lower re-identification risk and enable responsible use.

FAQs

What information qualifies as protected health information under HIPAA?

PHI is individually identifiable health information related to health status, care, or payment that is created or received by a covered entity or business associate. It includes identifiers or data combinations that could reasonably identify the person, in any form—electronic, paper, or oral.

How does the Safe Harbor method differ from the Expert Determination method for de-identification?

Safe Harbor is rule-based: remove 18 specific identifiers and ensure no actual knowledge of identifiability. Expert Determination is risk-based: a qualified expert applies statistical or scientific techniques and documents that the re-identification risk is very small for the specific data release and context.

Are there any exclusions to what is considered PHI?

Yes. De-identified data, education records and treatment records covered by FERPA, employment records held by a covered entity as an employer, and information about someone who has been deceased for more than 50 years are not PHI. Health information held by non-covered entities not acting for a covered entity also falls outside HIPAA.

Can de-identified data be re-identified under HIPAA rules?

HIPAA permits a covered entity to assign a non-derivable code so it can re-identify its own de-identified data. Recipients must not receive the re-identification mechanism. If data are re-identified or linked back to a person, the resulting information is PHI and the Privacy Rule applies.

Share this article

Ready to simplify HIPAA compliance?

Join thousands of organizations that trust Accountable to manage their compliance needs.

Related Articles