What Is De-Identified Information? Definition, Examples, and HIPAA Requirements

Check out the new compliance progress tracker


Product Pricing Demo Video Free HIPAA Training
LATEST
video thumbnail
Admin Dashboard Walkthrough Jake guides you step-by-step through the process of achieving HIPAA compliance
Ready to get started? Book a demo with our team
Talk to an expert

What Is De-Identified Information? Definition, Examples, and HIPAA Requirements

Kevin Henry

HIPAA

September 17, 2025

7 minutes read
Share this article
What Is De-Identified Information? Definition, Examples, and HIPAA Requirements

Definition of De-Identified Information

De-identified information is health data that cannot reasonably be used to identify an individual. Under the HIPAA Privacy Rule, it has been processed so that a person, household, or device is not identifiable to a practical degree.

Once data is de-identified, it is no longer protected health information (PHI) under HIPAA. You may use or disclose it without patient authorization, although strong privacy and security practices remain essential for health information privacy.

Methods of De-Identification

Safe Harbor Method

The Safe Harbor Method removes a specific list of 18 identifiers from the dataset and requires no actual knowledge that the remaining data could identify a person. It is straightforward to operationalize and is widely used for routine disclosures and publications.

Expert Determination Method

The Expert Determination Method relies on a qualified expert who applies accepted statistical or scientific techniques to conclude that the risk of re-identification is very small. The expert documents methods, assumptions, and results, and recommends controls (for example, generalization, suppression, or data-use restrictions).

Common Techniques Applied by Experts

  • Generalizing values (for example, age bands, year-only dates, 3-digit ZIPs when permitted)
  • Suppressing rare or unique combinations that raise re-identification risk
  • Perturbing values via rounding or adding bounded noise
  • Ensuring any re-identification code is not derived from identifiers and cannot be reversed

18 Identifiers to Remove

  1. Names.
  2. All geographic subdivisions smaller than a state, including street address, city, county, precinct, and ZIP code; you may keep only the initial three digits of a ZIP code if the area has more than 20,000 people, otherwise use 000.
  3. All elements of dates (except year) for dates directly related to an individual, including birth, admission, discharge, and death; ages over 89 and related elements must be aggregated into a single 90+ category.
  4. Telephone numbers.
  5. Fax numbers.
  6. Email addresses.
  7. Social Security numbers.
  8. Medical record numbers.
  9. Health plan beneficiary numbers.
  10. Account numbers.
  11. Certificate or license numbers.
  12. Vehicle identifiers and serial numbers, including license plates.
  13. Device identifiers and serial numbers.
  14. Web URLs.
  15. IP addresses.
  16. Biometric identifiers, including finger and voice prints.
  17. Full-face photographs and comparable images.
  18. Any other unique identifying number, characteristic, or code, except a permitted re-identification code under the HIPAA Privacy Rule.

HIPAA Requirements for De-Identified Information

The HIPAA Privacy Rule recognizes two paths—Safe Harbor and Expert Determination—to create de-identified data. Once de-identified, the information is not PHI, so HIPAA’s use, disclosure, and minimum necessary standards no longer apply to that dataset. However, the de-identification process itself uses PHI, so only authorized workforce members or business associates may perform it.

HIPAA permits a covered entity to retain a re-identification code that links back to the original records if: (1) the code is not derived from or related to information about the individual, (2) it cannot be translated to identify the person, and (3) the mechanism is not disclosed, except as allowed. Avoid hashes of direct identifiers for Safe Harbor, as those are derived from the identifiers.

If you engage a vendor to de-identify data, a Business Associate Agreement is required because the vendor accesses PHI. For Expert Determination, maintain written documentation of expert credentials, methods, risk findings, and any recommended safeguards.

Ready to simplify HIPAA compliance?

Join thousands of organizations that trust Accountable to manage their compliance needs.

Examples of De-Identified Information

  • A statewide dashboard showing annual counts of emergency department visits by condition, with patient ages in 5-year bands and only state-level geography.
  • A research dataset where birthdates are replaced with year of birth, ages over 89 are grouped as 90+, ZIPs are reduced to 3 digits when allowed, and all 18 identifiers are removed.
  • Utilization logs in which device serial numbers are replaced by non-derivable random codes, timestamps are rounded to week-of-year, and locations are generalized to state.
  • Quality reports that aggregate outcomes by year and facility type, avoiding small cells that could reveal individuals in sparse categories.

Re-Identification of De-Identified Data

Re-identification risk is the possibility that data could be linked back to a person using other reasonably available information. Safe Harbor reduces this risk substantially but does not eliminate it; Expert Determination explicitly targets a very small residual risk with technical and contractual controls.

Practical safeguards include cell-size thresholds, suppression of rare outliers, data-sharing agreements that prohibit re-identification, secure enclaves for analysis, and periodic risk reassessments. If data become re-identified—or you gain actual knowledge that individuals are identifiable—treat the information as PHI and apply HIPAA obligations.

Limited Data Set

A Limited Data Set (LDS) is not fully de-identified. It excludes direct identifiers but may include certain fields useful for analysis, such as dates and general geography, making it valuable for research, public health, and health care operations under the HIPAA Privacy Rule.

What May Remain in an LDS

  • City, state, and ZIP code
  • All elements of dates (for example, admission, discharge, service dates, DOB, DOD)
  • Other non-direct identifiers necessary for the project

What Must Be Removed from an LDS

  • Names and postal address lines other than city, state, ZIP
  • Telephone and fax numbers; email addresses
  • SSNs, MRNs, health plan numbers, account numbers
  • Certificate/license numbers; vehicle and device identifiers
  • Web URLs and IP addresses
  • Biometric identifiers and full-face photos
  • Any other unique identifying number, characteristic, or code

Data Use Agreement (DUA) Essentials

Sharing an LDS requires a Data Use Agreement that specifies permitted purposes, prohibits re-identification and contact attempts, limits disclosures, and mandates safeguards. An LDS is still PHI, so HIPAA applies to its handling and security, even though individual authorization is not required when a compliant DUA is in place.

How an LDS Differs from De-Identified Data

  • De-identified data: not PHI; no DUA required by HIPAA; lowest re-identification risk.
  • LDS: remains PHI; DUA required; retains dates and general geography to preserve utility.

Conclusion

De-identified information protects individuals while preserving analytic value. You can achieve it via the Safe Harbor Method or the Expert Determination Method, remove the 18 identifiers, and manage re-identification risk with strong technical and contractual safeguards. Use a Limited Data Set with a DUA when you need dates or limited geography to advance research and operations responsibly.

FAQs

What is the Safe Harbor Method for de-identification?

The Safe Harbor Method requires removing 18 specific identifiers and ensuring you have no actual knowledge that the remaining data could identify someone. It offers a clear, checklist-driven path to compliance and is widely used for routine data sharing.

What identifiers must be removed for de-identified data?

You must remove names; detailed geography below state (with limited 3-digit ZIP use); all elements of dates except year plus 90+ age grouping; phone, fax, and email; SSN, MRN, plan and account numbers; certificate/license numbers; vehicle and device IDs; URLs and IPs; biometric identifiers; full-face photos; and any other unique identifier not permitted for re-identification.

How does HIPAA regulate de-identified information?

HIPAA’s Privacy Rule recognizes Safe Harbor and Expert Determination as valid methods. After valid de-identification, the data are no longer PHI and HIPAA’s use and disclosure limits no longer apply, though you should still maintain strong health information privacy safeguards and document your approach.

Can de-identified data be re-identified?

Yes, there is always some re-identification risk, especially when combined with external datasets. Expert Determination aims to reduce that risk to a very small level, and contracts like a Data Use Agreement can prohibit re-identification and impose safeguards to keep the risk low.

Share this article

Ready to simplify HIPAA compliance?

Join thousands of organizations that trust Accountable to manage their compliance needs.

Related Articles