HIPAA De-Identification Safe Harbor Method Explained: The 18 Identifiers You Must Remove
Overview of HIPAA De-Identification
The HIPAA Privacy Rule allows you to use or disclose health data without authorization once it is de-identified under the De-Identification Standards. Two pathways exist: the HIPAA Safe Harbor method and the Expert Determination method. Safe Harbor is rule-based; Expert Determination is risk-based.
When you complete de-identification, the data are no longer considered protected health information (PHI) for HIPAA purposes. That unlocks broader sharing and analytics while maintaining Health Information Privacy. This article focuses on Safe Harbor so you can achieve Data Anonymization Compliance with confidence.
You will learn exactly which Patient Identifier Removal steps are required, how to treat geography and dates, and when a Risk Assessment Expert may be preferable for data utility. The goal is practical, defensible Unique Identifier Suppression without sacrificing analytic value more than necessary.
Safe Harbor Method Requirements
Safe Harbor has two requirements. First, you must remove 18 specific identifiers of the individual and of the individual’s relatives, employers, or household members. Second, you must have no actual knowledge that the remaining information could be used, alone or in combination, to identify the person.
This is a standards-based compliance exercise: you suppress defined fields, generalize certain details, and document decisions. Although Safe Harbor does not require a formal re-identification risk model, you should still review small cells, uncommon combinations, and free text to avoid implicit identifiers.
Implementation steps you can follow
- Inventory your data schema and map every field to the 18 categories.
- Delete, mask, or generalize fields per Safe Harbor; keep only what is permitted (for example, year—not month/day—of dates).
- If you use a re-identification code internally, ensure it is not derived from personal information and do not disclose the linkage mechanism.
- Scan notes for direct or indirect identifiers; redact or tokenize before release.
- Record your de-identification workflow and decisions to evidence compliance.
The 18 Identifiers List
The following identifiers must be removed if they pertain to the individual or to relatives, employers, or household members:
- Names.
- All geographic subdivisions smaller than a state, including street address, city, county, precinct, ZIP code, and equivalent geocodes, except the initial three digits of a ZIP code under the population rule described below.
- All elements of dates (except year) directly related to an individual, including birth date, admission, discharge, and death; plus all ages over 89 and any date elements (including year) indicative of such age, unless grouped as 90 or older.
- Telephone numbers.
- Fax numbers.
- Electronic mail addresses.
- Social Security numbers.
- Medical record numbers.
- Health plan beneficiary numbers.
- Account numbers.
- Certificate/license numbers.
- Vehicle identifiers and serial numbers, including license plate numbers.
- Device identifiers and serial numbers.
- Web Universal Resource Locators (URLs).
- Internet Protocol (IP) address numbers.
- Biometric identifiers, including finger and voice prints.
- Full-face photographs and any comparable images.
- Any other unique identifying number, characteristic, or code, except a re-identification code created by the covered entity that is not derived from personal information and whose mechanism is not disclosed.
Removing Geographic Subdivisions
Under Safe Harbor, you must remove all geographic subdivisions smaller than a state. That includes street address, city, county, precinct, full ZIP code, and any equivalent geocodes that could pinpoint a person’s location.
The three-digit ZIP rule
You may retain only the initial three digits of a ZIP code if, according to population statistics, the geographic unit formed by all ZIP codes with those three digits contains more than 20,000 people. If it contains 20,000 or fewer, you must replace the three digits with 000.
Ready to simplify HIPAA compliance?
Join thousands of organizations that trust Accountable to manage their compliance needs.
Practical guidance
- Retain state and permitted three-digit ZIPs; remove street/city/county and precise geocodes (e.g., latitude/longitude, census block).
- Avoid rare location hints in notes—such as unique facility names, isolated towns, or exact travel routes—that can function as implicit identifiers.
- If detailed geography is essential for analysis, consider the Expert Determination method to generalize or perturb locations while keeping a very small re-identification risk.
Handling Dates and Ages
Safe Harbor allows you to keep only the year for dates directly related to an individual: birth, admission, discharge, death, service, and specimen collection dates. Remove month, day, and any finer granularity (e.g., timestamps, week numbers, event times, or exact intervals that reveal a specific date).
Individuals older than 89 must not be reported with their exact age. Instead, group them into a single category of “age 90 or older,” and remove any related date elements—including year—that would reveal an age over 89. For those under 90, you may retain age in years or the year component of dates.
If you need seasonal or monthly temporal patterns, Safe Harbor will be restrictive. In that case, evaluate Expert Determination, which can justify alternative transformations (such as month-level dates or shifted dates) while meeting the “very small” risk threshold.
De-Identification of Contact Information
Direct communication channels—telephone, fax, email—must be removed entirely. Do not keep partial phone numbers, email usernames, or reversible encodings. Hashes derived from these fields typically count as unique identifiers and should not be shared under Safe Harbor.
Digital contact traces also require suppression. Remove URLs, IP addresses, device identifiers, messaging handles, and any accounts that can route communication to a person. If you need linkage across records, use a nondisclosive, randomly generated code that is not mathematically related to the underlying contact data.
For operational use, maintain contact fields in a protected environment and release only the de-identified dataset. This approach satisfies Unique Identifier Suppression while preserving internal workflows such as follow-up or quality checks.
Expert Determination Method Comparison
Expert Determination uses statistical and scientific principles to ensure the risk of re-identification is very small, as determined by a qualified Risk Assessment Expert. Instead of a fixed list, the expert applies techniques such as generalization, suppression, noise addition, swapping, and differential privacy–inspired safeguards.
Choose Safe Harbor when you can meet analytic needs with year-only dates and state-level geography, and when removing the 18 identifiers does not undermine utility. Choose Expert Determination when you need richer detail—such as month/day of service, five-digit ZIPs, precise ages, or fine-grained locations—and are prepared to document the residual risk analysis and controls.
Key takeaways
- Safe Harbor is straightforward and prescriptive: remove the 18 identifiers and ensure no actual knowledge of identifiability remains.
- Expert Determination is flexible and utility-preserving but requires specialized analysis, documentation, and ongoing risk management.
- Both methods aim to protect Health Information Privacy while enabling lawful data use under the HIPAA De-Identification Standards.
FAQs.
What is the HIPAA Safe Harbor method?
It is a de-identification pathway under the HIPAA Privacy Rule that requires removing 18 specified identifiers of the individual (and of relatives, employers, or household members) and having no actual knowledge that the remaining data could identify a person. Data meeting these conditions are no longer PHI for HIPAA purposes.
Which 18 identifiers must be removed for de-identification?
The 18 categories are: names; geographic subdivisions smaller than a state (with a limited three-digit ZIP exception); all elements of dates except year plus ages over 89; phone numbers; fax numbers; email addresses; Social Security numbers; medical record numbers; health plan beneficiary numbers; account numbers; certificate/license numbers; vehicle identifiers and serial numbers (including plates); device identifiers and serial numbers; URLs; IP addresses; biometric identifiers (e.g., finger and voice prints); full-face photos and comparable images; and any other unique identifying number, characteristic, or code (with a narrow, nonderived internal re-identification code exception).
How does the Expert Determination method differ from Safe Harbor?
Safe Harbor is rule-based: you remove the 18 identifiers and satisfy the “no actual knowledge” clause. Expert Determination is risk-based: a qualified expert applies statistical methods to show the re-identification risk is very small, enabling finer detail (e.g., month-level dates or granular geography) with documented safeguards.
Can ages over 89 be included in de-identified data?
Under Safe Harbor, you cannot report specific ages above 89 or dates that reveal those ages. You must group them into a single category of “90 or older.” Under Expert Determination, more detailed age reporting may be possible if an expert demonstrates the residual risk remains very small and appropriate protections are in place.
Ready to simplify HIPAA compliance?
Join thousands of organizations that trust Accountable to manage their compliance needs.