HIPAA Safe Harbor Checklist: 18 Identifiers to Remove for Compliance

Check out the new compliance progress tracker


Product Pricing Demo Video Free HIPAA Training
LATEST
video thumbnail
Admin Dashboard Walkthrough Jake guides you step-by-step through the process of achieving HIPAA compliance
Ready to get started? Book a demo with our team
Talk to an expert

HIPAA Safe Harbor Checklist: 18 Identifiers to Remove for Compliance

Kevin Henry

HIPAA

May 01, 2024

8 minutes read
Share this article
HIPAA Safe Harbor Checklist: 18 Identifiers to Remove for Compliance

The HIPAA Safe Harbor method gives you a clear, rule-based path to Privacy Rule compliance by removing specific identifiers from Protected Health Information. When you strip these Unique Identifiers and have no actual knowledge that the remaining data could identify a person, you may use or disclose the dataset as de-identified information.

This checklist explains the Safe Harbor rules, details the 18 identifiers, clarifies geographic and age exceptions, contrasts de-identification methods, and outlines practical tools and safeguards to reduce re-identification risk.

HIPAA Safe Harbor Method

Safe Harbor is a prescriptive approach: remove a defined set of identifiers and ensure you do not actually know the data could still identify someone. It is straightforward to operationalize and audit, making it a common choice for privacy programs.

What Safe Harbor requires

  • Identify all data elements that qualify as Protected Health Information (PHI).
  • Remove the 18 Unique Identifiers listed below from the dataset.
  • Apply the geographic and age exceptions exactly as specified.
  • Retain documentation showing the steps you took for Privacy Rule compliance.
  • Confirm you have no actual knowledge that remaining data could identify an individual alone or in combination.

Scope: what counts as PHI

PHI is individually identifiable health information held or transmitted by a covered entity or business associate. In practice, assume any data that links health facts to a person—directly or through linkage with other data—requires Safe Harbor or another permitted de-identification approach.

18 Identifiers to Remove

  1. Names.
  2. All geographic subdivisions smaller than a state, including street address, city, county, precinct, ZIP code, and equivalent geocodes (see the exception below).
  3. All elements of dates (except year) directly related to an individual (for example, birth, admission, discharge, and death dates); for individuals age 90 or older, aggregate ages and any related date elements to “90 or older.”
  4. Telephone numbers.
  5. Fax numbers.
  6. Email addresses.
  7. Social Security numbers.
  8. Medical record numbers.
  9. Health plan beneficiary numbers.
  10. Account numbers.
  11. Certificate and license numbers.
  12. Vehicle identifiers and serial numbers, including license plate numbers.
  13. Device identifiers and serial numbers.
  14. Web URLs.
  15. IP address numbers.
  16. Biometric identifiers (for example, finger or voice prints).
  17. Full-face photographic images and any comparable images.
  18. Any other unique identifying number, characteristic, or code.

Important notes

  • Internal record keys can be retained only if they are not derived from identifying information and are not disclosed for re-identification outside permitted uses.
  • Free-text fields frequently leak identifiers. Use data masking, redaction, or suppression rather than relying on keyword filters alone.

Geographic Data Exception

Under Safe Harbor, you must remove geographic data smaller than a state. A narrow exception permits retaining the initial three digits of a ZIP code if the combined area of all ZIPs sharing those three digits contains more than 20,000 people; otherwise, replace the three digits with 000.

What you may keep

  • State or broader geography.
  • Three-digit ZIP code when the population threshold is above 20,000; else record 000.
  • Generalized regions you create (for example, multi-state groupings) that do not enable identification.

What you must remove

  • Street address, city, county, precinct, and full ZIP codes.
  • Equivalent geocodes such as latitude/longitude, census block, or GPS traces.
  • Location details in notes that pinpoint residences, workplaces, or small venues.

Age Data Exception

For dates tied to an individual, you may keep only the year. Month, day, and any finer precision (for example, hour) must be removed. This applies to birth, admission, discharge, and death dates, and any similar event dates.

For age, treat individuals 89 or younger normally. For anyone 90 or older, do not disclose exact age or any date elements (including year) that would reveal the person is 90+. Instead, aggregate to a single category of “age 90 or older.”

Ready to simplify HIPAA compliance?

Join thousands of organizations that trust Accountable to manage their compliance needs.

De-Identification Methods

HIPAA permits two pathways: the Safe Harbor method described here and the Expert Determination Method. Organizations often combine Safe Harbor with additional Statistical De-Identification techniques to further reduce re-identification risk.

Safe Harbor vs. Expert Determination Method

  • Safe Harbor: rule-based removal of 18 identifiers plus the “no actual knowledge” requirement. Fast to implement, easy to verify.
  • Expert Determination: a qualified expert applies statistical or scientific principles to conclude the risk of identification is very small, documenting methods, assumptions, and residual risk.
  • When to choose which: Safe Harbor is ideal for standardized releases; Expert Determination fits complex datasets where utility requires tailored transformations.

Techniques that enhance de-identification utility

  • Data masking and redaction for free text and notes.
  • Generalization and binning (for example, age bands, wide date ranges, coarser geography).
  • Suppression and small-cell rules to remove rare combinations.
  • Tokenization or pseudonymization to preserve linkability without revealing identity.
  • Hashing with keyed salts for stable but non-reversible linking across systems.
  • K-anonymity, l-diversity, or t-closeness checks as part of Statistical De-Identification.

Documentation for Privacy Rule compliance

  • Keep a data inventory, transformation log, and approvals for each release.
  • Record the logic for geographic and age exceptions and any masking rules.
  • For Expert Determination, retain the expert’s report and scope of applicability.

Compliance Tools

Technology helps you consistently find and remove Unique Identifiers while preserving analytic value and auditability.

  • Data discovery and classification: scan structured and unstructured sources for PHI elements and patterns.
  • De-identification pipelines: configurable Data Masking, tokenization, redaction, and format-preserving transformations.
  • Access control and encryption: role-based permissions, encryption at rest/in transit, and key management segregation.
  • Data loss prevention and monitoring: prevent leakage of identifiers in exports, logs, and messaging systems.
  • Metadata, lineage, and approval workflows: prove how a dataset became de-identified and who authorized each step.
  • Testing and QA harnesses: synthetic test data, sampling checks, and automated small-cell detection.

Data Re-Identification Considerations

Even after Safe Harbor, re-identification risk can arise from linkage attacks, rare conditions, small populations, or descriptive notes. Reduce risk by minimizing quasi-identifiers, limiting precision, and governing downstream sharing.

Common risk amplifiers

  • Granular time, location, or facility details that were not fully generalized.
  • Rare diagnoses, procedures, or device models that make a record distinctive.
  • Small cells in cross-tabulations and stratified reports.
  • Free-text narratives with names, places, or events.
  • Combining releases from different departments or time periods.

Mitigations that work in practice

  • Apply small-cell suppression, top- and bottom-coding, and broader buckets for dates and ages.
  • Introduce calibrated noise or swapping where appropriate; test utility vs. risk.
  • Use limited datasets with data use agreements when Safe Harbor is not feasible.
  • Run re-identification simulations and document results before release.
  • Set retention limits, redisclosure controls, and recipient obligations to prevent misuse.

Conclusion

Use this HIPAA Safe Harbor checklist to remove the 18 identifiers, apply the geographic and age exceptions precisely, and layer Statistical De-Identification and Data Masking where needed. With sound tooling, documentation, and risk controls, you can protect privacy while preserving analytic value and meeting Privacy Rule compliance.

FAQs.

What are the 18 identifiers to remove for HIPAA Safe Harbor compliance?

They are: names; geographic subdivisions smaller than a state; all elements of dates (except year) tied to an individual, and for those 90+, aggregate age and related dates to “90 or older”; telephone numbers; fax numbers; email addresses; Social Security numbers; medical record numbers; health plan beneficiary numbers; account numbers; certificate/license numbers; vehicle identifiers and serials (including license plates); device identifiers and serials; web URLs; IP addresses; biometric identifiers (for example, finger or voice prints); full-face photos and comparable images; and any other unique identifying number, characteristic, or code.

How does the Expert Determination method differ from Safe Harbor?

Safe Harbor removes a fixed list of identifiers and requires you to have no actual knowledge of identifiability. The Expert Determination Method is risk-based: a qualified expert uses statistical techniques to show the chance of re-identification is very small, documents the analysis, and specifies conditions under which the conclusion holds.

What exceptions exist for geographic and age data under HIPAA?

You may keep state-level geography and the first three digits of a ZIP code only when the combined area has more than 20,000 people; otherwise record 000. For dates, keep only the year. For individuals age 90 or older, do not disclose exact age or date elements that reveal age; aggregate to “90 or older.”

How can organizations prevent data re-identification after de-identification?

Combine Safe Harbor with Statistical De-Identification: generalize ages and dates, suppress small cells, mask free text, tokenize linkable fields, add calibrated noise where needed, and govern sharing with data use agreements, access controls, and monitoring. Test residual risk before release and document decisions for ongoing compliance.

Share this article

Ready to simplify HIPAA compliance?

Join thousands of organizations that trust Accountable to manage their compliance needs.

Related Articles