The Two HIPAA De-Identification Methods: Requirements, Examples, and Compliance Tips

Product Pricing Demo Video Free HIPAA Training
LATEST
video thumbnail
Admin Dashboard Walkthrough Jake guides you step-by-step through the process of achieving HIPAA compliance
Ready to get started? Book a demo with our team
Talk to an expert

The Two HIPAA De-Identification Methods: Requirements, Examples, and Compliance Tips

Kevin Henry

HIPAA

March 02, 2025

8 minutes read
Share this article
The Two HIPAA De-Identification Methods: Requirements, Examples, and Compliance Tips

Safe Harbor Method Requirements

What Safe Harbor Means

Under HIPAA’s de-identification standards, the Safe Harbor method requires you to remove specific identifiers from data about patients and ensure you have no actual knowledge that remaining information could identify someone. The result is information that is no longer considered Protected Health Information for most uses.

Identifier Removal Guidelines

To use Safe Harbor, remove these 18 identifiers for the individual and for relatives, employers, or household members:

  • Names
  • Geographic details smaller than a state (street, city, county, precinct), including ZIP codes except as noted below
  • All elements of dates (except year) directly related to an individual; ages over 89 must be aggregated into a single 90+ category
  • Telephone numbers
  • Fax numbers
  • Email addresses
  • Social Security numbers
  • Medical record numbers
  • Health plan beneficiary numbers
  • Account numbers
  • Certificate or license numbers
  • Vehicle identifiers and serial numbers, including license plates
  • Device identifiers and serial numbers
  • Web URLs
  • IP addresses
  • Biometric identifiers (e.g., fingerprints, voiceprints)
  • Full-face photographs and comparable images
  • Any other unique identifying number, characteristic, or code (with limited allowance for a non-derivable, secured re-linkage code)

Rules for Dates and Geography

  • Dates: Remove day and month for birth, admission, discharge, and death dates; keep only the year. Aggregate ages 90 and above into “90+.”
  • Geography: You may retain the first three digits of a ZIP code only when the combined population of all ZIP codes with those three digits exceeds 20,000; otherwise, replace with 000.

Implementation Tips

  • Inventory fields across tables and free text, not just obvious columns.
  • Automate scrubbing for direct identifiers and scan notes for embedded identifiers.
  • Use a random, non-derivable re-linkage code stored separately with tight access controls.
  • Validate outputs and document that you lack actual knowledge of identification risk.

Expert Determination Method Process

Core Idea

The Expert Determination pathway relies on a qualified expert determination that the risk of re-identification is very small, given the data, context, and controls. Instead of a fixed list of fields, you apply statistical risk assessment tailored to your use case.

Step-by-Step Process

  1. Define the use case, data recipients, and environment (access controls, contractual limits, and sharing scope).
  2. Inventory attributes, including quasi-identifiers (e.g., age, ZIP3, dates, rare diagnoses) that could enable linkage.
  3. Model plausible attacks and external data sources an adversary could use.
  4. Measure re-identification risk on the dataset and within subgroups.
  5. Apply transformations to reduce risk while preserving utility.
  6. Re-evaluate risk and iterate until it meets a “very small” threshold.
  7. Produce a written qualified expert determination describing methods, assumptions, results, and required safeguards.

Common Privacy Techniques

  • Generalization and suppression (e.g., convert birthdate to age bands, truncate geography to state or region).
  • k-anonymity, l-diversity, and t-closeness to limit unique or homogenous records.
  • Top/bottom coding and microaggregation for outliers and small cells.
  • Date shifting, binning, rounding, and partial masking.
  • Noise addition, data swapping, or synthetic data for difficult features.

Controls That Matter

Risk depends on the release context. Contracts, access controls, audit trails, and recipient training materially reduce re-identification risk. The expert’s opinion often conditions approval on these safeguards to maintain data privacy compliance.

Ready to simplify HIPAA compliance?

Join thousands of organizations that trust Accountable to manage their compliance needs.

Examples of De-Identification

Electronic Health Record Table for Outcomes Research

  • Safe Harbor: Remove all direct identifiers; retain year of events and state; age 90+ collapsed.
  • Expert Determination: Keep narrower age bands (e.g., 5-year groups) and ZIP3 when population thresholds and controls support a very small re-identification risk.

Claims Dataset for Cost Benchmarking

  • Safe Harbor: Remove member IDs and dates to year-only; replace provider IDs with random codes.
  • Expert Determination: Retain quarter-level service dates and provider specialties after applying k-anonymity and limiting access to approved analysts.

Patient-Generated Wearables Data

  • Safe Harbor: Strip device serials, GPS traces below state, and exact timestamps (keep year).
  • Expert Determination: Keep week-of-year and coarse geohash after jittering locations and enforcing a data use agreement.

Free-Text Clinical Notes

  • Safe Harbor: Use NLP to redact names, addresses, contact info, and any embedded identifiers.
  • Expert Determination: Permit limited clinical context (e.g., month-level timeframe) after risk testing and manual QA of de-identification output.

Medical Imaging Repository

  • Safe Harbor: Remove full-face images and DICOM tags with identifiers; keep year of acquisition.
  • Expert Determination: Allow retention of non-facial images with constrained site metadata after small-cell evaluation and suppression of rare modality-site combinations.

Compliance Best Practices

  • Adopt a written policy that defines when to use Safe Harbor versus Expert Determination and aligns with your de-identification standards.
  • Minimize data: collect only what you need; drop fields not needed for the stated purpose.
  • Use repeatable pipelines for scrubbing identifiers and quality checks on outputs.
  • Train teams on Protected Health Information scope, identifier removal guidelines, and re-identification risk.
  • Enforce contracts and access controls; monitor sharing and maintain audit logs.
  • Reassess risk when variables, recipients, or external data landscapes change.

Risks and Limitations of Each Method

Safe Harbor

  • Pros: Clear checklist; quick to implement; widely understood by partners.
  • Cons: Utility loss from removing dates and geography; may not address uniqueness in small cohorts; residual risk if free text or rare combinations persist.

Expert Determination

  • Pros: Better data utility; tailored controls; explicit statistical risk assessment.
  • Cons: Requires qualified expertise, time, and maintenance; results depend on assumptions about data and environment.

Documentation and Record-Keeping

For Safe Harbor

  • Field inventory and mapping of each removed or transformed identifier.
  • Evidence of date handling and 90+ age aggregation.
  • ZIP3 population threshold checks when used.
  • Statement that you lack actual knowledge of identification risk, plus testing notes.
  • Design and storage details for any non-derivable re-linkage code.

For Expert Determination

  • Qualified expert determination report covering methods, assumptions, thresholds, results, and required safeguards.
  • Risk metrics before and after transformations; small-cell and uniqueness analyses.
  • Access model, data use agreement terms, and monitoring plan.
  • Versioning of datasets, parameters, and release notes for each refresh.

Retention and Oversight

  • Define retention periods for analyses, outputs, and approvals; keep decision logs for audits.
  • Track incidents and corrective actions; schedule periodic reviews of re-identification risk.

Consulting Qualified Experts

When to Engage an Expert

  • When Safe Harbor removes too much utility (e.g., time-sensitive analytics or granular geography).
  • When datasets include rare conditions, small populations, or complex quasi-identifiers.
  • Before new releases to unfamiliar recipients or broader audiences.

Who Qualifies

A qualified expert has relevant education, training, and experience in statistical disclosure control, privacy engineering, and health data. Look for a track record with qualified expert determinations, published methods, and familiarity with operational safeguards.

How to Work with the Expert

  • Define the purpose, users, environment, and acceptable risk threshold up front.
  • Provide data dictionaries and sample extracts to speed risk modeling.
  • Pilot transformations on a subset, then scale; agree on refresh triggers and monitoring.

Conclusion

Safe Harbor offers speed and clarity through strict identifier removal guidelines, while Expert Determination maximizes utility by managing re-identification risk statistically and operationally. Choose the path that fits your goals and controls, document decisions thoroughly, and revisit risk as data and contexts evolve.

FAQs

What identifiers must be removed under the Safe Harbor method?

You must remove 18 categories: names; sub-state geography (with limited ZIP3 use); all elements of dates except year and aggregate 90+ ages; phone, fax, email; Social Security, medical record, health plan, and account numbers; certificate/license numbers; vehicle, device, URL, and IP identifiers; biometric identifiers; full-face photos and comparable images; and any other unique identifying number, characteristic, or code unless used as a secure, non-derivable re-linkage code stored separately.

How does the Expert Determination method reduce re-identification risk?

A qualified expert conducts statistical risk assessment, models plausible attacks, and applies privacy techniques like generalization, suppression, noise addition, and k-anonymity. The expert also factors in contractual, technical, and administrative controls so the overall chance of re-identification remains very small for the intended recipients and context.

What documentation is required for HIPAA de-identification?

For Safe Harbor, keep a field-by-field log of identifiers removed, date/geography treatments, testing results, and the rationale that you lack actual knowledge of re-identification risk. For Expert Determination, maintain the expert’s written opinion, methods, thresholds, risk metrics, required safeguards, dataset versions, and conditions for refresh or revocation.

When should an organization consult a qualified expert?

Engage an expert when Safe Harbor undermines analytical value, when data includes rare or high-risk variables, when sharing beyond tightly controlled teams, or when you need to retain finer-grained dates or locations. An expert helps balance data utility with data privacy compliance and provides defensible documentation.

Share this article

Ready to simplify HIPAA compliance?

Join thousands of organizations that trust Accountable to manage their compliance needs.

Related Articles