Decoding PHI Identifiers for HIPAA Compliance: A Comprehensive Guide

Check out the new compliance progress tracker


Product Pricing Demo Video Free HIPAA Training
LATEST
video thumbnail
Admin Dashboard Walkthrough Jake guides you step-by-step through the process of achieving HIPAA compliance
Ready to get started? Book a demo with our team
Talk to an expert

Decoding PHI Identifiers for HIPAA Compliance: A Comprehensive Guide

Kevin Henry

HIPAA

January 09, 2024

6 minutes read
Share this article
Decoding PHI Identifiers for HIPAA Compliance: A Comprehensive Guide

Understanding how Protected Health Information is defined and what counts as an identifier is essential to your HIPAA compliance program. This guide distills the HIPAA Privacy Rule’s de-identification standards, clarifies all 18 PHI identifiers, and shows how to manage risk with practical safeguards and audit-ready procedures.

Definition of Protected Health Information

Protected Health Information (PHI) is individually identifiable health information that relates to a person’s past, present, or future health status, the provision of healthcare, or payment for care. PHI is covered whether it is electronic, paper, or oral, and whether you are a covered entity or a business associate under the HIPAA Privacy Rule.

Information becomes PHI when it can identify an individual directly or indirectly when combined with other data. De-identified data is not PHI. Education records protected by FERPA and employment records held by a covered entity in its capacity as an employer are not PHI. Your goal is to minimize identifiability while preserving utility for care, operations, and analytics.

Overview of the 18 PHI Identifiers

HIPAA’s Safe Harbor de-identification standards require removal of these 18 identifiers before data is considered de-identified. Use this list as a checklist during data disclosure reviews and compliance audit procedures:

  1. Names.
  2. All geographic subdivisions smaller than a state (e.g., street address, city, county, precinct, ZIP code, geocodes) with a limited exception for certain three-digit ZIPs.
  3. All elements of dates (except year) directly related to an individual, and ages over 89 (which must be grouped as 90+).
  4. Telephone numbers.
  5. Fax numbers.
  6. Email addresses.
  7. Social Security numbers.
  8. Medical record numbers.
  9. Health plan beneficiary numbers.
  10. Account numbers.
  11. Certificate/license numbers.
  12. Vehicle identifiers and serial numbers, including license plate numbers.
  13. Device identifiers and serial numbers.
  14. Web URLs.
  15. IP addresses.
  16. Biometric identifiers, including finger and voice prints.
  17. Full-face photographic images and comparable images.
  18. Any other unique identifying number, characteristic, or code (with special rules for non-derivable re-identification codes).

Alternatively, the Expert Determination method can document that re-identification risk is very small, allowing more data utility under documented de-identification standards and controls.

Geography and time fields often reveal identity. Under Safe Harbor, you must remove geographic subdivisions smaller than a state. A limited exception permits three-digit ZIP codes if the combined population of all ZIPs sharing those three digits exceeds 20,000; otherwise use “000.” Treat GPS coordinates and fine-grained geocodes as direct identifiers.

Dates linked to a person—birth, admission, discharge, death, appointment dates—must be generalized to year only. Ages over 89 must be aggregated into a single “90 or older” category. For analytics, use identifier aggregation techniques (for example, year or quarter instead of exact dates, and county or state instead of street or city) to preserve trends without compromising privacy.

Contact and Account Identifiers

Contact fields like telephone, fax, and email are PHI identifiers when tied to health information. Numeric identifiers—Social Security, medical record, health plan beneficiary, account, and certificate/license numbers—are high risk because they enable cross-system matching.

Digital and transactional identifiers also qualify: URLs, IP addresses, device serials, and vehicle identifiers (including license plates). When sharing data, mask, tokenize, or remove these values. For internal use, apply minimum necessary access, format-preserving tokenization for workflows, and strict key management to separate tokens from real values.

Ready to simplify HIPAA compliance?

Join thousands of organizations that trust Accountable to manage their compliance needs.

Biometric and Image Identifiers

Biometric identifiers include finger and voice prints; full-face photographic images and comparable images are explicitly listed as identifiers. Treat other biometric modalities with similar caution because they can uniquely identify a person or be linked with health information.

Effective biometric data protection combines encryption in transit and at rest, storage of templates rather than raw images, liveness detection, access controls, and short retention with documented deletion. Keep biometric systems behind strong authentication, segregate them from general networks, and audit access routinely.

HIPAA’s Privacy Rule governs permissible uses and disclosures, while the Security Rule mandates administrative, physical, and technical safeguards for electronic PHI. Perform a risk analysis, document a risk management plan, and enforce the minimum necessary standard across workflows and systems.

Key safeguards include role-based access, unique user IDs, automatic logoff, encryption, integrity controls, audit logging, facility and device protections, and workforce training with sanctions for violations. Maintain business associate agreements, incident response and breach notification procedures, and a records retention schedule aligned to your regulatory obligations.

Robust compliance audit procedures should trace each identifier class to specific controls, test access and logging, validate de-identification protocols (Safe Harbor or Expert Determination), and verify that policy and practice match. Regularly rehearse breach scenarios and document lessons learned.

Strategies for PHI Protection and Risk Mitigation

Start with data mapping to locate all PHI, then minimize collection and retention. Use identifier aggregation, suppression, and generalization to reduce identifiability while meeting your analytic needs. Apply data loss prevention, encryption, and network segmentation as part of a layered health information security model.

Adopt privacy by design in new projects, require vendor due diligence and continuous monitoring, and restrict PHI in free-text fields. For analytics and AI, prefer de-identified datasets under formal de-identification standards and keep re-identification keys in a separate, tightly controlled environment.

Continuously improve with metrics: unauthorized access rates, time-to-revoke accounts, de-identification coverage, and audit findings closed. This turns compliance into an operational discipline that withstands growth and change.

FAQs.

What are the 18 PHI identifiers under HIPAA?

They include names; sub-state geography; elements of dates (except year) and ages over 89; phone, fax, and email; Social Security, medical record, health plan beneficiary, account, and certificate/license numbers; vehicle and device identifiers; URLs; IP addresses; biometric identifiers (finger and voice prints); full-face photos and comparable images; and any other unique identifying number, characteristic, or code.

How does HIPAA define Protected Health Information?

PHI is individually identifiable health information related to a person’s health, care, or payment for care, held or transmitted by a covered entity or business associate, in any form. If the data cannot reasonably identify an individual—through Safe Harbor removal of the 18 identifiers or Expert Determination—it is not PHI.

What measures protect biometric identifiers under HIPAA?

Use encryption, access controls, and audit logging; store biometric templates instead of raw images; implement liveness detection; limit access to a need-to-know basis; set short retention with secure deletion; and separate biometric systems and keys. These steps strengthen biometric data protection and support Security Rule compliance.

When can PHI be considered de-identified?

PHI is de-identified when either (1) all 18 identifiers are removed and you do not actually know the data could identify someone, or (2) a qualified expert determines and documents that the risk of re-identification is very small, with appropriate technical and organizational controls.

In summary, decoding PHI identifiers lets you apply the HIPAA Privacy Rule confidently, design effective de-identification standards, and operationalize health information security. With clear controls, identifier aggregation, and disciplined compliance audit procedures, you can protect privacy while enabling legitimate data use.

Share this article

Ready to simplify HIPAA compliance?

Join thousands of organizations that trust Accountable to manage their compliance needs.

Related Articles