Celiac Disease Registry Data and HIPAA: What Counts as PHI and How to Comply
Overview of Celiac Disease Registries
Celiac disease registries consolidate real-world information to advance understanding, improve care, and enable research. They typically combine electronic health record extracts, lab results, endoscopy and pathology findings, patient-reported outcomes, and follow-up data across multiple sites.
Because registry records can link health details to specific individuals, the way you collect, store, and share them is governed by HIPAA when a covered entity or business associate is involved. This overview is for general information and supports your HIPAA Compliance planning; always consult your privacy office or counsel for program-specific requirements.
Common data elements
- Demographics (age, sex, race/ethnicity), contact details, and enrollment dates.
- Diagnostic information (serology such as tTG-IgA/EMA, biopsy results, Marsh classification, HLA-DQ2/DQ8 typing).
- Clinical history, comorbidities (e.g., type 1 diabetes, thyroid disease), medications, diet adherence, and complications.
- Utilization data (encounters, procedures), outcomes, quality metrics, and patient-reported symptoms.
- Biospecimen metadata, imaging, wearable or app-collected signals, and follow-up timelines.
How registry context shapes privacy obligations
Whether data are Protected Health Information depends on who holds the data, the presence of identifiers, and the purpose of use (operations, public health, or research). A hospital-run quality registry faces different HIPAA duties than a patient advocacy registry that collects information directly from participants without receiving PHI from a covered entity.
Definition of Protected Health Information
Protected Health Information (PHI) is individually identifiable health information created or received by a covered entity or business associate that relates to health status, care provision, or payment and can reasonably identify an individual. PHI can exist in any medium—paper, verbal, or electronic (ePHI).
Direct identifiers commonly encountered
- Names.
- Geographic subdivisions smaller than a state (street address, city, county, precinct; most ZIP codes).
- All elements of dates (except year) related to an individual (birth, admission, discharge, death) and ages over 89 unless aggregated.
- Telephone and fax numbers.
- Email addresses.
- Social Security numbers.
- Medical record numbers.
- Health plan beneficiary numbers.
- Account numbers.
- Certificate and license numbers.
- Vehicle identifiers and license plates.
- Device identifiers and serial numbers.
- Web URLs.
- IP address numbers.
- Biometric identifiers (finger/voice prints).
- Full-face photographs and comparable images.
- Any other unique identifying number, characteristic, or code (except an allowed re-identification code).
Less obvious identifiers
Free-text clinical notes, rare combinations of traits, small-population geographies, and granular timestamps can indirectly identify participants. Genetic data (e.g., detailed HLA typing) may increase re-identification risk when combined with other attributes.
HIPAA Applicability and Limitations
HIPAA applies to covered entities (providers, health plans, clearinghouses) and their business associates handling PHI on their behalf. If a celiac disease registry is operated by or receives PHI from a covered entity, HIPAA governs its collection, use, and disclosure.
HIPAA generally does not apply to data a participant submits directly to a consumer application that is not offered by a covered entity or business associate. However, other laws and institutional policies may still apply, and HIPAA re-attaches if PHI flows from a covered entity into the registry.
Typical scenarios
- Healthcare operations: A hospital quality-improvement registry may use PHI under HIPAA operations without individual authorization, subject to minimum necessary and access controls.
- Research: A university-led research registry usually needs IRB review and either a HIPAA Authorization or a waiver/alteration granted by an IRB/Privacy Board.
- Advocacy registry: A non-covered sponsor collecting data directly from patients may not be subject to HIPAA but should adopt strong privacy practices and clear notices.
Minimum necessary and role-based access
Limit PHI to the least amount needed for the task. Define roles (e.g., data curator, site investigator, analyst) and grant only the data necessary for those roles, revoking access when duties change.
Data De-Identification and Limited Data Sets
Data De-Identification removes the ability to reasonably identify individuals. Under the safe harbor method, all 18 direct identifiers are removed; under expert determination, a qualified expert documents that re-identification risk is very small with specific controls.
A Limited Data Set (LDS) is not fully de-identified. It excludes direct identifiers but may include dates, city, state, ZIP code, and other non-direct elements. Sharing an LDS requires a Data Use Agreement.
Practical de-identification tips for celiac registries
- Generalize or bin ages; consider aggregating ages over 89.
- Reduce geographic precision (e.g., 3-digit ZIP where permitted) and shift dates consistently per participant to preserve intervals.
- Suppress outliers and unique combinations; limit free text or scrub it with NLP tools.
- Use pseudonymous study IDs and store the linkage key separately with strict access controls.
- Review small cell counts before release to prevent inadvertent disclosure.
Pseudonymization versus de-identification
Pseudonymized data remain PHI if a key can re-link to identities. Only when re-identification risk is very small and identifiers are removed under HIPAA standards is a dataset considered de-identified.
Ready to simplify HIPAA compliance?
Join thousands of organizations that trust Accountable to manage their compliance needs.
Data Access and Use Agreements
A Data Use Agreement (DUA) is mandatory for sharing a Limited Data Set. It specifies permitted uses, who may access the data, safeguards, reporting of violations, and return or destruction at project end.
For de-identified data, organizations commonly require a Data Access and Confidentiality Agreement to set expectations on no re-identification, redisclosure limits, security controls, and publication review. Internal access can be governed by role-based agreements and standard operating procedures.
Key clauses to include
- Permitted purposes and prohibition on attempts to identify individuals.
- Named authorized users and restrictions on subcontractors or downstream sharing.
- Security measures (encryption, access logs, multi-factor authentication) and breach notification timelines.
- Compliance with IRB approvals, training requirements, and data retention/destruction plans.
- Audit rights, sanctions for misuse, and acknowledgment of data provenance.
Data Security and Compliance Measures
Build a layered security program aligned to the HIPAA Security Rule: administrative, technical, and physical safeguards. Start with a risk analysis and implement controls proportionate to the sensitivity and scale of your registry.
- Administrative: policies and procedures, workforce training, vendor risk management, incident response, and contingency planning with tested backups.
- Technical: encryption in transit and at rest, role-based access, MFA/SSO, network segmentation, audit logging, and regular vulnerability management.
- Physical: controlled server room access, device and media controls, and secure disposal of hardware and paper records.
Working with vendors and cloud platforms
Execute Business Associate Agreements when vendors handle PHI. Validate isolation of environments, data residency needs, logging/monitoring coverage, and documented change management. Limit production PHI in development or testing.
Ongoing HIPAA Compliance
Maintain a compliance calendar: periodic risk assessments, access reviews, training refreshers, tabletop exercises, and policy updates. Document decisions, exceptions, and mitigations to demonstrate due diligence.
Informed Consent and Participant Rights
The Informed Consent Process explains what data will be collected, how it will be used, risks and benefits, confidentiality protections, and withdrawal options. For research, consent is separate from a HIPAA Authorization, which permits use/disclosure of PHI for the study.
Participants have rights to access their information, request amendments, and receive a description of certain disclosures when applicable. If they revoke authorization or withdraw from the registry, you must stop new data collection, though prior authorized uses may continue as permitted by law and policy.
Special considerations for genetic data and biospecimens
Genetic and biospecimen-linked data can carry heightened re-identification risk. Use conservative de-identification, clear consent language about future use and sharing, and governance that reviews secondary uses and re-contact plans.
Participant communications
Provide concise privacy notices, honor contact preferences, and create easy channels for questions or complaints. Be transparent about data sharing with collaborators and about any clinically actionable findings policies.
Conclusion
To run a compliant celiac disease registry, classify data accurately as PHI, de-identified, or a Limited Data Set; apply the minimum necessary standard; use the right agreements (DUA or Data Access and Confidentiality Agreement); and implement strong security with continuous oversight. Clear consent and participant-centered practices complete a defensible privacy program.
FAQs.
What types of data in a celiac disease registry are considered PHI under HIPAA?
Any individually identifiable health information created or received by a covered entity or business associate qualifies as PHI. Beyond obvious items like names and contact details, PHI includes medical record numbers, full dates tied to care, most ZIP codes, images that could identify a person, device IDs, IP addresses, and any other unique identifiers when linked to health details such as serology, biopsy results, or treatment history.
How does de-identification affect HIPAA requirements?
Once data are properly de-identified under HIPAA (safe harbor removal of all 18 identifiers or expert determination showing very small re-identification risk), HIPAA’s Privacy Rule no longer governs sharing of those data. However, institutional policies, contracts, or IRB conditions may still restrict use, and risk controls should persist to prevent re-identification.
When is a Data Use Agreement necessary for registry data?
A Data Use Agreement is required when sharing a Limited Data Set that excludes direct identifiers but retains elements like dates and general location. The DUA defines permitted uses, who may access the data, safeguards, reporting obligations, and return or destruction terms. For fully de-identified data, organizations often use a Data Access and Confidentiality Agreement instead.
How is informed consent managed in celiac disease registries?
For research registries, an IRB-approved Informed Consent Process explains collection, use, risks, protections, and withdrawal rights; a separate HIPAA Authorization usually permits use/disclosure of PHI. Operational registries may rely on HIPAA allowances without consent, but transparency and opt-out mechanisms are good practice. Participants can withdraw prospectively and exercise rights to access and request corrections where applicable.
Ready to simplify HIPAA compliance?
Join thousands of organizations that trust Accountable to manage their compliance needs.