2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst...

22
1 1 De-Identification of Clinical Data De De - - Identification of Clinical Data Identification of Clinical Data TEPR Conference 2008 Ft. Lauderdale, Florida May 17 - 21, 2008 Sepideh Khosravifar, CISSP Info Security Analyst IV

Transcript of 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst...

Page 1: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

1

1

De-Identification of Clinical DataDeDe--Identification of Clinical DataIdentification of Clinical Data

TEPR Conference 2008 Ft. Lauderdale, Florida

May 17 - 21, 2008Sepideh Khosravifar, CISSPInfo Security Analyst IV

Page 2: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

Slide 1

cmw1 Craig M. Winter, 4/25/2008

Page 3: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

2

BackgroundBackground

One of the major challenges facing MedicalInformatics is creating data sets for researchand testing that maintain patientconfidentiality. De-identification is arequired element of information integration,reducing the risks of unauthorizeddisclosure.

Page 4: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

3

AnonymizationAnonymization

Anonymization is the process that removes the association between a dataset and the data subject. It can be done in the following ways:(1) Removing or transforming identifying characteristics in the data

set so that the association is not unique and relates to more than one data subject

(2) Increasing the population in the data subjects set so that theassociation between the data set and the data subject is notunique.

Source: ISO/IEC DTS 25237

Page 5: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

4

PseudonymizationPseudonymization

Pseudonymization is a particular type of anonymization that both removesthe association with a data subject and adds an association between aparticular set of characteristics relating to the data subject and one or morepseudonyms. It provides a means for information to be linked to the sameperson across multiple data records without revealing the identity of theperson as a data subject.

Source: ISO/IEC DTS 25237

Page 6: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

5

Re-identificationRe-identification

Pseudonymization through the trusted third party can supportre-identification where the implementation requires re-identificationsuch as supporting case investigation and other public health eventdetection and management. Reasons for re-identification thatshould be considered include:

– Verification and validation of data integrity – Checking for suspected duplicate records – Enabling requests for additional data – Linking to supplement research information variables – Compliance audits – Informing data subjects or their care providers of significant

findings – Facilitating follow-up research – Law enforcement.

Page 7: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

6

Issues Requiring ConsiderationIssues Requiring Consideration

Frequency and types of errors in de-identificationmethod. De-id tools are subject to at least two types oferrors:

(a) Failure to remove information that constitutesone of the 18 HIPAA Safe Harbor data elements(Undermarking),(b) Removal of more information than is required(Overmarking) rendering records less useful andinformative.

Page 8: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

7

NHIN Anonymization GuidelinesNHIN Anonymization Guidelines

6) Replace all geographic location data (patient, provider, etc.) smaller than a state with fabricated data, including street address, city, county, precinct and zip code.

(B) All geographic subdivisions smaller than a State, including street address, city, county, precinct, zip code, and their equivalent geocodes, except for the initial three digits of azip code

3) Replace patient, contact, next of kin, provider, technician and any other person name data with fabricated data. 10) Replace all employer, practice, laboratory, etc. names with fabricated names.

(A) Names;

AnonymizationGuidelines

HIPAA de-identification [45CFR164.514(b)(2)(i)]

Page 9: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

8

NHIN Anonymization GuidelinesNHIN Anonymization Guidelines

5) Replace all telephone and fax numbers with a fabricated number, for example 222-555-1111. Use the fictitious exchange code “555” in all cases.

(D) Telephone numbers;

2) Replace all registration data columns with fabricated data (except for: gender_code, date_of_birth)a) Offset all result dates by a

random number of days (between 1 and 90)into the past.

a) Offset date_of_birth by randomnumber of days between 1 and 90into the past.

(C) All elements of dates (except year) for dates directly related to an individual, including admission date, discharge date, date of death; birth date, all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older

Page 10: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

9

NHIN Anonymization GuidelinesNHIN Anonymization Guidelines

4) Replace all Social Security Numbers, order numbers, account numbers, patient ID numbers, Medicare/Medicaid numbers, certificate or licensing numbers, etc,. with fabricated numbers.

(G) Social security numbers;

7) Replace all email addresses, URLs and IP addresses with fabricated data.

(F) Electronic mail addresses;

5) Replace all telephone and faxnumbers with a fabricated number,for example 222-555-1111. Use thefictitious exchange code “555” in allcases.

(E) Fax numbers;

Page 11: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

10

NHIN Anonymization GuidelinesNHIN Anonymization Guidelines

4) Replace all Social Security Numbers, order numbers, account numbers, patient ID numbers, Medicare/Medicaid numbers, certificate or licensing numbers, etc,. with fabricated numbers.

(J) Account numbers;

4) Replace all Social Security Numbers, order numbers, account numbers, patient ID numbers, Medicare/Medicaid numbers, certificate or licensing numbers, etc,. with fabricated numbers.

(I) Health plan beneficiary numbers;

4) Replace all Social SecurityNumbers, order numbers, accountnumbers, patient ID numbers,Medicare/Medicaid numbers,certificate or licensing numbers, etc,.with fabricated numbers.

(H) Medical record numbers;

Page 12: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

11

NHIN Anonymization GuidelinesNHIN Anonymization Guidelines

4) Replace all Social Security Numbers, order numbers, account numbers, patient ID numbers, Medicare/Medicaid numbers, certificate or licensing numbers, etc,. with fabricated numbers.

(M) Device identifiers and serial numbers;

4) Replace all Social Security Numbers, order numbers, account numbers, patient ID numbers, Medicare/Medicaid numbers, certificate or licensing numbers, etc,. with fabricated numbers.

(L) Vehicle identifiers and serial numbers, including license plate numbers;

4) Replace all Social SecurityNumbers, order numbers, accountnumbers, patient ID numbers,Medicare/Medicaid numbers,certificate or licensing numbers, etc,.with fabricated numbers.

(K) Certificate/license numbers;

Page 13: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

12

NHIN Anonymization GuidelinesNHIN Anonymization Guidelines

11) Replace any other data that can be considered part of the HIPAA 18 individual identifiers.

(P) Biometric identifiers, including finger and voice prints;

7) Replace all email addresses, URLs and IP addresses with fabricated data.

(O) Internet Protocol (IP) address numbers;

7) Replace all email addresses, URLsand IP addresses with fabricated data.

(N) Web Universal Resource Locators (URLs);

Page 14: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

13

NHIN Anonymization GuidelinesNHIN Anonymization Guidelines

2a) Retain gender_code.11) Replace any other data that canbe considered part of the HIPAA 18individual identifiers.

(R) Any other unique identifying number, characteristic, or code

11) Replace any other data that canbe considered part of the HIPAA 18individual identifiers.

(Q) Full face photographic images and any comparable images; and

Page 15: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

14

HITSP Pseudonymize Transaction: Patient Pseudo Identifying Information

HITSP Pseudonymize Transaction: Patient Pseudo Identifying Information

Page 16: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

15

Person Identifier Cross-Reference (PIX) Manager Query

Person Identifier Cross-Reference (PIX) Manager Query

Page 17: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

16

Patient Identity FeedPatient Identity Feed

Page 18: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

17

StandardsStandards

• Health Insurance Portability an Accountability Act (HIPAA)

• Health Level Seven (HL7) • Integrating the Healthcare Enterprise (IHE) IT

Infrastructure Technical Framework (ITI-TF) • International Organization for Standardization

(ISO) Health Informatics - Pseudonymization, Technical Specification # 25237

Page 19: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

18

SummarySummary

Data de-identification systems can help accomplishorganizations goals of improving quality of care,promoting research, and protecting privacy.However, producing anonymous data that remainsspecific enough to be useful is often a very difficulttask. Although new technology offers some goodchoices, technical solutions alone remaininadequate. Technology must work with policy forthe most effective solutions.

Page 20: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

19

ReferencesReferences

• ISO/IEC DTS 25237,”Pseudonymization Practices for the Protection of Personal Health Information and Health Related Services

• HITSP Pseudonymize Transaction Ready for Implementation V2.1

• National Health Information Network (NHIN)

Page 21: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

20

Contact InformationContact Information

Sepideh Khosravifar, CISSPFor Department of Veteran AffairsSAIC - Information Security Analyst [email protected] office

Page 22: 2008 TEPR De-identification of clinical data 20080501.ppt ... TEPR De... · Info Security Analyst IV. Slide 1 cmw1 Craig M. Winter, 4/25/2008. 2 Background ... Full face photographic

21

Questions?Questions?