463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay...
Transcript of 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay...
![Page 1: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/1.jpg)
463.9 Health Information Technology
Computer Security IICS463/ECE424
University of Illinois
![Page 2: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/2.jpg)
• Midterm– March 10 (Tue), during class, this room (SC 0216)– Closed book, closed note, no calculator– 12 multi-choice questions + 6 essay questions
(sample questions posted via Git)
• Tips– Focus on slides, watch videos (https://echo360.org/)– Practice with quiz questions and sample questions
Announcement
2
![Page 3: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/3.jpg)
• Privacy and genomic data
• Privacy protecting genomic research using– Private set intersection– SGX
Outline
3
![Page 4: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/4.jpg)
Genome
• Contains all of the biological information needed to build and maintain a “living example” of an organism
• Encoded in DNA, one polymer of nucleotides– A,G,C,T
• Human Genome:– Approximately 3 billion nucleotides– Stored in 23 chromosome pairs (plus mtDNA)
4
[BaldiBDGT11]
![Page 5: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/5.jpg)
Cost Per Genome
5
![Page 6: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/6.jpg)
• Better understanding of human genome• Many individuals have access to key parts of their
genomes• Precision medicine enabled• Testing possible not only in-vitro but also in-silico
New Frontiers
6
![Page 7: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/7.jpg)
• Genomic data carry sensitive information that may reveal– identity,– predisposition to diseases,– and even facial features.
• Disclosure may propagate the privacy risks to blood relatives.• Individuals have marked differences in the way they want their
data utilized for research.• Data are irrevocable once they are disseminated• New privacy threats may emerge over time with new discoveries
of human genetics and the advance of attack methods.– Many aggregate results have been removed from the public domain hosted
by NIH.
Privacy Concerns
7
![Page 8: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/8.jpg)
Genetic ExceptionalismHow Special is Genomic Data?
Evans, James P., and Wylie Burke. "Genetic exceptionalism. Too much of a good thing?." Genetics in Medicine 10, no. 7 (2008): 500-501.
McGuire, Amy L., Rebecca Fisher, Paul Cusenza, Kathy Hudson, Mark A. Rothstein, Deven McGraw, Stephen Matteson, John Glaser, and Douglas E. Henley. "Confidentiality, privacy, and security of genetic and genomic test information in electronic health records: points to consider." Genetics in Medicine 10, no. 7 (2008): 495-499.
[Naveed15]
![Page 9: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/9.jpg)
• Homomorphic encryption• Differential privacy• Secret sharing• Secure multi-party computation (MPC)
– Garbled circuits
• Secure two-party computation– Private Set Intersection (PSI)
• Trusted execution environments– SGX
PETs for Computation on Genomic Data
9
![Page 10: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/10.jpg)
Strawman Approach for Paternity Test
• On average, ~99.5% of any two human genomes are identical
• Parents and children have even more similar genomes
• Compare candidate’s genome with that of the alleged child:– Test positive if % of matching
nucleotides is > 99.5 + τ
First-Attempt Privacy-Preserving Protocol
• Use an appropriate secure two-party protocol for the comparison
• PROs: High-accuracy and error resilience
• CONs: Performance not promising (3 billion symbols in input)– Experiments showed
computation takes a few days
Privacy-Preserving Genetic Paternity Test (1 of 2)
10[Baldi11]
![Page 11: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/11.jpg)
• Improved Protocol– ~99.5% of any two human genomes are
identical– Why don’t we compare only the
remaining 0.5%?
But… We don’t know (yet) where exactly these 0.5% occur!
Using Private Set Intersection Cardinality for privacy-preserving comparison, it takes about 1 hour
Privacy-Preserving Genetic Paternity Test (2 of 2)
11
Imagefromdna-testing-for-paternity.com
![Page 12: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/12.jpg)
Private Set Intersection Cardinality (PSI-CA)
12
Server Client
Private Set Intersection Cardinality (PSI-CA)
S∩C⊥
S={s1,…,sw} C={c1,…,cw}
![Page 13: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/13.jpg)
• In-vitro emulation – RFLP-based paternity test– Restriction Fragment Length Polymorphism (RFLP) analysis:
a difference between samples of homologous DNA molecules from differing locations of restriction enzyme sites
– DNA sample is cut into fragments by enzymes• Fragments separated according to their lengths by gel electrophoresis• Paternity test is positive if enough fragments have the same length
• RFLP-based PPGPT – Reduction to PSI-CA– Participants: “client” (receives the result), “server” (remains
oblivious)– Public input: , enzymes , markers– Private input: digitized genomes
PPGT Strategy
13
E = {e1,...,ej} M = {mk1,...,mkl}τ
![Page 14: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/14.jpg)
Privacy-Preserving RFLP-based Paternity Test
14
Private Set Intersection Cardinality
Test Result(#fragments with same length)
![Page 15: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/15.jpg)
• Why compare fragment lengths?– Isn’t it more accurate to compare actual contents?– In reality, RFLP yields “false positives” with very low probability– This approach increases resilience to sequencing errors
• Performance Evaluation– About 1min pre-processing to emulate enzyme digestion process– About 10ms computation time on Intel Core i5 with 25 fragments– Less than 1s on a smartphone (Nokia N900, 600MHz CPU)– Extending to 50 fragments doubles computation time and increases
accuracy by orders of magnitudes– Communication overhead: only a few KBs
Remarks
15
![Page 16: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/16.jpg)
16
![Page 17: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/17.jpg)
• Kawasaki Disease, also known as KD or mucocutaneous lymph node syndrome, is a disease in which blood vessels throughout the body become inflamed.
• It is rare (about 1 in 1000 under age of 5).• Its cause is not well understood, but there seem
to be both genetic and environmental effects. • It can be serious and hard to treat.
Kawasaki Disease
17
![Page 18: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/18.jpg)
• Challenge to get enough subjects to provide statistical power for studies of a rare disease.– No studies on KD genomics for African Americans
• Sharing genomic data is complicated by privacy rules of many institutions and governments.
• Approach: PRINCESS framework for Privacy-protecting Rare disease International Network Collaboration via Encryption through Software guard extensionS
PRINCESS Study on Kawasaki
18[Chen16]
![Page 19: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/19.jpg)
PRINCESS Framework
19
![Page 20: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/20.jpg)
• Transmission Disequilibrium Test (TDT) is a family-based test for disease traits that uses the genotype information from both parents and a child.
• Used to test seventy two Kawasaki disease (KD) children and their biological parents from: – Rady Children’s Hospital San Diego (RCHSD) (N = 45), – Emory University (N = 21) in Atlanta, and– Imperial College in London (N = 6)
• Examined > 695,784 SNPs
Genome Analysis of KD Trios
20
![Page 21: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/21.jpg)
PRINCESS Security Architecture
21
![Page 22: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/22.jpg)
PRINCESS Data Management
22
![Page 23: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/23.jpg)
• Solutions based on SGX are not expected to introduce significant computational overhead or big restrictions on data analysis operations.
• By contrast, these are common to software-based techniques such as the SMC (garbled circuit) FlexSC framework and Homomorphic Encryption based HElib framework.
• SGX therefore makes secure large-scale, inter-continental, genetic analysis feasible in practice.
Efficiency Hypothesis
23
![Page 24: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/24.jpg)
• Used simulated data to produce scale tests.• Used AWS nodes for multiple clients (up to 12).• Assorted conservative compromises were made
in measuring crypto analytics– For instance, HElib does not support division, so only
addition and multiplication operations were measured.
Performance Study
24
![Page 25: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/25.jpg)
Performance Comparison
25
![Page 26: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/26.jpg)
Performance Breakdown
26
![Page 27: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/27.jpg)
Identified SNPS
27
![Page 28: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/28.jpg)
[BaldiBDGT11] Countering GATTACA: Efficient and Secure Testing of Fully-Sequenced Human Genomes. Pierre Baldi, Roberta Baronio, Emiliano De Cristofaro, Paolo Gasti, and Gene Tsudik, CCS 2011.
• [Naveed15] Privacy in the Genomic Era, Muhammad Naveed, Erman Ayday, Ellen W. Clayton, Jacques Fellay, Carl A. Gunter, Jean-Pierre Hubaux, Bradley A. Malin, and XiaoFeng Wang. ACM Computing Surveys 48, 1, Article 6, August, 2015.Associated online tutorial on genomics for computer scientists.
References
28
![Page 29: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/29.jpg)
The Genomic Data Chasm
Clinical• Tests for specialists seeking
genetic markers for diagnosis or treatment of specific conditions
• Whole Genome Sequencing (WGS) for– PCPs who want to identify high
likelihood concerns– Researchers– Subsequent “in silica” testing
Direct to Consumer (DTC)• Enables broad access at low
cost for diverse reasons• Examples: paternity testing
and genealogy studies• Controversial issues with
quality of results and their interpretation
• Disruptive influence on clinical testing
![Page 30: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/30.jpg)
General Practitioner Report versus Medical Geneticist Report
Design Approaches for the Display of Genetic Test Results, C Bushell, M Ferber, L Gatzke, K Johnson, V Jongeneel, K Schahl. Individualizing Medicine Conference, 2012.
![Page 31: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/31.jpg)
Sample DTC Report on Genomic Susceptibility to Disease
23andme.com.
![Page 32: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/32.jpg)
Details on Markers
23andme.com.
![Page 33: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/33.jpg)
Adventurous DTC Vendors
genepartner.com.
![Page 34: 463.9 Health Information Technology · 2020. 3. 5. · –12 multi-choice questions + 6 essay questions (sample questions posted via Git) •Tips –Focus on slides, watch videos](https://reader033.fdocuments.in/reader033/viewer/2022060916/60a92e5234a9bc169045afb5/html5/thumbnails/34.jpg)
• What security and privacy issues are raised by DTC genomics?
• How would you like to see your DNA data managed? What about the DNA of your relatives?
• Should it be legal to obtain your DNA without your consent?
Discussion
34