Towards your own genome. Designing your Sequencing Run Sequencing strategy Genome size and genome.
INNOVATIONS IN MEDICAL GENOMICS: HOW TO ENABLE … · 2 ADVANCES IN GENOME SEQUENCING TECHNOLOGY...
Transcript of INNOVATIONS IN MEDICAL GENOMICS: HOW TO ENABLE … · 2 ADVANCES IN GENOME SEQUENCING TECHNOLOGY...
INNOVATIONSINMEDICALGENOMICS:HOWTOENABLEADVANCES
WHILEMANAGINGPRIVACYANDSECURITYRISKS?
ForrestBriscoe([email protected])andBarbaraGray([email protected])*
withassistancefromCelesteDiazFerraro
PennStateUniversity
SEPTEMBER2017
“Howdowesupporttheverynoblecancermoonshotandahealthiersocietyandenvironment,withouterodingwhataretrulyourfundamentalvaluesofindividualismandprivacy?”(GenomicscompanyCEO)
ASGENOMICSISINTEGRATEDINTOHEALTHCAREDELIVERY,NEWPRIVACYANDDATA-SECURITYRISKSNEEDTOBECONSIDERED
Humangenomesequencingistransforminghealthcare.Sequencingistheinnovationatthe
heartoftheshifttoprecisionmedicine,reflectedinthe21
st
CenturyCuresAct(2016)andother
recentinitiatives.Assequencingbecomesincorporatedintoroutinemedicalcare,millionsof
Americanswillhavetheirgenomicdatageneratedinordertodiagnose,treat,orprevent
disease.Atthesametime,thatdatawillbeaddedtogenomedatabasesthatareproliferating
acrossthecountry.Thesedatabasesarenecessarytorealizeprogressinmedicalcare–butthey
alsocarryparticularprivacyandsecurityrisksthatarenotwidelyappreciated.Inparticular,
genomicdatarevealdetailedphysicalandmentalcharacteristicsforeachpersonsequenced,as
wellasforhisorherfamilymembersandoffspringforgenerations.Thesedatabasesare
inevitablyvulnerabletobreaches,andtheircontentscannotbeanonymizedeffectively.
Inthiswhitepaper,wesummarizeprivacyandsecurityrisksassociatedwithmedicalgenomics,
andindicatesomeoftheoptionsforriskmitigationthatarebeingdeveloped.
1
Someoptions
couldimpactonthepaceofinnovationandprogressinscienceandmedicine.Hence,leadersin
thegenomicsfieldaredebatingtheserisksandbenefits.Yetthebroaderpublicalsoneedsto
beengagedinthisdebate,sincetheirgenomesareenteringthegenomedatabases.Citizens
andpolicymakersneedtograpplewithimportanttrade-offsinherentinhowwetreatgenomic
data.Henceweconcludewiththreequestionsaboutgenomicsthatneedtobedebatedby
informedAmericancitizensandpolicymakers:Whoshouldhavetherighttomakedecisions
aboutyourgenome?Howcloselyshouldyouholdontoyourgenome?Whatstandardsare
desirableforsecuringgenomicdatabases?Addressingthesequestionswillhelpinsurethatwe
striketherightbalanceofbenefitsandrisksinthisrapidlydevelopingfield.
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
2
ADVANCESINGENOMESEQUENCINGTECHNOLOGYARETRANSFORMINGHEALTHCAREANDTHELIFESCIENCES
Thegenomeisthatuniquesequenceofdeoxyribonucleicacid(DNA)moleculeswhich
representseachindividual’sbiochemicalcode.
2
Thefirsthumangenomewassequencedover
15yearsago,andnowthepromiseofgenomicsisbeingrealized.Inthattime,advancesinthe
technologiesusedtogenerateanddigitallystorethatsequencehavemadeitfeasibleto
incorporategenomicdataintoawiderangeofmedical,scientificandcommercialactivities.In
medicine,genomicsismovingtocenterstageinthesearchforeffectivetherapiesanddrugs;it
isalsobeingincorporatedintotheclinicaldiagnosis,treatment,andpreventionofdisease.
Genomicsisalsotransformingmanyscientificfields,frombiologyandanthropologytopublic
healthandthesocialsciences,sheddingnewlightonresearchtopicsfromhumanevolutionto
societalinequality.Andarangeofcommercialfirms,suchasGoogle,Apple,IBM,Amazonand
Alibaba,aimtousegenomicsinordertotailorconsumerproductsandservicesaccordingto
users’geneticprofiles,allowingthesecompaniestomorepreciselymanagerelationswithusers
andanticipatethoseusers’needsandactivities.
Thefieldofgenomicsislarge,complex,andrapidlyexpanding.Adiverseecosystemof
organizationsisinvolved,includinghospitalsandhealthcareorganizations,academic
departmentsandresearchteams,pharmaceuticalandbiotechnologyfirms,consumer
technologycompanies,genomicsdevicemanufacturersandsequencinglabs,grantfunding
organizations,venturecapitalists,regulatoryagencies,citizenscienceandpatientadvocacy
organizations,andmore.Majorgenomicsentitiesareorganizedasfor-profitcompanies,non-
profitorganizations,andgovernmentagencies,reflectingarangeofdifferinginterestsand
governancestructures.Awebofinter-organizationalpartnershipsandalliancescrisscrossesthe
field.A2013studyestimatedthegenomicsector’scontributiontotheeconomyat$25billionin
directoutputand$40billioninindirectoutput;observersanticipatethefootprintofthesector
willgrowrapidlyinthecomingdecade.
3
Inthiswhitepaper,wefocusonmedicalgenomics,whichisthelargestcomponentofthe
overallgenomicsfield.Changesnowoccurringinthehealthcaresystemsuggestthatinthe
nextfewyearsasignificantportionofAmericanswillbeaskedtohavetheirgenomesequenced
aspartofroutinemedicalcare.
4
Thisdevelopmenthasfarreachingconsequencesthatareonly
dimlyappreciatedatpresent.Ouraiminthispaperistoprovideaninitialriskassessment
focusedontheprivacyandsecurityconsequencesofthisdevelopment.Inparticular,wefocus
onprivacyrisks.Medicalgenomicshasgreatpotentialtoimprovethehumancondition,andour
aimisnottoquestionthatpotential.Rather,wehopethatinsensitizingthepublictomoreof
theimplicationsofmedicalgenomics,wecanstimulatepublicdiscussionanddeliberationon
thebestwaystobalanceprivacyconcernswithotherinterestsdrivingthisfield.
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
3
CLINICALHEALTHCAREISTHEGATEWAYTHROUGHWHICHAMAJORITYOFAMERICANSARELIKELYTOBEINTRODUCEDTOSEQUENCING
Withrapidadvancesinthefeasibilityofgenomesequencing,leadersinhealthcarearenow
forgingakeyroleformedicalgenomicsinthedeliveryofclinicalpatientcare.Indeed,inmajor
hospitalsandhealthcareorganizations,medicalgenomicsisalreadybeingintegratedwith
patientdataandotherpersonalhealthinformation,sotheycanbeusedtogetherindiagnosing
andtreatingpatients.
5
Atarecentconference,apanelofleadingexpertsconcludedthatwithin
adecade,over1billionhumansworldwidewillbesequenced,andthatsequencingwillhave
becomepartofroutinepatientcareintheU.S.
6
Forexample,apatientpresentingsymptomsof
diabeteswillhavehergenomeanalyzedformutationsknowntomakecommondiabetesdrugs
ineffective;iffound,thepatient’sphysicianwouldbeinformedandadifferentdrugprescribed.
ApatientarrivingintheE.R.withchestpainswillbeanalyzedforgeneticheart-rhythm
mutationswhich,iffound,wouldaltercriticalcaredeliveryinlife-savingways.
7
Thisincorporationofgenomicdataintoclinicalcaretakesustowardanindividualizedapproach
tohealthcarethatisbeingcalledprecisionmedicine.Proponentsarguethatasthefieldadvances,thepracticeofmedicinewillberevolutionizedthroughtargetedtherapiesand
customdrugsbasedonknowledgeofaspecificpatient’sgenome.Already,priortobeing
prescribedwithadrugsuchaswarfarinandPlavix,apatientcanhavehisorhergenome
checkedformutationsthatareknowntoreducetheefficacyofthosedrugs.
8
Inaddition,
genomictestingforprostatecancercannowidentifypatientsatriskofthemoreaggressive
formofthedisease,leadingtomoretargetedscreeningandtreatmentofthosepatients.
9
Routine“cell-freeDNA”sequencingonhealthypatientsisaimedatearlydetectionofavariety
ofcancers.Powerfulnewgeneeditingcapabilitiesarealsonowmakingitpossibletocurtail,
andevencure,certaindiseasesbyidentifyingtheoffendingmutationsinanindividualandthen
usinggenetherapytofixthem.In2016,suchapproacheswerefoundsuccessfulagainst
diseasesincludinghemophiliaandnon-Hodgkinslymphoma.
10
Astheseexamplesillustrate,
precisionmedicineinvolvestheintensiveuseofgenomicsinbiomedicalresearch,aswellasthe
routineuseofgenomicsinclinicaldiagnosisandtreatment.
GENOMICSDATABASESPLAYACRUCIAL–ANDCONTROVERSIAL–ROLEThelinchpinofthisrevolutionisthegenomicsdatabase.Thereuse,combination,andsharingof
humangenomicdatainordertocreatethesedatabasesisanessentialrequirementfor
realizingthemedicaladvanceschronicledabove.
Toseewhythisisthecase,considerhowascientistdiscoversthegeneticmutations(called
variants)thatcauseaspecificdisease.Ataminimum,sheneedstocomparegenomesfrom
manypeoplewiththedisease,againstgenomesfrommanypeoplewithoutthedisease.Bothgroupsneedtobepresentinsufficientnumbersinordertobestatisticallyconfidentaboutthe
associationbetweenthevariantsandthedisease.Ifthediseaseisrare,thedatabaseneedsto
belargerinordertoincludeenoughdisease-carryingindividuals.Ifthediseasehasmany
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
4
variantsassociatedwithit,andtheyoccurincomplexcombinations–asituationthatis
common–thenanevenlargerdatabaseisneededtoensureallthosecombinationsare
included.Thisstatisticalcomparisonprocessismademorechallengingbecauseofthesheer
sizeofthehumangenome.Althoughmostofthe6billionbasepairsineachgenomeare
identicalacrossindividuals,therearestillafewmillionbasepairsthatvarybetweenpeoplethatmayneedtobeanalyzedforpotentialdisease-causingvariants.
Thenetresultofthissituationisthatresearchersrequireaccesstolargedatabasescomprised
ofDNAdatafromthousands-andinsomecasesmillions-ofindividuals.What’smore,those
databasesaremostusefultoresearcherswhentheyarelinkedtoadditionalinformationabout
theindividualswithinthem.Forexample,toidentifythegenomesofdisease-carrying
individualsanddisease-freeindividuals,genomicdatamayneedtobelinkedtomedicalrecords
thatreportthediseasestatusofeachpersoninthedatabase.Henceeffortsareunderwayto
compileandconnectmultipledatabasestogether,tocreateever-largerpoolsofgenomicdata
coupledwithothermedicalandpersonalinformationontheindividualscontainedinthem.
Manypeoplehave-wittinglyorunwittingly-contributedtheirDNAsequencestothese
databasessimplybyparticipatinginmedicaltestingorbyusingdirect-to-consumergenomics
testingservices,oftenwithoutrealizingtheirdatahasnowbecomeavailableforacademicand
medicalresearch,andinsomecasesalsoforadditionalcommercialresearch.Although
contributorssignconsentformsmentioningtheirdatawillbereused,fewseemawareofthe
implications,aswedescribebelow.Ontheotherhand,actorsinthegenomicsfieldarewell
awareofthevalueofgenomicsdatabases.Asaresult,healthcareorganizations,
pharmaceuticalcompanies,scientificteams,governmentagencies,andentrepreneursareall
engagedinagenomicsdatabasegold-rush,tryingtoassemble,secure,monetizeandmine
thesedatabasesforthesecretstheyhold.Suchdatabasesarequietlyregardedasstrategic
assetsbyforward-thinkingmedicalorganizationssuchasKaiserPermanente,GeisingerHealth
System,andtheVeteransHealthAdministration,bypharmaceuticalfirmssuchasRocheand
AstraZeneca,andbyconsumertechnologyfirmssuchasGoogleand23andMe.
DISCLOSEDGENOMICDATAREVEALFUTUREHEALTH,BEHAVIORALTENDENCIES,ANDOTHERTRAITS,PLUSDETAILSOFFAMILYMEMBERS
Despitetheincreaseinandpopularizationofgenomics,therisksassociatedwiththisfieldin
general,andgenomicdatabasesinparticular,maybelargelyunderappreciatedbypatients.The
latentvalue(andlatentharm)ofdisclosingone’sgenomicinformationisalreadysignificant.
Consider,forexample,thatgenomicinformationnowrevealsanindividual’ssusceptibilityto
substanceabuse,oddsofdevelopingdepressionandearlyAlzheimer’sdisease,abaselinefor
innatemathematicalability,tendencytowardaggression,andprobabilitiesfordevelopinga
widerangeofbothcommonandraregeneticdiseases.
11
Further,advancesinstatisticalanalysis
meanthatevenwhenonepartofaperson’sgenomeisdisclosed,datafromotherpartsofthat
person’sgenomecanbeinferredwithrelativecertainty.Inonefamousexample,whenJames
Watson(theco-discovererofthestructureofDNA)madehisgenomepubliclyavailable,he
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
5
withhelddataforthevariantthatpredictsearlyAlzheimer’sdisease.Butscientistsshowedhow
toinferthatvariantusingotherpartsofhissequence.
12
Further,disclosurethatoccurstodaywillalsoincreaseone’svulnerabilitytofuturerisksthat
areas-yetunknown.Asthegenomicrevolutionadvances,boththevalue–andthepotential
harm–ofanindividual’sgenomicdatawillcontinuetoincrease,longafterthemomentwhen
heorshewasinitiallysequenced.Within5years,expandedsequencingofaperson’s
metagenome,whichincludeshisorherpersonalmicrobiome,willrevealfurtherdetailsofa“geneticfingerprint,”encompassingfine-grainedinformationaboutethnicity/nationalorigin,
placesanindividualhasvisited,andevenrecentcontactwithotherpeople.Clinicalcollectionof
expandedmetagenomicdatawillincreaseinthenearfuture,asscientistsuncovertheroleof
themicrobiomeintriggeringandregulatingdisease,andinservingasindicatorsofthe
environmentalexposuresaffectingdisease.
13
Metagenomicdatawillalsoimproveonexisting
abilitiestopredictanindividual’ssocioeconomiccircumstancesfromhisorhergenome.
14
Importantly,genomicdisclosurerisksalsoextendtofamilymembers.Becausesomuchofone
person’sDNAisidenticaltothatofhisorherrelatives,includingcurrentandfuture
descendants,thesedisclosuresaffectnotonlyanindividualbutalsotheirfamilymembers.
Knowledgeofoneindividual’sgenome,orevenjustsomeoftheirvariantdata,canreveal
usefulinformationonthediseaseprobabilitiesandpersonalcharacteristicsofthatperson’s
parents,siblings,children,andevenextendedkinupto5degreesofseparation.
15
Giventheinherentvalueinpersonalgenomicdata,
manyentitiesmayhaveaninterestinacquiringit-
includingemployers,educationalinstitutions,
insuranceandfinancialfirms,andevenromantic
partners.Althoughthe2008GeneticInformation
NondiscriminationAct(GINA)currentlyprotects
againstabusebysomeoftheseparties(specifically,
employersandhealthinsurersareprohibitedfrom
discriminatingagainstindividualsbasedongenomic
data),itdoesnotcoverlife,disabilityorlong-term
careinsurance.Educationalorfinancial
discriminationisalsopossible,ifeducatorsor
lenders-whicharenotsubjecttoGINA’snon-
discriminationprovisions-begintoscreen
applicantsaccordingtodesirablegenomicfeatures.
Genomicsdataisalreadybeingusedtodetermine
theeligibilityofsomestudentstoparticipatein
collegiateathletics.
16
Stateandnon-stateactorsalso
haveastronginterestinobtaininggenomicdata
(seetextbox).
17
StateandNon-StateActorsAlsoSeekGenomicDataBeyondthesemarketactors,whoelse
isinterestedingenomicdata?State
andnon-stateactorsalsohavean
interestingainingaccesstogenomic
dataforacountry’scitizens.Foreign
entitieshaveconductedlarge-scale
hackingofU.S.healthcaredatabases.
Suchdatacanprovidevaluablemedical
andbehavioralinsightsonatarget
country’sfuturebusinessleadersand
governmentalofficials.Assuch,these
datawillhaveincreasingvalueonthe
blackmarket,tobothcriminalsand
states.Ifgenomicsignaturesbecome
incorporatedintocryptographicand
informationsecuritytools,thevalueof
obtainingsuchgenomicdatawill
increasefurther.
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
6
Apotentialparalleltothecurrentmomentingenomicdatadisclosuremightexistintheearly
yearsofonlinesocialmedia.Earlysocialmediausersfailedtoanticipatehowdisclosureoftheir
embarrassingordubiouspicturesontheirFacebookpagemightjeopardizetheirfuture
employmentoptions.Aspeoplebecamemoreawareofhowonlinedatacouldbeusedagainst
them,socialmediausersappeartohavebecomemorecareful,andsocialmediacompanies
haverespondedwithmoreoptionsforindividualstocontroltheirdataandlimitdisclosure
risks.ForDNAdata,althoughGINAcurrentlyofferssomeprotectionsagainsttheuseof
disclosedgeneticinformation,thoseprotectionsareincomplete,andunlikelytokeeppacewith
futureattributesyettoberevealedfrompreviously-releasedDNA.
Inshort,genomicdataappearstobeuniquelyinformativeofanindividual’sfuturelife
expectations,challengesandopportunities-andominously-thoseoftheirprogenyaswell.
Thissuggeststhatanindividual’sdatashouldbetreatedwithuniquecare.Butaswedescribein
moredetailbelow,therearemanyscenariosinclinicaluseandbeyondinwhichidentifiable
genomicdatacanpotentiallybereleased-intentionallyornot-andcauseharmtoan
individualandtheirfamilymembers.
THELEGALANDREGULATORYFRAMEWORKTOPROTECTGENOMICDATAISLIMITED
Asthevalueandriskassociatedwithgenomic
datadisclosurecomesintofocus,theU.S.legal-
regulatoryframeworkprotectingithasnotkept
up.Evenwhengenomicdataisgeneratedinthe
healthcaresetting,itisnotprotectedinthe
samemannerasapatient’spersonalhealth
information.Personalhealthinformationis
protectedunderHealthInsurancePortabilityandAccountabilityAct(HIPAA),whichcircumscribeshowthatinformationcanbe
stored,used,andshared.Noclearregulatory
frameworkforprotectinggenomicdataexists.
Initsabsence,organizationsappeartobetaking
anapproachwiththreebasicpillars–consent,
de-identification,andcybersecurity–eachof
whichsuffersfromsignificantlimitations.
18
Consentisthelegalprocessthroughwhichindividualsprovideapprovalfortheirgenomic
datatobegeneratedandused.Inpartto
facilitategenomicsresearch,anewmodelof
“broadconsent”hasbeendeveloped,underwhichpatientsandothersagreeingtoDNA
sequencingaretoldsomethinglike,“Yourgenomicdataandhealthinformationwillbestudied
HowdoestheU.S.Government’sInvolvementinGenomicsComparewithOtherCountries?Governmentalinvolvementingenomics
variesgreatlyacrosscountries.Relativeto
theU.S.,thereisgreaterinvolvementin
theUnitedKingdom,China,Iceland,
France,andothercountrieswherethe
governmentiscollectinggenomicdataon
amanyorallofitscitizens.Whythe
differencesacrosscountries?Inpart,
differencesmayreflectsocietalvaluation
ofpersonalprivacy.However,differences
alsoreflectthestate’spre-existingrolein
thefinancingandprovisionofhealthcare;
wherethatroleislarge(suchastheUK),
thereareisamandateforthegovernment
toinvestinthecollectionanduseof
genomicsdatabasestocontrolthenation’s
healthcarecosts.
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
7
alongwithinformationfromotherparticipantsinthisresearch,anditwillbestoredforfuture
studiesbythisandotherresearchteams.”
19
Thisapproachislegallyadequate,becauseunder
currentU.S.law,genomicdataitselfisregardedasaformofproperty,subjecttocontractlaw,
ratherthangovernedbyanyuniversalprivacyrightoftheindividualwhothedatadescribes.
Underthisapproach,patientsarerarelyinformedofspecificwaystheirdatawillbe
subsequentlyusedortheconsequencesofsuchuse.
20
De-identificationisintendedtoensurethatpeoplewhoaccessagenomicsdatabasecannottell
whotheindividualsarewithinit.HIPAAregulationsdetailasetofstepsneededtode-identify
data,involvingremoving18specifiedindividually-identifyingdataitems,suchasname,
telephonenumber,andfingerprintdetails,whichmakeitpossibletouniquelyidentifyaperson.
Ifagenomicsdatabaseisfree-standing,orlinkedtoitemsotherthanthose18specifiedones,itfallsoutsideofthejurisdictionofHIPAA,andinthatstateitcanbereusedandsharedrelatively
easily.Infact,strictlyspeaking,inthatnon-HIPAAformat,genomicdatacanbereusedand
sharedforresearchpurposeswithoutpatientconsent.
21
Unfortunately,genomicdataitselfappearstoallowforre-identification,whetherornotitis
linkedtoanyotheritems.Indeed,genomicdataappearstobeoneofthebestwaystouniquely
identifyanindividualthathaseverbeendiscovered.Researchershavedemonstratedanumber
ofwaysthatapersoncanbere-identifiedbyanyonewithaccesstoa(de-identified)genomic
database.Forexample,ifyouknowjustafractionofaperson’sDNAsequencefromanother
source,youcanusethatinformationtore-identifytheminthegenomicdatabase.Or,ifyou
knowfragmentsofaperson’sphenotypeinformation,suchaseyeandskincolor,height,anda
fewotheritems,youcanalsousethattore-identifythatpersoninagenomicdatabase.
22
Whileaprimaryconcernwithre-identificationisdisclosureofprivateinformationmanifestin
theperson’sgenome,asdescribedabove,thereisalsoaprivacyriskinjustlearningthata
personisincludedwithinagivendatabase.Becausemanydatabasesfocusonspecificdiseases,
suchasautismspectrumdisorderor
mitochondrialdiseases,knowingthataperson
isinsuchadatabaserevealsthatthis
individualisacarrierforthatdisease,andtheir
familymembersarelikelycarriersaswell.
Thethirdpillariscybersecurity.However,theframeworkforensuringcybersecurityin
genomicsisflexible,anddependsinlargepart
onthevaryingpoliciesofeachorganization
involved.Inhealthcare,HIPAAshapes
cybersecurity,butgenomicsdatacanbekept
outsideofHIPAAjurisdiction.Further,even
HIPAA-protecteddataisoftenbreached(see
textbox).
23
Forresearchprojectsand
organizationsreceivingfederalfunding
HowOftenDoDataBreachesOccurintheU.S.HealthCareSystem?AccordingtotheInstituteforCritical
InfrastructureTechnology,nearlyhalfofthe
U.S.populationhadpersonalhealthcare
datacompromisedinjustoneyear.
Anthem’sdatabreach,disclosedin2015,
involved78millionrecordswithpersonally
identifiableinformation.Thissuggeststhat
personaldataprotectionsmandatedby
HIPAAarenotprovidingadequate
protection,evenofnon-genomicdata.These
sensitivepersonalhealthdataarebeing
successfullytargetedbynationstateactors,
cybercriminals,andhacktivists.
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
8
(includingmanymedicalorganizations,academicinstitutions,andpharmaceuticalcompanies),
aninternalInstitutionalReviewBoard(IRB)approvesgenomicscybersecuritypractices,along
withotheraspectsofclinicaloracademicgenomicsresearch.IRBs,inturn,trytoensurethat
bestpracticeguidelinesfordatasecurityarefollowed,andtheycanenforceinternalpenalties
fornon-complianceastheydeemappropriate.
24
UNCOVERINGTHEFULLLIFECYCLEOFDNADATA
Tofurtherexplainhowgenomicdataarebeingused,wedevelopedaportraitofthe“lifecycle
ofDNAdata,”tracinganarcfromitscreationduringclinicaldiagnosisandtreatmenttoits
subsequentreuseforclinicalandresearchpurposesinothersettings.Whiletheprocessisby
nomeansstandardized,thisgenerallifecyclemodelnonethelesscapturesthebroadsetof
stepsthatarecommonacrossmanysettings.Thelifecyclemodel(seeTable1)offersausefulmeansforidentifyingwhereprivacyandsecurityvulnerabilitiesassociatedwiththeuseof
clinicalDNAdatacanoriginate,andwhatstepsneedtobeinplacetoensuretheprivacyand
securityofthedataasitistransferredfromanindividualtoalargedatabaseandbeyond.
ThetoprowofTable1showsfourbasicstepsinhowDNAdataishandled:(1)thegenerationof
thesequenceddata,(2)theanalysisofvariantsinthedata,(3)clinicalinterpretationofdata,
and(4)subsequentdatasharingandreuse.Thenexttworowsinthetablesummarizethekey
activitiesinvolvedineachstep,andspecifywhohandlesthedata.Thebottomtworows
summarizetheimplicationsofeachstepfortheprivacyandsecurityofDNAdata.Inthe
sectionsbelow,wefirstmoveleft-to-rightacrossthetophalfofthetabletoexplaineachstep
inthelifecycle.Thenwegoleft-to-rightthroughthebottomhalfofthetable,discussingthe
privacyandsecurityissues,andmeasurestoaddressthem,ateachstep.
Step1.GenerationofDNAdata
ThefirststepisthedecisiontocreateDNAdataforthepatient.Thisdecisionoriginatesina
clinicalsettinginconsultationwithanindividual’shealthcareproviderbecausethepatienthas
symptomstheclinicianbelievesmaybebetterunderstoodthroughgenomicanalysis.Thisstep
involvesobtainingabiosampleandgettingitsequenced.Forgenomicsresearch,thebiosample
isusuallyobtainedviaabloodsampleora“spitkit”–asmallcontainerinwhichtheindividual
depositssaliva–whichisthensenttothelabforsequencing.
Priortosubmittingtheirbiosample,patientsareaskedtosignaconsentformthatindicates
theyunderstandhowtheirdatawillbeusedandtherisksinvolved.Inaddition,inmanycases,
patientsareaskedtomeetwithageneticcounselor,andtoreadadditionalmaterials,inorder
toincreasetheirunderstandingoftheimplicationsofhavingtheirgenomesequenced,the
informationitmayreveal,andthedecisionstheymayfaceasaresult.Thepeople“touchingthe
data”inthisstepincludenurses,labtechniciansandbioinformaticianswhoareeitheremployeesofthemedicalorganizationthatisrequestingtheDNAdatacreation,oremployees
ofasequencinglabcontractedtocollectandsequencethedata.
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
9
StepsinthelifecycleofDNAData,andattendantprivacyandsecurityrisks
1.GeneratingDNAdata
2.AnalysisofVariants 3.Clinicalinterpretation 4.Furtherdatasharingandreuse
Summary Individualmakesdecision
tocollectDNAdataandraw
sequencedataisgenerated
Individual’sDNAdata
processed&analyzedto
identifyvariants
DNAdatabaselinkedtoindividual’sother
healthdata,andanalyzedbycliniciansfor
insightintomedicalcondition
Findingsfromanalysismaybeshared,
anddataaresharedforsubsequent
reanalysis
Keyactivities
Patientseeksmedicalcare,
orjoinshealthcaresystem
Patientsignsconsentform
Biosamplecollectedatclinic
andsenttosequencinglab
-Biosampleispreparedand
runthroughsequencing
machine
-ResultingrawDNAdatais
storedtemporarily
Datamovesto
bioinformaticians
Patient’srawDNAdata
processedandannotated
usingvariantdatabase
DNAdataanalysismovestoclinicians
DNAdataanalysislinkedtootherdatain
patient’shealthrecord,andtolarger
genomics&medicaldatabases.Mayinclude:
-visualizationofspecificpatient’sDNA
-scrutinyofspecificpatient’sgenotype-
phenotypeassociations
-Sharinganddiscussionofdata,withinlocal
clinicalteamandpossiblyotherexternal
clinicalorresearchconsultants
Diagnosisandtreatmentoptionsgivento
patient
Patient’sDNAdatamergedwithothers’
DNAdatainlargerdatabase
Findingsfromanalysissharedwith
otherresearchersandclinicians
workinginsimilarareas
Datamaybesharedforreusewith:
-clinicianstreatingotherpatients;
-NIH,whichrequiresfundedprojectsto
depositdataforacademicreuse
-pharmafirmsfordrugdiscovery
-commercialfirmsandothertypesof
organizations
Whohandlesdata?
Sequencinglabemployees Employeesofsequencing
lab,medicalorganization,
and/orthird-party
bioinformaticscompany
Employeesofsequencinglab,medical
organization,and/orthird-party
bioinformaticscompany
-Alsootherclinicians,researchersandgenetic
counselorsconsultedfordiagnosis/treatment
Otherclinicians,academicresearchers,
pharmaceuticalemployees,and
governmentalentitiesthatacquire
accesstodatabases
Dataprivacy&security
risks
Consentformdoesnot
allowindividualtoeasily
appreciateimplicationsfor
selfandfamily
Sampleordatacrossing
state/legaljurisdictions
Behavioroflabemployeesandclinicians-carelessnessortheftofdata
-clinician’surgetodiagnosepatientincreasesincentiveforrapiddata
sharing,carelessuseofcommunicationtechnology
Technology-Storageandtransitofdataviainternet,localcomputersorcloud
Individualshandlingdataarenot
subjecttoprivacy&securitypoliciesof
originalorganization;maynotbeas
carefulwithdata;goalsmayalsodiffer
Re-identificationofdatamaybe
possibledespiteeffortstoanonymizeit
Waystoaddress
dataprivacy&security
Requireexpandedgenetic
counseling
Increaseindividualcontrol
overdata
Updateconsentprocess
with“opt-in”todatareuse
Behavioroflabemployeesandclinicians-screening,compliancetrainingandsanctions
-monitoringandauditing
Technology-databaseaccess:encryption,authentication,authorization
-secureplatformforcommunicationofgenomics&healthinformation
Makesharingcontingentonrecipient
organizationreplicatingprivacyand
securitystandardsofsharing
organization
Maintainindividual’sopt-in/opt-out
rights,enabledbytechnologyand
organizationalprocesses
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
10
Mostoften,thesequencinglabisseparatefromtheclinicalorganization,althoughsome
hospitalsorclinicshaveanin-housesequencinglab.Theregulatoryoversightofsequencing
labsisnotclearlydefinedcurrently.SomesequencingoccursinlabsthathaveobtainedCLIAcertificationbytheCentersforMedicareandMedicaidServices(CMS),butthisisnota
requirementtoconductsequencing.Theinitialresultofsequencinganindividualisthecreation
ofrawdatafiles,whichthenmovetostep2foranalysis.
Step2.Individual’sDNAdataisprocessedandanalyzedtoidentifyvariants
Inthisstep,theindividual’sdataislargelyinthehandsofbioinformaticianswhoworkforthe
in-houseorcontractedsequencinglab,orforanexternalconsultingorganizationspecializingin
DNAanalysis.Initsentirelyrawform,DNAdataisnotveryuseful.Andalthoughsequencing
technologyhasadvancedgreatlyinthelastdecade,theprocessingofrawdataintousableform
requiresaconsiderableamountofskill,experience,andtimeonthepartofbioinformaticians.
Dependingonthetypeofsequencingdone(e.g.,genotyping,wholegenomeorwholeexomesequencing),therawdatafileswillhavedifferentformats,andtheirprocessingandanalysis
willrequiredifferentnumbersofsteps.Forexample,forwholegenomeandexomesequencing,
therawdataproducedbythesequencingofasingleindividualconsistsofalargenumberof
smalldatafiles,eachofwhichcontainsatinysectionofthegenome,andwhichtogetherforma
sequencinglibrary.Thatlibraryisthensubjectedtoa3-stepprocessof“assembly”and
“alignment”(inwhichtheoriginalDNAsequenceisreconstructedinitsproperorder)and
finally“annotation”(comparisonwithawell-knownreferencegenomeorvariantdatabase)toidentifyandcataloguevariantsfoundalongthereconstructedsequence.Althoughthisprocess
isautomated,bioinformaticiansoftenvisuallyinspecttheresultsusingagenebrowsertohelpincreaseaccuracy.
ExamplesofGenomicSequencingDevices
IlluminaMiSeq OxfordNanoporeTechnologiesMinion
https://www.illumina.com/systems/sequencing-platforms.htmlhttps://nanoporetech.com/products/minion
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
11
Step3.DNAdataisusedforclinicalinterpretation
Inthethirdstep,theanalyticresultsareprovidedbacktoclinicianswhointerpretitforthe
patient.Thelabresultsincludekeyfindingsaboutvariationsfromclinicalcasesofsimilar
patientsinothergenomicdatabases.Itisworthnotingthateventhoughtheyarebasedon
digitaldata,thesefindingsarenotfreefrompotentialerrors,andinfacttheyalwaysinvolvea
degreeofsubjectiveinterpretation.Thevariantinformationhelpscliniciansinterpretthedata
ofthecurrentpatientbeingtreated.Inthisstep,variantinformationisalsolinkedtohealth
records,sothatclinicianscangaininsightintothepatient’sconditionandrecommend
treatmentoptions.
Atthispointintheprocess,ifhealthcareprovidersaredealingwithadifficultdiagnosisor
treatmentsituation,theyarelikelytoconsultwithin-houseand/orexternalcliniciansinorder
togatheradditionalinformationtoassisttheminmakingadiagnosis.Forthispurpose,they
mayseekgenomicdataonpatientswithsimilarsymptoms,ormedicalrecorddataonpatients
withsimilargenevariants.Thissteprequiressharinganddiscussionofthepatientgenomicdata
andphenotypicdata(i.e.medicalrecords)frommultipleindividuals.Whoisinvolvedatthis
stepdependsonthespecificdiagnosticsituation,butitwouldusuallyincludethepatient’s
primaryclinicalteam,cliniciansatothermedicalfacilities,andeveninternalorexternal
bioinformaticiansifsubsequentscrutinyofthegenomicdataisdeemednecessarytoreacha
diagnosis.
Step4.Furtherdatasharingandreuse
Inthefinalstep,thepatient’sDNAdataisnearlyalwaysreusedandsharedbeyondtheoriginal
purposeforwhichitwascreated.Initiallythefocusofattentioninthepatient’sdatawaslikely
onvariantstiedtohisorherspecificsymptomsordisease.However,thepatient’slarger
genomeisvaluableformyriadotherpurposesandthisdataishighlysoughtafterbyother
cliniciansandresearchers.Henceitiscommonpracticeforanindividual’sDNAdata(andits
associatedphenotypicdata)tobede-identifiedandthenaddedtoalargerDNAdatabasefor
furtherresearchandclinicaluses.Althoughinmanycasesthepatienthassignedaconsent
formthatauthorizesadditionalusesingeneral,theyrarelyareinformedaboutthespecific
waystheirdatawillbereused.Inaddition,genomicdatathatisconsideredde-identifiedcanbe
sharedwithoutconsent,asnotedabove.
DatabasesthatcontainmergedDNAandclinicaldatafrommanyindividualsarecriticalfor
researchersaimingtodiscovernewvariantsandteaseouttheirclinicalimplications.Variant
discoveriesfromnewdatacanthenbecomparedwithpriorknowledgefrompre-existing
variantdatabases,graduallygrowingandrefiningtheoverallpoolofgenomicknowledge.
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
12
Astheinitialpatient’sdataismergedintoalargerde-identifieddatabase,thatdatabasemayin
turnbesharedfurtherafield.Clinicalorganizationssuchashealthsystemscreatingthese
databasesmayenterintopartnershiporlicensingagreementswithotherorganizations(suchas
otherclinics,academicresearch
institutesordrugcompanies
searchingfornew
pharmacogenomicdrugs)toenablere-useoftheirdata.Such
partnershipsfacilitatethe
transferDNAdatafromclinical
tocommercialsettings.Indoing
so,thesepartnershipsalsoblur
thelinebetweenclinical,
research,andcommercial
genomics(seetextbox).25Ifthe
patient’ssequencingwasdone
aspartofafederally-funded
researchstudy,theirdatawill
alsobecomepartofadatabase
depositedwiththeNational
InstitutesofHealth(NIH)and
madeavailableforother
researchers.26Atthispointinthe
lifecycle,individualpatientsno
longerhavecontroloftheir
individualDNAdataandmost
likelywillneverlearntheresults
ofsubsequentanalysesthat
utilizeit.
Insum,thelifecycleofDNAdatashowshowgenomessequencedforindividualpatientcare
becomecriticaltonewresearchdiscoveries,aswellasprofit-makingendeavorsby
entrepreneurialorganizations,throughdatabasecreation,sharing,andreuse.
DataSharingBlurstheLinesBetweenClinicalCare,Research,andConsumerFirmsBecausegenomicsdatabasesareseenasvaluablefora
rangeofpurposes,includingclinicalcare,biomedical
research,andcommercialpurposes,partnershipsare
forgedthatallowdatasharingandreuseacross
organizationsinallthreeofthesesectors.Forexample,
FoundationMedicine–apubliclytradedcorporation–
offersaservicetopatientsandtheirdoctorsinvolving
sequencingcoupledwithtailoredtreatment
recommendationsbasedonthatsequencing.Usingclient
data,Foundationisassemblingagrowingdatabaseof
cancerpatientandtumorDNA,fromwhichithopesto
derivenewinsightsfortailoredtreatments.InNovember
2015,Roche,oneoftheworld’slargestpharmaceutical
firms,acquiredamajorityshareinFoundation,givingit
accesstotheirdatafordrugresearch.Atthesametime,
anotherpartnershiplinksFoundation’sgenomicsdataon
20,000cancerpatientstoFlatironHealth’sclinicaldataon
thedrugs,treatments,andhealthoutcomesofthose
patientsfrommedicalrecords.Theypromiseeventual
publicdisclosureofthesede-identifieddata.Andthrough
ongoingownershipandgovernancecontrolinFoundation,
theconsumertechnologygiantGoogleisalsoaccessing
Foundation’sdatafortheircommercialresearchand
developmentpurposes.
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
13
PRIVACYANDSECURITYRISKSATEACHSTEPINTHEDNADATALIFECYCLEGiventhenatureofgeneticdata,andspecificallyhowmuchitcanrevealaboutanindividual
andhisorherfamily,manypeoplemaywanttomaintaintheirprivacyregardingthiskindof
information.However,acrossthisDNAdatalifecycle,manyprivacyandsecurityriskscanbe
identified.Inthissection,weconsidertherisksateachstep,alongwithpotentialremediesthat
mayhelptoamelioratethem.Beforetakingstockofthespecificrisks,itisworthpausingto
considerthemeaningofprivacy,bothingeneralandinthespecificcontextofDNAdata.Inbroadstrokes,privacyinvolvesbeingabletochoosewhatpartsofoneselftodiscloseto
differentaudiences.27AlthoughtheUnitedStateslacksabroadlegalrighttoprivacy,many
Americans,aswellastheUnitedNations,regardadegreeofprivacytobeafundamental
humanright.28
Inthecontextofpersonalinformation,suchasDNAdata,privacymeanshavingcontrolover
thatinformation,suchthatapersoncanconfidentlychoosetodiscloseittoonetrustedparty
foronespecificpurposewithoutconcernthatitwillbepassedtootherpartiesorusedfor
otherpurposes.Maintainingprivacythereforealsologicallyrequiresmechanismstoensurethe
securityofthatinformationwhileitisbeingusedbytheotherparty,preventingunauthorized
disclosureandfurtheruses.Ofcourse,whenanindividual’sinformationbecomespartofa
largerdatabasethatencompassesdataonmanyotherpeople,themaintenanceofprivacyand
securitynolongerdependjustonhowtheirspecificinformationistreated;instead,much
dependsonthepoliciesandpracticesrelatedtothelargerdatabase.29Hence,muchoftherisk
–andpotentialremedy–concerningDNAdatainvolvesthedatabasesandhowtheyare
managed.
Instep1oftheDNAlifecycle,whengenomicdataarecollectedforsequencing,patientsmust
givetheirconsentfortheprocesstobegin.However,theyareunlikelytobefullyawareofthe
pathwaystheirdataaregoingto
travel.Patientsoftenconsentto
sequencingbecausetheirdoctor
hopesitcanhelpintheir
diagnosisandtreatment;that
usageisnaturallytheirpriority.
Butinmostsituations,theirdata
willalsobeaddedtoaDNA
database,andreusedforother
researchandclinicalpurposes.
Patientstypicallysignconsent
formsthatsaytheirdatawillbe
reusedinunspecifiedways(see
textbox).30Givenaninabilityto
trulyanonymizeDNAdata
UpdateonFederalConsentRulesforResearchAJanuary2017updatetotheU.S.CommonRule-which
appliestoresearchprojectsandorganizationsreceiving
federalfunding-hasimplicationsforthetreatmentof
genomicsdata.Theupdateallowsresearcherstobein
complianceiftheyobtain‘broadconsent’from
individualswhosebiosamplesarecollected(asdescribed
above).Analternativeapproach,requiringresearchersto
informindividualsabouteachspecificwaytheirdataare
beingreused,wasrejectedduetotheburdenitwould
haveplacedonresearchers.1Manyclinicaland
commercialgenomicsorganizationsareadoptingthis
broadconsentapproachaswell.
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
14
throughde-identification,thatonwardjourneycarriesdisclosurerisksthatriseasmore
individualshandlethedata.Allthoseindividualscouldaccidentallydisclosethedata,or
themselvesengageinitstheft,makingthedataavailabletopartieswhowoulduseitagainst
them.Althoughthoserisksmaybereduced,aswediscussbelow,theycanneverbeeliminated.
TheconsentformsusedbycommercialgenomicscompaniesandotherDNAdataservicestend
tobesimilarlynon-specificindetailingthereusesofanindividual’sdata(seetextbox).31
Whatcanbedonetoenhance
privacyandsecurityofDNA
dataatthisinitialstep?Ata
minimum,duringtheconsent
process,patientsneedtobe
educatedaboutthebenefits
andrisks.Beforeproviding
consentandbeingsequenced,
thepatientcanberequiredto
meetwithageneticcounselor,tolearnaboutthepotentialbenefitsandrisks
associatedwithDNAdata
generationandanalysis–as
wellasaboutthepotential
benefitsandrisksoffurther
sharingandreuseofdata.
Patientscanalsobegiventhe
choicetoopt-outofdatareuselaterinthelifecycle.
Theconsentprocessitselfcan
bemodifiedsothatpatients
areabletochoosethedegree
ofdatabaseinclusionand
furtherusestheywant.For
example,theycouldoptfor
inclusioninthehealthcare
organization’sinternal
database,butnotforfurther
sharingwithexternalpartners.Theycouldalsochoosetoopt-intoreuseininstancesthatthey
personallysupport(forexample,certainmedicalorscientificstudiestheycareabout),butstay
outofreuseininstancestheydonotsupport(forexample,certaincommercialuses).
Withoutsuchgenomiccounselingandopt-outprovisions,consentintheclinicalsetting
representsanindividuallossofcontrolandcreatespotentialprivacyrisks.Atthesametime,it
isimportanttorecognizethattheseprovisionswouldcomeatanon-trivialcost.Inparticular,
ExcerptsfromacommercialgenomicsconsentformThecompany23andMe,whichsharesgenomicdatainsome
formwithmorethanadozenotherentities,providesits
customerswithaconsentform,privacystatement,and
terms-of-service.Theirconsent“keypoints”500-word
summarydescribessomeaspectsoftheirgenomicdata
sharingandassociatedrisks:
“23andMeresearcherswhoconductanalyseswillhaveaccesstoyourgeneticandotherpersonalinformation,butnottoyourname,contact,orcreditcardinformation.”“23andMemaysharesomedatawithexternalresearchpartnersandinscientificpublications.Thesedatawillbesummarizedacrossenoughcustomerstominimizethechancethatyourpersonalinformationwillbeexposed.”The23andMeprivacyformelaboratesondatasharing:
“Wemayshareanonymizedandaggregateinformationwiththird-parties;anonymizedandaggregateinformationisanyinformationthathasbeenstrippedofyournameandcontactinformationandaggregatedwithinformationofothersoranonymizedsothatyoucannotreasonablybeidentifiedasanindividual.”Thelongerterms-of-serviceformalsostates:
“GeneticInformationthatyousharewithfamily,friendsoremployersmaybeusedagainstyourinterests.EvenifyoushareGeneticInformationthathasnoorlimitedmeaningtoday,thatinformationcouldhavegreatermeaninginthefutureasnewdiscoveriesaremade.”
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
15
researcherswouldlosethefreedomtoeasilyre-analyzethedatathatcamefromindividuals
whoopted-out.Afurtherissueisthetreatmentofinfantsandchildreningenomicsdatabases.
Theseminorscannotprovideinformedconsent,butaftersequencingintheclinicalsetting,
theirdataisbeingreusedinthesamewaysthatadultgenomicdataisbeingreused.
Insteps2and3,datasecurityriskscometothefore,asactivitiesrelatedtothestorageand
handlingofDNAdatapresentmanypointsofvulnerability.Althoughhealthcareorganizations
arerelativelysophisticatedwhenitcomestoprotectingpersonalhealthdata,becauseofHIPAA
laws,nonethelesstheyarefrequenttargetsofattack,andsufferfrequentdataspills.Indeed,
90%ofhealthcareorganizationsandassociatedfirmsrespondingtoa2016Ponemonsurvey
reportedtheyhadexperiencedadatabreach,and64%reportedabreachthatinvolvedleaking
patientmedicalrecords,injustthelasttwoyears.32Itistooearlytoknowabouttheoccurrence
ofgenomicspills,buttheyarelikelytobeoccurringnowandinthenearfutureaswell.Inone
recentincident,QuestDiagnostics,whichhandlesgenomicsdataforalargeIBMWatson
initiative,reportedamajorspillofpatientlabdata(seeinset).33
Forgenomicdatamanagement,therearemanyhardwareandsoftwareoptionsforthestorage
andtransitofDNAdata.Datacanbestoredonanorganization’slocalserversorinacloud
servicesuchasAmazonWebServicesorGoogleCloud.Datacanbemovedviadedicated
internalcableswithinanorganization,orovertheInternet,orphysicallyviaportablehard
drives.ForDNAdatathatisstoredonly(orlargely)inthecloud,datatransitcanbelimited
throughcloudcomputingpractices,essentiallyrequiringanyanalysissoftwaretobe“brought
tothedata”inthecloudratherthanthedatabeinganalyzedbysoftwareonalocalmachine.
Eachofthesealternativedatamanagementoptionscanbemaderelativelymoresecureusinga
numberoforganizationalbest-practices:
• Encryption:Ensuringthatdataisencryptedintransitandatrest• Authentication:Verifyingtheidentityofindividualsaccessingthedata.Two-factor
authentication,involvingatokensenttoamobiledeviceorkeyfob,providesadditional
validation.Forsensitivedata,in-personauthenticationcouldberequired.
• Authorization:Narrowingthenumberofpeoplewithaccesstodatabasedonthe
projectortask,andlimitingthedurationofthataccess.
• Monitoringandauditing:Assessingandimprovingsystemsecurity,andtrackingdetails
ofuse,unauthorizeduseandcompliance.Routinevulnerabilityassessmentsand
penetrationtests.Usingablockchainlegersystemmaybeonewaytoverifyhistoryof
dataaccessthroughoutlifecycle.
• De-identification:Strippinganindividual’sidentityfromtheDNAdataisuseful,butit
doesnotachievetrueanonymizationinmanycircumstances(aswediscussedabove).
“QuestDiagnostics,aNewJersey-basedmedicallaboratorycompany,disclosedadatabreachaffectingabout34,000peopleonMonday.Digitalintrudersstolepersonalandmedicalinformationofcustomers—includingnames,datesofbirth,labresults….AttackersgainedaccesstothedataonNovember26throughanimproperlysecuredmobileappthatletspatientsshareandstoreelectronichealthrecords.”
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
16
Beyondthesecurrentbest-practices,therearealsoeffortstoallowuserstoquerydatabasesas
totheircontents,andtoaggregateandanalyzedataacrossmultipledatabases,allwithout
exposingtheusertotheactualunderlyingdata.However,theseeffortshaveproventobe
challenging.Forexample,in2015a“beacon”systemwasdevelopedbyGlobalAlliancefor
GenomicHealth(GA4GH)toallowpeopletoquicklyquerymanydatabases(viatheirbeacons)
aboutwhethertheycontainindividualswithcertainspecificvariants.However,thissystemwas
quicklyshowntobevulnerabletore-identificationattacks.34Inanotherexample,“Datashield”
softwarewasdevelopedtoallowpooledstatisticalanalysesacrossmanygenomicdatabases,
returningpooledresultswithoutrevealinginformationaboutanyspecificdatasetthatwent
intothepooling.35
Atthesametime,despitethesetechnologicalprotections,alltheseoptionsarevulnerableto
risksthatarisethroughhumancompliancebehavior.Ingeneral,manydataspillsresultfrom
(non-)compliancebehavioramongindividualswhoareauthorizedtousesensitivedata.36
Carelessnessandtheftbyemployeesposeseriesriskstodatamanagement.Minimizingsuch
behavioralthreatstodataprivacyandsecurityarebesthandledthroughrobuststaffscreening
andtraining,coupledwithsanctionsforemployeesandsubcontractorswhoviolatepoliciesand
procedures,andphysicalworkplacesecurity.Indesigningtrainingandsanctions,itiscriticalto
recognizethatastheintrinsicvalueofagenomicsdatabaseincreases,sodoesthefinancial
temptationforinsiderstobecomeinvolvedinadataspill.
Instep3,datasecurityrisksalsoariseasphysiciansandothercliniciansshareindividualpatientgenomicdataandinterpretationswithcolleaguesinthecourseoftheirwork.Outofadesireto
helptheirpatients,physiciansandotherprovidersoftensearchforotherpatientswhoseDNA
variantsandclinicalsymptomsappearsimilartothepatienttheyaretreating.Inthisprocess,
theywillsharedataaboutapatientwithotherclinicians,andrequestthatthesecliniciansshare
dataonotherpatientswiththem.Intheeraofgenomicsequencing,thissharingcaninvolve
genomicdata–variantlists,interpretations,genomesnippetsorsectionsofinterest,oreven
wholeexomeorgenomedata.Inthisprocess,cliniciansareoftenmotivatedtoactquicklyand
efficiently,givenboththeirprofessionaldesiretohelpasufferingpatient,andtheirownbusy
schedules.
Theseforcesincreasetheincentivetoshareapatient’sgenomicdata,alongwithother
personalhealthinformation,inamannerthatisnotsecure.Inparticular,theelectronic
communicationplatformsusedforthispurposeareanimportantconsideration.Inrecent
years,ithasbecomeincreasinglycommonforphysicianstoshareinformationfromtheir
mobiledevices,usingemail,text,orappsthatdonotcomplywithHIPAArequirementsfordata
security,andthishasbeenamajorcauseofhealthdatabreaches,asintheQuestdatabreach
reportedabove.37Mobilecommunicationstransmittedoverwirelessnetworkscanbe
particularlyvulnerabilitytointerception.Whengenomicdataislinkedwithotherpatient
information,itisclearlysensitive;eveninisolation,aswehaveshown,genomicdatamustbe
treatedwithcare.
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
17
Henceallcommunicationofpatientdataneedstotakeplaceusingasecureplatform–onethat
complieswithHIPAArequirements,maintainingencryptionintransitandatrest.Ofcourse,this
posesacomplicationforsharingamongclinicianswhodonotusethesamesecureplatform.
Andthesecurityofthesecommunicationsisstilllimitedbytheextenttowhichcliniciansare
compliantinmaintainingthesecurityoftheirownmobiledevices.
Step4involvestheDNAdata’sonwardjourneybeyondtheoriginalsettingandpurposeforwhichitwasgenerated.Inmanyways,thissteprepresentsthegreatestrisk.Thedatais
changinghandsandmovingacrossorganizationalboundaries,beingentrustedtomore
individuals,exposedtomultipleorganizationalpoliciesandroutines,allofwhichincreasethe
chancesofloss,accidentaldisclosureortheft.OrganizationsreceivingDNAdatamayhave
differentgoals,andtheiremployeesmaybesubjecttodifferentprivacyandsecuritypractices,
andthesedifferencesmaygrowwiththepassageoftime,asillustratedinarecentcourtcase
involvingDNAcollectedformedicalresearch(seetextbox).38
Whengenomicsdatabasesareshared
acrossorganizationalboundaries–i.e.
whendataisbeingtransferredtoreside
inanotherorganization,ordataaccess
isbeingprovidedtomembersof
anotherorganization–thereneedsto
beanaccountabilitystructurethat
ensuresdataprivacyandsecurityinthe
receivingorganization.Anobvious
templateforthisaccountability
structureisprovidedbyHIPAA.Under
HIPAA,afteritwasupdatedwiththe
2010HITECHAct,whenahospitalor
healthcompanysharesdatathat
containspersonalhealthinformation,
thereceivingorganizationhastosigna
HIPAABusinessAssociate(BA)contract,inwhichtheyagreetoasetofdata
privacyandsecuritypracticesthat
largelymirrorthoserequiredwithinthehospitalorhealthcompanyitself.TheBAentity
becomesliableforrespondingtodatabreachesaswell.SuchBAcontractsaresignedbythird-
partyorganizationsforinsuranceclaimsprocessing,hospitalconsultants,andeven
transcriptionistshandlingpersonalhealthinformation.Suchanapproachdoesnotapplyto
genomicsdata,butitcould.
Asintheprimaryorganization,datasecuritypracticesintheorganizationsthatreceiveaccess
toshareddatashouldalsobesettoahighstandard.Ifdataaretransferredtoapartner
organization,themanagementofgenomicsdataatthatpartnerorganizationnaturallypose
risksthataresimilartothoseweinventoriedaboveinsteps2and3.Inparticular,thereare
AcourtcaseinvolvingdisputedDNAdatareuseIn1993,DNAsamplesfrommembersofasmall
AmericanIndiantribe,theHavasupai,were
obtainedformedicalresearchpurposes,using
broadconsent.Theinitialstudyapprovedbythe
tribeinvolveddiabetes,butsubsequentlythe
DNAdatawereusedbyotherresearchersin
otherstudies,ontopicsrelatedtomentalhealth,
migration,andinbreeding.In2003,atribal
memberlearnedaboutotherresearchwhile
attendingauniversitylecture,leadingtoa
lawsuitthatwasultimatelysettledoutofcourtin
2010.Issuesinthelawsuit,ArizonaBoardofRegentsv.HavasupaiTribe,includedlackofinformedconsent,violationofcivilrights,
unapproveduseofdata,andviolationofmedical
confidentiality/re-identification.
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
18
databreachrisksarisinginthestorageanduseofthedata,regardlessofwhetheritisphysically
locatedonlocalserversorinthecloud,andwhetherornotitismovedbetweenlocationsover
theinternet.Thesameorganizationalbest-practicescanhelpmitigatethoserisks.Ifdataisnot
transferred,butaccessisprovidedtomembersofapartnerorganization,thenthesamehuman
compliancebehaviorconcernsariseforthosepartnerorganizationemployeeswhoaregiven
access.
Theseverityofconcernswithgenomicdatabasesharingwasunderscoredintheconclusionsof
arecentassessmentbytheAmericanAssociationofArtsandSciences(AAAS)andtheFederal
BureauofInvestigationonthistopic(seeinset).39
“Beyondaccesscontrols,encryption,andothercommondataandcybersecuritytechnologies,nosolutionsexistthatpreventormitigateattacksondatabasesorthecyberinfrastructurethatsupportBigDatainthelifesciences,whichcouldresultinconsequencestothelifescience,commercial,andhealthsectors.”
Whatfactorsarelikelytobeassociatedwithanincreasedriskofprivacyandsecurity
compromiseinpartnerorganizations?Researchontheincidenceofwrongdoingandaccident
eventsinorganizationalandscientificfieldsprovidesausefulguide.40Inparticular,theresearch
suggestsheightenedriskwhendatasharingincludes:
• Emerginginnovators.Smallandrecently-foundedorganizationstendtolackthe
resources,experienceandscaletohavedevelopedandfundedinternalcompliance
systems.Andthecultureofinnovationinemergingventuresoftenencouragesbreaking
withindustryrules.Examplesofemerginginnovatorsincludenewspecialized
sequencinglabsandbioinformaticsfirms,consumer-facingstartups,andcitizen-science
organizationsthatlackfundingandexperience.(Ofcourse,largerorganizationsmaybe
morevisible,tohackersaswellaseveryoneelse,butsmallerorganizationstendtobe
morevulnerable.)
• Organizationsbasedinweakerregulatoryjurisdictions.Examplesincludecommunity
hospitalsandmedicalofficesthathavenotpreviouslyengagedinclinicalresearch,
whichcouldfalloutsidefederalCommonRulejurisdictionandlackIRBexperience.
Otherexamplesincludecommercialgenomicscompaniesandpatientadvocacy
organizationsthatdonothandlepatientmedicalrecords,implyingtheyfalloutside
HIPAAjurisdictionandlackexperiencehandlingsensitivepatientdata.Partner
organizationsmayalsobebasedinstatejurisdictionswithlooserregulations.41An
extreme(butcommon)exampleispartnerorganizationsbasedoutsidethelegal
jurisdictionoftheUnitedStatesaltogether.Forexample,amajorU.S.genomics
company,HumanLongevityInc.,recentlyformedpartnershipswiththeBritish-Swiss
pharmaceuticalfirmAstraZenecaandtheSouthAfrica-basedhealthandlifeinsurance
companyDiscoveryLtd.42
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
19
• Useofpartnersandoutsidecontractors.Asthenumberofpartnerorganizationsincreases,
accountabilitystructuresbecomedecentralized
andmorediffuse,andthereisagreaterchance
thatcompliancestandardswillconflictbetween
organizations.Morepartnersalsoadd
complexity,whichincreasesthechancefor
errorsorgapsinprocedurethatcreate
vulnerabilities.Forexample,aclinical
organizationmayhavecrisscrossing
partnershipswithpharmaceuticalcompanies,
genomicsstartups,academicresearchers,and
patientadvocacygroups–allofwhichare
commoninthegenomicsfield.Databrokersare
alsousedtosharegenomicdata,addinganother
layeroforganizationalcomplexity(seetextbox).43
• Linksbetweenfor-profitandnon-profitentities.Thesepartnershiparrangementswill
mixconflictinglegaljurisdictions,andtheyarealsolikelytomixconflictingfinancialand
societalobjectives.Thosefactorsincreasethechancesthatcompliancestandardswill
differ,andcomplicatecomplianceoversight.
• Organizationsexperiencingrapidchange.Organizationsthatarerestructuring,mergingorbeing
acquired,rapidlyexpandingor
contracting,experiencingfinancial
hardshiporgoingthrough
bankruptcyareallmorelikelyto
experiencebreakdownsandgapsin
compliance,asstandardoperating
proceduresaresuspended(seetext
box).44Theyarealsolikelytohave
increasedstaffturnover,bringingan
elevatedriskfornewlyhired
employeeswholacktraining,and
outgoingemployeeswhomaybe
morewillingtoretaliateagainst
theirformeremployer.
Professionaltraining.Acrosstheentirelifecycle,thereisabroadneedforclinicians,bioinformaticians,andotherswhohandlegenomicdatatobetrainedintheprivacyand
securityrisksassociatedwiththesedata.Currently,theprimaryvehicleforthisisemployee
training–ahighlydecentralizedandthereforeunevenvehicle.Educationalinstitutionscanalso
GenomicDataBrokersInresponsetogrowingdemandfor
humangenomicdata,anewcropof
start-upcompaniesisservingasdata
brokers,offeringtopayindividualsfor
accesstotheirgenomicdata,whichthe
brokerthensellstoresearchstudies.
Oneexampleofthisbrokerfunction,a
startupventurecalledDNASimple,
offerstogiveconsumerscontrolover
whichresearchstudiestheirdataare
givento,andtolaterdestroya
customer’sdataupontheirrequest.
Whathappenstoagenomicsdatabaseinbankruptcy?Asgenomicsdatabasesproliferateintheprivate
sector,thequestionarisesconcerninghowthey
wouldbetreatedinabankruptcyproceedings.
Althoughpersonaldataispartlyprotectedunder
FederalTradeCommissionrules,andenforcedby
stateconsumerprotectionauthorities,thisdoesnot
stopthesaleofgenomicdatatoanotherentity
duringbankruptcy.Theruleofthumbisthat
whateverprivacypoliciesthebankruptcompany
hasinplacewillhavetobereplicatedintheentity
thatthatbuysthedata.HIPAAcanalsoconstrain
thesaleofpersonalhealthdata,butde-identified
genomicdatawouldprobablybeexemptfromthis
constraint.
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
20
playanimportantrole,byincorporatingprivacyandsecuritytopicsintothecurriculumof
degreeprogramsinmedicine,bioinformatics,andrelatedfields.Thereisalsolikelytobeaneed
forcommercialtrainingprograms,andadvisoryservicestohelpdisseminateandestablishbest-
practicesinorganizationaltrainingprogramsacrosshealthcareorganizations.Manypracticing
physicians,forexample,willneedcontinuingmedicaleducation(CME)relatedtogenomicsin
comingyears,representinganopportunitytoensuretheyareexposedtoprivacyandsecurity
concerns.
Self-regulationintheprivatesector.Marketforcesandself-regulationcanplayanimportant
roleinreducingtheserisks.Yetintheabsenceofanexternalaccountabilitymechanism,self-
regulatingmarketactorsonlybearpartofthecostofagenomedatabreachthatoccurs.
Instead,furthercostsareshoulderedbyindividualswhosedataaredisclosed(andtheirfamily
members,whosedataareindirectlydisclosed),whoincurharmifthosedataareusedagainst
themsubsequently.Throughthislens,genomicdata-spillrisksareanegativeexternalitythat
maymeritregulation.45Whilethethreatofcompetitionortheftfromrivalsshouldincentivize
genomicdatabasesecurity,atthesametimecompetitivepressureincreasestheincentiveto
enterpartnershipstoacceleratediscoveryaheadofcompetitors-increasingthedatasharing
risksposedbythosepartnerships.
Publicdatasharinginitiatives.Inthepublicsector,commitmentstoopenscienceareaimedat
acceleratingscientificadvances.Theseinitiatives,whilelaudable,alsoposeriskstotheextent
thatpublicsharingofgenomicdatasetsincreasetheirexposure.Forexample,agenomic
databaseofcancerpatients,linkedtotheirmedicalrecorddata,wasrecentlyreleasedbythe
AmericanAssociationforCancerResearch,incollaborationwithSageBionetworks,forpublic
researchaccess.Whiledataprivacyprovisionswereincludedinthedesignofthepublicdata
release,itremainstobeseenwhetherthesedataarevulnerabletodisclosureand
reidentificationattacks.46Onalargerscale,theNationalInstitutesofHealthcollectsgenomics
dataintoseveral
centralizedarchives
designedforsharing
andreusebyother
researchers,increasing
thebenefitsaswellas
therisksofdatareuse
(seetextbox).47
CentralizedCollectionandSharingofU.S.HumanGenomicsDataThecentralcollectionandintegrationofmanydatabasesoccurs
withinasinglepublicsectorentity,theU.S.NationalCenterfor
BiotechnologyInformation's(NCBI)databaseofGenotypesand
Phenotypes(dbGaP).Suchcentralcollectionsmaybringadditional
exposurerisksassociatedwiththescaleofthedatabaseandthe
numberofsharingeventstobemanaged(20,178approveddata
requestsasofJuly1,2015).ThedbGaPdatabasehasalready
experiencedseveralknowndatasecurityincidents,andislikelyto
beatargetforfuturehacking.Thereareotherrisksassociatedwith
unintendedfutureusesofsuchdatacollections,includingfuture
governmentalreuse,includingreuseforforensicinvestigation,to
identifyvictimsinthewakeofmasscasualtyevents,forcitizenship
verification.
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
21
THEBIGPICTURE:WENEEDGREATERSOCIETALATTENTIONTOPRIVACYANDSECURITYOFGENOMICDATA
Genomicsoffersboldpromisesforrevolutionizingmedicineasweknowit—andsomeofthose
promisesarealreadybeingrealized.Therearetremendouspotentialbenefitsfromgenomics,
revealingnewlife-savingandlife-enhancingdiscoveriesforprecisionmedicalcare.Genomics
databaseswillplayanimportantroleinachievingthosebreakthroughs.Atthesametime,we
alsoneedtobeappreciatetheseriousrisksthatthedisclosureofsequencedDNAresultspose
forindividuals.
Ifindeedthegenomicsfieldisatacriticalinflectionpoint,asmanybelieve,thenthisisacrucial
pointforustowrestlewiththetensionsinherentinpromotingfutureresearchwhileatthe
sametimesafeguardingindividualprivacy.Thelifecycleviewweofferedinthispaper–showing
howDNAdataisgeneratedandprocessedforuseinprecisionmedicine–identifiespotential
riskstoprivacyandsecurityateachstep.Ourintentioninsettingoutthislifecycleperspectiveis
toprovideacautionarytale,indicatingwheredatabreachescouldoccurinclinicalpractice,
despitebreachpreventioneffortscurrentlyemployed.
ShouldwebehavingbroaderpublicdeliberationabouthowDNAprivacyandsecurityissues
canbestbeaddressed?Whatisthepublic’sroleinrealizingthetwingoalsofaccelerating
discoveryandtreatment–whileensuringthatprivacyandsecurityarerespected?This
discussionshouldincludeallrelevantstakeholders–notjustthosewhoarealreadyinvolved
andinvestedinthecurrentgenomicsfield,butalsorepresentativesofthoseordinarycitizens
whoaresoontobeaffectedbythegenomicsrevolution(whethertheylikeitornot).
Ouranalysisrevealedthreefundamentalquestionsthatwebelievewarrantbroadsocietal
reflectionanddeliberationifwearetoreachforthepromiseofmedicalgenomicswhile
simultaneouslymitigatingrisksofdisclosure.Explorationofthesequestionsshouldbeatthe
coreofpublicdeliberationsabouttheprivacyandsecurityofgenomicdata.Thereareofcourse
manyotherpressingquestions–involvinglegal-regulatoryframeworks,economicimpacts,and
nationalinterests,amongotherthings.Butwebelievethesethreequestionsshouldtake
prioritybecausetheystartfromarecognitionofthefundamentalnatureofthegenome.
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
22
ThreeFundamentalQuestions
Question#1:“Whoshouldhavetherighttomakedecisionsaboutyourgenome?”Thetensionunderliningthisquestionpitsindividualvs.societalrights.Basically,thequestionis
towhatextentcanIcontrolwhathappenstomysequencedDNAresults?Althoughnoone
“owns”theirDNAdata,anindividual-centricapproachtoDNAdatameansthatindividuals
retaincontroloverwhentheirDNAdataisextractedandtowhatusesitisput.Presumablythis
wouldentailindividualsmakinginformeddecisionsaboutwhentheychoosetobesequenced,
whattheirsequenceddatacanbeusedfor,andunderwhatcircumstancestheirdatacanbe
usedbyothers.Whilethisisthepurposeofinformedconsent,atthetimethatconsentisgiven,
individualsareaskedtoagreeto-butcannotpossiblyfullyappreciate-thepotential,unnamed
usestowhichtheirdatamaybeputinthefuture.Indeed,thecurrentcommonpracticeof
obtaining“broadconsent”limitsindividualcontrolfurtherbyrequiringpeoplebeingsequenced
toagreetoabroadrangeofunspecifiedfuturereuseandsharingscenarios.
Themostobviousanswertothisquestionfromaprivacyandsecurityperspectiveisthat
individualsoughttoretaincontrolovertheirDNAanditsuse.However,aswenotedabove,
therearemanyresearchandcommercialorganizationsthatdependonlargedatabasesofDNA
datatodotheirwork.“Nearlythree-quartersofallgenomicscompaniesprovidetools(both
physicalandinthecloud)topharmaceuticalcompaniesandacademicresearchinstitutions.”48
Inaddition,medicalorganizationsarescramblingtobuildbigger(andbetter)DNAdatabasesto
enablenewresearchandimprovedclinicalpractice,andtheNationalInstitutesofHealth
activelypromotethesharingofDNAdatabaseswhenfundinginnovativemedicalresearch.
JustastheHeLacelllinethatoriginatedfromHenriettaLackswasinstrumentalinadvancing
biomedicalresearch,49thesesequencedatabases–inthiscaseoriginatingfrommanypeople
ratherthanjustone–arelikelytobeinstrumentalforprogressinmedicineandhealthcare.
Indeed,creatingandsharingthesedatabasesisbelievedtobecriticalforsuccessintheWhite
HousePrecisionMedicineInitiative,theCancerMoonshot,andthe21stCenturyCuresAct.
50So
acompellingcounterargumenttoindividualcontrolisthatimplementingfullindividualcontrol
couldgrindgenomicresearchtoahalt,andgreatlyslowprogressinmedicineandscience.In
thisview,thescientificcommunityshouldbeempoweredandtrustedtodecidehowindividual
genomicdatashouldbeused,becausetheyarebestpositionedwiththerelevantexpertiseto
weighthepotentialbenefitsandcostsofitsuseoverall.
Everydaycitizensshouldhavecontroloftheusesoftheirowngenomicdata
ThescientificcommunityshoulddecidehowDNAdatabasesarehandled vs
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
23
Question#2:“Howcloselyshouldyouholdontoyourgenome?”HowproprietaryindividualsfeelabouttheirsequencedDNAdatarunsthegamutfromthose
whowanttozealouslyguardittothosewhoarewilling,andeveneager,tocirculateitfreely,
eitherunawareoforindifferenttopotentialfutureriskstothemselvesandtheirfamilies.Still
othersciteasenseofinevitabilityofwidely-sharedDNAdataandeventhepotentiallikelihood
ofpeerpressuretoshareitthatcouldemanatefromsuspicionthatthosewhodonotmustbe
hidingdamagingevidenceabout
themselves.Indeed,withinaculture
ofwidespreaddisclosureofpersonal
datathroughsocialmedia,arguingin
favorofprivacymaybealosing
battle.Thereareevenpersuasive
argumentsthatundersome
conditionstheremaybeaobligation
torevealyourgenomicinformation
(seetextbox).51
Still,ifthesocietalresponseto
question#1isthatindividualsshould
havetherighttoprotecttheirown
DNAfromusebyothersand,instead,
retainitasatreasuredprivate
possessionthatshouldbekept
secureinadigitalsafetydepositbox,
thenattentiontohowwearegoing
toenablethatisneeded.Ata
minimum,thisdeservesaninformed
publicdiscussion,whichbringsustothethirdquestion.
DNAdataisjustanotherbitofpersonaldatatosharewith
relativeopennessduringawiderangeofsocialtransactions
DNAisatreasuredpossessionthatneedstobekeptsecureinadigitalsafetydepositboxvs
ADutytoRevealYourGenome?Genomicdataoftenprovideinformationofrelevance
torelatives,includinginformationaboutincreased
diseaserisk,andbiologicalparent/child/siblingstatus.
Thisisgivingrisetoarangeofunsettledethicaland
legalquestions.Underwhatconditionsdoesthis
informationcreateadutyforanindividualtoactually
revealinformationfromhisorhergenomicdatato
relativeswhoareimpacted?Andbeyondthat
individual,couldaphysicianorothercliniciantreating
boththepersonsequencedandhisorherrelatives
haveadutytoinformtherelatives?What
responsibilityisbornbythesequencinglabs,research
organizations,ormedicalorganizationsthatacquire
andstorethisinformation?Beyondethical
considerations,couldtheseactorsbeexposedto
claimsofnegligence,malpractice,orotherlegal
liabilities?
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
24
Question#3:“Whatstandardsaredesirableforsecuringgenomicdatabases?”Theprivacyconsiderationsnotwithstanding,aswehaveshownabove,therisksofdata
breachesofsequencedDNAdataremainasrealpossibilitieseitherbecauseofhuman
ignoranceorcarelessnessinthehandlingofDNAresultsornefariousactivitiessuchastheft
throughhacking.WithsequenceddataonlyfallingunderHIPAAprotectionsonceitislinkedto
apatient’smedicalrecord,thepotentialforunsecuretransmissionofthesedataaccidentallyor
intentionallyisgreaterthanzero,andthepotentialforre-identificationofanonymizeddata
certainlyexistsandbecomesevenhigherwhenthedataislinkedtohealthrecords.
Asourlifecyclemodelshows,DNApassesthroughalotofhandsinitstrajectoryfrom
collectiontoreuse.TheresponsibilityforsecuringDNAcurrentlyfallstothoseindividual
organizationsdoingthesequencing,interpretingtheresults,presentingthemtopatientsor
reanalyzingbigDNAdatabases.Whiletechnicalprotocolsforensuringdatasecuritysecureare
available,theyonlyworkasdeterrentsiforganizations(andtheiremployees)areassiduousin
theiruseandenforcement.Anunansweredquestionthatremainsiswhetherspecificstandards
(similartoHIPAA)forthehandlingof
genomicdataareneededatasocietallevel
toensurethatindividualorganizations
takethisresponsibilityseriously.Those
whoopposesuchregulations(suchas
researchagencies)makethereasonable
claimthatstrictregulationswillslowthe
progressinrealizingthepromiseof
genomics.
Couldamodelforbestpracticesinthe
privacyandsecurityofsensitivedataexist
inanotherindustry?Someobservers
believethefinancialservicesindustrymay
provideaninstructiveexample(seetext
box).52
Tightmonitoringoftheuseofgenomicdatabaseswillhelpprotectindividualandfamily
privacy
Unrestricteddatabaseswillhelpournationbeatthe
forefrontofgenomicscienceandinnovation
vs
LearningfromtheFinancialServicesIndustry?Theexperiencesoftheinvestmentbankingsector
couldprovideadegreeofroadmappingfordata
policiesandpracticesinthefieldofmedical
genomics.Financialservicefirmshandlelarge
volumesofsensitivecustomerdata,andshare
dataacrossinstitutionsonaregularbasis.A
combinationofgovernmentregulationand
voluntaryinitiativeshaveledtorelativelyuniform
andrigorousdatasecuritypracticeswhich,thus
far,appeartohavelimitedthescopeandsuccess
ofdatabreaches.Asinvestmentbanksadapt
practicesfornewtechnologicalplatforms,
includingcloudcomputingandmobileaccess,
theremaybevaluablelessonsformedical
genomics.
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
25
WhatActionisNeededNow?
Answeringthesequestionswillrequirecareful,well-informedsocietaldeliberationatthe
broadestlevelpossible.Thebestwaytoincludemanystakeholderswithalegitimateinterestin
genomicdataisthroughapublic,collaborativeexplorationofthesequestionsandhowbestto
resolvethem.Asasociety,weneedtoconsiderhowbesttostructuresuchadialogueto
ensurethatallinterestedstakeholders(patients,familymembers,clinicians,researchers,
insurers,advocacygroups,businesses,regulatorsetc.)canthoroughlyexploreanddebate
alternativescenariosforhowtosafeguardtheprivacyandsecurityofDNAdata.
*Abouttheauthors:Wearesocialscientistswhostudythedevelopmentandtransformationof
industriesandscientificfields.Weteachandconductresearchontheemergenceofnew
standardsandpracticesinthesefields.Ourresearchhasaddressedquestionssuchas:Howdoa
diversesetoforganizationsinafieldcometoagreeoncollectivestandardsandgovernance
procedures?Howdotherisksandbenefitsofnewtechnologiesandpracticescometobe
perceivedandcommunicatedamongtheseorganizations?Howdovoluntaryindustry
associations,formalstateregulations,andsocialmovementsinfluencethisprocess?What
facilitatescollaborationwithinscientificcommunities?Andwithinorganizations,whatarethe
structuralandleadershipcharacteristicsthataffectadoptionofnewstandardsandpractices?
Webothhaveundergraduatedegreesinscience,anddoctoratesinsocialscience–specifically
thefieldofManagement&Organization.
Wewrotethiswhitepaperbecause,fromourvantageassocialscientistswhostudyfield
transformation,webelievethefieldofmedicalgenomicsisinawatershedmoment.Manyof
theissuesthefieldfacesinvolveethicaldecisionswithuncertainoutcomesformany.Given
thesestakes,wealsobelieveitisimportanttoincreasepublicunderstandingandinvolvement
indecisionsaboutthestandardsforthisrapidlydevelopingfield.Eventhoughtheissuesare
complexandtechnical,weneedtheinvolvementofbothinsidersandoutsiders.Wehopeto
stimulatewidespreaddiscussionoftheissuesraisedhere.
[email protected]@psu.edu.
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
26
APPENDICES:Examplesofgenomebrowsers
USCSGenomeBrowser
http://genome.ucsc.edu
OmicsoftGenomeBrowser
http://www.omicsoft.com/genome-browser/
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
27
APPENDIX:ExamplesofIndividualGenomicsReports
MedicalGenomics(GenomicsAdvisor)
http://projects.iq.harvard.edu/smartgenomics
MetagenomicProfile(BiomeOrganisms)
https://www.onecodex.com
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
28
ICBBreederTool(CanineGenomics)
http://www.instituteofcaninebiology.org/blog/the-icb-breeder-tool-available-now
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
29
ENDNOTES1Thispaperisbasedonin-depthinterviewswithmorethan25leadersinthegenomicsfield,
andreviewofover200archivaldocuments,conductedduringthefallof2016.Ourinformants
includedleadersingenomicsbiomedicalresearch,clinicalhealthcaredelivery,healthcare
regulation,commercialgenomics,venturecapital,datasecurity,researchoversight.2Theshiftinmedicalterminologyfrom“genetics”to“genomics”reflectsashiftfromtheuseof
tailoredgenetictestsforspecificmutationstowardtheuseofsequencingtechnologyto
generatedataonaperson’swholegenome(orsignificantsectionsofit).3Battelle2013report:“TheImpactofGenomicsontheUSEconomy.”
http://web.ornl.gov/sci/techresources/Human_Genome/publicat/2013BattelleReportImpact-
of-Genomics-on-the-US-Economy.pdf4Topol,E.2015.Thepatientwillseeyounow:Thefutureofmedicine.Philadelphia:PerseusBooks.5ExamplesofspecificinitiativesintegratinggenomicsintoclinicalcareincludeKaiser
Permanente’slungcancerinitiative(https://share.kaiserpermanente.org/article/new-clinical-
trials-use-genetic-testing-to-personalize-lung-cancer-treatment/),GeisingerHealthSystem’s
autisminitiative(http://www.geisinger.org/for-researchers/initiatives-and-
projects/pages/simons-vip.html),InterMountainHealthcare’sPrecisionGenomicsCancer
Initiative(https://intermountainhealthcare.org/services/cancer-care/precision-
genomics/research/),PartnersHealthCare’sserviceforpatientswithsuspectedbut
undiagnosedraregeneticdiseases(http://personalizedmedicine.partners.org/laboratory-for-
molecular-medicine/tests/genome.aspx),andtheTexasMedicalCenter’sClinicalCancer
GeneticsProgramwhichcoordinatesgenetictestingandhigh-riskcancersurveillancefor
familieswithhereditarycancersyndromes(https://www.mdanderson.org/prevention-
screening/family-history/hereditary-cancer-syndromes.html).Severalhospitalsandmedical
centersareparticipatingintheNIH-fundedElectronicMedicalRecordsandGenomics(eMERGE)
Networkinitiativetointegrategenomicsandpatientmedicalrecordsintoclinicalcare
(https://www.genome.gov/27540473/electronic-medical-records-and-genomics-emerge-
network/).6PrecisionMedicineWorldConference,ClosingPanel,January24,2017,MountainView,CA.
7Schwartz,P.J.,Crotti,L.,&Insolia,R.(2012).Long-QTsyndromefromgeneticsto
management.Circulation:ArrhythmiaandElectrophysiology,5(4),868-877.8FDADrugSafetyCommunication:ReducedeffectivenessofPlavix(clopidogrel)inpatientswho
arepoormetabolizersofthedrug.http://www.fda.gov/Drugs/DrugSafety/PostmarketDrugSafetyInformationforPatientsandProvid
ers/ucm203888.htm;DeanL.WarfarinTherapyandtheGenotypesCYP2C9andVKORC1.2012
Mar8[Updated2016Jun8].In:MedicalGeneticsSummaries[Internet].Bethesda(MD):
NationalCenterforBiotechnologyInformation(US);2012-.Availablefrom:
https://www.ncbi.nlm.nih.gov/books/NBK84174/9Cooperberg,M.R.,Davicioni,E.,Crisan,A.,Jenkins,R.B.,Ghadessi,M.,&Karnes,R.J.(2015).
Combinedvalueofvalidatedclinicalandgenomicriskstratificationtoolsforpredictingprostate
cancermortalityinahigh-riskprostatectomycohort.Europeanurology,67(2),326-333.
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
30
10Grupp,S.A.,etal.(2014).Tcellsengineeredwithachimericantigenreceptor(CAR)targeting
CD19(CTL019)havelongtermpersistenceandinducedurableremissionsinchildrenwith
relapsed,refractoryALL.Blood,124(21),380-380.11Stranger,B.E.,Stahl,E.A.,&Raj,T.(2011).Progressandpromiseofgenome-wide
associationstudiesforhumancomplextraitgenetics.Genetics,187(2),367-383.Robinson,M.
R.,Wray,N.R.,&Visscher,P.M.(2014).Explainingadditionalgeneticvariationincomplex
traits.TrendsinGenetics,30(4),124-132.12D.Nyholt,C.Yu,andP.Visscher.OnJimWatson’sAPOEstatus:Geneticinformationishardto
hide.EuropeanJournalofHumanGenetics,17:147–149,2009.13Hooper,L.V.,Littman,D.R.,&Macpherson,A.J.(2012).Interactionsbetweenthemicrobiota
andtheimmunesystem.Science,336(6086),1268-1273.Honda,K.,&Littman,D.R.(2016).The
microbiotainadaptiveimmunehomeostasisanddisease.Nature,535(7610),75-84.14Rietveld,C.A.,Medland,S.E.,Derringer,J.,Yang,J.,Esko,T.,Martin,N.W.,...&Albrecht,E.
(2013).GWASof126,559individualsidentifiesgeneticvariantsassociatedwitheducational
attainment.science,340(6139),1467-1471.Krapohl,E.,&Plomin,R.(2016).Geneticlink
betweenfamilysocioeconomicstatusandchildren’seducationalachievementestimatedfrom
genome-wideSNPs.Molecularpsychiatry,21(3),437-443.15Humbert,M.,Ayday,E.,Hubaux,J.P.,&Telenti,A.(2013,November).Addressingthe
concernsofthelacksfamily:quantificationofkingenomicprivacy.InProceedingsofthe2013ACMSIGSACconferenceonComputer&communicationssecurity(pp.1141-1152).ACM.16Wagner,J.K.(2013).Playingwithheartandsoul…andgenomes:sportsimplicationsand
applicationsofpersonalgenomics.PeerJ,1,e120.17https://www.bloomberg.com/news/articles/2015-06-05/u-s-government-data-breach-tied-
to-theft-of-health-care-records18Otherregulatoryactorscouldalsobecomemoreinvolvedingenomicsinthefuture.TheFDA
hasbeenworkingtodevelopa“flexible,adaptiveregulatoryapproach”toensuringthe
accuracyandsafetyofgenomicsequencing.However,aftersignificantstakeholderinputon
proposedregulationsoverseveralyears,asofJanuary2017,theyhavenotissuednew
regulations,noristhereatimelineforissuingthem.TheFederalCommunicationsCommission
(FCC)alsohasaparticularinterestinprivacyinthedigitalage,whichcouldapplytogenomics.
InOctober2016,theFCCpassednewregulationsthatensureconsumerprivacybylimitinghow
internetprovidercompaniesuseandsellcustomerdata.TheFCChasalsoexpressedaninterest
inconsumerhealthdata,butsincetheFCCdoesnothavejurisdictionovernon-profit
organizations,andmanygenomicsentitiesinhealthcareandacademiaarenon-profit,their
jurisdictionalreachislimited.19NHGRIIRBGuidetoWritingConsentForms-Version2.0(November25,2015).Availableat
https://www.genome.gov/27528182/irb-forms-templates-and-guides/20Incontrast,Europeanprivacylawsprovideindividualswithgreaterrightstotheirpersonal
data,butthoselawsarebeingupdated,andtheirapplicationtogenomicsisnotyetclear.See
Townend,D.(2016).EULawsonPrivacyinGenomicDatabasesandBiobanking.TheJournalofLaw,Medicine&Ethics,44(1),128-142.21U.S.DepartmentofHealthandHumanServices.2016.GuidanceRegardingMethodsforDe-
identificationofProtectedHealthInformationinAccordancewiththeHealthInsurance
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
31
PortabilityandAccountabilityAct(HIPAA)PrivacyRule.https://www.hhs.gov/hipaa/for-
professionals/privacy/special-topics/de-identification/index.html#coveredentities;Presidential
CommissionfortheStudyofBioethicalIssues.2012.PrivacyandProgressinWholeGenomeSequencing.http://bioethics.gov/sites/default/files/PrivacyProgress508_1.pdf22Nyholt,D.R.2012.UsingGenomicDatatoMakeIndirect(andUnauthorized)Estimatesof
DiseaseRisk.PublicHealthGenomics15:303–311;Zaaijer,S.,Gordon,A.,Piccone,R.,Speyer,D.,&Erlich,Y.(2016).DemocratizingDNAFingerprinting.bioRxiv,061556.23InstituteforCriticalInfrastructureTechnology(ICIT)(2016).HackingHealthcareITin2016:
LessonstheHealthcareIndustryCanLearnfromtheOPMBreach.http://icitech.org/wp-
content/uploads/2016/01/ICIT-Brief-Hacking-Healthcare-IT-in-2016.pdf24GlobalAllianceforGenomicsandHealth.SecurityTechnologyInfrastructure:Standardsand
ImplementationPracticesforProtectingthePrivacyandSecurityofSharedGenomicand
ClinicalData.Version2.0,August9,2016.https://genomicsandhealth.org/category/search-
topics/security25FlatironHealthandFoundationMedicineUnveilPowerfulOncologyInformationResourceto
AdvancePrecisionMedicine.November3,2016PressRelease.
http://investors.foundationmedicine.com/releasedetail.cfm?releaseid=99738526NIH’sGenomicDataSharingPolicy,documentavailableat
https://gds.nih.gov/03policy2.html.27TheRighttoPrivacyintheDigitalAge,UnitedNationsOfficeoftheHighCommissioner,
http://www.ohchr.org/EN/Issues/DigitalAge/Pages/DigitalAgeIndex.aspx.28Acquisti,A.,Brandimarte,L.,&Loewenstein,G.(2015).Privacyandhumanbehaviorinthe
ageofinformation.Science,347(6221),509-514.Forattitudesaboutprivacyofhealthinformation,seeDimitropoulos,L.,Patel,V.,Scheffler,S.A.,&Posnack,S.(2011).Public
attitudestowardhealthinformationexchange:perceivedbenefitsandconcerns.AmericanJournalofManagedCare,17(12SpecNo.),SP111-6.29Mayer-Schönberger,V.,&Cukier,K.(2013).Bigdata:Arevolutionthatwilltransformhowwe
live,work,andthink.HoughtonMifflinHarcourt.30Jaschik,S.U.S.IssuesFinalVersionof‘CommonRule’onResearchInvolvingHumans.Inside
HigherEd,January19,2017.Availableathttp://insidehighered.com.NationalAcademyof
Sciences.OptimizingtheNation'sInvestmentinAcademicResearch:ANewRegulatory
Frameworkforthe21stCentury.Availableathttp://www.nap.edu/download/2182431https://www.23andme.com/about/consent/
32PonemonInstitute/IBM,SixthAnnualBenchmarkStudyonPrivacy&SecurityofHealthcare
Data.May2016.http://www2.idexpertscorp.com/ponemon2016.33http://fortune.com/2016/12/13/quest-diagnostics-data-breach-health/
34Shringarpure,S.S.,&Bustamante,C.D.(2015).Privacyrisksfromgenomicdata-sharing
beacons.TheAmericanJournalofHumanGenetics,97(5),631-646.35Heatherly,R.(2016).PrivacyandSecuritywithinBiobanking:TheRoleofInformation
Technology.TheJournalofLaw,Medicine&Ethics,44(1),156-160.36PonemonInstitute,SixthAnnualBenchmarkStudyonPrivacy&SecurityofHealthcareData.
May2016.http://www2.idexpertscorp.com/ponemon2016.
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
32
37Thomson,L.2015.HealthCareDataBreachesandInformationSecurity:AddressingThreats
andRiskstoPatientData.Chapter15inPeabody,A.(Ed.).(2013).HealthCareIT:Theessentiallawyer'sguidetohealthcareinformationtechnologyandthelaw.AmericanBarAssociation.
Availableat
http://www.americanbar.org/content/dam/aba/publications/books/healthcare_data_breaches
.authcheckdam.pdf38Eveleth,R.2015.GeneticTestingandTribalIdentity.TheAtlantic,January26.Alsosee
HavasupaiTribeandtheLawsuitSettlementAftermath.http://genetics.ncai.org/case-
study/havasupai-Tribe.cfm39Berger,K.M.,&Roderick,J.(2014).NationalandtransnationalsecurityimplicationsofBig
Datainthelifesciences.Washington,DC:AmericanAssociationfortheAdvancementofScience.http://www.aaas.org/sites/default/files/AAAS-FBI-UNICRI_Big_Data_Report_111014.pdf40Trevino,L.K.,&Nelson,K.A.(2010).Managingbusinessethics.JohnWiley&Sons.D.Palmer,
K.Smith-CroweandR.Greenwood(2016),OrganizationalWrongdoing:KeyPerspectivesandNewDirections(Cambridge,UK:CambridgeUniversityPress),includingchapters5("BadApples,
BadBarrelsandBadCellars:A‘Boundaries’PerspectiveonProfessionalMisconduct.")and7
(“SheblindedmewithScience:TheSociologyofScientificMisconduct.”Leveson,N.,Dulac,N.,
Marais,N.,Carroll,J.2009.MovingBeyondNormalAccidentsandHighReliability
Organizations:ASystemsApproachtoSafetyinComplexSystems.OrganizationStudiesVol30,Issue2-3,pp.227–249.Roberts,K.H.,Bea,R.,&Bartles,D.L.(2001).Mustaccidentshappen?
Lessonsfromhigh-reliabilityorganizations.TheAcademyofManagementExecutive,15(3),70-78.Kochan,T.A.,Smith,M.,Wells,J.C.,&Rebitzer,J.B.(1994).Humanresourcestrategiesand
contingentworkers:Thecaseofsafetyandhealthinthepetrochemicalindustry.HumanResourceManagement,33(1),55-77.41Thereisincreasingvariationacrossstatesintheextenttowhichgenomicsisregulatedin
research,clinical,andcommercialsettings.See
https://www.genome.gov/policyethics/legdatabase42http://www.humanlongevity.com/human-longevity-inc-and-discovery-ltd-to-offer-whole-
exome-whole-genome-and-cancer-genome-sequencing-to-discovery-insurance-clients-in-
south-africa-and-the-united-kingdom/43https://www.dnasimple.org
44Thomson,L.L.(2015).PersonalDataforSaleinBankruptcy.AmericanBankruptcyInstitute
Journal,34(6),32.45Acquisti,A.,C.TaylorandL.Wagman.2016.TheEconomicsofPrivacy.JournalofEconomic
Literature,Vol.52,No.2.https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2580411##46AmericanAssociationforCancerResearch.
http://www.aacr.org/RESEARCH/RESEARCH/PAGES/AACR-PROJECT-GENIE-DATA.ASPX47ComplianceStatisticsforPoliciesthatGovernDataSubmission,AccessandUseofGenomic
Data.NIHGenomicDataSharing(GDS).Availableat
https://gds.nih.gov/20ComplianceStatistics_dbGap.html
Privacy&SecurityinMedicalGenomics:APrimerandCallforPublicEngagement
33
48Devos,L.,T.WangandS.Iyer.2016.TheGenomicInectionPoint:ImplicationforHealthcare.
RockHealthSpecialTopicsReport.https://rockhealth.com/reports/the-genomics-inflection-
point-implications-for-healthcare/49Skloot,Rebecca(2010).TheImmortalLifeofHenriettaLacks,NewYorkCity:RandomHouse.
50Seehttps://ghr.nlm.nih.gov/primer/precisionmedicine/initiativeand
https://www.cancer.gov/research/key-initiatives/moonshot-cancer-initiative51Conley,J.2017.WilliamsandBeyond:LegalandPolicyIssuesintheRegulationofGenetic
Testing.UNCCenterforGenomicsandSociety.Presentation,February9,2017.
http://www.genomicslawreport.com.52CyberSecurity:ConfrontingtheThreat.2015.AccentureConsulting.
https://www.accenture.com/_acnmedia/Accenture/next-gen/top-ten-
challenges/challenge9/pdfs/Accenture-2016-Top-10-Challenges-09-Cyber-Security.pdf