Khoury ashg2014
Transcript of Khoury ashg2014
![Page 1: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/1.jpg)
Separating Signal from Noise in the Age of
Genomics & Big Data:
A Public Health Approach
Muin J. Khoury MD, PhD
CDC Office of Public Health Genomics
NCI Epidemiology & Genomics Research Program
![Page 2: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/2.jpg)
Outline
Big Data & Causation in the Age of Genomics
Promises of Genomics & Big Data
Challenges of Genomics & Big Data
A Public Health Approach to Realize Potential of
Genomics & Big Data
![Page 3: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/3.jpg)
A Case Study: Searching for Needles in the
Haystack- The CDC HuGE Navigator
http://www.hugenavigator.net/HuGENavigator/home.do
![Page 4: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/4.jpg)
Text Mining Tool To Find HuGE Articles
in Published Literature
PubMed Signal/Noise ratio very
low
Support Vector Machine (SVM)
tool generated in 2008
Based on >3800 words in text,
extensively validated
Sensitivity & specificity >97%
Since 2008, genetic epidemiology
literature has changed
considerably
Performance of SVM model was
significantly reduced (60%)
In 2014, Retrained SVM now using
> 4500 words pushed sensitivity
and specificity to >90% Yu W et al. BMC Bioinformatics, 2008
![Page 5: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/5.jpg)
Application of Data Mining in the Prediction
of Type 2 Diabetes in the United States
1999-2004 National Health and
Nutrition Examination Survey
Developed and validated SVM
models for diabetes, undiagnosed
diabetes & prediabetes using
numerous variables in survey
Discriminative abilities Using area
under ROC curve of 84% and 73%
Validated known risk factors for
diabetes
Not clear what best models, what
best variables to use and how
applicable to other populations
Proof of concept only Yu W et al. BMC Medical Informatics 2010
![Page 6: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/6.jpg)
The IOM Ecological Model & the Need for Multilevel Analysis of “Causation”
Obesity Example NEJM 2007;357:404-7
IOM Ecological Model
![Page 7: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/7.jpg)
“We will all be surrounded by a personal cloud of billions of data pointsl“ L
Hood (ISB)
Genomics & Big Data
The Genome is Just the Beginning
![Page 8: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/8.jpg)
Big Data: From Association to Prediction
How about Causation?
Association
Replication
Classification
Prediction
?CAUSATION
Does Big Data care about “Causation”?
Intervention is based on cause-effect
relationships
![Page 9: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/9.jpg)
The Promises of Genomics & Big Data
The Economist
![Page 10: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/10.jpg)
The Promises of Genomics & Big Data
Workup of Rare & Familial Diseases
NEJM June2014
![Page 11: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/11.jpg)
The Promises of Genomics & Big Data
Improved Disease Classification
![Page 12: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/12.jpg)
The Promises of Genomics & Big Data
Improved Measurement of the “Environment”
http://www.niehs.nih.gov/research/programs/geh/geh_newsletter/2014/4/spotlight/index.cfm
![Page 13: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/13.jpg)
The Promises of Genomics & Big Data
Better Understanding of Natural History
G Ginsburg
![Page 14: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/14.jpg)
The Promises of Genomics & Big Data Stratified Prevention (One size does not fit All)
No one is average: “population medicine: let’s get over it” (E. Topol)
![Page 15: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/15.jpg)
The Promises of Genomics & Big Data
Precision Medicine
![Page 16: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/16.jpg)
The Promises of Genomics & Big Data
Pathogen Genomics
![Page 17: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/17.jpg)
The Promises of Genomics & Big Data
Public Health Practice
“As cholera swept through London in the
mid-19th century, a physician named John
Snow painstakingly drew a paper map
indicating clusters of homes where the
deadly waterborne infection had struck. In
an iconic feat in public health history, he
implicated the Broad Street pump as the
source of the scourge—a founding event in
modern epidemiology. Today, Snow might
have crunched GPS information and disease
prevalence data and solved the problem
within hours”http://www.hsph.harvard.edu/news/magazine/big-datas-big-
visionary/?utm_source=SilverpopMailing&utm_medium=email&utm_cam
paign=Kiosk%2009.25.14_academic%20(1)&utm_content
![Page 18: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/18.jpg)
Some Promises of Genomics & Big Data
Workup of Rare & Familial Diseases
Improved Disease Classification
Improved Measurement of the “Environment”
Better Understanding of Disease Natural History
Stratified Prevention
Precision Medicine
Pathogen Genomics
Public Health Practice
![Page 19: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/19.jpg)
The Challenges of Genomics & Big Data
Problems of Study Designs & Hidden Biases
“…claims are based upon complex
(and we believe flawed)
analyses…there are far simpler
alternative explanations for the
patterns they observed. We believe
that the authors have not excluded
important alternative explanations“
G. Breen
Schizophrenia is Eight Different Diseases
Not One” USA Today (9/15/2014)
“Eight types of schizophrenia? Not so
fast” Genomes Unzipped (9/30/2014)
Am J Psychiatry Sep 2014
![Page 20: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/20.jpg)
![Page 21: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/21.jpg)
The Challenges of Genomics & Big Data
Analytic Issues: Dealing with Complexity
Prediction of LDL cholesterol response to statin using transcriptomic and
genetic variation. Kyungpil Kim et al. Genome Biology, Sep 2014
![Page 22: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/22.jpg)
The Challenges of Genomics & Big Data
Reproducibility
Lots of Input
Variables
Molecularly defined
Disease subsets & precursors
Millions
of genetic
variants
![Page 23: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/23.jpg)
Am J Clin Nutrition 2013
![Page 24: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/24.jpg)
The Challenges of Genomics & Big Data
Causation, Ecologic Fallacies & Hubris
![Page 25: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/25.jpg)
‘The Scientific Method Itself is Growing
Obsolete.’ (A. Butte, Sep 2014)
“..implicit
assumption that big
data are a substitute
for, rather than a
supplement to,
traditional data
collection and
analysis."
http://blogs.kqed.org/science/
audio/how-big-data-is-
changing-medicine/
Garbage In, Garbage Out (GIGO)
![Page 26: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/26.jpg)
The Challenges of Genomics & Big Data
Beyond Prediction: From Validity to Utility
![Page 27: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/27.jpg)
The Challenges of Genomics & Big Data
Challenges of Population Stratification & Precision
Medicine
![Page 28: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/28.jpg)
Some Challenges of Genomics & Big Data
Problems of Study Designs & Hidden Biases
Analytic Issues: Dealing with Complexity
Reproducibility and Replication
Causation vs Association-Ecologic Fallacies &
Hubris
Translation: from Validity into Utility and
Implementation
Challenges of Population Stratification &
Personalized Medicine
![Page 29: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/29.jpg)
A Public Health Translation Framework
for Genomics & Big Data
Population
Health
Discovery
Evidence based
Recommendation
or Policy
Health care
& Prevention
Programs
Application
Knowledge
Integration
T1
T2
T3T4
T0
Khoury MJ et al, AJPH, 2012
Evaluation
Implementation
ScienceEffectiveness
& Outcomes
Research (CER, PCOR.
Economics, ELSI
Development
Basic, Clinical &
Population
Sciences
![Page 30: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/30.jpg)
A Public Health Approach to Realizing
Promises of Genomics & Big Data
1. Use a Strong Epidemiologic Foundation
The study of distribution and determinants of disease occurrence and outcomes in populations, and using resulting knowledge to improve health and prevent disease
Fundamental science of medicine and public health
Human Genome Epidemiology (HuGE)- Beyond Gene Discovery
New Brand of “Big Data Epidemiology” 2010
![Page 31: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/31.jpg)
![Page 32: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/32.jpg)
• Investigators responsible:
– 40+ high-quality cohorts
– 4+ million people
• Coordinated, interdisciplinary approach
• Tackle important scientific questions, economies of scale, and opportunities to quicken the pace of research
• Focused so far mostly on etiology, but adapting to include outcomes
Epidemiologic Cohort Studies:The NCI Cohort Consortium
• Major role in identifying specific carcinogenic environment agents▫ Asbestos – Lung▫ Benzene – Leukemia▫ Smoking – many dzs
• Exposures/Risk factors assessment prior to onset of disease▫ Overcome
recall/selection biases
• Permit absolute measures of risks/incidence rates▫ Relevant for public
health policies
• Value resource for studying for repeated measures and multiple outcomes
![Page 33: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/33.jpg)
Epidemiology Data Sharing & Harmonization
Nature, August 27, 2014
![Page 34: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/34.jpg)
A Public Health Approach to Realizing
Promises of Genomics & Big Data
2. Develop a Robust Knowledge Integration
Process
![Page 35: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/35.jpg)
A Public Health Approach to Realizing
Promises of Genomics & Big Data
2. Develop a Robust Knowledge Integration
Process
![Page 36: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/36.jpg)
Components of Knowledge Integration
• Knowledge Management: Integration of knowledge from disparate sources & disciplines
• Knowledge Synthesis: Systematic synthesis of scientific findings▫ Accumulating evidence on a cancer outcome
Minimize waste in repeat funding
▫ Identify scientific gapsInform research priorities
• Knowledge Translation▫ Stakeholder engagement ▫ Evidence-based information▫ Decision support tools
![Page 37: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/37.jpg)
Interpretation
“The Bottleneck for Realizing Personalized Medicine”
(Good et al. Genome Biology Sep 2014)
![Page 38: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/38.jpg)
The NIH BD2K Initiative Can Help
![Page 39: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/39.jpg)
A Public Health Approach to Realizing
Promises of Genomics & Big Data
3. Use (and not avoid) Principles of Evidence-
based Medicine and Population Screening
![Page 40: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/40.jpg)
Guidelines We Can Trust (IOM, 2011)
![Page 41: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/41.jpg)
Guidelines We Can Trust in Genomic Medicine (Schully S et al. Genetics in Medicine 2014)
![Page 42: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/42.jpg)
CDC-Sponsored
EGAPP Working Group
• Independent, multidisciplinary, non-federal panel established in 2004
• Established a systematic, evidence-based process to assess validity & utility of genomic tests & family health history applications.
• New methods for evidence synthesis and modeling in 2013, including next generation sequencing and stratified cancer screening based on family history
• 10 recommendation statements to date:• Colorectal cancer, breast cancer, heart disease, clotting
disorders, depression, prostate cancer, diabetes, and more
• Clinical Validity vs Clinical Utility• Uncovered evidence gaps that require additional
research• Principles can be applied to other “Big Data”
![Page 43: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/43.jpg)
Evidence-based Classification of Genomic
Applications in Practice
Tier 1
Tier 2
Tier 3
http://www.cdc.gov/genomics/gtesting/tier.htm
![Page 44: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/44.jpg)
Evidence-based Binning of the Genome
Genetics in Medicine 2011
![Page 45: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/45.jpg)
A Public Health Approach to Realizing
Promises of Genomics & Big Data
4. Develop a Robust T2+ Translational
Research Agenda
![Page 46: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/46.jpg)
Limited Translational Research in Genomics Beyond the Bedside
Khoury MJ, 2007, Schully, 2012. Clyne, M, 2014
T0 ↔ T1 ↔ T2 ↔ T3 ↔ T4
Discovery to Application Guideline to Practice to Application to Guideline Practice Population
Health Impact
<1% of published genomics research
in T2 – T4
Multiple clinical and population
scientific disciplines involved
![Page 47: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/47.jpg)
Cancer Genomics Research Funding T2+
Public Health Genomics 2010
![Page 48: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/48.jpg)
A MultiDisciplinary T2+ Research Agenda
Comparative Effectiveness Research
Patient-centered Outcomes Research
Behavioral, Social & Communication Sciences
Economic Studies
Surveillance & Population Monitoring
![Page 49: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/49.jpg)
A Public Health Approach to Realizing
Promises of Genomics & Big Data
Use a Strong Epidemiologic Foundation
Develop a Robust Knowledge Integration
Process
Use (and not avoid) Principles of Evidence-
based Medicine and Population Screening
Develop a Robust T2+ Research Agenda
(Learning Health systems, Consumer
Involvement etc..)
![Page 50: Khoury ashg2014](https://reader034.fdocuments.in/reader034/viewer/2022042602/55940fc51a28abaf478b4688/html5/thumbnails/50.jpg)
In Summary
“Big Data” is agnostic to disease causation
Numerous promises for health impact of genomics
& Big Data- Leading edge in genomics in Big Data
beginning to be applied
But numerous challenges face genomics & Big
Data. So we should not overpromise & under
deliver
A “Public Health” translational approach Is needed
to realize potential of genomics & Big Data