CIS100 Test Review Dale McIntosh © 2009 Dale McIntosh. All Rights Reserved.Fall 2009.
2/16/18 Guest Lecturer: Dr. Leslie McIntosh
Transcript of 2/16/18 Guest Lecturer: Dr. Leslie McIntosh
![Page 1: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/1.jpg)
Fran Berman, Data and Society, CSCI 4370/6370
Data and Society
Lecture 5: Data and Health
2/16/18
Guest Lecturer: Dr. Leslie McIntosh
![Page 2: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/2.jpg)
Fran Berman, Data and Society, CSCI 4370/6370
Announcements 2/16
• Op-Ed Drafts returned today after break.
• Op-ed Finals due March 2. Please turn in hardcopies of
the draft and the final copy on March 2 at 9:00 a.m. If
you will be doubling your draft grade, let Fran
([email protected]) know by March 1.
• Wednesday class February 21, starts at 8:30 a.m.
• Guest Speaker Dr. Leslie McIntosh today on data and
health
![Page 3: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/3.jpg)
Fran Berman, Data and Society, CSCI 4370/6370
Wednesday Section Friday lecture
First Half of Class Second Half of Class Assignments
January 17 : NO class January 19 L!: CLASS INTRO AND LOGISTICS Presentation Model / Op-Ed Instructions
Op-Ed instructions
January 24: NO class January 26 L2: BIG DATA 1 4 Presentations
January 31: NO class February 2 L3: BIG DATA 2 -- IoT 4 Presentations
February 7: NO class February 9 L4: DATA AND SCIENCE 4 Presentations Op-Ed due Feb. 9
February 14: 3Presentations
February 16 L5: DATA AND HEALTH / LESLIE McINTOSH GUEST SPEAKER
5 Presentations Op-Ed drafts returned Feb. 21
February 21: 5 Presentations
February 23 L6: DATA STEWARDSHIP AND PRESERVATION
4 Presentations Research Paper instructions
February 28: 5 Presentations
March 2 L7: DATA INFRASTRUCTURE 4 Presentations Op-Ed Final due March 2
March 7 : 5 Presentations March 9: NO CLASS / PAPER PREPARATION
March 14: Spring Break March 16 SPRING BREAK
March 21: NO class March 23: NO CLASS / PAPER PREPARATION
March 28: 5 Presentations
March 30 L8: DATA RIGHTS, POLICY, REGULATION 4 Presentations Research Paper due March 28
April 4: NO class April 6 L9: DATA AND ETHICS 4 Presentations
April 11: 5 Presentations April 13 L10: DATA AND COMMUNICATION 4 Presentations
April 18: 5 Presentations April 20 L10: DATA FUTURES 4 Presentations
April 25: 5 Presentations April 27 L11: HOT TOPICS / TBD
![Page 4: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/4.jpg)
Fran Berman, Data and Society, CSCI 4370/6370
Dr. Leslie McIntosh
• PhD, MPH
• Research Data Alliance, Executive Director - US
• Ripeta, LLC, CEO
• Formerly Director for the Center for Biomedical Informatics and Professor for the Department of Pathology and Immunology at Washington University, St. Louis School of Medicine
• Biomedical Informatics, reproducibility researcher, global player in health data
![Page 5: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/5.jpg)
A Journey through Biomedical
Data using a Reproducibility Lens
Leslie D. McIntosh, PhD, MPH
Research Data Alliance, Executive Director - US
Ripeta, LLC, CEO
@mcintold
![Page 6: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/6.jpg)
Disclosures
SafeT, LLC Board Member
Asteris, LLC Advisor
Ripeta, llc Founder
![Page 7: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/7.jpg)
Team for Reproducible Research
NLM Supplement
Cynthia Hudson-Vitale
Anthony Juehne
Rosalia Alcoser
Xiaoyan Lui
Brad Evanoff
Research Data Alliance
Cynthia Hudson-Vitale
Anthony Juehne
Snehil Gupta
Connie Zabarovskaya
Brian Romine
RDA Collaborators
Andreas Rauber
Stefan Pröll
![Page 8: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/8.jpg)
Funding Support
Washington University Institute of Clinical and Translational
Sciences
NIH CTSA Grant Number UL1TR000448 and
UL1TR000448-09S1
MacArthur Foundation 2016 Adoption Seeds program
Foundation through a sub-contract with Research Data
Alliance
![Page 9: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/9.jpg)
An Informatics Perspective
![Page 10: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/10.jpg)
![Page 11: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/11.jpg)
![Page 12: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/12.jpg)
![Page 13: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/13.jpg)
BDaaS
Biomedical Data as a Service
Researchers
Data
Broker
i2b2
ApplicationBiomedical
Data
Repository
![Page 14: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/14.jpg)
Move some of the responsibility of reproducibility
Biomedical
Researcher
Biomedical
Pipeline
![Page 15: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/15.jpg)
Challenges
![Page 16: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/16.jpg)
Clinical recommendations based upon
restricted use data pose challenges for
research transparency and
accessibility.
![Page 17: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/17.jpg)
Given the protected nature of much
biomedical research, what is needed to
call our research reproducible?
![Page 18: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/18.jpg)
How do we as biomedical researchers
facilitate research reproducibility?
![Page 19: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/19.jpg)
What is reproducibility research
anyway?
![Page 20: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/20.jpg)
Why do we care?
![Page 21: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/21.jpg)
What is Reproducible Research?
![Page 22: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/22.jpg)
Definitions
Replicable - independent people, collecting new
data, and using same methods
Reproducible - independent people analyzing the
same data
V. Stodden, “Trust Your Science? Open Your Data and Code,” Amstat News, 1 July
2011; http://magazine. amstat.org/blog/2011/07/01/trust-your-science/
![Page 23: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/23.jpg)
Why do we care?
![Page 24: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/24.jpg)
Climategate
2009
![Page 25: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/25.jpg)
Duke’s
Precision
Medicine Bust
2007 - 2011+
![Page 26: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/26.jpg)
(Lack of)
Reproducibility
of Psychological
Science
2015
![Page 27: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/27.jpg)
Reproducibility
of Cancer
Biology
2017 https://cos.io/our-services/research/rpcb-
overview/?imm_mid=0eceb8&cmp=em-
data-na-na-newsltr_20170201
https://elifesciences.org/collections/reprodu
cibility-project-cancer-biology
![Page 28: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/28.jpg)
Paper Conclusion Focus of key experiment
Replication
results Citations
Sirota, M. et al. Sci. Transl.
Med. 3, 96ra77 (2011)
Public gene expression data
can identify unintuitive uses
for old drugs
Growth of tumours treated
with an anti-ulcer drug
Substantially
reproduced334
Sugahara, K. N. et al.
Science 328, 1031–1035
(2010)
A tumour-penetrating
peptide enhances the
effects of cancer drugs
Growth of peptide-treated
tumoursNot reproduced 495
Willingham, S. B. et al.
Proc. Natl Acad. Sci. USA
109, 6662–6667 (2012)
Blocking contact between
CD47 and another protein
inhibits tumour
Growth and metastasis of
treated tumoursUninterpretable 290
Delmore, J. E. et al. Cell
146, 904–917 (2011)
Blocking a protein sequence
damps down pro-cancer
genes
Gene expression in treated
cells; growth of treated
tumours
Substantially
reproduced1059
Berger, M. F. et al. Nature
485, 502–506 (2012)
Sequencing reveals gene
that is frequently mutated in
melanoma and accelerates
growth
Tumour formation in cells
carrying mutationsUninterpretable 428
![Page 29: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/29.jpg)
Relative risk of
Alzheimer's between
men and women:
Record corrected
https://www.sciencedaily.com/rel
eases/2017/08/170828124531.h
tm
https://jamanetwork.com/journals/jamaneurology/article-
abstract/2649260?resultClick=1&redirect=true
![Page 30: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/30.jpg)
Privacy and Ethics
![Page 31: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/31.jpg)
Transparency
Accessib
ility
![Page 32: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/32.jpg)
Transparency
Accessib
ility
![Page 33: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/33.jpg)
Transparency
Accessib
ility
![Page 34: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/34.jpg)
Personal
Privacy
Population
Vulnerability
![Page 35: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/35.jpg)
“U.S. soldiers are revealing sensitive and
dangerous information by jogging”
![Page 36: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/36.jpg)
Back to Reproducibility
![Page 37: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/37.jpg)
The purpose of this study is to:
● Assess the current research reproducibility
practices of investigators
● Who use electronic health records (EHR) for
secondary analysis, then
● Develop a framework
![Page 38: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/38.jpg)
1. Do you have a well-formed, well-defined
question?
2. Do you have data to answer your
question?
3. If you get an answer, is the answer
relevant or meaningful?
![Page 39: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/39.jpg)
Can the data answer the
research hypothesis?
Is all the information available
to reuse the data for a study?
Are the data of sound quality
to want to reuse them?
Replicability/
Reproducibility
Quality
Pertinence
![Page 40: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/40.jpg)
How a business begins...
![Page 41: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/41.jpg)
Scientific
Method
Science
![Page 42: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/42.jpg)
Ric
och
etC
on
ce
pt
ScienceScientific
Method
![Page 43: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/43.jpg)
Work-to-Date
![Page 44: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/44.jpg)
Ric
och
etO
utp
uts
Publication
Repeat: A Framework to Assess Empirical
Reproducibility in Biomedical Research
BioMed Central (https://doi.org/10.1186/s12874-017-0377-6)
Repos:
https://github.com/ripeta
https://osf.io/ppnwa/
International Engagement
Research Data Alliance
![Page 45: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/45.jpg)
100+ unique variables
5 categories● bibliographic
● database & data collection
● data mining & cleaning
● data analysis
● data sharing & documentationRic
oc
he
tFra
mew
ork
![Page 46: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/46.jpg)
Ric
oc
he
tFra
mew
ork
Publication Overview and Bibliographic Information (21 items)
Is the research hypothesis-driven or hypothesis-generating? Hypothesis Driven
Hypothesis Generating
Unclear
Database and Data Collection (63 items)
Publication states database(s) source(s) of data? Yes/No
*Publication states database(s) source(s) of data in the following location: Not Stated
Supplementary materials
Appendix
Body of Text
Query methodology Manual extraction
Digital extraction through query interface
Digital extraction through honest broker
Not Applicable/Not Stated
*Does the shared query script for database contain comments and/or
notations for ease of reproducibility?
Yes/No
Methods: Data Mining and Cleaning (19 items)
Does the research involve natural language processing or text mining? Yes/No
*Is the text mining software application proprietary or open?
If multiple applications were used, please select all options that
apply.
1. Proprietary
2. Mixed
3. Open
![Page 47: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/47.jpg)
Ric
oc
he
tFra
mew
ork
Methods: Data Analysis (15 items)
Does the author state analysis methodology and process? Yes/No
Does the author indicate the software used to develop the
analysis code?
Yes/No
*Is the analysis software proprietary or open? Proprietary
Open
Data Sharing and Data Documentation (36 items)
Is the finalized dataset shared? Yes
No
*Where is the finalized dataset shared? Affiliated Research Center Website
Author’s Institution or Department Website
Data Registry
Journal or Publication’s Website
GitHub
Other
Is there a clear process for requesting the data? Yes
No
Ricochet Framework has been tested for inter-rater reliability and face validity
![Page 48: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/48.jpg)
Ric
oc
he
tExam
ple
Pharmacogenomics. 2012 Mar; 13(4): 407–418.
Published online 2012 Feb 13. doi: 10.2217/pgs.11.164
Here we test: whether the published associations
between steady-state warfarin dose and variants in
warfarin pharmacogenes could be replicated in
BioVU; how implementing published
pharmacogenomic algorithms affects dosing error;
and if an improved algorithm for African–Americans
can be generated using variants associated with
stable dose in this population.
Hypothesis Stated = Yes
![Page 49: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/49.jpg)
Ric
oc
he
tExam
ple Cases were identified
in BioVU, the
Vanderbilt DNA
Biobank, which
accrues DNA samples
extracted from blood
remaining from
routine clinical testing
after it has been
retained for 3 days
and is scheduled to be
discarded [27].
Data Source Stated = Yes
Data Source Cited = Yes
![Page 50: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/50.jpg)
Ric
oc
he
tExam
ple
The R programming
language was used for
regression analyses,
diagnostic-test calculations,
and to implement and
evaluate the algorithms (R
Foundation for Statistical
Computing, Vienna,
Austria).
Software Stated = YesSoftware Cited = Yes
Software Version = No
![Page 51: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/51.jpg)
Ric
oc
he
tExam
ple
Analyses Code Present =
Partially
Supplementary Table 4. Equations of novel
Expanded Genetic algorithm.
Expanded Genetic:5.9487517
- 0.0073436353 * race (AA=0,EA=1)
- 0.025161445 * age (in years)
+ 0.058138499 * sex (F=0,M=1)
+ 1.1848957 * bsa (kg/m2)
+ 0.068020571 * smoking status (nonsmoker=0,smoker=1)
+ 0.058578086 * VTE indication (no=0,yes=1)
- 0.10646416 * atrial fibrillation indication (no=0,yes=1)
- 0.8142521 * amiodarone use (no=0,yes=1)
- 0.64877338 * CYP2C9*2 (wt=0,heterozygote=1,homozygote=2)
- 1.0601067 * CYP2C9*3 (wt=0,heterozygote=1,homozygote=2)
- 1.9737831 * CYP2C9*6 (wt=0,heterozygote=1,homozygote=2)
- 1.0622944 * CYP2C9*8 (wt=0,heterozygote=1,homozygote=2)
+ 0.24749973 * CYP4F2 (wt=0,heterozygote=1,homozygote=2)
- 0.31996754 * CALU (wt=0,heterozygote=1,homozygote=2)
- 0.87262446 * VKORC1 (wt=0,heterozygote=1,homozygote=2)
= log[weekly warfarin dose]
![Page 52: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/52.jpg)
Ric
oc
he
tResults
![Page 53: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/53.jpg)
Ric
oc
he
tMeth
ods
Manually curating manuscripts
Using NLP and neural network algorithms to
mine data
![Page 54: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/54.jpg)
Reproducibility in an Evolving EHR
Data Repository
![Page 55: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/55.jpg)
Next Steps
Refine RepeAT using mined
literature
Incorporate more variables
into tool
Incorporate data citability into
WU EHR repository
![Page 56: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/56.jpg)
https://www.rd-alliance.org/system/files/documents/RDA-DC-
Recommendations_150609.pdf
![Page 57: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/57.jpg)
Moving Biomedical Big Data Sharing Forward
1. Integrate the RDA recommendations for Data
Citation of Evolving Data local EHR repository
2. Contribute all source code back to the i2b2 GitHub
3. Gather feedback about RDA WGDC-compliant i2b2
code from established i2b2 installations
MacArthur Foundation 2016 Adoption Seeds program Foundation through a sub-
contract with Research Data Alliance
![Page 58: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/58.jpg)
Implementation
![Page 59: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/59.jpg)
R1 and R2 Implementation
PostgreSQL
Extension
“temporal_tables”
c1 c2 c3
RDC.table
sys_period
c1 c2 c3 sys_period
RDC.hist_table*
*stores history of
data changes
12
3triggers
![Page 60: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/60.jpg)
ETL Incrementals
Source Data
Update?
… sys_period
2016-9-9 00:00,
NULL
RDC.table
Old Data
… sys_period
2016-9-8 00:00,
2016-9-9 00:00
RDC.hist_table
Insert
?
… sys_period
2016-9-9 00:00,
NULL
RDC.table
![Page 61: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/61.jpg)
R3, R7, R8, R9, and R10 Implementation
PostgreSQL
Extension
“temporal_tables”
1RDC.table RDC.hist_table
RDC.table_with_history (view)
2 3
• functions
• triggers
• query audit
tables
![Page 62: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/62.jpg)
Data Reproducibility Workflow
![Page 63: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/63.jpg)
Bonus Feature: Determine if Change Occurred
![Page 64: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/64.jpg)
Other thoughts...
![Page 65: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/65.jpg)
Other Information
![Page 66: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/66.jpg)
Github - Research Reproducibility Assessment Toolhttps://github.com/CBMIWU/Research_Reproducibility
Bibliography for NLM funded studyhttps://www.zotero.org/groups/biomedical_informatics_resrepro
Resources
![Page 67: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/67.jpg)
Attribution
Bakery
https://www.flickr.com/photos/puthoor_photo/96159
54804/
London Underground
https://www.flickr.com/photos/jvk/151830527/
www.flickr.com/photos/wttw/14763924/
http://bit.ly/1WL4k1U
UEA Climate Research Unit
By ChrisO - Own work, CC BY-SA 3.0,
https://commons.wikimedia.org/w/index.php?curid=
8730683
Reproducibility Duke University PMI Bust
http://www.eurekalert.org/multimedia/pub/1579.php
Reproducibility of Psychological Findings
Doi: 10.1126/science.aac4716
![Page 68: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/68.jpg)
Bibliography on Reproducible Research
1. Landis, S. C. et al. A call for transparent reporting to optimize the predictive value of preclinical research. Nature 490, 187–191 (2012).
2. Freedman, L. P. & Inglese, J. The increasing urgency for standards in basic biologic research. Cancer Res. 74, 4024–4029 (2014).
3. Khoury, M. J. et al. Transforming Epidemiology for 21st Century Medicine and Public Health. Cancer Epidemiology Biomarkers & Prevention 22,
508–516 (2013).
4. Laine, C., Goodman, S. N., Griswold, M. E. & Sox, H. C. Reproducible Research: Moving toward Research the Public Can Really Trust. Ann
Intern Med 146, 450–453 (2007).
5. Announcement: Reducing our irreproducibility. Nature 496, 398–398 (2013).
6. Russell, J. F. If a job is worth doing, it is worth doing twice. Nature 496, 7–7 (2013).
7. Vasilevsky, N. A. et al. On the reproducibility of science: unique identification of research resources in the biomedical literature. PeerJ 1, e148
(2013).
8. Begley, C. G. & Ioannidis, J. P. A. Reproducibility in Science: Improving the Standard for Basic and Preclinical Research. Circulation Research
116, 116–126 (2015).
9. Peng, R. D. Reproducible research and. Biostat 10, 405–408 (2009).
10. Bustin, S. A. The reproducibility of biomedical research: Sleepers awake! Biomolecular Detection and Quantification 2, 35–42 (2014).
11. Stodden, V. et al. Setting the Default to Reproducible: Reproducibility in Computational and Experimental Mathematics. (20130216). at
https://icerm.brown.edu/tw12-5-rcem/icerm_report.pdf
12. Johnson, R. B. & Onwuegbuzie, A. J. Mixed Methods Research: A Research Paradigm Whose Time Has Come. EDUCATIONAL
RESEARCHER 33, 14–26 (2004).
13. Tenopir, C. et al. Data Sharing by Scientists: Practices and Perceptions. PLoS ONE 6, e21101 (2011).
14. Research Data Alliance, Data Citation of Evolving Data https://rd-alliance.org/sites/default/files/RDA-DC-Recommendations_150508_1.pdf
15. http://www.nature.com/news/cancer-reproducibility-project-releases-first-results-1.21304#/water
![Page 69: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/69.jpg)
Questions
![Page 70: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/70.jpg)
Fran Berman, Data and Society, CSCI 4370/6370
Break
![Page 71: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/71.jpg)
Fran Berman, Data and Society, CSCI 4370/6370
Op-Ed Draft Grades Final Op-Ed grade is Draft Grade + Final Grade or
2 X Draft Grade
0
1
2
3
4
5
6
7
8
10 10.5 11 11.5 12 12.5 13 13.5 14 14.5 15
DRAFT GRADES (out of 15)
Series 1
![Page 72: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/72.jpg)
Fran Berman, Data and Society, CSCI 4370/6370
Calculating your Grade So Far (GSF)
GSF (a %) =
Presentation grade / 15 [if you’ve given a presentation]
+ Draft grade / 15
+ Draft grade / 15 [if you’ll be doubling your Draft Grade]
+ Participation Grade / 15 [Assume 15 if you will miss no more than 2 classes and fully participate in class]
Example: You got a 13/15 on your presentation and are doubling your draft grade of 12/15 and will participate fully in class. In that case, your GSF is
13/15 + 12/15 + 12/15 + 15/15 = 52/60 = 87%
![Page 73: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/73.jpg)
Fran Berman, Data and Society, CSCI 4370/6370
Discussion Article for February 23
• “How the data that Internet companies collect can be used for the public good”, Harvard Business Review, https://hbr.org/2018/01/how-the-data-that-internet-companies-collect-can-be-used-for-the-public-good
![Page 74: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/74.jpg)
Fran Berman, Data and Society, CSCI 4370/6370
Presentations
![Page 75: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/75.jpg)
Fran Berman, Data and Society, CSCI 4370/6370
Presentation Articles for February 21
• “The Australian Square Kilometre Array Pathfinder Finally Hits the Big Data Highway”, Phys. Org., https://phys.org/news/2017-01-australian-square-kilometre-array-pathfinder.html [Ben H]
• “Precision Medical Treatments Have a Quality Control Problem,” NPR, https://www.npr.org/sections/health-shots/2017/12/29/572853436/precision-medical-treatments-have-a-quality-control-problem [Diego C]
• “Precision Medicine in the million genome era”, Genetic Engineering and Biotechnology News, http://www.genengnews.com/gen-articles/precision-medicine-research-in-the-million-genome-era/5944 [Halley F]
• “New Technologies Bring Marine Archaeology Treasures to Light”, The Guardian, https://www.theguardian.com/science/2016/dec/29/new-technologies-bring-marine-archaeology-treasures-to-light [Julia F]
• “Relative risk of Alzheimer’s between men and women: Record corrected”, Science Daily, https://www.sciencedaily.com/releases/2017/08/170828124531.htm [Sam S-F]
![Page 76: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/76.jpg)
Fran Berman, Data and Society, CSCI 4370/6370
Presentation Articles for February 23
• “The Lost NASA Tapes: Restoring Lunar Images after 40 Years in the Vault”, ComputerWorld, http://www.computerworld.com/article/2525935/computer-hardware/the-lost-nasa-tapes--restoring-lunar-images-after-40-years-in-the-vault.html?page=1 [Cameron M]
• “A Long-Lost Data Trove Uncovers California’s Sterilization Program”, The Atlantic, https://www.theatlantic.com/health/archive/2017/01/california-sterilization-records/511718/ [Justin T]
• “Will today’s digital movies exist in 100 years?”, IEEE Spectrum, http://spectrum.ieee.org/consumer-electronics/standards/will-todays-digital-movies-exist-in-100-years [Sean P]
• “Can DNA hard drives solve our looming data storage crisis?”, SinguilarityHub, https://singularityhub.com/2016/10/21/can-dna-hard-drives-solve-our-looming-data-storage-crisis/ [John L]
![Page 77: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/77.jpg)
Fran Berman, Data and Society, CSCI 4370/6370
Presentation Articles for February 28
• “Text Messages: Preservation Lessons for Mobile E-Discovery”, Lexology, https://www.lexology.com/library/detail.aspx?g=2ec513e9-daed-4e17-9bbb-c26dd7f28be7
• “CMS releases more than one petabyte of open data”, Phys. Org., https://phys.org/news/2017-12-cms-petabyte.html
• “Librarians Saving the Internet”, Science Friday, https://apps.sciencefriday.com/data/librarians.html
• “Stewards Mine Data to keep F1 Races honest and drivers’ safe,” NY Times, https://www.nytimes.com/2017/07/15/sports/autoracing/stewards-mine-data-keep-formula-one-drivers-honest-races-safe.html?_r=0
• “As Climate Change Fades from Government Sites, A Struggle to Archive Data”, Frontline, https://www.pbs.org/wgbh/frontline/article/as-climate-change-fades-from-government-sites-a-struggle-to-archive-data/
![Page 78: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/78.jpg)
Fran Berman, Data and Society, CSCI 4370/6370
Presentation Articles for March 2
• “Lasers reveal a Mayan civilization so dense it blew experts’ minds,” New York Times, https://www.nytimes.com/2018/02/03/world/americas/mayan-city-discovery-laser.html
• “Plagiarism software unveils a new source for 11 of Shakespeare’s plays”, New York Times, https://www.nytimes.com/2018/02/07/books/plagiarism-software-unveils-a-new-source-for-11-of-shakespeares-plays.html
• “Attuned to tremblors: how well can scientists forecast massive earthquakes?” Christian Science Monitor, https://www.csmonitor.com/Environment/2017/1122/Attuned-to-temblors-How-well-can-scientists-forecast-massive-earthquakes
• “The Human Cell Atlas: From Vision to Reality”, Nature.com, https://www.nature.com/news/the-human-cell-atlas-from-vision-to-reality-1.22854
![Page 79: 2/16/18 Guest Lecturer: Dr. Leslie McIntosh](https://reader033.fdocuments.in/reader033/viewer/2022041811/62545f72cf2f6370c00246ec/html5/thumbnails/79.jpg)
Fran Berman, Data and Society, CSCI 4370/6370
Presentation Articles for Today
• “How the end of Net Neutrality will affect the Internet of Things”, Network World,
https://www.networkworld.com/article/3244251/net-neutrality/how-the-end-of-
net-neutrality-will-affect-iot.html [Trulee H]
• “How Big Tech is Going After your Health Care,” NY Times,
https://www.nytimes.com/2017/12/26/technology/big-tech-health-care.html
[Daniel C]
• “Healthcare is hemorrhaging data. AI is here to help.”, Wired,
https://www.wired.com/story/health-care-is-hemorrhaging-data-ai-is-here-to-help/
[Matthew M]
• “Researchers use WWII Code-breaking Techniques to Interpret Brain Data,” Phys.
Org., https://phys.org/news/2017-12-wwii-code-breaking-techniques-brain.html
[Chandler M]