Reproducible research: First steps.

28
Richard Layton May 6, 2015 First steps towards reproducible research

Transcript of Reproducible research: First steps.

Page 1: Reproducible research: First steps.

Richard LaytonMay 6, 2015

First steps towards reproducible research

Page 2: Reproducible research: First steps.

Credibility turns on the success or failure of attempts to reproduce findings.

Kenneth Rogoff & Carmen Reinhart

In economic models

• coding errors

• selective exclusion of available data

• unconventional weighting of summary statistics

Thomas Herdon, Michael Ash, & Robert Pollin (2013). Does high public debt consistently stifle economic growth? A critique of Reinhart and Rogoff. Working

paper series 322. Political Economy Research Institute, U Mass Amherst.

Page 3: Reproducible research: First steps.

Credibility turns on the success or failure of attempts to reproduce findings.

Jason deBruyn (Jan 23, 2015) Trial involving disgraced scientist and bunk Duke research to begin Monday. Triangle Business Journal.

In cancer therapy models

• data falsification

• retracted journal articles

• terminated clinical trials

• civil suit by patientsAnil Potti

Page 4: Reproducible research: First steps.

Credibility turns on the success or failure of attempts to reproduce findings.

1000 years of temperature variation: the ”hockey stick” graph by Michael Mann

In climate science models

• flawed research methods

• evasion of FOIA requests

• leaked emails

• media hype

Freed Pearce (2010-02-09) Climate change debate overheated after sceptic grasped 'hockey stick‘. The Guardian.

Page 5: Reproducible research: First steps.

“Computational science today faces a credibility crisis.”

Victoria Stodden, UIUC

Without access to the code and

data that underlie scientific

discoveries, published findings

are all but impossible to verify.

Page 6: Reproducible research: First steps.

What can reproducible research do for you?

Your closest collaborator

is you six months ago,

but you don't reply to emails.

Paul Wilson

Engineering Physics

UW–Madison

Page 7: Reproducible research: First steps.

This work flow is probably familiar.

Page 8: Reproducible research: First steps.

Karl Broman

Biostatistics & Medical Informatics

UW–Madison

If you do anything “by hand”

once, you’ll do it 100 times.

Page 9: Reproducible research: First steps.

Some narrative.

<<>>=

hist(co2)

@

Discuss result.

Principle 1.

Blend computing, results, and narrative.

Open a script.

Embed the code that

creates output.

More narrative.

Write content.

Page 10: Reproducible research: First steps.

Principle 1.

Blend computing, results, and narrative.

<<>>=

hist(co2)

@

Render the text and

code outputs.

Report titleIntroduction.Some narrative.

Discuss result.More narrative.

Page 11: Reproducible research: First steps.

Report titleIntroduction.Some narrative.

Discuss result.More narrative.

Some narrative.

<<>>=

hist(co2)

@

Discuss result.

Changes in the script? Render a new report.

Page 12: Reproducible research: First steps.

.Rnw

Example

Page 13: Reproducible research: First steps.

.Rnw

render

Example

Page 14: Reproducible research: First steps.

.Rnw

render

Example

Page 15: Reproducible research: First steps.

The same report in Markdown.

.Rmd

Page 16: Reproducible research: First steps.

The same report in Markdown.

render

.Rmd

Page 17: Reproducible research: First steps.

render

.Rmd

Page 18: Reproducible research: First steps.

.Rmd

Edit the output option.

No change to the rest of the file.

render

Same report with a different output format.

Page 19: Reproducible research: First steps.

render

.Rmd

Page 20: Reproducible research: First steps.

Principle 2. Organize for reproducibility

from the beginning.

1. Everything is a script

2. Every script is connected

3. File management is planned

Page 21: Reproducible research: First steps.

# wrangle data

write(csv)

# gather data

read(xlsx)

script

Data

Page 22: Reproducible research: First steps.

# create graph

write(PDF)

write(PNG)

# analysis

read(csv)

script

Design

Page 23: Reproducible research: First steps.

source(design)

```{r}

source(gather)

Narrative.

script

Narrative

include(graph)

.Rmd

Report

.Rmd

render

Page 24: Reproducible research: First steps.

.Rmd

reproducible

report

non-reproducible

documents

Your future self thanks you.

Page 25: Reproducible research: First steps.

Summary: two principles.

Organize for reproducibility

from the beginning.

Explicitly link computing,

results, and narrative.

Page 26: Reproducible research: First steps.

To learn more,

Victoria Stodden, Friedrich

Leisch, & Roger D. Peng (2014)

Chrtistopher Gandrud (2015)Yihui Xie (2013)

Page 27: Reproducible research: First steps.

One Script to rule them all,

One Script to find them,

One Script to bring them all

And in the Markdown bind them.

Page 28: Reproducible research: First steps.

Image credits1. Image of Reinhart and Rogoff, reprinted under Creative Commons license, courtesy of The Commentator,

http://www.thecommentator.com/privacy_policy.

2. Image of Anil Potti, from WPDE.com, http://www.carolinalive.com/ © 2015 Sinclair Communications, LLC.

3. “Hockey stick” graph from Mann, Bradley, & Hughes, Nature, 1998. Reprinted from The Guardian, © 2015 Guardian News and Media Limited, http://www.theguardian.com/environment/2010/feb/02/hockey-stick-graph-climate-change.

4. Image of Victoria Stodden, from YouTube, speaking on "Reproducible Research: A Digital Curation Agenda" at the 7th International Digital Curation Conference, University of Bath, Bristol, UK, Dec 6, 2011. Creative Commons attribution license.

5. Bing images for the MATLAB logo, Microsoft Word, Excel, & PowerPoint, and for Adobe PDF are reprinted under Creative Commons license.

6. Other unattributed clipart courtesy of https://openclipart.org/, used with permission.