Rmarkdown for Reproducible...

24
Rmarkdown for Reproducible Research Andrew Caines | Linguistics, Cambridge | apc38 27 November 2015 Andrew Caines | Linguistics, Cambridge | apc38 Rmarkdown for Reproducible Research 27 November 2015 1/9

Transcript of Rmarkdown for Reproducible...

Page 1: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

Rmarkdown for Reproducible Research

Andrew Caines | Linguistics, Cambridge | apc38

27 November 2015

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 1 / 9

Page 2: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

What is R Markdown?

source file / output file paradigm (cp LaTeX, HTML, markdown)

plain text file (.Rmd), rendered by Rmarkdown = plain-text authoring syntax, increasingly popular(e.g. GitHub)R Markdown = implementation of markdown, can take R codechunksuses knitr package (successor to Sweave) and ‘Pandoc’ text fileconversion programcan produce HTML, PDF and Word documentsmaintained by R Studio, now R Markdown v2

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 2 / 9

Page 3: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

What is R Markdown?

source file / output file paradigm (cp LaTeX, HTML, markdown)plain text file (.Rmd), rendered by R

markdown = plain-text authoring syntax, increasingly popular(e.g. GitHub)R Markdown = implementation of markdown, can take R codechunksuses knitr package (successor to Sweave) and ‘Pandoc’ text fileconversion programcan produce HTML, PDF and Word documentsmaintained by R Studio, now R Markdown v2

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 2 / 9

Page 4: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

What is R Markdown?

source file / output file paradigm (cp LaTeX, HTML, markdown)plain text file (.Rmd), rendered by Rmarkdown = plain-text authoring syntax, increasingly popular(e.g. GitHub)

R Markdown = implementation of markdown, can take R codechunksuses knitr package (successor to Sweave) and ‘Pandoc’ text fileconversion programcan produce HTML, PDF and Word documentsmaintained by R Studio, now R Markdown v2

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 2 / 9

Page 5: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

What is R Markdown?

source file / output file paradigm (cp LaTeX, HTML, markdown)plain text file (.Rmd), rendered by Rmarkdown = plain-text authoring syntax, increasingly popular(e.g. GitHub)R Markdown = implementation of markdown, can take R codechunks

uses knitr package (successor to Sweave) and ‘Pandoc’ text fileconversion programcan produce HTML, PDF and Word documentsmaintained by R Studio, now R Markdown v2

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 2 / 9

Page 6: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

What is R Markdown?

source file / output file paradigm (cp LaTeX, HTML, markdown)plain text file (.Rmd), rendered by Rmarkdown = plain-text authoring syntax, increasingly popular(e.g. GitHub)R Markdown = implementation of markdown, can take R codechunksuses knitr package (successor to Sweave) and ‘Pandoc’ text fileconversion program

can produce HTML, PDF and Word documentsmaintained by R Studio, now R Markdown v2

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 2 / 9

Page 7: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

What is R Markdown?

source file / output file paradigm (cp LaTeX, HTML, markdown)plain text file (.Rmd), rendered by Rmarkdown = plain-text authoring syntax, increasingly popular(e.g. GitHub)R Markdown = implementation of markdown, can take R codechunksuses knitr package (successor to Sweave) and ‘Pandoc’ text fileconversion programcan produce HTML, PDF and Word documents

maintained by R Studio, now R Markdown v2

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 2 / 9

Page 8: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

What is R Markdown?

source file / output file paradigm (cp LaTeX, HTML, markdown)plain text file (.Rmd), rendered by Rmarkdown = plain-text authoring syntax, increasingly popular(e.g. GitHub)R Markdown = implementation of markdown, can take R codechunksuses knitr package (successor to Sweave) and ‘Pandoc’ text fileconversion programcan produce HTML, PDF and Word documentsmaintained by R Studio, now R Markdown v2

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 2 / 9

Page 9: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

What is R Markdown?

simple syntax: shallow learning curve

human-readable: emphasis on communication and clarity, ratherthan technique and complexitytransparent formatting: wysiwygembedded computation: share with collaborators, reviewers,colleagues, students, critics

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 3 / 9

Page 10: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

What is R Markdown?

simple syntax: shallow learning curvehuman-readable: emphasis on communication and clarity, ratherthan technique and complexity

transparent formatting: wysiwygembedded computation: share with collaborators, reviewers,colleagues, students, critics

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 3 / 9

Page 11: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

What is R Markdown?

simple syntax: shallow learning curvehuman-readable: emphasis on communication and clarity, ratherthan technique and complexitytransparent formatting: wysiwyg

embedded computation: share with collaborators, reviewers,colleagues, students, critics

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 3 / 9

Page 12: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

What is R Markdown?

simple syntax: shallow learning curvehuman-readable: emphasis on communication and clarity, ratherthan technique and complexitytransparent formatting: wysiwygembedded computation: share with collaborators, reviewers,colleagues, students, critics

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 3 / 9

Page 13: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

What is R Markdown?

. . . integrating source code, statistical output, and text in RMarkdown is a model of reproducibility.

Such transparency facilitates comprehension, defensibility, andfurther research or testing.

R Markdown helps to bring the vision for reproducibility instatistical analysis articulated by Gentleman & Temple Lang[2004] to reality.

Baumer & Udwin 2015, ‘R Markdown’, WIRES: ComputationalStatistics

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 4 / 9

Page 14: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

How to use R Markdown

installationinstall.packages('rmarkdown')

library(rmarkdown)

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 5 / 9

Page 15: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

How to use R Markdown

installationinstall.packages('rmarkdown')library(rmarkdown)

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 5 / 9

Page 16: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

How to use R Markdownsimple syntax, human-readable, transparent formatting

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 6 / 9

Page 17: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

How to use R Markdownembedded computation

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 7 / 9

Page 18: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

R Markdown and Reproducibility

User experience?

disappointing :-/Computational linguistics >(open) Linguisticsculture of Shared Tasks, for exampleCurrent project: proprietary data + 4p paper + no shared code +authors moved onMartijn Wieling’s statistics course: Rmd, HTML

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 8 / 9

Page 19: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

R Markdown and Reproducibility

User experience?disappointing :-/

Computational linguistics >(open) Linguisticsculture of Shared Tasks, for exampleCurrent project: proprietary data + 4p paper + no shared code +authors moved onMartijn Wieling’s statistics course: Rmd, HTML

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 8 / 9

Page 20: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

R Markdown and Reproducibility

User experience?disappointing :-/Computational linguistics >(open) Linguistics

culture of Shared Tasks, for exampleCurrent project: proprietary data + 4p paper + no shared code +authors moved onMartijn Wieling’s statistics course: Rmd, HTML

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 8 / 9

Page 21: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

R Markdown and Reproducibility

User experience?disappointing :-/Computational linguistics >(open) Linguisticsculture of Shared Tasks, for example

Current project: proprietary data + 4p paper + no shared code +authors moved onMartijn Wieling’s statistics course: Rmd, HTML

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 8 / 9

Page 22: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

R Markdown and Reproducibility

User experience?disappointing :-/Computational linguistics >(open) Linguisticsculture of Shared Tasks, for exampleCurrent project: proprietary data + 4p paper + no shared code +authors moved on

Martijn Wieling’s statistics course: Rmd, HTML

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 8 / 9

Page 23: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

R Markdown and Reproducibility

User experience?disappointing :-/Computational linguistics >(open) Linguisticsculture of Shared Tasks, for exampleCurrent project: proprietary data + 4p paper + no shared code +authors moved onMartijn Wieling’s statistics course: Rmd, HTML

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 8 / 9

Page 24: Rmarkdown for Reproducible Researchbioinformatics-core-shared-training.github.io/reproducibility_worksho… · further research or testing. R Markdown helps to bring the vision for

End

useful linksRStudio: R MarkdownUdwin & Baumer arXiv.org paper (= WIRES paper)Shiny & R Markdownmy GitHub ‘replication’ repo

contact meapc38my website[@cainesap](https://twitter.com/cainesap)

Andrew Caines | Linguistics, Cambridge | apc38Rmarkdown for Reproducible Research 27 November 2015 9 / 9