Data-Driven Discovery · Science Environment Bay Area Data-Driven Discovery ... View notebooks in...

Post on 08-Oct-2020

3 views 0 download

Transcript of Data-Driven Discovery · Science Environment Bay Area Data-Driven Discovery ... View notebooks in...

Data-Driven Discovery@

Carly Strasser, PhD ALPSP15 September 2017

Patient Care Environment Bay AreaScience

Patient Care

Marine Microbes

EnvironmentScience Bay Area

Data-Driven Discovery

EPiQS TMT …

Computational X

Data IntensiveData-Driven

Data Deluge

Big Data

Data Science

From Flickr by deltaMike

Carly Strasser

From Flickr by melocactus

From Flickr by solarnu

From Wikimedia

From Wikimedia

More, bigger data…

…means more computation, stats, and programming

The need for data science is ubiquitous

The need for data science is ubiquitous

Computational skills

Math & Stats skills

Domain expertise

Computational skills

Math & Stats skills

Domain expertise

!

Academia makes it hard for researchers to engage

Academia makes it hard for researchers to engage

Career tracksCredit & incentives

TrainingBarriers to interdisciplinary work

Career tracksCredit & incentives

TrainingBarriers to interdisciplinary work

Academia makes it hard for researchers to engage

Data-Driven Discovery @ Moore

$60M

6 Years

Data-Driven Discovery @ Moore

Institutions50%$30M3 grantees

People35%

$21M14 grantees

Practices15%$9M7 grantees

$60M

6 Years

Institutions50%$30M3 grantees

How can we change the way institutions think about data-driven research(ers)?

How can we change the way institutions think about data-driven research(ers)?

Careers

Space

Tools

Training

Open Science

Data Science Studies

Themes

People35%

$21M14 grantees

Computational skills

Math & Stats skills

Domain expertise

!

Metagenomics

StatisticsMathematics

Computer scienceGene expressionAstrophysics

Imaging EcologyEvolution

Gene expression

Computer science

EnvironmentalCS visualization

Astrophysics

DDD Investigators

$1.5M per investigator for 5 years

Practices15%$9M7 grantees

Promote tools and techniquesfor data driven research

Promote tools and techniquesfor data driven research

Interactive lab notebook for sharing workflows

Code cell

Results of running code cell

Narrative/ instructions

Code

Narrative

Info on dependencies

Code

Narrative

Info on dependencies

Code

Narrative

Info on dependencies

Code

Narrative

Info on dependencies

● View notebooks in browser● Interactive● All dependencies captured● Reproducibility!

beta.mybinder.org

Oran Viriyincy CC-BY

We are part of the scientific infrastructure

Oran Viriyincy CC-BY

We are part of the scientific infrastructure

Our decisions affect the quality of the science

research process

Oran Viriyincy CC-BY

Requirements✓ Meetings✓ Budgets✓ Progress✓ Annual reports

The foundation’s general policy is that Data and Intellectual Property must be managed and disseminated in a manner that leads to the greatest impact. Accordingly, in most cases, Data and Intellectual Property should be owned by the grantee and made available at no cost or, when justified, at a reasonable cost.

websiteemail

twitter

carlystrasser.netcarlystrasser@gmail.com@carlystrasser #MooreData