Science Commons Open Notebook Science Talk

Post on 06-May-2015

2.837 views 2 download

description

Jean-Claude Bradley presents at the Science Commons Symposium on Feb 20, 2010 at the Microsoft Campus in Redmond. The talk covers doing Open Notebook Science using free and hosted tools, including new archiving protocols developed with Andrew Lang.

Transcript of Science Commons Open Notebook Science Talk

Using Free Hosted Web2.0 Tools for Open Notebook

Science

Jean-Claude Bradley

February 20, 2010

Science Commons Symposium

Associate Professor of ChemistryDrexel University

The case for Open Notebook Science

1. Is our current system working?2. Is ONS difficult or expensive to

implement?3. Does ONS prevent peer-reviewed

publication?4. Can ONS data be easily discoverable?5. Can ONS information be easily

archived and cited?6. Is ONS compatible with IP protection?

How bad is our current system? Try to find the solubility EGCG?

=2.3 g/L

WTF?!

The End of the Chain of Provenance

The NaH oxidation controversy

Information spreads quickly through the blogosphere

15% NMR yield

Khalid Mirza and Marshall Moritz

Top results on a Google search

The Scandal of Bell’s Lab Notebook

Motivation: Faster Science, Better Science

Open Notebook Science Logos (Andy Lang, Shirley Wu)

Sharing: how much and when

There are NO FACTS, only measurements embedded

within assumptions

Open Notebook Science maintains the integrity of data

provenance by making assumptions explicit

TRUST

PROOF

The solubility of 4-chlorobenzaldehyde

The Log makes Assumptions Explicit

The Rationale of Findings Explicit

Raw Data Made Public

Splatter?

Some liquid

YouTube for demonstrating experimental YouTube for demonstrating experimental set-upset-up

Calculations Made Public on Google Spreadsheets

Revision History on Google Spreadsheets

Wiki Page History

Comparing Wiki Page Versions

Proof of Purity with interactive NMR spectrum using JSpecView and

JCAMP-DX

Linking to Molecules in Chemistry Databases

Experimental Spectra and User-Deposited Data on ChemSpider

(Andy Lang, Tony Williams)

Open Data JCAMP spectra for education

(Andy Lang, Tony Williams, Robert Lancashire)

Database Curation via Game Playing

Over 100,000 spectrum views so far - worldwide

Link Spectral Game to Open Educational Content

The Ugi reaction: can we predict precipitation?

Can we predict solubility in organic solvents?

Crowdsourcing Solubility Data

ONS Submeta Award Winners

ONS Challenge Judges

Teaching Lab: Brent Friesen (Dominican University)

Solubility Experiment List

Solubilities collected in a Google Spreadsheet

Rajarshi Guha’s Live Web Query using Google Viz API

WE ARE HEREWE ARE HERE

How can the scientific process become more automated?

Semi-Automated Semi-Automated Measurement of solubility via Measurement of solubility via

web service analysis of web service analysis of JCAMP-DX files JCAMP-DX files

(Andy Lang)(Andy Lang)

Solubility Measurement Requests: DoSol sheet

•Outlier Bot: flags measurements with high standard deviation to mean ratios•Google Analytics queries – new solvent/solute searches•Solubility request form – researcher in Israel requesting pyrene in acetonitrile solubility for environmental soil contamination study•Application based models – high priority Ugi reactants

Solubility Prediction (Andy Lang’s Model)

Understanding in addition to empirical modeling

Missed in a prior publication on

solubility for this compound

Data provenance: From Wikipedia to…

…the lab notebook and raw data

Including links to the literature

•Concentration (0.4, 0.2, 0.07 M)•Solvent (methanol, ethanol, acetonitrile, THF)•Excess of some reagents (1.2 eq.)

How does Open Notebook Science fit with traditional publication?

Paper written on Wiki

References to papers, blog posts, lab notebook pages, raw

data

Paper on Journal of Visualized Experiments (JoVE)

Pre-print on Nature Precedings

ChemSpider Automated Mark-up of Chemical Names

BUT…

Open Access: the Choice that Keeps Giving.. and Giving…

Beware of your addiction to metrics: redundancy will reduce

them

Cameron Neylon’s NotebooksCameron Neylon’s Notebooks

Other Open NotebooksOther Open Notebooks

Anthony Salvagno’s Notebook Anthony Salvagno’s Notebook (Steve Koch group)(Steve Koch group)

TraditionalLab Notebook(unpublished)

TraditionalJournal Article

Open Access Journal Article

Open Notebook Science (full transparency)

CLOSED OPEN

TraditionalPaper TextbookF2F lectures

Lectures Notes public

Assigned problems public

Archived Lectures Public and free online textbooks

RESEARCH

TEACHING

Where do Libraries fit in the Where do Libraries fit in the communication of science and education in communication of science and education in

the Open/Closed Continuum? the Open/Closed Continuum?

The Missing Pieces of the Puzzle

• Automatic Backup of Science 2.0 Data

• Archiving of Open Notebooks

• Science 2.0 Community Needed Resources - Preservation, Cataloging, Archiving, Cite-ability

Librarians and Science 2.0"The Internet Archive is a 501(c)(3) non-profit that was founded to build an Internet library, with the purpose of offering permanent access for researchers, historians, and scholars to historical collections that exist in digital format."

The internet Archive is not practical for practitioners of 

Open Notebook Science or 

Science 2.0 

Good concept but.....

Most pages look like this....

Where We Began: The ONS backup spreadsheet and ONSPreserver

Publishing Google Spreadsheets as XLS

Where We Are Now

ONSArchive: Semi-Automated Snapshot of the Entire Scientific Record

Snapshot is Self-Contained and Live on the Internet

Lulu.com Data Disks

DSpace – Handle (hdl)

Lulu.com - ISBN

Google Spreadsheet

s

Google Documents

Web Services

ChemSpider & Indiana

Real Time Linear Regression, Unit

Conversions, Style Sheet, etc

Data Book

Bradley, Jean-Claude; Lang Andrew. Solubilities Summary Sheet. Open Notebook Science Challenge. 2009-12-11. URL:http://spreadsheets.google.com/pub?key=plwwufp30hfq0udnEmRD1aQ&output=xls. Accessed: 2009-12-11. (Archived by WebCite® at http://www.webcitation.org/5lx5ry3BV)

More about the ONSarchive project:

Conclusions

1. Is our current system working? NO2. Is ONS difficult or expensive to

implement? NO 3. Does ONS prevent peer-reviewed

publication? NO – but depends of publisher

4. Can ONS data be easily discoverable? YES

5. Can ONS information be easily archived and cited? YES

6. Is ONS compatible with IP protection? Maybe to a limited extent