Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of...

18
Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. [email protected] CLADDIER workshop Chilworth, Southampton 15 th May 2007 This work is licensed under a Creative Commons Licence Attribution-ShareAlike 3.0 http://creativecommons.org/licenses/by-sa/3.0/

Transcript of Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of...

Page 1: Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. s.j.coles@soton.ac.uk CLADDIER workshop.

                                                             

Linking Data and Publications:

the Chemistry Way

Simon Coles

School of Chemistry,

University of Southampton, U.K.

[email protected]

CLADDIER workshop Chilworth, Southampton 15th May 2007

This work is licensed under a Creative Commons LicenceAttribution-ShareAlike 3.0

http://creativecommons.org/licenses/by-sa/3.0/

Page 2: Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. s.j.coles@soton.ac.uk CLADDIER workshop.

                                                             

The Research Data Lifecycle

Research & e-Science workflows

Aggregator services: national, commercial

Repositories : institutional, e-prints, subject, data, learning objects

Data curation: databases & databanks

Validation

Harvestingmetadata

Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media

Deposit / self-archiving

Peer-reviewed publications: journals, conference proceedings

Publication

Validation

Data analysis, transformation, mining, modelling

Searching , harvesting, embedding

Presentation services: subject, media-specific, data, commercial portals

Resource discovery, linking, embedding

Linking

Page 3: Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. s.j.coles@soton.ac.uk CLADDIER workshop.

                                                             

Current Situation - Data Deluge

Cl

Cl

Cl

Cl

Cl

Cl

ClCl Cl

Cl

Cl

ClCl

O

O

O

O

N

N

N

N

N+

O

O

O

N+

O

O

O

30,000,000

2,000,000

450,000

Page 4: Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. s.j.coles@soton.ac.uk CLADDIER workshop.

                                                             

Current Situation – Data and Publishing

Page 5: Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. s.j.coles@soton.ac.uk CLADDIER workshop.

                                                             

Separating Data from Interpretations Underlying data

(Institutional data repository)

Intellect & Interpretation

(Journal article, report,

etc)

Page 6: Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. s.j.coles@soton.ac.uk CLADDIER workshop.

                                                             

The eCrystals Public Data Archive

http://ecrystals.chem.soton.ac.uk

Page 7: Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. s.j.coles@soton.ac.uk CLADDIER workshop.

                                                             

Laboratory IRs and Data Management

Page 8: Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. s.j.coles@soton.ac.uk CLADDIER workshop.

                                                             

The R4L Repository

Deposit

Search / Browse

Create new compound Add experiment data and metadata

Page 9: Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. s.j.coles@soton.ac.uk CLADDIER workshop.

                                                             

Aggregator services

Institutional data repositories

Deposit , Validation

Publication

ValidationData analysis

Search, harvest

Presentation services / portals

Data discovery, linking, citation

Laboratory repository

Deposit

eCrystals ‘Federation’ Model

Publishers: peer-review journals, conference proceedings, etc

Curation

Preservation

Subject Repository

Institution Library & Information Services

Data creation & capture in “Smart lab”

Data discovery, linking, citation

Search, harvest

Search, harvest

Deposit

Deposit

Deposit

Page 10: Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. s.j.coles@soton.ac.uk CLADDIER workshop.

                                                             

Metadata standards: Dublin Core

About 15 core elements

Page 11: Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. s.j.coles@soton.ac.uk CLADDIER workshop.

                                                             

Metadata Publication

ecrystals.chem.soton.ac.uk/perl/oai2

Page 12: Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. s.j.coles@soton.ac.uk CLADDIER workshop.

                                                             

Metadata Publication

• Using simple Dublin Core • Crystal structure• Title (Systematic IUPAC Name)• Authors• Affiliation• Creation Date

• Additional chemical information through Qualified Dublin Core• Empirical formula• International Chemical Identifier (InChI)• Compound Class & Keywords

• Specifies which ‘datasets’ are present in an entry

• DOI http://dx.doi.org/10.1594/ecrystals.chem.soton.ac.uk/145

• Rights & Citation http://ecrystals.chem.soton.ac.uk/rights.html

• Application Profile http://www.ukoln.ac.uk/projects/ebank-uk/schemas/

Page 13: Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. s.j.coles@soton.ac.uk CLADDIER workshop.

                                                             

Aggregating Datasets

CCDCCDS

Page 14: Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. s.j.coles@soton.ac.uk CLADDIER workshop.

                                                             

Aggregating Datasets

Page 15: Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. s.j.coles@soton.ac.uk CLADDIER workshop.

                                                             

Search and Discovery

Page 16: Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. s.j.coles@soton.ac.uk CLADDIER workshop.

                                                             

http://www.rsc.org/Publishing/Journals/ProjectProspect/index.asp

Controlled Vocabulary and Semantics

Page 17: Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. s.j.coles@soton.ac.uk CLADDIER workshop.

                                                             

Linking Data and Publications

• Link data and associated ‘publications’

• Dataset annotated with metadata

• Semantic publishing on WWW and in journals

http://www.ukoln.ac.uk/projects/ebank-uk/pilot/

Page 18: Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. s.j.coles@soton.ac.uk CLADDIER workshop.

                                                             

The Future?

Database Citation Services Literature Citation Services

Controlled Vocabulary & Semantics