Checking, Curating And Qualifying Chemistry

23
Qualifying Online Qualifying Online Information Resources for Information Resources for Chemists Chemists Antony Williams Antony Williams

description

A presentation given at UNC Chapel Hill regarding Online Chemistry Resources and how ChemSPider and Chemmantis are contributing.

Transcript of Checking, Curating And Qualifying Chemistry

Page 1: Checking, Curating And Qualifying Chemistry

Qualifying Online Information Qualifying Online Information Resources for Chemists Resources for Chemists

Antony WilliamsAntony Williams

Page 2: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

Access to InformationAccess to Information

For me…For me… PhD : Libraries primary source of informationPhD : Libraries primary source of information PostDoc/Academia: Libraries and librariansPostDoc/Academia: Libraries and librarians Eastman Kodak: Software tools and Eastman Kodak: Software tools and

databasesdatabases Kodak and ACD/Labs: Replaced by the Kodak and ACD/Labs: Replaced by the

internet internet Today: The Internet enhanced by a network Today: The Internet enhanced by a network

of collaborators…of collaborators…

Librarians have become gurus in using Librarians have become gurus in using software systems to resource informationsoftware systems to resource information

Page 3: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

The Language of ChemistryThe Language of Chemistry

My language….My language….

Page 4: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

And its dialects….And its dialects….

Page 5: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

As a chemist…As a chemist…

I look for information about I look for information about chemicals/chemistrychemicals/chemistry What is a particular structure ?What is a particular structure ? What alternative names/identifiers?What alternative names/identifiers? Reaction synthesis?Reaction synthesis? Physical properties?Physical properties? Analytical data?Analytical data? Purchase?Purchase? Tell me more?Tell me more? Similar stuff – what other compounds are “like” Similar stuff – what other compounds are “like”

mine?mine?

Page 6: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

Searching and Reading Searching and Reading Articles…Articles…

Searching articles based on chemical Searching articles based on chemical structure and substructure is very expensive.. structure and substructure is very expensive.. but is changingbut is changing

The web IS “tool-ready” so when will The web IS “tool-ready” so when will publishers deliver?publishers deliver? Structures can be shownStructures can be shown Spectra can be interactiveSpectra can be interactive Graphics don’t need to be staticGraphics don’t need to be static Publishers can enhance their articles (Project Publishers can enhance their articles (Project

Prospect from the RSC is an example)Prospect from the RSC is an example)

Page 7: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

PublicationsPublications

Page 8: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

Enable Electronic Articles…Enable Electronic Articles…

Structures are the Structures are the language of language of chemistrychemistry

Show structures to Show structures to chemists and chemists and search/link from search/link from there…there…

Page 9: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

Allow Integration…Allow Integration…

Page 10: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

And Extend to Patents…And Extend to Patents…

Page 11: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

What can be done?What can be done?

Page 12: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

Blogs, Wikis, Forums and Blogs, Wikis, Forums and Collaborative ScienceCollaborative Science

I have two blogs, one forum and a full blog reader…I have two blogs, one forum and a full blog reader… http://www.chemspider.com/bloghttp://www.chemspider.com/blog http://www.chemspider.com/chemunicatinghttp://www.chemspider.com/chemunicating

(ChemConnector)(ChemConnector)

http://forum.chemspider.com/http://forum.chemspider.com/ They are catalytic for collaborations, getting They are catalytic for collaborations, getting

questions answered, garnering comments and questions answered, garnering comments and feedbackfeedback

There are upsides and downsides: There are upsides and downsides: http://www.chemspider.com/blog/the-joys-and-frustrations-ofhttp://www.chemspider.com/blog/the-joys-and-frustrations-of-6-months-blogging-in-the-chemistry-community.html-6-months-blogging-in-the-chemistry-community.html

Page 13: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

Collaborative Knowledge Collaborative Knowledge Management Management

for Chemists – Wikipedia, Built by for Chemists – Wikipedia, Built by a Networka Network

Page 14: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

Wikipedia Chemistry Curation Wikipedia Chemistry Curation projectproject

Only ca. 5000 organic Only ca. 5000 organic structuresstructures

A year of work for a team of A year of work for a team of 6 people6 people

Many errors removed in the Many errors removed in the process.process.

Slow and torturous processSlow and torturous process CAS collaborating in the CAS collaborating in the

processprocess

Page 15: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

Wikipedia via Wikipedia via ChemSpiderChemSpider……

Page 16: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

The Quality of Data Online…The Quality of Data Online…

Content is king – quality costs. Curation is Content is king – quality costs. Curation is expensive!expensive!

Data online are “filthy”. Data online are “filthy”. Gathering data is the “easy part” Gathering data is the “easy part” Structures are COMMONLY incorrectStructures are COMMONLY incorrect

Informatics tools exist alreadyInformatics tools exist already Hold millions of structures and associated dataHold millions of structures and associated data Structure/substructure/text searchingStructure/substructure/text searching Data downloads, data uploads, editing, annotationData downloads, data uploads, editing, annotation

Page 17: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

Caution! Question Everything!Caution! Question Everything!

Page 18: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

Question EverythingQuestion Everythingwww.dhmo.orgwww.dhmo.org

Page 19: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

Quality of Structures!!!Quality of Structures!!!

Page 20: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

Quality of StructuresQuality of Structures

Page 21: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

InChIs InChIs Structure but NOT substructureStructure but NOT substructure

Page 22: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

ConclusionsConclusions

The internet enables chemistry – and at a reduced The internet enables chemistry – and at a reduced costcost

Web 2.0 is here and improving quality – to benefit Web 2.0 is here and improving quality – to benefit 3.03.0

Question Quality!Question Quality! Crowdsourcing for expansion, curation and Crowdsourcing for expansion, curation and

integrationintegration Classical models may die quite quickly – business Classical models may die quite quickly – business

models must change soon or failmodels must change soon or fail Publishers – Publishers – heed the profileration of InChIs for heed the profileration of InChIs for

ChemistryChemistry

Page 23: Checking, Curating And Qualifying Chemistry

Building a Structure Centric Community for Chemists

The ChemSpider Journal – 12/2008The ChemSpider Journal – 12/2008www.chemspider.comwww.chemspider.com