ChemSpider – disseminating data and enabling an abundance of chemistry platforms
-
Upload
orcid-0000-0002-2668-4821 -
Category
Technology
-
view
1.933 -
download
3
description
Transcript of ChemSpider – disseminating data and enabling an abundance of chemistry platforms
![Page 1: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/1.jpg)
ChemSpider – disseminating data and enabling an abundance of
chemistry platforms
Antony Williams, Valery Tkachenko, Ken Karapetyan, Alexey Pshenichnov, Dmitry Ivanov, Colin Batchelor, Jon Steele
and David Sharpe
ACS New Orleans April 2013
![Page 2: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/2.jpg)
ChemSpider
• >28.5 million unique chemicals from >400 data sources
• Focus on improving data quality, enhancing functionality, integrating and enabling
![Page 3: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/3.jpg)
![Page 4: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/4.jpg)
Some usage statistics• ca. 200 visitors at any one time, ~30,000 visits per day• Mar 4-Apr 3, 2013
– Visits = 731,656– Unique Visitors = 527,008
• Independent servers to support other projects
![Page 5: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/5.jpg)
Access ChemSpider
• APIs– Programmatic access used by Mobile Apps, Funded
Consortia projects, many Academic groups
• Widgets– UI components for embedding in other websites
• Data– Data access, downloads, reuse, licensing
![Page 6: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/6.jpg)
Supporting the Semantic Webrdf.chemspider.com/CSID
![Page 7: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/7.jpg)
ChemSpider Resources for Chemistry
![Page 8: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/8.jpg)
![Page 9: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/9.jpg)
From this…..…..to this
Simplified interface
ChemSpider Audiences
![Page 10: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/10.jpg)
Substance Pages
![Page 11: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/11.jpg)
![Page 12: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/12.jpg)
It is so difficult to navigate…
What’s the structure?What’s the structure?
Are they in our file?
Are they in our file?
What’s similar?What’s similar?
What’s the target?
What’s the target?Pharmacology
data?Pharmacology
data?
Known Pathways?
Known Pathways?
Working On Now?
Working On Now?Connections to
disease?Connections to
disease?
Expressed in right cell type?
Expressed in right cell type?
Competitors?Competitors?
IP?IP?
![Page 13: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/13.jpg)
• 3-year knowledge management IMI project
• Integrating chemistry and biology data and delivering using semantic web technologies
• Open source code, open data and open standards
• Academics, Pharma companies, Publishers….
![Page 14: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/14.jpg)
ChemSpider Contributions
• The host of the chemistry services– Supplier of “standardized” chemical data files– Chemistry searching (structure, substructure etc)– Provider of data in RDF format – Curator and data quality checking
• Now building the Open PHACTS chemical registration system
![Page 15: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/15.jpg)
ChemSpider Contributions
• Supplier of chemistry UI components• “Quality Police” for data checking • Chemical Validation and Standardization Platform• Nanopublications from RSC publications
![Page 16: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/16.jpg)
• FP7 Initiative. PharmaSea: increasing value and flow in the marine biodiscovery pipeline
![Page 17: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/17.jpg)
PharmaSea
• Dereplication via ChemSpider• Segregation of natural products datasets• Analytical data algorithms & integration
– Mass spec searching – predicted fragmentation
– NMR feature searching – NMR prediction– Computer-assisted structure elucidation
![Page 18: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/18.jpg)
Integrate to instruments and software
• Integration to analytical instrumentation vendors already in place – Agilent, Bruker, Thermo, Waters
• Also, Cheminformatics vendors link to ChemSpider– Accelrys, ACD/Labs, ChemAxon, iChemLabs, and…
![Page 19: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/19.jpg)
Natural Products Updates
• Names hard, Structures “Obvious”
• New content based on monthly updates of the database
• Click through to the Natural Products Updates entry
![Page 20: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/20.jpg)
National Chemical Database Service
![Page 21: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/21.jpg)
Chemical Database Service• National Chemical Database
Service for UK Academics
• Integrating Commercial Databases and Services
• Chemicals, analytical data, prediction algorithms
• Development of data repository
![Page 22: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/22.jpg)
Retrosynthetic Analysis
![Page 23: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/23.jpg)
Publications - a summary of work
• Scientific publications are a summary of work– Is all work reported?– How much science is lost to pruning?– What of value sits in notebooks and is lost?
• How much data is lost?– How many compounds never reported?– How many syntheses fail or succeed?– How many characterization measurements?
![Page 24: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/24.jpg)
Community Repository for Data• Funding agencies encourage sharing of data• Increasing availability of “Open Data”• Institutional repositories no specific domain
support • Develop a community repository for chemistry
data – private, public, embargoed• Provides data to develop models/algorithms
![Page 25: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/25.jpg)
Community Repository for Data• Automated depositions of data• DOI’ed data objects for citation purposes• A database of reference data, but validated by
the community • National services feeding the repository –
crystallography, mass spectrometry• Integrate to blogging tools for chemistry• Integrate to Electronic Lab Notebooks as feeds
![Page 26: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/26.jpg)
Model Building with Community Data
• Community data as a basis of model building– Consume data from available databases, community
data, new publications and build predictive algorithms for the community
– How many algorithms are reported and lost? How much repeat work is done in the domain of algorithmic development?
![Page 27: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/27.jpg)
Recognition onData
IC50 Measurements for 62 substituted benzoxazolesChemSpider Data Repository: DOI: 10.1356/CSID784.4
![Page 28: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/28.jpg)
Integrate to electronic lab notebooks
![Page 29: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/29.jpg)
E-Lab Notebooks
• Previous work with IDBS and University of Cambridge
• Working on LabTrove integration win U. Southampton
• Integration between ELNs and:• ChemSpider• ChemSpider Reactions• CDS Repository
• Publish data from ELNs issue DOIs
• Data aggregated into fully indexed ESI format for publication
![Page 30: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/30.jpg)
Support for Chemical Reactions
• Integrating mined reaction data from patents (Daniel Lowe)
• Will also incorporate and integrate: Methods of Organic Synthesis, Catalysts and Catalyzed Reactions and…
![Page 31: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/31.jpg)
Micro-publishing Chemical Reactions
![Page 32: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/32.jpg)
ChemSpider SyntheticPages
![Page 33: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/33.jpg)
Retrosynthetic Analysis
![Page 34: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/34.jpg)
Inside our Publication Archive
• How much data is in the archive, in the publications and in the supplementary info?– How many compounds for ChemSpider?– How many syntheses for ChemSpider reactions?– How many characterization measurements?
• Property Data• Spectral Data• Graphs and charts to be used for modeling?
![Page 35: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/35.jpg)
What if we could capture it all?Digitally Enhancing the RSC Archive
![Page 36: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/36.jpg)
Start with data in publications
![Page 37: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/37.jpg)
Recent Work
![Page 38: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/38.jpg)
Comparison of Spectra
![Page 39: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/39.jpg)
Data Validation and Curation Required
![Page 40: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/40.jpg)
CVSP: Validation and Standardization
![Page 41: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/41.jpg)
Data Validation and Curation Required
Encouraging Participation with Rewards and RECOGNITION
![Page 42: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/42.jpg)
Manual Curation
• Integrated commenting, curating and validation platform across ALL eScience and publishing platforms
• All integrated to a central RSC profile and feeding the AltMetrics tools
![Page 43: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/43.jpg)
Structure Review
![Page 44: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/44.jpg)
Where we are now…
![Page 45: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/45.jpg)
Rewards and Recognition
Congratulations! Your 1st CSSP article has been published. Philosopher Lao Tzu said “A journey of a thousand miles begins with a single step”. In the same way we hope that this will be the first of many submissions that you make to CSSP.
The First Step badge is awarded when a user submits (& has published) their 1st CSSP article.
![Page 46: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/46.jpg)
Future Recognition in AltMetrics?
ChemSpider
![Page 47: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/47.jpg)
Why is ChemSpider “different”
• Interfaces for integration• Sharing of data – and increasingly open• Open for community participation
– Deposition– Annotation– Curation
• We are clear…the world is changing
![Page 48: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/48.jpg)
Internet Data
The Future
Commercial SoftwarePre-competitive Data
Open ScienceOpen DataPublishersEducators
Open DatabasesChemical Vendors
Small organic moleculesUndefined materialsOrganometallicsNanomaterialsPolymersMineralsParticle boundLinks to Biologicals
![Page 49: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/49.jpg)
Acknowledgments • The RSC eScience and infrastructure teams• Our data providers, depositors, collaborators
and curators• Daniel Lowe for Reaction Data• William Brouwer, Penn State• Software providers – OpenEye, ChemDoodle,
ACD/Labs, GGA Software, Open Source (Jmol, JSpecView, OpenBabel)
![Page 50: ChemSpider – disseminating data and enabling an abundance of chemistry platforms](https://reader035.fdocuments.in/reader035/viewer/2022062703/554e7d8eb4c9054a698b5298/html5/thumbnails/50.jpg)
Thank you
Email: [email protected] Twitter: ChemConnectorPersonal Blog: www.chemconnector.com SLIDES: www.slideshare.net/AntonyWilliams