How the web has weaved a web of interlinked chemistry data final
-
Upload
antony-williams-chemconnector -
Category
Technology
-
view
1.040 -
download
0
description
Transcript of How the web has weaved a web of interlinked chemistry data final
![Page 1: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/1.jpg)
How the Web Has Weaved a Web of Interlinked Chemistry Data
Antony WilliamsACS Anaheim March 29th 2011
![Page 2: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/2.jpg)
Data on the Web
![Page 3: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/3.jpg)
Where is Chemistry Online? Property databases Compound aggregators Screening assay results Scientific publications Encyclopedic articles (Wikipedia) Metabolic pathway databases ADME/Tox data Blogs/Wikis and Open Notebook Science Contributing Open Source code to projects
![Page 4: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/4.jpg)
How to Connect Chemicals…
![Page 5: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/5.jpg)
Chemistry on the Internet
100s of websites serving up chemistry data, SDF files of structures and data
RSC’s ChemSpider “links” chemistry on the internet Over 25 million compounds, over 400 data sources Allows community deposition, curation, annotation Integrating properties, publications, patents, media Text, structure, substructure, similarity searching
![Page 6: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/6.jpg)
www.chemspider.com
![Page 7: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/7.jpg)
Search for a Chemical
![Page 8: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/8.jpg)
We Have Delivered the Vision
“Build a Structure Centric Community toServe Chemists”
Integrate chemical structure data on the web
![Page 9: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/9.jpg)
How Did We Build It? We deal in Molfiles or SDF files
We do rudimentary filtering prior to deposition – valence checking, charge imbalance etc.
We have our own “business logic” to standardize
We use InChI to “aggregate tautomers”
Link out to external sites where possible using IDs
![Page 10: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/10.jpg)
Inherited Errors
We have inherited errors All public compound databases, including ours,
have errors “Incorrect” structures – assertions, timelines etc “Incorrect” names associated with structures Properties Links Publications ENORMOUS CHALLENGE
![Page 11: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/11.jpg)
The Structure of Vitamin K?
![Page 12: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/12.jpg)
MeSH
A lipid cofactor that is required for normal blood clotting. Several forms of vitamin K have been identified: VITAMIN K 1 (phytomenadione) derived from plants, VITAMIN K 2 (menaquinone) from bacteria, and synthetic naphthoquinone provitamins, VITAMIN K 3 (menadione). Vitamin K 3 provitamins, after being alkylated in vivo, exhibit the antifibrinolytic activity of vitamin K. Green leafy vegetables, liver, cheese, butter, and egg yolk are good sources of vitamin K
![Page 13: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/13.jpg)
The Structure of Vitamin K1?
![Page 14: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/14.jpg)
What is the Structure of Vitamin K1?
![Page 15: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/15.jpg)
CAS’s Common Chemistry
![Page 16: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/16.jpg)
Wikipedia
![Page 17: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/17.jpg)
![Page 18: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/18.jpg)
![Page 19: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/19.jpg)
ChEBI – Manual Curation
![Page 20: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/20.jpg)
![Page 21: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/21.jpg)
![Page 22: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/22.jpg)
PubChem
![Page 23: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/23.jpg)
![Page 24: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/24.jpg)
“2-methyl-3-(3,7,11,15-tetramethylhexadec-2-enyl)naphthalene-1,4-dione”
Variants of systematic names on PubChem
2-methyl-3-[(E,7R,11R)-3,7,11,15-tetramethyl 2-methyl-3-[(E,7S,11R)-3,7,11,15-tetramethyl 2-methyl-3-[(E,7R,11S)-3,7,11,15-tetramethyl 2-methyl-3-[(E,7S,11S)-3,7,11,15-tetramethyl 2-methyl-3-[(E,11S)-3,7,11,15-tetramethyl 2-methyl-3-[(E)-3,7,11,15-tetramethyl 2-methyl-3-(3,7,11,15-tetramethyl 2-methyl-3-[(E)-3,7,11,15-tetramethyl
![Page 25: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/25.jpg)
Public Domain Databases
Our databases are a mess…
Non-curated databases are proliferating errors
We source and deposit data between databases
Original sources of errors hard to determine
Curation is time-consuming and challenging
![Page 26: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/26.jpg)
![Page 27: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/27.jpg)
Consider searching each of these chemical databases by chemical name (systematic name, trade name or synonym). Please mark each online resource according to how much you generally trust the results.
![Page 28: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/28.jpg)
![Page 29: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/29.jpg)
To report at Denver ACS…
An examination of quality in databases – inter/intra lab comparison of processes for 150 drugs
Five separate organizations, 8 individuals
The Wikipedia List of the “200 Top Selling Drugs”
![Page 30: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/30.jpg)
![Page 31: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/31.jpg)
Vytorin: Ezetimibe/Simvastatin
![Page 32: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/32.jpg)
Vytorin: Ezetimibe/Simvastatin
![Page 33: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/33.jpg)
Vytorin: Ezetimibe/Simvastatin
![Page 34: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/34.jpg)
Vytorin: Ezetimibe/Simvastatin
![Page 35: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/35.jpg)
Vytorin: Ezetimibe/Simvastatin
![Page 36: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/36.jpg)
Taxol: Paclitaxel 44 structures
![Page 37: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/37.jpg)
Taxol: Paclitaxel Bioassay Data
![Page 38: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/38.jpg)
Taxol: Paclitaxel Bioassay Data
Most Bioassay data associated with structure with one ambiguous stereocenter
![Page 39: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/39.jpg)
Drug Name Generic Name ChEBI ChemSpiderCAS Com.
Chem ChemIDPlus DailyMed DrugBank PubChem Wikipedia
SpirivaTiotropium Bromide
No Hits No Hits 4/0
DepakoteValproate semisodium No
Structure
Basen Voglibose No Hits No Hits 2/1 Symbicort 1) Budesonide 8/1 Symbicort 2) Formoterol WRONG No Hits 6/1 Vytorin 1) Ezetimibe No Hits Vytorin 2) Simvastatin 2/1 Taxol Paclitaxel 44/1 Thalidomid Thalidomide No Hits Zocor Simvastatin 2/1 Crestor Rosuvastatin No Hits 2/1
![Page 40: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/40.jpg)
Entity-Extraction and Mark-up
![Page 41: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/41.jpg)
Entity-Extraction and Mark-up
![Page 42: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/42.jpg)
Success Depends on Dictionaries
![Page 43: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/43.jpg)
Nature Chemistry
![Page 44: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/44.jpg)
RSC Prospect
![Page 45: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/45.jpg)
Validated “Dictionaries”
The following resources do NOT have structures to link to ChemSpider…but are linked:
Google Scholar PubMed DailyMed RSC Databases and Backfile
How did we link these resources to ChemSpider? Validated Name Look-up!
![Page 46: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/46.jpg)
Extend the Vision
“Build a Structure Centric Community toServe Chemists”
Integrate chemical structure data on the web Create a “structure-based hub” to information,
data and algorithmic predictions
![Page 47: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/47.jpg)
Integrate other services..
We will integrate to systems of values to the community
Many interfaces now available for integration NMRShiftDB ACD/Labs Name Generation ChemAxon Chemicalize What others???
![Page 48: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/48.jpg)
![Page 49: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/49.jpg)
Extend the Vision
“Build a Structure Centric Community toServe Chemists”
Integrate chemical structure data on the web Create a “structure-based hub” to information,
data and algorithmic predictions Let chemists contribute their own data Allow the community to curate/correct data
![Page 50: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/50.jpg)
How Did We Build It (cont.)
Ask users to add… Descriptions/Syntheses/Commentaries Links to PubMed articles Links to articles via DOIs Add spectral data Add Crystallographic Information Files Add photos Add MP3 files Add Videos
![Page 51: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/51.jpg)
Complex Data and Information
![Page 52: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/52.jpg)
![Page 53: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/53.jpg)
Kind Contributions!
![Page 54: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/54.jpg)
Crowdsourcing “Vitamin H”
![Page 55: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/55.jpg)
“Curate” Identifiers
![Page 56: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/56.jpg)
“Curate” Identifiers
![Page 57: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/57.jpg)
Crowdsourcing Works
>130 people have deposited data and participated in data curation
Different level curators check each other
More curators and depositors are encouraged!
![Page 58: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/58.jpg)
Accessibility and Reuse
It’s a shame to go it alone!!!
Can we “collectively” improve the quality of chemistry on the Internet?
![Page 59: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/59.jpg)
All DBs should take comments!
![Page 60: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/60.jpg)
Proof-of-concept curation sharing
Presently collaborating with DrugBank to enable “curation sharing”
Setting up services for monitoring curations and edits – starting with “identifiers”
![Page 61: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/61.jpg)
The Social Network
Career-wise NOT having a personal presence online will be a detriment Self-marketing Establishing a profile Getting on the record Collaborative Science Demonstrating a skill set Measured using alternative metrics Contributing to the public peer review process
![Page 62: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/62.jpg)
Social Networking Tools
A growing number of social networking tools:
Facebook Twitter Linked-In Flickr YouTube Blogs Communities Collaborative environments
![Page 63: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/63.jpg)
Chemistry Social Networking
Methods of sharing MY chemistry online include: Wikis or blogs Slideshare for presentations YouTube for videos Flickr, Wikimedia etc. for images PubChem for assay data NMRShiftDB for NMR assignments GoogleDocs for data
![Page 64: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/64.jpg)
Drivers in the Social Network Anonymity is a choice in the social networks
Anonymity in peer-review will likely become less important and may be generational
I may want acknowledgment if… I share my data I review a paper I share my expertise
![Page 65: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/65.jpg)
The Alt-Metrics Manifesto
http://altmetrics.org/manifesto/
![Page 66: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/66.jpg)
Enabled by ORCID…
![Page 67: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/67.jpg)
What will enhance OUR network?
The “semantic web”
Mobile technologies
More participation
Use of standards: JCAMP, InChI
![Page 68: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/68.jpg)
RDF and the semantic web
Using RDF permalinks
http://www.chemspider.com/Chemical-Structure.7787.rdf
Using a Search Term
http://www.chemspider.com/rdf.ashx?q=cyclohexane
http://rdf.chemspider.com/cyclohexane
![Page 69: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/69.jpg)
![Page 70: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/70.jpg)
Enabled through InChIs
![Page 71: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/71.jpg)
Mobile Support
![Page 72: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/72.jpg)
Licensing “My Work” Online The complex nature of licensing “my” chemistry
Blogs - copyrighted and creative commons Wikis - mixed licensing, depends on the host(s) Data – much value in sharing data as “Open Data”
Often, people can make money from your work!
Police your own “licensing” – how many people have read the Facebook and Twitter agreements?!
![Page 73: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/73.jpg)
Who declares data as Open? Data licensing is very interesting and can spark
“interesting” conversations. Opinions differ: Are images data? Are assertions data? What on a ChemSpider record is data?
We allow people to declare their data as Open and add an Open Data button at upload
![Page 74: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/74.jpg)
Acknowledgments
RSC|ChemSpider team All data source providers >100 curators and annotators, and growing… Service providers:
ACD/Labs ChemAxon GGA Software Services Google PubMed ….
![Page 75: How the web has weaved a web of interlinked chemistry data final](https://reader036.fdocuments.in/reader036/viewer/2022062511/54bd43964a7959c91e8b45d6/html5/thumbnails/75.jpg)
ChemSpider Training Session
ChemSpider: A Community Resource for Chemical Data
Wednesday, March 30th
8:30-11:00 AM
Anaheim Convention Center, Room 211 A