Building A Community Resource For The Life Sciences
-
Upload
antony-williams-chemconnector-orcid-0000-0002-2668-4821 -
Category
Technology
-
view
679 -
download
0
description
Transcript of Building A Community Resource For The Life Sciences
![Page 1: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/1.jpg)
Building A Community Platform to Support Chemistry and the Life Sciences
![Page 2: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/2.jpg)
Where Would You look? What Do You Trust?
![Page 3: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/3.jpg)
Chemistry on the Internet TODAY
Chemistry searches are generally limited to text-based searches across the internet
Data are dirty: sorting the wheat from the chaff. Who can you trust?
Too many searches required to resource data
![Page 4: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/4.jpg)
Chemistry on the Internet TODAY
Chemistry searches are generally limited to text-based searches across the internet
Data are dirty: sorting the wheat from the chaff. Who can you trust?
Too many searches required to resource data
![Page 5: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/5.jpg)
![Page 6: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/6.jpg)
![Page 7: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/7.jpg)
The Final Search Strategy
![Page 8: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/8.jpg)
All Those Names, One StructureA problem to solve…
![Page 9: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/9.jpg)
Chemistry on the Internet TODAY
Chemistry searches are generally limited to text-based searches across the internet
Data are dirty: sorting the wheat from the chaff. Who can you trust?
Too many searches required to resource data
![Page 10: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/10.jpg)
Trustworthy Chemistry? Encyclopedic articles (Wikipedia) Chemical vendor databases Metabolic pathway databases Property databases Patents with chemical structures Drug Discovery data Scientific publications Compound aggregators Blogs/Wikis and Open Notebook Science
![Page 11: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/11.jpg)
Where Would You look? What Do You Trust?
![Page 12: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/12.jpg)
Structural Data for LifeSciencesDailyMed
![Page 13: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/13.jpg)
Lack of Stereochemisty
![Page 14: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/14.jpg)
Incorrect Structures
![Page 15: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/15.jpg)
Ugh…
![Page 16: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/16.jpg)
Drugs are REALLY Messy
![Page 17: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/17.jpg)
Vancomycin
Who will curate?
How would you clean such a large dataset?
Assertions!!!
![Page 18: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/18.jpg)
The EXPERTS must get it right?!
![Page 19: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/19.jpg)
Wikipedia, C&E News, PubChem C&E News (from ACS)
![Page 20: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/20.jpg)
Chemistry on the Internet TODAY
Chemistry searches are generally limited to text-based searches across the internet
Data are dirty: sorting the wheat from the chaff. Who can you trust?
Too many searches required to resource data
![Page 21: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/21.jpg)
Just “Public Compound” Databases
PubChem Drugbank ChEBI/ChEMBL KEGG LipidMAPs ChemIDPlus eMolecules ZINC Lots of chemical vendors ChemSpider
![Page 22: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/22.jpg)
media.obsessable.com
As few interfaces as possible
What do humans want?
![Page 23: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/23.jpg)
A Pragmatic Vision“Build a Structure Centric Community to
Serve Chemists”
Integrate chemical structure data on the web Create a “structure-based hub” to information and
data Provide access to structure-based “algorithms” Let chemists contribute their own data Allow the community to curate/correct data
![Page 24: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/24.jpg)
Answer Questions
Questions a chemist might ask… What is the melting point of n-heptanol? What is the chemical structure of Xanax? Chemically, what is phenolphthalein? What are the stereocenters of cholesterol? Where can I find publications about xylene? What are the different trade names for Ketoconazole? What is the NMR spectrum of Aspirin? What are the safety handling issues for Thymol Blue?
![Page 25: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/25.jpg)
ChemSpider Searches
![Page 26: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/26.jpg)
![Page 27: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/27.jpg)
Search “OEA”
![Page 28: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/28.jpg)
Search OEA
![Page 29: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/29.jpg)
Link Farm Connections
![Page 30: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/30.jpg)
Link Farm Connections
![Page 31: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/31.jpg)
Search OEA
![Page 32: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/32.jpg)
Search OEA
![Page 33: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/33.jpg)
Google Books
![Page 34: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/34.jpg)
Google Scholar
![Page 35: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/35.jpg)
Linked Patents for OEA
![Page 36: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/36.jpg)
![Page 37: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/37.jpg)
Google Patents
![Page 38: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/38.jpg)
Microsoft Academic Search
![Page 39: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/39.jpg)
RSC Journals
![Page 40: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/40.jpg)
RSC Databases
![Page 41: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/41.jpg)
Statistics for Today
Almost 25 million compounds from >350 data sources
About 7000 unique users per day and up to ½ million transactions per day
A crowdsourced deposition and curation platform
Grows daily – more depositions, more links, more data
![Page 42: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/42.jpg)
Searching Chemistry on the Internet
How complete a result set will we get if we search for “chemicals” by name?
Is there a better way to link chemistry databases? Linking by “names” is dangerous
Chemists want structure and SUBstructure searching
![Page 43: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/43.jpg)
The InChI Identifier
![Page 44: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/44.jpg)
Multiple Layers
![Page 45: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/45.jpg)
InChIStrings Hash to InChIKeys
![Page 46: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/46.jpg)
Link the Internet with InChIKeys!
Taken from: Rafael Sidis’ Blog
![Page 47: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/47.jpg)
Vancomycin – Search the Internet
![Page 48: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/48.jpg)
Vancomycin
Search Molecular SKELETON
Search Full Molecule
![Page 49: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/49.jpg)
Full Molecule Search: 4 Hits
![Page 50: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/50.jpg)
Full Skeleton Search: 104 Hits
![Page 51: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/51.jpg)
![Page 52: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/52.jpg)
![Page 53: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/53.jpg)
![Page 54: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/54.jpg)
Vancomycin
![Page 55: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/55.jpg)
Vancomycin on ChemSpider 1 compound – 3 days
![Page 56: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/56.jpg)
InChIKeys
RCINICONZNJXQF-MZXODVADSA-N
Make the internet searchable by adding InChIKeys
Publishers add InChIKeys to papers now…
![Page 57: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/57.jpg)
InChIKeys
RCINICONZNJXQF-MZXODVADSA-N
Make the internet searchable by adding InChIKeys
Publishers add InChIKeys to papers now…
is what???
![Page 58: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/58.jpg)
The InChI “Resolver”
![Page 59: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/59.jpg)
InChI Resolver to DOIsStructure Search the Web
![Page 60: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/60.jpg)
Most Chemistry is NOT Published
Only a fraction of chemistry is published
Only a tiny fraction of chemistry is patented
What of the “Lost Chemistry”- never published and cannot be abstracted Reactions performed Structures made and studied Spectra acquired and then disposed of Available chemicals never found
![Page 61: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/61.jpg)
Crowd-sourcing Curation and Deposition
Crowd-sourced curation: identify/tag errors, edit names, synonyms, identify records to deprecate
![Page 62: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/62.jpg)
Building a Structure Centric Community for Chemists
Multi-level Curation and Approval
![Page 63: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/63.jpg)
Semantic Markup: Project Prospect
![Page 64: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/64.jpg)
Name-Structure Pairs
![Page 65: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/65.jpg)
Semantic Linking of Structures
What would you want to link off a structure? Chemical suppliers Other publications Analytical Data Related Reactions Wikipedia Patents “Everything”
![Page 66: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/66.jpg)
Org Prep Daily (Blog)
![Page 67: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/67.jpg)
ChemSpider SyntheticPages
![Page 68: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/68.jpg)
Chemistry on the Internet FUTURE The semantic web for chemistry is in place Crowdsourced contributions are commonplace Chemists will search by structure/substructure Chemistry articles indexed and searchable Reduced number of searches to find data Data are integrated – compounds, vendors,
syntheses, data, publications and patents A world of Open Access and Open Data
![Page 69: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/69.jpg)
ChemSpider Web Services
![Page 70: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/70.jpg)
![Page 71: Building A Community Resource For The Life Sciences](https://reader035.fdocuments.in/reader035/viewer/2022062419/5581a39fd8b42afd4c8b4a2f/html5/thumbnails/71.jpg)
Thank you
[email protected]: ChemSpidermanwww.chemspider.com/blogSLIDES: www.slideshare.net/AntonyWilliams