The Royal Society of Chemistry and its adoption of semantic web technologies for chemistry at the epoch of a federated world
Antony Williams, Valery Tkachenko, Ken Karapetyan, Alexey Pshenichnov
ACS, 248th National Meeting
San Francisco, CA
August 11th 2014
Who is involved?
29 partners
Research questions
Research questions
ChEMBLChEMBL DrugBankDrugBank Gene Ontology
Gene Ontology WikipathwaysWikipathways
UniProtUniProt
ChemSpiderChemSpider
UMLSUMLS
ConceptWikiConceptWiki
ChEBIChEBI
TrialTroveTrialTrove
GVKBioGVKBio
GeneGoGeneGo
TR IntegrityTR Integrity
“Find me compounds that inhibit targets in NFkB pathway assayed in only functional assays with a potency <1 μM”
“What is the selectivity profile of known p38 inhibitors?”
“Let me compare MW, logP and PSA for known oxidoreductase inhibitors”
Open PHACTS Explorer Web based searching interface
explorer.openphacts.org
Discovery Platform
Open PHACTS API dev.openphacts.org Applications can query the pharmacological data within Open PHACTS
Open PHACTS applicationsExternal bespoke applications using the Open PHACTS API.
chembionavigator.org
pharmatrek.org
• Compound-protein interactions • Physicochemical properties
Workflow toolsPipeline Pilot, KNIME, R
• Gene information• Biological pathways
OpenPHACTS UIhttp://explorer.openphacts.org/
ChemBioNavigator
OpenPHACTS APIhttps://dev.openphacts.org/
https://dev.openphacts.org/
KNIME
OpenPHACTS Architecture
Micro-article
Compounds
Reaction
Analytical Data
Text and References
Technical view - unification
Chemistry Validation and Standardization Platform
DrugBank dataset (6516 records)
J. Brechner, IUPACGraphical Representation of stereochem. configurationsSection: ST-1.1.10
DB06287
PubChemDrugbankChemSpider
Imatinib
Mesylate
What Is Gleevec?
Ambiguities
How is this a semantic web problem? Why can’t people just be clear?
People may be working with faulty data.
Salts, say, may make little difference to the effects of an active ingredient.
People may assume a one-to-one mapping between a gene and the gene product (protein, ncRNA) that it codes for.
What’s in a lens?
IdentifierTitle (dct:title)Description (dct:description)Documentation link (dcat:landingPage)Creator (pav:createdBy)Timestamp (pav:createdOn)
Equivalence rules (bdb:linksetJustification)
Equivalence rules
The BridgeDB vocabulary adds metadata that provides a justification for treating two URIs alike, thus allowing the researcher to determine whether their circumstances fit.
owl:sameAs ≤ skos:exactMatch ≤ skos:closeMatch ≤ rdfs:seeAlso
The ChEBI and CHEMINF ontologies provide a rich set of relations (many of which developed for this project) to relate one molecule to another.
ChEBI (http://www.ebi.ac.uk/chebi)
has partis tautomer of
CHEMINF (http://code.google.com/p/semanticchemistry/)
has component with uncharged counterparthas counterpart molecular entity
has normalized counterparthas OPS normalized counterparthas PubChem normalized counterpart
has uncharged counterpartsimilar to
similar to by PubChem 2D similarity algorithmsimilar to by PubChem 3D similarity algorithm
has same connectivity asis isotopologue ofis stereoisomer of
subClassOf (standard relation in RDF)has isotopically unspecified parenthas stereoundefined parent
Link: skos:closeMatchReason: non-salt form
Link: skos:exactMatchReason: drug name
Strict Relaxed
Analysing Browsing
skos:exactMatch(InChI)
Strict Relaxed
Analysing Exploring
23
skos:closeMatch(Drug Name)
skos:closeMatch(Drug Name)
skos:exactMatch(InChI)
What does the Open PHACTS Chemistry Registration System do?
Takes in structures from ChEMBL, ChEBI, DrugBank, PDB, Thomson Reuters.
Normalizes structures according to rules based on FDA guidelines.
Generates counterpart molecules: without charge, fragments
Chemistry Validation and Standardization Platform
Input pipeline
Compounds domain
Navigation in chemical space
Navigation in chemical space
Reactions domain
Analytical data domain
Crystallography domain
Standards
Share in a “proper way”
APIs, endpoints and widgets
Dimensions and complexity of science
Handling complex content
What’s the structure?What’s the structure?
Are they in our file?
Are they in our file?
What’s similar?What’s
similar?
What’s the target?
What’s the target?Pharmacology
data?Pharmacology
data?
Known Pathways?
Known Pathways?
Working On Now?
Working On Now?Connections
to disease?Connections to disease?
Expressed in right cell type?Expressed in
right cell type?
Competitors?Competitors?
IP?IP?
Machine learning
Top Related