Post on 17-Jan-2016
““Good annotation Good annotation practicepractice”” for chemical for chemical
data: data: ChEBI experienceChEBI experience
Kirill DegtyarenkoKirill DegtyarenkoEuropean Patent OfficeEuropean Patent Office
good Naming practice how to give most appropriate names
good Ontology practice how to link the entity of interest by
defined logical relationships to other entities
good Drawing practice
• how to draw unambiguous 2-D diagrams
Good anNODation practice
or
How to Give Most Appropriate Names
Good Naming Practice
2-{[3-(trifluoromethyl)phenyl]amino}benzoic acid
NH
O
OH
F
F
F
Systematic Name (IUPAC)
1
23
4
5
6
1
2
34
5
6
• flufenamic acid (INN English)• acide flufénamique (INN French)• ácido flufenámico (INN Spanish)• acidum flufenamicum (INN Latin)• Flufenaminsäure (German)
NH
O
OH
F
F
F
Common Name
The Unpronounceables
CHEBI:48935
(E)-roxithromycin
IUPAC name:
(3R,4S,5S,6R,7R,9R,10E,11S,12R,13S,14R)-4-(2,6-dideoxy-3-C-methyl-3-O-methyl-α-L-ribo-hexopyranosyloxy)-14-ethyl-7,12,13-trihydroxy-10-{[(2-methoxyethoxy)methoxy]imino}-6-[3,4,6-trideoxy-3-(dimethylamino)-β-D-xylo-hexopyranosyloxy]-3,5,7,9,11,13-hexamethyloxacyclotetradecan-2-one
O O
O
O
OH
N
O
O
N
OH
OH
O OO
O
OH OH
CH3
CH3
CH3
CH3
CH3CH3
CH3 CH3
CH3
CH3
CH3
CH3
CH3CH3
O O
O
O
OH
N
O
O
N
OH
OHO
OH OH
CH3
CH3
CH3CH3CH3
CH3 CH3
CH3
CH3
CH3
CH3
CH3CH3
OOO
CH3
CHEBI:32109(Z)-roxithromycin
What is the common name of roxithromycin?
CHEBI:48935(E)-roxithromycinINN: roxithromycin
O O
O
O
OH
N
O
O
N
OH
OH
O OO
O
OH OH
CH3
CH3
CH3
CH3
CH3CH3
CH3 CH3
CH3
CH3
CH3
CH3
CH3CH3
O O
O
O
OH
N
O
O
N
OH
OH
O OO
O
OH OH
CH3
CH3
CH3
CH3
CH3CH3
CH3 CH3
CH3
CH3
CH3
CH3
CH3CH3
O O
O
O
OH
N
O
O
N
OH
OHO
OH OH
CH3
CH3
CH3CH3CH3
CH3 CH3
CH3
CH3
CH3
CH3
CH3CH3
OOO
CH3
CHEBI:48844 roxithromycin
(E)-roxithromycin
O O
O
O
OH
N
O
O
N
OH
OH
O OO
O
OH OH
CH3
CH3
CH3
CH3
CH3CH3
CH3 CH3
CH3
CH3
CH3
CH3
CH3CH3
(Z)-roxithromycin
What is thiamine?CHEBI:18385thiamine(1+)aka thiamine
N
N
NH2
CH3 S
CH3
OH
N+
CHEBI:33283thiamine(1+) chlorideINN: thiamine
N
N
NH2
CH3 S
CH3
OH
N+
Cl-
CHEBI:49105 thiamine(2+) dichlorideaka thiamine chloride hydrochloride aka thiamine hydrochloride
N
NH3+
NCH3 S
CH3
OH
N+
Cl-
Cl-
Problem is not unique to ChEBI
Cf. phenol vs phenols phenol metabolism vs phenols
metabolism
Bad solution: article use a phenol metabolism?
Solution: prepositional phrases metabolism of phenols
Plurals and singulars
or
How to Draw Unambiguous 2-D Diagrams
Good Drawing Practice
Linear forms of monosaccharides
CHO
CH2OH
H OH
OH H
OH H
H OH
OH
O
H OH
OH H
OH H
H OH
H H
OH
OH
OH
OH
OH
O
Pyranose forms of monosaccharides
O
OHH
HOH
HOH
H OH
H
CH2OH
O
CH2OH
OH
OH
OH
OH
OH
OH
OH
OH
OOH
Fused systems
(R)-camphor
ambiguous unambiguous
CH3
OCH3
CH3
O
CH3CH3
CH3
Square planar geometry
InChI=1/2ClH.2H3N.Pt/h2*1H;2*1H3;/q;;;;+2/p-2
Pt
N Cl
ClN
HH
H
H
HH
Pt
NCl
N Cl
H H
H
H
HH
cisplatin transplatin
SMILES: [H][N]([H])([H])[Pt](Cl)(Cl)[N]([H])([H])[H]
Compositional uncertainty
Positional uncertainty
Configurational uncertainty
Conformational uncertainty
Uncertainty and ambiguity in chemistry
Examples
an alkali metal cation
vanadate(V) anion
[2H]ethanol
Compositional uncertainty
Examples
L-bromohistidine residue
pteroic acid (several tautomers)
Positional uncertainty
Examples
androstane
rel-(2R,3R)-2-amino-3-methylpentanoic acid
tetradec-11-enoic acid
Configurational uncertainty
Examples
cyclohexane: chair, boat, twist
protein secondary structure: , , …
Conformational uncertainty
or
How to Link the Entity of Interest by Defined
Logical Relationships to Other Entities
Good Ontology Practice
• Molecular structure ontology• Subatomic particle ontology• Biological role ontology • Application ontology
ChEBI ontology
Relationships in ChEBI ∆ Is A generic
⋄ Is Part Of generic
♯ Is Conjugate Acid Of specific
♭ Is Conjugate Base Of specific
Is Enantiomer Of specific
Is Tautomer Of specific
ℛ Is Substituent Group From
specific
ℋ Has Parent Hydride specific
ℱ Has Functional Parent specific
Is A relationship
NH2
O
OHSH
NH2
O
OHSH∆
L-cysteine
cysteineis a
NH3+
O
OHSH
NH3+
O
OHSH
L-cysteinium
Is Part Of
⋄
L-cysteine hydrochloride
is part of
Cl-
has part
NH2
O
OHSH
Is Enantiomer Of
NH2
O
OHSH
L-cysteine
NH2
O
OHSH
∆ ∆
D-cysteine
is enantiomer of
Is Tautomer Of
3H-pyrrole
NH
N N
2H-pyrrole
1H-pyrrole
NH2
O
O-
S-
NH3+
O
OHSH
NH2
O
O-
SH
Is Conjugate Acid Of
NH2
O
OHSH♯
L-cysteine
L-cysteinate(1–)is conjugate acid of
L-cysteinium
L-cysteinate(2–)
♯♯
NH2
O
O-
SH
Is Conjugate Base Of
NH2
O
OHSH
♭
L-cysteine
L-cysteinate(1–)
NH2
O
O-
S-
NH3+
O
OHSH
L-cysteinium
L-cysteinate(2–)
♭ ♭
NH2
O
O-
SH
Acid/base relationships
NH2
O
OHSH
♭L-
cysteineL-cysteinate(1–)
NH2
O
O-
S-
NH3+
O
OHSH
L-cysteinium
L-cysteinate(2–)
♭
♯
♭♯♯
NH2
O
SH
L-cysteinyl
NH
O
SH
NH
O
OHSH
Is Substituent Group From
NH2
O
OHSHL-cysteine
L-cysteine residue
L-cysteino
ℛ
ℛ
ℛ
*
*
*
*
salutaridinol
Has Parent Hydride
has parent hydride
is parent hydride of
ℋ NHH
morphinan
OH
N
O
O
CH3
OH
CH3
CH3
7-O-acetylsalutaridinol
Has Functional Parent
has functional parent
is functional parent of
ℱ
salutaridinol
OH
N
O
O
CH3
CH3
CH3
OCH3
O
OH
N
O
O
CH3
OH
CH3
CH3
Live annotation demo
Going to SourceForge…
Reading a request…
Going to curator tool…
Search result…
Adding new entry…
Editing new entry…
Success!
Let’s draw
Success again!
Using ACD/Name (1)
Using ACD/Name (2)
Adding IUPAC name (1)
Adding IUPAC name (2)
Classifying (1)
Classifying (2)
Classifying (3)
Classifying (4)
The last touch (1)
The last touch (2)
Responding request…
A job well done…
• Rafael Alcántara• Michael Ashburner• Volker Ast *• Michael Darsow *• Paula de Matos• Marcus Ennis• Janna Hastings• Alan McNaught *• Chris Steinbeck• Martin Zbinden *
The team
• Kristian Axelsen• Hélène Courrier• Anne Morgat• Ian Unwin• Our faithful Users
• EU: funding
Thanks