InChI vs IUPAC nomenclature: Aspects to be aware of when using Standard InChI

Post on 10-May-2015

760 views 2 download

Tags:

description

Features of IUPAC nomenclature that cannot be represented in Standard InChI will be examined to draw caution to cases where the use of standard InChI (and even in some cases non-standard InChI) may result in a loss of information. These areas include the representation of tautomers and mixtures of stereoisomers.

Transcript of InChI vs IUPAC nomenclature: Aspects to be aware of when using Standard InChI

Daniel LoweUnilever Centre for Molecular Science Informatics

University of Cambridge

InChI vs IUPAC nomenclature: Aspects to be aware of when using Standard InChI

• InChI is used for checking the correctness of the results of name to structure– 172,249 name and InChI pairs used for routine

regression testing• Failures arising from stereochemistry can be

distinguished from constitutional failures

Plausible interpretations of “alanine”:

InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5,6) InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5,6)/t2-/m0/s1

1H-tetrazole 2H-tetrazole 3H-tetrazole 4H-tetrazole

InChI=1S/CH2N4/c1-2-4-5-3-1/h1H,(H,2,3,4,5)

Not all tautomers are equally important

A fixed hydrogen layer can always be removed but cannot be losslessly readded

• Conditions can lead to a particular tautomer being far more representative of a compound than another

• Not all tautomers readily interconvert

• A particular tautomer could be the reactive species

cyclo-tris(tetracarbonylosmium) (3 Os—Os)

InChI=1S/12CO.3Os/c12*1-2;;;

InChI=1/12CO.3Os/c12*1-2;;;/rC12O12Os3/c13-1-25(2-14,3-15,4-16)26(5-17,6-18,7-19,8-20)27(25,9-21,10-22,11-23)12-24

(RS)-2-(4-(2-methylpropyl)phenyl)propanoic acidCH3

O

HOCH3

H3C

InChI=1/C13H18O2/c1-9(2)8-11-4-6-12(7-5-11)10(3)13(14)15/h4-7,9-10H,8H2,1-3H3,(H,14,15)/t10-/s3/f/h14H

/s1 = absolute stereochemistry/s2 = relative stereochemistry/s3 = racemic stereochemistry

&1O

&1

&2

N

OCl Cl

O

A mixture of relative and absolute stereochemistry, and systems with multiple groups of relative stereochemistry are not yet supported

beta-cypermethrin (4 exact structures)

Helical stereochemistryhexahelicene

(M)-hexahelicene (P)-hexahelicene

Helical stereochemistry

(Sa)-6,6 -dinitrobiphenyl-2,2 -dicarboxylic acid′ ′

Axial stereochemistry

(Ra)-6,6 -dinitrobiphenyl-2,2 -dicarboxylic acid′ ′

Conclusions

• Where useful greater specificity than standard InChI can be achieved using extra layers

• InChI does not yet support all corner cases of stereochemistry

Any Questions?

Email: daniel@nextmovesoftware.com