Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in...
-
date post
19-Oct-2014 -
Category
Technology
-
view
502 -
download
0
description
Transcript of Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in...
ChemAxon UGM, San Diego, USA 25th September 2013
Recent improvements in Marvin v6: Reaction Atom Mapping and its Application to
Reaction Validation in Pharmaceutical ELNs
Daniel Lowe and Roger Sayle
NextMove Software
Cambridge, UK
ChemAxon UGM, San Diego, USA 25th September 2013
What is Atom-Mapping?
Mapping algorithm
ChemAxon UGM, San Diego, USA 25th September 2013
Why Perform Atom-Mapping?
• Assigning roles to reagents
• Normalization of reactions for registration
ChemAxon UGM, San Diego, USA 25th September 2013
Why Perform Atom-Mapping?
• More precise database searches
– Solvents/catalysts can be distinguished from reactants
– Allows the relationship between the reactant atoms and product atoms to be made explicit
ChemAxon UGM, San Diego, USA 25th September 2013
Example
• I want to find reactions converting an alkene to a cyclopropane so I search for C=C>>C1CC1
ChemAxon UGM, San Diego, USA 25th September 2013
Why Perform Atom-Mapping?
• Identifying suspect reactions:
ChemAxon UGM, San Diego, USA 25th September 2013
Chemaxon atom mapping
ChemAxon UGM, San Diego, USA 25th September 2013
Chemaxon atom mapping
ChemAxon UGM, San Diego, USA 25th September 2013
Atom mapping modes
• Complete
• Changing
• Matching
ChemAxon UGM, San Diego, USA 25th September 2013
Methodology
Test set Reactions
Pharmaceutical ELN subset 18,244
ChemReact68 database 67,926
SPRESI database subset 5,230
Reactions extracted from 2008-2011 USPTO patent applications*
562,872
* Lowe, D. M. Automated Extraction of Reactions from the Patent Literature. 243rd ACS National Meeting & Exposition, San Diego, CA, March 27, 2012.
ChemAxon UGM, San Diego, USA 25th September 2013
MetricS used
• Were all product atoms mapped
– Measures recall
• How many C-C bonds were broken
– Measures precision
ChemAxon UGM, San Diego, USA 25th September 2013
Ability to map all product atoms
0
10
20
30
40
50
60
70
80
PharmaELN ChemReact68 SPRESI USPTO
Pe
rce
nt
of
reac
tio
ns
wit
h a
ll p
rod
uct
ato
ms
map
pe
d
Marvin 5.10
Marvin 6.0
ChemDraw 12
ChemAxon UGM, San Diego, USA 25th September 2013
c-c bonds broken
0.0
0.2
0.4
0.6
0.8
1.0
1.2
PharmaELN ChemReact68 SPRESI USPTO
Ave
rage
nu
mb
er
of
C-C
bo
nd
s b
roke
n p
er
map
pin
g (l
ow
er
is b
ette
r)
Marvin 5.10
Marvin 6.0
ChemDraw 12
ChemAxon UGM, San Diego, USA 25th September 2013
Marvin 5.10
ChemDraw 12
Marvin 6.0
ChemAxon UGM, San Diego, USA 25th September 2013
Speed Comparison
*Comparison performed on the PharmaELN dataset on an i7-2600
0
50
100
150
200
250
300
350
Marvin 5.12 Marvin 6.0 Marvin 6.0(multithreaded)
Re
acti
on
s m
app
ed
pe
r se
con
d
ChemAxon UGM, San Diego, USA 25th September 2013
Difficult cases
ΔT
ChemAxon UGM, San Diego, USA 25th September 2013
Areas for improvements: Implicit stoichiometry
ChemAxon UGM, San Diego, USA 25th September 2013
Areas for improvements: many choices for reactant atom mapping
ChemAxon UGM, San Diego, USA 25th September 2013
0
10
20
30
40
50
60
70
80
90
100
PharmaELN
Pe
rce
nt
of
reac
tio
ns
wit
h a
ll p
rod
uct
ato
ms
map
pe
d Marvin 6.0
ChemDraw 12
Marvin6 + ChemDraw12
Consensus Result*
Consensus Methods
* Marvin 6.0 + ChemDraw12 + 2 variants of GGA’s Indigo toolkit + InfoChem ICMap + Pipeline Pilot + MDL Cheshire
ChemAxon UGM, San Diego, USA 25th September 2013
Beyond atom mapping
• Missing reactants (often for routine reactions)
ChemAxon UGM, San Diego, USA 25th September 2013
Beyond atom mapping
• Change of stereoisomer or chiral resolution
(E)-3-{8-[2-(4-Isopropyl-1,3-thiazol-2-yl)ethyl]-2-methoxy-4-oxo-4H-pyrido[1,2-a]pyrimidin-3-yl}-2-propenoic acid (1 mg) was dissolved in CDCl3 (0.5 ml) and irradiated with light from a fluorescent lamp
for 19 hours . The solvent was evaporated to obtain the title compound (1 mg).
ChemAxon UGM, San Diego, USA 25th September 2013
Atom mapping + classification
0
10
20
30
40
50
60
70
80
90
100
Atom mappingalgorithms alone
Combined withNameRXN
Pe
rce
nt
of
reac
tio
ns
wit
h a
ll p
rod
uct
at
om
s m
app
ed
Marvin 6.0
ChemDraw 12
ConsensusResult
Verified / Recognised
by NameRXN
(71%)
ChemAxon UGM, San Diego, USA 25th September 2013
conclusions
• Marvin v6’s atom mapping algorithm provides large improvements in recall, precision and speed over v5
• Atom mapping in some cases isn’t as simple as finding a maximum common subgraph mapping
• Classification algorithms can be useful for the validation of some reactions
ChemAxon UGM, San Diego, USA 25th September 2013
acknowledgements
• Zsolt Mohacsi and Istvan Rabel, ChemAxon
• Ed Griffen and Nick Tomkinson, AstraZeneca
• Andrew Wooster, GSK
• Hans Kraut, InfoChem
• Thank you for your time.