Impressions of Spectrus Workbook - ACD/Labs · 2013. 11. 8. · Intro . NMR at Janssen Belgium: •...
Transcript of Impressions of Spectrus Workbook - ACD/Labs · 2013. 11. 8. · Intro . NMR at Janssen Belgium: •...
Impressions of Spectrus Workbook ASV tools – did they improve?
ACD/Labs NMR Software Symposium Alex De Groot
SMASH September 22 - 2013
Intro
NMR at Janssen Belgium: • Discovery: 2 NMR specialists (1.7 FTE), ±40 chemists in Open Access • Development: 5 NMR specialists, ±20 chemists • ACD/Labs software version 12.0
Evaluating ASV systems for 3 years
Beta-tester for ACD/Spectrus platform. Only one who works with ACD/NMR Workbook Frequent feedback to ACD/Labs about bugs, suggestions,
improvements...
1
Part 1 : Impressions of ACD/Spectrus
2
– Previously: • Interpreting proton spectrum • Looking to and from time to time transferring assignments to 2D
– ACD/NMR Workbook v12.01: • Assignments automatically transferred: great! But...
Lot of problems with overlap, difficult to correct Spend more time in correcting than benefit of transfer Disadvised colleagues to start working with workbook
– ACD/NMR Workbook v2012 (Spectrus): • More automation and much better automatic results • Problematic peaks easier remediated Whole other way of working:
Everything automatic Check for not picked peaks and not assigned ones Get out issues based on red arrows on structure
Great for giving feedback to doubting chemists
3
Whole other way of working
17N22
21
20 19
16
14N18
N15
5
Cl23
24
25
CH3 26
CH328
N4
6
N1
2
3
7 10
11
8 9
CH312O
H3a
H5a
17N22
21
20 19
16
14N18
N15
5
Cl23
24
25
CH3 26
CH328
N4
6
N1
2
3
7 10
11
8 9
CH312O
H3a
H5a
13C-HMBC
15N-HMBC
Whole other way of working
DBU:
R600109_1H.esp53 H's / 53 H's (spectra / structure)
9.5 9.0 8.5 8.0 7.5 7.0 6.5 6.0 5.5 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5Chemical Shift (ppm)
2.002.2011.632.571.153.185.194.068.051.022.021.000.991.002.020.980.901.841.001.011.011.00
CHLOROFORM-d
21,23,30,31
5
20
47
6
48,49
3
422
35 48,49
41
18
7
43
40
25
13,24
37
7
14
19,8
17,46
32,19,6,8
9.40
9.40
8.77
8.77
8.58
8.57
8.57
8.31
8.29
7.84 7.
827.
81
7.49
7.47
5.60 5.60
5.59
5.58
5.57
5.57
4.82
4.80
4.80
4.78
4.66
4.66
3.96
3.95
3.93
3.68
3.67
3.66
3.65
2.97 2.
96
2.95
2.95 2.94
2.89
2.88
2.88
2.75
2.74
2.73
2.72
1.97 1.96
1.95 1.
901.
891.
88 1.84
1.83
1.75
1.63
1.61
1.51
1.43
1.41
1.40
1.31
1.30
1.23
1.20
1.08
1.02
0.93
0.92
0.91
0.87
0.76
0.75
0.74
0.73
0.72
41
40
N42
N39
38
43
26
O27
NH25 24
15
32
NH14
O16
33 37
13
3435
36
9
22
O10
N1
2330
31
5
24
11
3
6
8
7
O12
NH1718
2819
44
O2920
O45
NH46
47
48 49
21
Telaprevir
SOP and Points of attention:
• Reference H (and C) and dark region solvent peaks
• Auto allign Verify all 2D are alligned well in both dimensions!
• Auto signal analysis Assure all 2D are picked well, especially NOESY!
• Auto assign
• Check out all strange connection arrows in all 2D and wrongly coulored atoms
atahri_440_2.003.001.2rr.esp
4.0 3.9 3.8 3.7 3.6 3.5 3.4 3.3 3.2F2 Chemical Shift (ppm)
38
40
42
J(3.29, 37.8, 0) (5a, 5)
H(4.05, 42.23, 0) (3a, 3)
HSQC-ed
Improvement opportunities
• Taking the already picked 2D’s into account for the automatic assignment.
E.g.: Splitted CH2 assigned to two proton multiplets which are not on the same carbon
• Suggestions from you?
Part 2 : ASV tools – did they improve?
13
NMR at Discovery Janssen Belgium
Open access:
80% of approvals done by chemists themselves
20% request for expert help > 25% of these are wrong
So only 5% is wrong????
We do not check but we “accidentally” encounter errors...
14
15
ORN
Cl
C
F
N
Cl
C
F
N
Cl
C
F
.....wrong top aromatic ring delivered
16
CH3
CH3
O
CH2 NH
CH2
CH3
CH3
O H
CH2 N
CH2
Aliphatics very symmetric; exchangeable was a sharp singlet! .....11 products assumed by chemist to be closed as macrocycles...
What about external chemistry???
In 2011: 47 wrong external compounds found!
....and we even did not search for it...
few examples of wrong starting materials used for internal reactions:
17
N+
O-
O
N
Br
N+
O-
O
N
Cl
NN
N
N
F
FF
NN
N
NH
F
FF
external CRO chemistry wrong (few examples)
18
CH3
N
N
CH3
NCH3
CH3
CH3
NHN
CH3NHCH3
CH3
N
NH
O
CH3
NH
N
NH
O
NHCH3
NHR
N
R1
R2 OH
O
NR
N
R1
R2
OH O
CRO: bigger problems – fast check on request of chemist > very obvious ortho-meta error found:
Already 23 analogue compounds synthesized in CRO, with NMR measured!
All showed 2 triplets, but no CRO chemist/analyst has ever seen these not matching peaks????
Those 23 were already screened internally and were at risk of influencing chemistry design.
19
EO7218_17_001.001.001.1R.esp
8.3 8.2 8.1 8.0 7.9 7.8 7.7 7.6 7.5 7.4 7.3 7.2 7.1 7.0 6.9 6.8 6.7 6.6 6.5
R
OCH3
R
O
CH3
Magic?
20
Automated Structure Verification
or Pandora’s box?
• 3 vendors evaluated : ACD/Labs, Mnova and Bruker
• Not focused on big numbers, but on getting an idea of trustworthiness, reliability, structural insight, efficiency, problems and ease of use
• 2 sets of compounds: – fragment compounds → 3 wrong sets designed
– Wrong chemistry compounds
• 2 screening techniques: 1H, 1H+HSQC
• 2 alterations: prediction training, concurrent verification
21
ASV test at Janssen
22
Figure 1: Example of structures
used for false positive evaluation
Set I Set II
• 37 molecules • Fragment library • 1 correct structure + 3 invented wrong structures
• 77 molecules • Synthesized by chemists • 1 correct structure:
real structures elucitated by NMR expert
• 1 wrong structure: proposed by chemists
Testsets
23
Main screen of ACD/NMR Expert
Traffic light colors- match factor
ClN
N
N
S
24
1H verification set I (fragments)
Comparison between ACD/Labs new and old version
1D- Improvement multiregio isomers
ACD/Labs with or without training
training not real improvement
86%
41%
58%
10%
78%
34%
44%
9%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
besttautomere
hetero regio multiregio
1D new
1D training new
86%
41%
58%
10%
81%
41%
59%
35%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
besttautomere
hetero regio multiregio
1D ACD new
1D ACD old
25
1D ACD labs better mutiregio, while still high pass rate
Set 1: 1D – comparison ACD/Labs v2012, CMC assist v2, Mnova v 8.1:
86%
41%
58%
10%
78%
54%
62%
30%
57%
16%
38%
19%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
correct hetero regio multiregio
ACD SpectrusCMC AssistMnova
1H verification Set I (fragments)
Adding 2D-HSQC-edited information
26
27
set I, combined 1D+ HSQC verification
Comparison between ACD/Labs old and new, and Mnova 8.1
86%
41%
58%
10%
81%
43%
62%
16%
68%
16%
35%
7%
49%
14%
27%
5%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
correct hetero regio multiregio
ACD 1D newACD comb. oldACD comb. newMnova
28
new version better in regioisomers, but more ambiguous
Set I - 1D- concurrent verification, comparison new and old version
ACD Manual concurrent verification
49%
5%
19%
0%
27% 32%
8% 3%
0%
57%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
correct hetero regio multiregio ambiguous
ACD oldACD new
29
Set I – combined 1D+2D concurrent verification : comparison new and old version
41%
16%
5% 0%
38%
65%
0% 0% 0%
35%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
correct hetero regio multiregio ambiguous
ACD oldACD new
ACD Manual concurrent verification
30
Set 2: chemistry compounds
Set II - 1D verification: Comparison between new ACD, Mnova and CMC
49%
23% 17% 17%
49%
32%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
correct wrong
ACD SpectrusMnovaCMC assist
31
Set 2: chemistry compounds
Set II – Combined 1D& HSQC verification: comparison ACD/Labs,
CMC assist, Mnova and ACD/Labs manual concurrent
62%
9%
21%
5%
52%
30%
69%
2%
0%
10%
20%
30%
40%
50%
60%
70%
80%
correct wrong
ACDMnovaCMC assistACD concurrent
30%
3%
61%
82%
9% 6% 9%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
correct wrong
fitno fitseveralgood suggested
32
Set 2 – combined 1D+2D automatic concurrent verification
ACD/Labs automatic concurrent verification (β-version!)
– 10 isomers invented - 5-step verification
– Alternative structure between isomers: • 9% for good and 12% for wrong
Conclusions – Many failures still from processing!
– ACD/Labs has best equilibrium pass rate/false positives
– Mnova has many colors, but with suggested cut-off of 0.3 low pass rates are observed
– CMC-assist is different, but higher number of false positives are observed
– General performance conclusion: • all good to find big mistakes in structures, but all have hard time with
common regio-isomers discrimination • No magic: still a lot of false positives, especially for proton only
we cannot fully rely on it • Better than nothing: every error is one out Good for checking CRO data, but we are afraid chemists would take
green blindly for granted
33
Automated Structure Verification
Aknowledgements
My students:
Benjamin Guguen (2011)
Léa Drieu (2012)
Nicolas Lefèvre (2013)
ACD macro tools support
Sergey Golotvin
34
Thank you!
Questions?