Are Multilingual Neural Machine Translation Models Better ...
“Found in Translation” –Neural machine translation …...“Found in Translation” –Neural...
Transcript of “Found in Translation” –Neural machine translation …...“Found in Translation” –Neural...
“Found in Translation” – Neural machine translation models for chemical reaction prediction
Philippe Schwaller (@phisch124)
Theophile Gaudin, Riccardo Pisoni, David Lanyi, Costas Bekas & Teodoro Laino
IBM Research Zurich, Switzerland
Alpha LeeUniversity of Cambridge
Chem. Sci., 2018, 9, 6091-6098
Exploring the nearly endless chemical space
O
O
OHO
N
S
O
NH
OO
OH
O
O-Na+
Cl
Cl
ClCl
Cl
N
OH O
OH
HO
S
O
O
OH
Design Make
Test
De Novo MoleculesVAE / GAN / RL
Synthetic route planningReaction outcome prediction
Experimental verificationCredits to Marwin Segler
Chemical reaction prediction
Data
US patents
Lowe (2012,2017)
text-mining
Cl:1C:2
C:3
C:4 C:8
N:5N:6
C:7OH:14
N+:15O-:16
O:17
S:9O:10
O:11
OH:12
OH:13
Cl:1
C:2C:3
N+:15
O:14
O-:16
C:4
C:8
N:5
N:6C:7+ +
Jin et. al.(NIPS 2017)
filtering
SMARTS- reactants>>products- partly correct atom-mapping- >1M reactions
Benchmark dataset(USPTO_500k)
Representing molecules for MLMolecular fingerprints000010000….0100
Text-based representations- SMILES / INCHI - CN1C=NC2=C1C(=O)N(C(=O)N2C)C
Graph-based
1D 2D 3D
3D structure
N
N
O
N
O
N
N
N
O
N
O
N
Atoms as letters, molecules as words
SMILES to SMILES prediction with sequence-2-sequence models Schwaller et. al. Chem. Sci., 2018, 9, 6091-6098 / Nam & Kim arXiv:1612.09529
How do Seq-2-Seq models work?
Step-by-step
Inspired by Hendrik’s talk in Zurich
Step-by-stepwith attention
Link to reaction prediction?- Reactants to products
Br . C C = C 1 C …Source
C
Prediction
Attention weights
Plotting the attention weights at every decoder time step.
Attention weightsChloro Buchwald-Hartwig amination:US20170240534A1
Source: Reactants
Targ
et: P
rodu
ct
SMILES-2-SMILES
Trained end-2-endTemplate-free
Fully data-driven
Nochemicalknowledge incorporated
No atom-mappingrequired
Attention weightsand confidence score
Less than 0.5%invalid SMILES
Limitations
Only as good asthe training data
Difficult to includenegative data
Chemically unreasonablepredictions possible
What other template-free approaches exist?Graph neural networks• Jin et al. (MIT, 2017): Weisfeiler-Lehman Networks (WLDN)• Network 1: Reaction center identification• Network 2: Product candidate ranking• Trained separately • Outperforms template-based by 10% on USPTO_500k dataset
• Bradshaw et al. (Cambridge, 2018): Electron path prediction• Gated Graph Neural Networks• Outperforms WLDN on USPTO_350k dataset
(subset without more difficult reactions, e. g. cycloadditions)Fundamental limitation: require atom-mapped training sets
Data set Jin et. al., Schwaller et. al. Our new model MIT_500k 74.0 % <74.0 % 87.3 %
How do we perform compared to others? Reactants > reagents Products
Reactants & reagents mixed Products
Data set Jin et. al., WLDN
Schwaller et. al., Seq-2seq
Bradshaw et al., GGNN
MIT_500k 79.6 % 80.3 %
Top-1 accuracy:
On USPTO_MIT benchmark dataset.Schwaller et al.: Chem. Sci., 2018, 9, 6091-6098Jin et al.: NIPS, 2017, 30, 2607-2616Bradshaw et al.: arXiv:1805.10970
-subset 350k 84.0 % 87.0 %
Our new model 88.8 %
What else can we do?Reaction scoring
CC=C1CCCCC1.Cl>>CCC1(Cl)CCCCC1
CC=C1CCCCC1.Cl>>CC(Cl)C1CCCCC1
Score: 0.99
When we provide the modelwith reactants>>products
Score: 0.001
Markovni
kov
Anti-Markovnikov
IBM RXNfor ChemistryFreely available now:research.ibm.com/ai4chemistry
#RXNFORCHEMISTRY
Booth #530
[email protected] / @phisch124
IBM RXN
Chemistry
for
research.ibm.com/ai4chem
istry
Questions?
Philippe Schwaller