(ATS3-APP08) Top 10 things every Symyx Notebook by Accelrys Administrator should know.
pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle...
Transcript of pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle...
![Page 1: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/1.jpg)
CINF 13, ACS Fall 2017, Washington, D.C.
pistachioSearch and Faceting of Large Reaction Databases
JohnMayfield,DanielLowe,RogerSayle
![Page 2: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/2.jpg)
What do Synthetic Chemists Want from Their Reaction Systems?
CINF 13, ACS Fall 2017, Washington, D.C.
Data ClassificationDiagrams Search
![Page 3: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/3.jpg)
What do Synthetic Chemists Want from Their Reaction Systems?
CINF 13, ACS Fall 2017, Washington, D.C.
Data ClassificationDiagrams Search
![Page 4: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/4.jpg)
HazELNut Filbert NameRXN Cobnut
Accelrys Pipeline Pilot (AstraZeneca, AbbVie & Hoffmann-La Roche)
ChemAxon JChem Cartridge (GlaxoSmithKline & Novartis)
Elsevier Reaxys (Hoffmann-La Roche, AstraZeneca, Merck)
Perkin Elmer Informatics (formerly CambridgeSoft) eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x
Oracle Server version 10, 11 or
Microsoft Windows, Linux or Mac OS
Infrastructure for liberating and processing reactions from Electronic Lab Notebooks (ELNs)
CINF 13, ACS Fall 2017, Washington, D.C.
![Page 5: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/5.jpg)
To 7-chloro-4-oxo-4,5-dihydrofuro[2,3-d]pyridazine-2-carboxylic acid (Peakdale) (220 mg, 1.025 mmol) and (3,4-dimethoxyphenyl)boronic acid (187 mg, 1.025 mmol) in 1,4-dioxane (3 mL) and water (1.5 mL) was added sodium carbonate(435 mg, 4.10 mmol) and tetrakis(triphenylphosphine)palladium(0) (110 mg, 0.095 mmol). The reaction was heated in the microwave at 80° C. for 2 hours and at 100° C. for a further 2 hours. The solvent was removed and the residue was suspended in DMSO, filtered and purified by MDAP. Appropriate fractions were combined and the solvent removed to give 7-(3,4-dimethoxyphenyl)-4-oxo-4,5-dihydrofuro[2,3-d]pyridazine-2-carboxylic acid (25 mg, 7%) as a yellow solid.
[0517]
US 2016/16966 A1
Daniel M. Lowe. Extraction of chemical structures and reactions from the literature. Ph.D. Thesis, University of Cambridge, 2012
![Page 6: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/6.jpg)
Daniel M. Lowe. Extraction of chemical structures and reactions from the literature. Ph.D. Thesis, University of Cambridge, 2012
To 7-chloro-4-oxo-4,5-dihydrofuro[2,3-d]pyridazine-2-carboxylic acid (Peakdale) (220 mg, 1.025 mmol) and (3,4-dimethoxyphenyl)boronic acid (187 mg, 1.025 mmol) in 1,4-dioxane (3 mL) and water (1.5 mL) was added sodium carbonate(435 mg, 4.10 mmol) and tetrakis(triphenylphosphine)palladium(0) (110 mg, 0.095 mmol). The reaction was heated in the microwave at 80° C. for 2 hours and at 100° C. for a further 2 hours. The solvent was removed and the residue was suspended in DMSO, filtered and purified by MDAP. Appropriate fractions were combined and the solvent removed to give 7-(3,4-dimethoxyphenyl)-4-oxo-4,5-dihydrofuro[2,3-d]pyridazine-2-carboxylic acid (25 mg, 7%) as a yellow solid.
[0517]
Product Properties7-(3,4-dimethoxyphenyl)-4-oxo-4,5-dihydrofuro[2,3-d]pyridazine-2-carboxylic acid 25 mg, 7% yield, Yellow Solid
Reactant Properties7-chloro-4-oxo-4,5-dihydrofuro[2,3-d]pyridazine-2-carboxylic acid 220 mg, 1.025 mmol(3,4-dimethoxyphenyl)boronic acid 187 mg, 1.025 mmol
Agent Properties1,4-dioxane 3mLwater 1.5mLsodium carbonate 435 mg, 4.10 moltetrakis(triphenylphosphine)palladium(0) 110 mg, 0.095 mmolDMSO
Unstructuredtexttoastructuredreactiontable
US 2016/16966 A1
LeadMine+ChemicalTagger
![Page 7: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/7.jpg)
Christos Nicolaou et al. The Proximal Lilly Collection: Mapping, Exploring and Exploiting Feasible Chemical Space J. Chem. Inf. Model., 2016, 56 (7), pp 1253–1266
Nadine Schneider et al. Big Data from Pharmaceutical Patents: A Computational Analysis of Medicinal Chemists’ Bread and Butter. J. Med. Chem., 2016, 59 (9), pp 4385–4402
Nadine Schneider et al. Development of a Novel Fingerprint for Chemical Reactions and Its Application to Large-Scale Reaction Classification and Similarity J. Chem. Inf. Model., 2015, 55 (1), pp 39–53
Nadine Schneider et al. What’s What: The (Nearly) Definitive Guide to Reaction Role Assignment. J. Chem. Inf. Model., 2016, 56 (12), pp 2336–2346
Connor Coley et al. Prediction of Organic Reaction Outcomes Using Machine Learning. ACS Cent. Sci., 2017, 3 (5), pp 434–443
Data impact
CINF 13, ACS Fall 2017, Washington, D.C.
Public subset released in 2014 as CC-Zero
Pistachio expands the scope of the data and uses Atom-Atom Maps from NameRxn
![Page 8: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/8.jpg)
Example26.EpizymeInc.1-phenoxy-3-(alkylamino)-propan-2-olderivativesasCARM1inhibitorsandusesthereof(US09718816B2)Aug.1,2017
Example 26, US 09718816 B2
JohnMay,etal.SketchySketches:HidingChemistryinPlainSight.SeventhJointSheffieldConferenceonCheminformatics.2016
Step1
Step4
Step3
Step2
etc..
sketch extraction
NextMove’sPraline
![Page 9: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/9.jpg)
total reactions over time
CINF 13, ACS Fall 2017, Washington, D.C.
0
0.5M
1.0M
1.5M
2.0M
2.5M
3.0M
3.5M
1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
Rea
ctio
n D
etai
ls (c
umul
ativ
e) EPO ApplicationsEPO Grants
USPTO Applications
USPTO Grants
![Page 10: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/10.jpg)
What do Synthetic Chemists Want from Their Reaction Systems?
CINF 13, ACS Fall 2017, Washington, D.C.
Data ClassificationDiagrams Search
![Page 11: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/11.jpg)
reaction DIAGRAMSGood reaction diagrams are essential in communicating synthetic chemistry
Layout can be stored or generated • When extracting from text, layout must be generated • Generated diagrams can be unsatisfactory for display
CINF 13, ACS Fall 2017, Washington, D.C.
![Page 12: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/12.jpg)
O
OB
OH
HO
OH
O
O
Cl
N
HNC
O
PPd
P
P
P
O
O
Na+
Na+
-O O-
O
H2O
O
O
N
HNC
O
O OH
O
+
Che
mD
raw
OEC
hem
Generated from SMILES for US 2016/16966 A1 [0517]
![Page 13: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/13.jpg)
Che
mA
xon
BIO
VIA
Generated from SMILES for US 2016/16966 A1 [0517]
![Page 14: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/14.jpg)
diagram improvementsTypical work arounds:
• Separately render molecules • Hide agents and list separately
What do humans do: • Wrap products below • Abbreviate functional groups and agents • Orientate reactants to products and visa versa • Hide agents and list as text
CINF 13, ACS Fall 2017, Washington, D.C.
![Page 15: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/15.jpg)
Pist
achi
o+C
DK
(Abb
revi
ated
+Alig
ned)
Pist
achi
o+C
DK
(Abb
revi
ated
)
Generated from SMILES for US 2016/16966 A1 [0517]
![Page 16: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/16.jpg)
reaction detail view
![Page 17: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/17.jpg)
What do Synthetic Chemists Want from Their Reaction Systems?
CINF 13, ACS Fall 2017, Washington, D.C.
Data ClassificationDiagrams Search
![Page 18: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/18.jpg)
4.1.6CyclicBeckmannrearrangement
Assigns names to 900+ reactions using transformations
Can guarantee perfect Atom-Atom Mapping • Atom-Atom Mapping is an output not an input • MCS mappers struggle with rearrangements:
namerxn
![Page 19: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/19.jpg)
concepts and rxno
CINF 13, ACS Fall 2017, Washington, D.C.
1 Heteroatom alkylation and arylation .7 O-substitution .1 Chan-Lam ether coupling .2 Diazomethane esterification .3 Ethyl esterification .4 Hydroxy to methoxy .5 Hydroxy to triflyloxy .6 Methyl esterification .n 2 Acylation and related processes .6 O-acylation to ester .1 Ester Schotten-Baumann .2 Esterification (generic) .3 Fischer-Speier esterification .4 Baeyer-Villiger oxidation .5 Yamaguchi esterification .6 Hydroxy to imidazolecarbonyloxy .7 Imidazolecarbonyl to ester .8 Hydroxy to acetoxy .9 Steglich esterification .n
![Page 20: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/20.jpg)
concepts and rxno
CINF 13, ACS Fall 2017, Washington, D.C.
1 Heteroatom alkylation and arylation .7 O-substitution .1 Chan-Lam ether coupling .2 Diazomethane esterification .3 Ethyl esterification .4 Hydroxy to methoxy .5 Hydroxy to triflyloxy .6 Methyl esterification .n 2 Acylation and related processes .6 O-acylation to ester .1 Ester Schotten-Baumann .2 Esterification (generic) .3 Fischer-Speier esterification .4 Baeyer-Villiger oxidation .5 Yamaguchi esterification .6 Hydroxy to imidazolecarbonyloxy .7 Imidazolecarbonyl to ester .8 Hydroxy to acetoxy .9 Steglich esterification .n
Esterification(7)
Chan-Lamcoupling(3)
Schotten-BaumannReaction(9)
RXNO: http://github.com/rsc-ontologies/rxno
![Page 21: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/21.jpg)
result FACETSProvides summary over the key concepts of results
Cut through information deluge and refine search
CINF 13, ACS Fall 2017, Washington, D.C.
• Reaction Types (NextMove ontology tree) • Drug Targets (ChEMBL ontology tree) • Disease Targets (MESH ontology tree) • Yields • Affiliation (NextMove ontology tree) • Publication Date, Documents, Authors
![Page 22: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/22.jpg)
CINF 13, ACS Fall 2017, Washington, D.C.
Intel(R) Core(TM) i7-6900K CPU @ 3.20GHz
2.9 seconds to summarise all 6.6 million rows
Resource expensive – O(n) size of result set • Client, server, or database? • Overhead copying and transferring data that is
not needed • Calculate when requested or up-front?
facet calculation
Custom cartridge:
![Page 23: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/23.jpg)
What do Synthetic Chemists Want from Their Reaction Systems?
CINF 13, ACS Fall 2017, Washington, D.C.
Data ClassificationDiagrams Search
![Page 24: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/24.jpg)
one entry point
CINF 13, ACS Fall 2017, Washington, D.C.
SystematicName DateRange TrivialName
YieldRange Affiliation ReactionSMARTS
DiseaseTarget DocumentLineFormula
SMILES InChIAuthor ProteinTarget Collection
ReactionType(NameRxn)SMARTSSource
…andlogicalcombinationsthereof
![Page 25: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/25.jpg)
suggestionsBased on global frequency
CINF 13, ACS Fall 2017, Washington, D.C.
Based on context frequency
![Page 26: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/26.jpg)
structure search technology
NextMove’s Arthor Technology
Up to 100x faster then state-of-the-art
Combination of SMARTS compilation and efficient storage
Preliminary PostgreSQL integration
36s Arthor 56m BIOVIA Direct (Oracle) 1h Bingo (NoSQL) 1h54m Bingo (PostgreSQL) 2h6m Bingo (Oracle) 2h41m JChem (Oracle) 5h9m RDCart (PostgreSQL) 13h54m pgchem (PostgreSQL) 1d1h52m mychem (MySQL) 3d1h13m orchem (Oracle)
Benchmark: ~3.5K queries against ~7M structures (eMolecules 2014) all on the same hardware.
John May and Roger Sayle, Substructure Search Face-off, May 2015
![Page 27: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/27.jpg)
Intention can be refined by qualifiers Role {structure} product
Substructure {structure} substructure {structure} substructure product
Make/Break Synthesis of {structure}
Combined with other terms {structure} substructure product and yield of 80%
refining structure search
CINF 13, ACS Fall 2017, Washington, D.C.
![Page 28: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/28.jpg)
Find:7H-purinesubstructureproduct
Find:Synthesisof7H-purine
make/break example
CINF 13, ACS Fall 2017, Washington, D.C.
![Page 29: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/29.jpg)
Find:7H-purine-8-onesubstructurechlorination
Find:[*:1][CH2:2]Cl>>[*:1][CH2:2]F
Namerxn example
CINF 13, ACS Fall 2017, Washington, D.C.
![Page 30: pistachio - NextMove Software€¦ · eNotebook v9, v11 or v13 or Symyx ELN v5.x or v6.x Oracle Server version 10, 11 or Microsoft Windows, Linux or Mac OS Infrastructure for liberating](https://reader034.fdocuments.in/reader034/viewer/2022042312/5edbbc54ad6a402d6666197d/html5/thumbnails/30.jpg)
Acknowledgements Noel O’Boyle (NextMove Software), Egon Willighagen (CDK) James Davison, Matt Swain (Vernalis)
What do Synthetic Chemists Want from Their Reaction Systems?
Data ClassificationDiagrams Search
pistachiohttp://www.nextmovesoftware.com/pistachio.html
Come find me around ACS for a demo!See also: CINF 90