The Schrödinger KNIME extensions
Transcript of The Schrödinger KNIME extensions
Jean-Christophe Mozziconacci
Volker Eyrich
KNIME UGM, Zurich, March 2013
The Schrödinger KNIME extensions
Computational Chemistry and Cheminformatics in
a workflow environment
What are the Schrödinger extensions?
• Computational chemistry and drug
design
• 150+ nodes
Linux, Mac, Windows 32 and 64 bit
Molecular Mechanics – Macromodel
Molecular Dynamics – Desmond
Quantum Mechanics – Jaguar
Cheminformatics – Canvas
Pharmacophore modeling – Phase
Combinatorial Libraries – Combiglide
Docking – Glide
Protein Structure Prediction – Prime, IFD
Protein preparation
Ligand preparation – LigPrep, Epik
Property generation – Qikprop ...
Filtering
Tools for data and structure manipulation
Scripting – shell, Python
Reporting
Library design: Library enumeration
Homology modeling
• Model building and refinement ◄
• Induced Fit Docking ◄
Real World Examples
• Vendor database preparation
• SiteMap and Glide grid generation
• SiteMap and clustering
Metanodes: Run Maestro 1:1, SiteMap, Run
PyMOL, Jaguar pKa, Glide grid writer
General tools
• Split and align multimers
• Python script, Chemistry external tool, Run
maestro command node use-cases…
• Output column structure option philosophy
KNIME desktop
• Workflows in the current workspace
• GroupBy, loop examples …
Simplest, most exciting, new and improved◄ workflows
KNIME workflow page – http://www.schrodinger.com/knimeworkflows/
Cheminformatics
• Substructure Search
• Clustering, diversity selection, similarity search
• Database analysis, MCS
Docking and post-processing
• Protein preparation and Glide grid generation
• Virtual screening, Ensemble docking
• Loop over docking parameters
• Validate docking parameters
using the Maestro connector node
Pharmacophore modeling
• Phase Shape screening , hypothesis
identification and database screening
Molecular Mechanics
• Compare conformational search methods
• Conformational search and post-processing
Molecular Dynamics: Desmond simulation
Quantum mechanics
• ESP charges
• Conformational search and QM optimization
(using the Report designer)
Workshop: QM – ESP charges and surface
Binding shape site clustering and ensemble docking Osguthorpe, D. J.; Sherman, W; Hagler, A. T.
Chem Biol Drug Des 2012; 80: 182–193
Environment configurable in the preferences
• For a stand-alone installations
• Other ways ($SCHRODINGER/run, start-up script, hunt based)
• Set-up diagnosis node for the KNIME Server
– Environment, scratch directories
– License, Backend installed
Batch command generation
“Structure file”
“Text file” “GUIsetting”
• Command line to be run in batch generated based on node annotations
Similar to the KNIME in Maestro ones eg “Structure file” for Molecule reader and writer nodes
eg $SCHRODINGER/run -FROM maestro KNIME_batch.py QM.zip –printcmd
$SCHRODINGER/knime -batch -nosave -maxThreads 1 -nosplash -workflowFile=QM.zip
-option=8380,DataURL/0,/tmp/Aniline.smi,String
-option=8386,output_file_name,/tmp/QMprotocolOutput.mae,String
-option=8399,value/value,localhost:2,String
• Workflows easily run in Seurat
Parameter flow variables
• Any backend command line option not exposed in the node configuration panel
• Value, activate, value/structure from an input column, extra output column
• Metanode GUI with the Quick form nodes
eg SiteMap-sitebox = 3 SiteMap-ligmae = :CT lig:
Sitemap -HOST localhost:4 -j sitemap_-732349751_1 -maxsites 5 -modphobic 3 -keeplogs no
-sitebox 3 -ligmae sitemap_-732349751_in_1_CT_lig.mae -prot sitemap_-732349751_in_1.mae
• Glide ligand docking, Prime nodes, Protein preparation wizard, SiteMap, some
Jaguar, MacroModel and Canvas nodes. Simple to activate for other nodes when
needed.
Workflow examples and Workflow list node
• Workflow examples available as a workspace under $SCHRODINGER/knime-v.../tutorial/
• Workflow list node (free, in a separate plugin feature)
Lists the nodes and workflows in the workspace
– Latest modified workflows
– workflows containing specific nodes (eg example to create a new one)
– Compare several versions of a workflow (date, complexity)
– Find a workflow buried in groups
– Workspace clean up (size on disk)
SiteMap and Run PyMOL nodes
• SiteMap
– Identify potential binding sites.
– Evaluate a single binding site region (using the parameter flow variables)
• Run PyMOL (free, in a separate plugin feature)
Standard input/ouput and Glide ligand docking node
• More nodes input and output pdb and sdf.
No need of converters.
Glide ligand docking, Run Maestro command, Assign
bond order, Split by Structure...
• Extract automatically the generated properties
Prime MM-GBSA and Glide ligand docking
• Glide ligand docking with 1 output
SDF SDF SDF SDF
SDF
New Chemistry external tool node
(Free, in a separate plugin feature)
• Optional input/output ports, output
column structure options, column name.
• Reads maegz files, input/output pdb,
output Surface type
• Flow variables, accessible by name
• Basename keyword, add extra columns to the output
Why giving a try to the Schrodinger extensions in 2013?
• Stand-alone installation configuration - in the Preferences
• Won’t miss options in the node anymore – parameter flow variables
• Easier metanode creation – new Chemistry External Tool node (+quick form nodes)
• Workflow list (free)
• Batch command generation – based on node annotations
• Less conversions between nodes from several providers - input/output pdb and sdf
Release in May 2013
Jean-Christophe Schrodinger KNIME extensions Product Manager
Volker Eyrich Vice President of Technology
Ravikiran Kuppuraj Main developer
Schrödinger Developers
Workshop
• ESP charges
– Jaguar ESP
– Semi empirical optimization
• Python script node, new Chemistry external tool node
– ESP surfaces – Parameter flow variables
– (conformational search and QM refinement)
• Sitemap and clustering
– SiteMap and clustering
– Docking and result analysis
– Molecular Dynamics
• Workflows in the workspace
– Workflow currently available
– New workflow list node
• Other workflow page examples...
Library design: Library enumeration
Homology modeling
• Model building and refinement ◄
• Induced Fit Docking ◄
Real World Examples
• Vendor database preparation
• SiteMap and Glide grid generation
• SiteMap and clustering
Metanodes: Run Maestro 1:1, SiteMap, Run
PyMOL, Jaguar pKa, Glide grid writer
General tools
• Split and align multimers
• Python script, Chemistry external tool, Run
maestro command node use-cases…
• Output column structure option philosophy
KNIME desktop
• Workflows in the current workspace
• GroupBy, loop examples …
Simplest, most exciting, new and improved◄ workflows
KNIME workflow page – http://www.schrodinger.com/knimeworkflows/
Cheminformatics
• Substructure Search
• Clustering, diversity selection, similarity search
• Database analysis, MCS
Docking and post-processing
• Protein preparation and Glide grid generation
• Virtual screening, Ensemble docking
• Loop over docking parameters
• Validate docking parameters
using the Maestro connector node
Pharmacophore modeling
• Phase Shape screening , hypothesis
identification and database screening
Molecular Mechanics
• Compare conformational search methods
• Conformational search and post-processing
Molecular Dynamics: Desmond simulation
Quantum mechanics
• ESP charges
• Conformational search and QM optimization
(using the Report designer)
Workshop
• ESP charges
– Jaguar ESP
– Semi empirical optimization
• Python script node, new Chemistry external tool node
– ESP surfaces – Parameter flow variables
– (conformational search and QM refinement)
• Sitemap and clustering
– SiteMap and clustering
– Docking and result analysis
– Molecular Dynamics
• Workflows in the workspace
– Workflow currently available
– New workflow list node
ftp://ftp.schrodinger.com/support/hidden/jcmozzic/QM_EXP.zip, QM_ESP_2013.zip,
SiteMap_and_clustering.zip, SiteMap_and_clustering_2013.zip
(new SiteMap node, can be open with KNIME 2.7+Suite 2012)
2013 new features
Based on KNIME 2.7
Infrastructure improvements
• Environment for a stand-alone installation configurable in the preferences.
Set-up diagnosis node.
• Generation of the command line to be run in batch based on node annotations,
batch execution setting panel. So workflows can easily be run in Seurat
• Parameter flow variables to use any backend command line option not exposed in
the node configuration panel
• Workflow examples available in the installation
New nodes
• SiteMap
• Run PyMOL
• Workflow list – lists the nodes and workflows in the workspace
2013 new features
New functionalities
• KNIME in Maestro – input structures from files
• More nodes input and output pdb and sdf
eg Glide ligand docking, Assign bond order, Split by Structure
• Prime MM-GBSA and Glide ligand docking – Extract automatically the generated
properties
• Parameter flow variable – Glide ligand docking, Prime nodes, Protein preparation
wizard, SiteMap, some Jaguar, MacroModel and Canvas nodes.
• Glide ligand docking – with 1 output, sdf output type and other new functionalities
• Chemistry external tool – with optional input/output ports, access to the flow
variables by name, basename and other new keywords.
• And many other fixes and minor improvements
What are the Schrödinger extensions?
• Modelling and
computational chemistry in
a workflow environment
• 150+ nodes covering the
whole Schrodinger Suite
• Run on Linux and Windows
32 and 64 bit versions
Molecular Mechanics
- MacroModel Single Point Energy,
Minimization, Coordinate Scan
- ConfGen
- Conformational Search
- Premin, Impref, Uffmin
Quantum Mechanics
- Jaguar Single Point Energy, Minimization
- NMR Shielding Constants
- Jaguar Charges
Molecular Dynamics
- Desmond System builder
- Desmond Molecular Dynamics
- Trajectory extract frames and manipulation
- Trajectory reader, CMS reader
Cheminformatics
Fingerprint Based Tools
- Fingerprint Generation
- Generate Pairwise Matrix, and 2 Inputs
- Similarity Matrix, Dissimilarity Selection
- Build Report and Hierarchical Clustering
Filters and Mining Tools
- MCS
- Substructure Search
- Structure, REOS Filters
Utilities and Converters
- PCA, MDS
- Combine Fingerprints, Concatenate
Bitvectors, Convert Fingerprint to
Bitvector, etc
Modeling
- Bayes Model Building, prediction
- PLS Model Building, Prediction
Pharmacophore Modeling
- Phase Shape
- Phase DB Query, File Query
- Phase DB Creation
- Phase Hypothesis Identification
Combinatorial Libraries
- CombiGlide Reagent Preparation and
Library Enumeration
- Fragments from Molecules and joiner
Docking and Scoring
- Glide Grid Generation
- Glide Ligand Docking, Ensemble Docking
- XP Visualizer
Post-processing
- Prime MM-GBSA
- Embrace Minimization
- Strain Rescore, Pose Entropy
- Pose Filter, Glide Merge, Sort Results
- Glide Ensemble Merge
Protein Structure Prediction
- BLAST
- Prime Build Homology Model
- Prime Side Chain Sampling, Minimization
- IFD and individual steps
Schrödinger nodes
• Generate, manipulate,
analyze and visualize
chemical data and
structures
• Interactive and automated
analysis. Presentation and
communication of results
Protein Preparation
- Protein Preparation Wizard
- Protein Assignment
Ligand Preparation
- LigPrep and the individual steps
- Epik
Property Generation
- QikProp, Molecular Descriptors
- Canvas descriptors
Filtering
- Ligfilter, Ligparse, Propfilter
Scripting
- Run Maestro Command
- Chemistry External nodes
- Python Script nodes
Reporting
- Run Maestro, Run Canvas
- Spreadsheet viewer
- Table viewer
Tools
Data Manipulation
- Compare Ligands
- Lookup and Add Columns
- Group and Ungroup MAE
Structure Manipulation
- Add Hydrogens
- Delete Atoms
- Split by Structure
- MAE atom/bond property Parser
- Extract, Set and Delete MAE Properties
- Set Molecule Title, MAE Index
Utilities
- Get PDB
- Align Binding Sites
- Protein Structure Alignment
- RMSD
- Assign Bond Orders
- Unique Title Check
- Check PDB Name
- SD Format Checker
- Generate Smarts, Unique Smiles
- RRHO Entropy
- Boltzmann Population
- Volume Overlap Matrix
Readers/Writers
- CSV Reader (several inputs)
- Molecule Reader and Writer
- SD, PDB, Mol2 Reader nodes
- Sequence, Alignment Readers and Writers
- Fingerprint Reader and Writer
- Hypothesis Reader and Writer
- Glide Grid and Multiple Grid Reader
- Variable Based Glide Grid Reader
Converters
- String-to-Type
- Molecule-to-MAE, MAE-to-Pdb, to-SD, to-
Smiles and to-Mol2, SD-to-Smiles
- PoseViewer-to-Complexes and
Complexes-to-PoseViewers
- Hartree-to-kcal/mol and kJ-to-kcal