The Schrödinger KNIME extensions

22
Jean-Christophe Mozziconacci Volker Eyrich KNIME UGM, Zurich, March 2013 The Schrödinger KNIME extensions Computational Chemistry and Cheminformatics in a workflow environment

Transcript of The Schrödinger KNIME extensions

Jean-Christophe Mozziconacci

Volker Eyrich

KNIME UGM, Zurich, March 2013

The Schrödinger KNIME extensions

Computational Chemistry and Cheminformatics in

a workflow environment

What are the Schrödinger extensions?

• Computational chemistry and drug

design

• 150+ nodes

Linux, Mac, Windows 32 and 64 bit

Molecular Mechanics – Macromodel

Molecular Dynamics – Desmond

Quantum Mechanics – Jaguar

Cheminformatics – Canvas

Pharmacophore modeling – Phase

Combinatorial Libraries – Combiglide

Docking – Glide

Protein Structure Prediction – Prime, IFD

Protein preparation

Ligand preparation – LigPrep, Epik

Property generation – Qikprop ...

Filtering

Tools for data and structure manipulation

Scripting – shell, Python

Reporting

Library design: Library enumeration

Homology modeling

• Model building and refinement ◄

• Induced Fit Docking ◄

Real World Examples

• Vendor database preparation

• SiteMap and Glide grid generation

• SiteMap and clustering

Metanodes: Run Maestro 1:1, SiteMap, Run

PyMOL, Jaguar pKa, Glide grid writer

General tools

• Split and align multimers

• Python script, Chemistry external tool, Run

maestro command node use-cases…

• Output column structure option philosophy

KNIME desktop

• Workflows in the current workspace

• GroupBy, loop examples …

Simplest, most exciting, new and improved◄ workflows

KNIME workflow page – http://www.schrodinger.com/knimeworkflows/

Cheminformatics

• Substructure Search

• Clustering, diversity selection, similarity search

• Database analysis, MCS

Docking and post-processing

• Protein preparation and Glide grid generation

• Virtual screening, Ensemble docking

• Loop over docking parameters

• Validate docking parameters

using the Maestro connector node

Pharmacophore modeling

• Phase Shape screening , hypothesis

identification and database screening

Molecular Mechanics

• Compare conformational search methods

• Conformational search and post-processing

Molecular Dynamics: Desmond simulation

Quantum mechanics

• ESP charges

• Conformational search and QM optimization

(using the Report designer)

Workshop: QM – ESP charges and surface

Binding shape site clustering and ensemble docking Osguthorpe, D. J.; Sherman, W; Hagler, A. T.

Chem Biol Drug Des 2012; 80: 182–193

Environment configurable in the preferences

• For a stand-alone installations

• Other ways ($SCHRODINGER/run, start-up script, hunt based)

• Set-up diagnosis node for the KNIME Server

– Environment, scratch directories

– License, Backend installed

Batch command generation

“Structure file”

“Text file” “GUIsetting”

• Command line to be run in batch generated based on node annotations

Similar to the KNIME in Maestro ones eg “Structure file” for Molecule reader and writer nodes

eg $SCHRODINGER/run -FROM maestro KNIME_batch.py QM.zip –printcmd

$SCHRODINGER/knime -batch -nosave -maxThreads 1 -nosplash -workflowFile=QM.zip

-option=8380,DataURL/0,/tmp/Aniline.smi,String

-option=8386,output_file_name,/tmp/QMprotocolOutput.mae,String

-option=8399,value/value,localhost:2,String

• Workflows easily run in Seurat

KNIME in Seurat

Parameter flow variables

• Any backend command line option not exposed in the node configuration panel

• Value, activate, value/structure from an input column, extra output column

• Metanode GUI with the Quick form nodes

eg SiteMap-sitebox = 3 SiteMap-ligmae = :CT lig:

Sitemap -HOST localhost:4 -j sitemap_-732349751_1 -maxsites 5 -modphobic 3 -keeplogs no

-sitebox 3 -ligmae sitemap_-732349751_in_1_CT_lig.mae -prot sitemap_-732349751_in_1.mae

• Glide ligand docking, Prime nodes, Protein preparation wizard, SiteMap, some

Jaguar, MacroModel and Canvas nodes. Simple to activate for other nodes when

needed.

Workflow examples and Workflow list node

• Workflow examples available as a workspace under $SCHRODINGER/knime-v.../tutorial/

• Workflow list node (free, in a separate plugin feature)

Lists the nodes and workflows in the workspace

– Latest modified workflows

– workflows containing specific nodes (eg example to create a new one)

– Compare several versions of a workflow (date, complexity)

– Find a workflow buried in groups

– Workspace clean up (size on disk)

SiteMap and Run PyMOL nodes

• SiteMap

– Identify potential binding sites.

– Evaluate a single binding site region (using the parameter flow variables)

• Run PyMOL (free, in a separate plugin feature)

Standard input/ouput and Glide ligand docking node

• More nodes input and output pdb and sdf.

No need of converters.

Glide ligand docking, Run Maestro command, Assign

bond order, Split by Structure...

• Extract automatically the generated properties

Prime MM-GBSA and Glide ligand docking

• Glide ligand docking with 1 output

SDF SDF SDF SDF

SDF

New Chemistry external tool node

(Free, in a separate plugin feature)

• Optional input/output ports, output

column structure options, column name.

• Reads maegz files, input/output pdb,

output Surface type

• Flow variables, accessible by name

• Basename keyword, add extra columns to the output

Why giving a try to the Schrodinger extensions in 2013?

• Stand-alone installation configuration - in the Preferences

• Won’t miss options in the node anymore – parameter flow variables

• Easier metanode creation – new Chemistry External Tool node (+quick form nodes)

• Workflow list (free)

• Batch command generation – based on node annotations

• Less conversions between nodes from several providers - input/output pdb and sdf

Release in May 2013

Jean-Christophe Schrodinger KNIME extensions Product Manager

[email protected]

Volker Eyrich Vice President of Technology

[email protected]

Ravikiran Kuppuraj Main developer

Schrödinger Developers

Jean-Christophe Mozziconacci

Volker Eyrich

The Schrödinger KNIME extensions

2013 KNIME UGM workshop

Workshop

• ESP charges

– Jaguar ESP

– Semi empirical optimization

• Python script node, new Chemistry external tool node

– ESP surfaces – Parameter flow variables

– (conformational search and QM refinement)

• Sitemap and clustering

– SiteMap and clustering

– Docking and result analysis

– Molecular Dynamics

• Workflows in the workspace

– Workflow currently available

– New workflow list node

• Other workflow page examples...

Library design: Library enumeration

Homology modeling

• Model building and refinement ◄

• Induced Fit Docking ◄

Real World Examples

• Vendor database preparation

• SiteMap and Glide grid generation

• SiteMap and clustering

Metanodes: Run Maestro 1:1, SiteMap, Run

PyMOL, Jaguar pKa, Glide grid writer

General tools

• Split and align multimers

• Python script, Chemistry external tool, Run

maestro command node use-cases…

• Output column structure option philosophy

KNIME desktop

• Workflows in the current workspace

• GroupBy, loop examples …

Simplest, most exciting, new and improved◄ workflows

KNIME workflow page – http://www.schrodinger.com/knimeworkflows/

Cheminformatics

• Substructure Search

• Clustering, diversity selection, similarity search

• Database analysis, MCS

Docking and post-processing

• Protein preparation and Glide grid generation

• Virtual screening, Ensemble docking

• Loop over docking parameters

• Validate docking parameters

using the Maestro connector node

Pharmacophore modeling

• Phase Shape screening , hypothesis

identification and database screening

Molecular Mechanics

• Compare conformational search methods

• Conformational search and post-processing

Molecular Dynamics: Desmond simulation

Quantum mechanics

• ESP charges

• Conformational search and QM optimization

(using the Report designer)

Workshop

• ESP charges

– Jaguar ESP

– Semi empirical optimization

• Python script node, new Chemistry external tool node

– ESP surfaces – Parameter flow variables

– (conformational search and QM refinement)

• Sitemap and clustering

– SiteMap and clustering

– Docking and result analysis

– Molecular Dynamics

• Workflows in the workspace

– Workflow currently available

– New workflow list node

ftp://ftp.schrodinger.com/support/hidden/jcmozzic/QM_EXP.zip, QM_ESP_2013.zip,

SiteMap_and_clustering.zip, SiteMap_and_clustering_2013.zip

(new SiteMap node, can be open with KNIME 2.7+Suite 2012)

2013 new features

Based on KNIME 2.7

Infrastructure improvements

• Environment for a stand-alone installation configurable in the preferences.

Set-up diagnosis node.

• Generation of the command line to be run in batch based on node annotations,

batch execution setting panel. So workflows can easily be run in Seurat

• Parameter flow variables to use any backend command line option not exposed in

the node configuration panel

• Workflow examples available in the installation

New nodes

• SiteMap

• Run PyMOL

• Workflow list – lists the nodes and workflows in the workspace

2013 new features

New functionalities

• KNIME in Maestro – input structures from files

• More nodes input and output pdb and sdf

eg Glide ligand docking, Assign bond order, Split by Structure

• Prime MM-GBSA and Glide ligand docking – Extract automatically the generated

properties

• Parameter flow variable – Glide ligand docking, Prime nodes, Protein preparation

wizard, SiteMap, some Jaguar, MacroModel and Canvas nodes.

• Glide ligand docking – with 1 output, sdf output type and other new functionalities

• Chemistry external tool – with optional input/output ports, access to the flow

variables by name, basename and other new keywords.

• And many other fixes and minor improvements

What are the Schrödinger extensions?

• Modelling and

computational chemistry in

a workflow environment

• 150+ nodes covering the

whole Schrodinger Suite

• Run on Linux and Windows

32 and 64 bit versions

Molecular Mechanics

- MacroModel Single Point Energy,

Minimization, Coordinate Scan

- ConfGen

- Conformational Search

- Premin, Impref, Uffmin

Quantum Mechanics

- Jaguar Single Point Energy, Minimization

- NMR Shielding Constants

- Jaguar Charges

Molecular Dynamics

- Desmond System builder

- Desmond Molecular Dynamics

- Trajectory extract frames and manipulation

- Trajectory reader, CMS reader

Cheminformatics

Fingerprint Based Tools

- Fingerprint Generation

- Generate Pairwise Matrix, and 2 Inputs

- Similarity Matrix, Dissimilarity Selection

- Build Report and Hierarchical Clustering

Filters and Mining Tools

- MCS

- Substructure Search

- Structure, REOS Filters

Utilities and Converters

- PCA, MDS

- Combine Fingerprints, Concatenate

Bitvectors, Convert Fingerprint to

Bitvector, etc

Modeling

- Bayes Model Building, prediction

- PLS Model Building, Prediction

Pharmacophore Modeling

- Phase Shape

- Phase DB Query, File Query

- Phase DB Creation

- Phase Hypothesis Identification

Combinatorial Libraries

- CombiGlide Reagent Preparation and

Library Enumeration

- Fragments from Molecules and joiner

Docking and Scoring

- Glide Grid Generation

- Glide Ligand Docking, Ensemble Docking

- XP Visualizer

Post-processing

- Prime MM-GBSA

- Embrace Minimization

- Strain Rescore, Pose Entropy

- Pose Filter, Glide Merge, Sort Results

- Glide Ensemble Merge

Protein Structure Prediction

- BLAST

- Prime Build Homology Model

- Prime Side Chain Sampling, Minimization

- IFD and individual steps

Schrödinger nodes

• Generate, manipulate,

analyze and visualize

chemical data and

structures

• Interactive and automated

analysis. Presentation and

communication of results

Protein Preparation

- Protein Preparation Wizard

- Protein Assignment

Ligand Preparation

- LigPrep and the individual steps

- Epik

Property Generation

- QikProp, Molecular Descriptors

- Canvas descriptors

Filtering

- Ligfilter, Ligparse, Propfilter

Scripting

- Run Maestro Command

- Chemistry External nodes

- Python Script nodes

Reporting

- Run Maestro, Run Canvas

- Spreadsheet viewer

- Table viewer

Tools

Data Manipulation

- Compare Ligands

- Lookup and Add Columns

- Group and Ungroup MAE

Structure Manipulation

- Add Hydrogens

- Delete Atoms

- Split by Structure

- MAE atom/bond property Parser

- Extract, Set and Delete MAE Properties

- Set Molecule Title, MAE Index

Utilities

- Get PDB

- Align Binding Sites

- Protein Structure Alignment

- RMSD

- Assign Bond Orders

- Unique Title Check

- Check PDB Name

- SD Format Checker

- Generate Smarts, Unique Smiles

- RRHO Entropy

- Boltzmann Population

- Volume Overlap Matrix

Readers/Writers

- CSV Reader (several inputs)

- Molecule Reader and Writer

- SD, PDB, Mol2 Reader nodes

- Sequence, Alignment Readers and Writers

- Fingerprint Reader and Writer

- Hypothesis Reader and Writer

- Glide Grid and Multiple Grid Reader

- Variable Based Glide Grid Reader

Converters

- String-to-Type

- Molecule-to-MAE, MAE-to-Pdb, to-SD, to-

Smiles and to-Mol2, SD-to-Smiles

- PoseViewer-to-Complexes and

Complexes-to-PoseViewers

- Hartree-to-kcal/mol and kJ-to-kcal