Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition...

21
https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten Pamela Greenwell, Hans Heindl AHM’09 Oxford, UK, 07-09 December 2009

Transcript of Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition...

Page 1: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

Parameter Sweep Workflows for Modelling Carbohydrate

Recognition

ProSim Project

Tamas Kiss, Gabor Terstyanszky, Noam Weingarten Pamela Greenwell, Hans Heindl

AHM’09Oxford, UK, 07-09 December 2009

Page 2: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

The research interest• The motivation:

• Understanding how sugars interact with their protein partners may lead to development of new treatment methods for many diseases.

• The obstacle:• Investigation of the binding of proteins to sugars in “wet

laboratory” (in vitro) experiments is expensive and time consuming

• Expensive substrates• Sophisticated machinery

• The solution: • Use “in silico” tools (computer simulation) to select best

binding candidates• In vitro work only on selected candidates

Page 3: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

The research task

Binding pocket

Sugar (ligand)

Protein (receptor)

Page 4: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

The research interest

• Advantages of in silico methods:• Better focusing wet laboratory resources:

• Better planning of experiments by selecting best molecules to investigate in vitro

• Reduced time and cost• Increased number of molecules screened

• Problems of in silico experiments:• Time consuming

• Weeks or months on a single computer• Simulation tools are too complex for bio-scientists

• Unix command line interfaces + software packages (Amber, GROMACS)• Bio-molecular simulation tools are not widely tested and validated

• Are the results really useful and accurate?

Page 5: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

What can we gain via the simulation?1. Validation and refinement of in-silico modelling tools

2. Filter potential scenarios for wet lab experiments

Page 6: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

The researcher’s interest

• What does the researcher want?• Run the simulations faster

• Use compute resources – National Grid Service (NGS)• Run the simulations

• Using seamless access to compute resources web based

interface • Combining many simulation, analysis and visualisation tools workflows• Running multiple docking experiments to investigate different

protein and sugar combinations parameter study

Page 7: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

Westminster Grid Application Support Service (W-GRASS)

Page 8: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

Bio- and Life ScienceBio- and Life Science- Molecular Dynamics Simulation using CHARMmMolecular Dynamics Simulation using CHARMm- Patient Readmission Analysis with RPatient Readmission Analysis with R- GAMESS-UK - ab initio molecular electronic structure program - GAMESS-UK - ab initio molecular electronic structure program - MultiBayes - program for analysing DNA sequences of genes MultiBayes - program for analysing DNA sequences of genes - ProSim - Modelling Protein Carbohydrate Recognition in-silico – ProSim - Modelling Protein Carbohydrate Recognition in-silico –

application application- In silico Modelling Using AutoDockIn silico Modelling Using AutoDockEngineeringEngineering- - DASP - Digital Alias-free Signal ProcessingDASP - Digital Alias-free Signal Processing- Extraction of X-RAY Diffraction ProfilesExtraction of X-RAY Diffraction Profiles- Cellular Automata-Based Laser DynamicsCellular Automata-Based Laser DynamicsMulti-mediaMulti-media- Rendering portal - Grid-based on-line rendering service Rendering portal - Grid-based on-line rendering service PhysicsPhysics- VisIVO – Visualisation Interface to the Virtual ObservatoryVisIVO – Visualisation Interface to the Virtual Observatory

Application Ported by W-GRASS

Page 9: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

ProSim – Protein Molecule Simulation on the Grid

• Funded by the JISC- ENGAGE program• Engaging Research with e-Infrastructure • promote the greater engagement of academic researchers in the UK with

the UK's e-Infrastructure

• Prosim objectives:– define user requirements and user scenarios of protein molecule

simulation

– Identify, test and select software packages for protein molecule simulation

– automate the protein molecule simulation creating workflows and parameter study support.

– develop application specific graphical user interfaces

– run protein molecule simulation on the UK National Grid Service and make it available for the bioscience research community.

Page 10: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

The User ScenarioPDB file 1(Receptor) PDB file 2

(Ligand)

Energy Minimization(Gromacs)

Validate(Molprobity)

Check(Molprobity)

Perform docking(AutoDock)

Molecular Dynamics(Gromacs)

Phase 1

Phase 2

Phase 3

Phase 4

Page 11: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

The User Scenario in detail

Public repository

Local database

User provided

Preparation and standardisation

Solvation and charge

neutralization

Energy minimisation

Validation

phase 1 – selection and preparation of receptor

Solvation

Energy minimisation

Built using

SMILESPublic

repositoryLocal

databaseUser

provided

phase 2 – selection and preparation of ligand

Page 12: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

The User Scenario

Prepare docking: docking parameters and grid-space -

AutoGrid

Docking and selection of best results according to total

energyAutoDock

10 AutoDock executions, 100 genetic algorithm

runs each

phase 3 – docking ligand to receptor

Solvation of the ligand-receptor structure

Energy minimisation – GROMACS

Molecular dynamicsGROMACS MPI version

Molecule trajectory data analysis

phase 4 – refining the ligand-receptor molecule (performed

on 10 best results of the AutoDock simulation)

Page 13: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

The Workflow in g-USE

• a combination of GEMLCA and standard g-USE jobs

• Executed on 5 different sites of the UK NGS

• Parameter sweeps in phases 3 and 4

Page 14: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

Running simulationsSet input parameters

Upload input filesSelect executor sites

Follow execution progress

Typical execution time: 24 hours

Page 15: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

User views

• Researchers (or End-User)• Minimal computer, Grid and portal skills• Only interested in running their own research• Import, parameterize, execute and visualise workflows

• Application Developers (and/or Expert Users) • Computer literate researcher or software engineer• Define user scenarios and design new experiments• Create, test and deploy and modify workflows• Communicate with end-users and consider their requirements

Page 16: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

The ProSim visualiser• Visualisation in a newly developed portlet• Allows visualisation of receptor, ligand and docked

molecules at any phase during and after simulation (if the necessary files have already been generated)

• Allows to visualise and compare two molecules at a time.

• Energy, pressure, temperature and other important statistics statistics are also displayed.

• Using the KiNG ((Kinemage, Next Generation) visualisation tool

Page 17: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

The ProSim visualiser

Page 18: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

The ProSim visualiser

Page 19: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

Lessons learned• Communication between scientists and Grid experts is

extremely difficult• More than 50% of total time spent for the project is for

communication and describing/understanding user requests/requirements

• Novice Grid users require totally transparent access to Grid resources• Users interested in their research and not in Globus, MPI or

WMS.

Page 20: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

Future plans

• Make workflow more flexible to accommodate numerous different user scenarios

• Investigate further scenarios such as virtual screening of many ligands to one selected receptor

Page 21: Https://engage.cpc.wmin.ac.uk Parameter Sweep Workflows for Modelling Carbohydrate Recognition ProSim Project Tamas Kiss, Gabor Terstyanszky, Noam Weingarten.

https://engage.cpc.wmin.ac.uk

Thank you for your attention!Any questions?

https://engage.cpc.wmin.ac.uk

[email protected]