Investigating Approaches to Speeding Up Systems Biology Using BOINC-Based Desktop Grids Simon J E...

Post on 16-Dec-2015

217 views 2 download

Tags:

Transcript of Investigating Approaches to Speeding Up Systems Biology Using BOINC-Based Desktop Grids Simon J E...

Investigating Approaches to Speeding Up Systems Biology Using BOINC-Based Desktop Grids

Simon J E Taylor (1)

Mohammadmersad Ghorbani (1)

David Gilbert (1)

Annette Payne (1)

Tamas Kiss (2)

Daniel Farkas (2)

(1) ICT Innovation Group/Centre for Synthetic and Systems Biology Department of Information Systems and Computing Brunel University, UK(X.Y@brunel.ac.uk)

(2) Centre for Parallel ComputingUniversity of WestminsterLondon, UK(initial.Y@wmin.ac.uk)

Overview

Systems Biology Description of Application Grid enabling core component of

application Experiments Conclusion

Systems Biology

Systems biology addresses the systematic study of biological and biochemical systems in terms of complex interactions rather than individual molecular components.  Computational modelling is used to construct and simulate an abstract model of a biological system for analysis. 

Steps of ODE based biochemical modelling

Model

Models are described by Systems Biology Mark up Language (SBML)

Complexity of MAPK model Contains 732 species , ~ 244 parameters

Graph generated by CellDesigner from MAPK.xml

Simulation

Simulation : Solving the system of differential equations in ODE (ordinary differential equation) based models.

SBMLODEsolver extract parameter (kinetic rates) and ODEs and compute the concentrations of species. Result will show changes of concentration of species during time.

Simulation output

Output of simulation is a text file which can be used in Excel or other analytical tools

Why use Grid

Parameter scanning runs simulation over different parameter range.

e.g. parameter scanning of MAPK model

11 hours for 2 parameter . 3 months for 3 parameter to run on (Typical desktop PC).

Grid Architecture

WLDG(Westminster Local Desktop Grid)

SZDG – SZTAKI Desktop Grid WS-PGRADE portal (gUSE) 

WLDG

BOINC Server

Components

WS-PGRADE

(User Interface/Portal)

gUse(Workflow

Processor & Services)

gUse DG Submitter

3G Bridge

BO

INC

T

ask

DB

Data Server

Sched-uler

BOINC Client

GenWrapper

Legacy Application

BOINC Client

GenWrapper

Legacy Application

BOINC Client

GenWrapper

Legacy Application

SZDG Server

Workflow Desciption

Workflow

WorkUnits

Tasks

Job (work unit) Description

Inputs SBML model which is basically xml file Instruction to run ODEsolver n times for

range of parameters. (batching simulations in job)

Size: 1 MB. Output

Zip file contains results for all jobs Size:1.5 MB.

Workflow Generator

generate xml files for BioNessie application (port 1 of generator to port 0 of Bionessie)

Input Port 2 of bionessie can be set for number of simulations/job

Control Flow of the Ported Application

Figure : Control Flow of the Ported SIMAP Application

The University of Westminster Local DG• Over 1500 Windows PCs from 6 different campuses

Lifecycle of a node:

1. PCs basically used by students/staff

2. If unused, switch to Desktop Grid mode

3. No more work from DG server -> shutdown (green solution)

1

2

34

5

6

1

2

34

5

6

1. New Cavendish Street 576 nodes2. Marylebone Campus 559 nodes3. Regent Street 395 nodes4. Wells Street 31 nodes5. Little Tichfield Street 66 nodes6. Harrow Campus 254 nodes

Experimentation Experiment 1:

Several run for different job and simulation sizes

Results : jobs completion highly variable -> Exp 3

Experiment 2 Fix number of simulations and different

simulations/job number. Results: Speedup for some simulation/job

number. Exp1->Experiment 3

Calculating point speed up at time steps. Exp2&Exp3->Experiment 4

Comparing point based speed up for fix number of simulations and different number of simulation per job

Experiment 3 Speedup dynamic -100*100 simulations 30 min to complete ~50% of jobs 2 h and 30 min to complete others

Experiment 4 Dynamics of jobs completion for different simulation/job

Experiment 4 Point Speedup for different

simulation/job 6400 simulations.

Conclusions

WLDG meets the requirement of parameter scanning application at design and implementation level.

Batching of jobs show speedups for some simulation/job size.

Further experiment may show optimal value for simulation/job for different job numbers.

Questions