Slides-SBAC-PAD2014 -Walid SAAD

20
Wide Area BonjourGrid as a Data Desktop Grid: modeling and implementation on top of Redis UNIVERSITY OF TUNIS Ecole Supérieure des Sciences et Techniques de Tunis Laboratoire de Recherche en Technologies de l’Information et de la Communication & Génie Electrique (LaTICE) UNIVERSITY OF PARIS XIII Université Sorbonne Paris Cité Laboratoire d’Informatique de Paris Nord (LIPN) Walid SAAD , Leila ABIDI, Heithem ABBES, Christophe CERIN and Mohamed JEMNI SBAC-PAD-2014, 24 October 2014, Paris, France

Transcript of Slides-SBAC-PAD2014 -Walid SAAD

Page 1: Slides-SBAC-PAD2014 -Walid SAAD

Wide Area BonjourGrid as a Data Desktop Grid:

modeling and implementation on top of Redis

UNIVERSITY OF TUNISEcole Supérieure des Sciences

et Techniques de TunisLaboratoire de Recherche en Technologies de l’Information

et de la Communication& Génie Electrique (LaTICE)

UNIVERSITY OF PARIS XIIIUniversité Sorbonne Paris

CitéLaboratoire d’Informatique

de Paris Nord (LIPN)

Walid SAAD , Leila ABIDI, Heithem ABBES, Christophe CERIN and Mohamed JEMNI

SBAC-PAD-2014, 24 October 2014, Paris, France

Page 2: Slides-SBAC-PAD2014 -Walid SAAD

Outlines

Background

Data-intensive applications

Desktop Grid overview

BonjourGrid Meta-Middleware

Wide Area BonjourGrid

Data management approach

Design on top of Redis

Formal modeling

Performance evaluation and experimentations

Conclusion and future works

2

Page 3: Slides-SBAC-PAD2014 -Walid SAAD

� E-science and enterprise applications deal with a huge amount of data (Big Data) such as in

bio-informatics, medical imaging, high energy physics;

� In order to process large data-sets, applications (Bag-of-Tasks and DAG workflow) typically

need a high performance computing infrastructure : enabling Data Grids became a challenge;

Background E-science Data-intensive applications

3

Page 4: Slides-SBAC-PAD2014 -Walid SAAD

� During last years Desktop Grid Computing systems form some of the biggest computing

platforms for solving E-science and engineering applications at low cost;

�With the mergence of Cloud Computing, Desktop Grid will continue to survive if we are able

to transform the old-fashioned client/server architecture to new web oriented architecture to

Background Issues and motivations

4

to transform the old-fashioned client/server architecture to new web oriented architecture to

deliver services on demand.

� Where to take resources? How to coordinate the resources? How to manage big data

transparently from end users?

Page 5: Slides-SBAC-PAD2014 -Walid SAAD

Objectives:

� A form of distributed and Voluntary computing.

� Using idle resources over Internet or LAN’s networks.

� The execution of scientific applications at low cost.

Taxonomies:

� Computational Desktop Grid (High Throughput Computing).

Background Desktop Grid

5

� Computational Desktop Grid (High Throughput Computing).

� Data Desktop Grid (Data Intensive Applications).

Middlewares:

� BOINC, Condor, XtremWeb, United Devices, OurGrid, etc.

Page 6: Slides-SBAC-PAD2014 -Walid SAAD

Background Desktop Grid

Architectures:

� Centralized or hierarchical.

� No scalability due to the master-worker paradigm.

� Permanent administrative monitoring and vulnerability to failures.

Configuration and Deployment Process:

� Complicated procedure of installation and configuration phase for an ordinary user.

6

� Users can not use their preferred Middleware to manage the jobs.

BonjourGrid Desktop Grid:

� New Generation of Decentralized Institutional Desktop Grids, based on P2P and

Publish/Subscribe paradigm.

� Orchestration of multi-instances of existing desktop grid middleware (Boinc, Condor,

XtremWeb).

Page 7: Slides-SBAC-PAD2014 -Walid SAAD

User A

User D

Computing Element (CE) = 1 coordinator + N Workers

A computing element for each user

Each user can specify a middleware for his CE

BonjourGrid How it works?

77Coordinator Worker Idle

User B

User C

User D

Page 8: Slides-SBAC-PAD2014 -Walid SAAD

Wide Area BonjourGrid Main contributions

� Include a new tier for data-intensive management to BonjourGrid:

o Coordinate and orchestrate computing and data platforms into a unified Desktop Grid system;

o User can select in addition to computing system (Condor, Boinc or XtremWeb), the desired

data manager protocol in a transparent and decentralized manner.

8

� Extends the resources coordination protocol of BonjourGrid:

o Formal modeling using colored Petri nets and verification by CPN-Tools;

o A wide area Implementation based on Redis (a popular net technology);

Page 9: Slides-SBAC-PAD2014 -Walid SAAD

Wide Area BonjourGrid Abstraction layers

Computing Element

Job Scheduler

Condor

Remote Cache (Level 2)

Stork

Local Cache (Level 1)

Bitdew

Deployment of a Computing System

4

5

9

Boinc

XtremWeb

GridFTP SRM

SRBAmazon

S3

FTP HTTP

Bittorent

Resources Selection (Discovery and Matchmaking)

Connection To BonjourGrid

Publish/Subscribe (DNS-SD Bonjour Protocol)

API API

1

2

3

Page 10: Slides-SBAC-PAD2014 -Walid SAAD

Application Specification+ Configuration File

BonjourGrid Interface (For each user)

Local Cache (Bitdew)

Remote Cache (Stork)

1. Create Coordinator(Job Scheduler, Data Cache, data URL, etc…)

Wide Area BonjourGrid User interaction with BonjourGrid

Coordinator Worker Idle

3. Computing Element

External Data Servers

(SRM,SRb, GridFTP,etc)

Job Scheduler(Condor, Boinc, XW)

2. Get Input data(URL)

4. Distribute Data (ID)

5. Schedule Job

6. Put Output data(URL)

10

Page 11: Slides-SBAC-PAD2014 -Walid SAAD

Wide Area BonjourGrid Resources orchestration

� Redis terminology: SUBSCRIBE (CHANNEL-NAME), PUBLISH (CHANNEL-NAME, MESSAGE)

� A part of the interaction events between BonjourGrid components:

11

Page 12: Slides-SBAC-PAD2014 -Walid SAAD

Wide Area BonjourGrid Formal modeling with CPN-Tools

� The analysis of results returned by CPN-Tools yields to satisfaction and more confidence

in the BonjourGrid system:

� We have not found any deadlock states (i.e., states that do not admit executable transitions).

� All possible transitions are executable and all possible events can eventually happen.

� How the modeling serves as guidelines for the implementation?

12

� How the modeling serves as guidelines for the implementation?

� Add control places in order to control publications and subscriptions

� With Redis the SUBSCRIBE event should occur before PUBLISH otherwise the message will be

lost.

� Put the SUBSCRIBE events at the beginning of our implementation

Page 13: Slides-SBAC-PAD2014 -Walid SAAD

Experimentations Performance evaluation of Redis

� Analyze the performance and scalability issues of Redis protocol for discovering and

registering services;

� Measure the response time of Redis when managing resources coming from different

networks;

� Grid5000 testbed using 300 nodes in Nancy, Grenoble and Toulouse sites:

� Redis package (client and server tools)

� Python scripts for starting Redis Server Start-Redis-Server(), registration Register-Service()

13

� Python scripts for starting Redis Server Start-Redis-Server(), registration Register-Service()

and discovering services Browse-Service()

� Several test scenarios (sequential or simultaneous) using one or multiple sites.

Page 14: Slides-SBAC-PAD2014 -Walid SAAD

Experimentations Simultaneous registration in multiple sites

� First Node: Run the Redis Server;

� Second Node: browse services

� 300 nodes: publish services;

Grid5000 Frontend Host

14Nancy Site

2.Browser_Service()

3.Register _Service()

1.Redis_Server()

Grenoble / Toulouse Sites

4.Register _Service() 5.Register _Service()

Page 15: Slides-SBAC-PAD2014 -Walid SAAD

Experimentations Simultaneous registration in multiple sites

� Registration Time :

� Increases from one site to another and it is

proportional to the distance between the site and

the Redis server.

� Varies between 10 ms (Nancy site) to 48 ms

(Toulouse).

� Redis has not been saturated and plots are almost

15

� Redis has not been saturated and plots are almost

linear (high scalability of the Redis protocol).

� Discovery Time:

� Plots are not linear and time varies between 2 to 210 ms.

� Redis Server processes of clients connections sequentially using the same CHANNEL.

Page 16: Slides-SBAC-PAD2014 -Walid SAAD

Experimentations Wide Area BonjourGrid

� Investigate the performance and scalability issues of BonjourGrid with a data manager

in performing data-intensive BLAST computations: compare a nucleotide queries

sequences (DNA Sequence) against a DNA databases (Genebase)

� Grid5000 testbed using 300 nodes in Nancy, Grenoble and Toulouse sites:

o Redis package (client and server tools)

o BonjourGrid (orchestration protocol, Condor middleware, Data manager)

16

o BonjourGrid (orchestration protocol, Condor middleware, Data manager)

Page 17: Slides-SBAC-PAD2014 -Walid SAAD

Experimentations Wide Area BonjourGrid

� Remote data (SCP vs. Stork):

� Stork presents a slight difference

compared to SCP protocols

� Computing Element (Redis):

� Time increases slightly (varies from 130 to

250 s)

17

� Local Data (Bitdew):

� Time to share and download data (DNA Genbase, DNA Sequence and BLAST program)

� Maximal values : Human BLAST (1035)

� BLAST task (blastn):

� Varies, respectively for both scenario, from 1380 to 1795 s and from 420 to 538 s

� BonjourGrid with data manger is better (data are scheduled on workers before job submission).

Page 18: Slides-SBAC-PAD2014 -Walid SAAD

Conclusion and future works Conclusions

�We have implemented, with the Publish/Subscribe paradigm, a new release of the

BonjourGrid meta-middleware in which multiple computing systems and data management

frameworks are orchestrated in a transparent and decentralized manner;

�We have proposed a formal modeling using colored Petri nets;

�With different case studies, we have evaluated the performance and scalability of

18

�With different case studies, we have evaluated the performance and scalability of

BonjourGrid with data manager functionality over 300 machines in the Grid5000 testbed.

Page 19: Slides-SBAC-PAD2014 -Walid SAAD

Conclusion and future works Future works

� Evaluate the communication overhead between the Data Manager protocols;

� Integrate our work in the Univ. Paris 13 SlapOS Cloud system to offer elastic Desktop Grid

Computing with Data Managers as a Service.

19slapos.cloud.univ-paris13.fr

Page 20: Slides-SBAC-PAD2014 -Walid SAAD

Thanks, Any Question ?Thanks, Any Question ?

20

{walid.saad, christophe.cerin, leila.abidi}@lipn.un iv-paris13.fr