Yoshihiro Nakajima, Mitsuhisa Sato, Yoshiaki Aida,Taisuke Boku

17
Integrating Computing Resources Integrating Computing Resources on Multiple Grid-enabled Job on Multiple Grid-enabled Job Scheduling Systems Through a Grid Scheduling Systems Through a Grid RPC System RPC System Yoshihiro Nakajima, Mitsuhisa Sato, Yoshiaki Aida,Taisuke Boku Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid,2006 Reporter:Tung-Yen Haieh

description

Integrating Computing Resources on Multiple Grid-enabled Job Scheduling Systems Through a Grid RPC System. Yoshihiro Nakajima, Mitsuhisa Sato, Yoshiaki Aida,Taisuke Boku Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid,2006 Reporter:Tung-Yen Haieh. - PowerPoint PPT Presentation

Transcript of Yoshihiro Nakajima, Mitsuhisa Sato, Yoshiaki Aida,Taisuke Boku

Page 1: Yoshihiro Nakajima, Mitsuhisa Sato,  Yoshiaki Aida,Taisuke Boku

Integrating Computing Resources Integrating Computing Resources on Multiple Grid-enabled Job on Multiple Grid-enabled Job

Scheduling Systems Through a Scheduling Systems Through a Grid RPC SystemGrid RPC System

Yoshihiro Nakajima, Mitsuhisa Sato, Yoshiaki Aida,Taisuke Boku

Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid,2006

Reporter:Tung-Yen Haieh

Page 2: Yoshihiro Nakajima, Mitsuhisa Sato,  Yoshiaki Aida,Taisuke Boku

OutlineOutline

IntroductionDesign of Grid RPC System Integrating

Computing Resources on a Multiple Gridenabled Job Scheduling System

Experimental ResultsConclusion

Page 3: Yoshihiro Nakajima, Mitsuhisa Sato,  Yoshiaki Aida,Taisuke Boku

Introduction(Cont.)Introduction(Cont.)

The demands for high-throughput computing is increasing, several grid-enabled job scheduling systems (GJSSs) that support high-throughput computing, such as by XtremWeb , Condor and CyberGRIP

Page 4: Yoshihiro Nakajima, Mitsuhisa Sato,  Yoshiaki Aida,Taisuke Boku

Introduction(Cont.)Introduction(Cont.)

However, each GJSS has its own user interfaces and each GJSS has its own user interfaces that the management policy for the GJSS may also be different on each site.

They propose a framework for integrating and utilizing computing resources managed by a GJSS in different organizations by using Grid RPC style programming.

Page 5: Yoshihiro Nakajima, Mitsuhisa Sato,  Yoshiaki Aida,Taisuke Boku

Design of Grid RPC System IntegratingDesign of Grid RPC System IntegratingComputing Resources on a Multiple Computing Resources on a Multiple

GJSS(cont.)GJSS(cont.)

Page 6: Yoshihiro Nakajima, Mitsuhisa Sato,  Yoshiaki Aida,Taisuke Boku

Design of Grid RPC System IntegratingDesign of Grid RPC System IntegratingComputing Resources on a Multiple Computing Resources on a Multiple

GJSS(cont.)GJSS(cont.)

Page 7: Yoshihiro Nakajima, Mitsuhisa Sato,  Yoshiaki Aida,Taisuke Boku

Design of Grid RPC System IntegratingDesign of Grid RPC System IntegratingComputing Resources on a Multiple Computing Resources on a Multiple

GJSS(cont.)GJSS(cont.)

The proposed system realizes following objectives:

A uniform and parallel programming model by remote procedure call on the grid-enabled job scheduling system.

A fault-tolerant Grid RPC system on the computing resource side.

Page 8: Yoshihiro Nakajima, Mitsuhisa Sato,  Yoshiaki Aida,Taisuke Boku

Design of Grid RPC System IntegratingDesign of Grid RPC System IntegratingComputing Resources on a Multiple Computing Resources on a Multiple

GJSS(cont.)GJSS(cont.)

Simultaneous exploitation of massive computing resources provided on sites that are managed by different organizations.

An easy-to-use execution environment from a cluster to Grid-enabled Job Scheduling Systems without any change in the application source program.

Page 9: Yoshihiro Nakajima, Mitsuhisa Sato,  Yoshiaki Aida,Taisuke Boku

Design of Grid RPC System IntegratingDesign of Grid RPC System IntegratingComputing Resources on a Multiple Computing Resources on a Multiple

GJSS(cont.)GJSS(cont.)

General APIs to absorb differences between GJSSs.

General APIs to adapt to new GJSSs.

Automatic deployment of execution programs on remote

computing resources.

Page 10: Yoshihiro Nakajima, Mitsuhisa Sato,  Yoshiaki Aida,Taisuke Boku

Design of Grid RPC System IntegratingDesign of Grid RPC System IntegratingComputing Resources on a Multiple Computing Resources on a Multiple

GJSS(cont.)GJSS(cont.)

Page 11: Yoshihiro Nakajima, Mitsuhisa Sato,  Yoshiaki Aida,Taisuke Boku

Design of Grid RPC System IntegratingDesign of Grid RPC System IntegratingComputing Resources on a Multiple Computing Resources on a Multiple

GJSS(cont.)GJSS(cont.)

We have extended OmniRPC for the proposed system as follows:

A OmniRPC agent process to handle protocol conversion between the OmniRPC client program and each GJSS server was added.

Page 12: Yoshihiro Nakajima, Mitsuhisa Sato,  Yoshiaki Aida,Taisuke Boku

Design of Grid RPC System IntegratingDesign of Grid RPC System IntegratingComputing Resources on a Multiple Computing Resources on a Multiple

GJSS(cont.)GJSS(cont.)

The remote executable module of OmniRPC can handle I/O data through files.

Alternative methods are available to manage the information of the remote function.

Easy-to-use APIs by which the proposed system can adapt to new GJSSs are provided.

Page 13: Yoshihiro Nakajima, Mitsuhisa Sato,  Yoshiaki Aida,Taisuke Boku

Experimental Results(cont.)Experimental Results(cont.)

GJSSs as backbends of OmniRPC are XtremWeb version 1.5, CyberGRIP version 2.2 (CyberGRIP uses JTX), Condor version 7.10.7, and Open Source Grid Engine Version 6.0u6.

Page 14: Yoshihiro Nakajima, Mitsuhisa Sato,  Yoshiaki Aida,Taisuke Boku

Experimental Results(cont.)Experimental Results(cont.)

Page 15: Yoshihiro Nakajima, Mitsuhisa Sato,  Yoshiaki Aida,Taisuke Boku
Page 16: Yoshihiro Nakajima, Mitsuhisa Sato,  Yoshiaki Aida,Taisuke Boku

Experimental Results(cont.)Experimental Results(cont.)

Page 17: Yoshihiro Nakajima, Mitsuhisa Sato,  Yoshiaki Aida,Taisuke Boku

ConclusionConclusion

They have presented a framework for a parallel programming model by remote procedure calls bridging between large-scale computing resource pools managed by multiple GJSSs.

They found that the proposed system can achieve approximately the same performance as using OmniRPC and can handle interruptions in worker programs on remote nodes.