Post on 09-Jul-2015
description
NIIF Grid development portfolio
Szalai Ferenc, NIIF Instituteszferi@niif.hu
http://www.clustergrid.huhttp://gug.grid.niif.hu
Introduction● Main force: ClusterGrid Infrastucture
– about 1000 computational node organized in 32 cluster
● Two main direction:– Infrastructure development: virtualization,
distributed storage infrastructure, manage hugh amount of nodes etc.
– Middleware development: Grid Underground (GUG) Project
ClusterGrid Architecture
Numbers:1000 nodes32 sites (cluster)22 TB storage
Pre-GUG histroy
● ClusterGrid since 2002● First middleware: Condor Flock
– problems: strong centralization, required central auth with LDAP, the io overhead was lot because the shadow processes etc.
● Second middleware: Centralized broker based on Apache/Postgresql/PHP– since then Condor just an LRMS– problems: missing storage, missing interoperability
● Final solution :) GUG
Grid UnderGround● new generation ClusterGrid middleware. ● Since Feb 2006 using in the production system● Design goals:
– pure web service based framework (no WSRF)
– using selected GGF, W3C standards
– simplify service development
– focus on core services (info, storage, job management, security, monitoring)
– KISS: Keep It Simple, Stupid
– destop and HPC ware: low memory and cpu usage
– open source development (http://www.sourceforge.net/projects/gug)
GUG Architecture● Pure python framework:
– framework runs as a single daemon– manage threads – handle network communication over HTTP(S)/SOAP– every service is a dinamicaly loadable plugin of the
framework, services use backends to separate interfaces and functions
● Mandatory services:– Manager service: manage simple lifecycle of other
services. Remote management also possible.– Grid Information System: p2p system to route
advertisements, service descriptions of services (better than UDDI)
Example serviceclass Test:
def __init__(self, id, local_gis_url, config):pass
def _get_description(self, site_id):return ”””<?xml version='1.0'?>
<ServiceDescription>
<Site>%s</Site>
</ServiceDescription>
””” % site_id
def echo(self, x509, x):return x
GIS● separates data and metadata● advertisements: (metadata, data) tuple● data: XML description of anything like service,
resource etc.● metadata: source of data, TTL, etc. ● simple routing algorithm based on static peer list
and TTL (we like news feeds :)● two main source of data:
– services (get_description function), – GIS backends: standalone data provideres
GUG Core Services● VOService (security)
– every entity identified by X509 cert
– every VO should set up at least one VO service
– manage authorization information, organize them into the tree
– manage VO membership like a maling list
● Job management components
– Exec: run and manage job in SMP systems (useful on destops)
– Job Controller: using GGF BES interface and GGF JSDL. Interface with common LRMS (eg: Condor, Exec etc), no scheduling
– SuperScheduler: use the same interace and data model as Job Controller, it's a grid level scheduler
GUG Core Services
● Storage management components:– file based arch.– Storage Controller: stores and gives back files
using transport independent protocol like SRM– ShareDirectory: directory and file sharing (same
interface as Storage Controller)– File System Service: metadata catalog– Storage Manager: provides POSIX like interface
(mkdir, ls, mv, cp etc.), create replicas on Storage Contollers, manage file system entity types as a plugin: file, directory, shared directory etc.
GUG Serives and UI● Additional services:
– Compiler service: create binaries from source to all avalilable platforms. Use job management componets
● User Interface:– modular command line interface: 'grid' command:$ grid storage ls /grid/tmp R
/grid/tmp:d 20060412 14:04 proba
/grid/tmp/proba: 8 20060412 14:05 szoveg 8 20060412 14:06 szoveg.1 8 20060412 14:06 masnev
$ grid job submit testjob.jsdl
– graphical and web interface comming soon
The real life say good
● Seamles transaction to new middleware: UI have 'almost' compatibility interface with old ClusterGrid broker
● GUG already tested with real life applications:– virtual screening (pharmacy)– nonlinear dinamics (physics)– compliler optimalization (IT research)– usual ClusterGrid applications