Development of the distributed monitoring system for the NICA cluster

19
Development of the distributed monitoring system for the NICA cluster Ivan Slepov (LHEP, JINR) Mathematical Modeling and Computational Physics Dubna, Russia, July 8, 2013

description

Development of the distributed monitoring system for the NICA cluster. Ivan Slepov (LHEP, JINR). Mathematical Modeling and Computational Physics Dubna , Russia, July 8, 2013. The MultiPurpose Detector – MPD to study Heavy Ion Collisions at NICA. - PowerPoint PPT Presentation

Transcript of Development of the distributed monitoring system for the NICA cluster

Page 1: Development of  the  distributed monitoring system for the NICA cluster

Development of the distributed monitoring system for the NICA cluster

Ivan Slepov(LHEP, JINR)

Mathematical Modeling and Computational Physics Dubna, Russia, July 8, 2013

Page 2: Development of  the  distributed monitoring system for the NICA cluster
Page 3: Development of  the  distributed monitoring system for the NICA cluster
Page 4: Development of  the  distributed monitoring system for the NICA cluster

The MultiPurpose Detector – MPDto study Heavy Ion Collisions at NICA

Page 5: Development of  the  distributed monitoring system for the NICA cluster

Software for MultiPurpose Detector

MpdRoot Framework

components:

Detectors simulation

Data reconstruction

Event analysis

ROOT + FairRoot (FairBase + FairSoft software packages) =

Page 6: Development of  the  distributed monitoring system for the NICA cluster

Software for MultiPurpose Detector

MpdRoot Framework

components:

Detectors simulation

Data reconstruction

Event analysis

ROOT + FairRoot (FairBase + FairSoft software packages) =

Page 7: Development of  the  distributed monitoring system for the NICA cluster

Software for MultiPurpose Detector

MpdRoot Framework

components:

Detectors simulation

Data reconstruction

Event analysis

ROOT + FairRoot (FairBase + FairSoft software packages) =

Page 8: Development of  the  distributed monitoring system for the NICA cluster

Software for MultiPurpose Detector

MpdRoot Framework

components:

Detectors simulation

Data reconstruction

Event analysis

ROOT + FairRoot (FairBase + FairSoft software packages) =

Page 9: Development of  the  distributed monitoring system for the NICA cluster

Computing resources for MPD data processing

CPU: 128 XEON cores GPU: ~1500 TESLA cores

Page 10: Development of  the  distributed monitoring system for the NICA cluster

Computing resources for MPD data processing

CPU: 128 XEON cores => in future ~10 000 XEON cores GPU: ~1500 TESLA cores

Page 11: Development of  the  distributed monitoring system for the NICA cluster

Motivation to develop monitoring system

- Computing resources information (free space, memory, cpu, etc)

- System load (load average, processes)

- MPD software information (FairSoft version)

- Cluster software information (SGE, xrootd, proof)

- User tasks monitoring (batch processing and interactive jobs)

MPD users need more information about all own cluster nodes and public computers!

Page 12: Development of  the  distributed monitoring system for the NICA cluster

Monitoring system schemes

MySQLDB

BASH Scripts

DSHSoftware

Cronrun job

PHPScripts

WEBInterface

MySQLDB

Scheme 1 – for collect general information

Page 13: Development of  the  distributed monitoring system for the NICA cluster

Monitoring system schemes

MySQLDB

BASH Scripts

DSHSoftware

Cronrun job

PHPScripts

WEBInterface

MySQLDB

Scheme 1 – for collect general information

WEBInterface

PHPScripts

DSHSoftware

BASHScripts

MySQLDB

Scheme 2 – for collect information about user tasks and provide data management

Page 14: Development of  the  distributed monitoring system for the NICA cluster

Web-interface for

Monitoring system

1. MPD software information

2. Computing resources information

3. System load

4. User tasks monitoring

Page 15: Development of  the  distributed monitoring system for the NICA cluster

Monitoring system web-interfaceUser tasks

Page 16: Development of  the  distributed monitoring system for the NICA cluster

Monitoring system web-interfaceInteractive nodes

Page 17: Development of  the  distributed monitoring system for the NICA cluster

Access to the monitoring system on websitempd.jinr.ru

Page 18: Development of  the  distributed monitoring system for the NICA cluster

Thank you for your attention!

Page 19: Development of  the  distributed monitoring system for the NICA cluster

MPD users need more information about all own cluster nodes and public computers!

Why? If, for example, the concept of grid uses a layer of abstraction from the resources.

Because MPD software now still under development and needs testing and debugging.

Motivation to develop system monitoring