A Hardware Based Cluster Control and Management System Ralf Panse...

18
A Hardware Based Cluster Control and Management System Ralf Panse ([email protected]) Kirchhoff Institute of Physics University of Heidelberg, Germany

Transcript of A Hardware Based Cluster Control and Management System Ralf Panse...

Page 1: A Hardware Based Cluster Control and Management System Ralf Panse (panse@kip.uni-heidelberg.de)panse@kip.uni-heidelberg.de Kirchhoff Institute of Physics.

A Hardware Based Cluster Control and Management

System

Ralf Panse([email protected])

Kirchhoff Institute of PhysicsUniversity of Heidelberg, Germany

Page 2: A Hardware Based Cluster Control and Management System Ralf Panse (panse@kip.uni-heidelberg.de)panse@kip.uni-heidelberg.de Kirchhoff Institute of Physics.

27.9.- 01.10. CHEP 04,Interlaken R.Panse, KIP Heidelberg 2

KIPOutline

• Motivation

• CIA card

• Remote Control

• Monitoring

• Debugging

• Summary

Page 3: A Hardware Based Cluster Control and Management System Ralf Panse (panse@kip.uni-heidelberg.de)panse@kip.uni-heidelberg.de Kirchhoff Institute of Physics.

27.9.- 01.10. CHEP 04,Interlaken R.Panse, KIP Heidelberg 3

KIPMotivation

?

Administrator

Repair

Install

Monitor

Page 4: A Hardware Based Cluster Control and Management System Ralf Panse (panse@kip.uni-heidelberg.de)panse@kip.uni-heidelberg.de Kirchhoff Institute of Physics.

27.9.- 01.10. CHEP 04,Interlaken R.Panse, KIP Heidelberg 4

KIPCluster Control

Hardware solutions

• Wake on Lan• Remote Management

Boards• BIOS Serial Console• Remote controlled

power socket

Software solutions

• Disk Image• Virtual Network

Computing (VNC)• Terminal Server• SSH, Telnet

Page 5: A Hardware Based Cluster Control and Management System Ralf Panse (panse@kip.uni-heidelberg.de)panse@kip.uni-heidelberg.de Kirchhoff Institute of Physics.

27.9.- 01.10. CHEP 04,Interlaken R.Panse, KIP Heidelberg 5

KIPCCluster IInterface AAgent Card

PCI expansion card Low profile PCI form factor Independent of the computer Remote installation Remote power cycle Monitoring Debugging Automatic administration

Page 6: A Hardware Based Cluster Control and Management System Ralf Panse (panse@kip.uni-heidelberg.de)panse@kip.uni-heidelberg.de Kirchhoff Institute of Physics.

27.9.- 01.10. CHEP 04,Interlaken R.Panse, KIP Heidelberg 6

KIPCIA Network

Computing cluster

CIA CardsService NetworkAdministrator

TCP/IP

Cluster Network

User

Page 7: A Hardware Based Cluster Control and Management System Ralf Panse (panse@kip.uni-heidelberg.de)panse@kip.uni-heidelberg.de Kirchhoff Institute of Physics.

27.9.- 01.10. CHEP 04,Interlaken R.Panse, KIP Heidelberg 7

KIPEmbedded System

AlteraExcalibur

(ARM 922T)

Flash memory (8-16MB)

SDRAM (32-64MB)Ethernet

Linux

Page 8: A Hardware Based Cluster Control and Management System Ralf Panse (panse@kip.uni-heidelberg.de)panse@kip.uni-heidelberg.de Kirchhoff Institute of Physics.

27.9.- 01.10. CHEP 04,Interlaken R.Panse, KIP Heidelberg 8

KIPRemote Control

Computing cluster

Service NetworkAdministrator TCP/IP

Java Application

WebBrowser

VNCViewer

CIAinside

Page 9: A Hardware Based Cluster Control and Management System Ralf Panse (panse@kip.uni-heidelberg.de)panse@kip.uni-heidelberg.de Kirchhoff Institute of Physics.

27.9.- 01.10. CHEP 04,Interlaken R.Panse, KIP Heidelberg 9

KIPVideo Control

Video data

Linux• VNC Server

CIA card (Video card)

PCI bus

Service Network (TCP/IP)

Video data

Page 10: A Hardware Based Cluster Control and Management System Ralf Panse (panse@kip.uni-heidelberg.de)panse@kip.uni-heidelberg.de Kirchhoff Institute of Physics.

27.9.- 01.10. CHEP 04,Interlaken R.Panse, KIP Heidelberg 10

KIPKeyboard/Mouse Control

keyboard/mouse data

CIA card (PCI Master)

PCI bus

Service Network (TCP/IP)

keyboard/mouse data

Page 11: A Hardware Based Cluster Control and Management System Ralf Panse (panse@kip.uni-heidelberg.de)panse@kip.uni-heidelberg.de Kirchhoff Institute of Physics.

27.9.- 01.10. CHEP 04,Interlaken R.Panse, KIP Heidelberg 11

KIPData Control

CIA card (Floppy Drive)

floppy data

Service Network (TCP/IP)

Floppy image

Page 12: A Hardware Based Cluster Control and Management System Ralf Panse (panse@kip.uni-heidelberg.de)panse@kip.uni-heidelberg.de Kirchhoff Institute of Physics.

27.9.- 01.10. CHEP 04,Interlaken R.Panse, KIP Heidelberg 12

KIPPower/Reset Switch

Cluster Node Mainboard

Power/ResetSwitch

Page 13: A Hardware Based Cluster Control and Management System Ralf Panse (panse@kip.uni-heidelberg.de)panse@kip.uni-heidelberg.de Kirchhoff Institute of Physics.

27.9.- 01.10. CHEP 04,Interlaken R.Panse, KIP Heidelberg 13

KIPMonitoring

PCI scan

• Temperature• Voltage• Acoustics

ADC

POST code

Page 14: A Hardware Based Cluster Control and Management System Ralf Panse (panse@kip.uni-heidelberg.de)panse@kip.uni-heidelberg.de Kirchhoff Institute of Physics.

27.9.- 01.10. CHEP 04,Interlaken R.Panse, KIP Heidelberg 14

KIPDebug

POST code PCI scan DMI information Sensor data (temperature, acoustics,

voltage) Test programs

- Memory- CPU- Hard disc

Page 15: A Hardware Based Cluster Control and Management System Ralf Panse (panse@kip.uni-heidelberg.de)panse@kip.uni-heidelberg.de Kirchhoff Institute of Physics.

27.9.- 01.10. CHEP 04,Interlaken R.Panse, KIP Heidelberg 15

KIPCIA-Prototype

PCI scans Video card functionality VNC server (text mode) Web control

Implemented features: Power control (computer) USB device mock-up Full memory access (PCI Bus)

Page 16: A Hardware Based Cluster Control and Management System Ralf Panse (panse@kip.uni-heidelberg.de)panse@kip.uni-heidelberg.de Kirchhoff Institute of Physics.

27.9.- 01.10. CHEP 04,Interlaken R.Panse, KIP Heidelberg 16

KIPControl Application

Page 17: A Hardware Based Cluster Control and Management System Ralf Panse (panse@kip.uni-heidelberg.de)panse@kip.uni-heidelberg.de Kirchhoff Institute of Physics.

27.9.- 01.10. CHEP 04,Interlaken R.Panse, KIP Heidelberg 17

KIPOutlook

PCI multifunction device / PCI bridge Emulating IDE controller Analyse hard disc acoustics SVGA support Automatic system installation

Page 18: A Hardware Based Cluster Control and Management System Ralf Panse (panse@kip.uni-heidelberg.de)panse@kip.uni-heidelberg.de Kirchhoff Institute of Physics.

27.9.- 01.10. CHEP 04,Interlaken R.Panse, KIP Heidelberg 18

KIPSummary

Independent of operating system

Own network interface

VGA functionality

Direct hardware control (PCI, USB, Power)

Simple and reliable system (no moving components)

Independent of computer architecture