NWU and HPC
-
Upload
wilhelm-van-belkum -
Category
Education
-
view
829 -
download
1
description
Transcript of NWU and HPC
igh
erformance
omputing
Computing
Attie Juyn & Wilhelm van Belkum
Agenda
The birth of a HPC…• Part A: management perspective• Part B: technical perspective
Background
• Various departmental compute clusters• A flagship project at the CHPC• Fragmented resources and effort
At last year’s conference, our vision was ….
10 GFlop
> 2 TFlops
1-2 TFlops
40-100 GFlop
To establish an Institutional HPC
Level 1 : (Entry Level) Personal workstationLevel 1 : (Entry Level) Personal workstation
Level 2 : Departmental Compute ClusterLevel 2 : Departmental Compute Cluster
Level 3 : Institutional HPCLevel 3 : Institutional HPC
Level 4 Level 4 Nat./Int. HPCNat./Int. HPC
University Strategy
• Increased focus on researchTo develop into a balanced teaching-learning & research university
• As a result of merger, a central IT department
The Challenge: to innovate
• SustainabilityHPC must be a service, not a project or experiment
Funding model must enable constant renewal
Support model with clear responsibilities
• ReliabilityRedundant design principles (DR capability)
24x7x365 (not 99.99%)
• AvailabilityStandardised user interface (not root)
Equally accessible on all campuses
• EfficiencyPower, cooling, etc.
HPC (IT) success criteria
Sustainability
Efficiency
Reliability
Availability
= key issues of this decade
& Performance
Enabling factors
• A spirit of co-operation Key researchers & IT agreeing on what should be done
• A professional, experienced IT team
Supporting +- 200 servers in 4 distributed data centers
• A well managed, state-of-the-art infrastructureResulting from the merger period
• Management trust & commitment• International support & connections
Networks, grids, robust & open software
Project milestones
• March 2007: first discussions & documentation of vision
• April 2007: budget compilation and submission• 27 November 2007: Project and budget approved• December 2007: CHPC Conference, tested our vision• 17 March 2008: Dr Bruce Becker visits Potchefstroom
(first discussions of gLite, international & SA grids)
• 18 March 2008: Grid concept presented to IT Directors• May 2008: established POC cluster, testing software• June-October: recruitment & training of staff• July 2008: Grid Conference at UCT & SA Grid Initiation• August – September 2008: detailed planning & testing• October 2008: tenders & ordered equipment• Nov. 2008 - Jan. 2009: implementation
Management principles
• A dedicated research facility(not for general computing)
• To serve researchers in approved research programmes of all three campuses
• Implemented, maintained and supported by Institutional IT(IT should do the IT)
• Configured to international standards & best practice(to be shown later)
• Parallel applications only• Usage governed by an institutional and representative governance body• Sustainability subject to acceptable ROI
(to justify future budgets)
The New World Order
Mainframe
Mini Computer
PC
Cluster & Grids
Vector Supercomputer
Source 2006 UC Regents
Technical goals
Build a
Institutional High Performance Computing facility,
based on Beowulf cluster principals,
coexisting and linking
existing departmental cluster,
the National and International computational Grids
Beowulf cluster
• The term "Beowulf cluster" refers to a cluster of workstations (usually Intel architecture, but not necessarily) running some flavor of Linux that is utilized as a parallel computation resource.
• The main idea is to use commodity, off-the-shelf computing components with Open Source software to create a networked cluster of workstations.
History of Clusters - The first Beowulf
07/2002 – Design system
08/2002 to 11/2002 – Build system
03/2003 – System in Production
• 7-8 Months for Concept to Production
• Moore’s Law 18 months to
-> Half life of performance and cost
-> Useful life 3-4 years
Source 2006 UC Regents
The Evolved Cluster
ResourceManager
Scheduler
Compute Nodes
Admin
User
LicenseManager
JobQueue
Myrinet
IdentityManager
AllocationManager
ResourceManager
Scheduler
Departmental Cluster
Source Cluster Resources, Inc.
Cluster and Grid software landscape
MPIMPI PVMPVM LAMLAM MPICHMPICH
ParallelParallel SerialSerialApplicationApplication
Resource ManagerResource ManagerRocksRocks OscarOscar
MPIMPI PVMPVM LAMLAM MPICHMPICH
ParallelParallel SerialSerialApplicationApplication
Resource ManagerResource ManagerOscarOscar TorqueTorqueRocksRocks
Grid/Cluster Stack or Framework
Hardware (Cluster or SMP)Hardware (Cluster or SMP)
CentOSCentOS SolarisSolaris RedHatRedHat UNICOSUNICOS AIXAIX Scientific Scientific LinuxLinux
WindowsWindowsMac OS XMac OS XHP UXHP UX OtherOther
Operating SystemOperating System
Se
cu
rityS
ec
urity
EGEEChineseUSA EU
GLOBUSGLOBUS CROWNGridCROWNGrid gLitegLite UNICOREUNICORE
Grid Workload Manager: Scheduler, Policy Manager, Integration PlatformGrid Workload Manager: Scheduler, Policy Manager, Integration Platform
Load Load LevelerLeveler
PBSpro PBSPBS SGESGE Condor(G)Condor(G) LSFLSF SLURMSLURM
Cluster Workload Manager: Scheduler, Policy Manager, Integration PlatformCluster Workload Manager: Scheduler, Policy Manager, Integration Platform
NimrodNimrod MOABMOAB MAUIMAUI
PortalPortal
CLICLI
GUIGUI
ApplicationApplication
UsersUsersAdminAdmin
Departmental Computer Cluster
CHPC (May 2007)
iQudu” (isiXhosa name for Kudu “Tshepe” (Sesotho name for ‘Springbok’) and Impala
• 160 Node Linux cluster
• Each node with 2xdual-core AMD Opteron 2.6GHz Ref F processors and 16GB of random access memory
• Infiniband 10 GB cluster interconnecting
• 50TB SAN
• 640 processing (2.5 Teraflops per second )
• 2x IBM p690 with 32 x 1.9GHz Power4+ CPUs
• 32GB of RAM each
The #1 and #13 in world (2007)
BlueGene/L - eServer Blue Gene Solution (IBM/212992 Power cores)
DOE/NNSA/LLNL - USA
MareNostrum - BladeCenter JS21 Cluster, PPC 970, 2.3 GHz, Myrinet (IBM 10240 Power cores)
Barcelona Supercomputer Centre – Spain (63.83 teraFLOP)
478.2 trillion floating operations per second (teraFLOPS) on LINPACK
The #4 and #40 in world (2008)
As of November 2008 #1 : Roadrunner
Roadrunner - BladeCenter QS22/LS21 Cluster, 12,240 x PowerXCell 8i 3.2 Ghz6,562 Dual-Core Opteron 1.8 GHz
DOE/NNSA/LANL - United States
1.105 PetaFlop
Reliability & Availability of HPC
HPC (IT) success criteria
Sustainability
Efficiency
Reliability
Availability
= key issues of this decade
& Performance
Introducing - Utility Computing
Grid Workload ManagerCondor, MOAB
Utility Computing
DataDataCenterCenter
RMRM
HPCHPC
RMRM
Swap & migrating of Hardware (First Phase)
Dynamic load shifting on RM level (Second Phase)
Hardware (Cluster or SMP)Hardware (Cluster or SMP)
MPIMPI PVMPVM LAMLAM MPICHMPICH
ParallelParallel SerialSerialApplicationApplication
Resource ManagerResource ManagerRocksRocks OscarOscar
MPIMPI PVMPVM LAMLAM MPICHMPICH
ParallelParallel SerialSerialApplicationApplication
Resource ManagerResource ManagerOscarOscar TorqueTorqueRocksRocks
Grid/Cluster Stack or Framework
CentOSCentOS SolarisSolaris RedHatRedHat UNICOSUNICOS AIXAIX Scientific Scientific LinuxLinux
WindowsWindowsMac OS XMac OS XHP UXHP UX OtherOther
Operating SystemOperating System
EGEEChineseUSA EU
GLOBUSGLOBUS CROWNGridCROWNGrid gLitegLite UNICOREUNICORE
Grid Workload Manager: Scheduler, Policy Manager, Integration PlatformGrid Workload Manager: Scheduler, Policy Manager, Integration Platform
Load Load LevelerLeveler
PBSproPBSpro PBSPBS SGESGE Condor(G)Condor(G) LSFLSF SLURMSLURM
Cluster Workload Manager: Scheduler, Policy Manager, Integration PlatformCluster Workload Manager: Scheduler, Policy Manager, Integration Platform
NimrodNimrod MOABMOAB MAUIMAUI
Se
cu
rityS
ec
urity
PortalPortal
CLICLI
GUIGUI
ApplicationApplication
UsersUsersAdminAdmin
HP BL460c8*3GHz Xeon 12G L2, 1333Mhz FSB10G memory(96GFlop)
HP Modular Cooling System G2Up to 4 HP C7000, 512 CPU cores 5.12 TFlop
HP Blc Virtual Connect Ethernet
D-Link X-stack DSN320010.5TB RAID5, 80 000 I/O per second
HP C7000 Up to 16 HP2x220c (3.072TFlop)
1024 CPU cores HP2x220c (12.288TFlop)
BL2x220c16*3GHz Xeon
192GFlop HP C7000 Up to 16 HP460c (1.536TFlop)
HP ProLiant BL460c
BL460c
ProcessorUp to two Dual & Quad-Core Intel Xeon processors
Memory• FBDIMM 667MHz• 8 DIMM Slots• 32GB max
Internal Storage• 2 Hot-Plug SFF SAS HDDs• Standard RAID 0/1 controller with
optional BBWC
Networking 2 integrated Multifunction Gigabit NICs
Mezzanine Slots 2 mezzanine expansion slots
ManagementIntegrated Lights Out 2 Standard Blade Edition
BL460c Internal View
Embedded Smart ArrayController integrated on
drive backplane
8 Fully Buffered DIMM SlotsDDR II 667Mhz
Two Mezzanine Slots:•One x4•One x8
Two hot-plug
SAS/SATA drive bays
QLogic QMH24622-pt 4Gb FC HBA
NC512m 2-pt 10GbE-KX4Netxen
4x DDR InfiniBand2-pt 4X DDR (20Gb) Mellanox
HP ProLiant BL2x220c G5
BL2x220c G5
ProcessorUp to two Dual or Quad-Core Intel® Xeon® processors per board
Memory• Registered DDR2 (533/667 MHz)• 4 DIMM Sockets per board• 16GB max (with 4GB DIMMs)
Internal Storage 1 Non Hot-Plug SFF SATA HDD per board
Networking 2 integrated Gigabit NICs per board
Mezzanine Slots1 PCIe mezzanine expansion slot (x8, Type I) per board
ManagementIntegrated Lights Out 2 Standard Blade Edition
Density32 server blades in 10U enclosure16 server blades in 6U enclosure*2 blades per HH enclosure bay
HP ProLiant BL2x220c G5
Internal View
Two Mezzanine Slots
Two x8 (both reside on bottom board)2 x Optional SATA HDDs
Top and bottom PCA, side by side
2 x 2 CPUs 2 x 4 DIMM Slots
DDR2 533/667MHz
2 x Embedded 1Gb Ethernet
Dual-Port NICs
Server Board Connectors
Servers and other
racked equipment
Half-Height Blade Server• Up to 16 per enclosure
10U
Max. Capacity•HP Modular Cooling System G2•Up to 4 HP C7000, •1024 CPU cores •12.228 TFlop
NWU HPC Hardware Spec.
• 16 Dual Quad-Core Intel Xeon E5450– 3GHz CPU , 12MB L2, 1333MHz FSB, 80W power– 16xHP BL460c– 10G Memory– HP c7000 enclosure– HP Modular Cooling System G2 (MCS G2)– Link iSCSI DSN-3200 (20Tb disk)
• 16 Dual quad-Core Intel Xeon E5450– 3GHz CPU , 12MB L2, 1333MHz FSB, 80W power– 8xHP BL2x220c– 10G Memory– HP c7000– HP Modular Cooling System G2 (MCS G2)– Link iSCSI DSN-3200 (20Tb disk)
• 32 * 8 *3Ghz *4 = 3.072TFlops (256 Cores)• 32 * 10 Gbyte = 320 G memory• 2 * 10 TByte storage • Gig Ethernet Interconnect : 42.23 microseconds
latency (IB= 4 Microseconds)
NWU HPC/Grid
Campus GRID
University Wide Area Network/Internet
Total of 45Mbps 34.2Mbps International
Telkom
SANREN
SANREN Vision and the Players
InfraCo
SEACOM
SA-Grid
CHPC
NWUC4
UOVS
SA-Grid
SEACOM
TE-North is a new cable currently being laid across the
Mediterranean Sea
Cable Laying to start Oct. 08Final splicing April 09Service launch June 09
International Grid
High Performance Computing @ NWU
1/1/2008 12/20/2009
4/1/2008 7/1/2008 10/1/2008 1/1/2009 4/1/2009 7/1/2009 10/1/2009
11/28/2008HPC
6/30/2009Campus GRID
11/29/2009National GRID
12/20/2009International GRID
12/15/2008
igh
erformance
omputing
Scientific Linux
Computing
orth est niversity
SustainableEfficient Reliable
High Availability
&Performance
@ >3TFlop