NEC and CSCS A First Ivy Bridge Deployment · CSCS in a nutshell • An autonomous unit of the...
Transcript of NEC and CSCS A First Ivy Bridge Deployment · CSCS in a nutshell • An autonomous unit of the...
NEC and CSCS A First Ivy Bridge Deployment
Jason Temple, Luc Corbeil, HPC Solutions, CSCS
What is CSCS
© CSCS 2013 2
Office building • 5 floors • 2’600 m2
• Standard Minergie • Offices for 55 people • Two conference rooms
CSCS in a nutshell
• An autonomous unit of the Swiss Federal Institute of Technology in Zurich (ETH Zurich)
• founded in 1991 in Manno • relocated to Lugano in 2012
• Develops and promotes technical and scientific services • for the Swiss research community in the field of high-performance
computing
• Enables world-class scientific research • by pioneering, operating and supporting leading-edge supercomputing
technologies
© CSCS 2013 3
Management of Computing Resources
• Two groups of highly skilled HPC system administrators
• National Systems – Systems that are operated for the benefit of the Swiss scientific
community – Time allowed based on a call for proposal with peer-review
• HPC Solutions – Hosting and management of HPC resources for Swiss organizations – Current partners:
– MeteoSwiss (24/7 operational weather forecast) – EPFL/BlueBrain – CHiPP (data processing from LHC)
New project for HPC Solutions
• Community of ETHZ professors needed HPC resources – Theoretical physics – Seismology – Polymer Physics – Astrophysics – Molecular dynamics
• CSCS acts as a broker – Pooling of resources – Reduced overhead for each professor – Provider of a turnkey solution
Requirements
• Common analysis of the requirements – Scientific
– Subset of applications – Gromacs – VASP – AMR – SSE
– Technical – Facilities constraints – Cluster design – Reliability, Availability, Serviceability (RAS)
– Best solution within budget
Solution
• NEC’s offer was selected – Full competitive WTO process
• First Intel Ivy Bridge cluster in Switzerland – Maximum power consumption was lower than expected – Throughput per node improved
System Hardware Specifications 340 Standard Compute Nodes with two Sockets and 20 Cores
– Two Intel Ivy Bridge EP E5-2660v2 @ 2.2 GHz – 312 + 4 nodes with 32 GB (DDR3 1600 MHz), 24 with 64GB
8 of 21 © CSCS 2013
System Hardware Specifications
• FDR fat tree, fully non-blocking, Mellanox SX6036
• >300TB NEC LXFS v3 (based on Lustre 2.3), >15 GB/s – Built with NEC SNA260-FS building blocks
• <100 kW under typical load
9
System Management, Compilers and Tools
10 © CSCS 2013
• LX C3 (LX Cluster Command and Control) • Node provisioning • Cluster monitoring
• Slurm scheduler (installed and managed by CSCS)
• Intel Cluster Studio • Intel C, C++, Fortran Compiler • OpenMP Version 3 • Intel MPI (MPI 2.2 for C/C++ and Fortran) • Mathematical Libraries • Intel VTune XE for Linux
• Video
NEC: solid HPC partners
• Very professional approach – Detailed installation planning – Planned followed with very minor variations – No bad surprises
• Great benchmarking expertise – Prudent, conservative commitments – Performance commitments were all exceeded
• Efficient support – Personalized attention – Pro-active replacement of suspicious hardware
Conclusion: Success Story!
Thank you for your attention. Questions?
13