Singapore Students to Compete in SC14 - HPC-AI Advisory ...€¦ · Singapore Students to Compete...

Post on 25-Jun-2020

4 views 0 download

Transcript of Singapore Students to Compete in SC14 - HPC-AI Advisory ...€¦ · Singapore Students to Compete...

Singapore Students to Compete in

SC14

by Team NUS

7 Oct 2014

Contents

• SC14 Student Cluster Competition

• Team Formation

• Trainings

• Exciting Hardware

• Advanced Software

• Challenges

• Knowledge and Experience

Contents

• SC14 Student Cluster Competition

• Team Formation

• Trainings

• Exciting Hardware

• Advanced Software

• Challenges

• Knowledge and Experience

SC14 Student Cluster Competition

• 17 Nov – 19 Nov, New Orleans, Louisiana, USA

• 48 Hours

• 12 University Teams

• 3120 Watt limit

• 4 Parallel Applications

• Fastest system wins

Contents

• SC14 Student Cluster Competition

• Team Formation

• Trainings

• Exciting Hardware

• Advanced Software

• Challenges

• Knowledge and Experience

The Students

6 undergraduate students from National University of Singapore

• Chen Liang (Yr 4 Comp Engineering)

• David Heryanto (Yr 4 Comp Science)

• Ho Wei Xiong (Yr 5 Math + Comp Science double degree)

• Liu Jin Frank (Yr 4 Comp Science)

• Li Yin (Yr 4 Comp Science)

• Yu Fangzhou (Yr 4 Comp Science)

The Mentors

• Prof. Deng Yuefan (NUS Visitng Professor)

• Kevin Siswandi (A*CRC)

• Jonathan Low (A*CRC)

• Special Thanks:

• Dr. Marek Michalewicz (A*CRC)

• Dr. Tan Tin Wee (A*CRC - NUS)

The Sponsors

• A*CRC (hardware + logistics)

• Intel (Servers and CPUs)

• IBM (Power8)

• NVIDIA (GPUs)

• Samsung (Memory Cards)

Contents

• SC14 Student Cluster Competition

• Team Formation

• Trainings

• Exciting Hardware

• Advanced Software

• Challenges

• Knowledge and Experience

Training Intensity Timeline

0

5

10

15

Apr May Jun Jul Aug Sep

Ho

urs

pe

r w

ee

k

Weeks

Weekly Training Hours

Training Scope

• Hardware Architecture Design • Component specs analysis

• Power efficiency analysis

• Application Learning and Testing • Understanding the scientific theory behind competition applications

• Application fine-tuning

• Run-time and speed-up comparison

Training Scope

• System setup • Operating System

• Internet Sharing

• Remote Access

• Password-free ssh

• File system sharing

• Networking through TCP and IB

• MPI

• CPU frequency tuning

• Job Scheduling

• Hardware Management System

• System monitoring tools

Contents

• SC14 Student Cluster Competition

• Team Formation

• Trainings

• Exciting Hardware

• Advanced Software

• Challenges

• Knowledge and Experience

A glimpse of the super computer speed

Figure 1: 4-core notebook running parallel program Figure 2: 28-core HPC server running parallel program

0

50

100

150

200

250

300

350

400

0 20 40 60

Tim

e (

S)

Cores

ADCIRC Sample Input Time against Cores

Figure 1: ADCIRC test performed by Team NUS on SGI

server

Figure 2: NAMD test performed by Team NUS on SGI

server

A glimpse of the super computer speed

Working with today’s best technology

Figure 1: IBM Power S822L server (Available since June 2014)

• Processors • Two 10-core processors (3.42 GHz)

• Each core supports maximum 8 threads

• Processor-to-memory bandwidth 192 GB/s per

socket

• 512 KB L2 cache per core

• 8 MB L3 cache per core

• 16 MB L4 cache per socket

• Memory • Up to 1 TB

• OS • Linux (RHEL 7)

Figure 1: IBM Power S822L server (Available since June 2014)

• Supported Compilers • XL compilers (optimal)

• GNU compilers

• Supported Math Library • ESSL (optimal)

• Open BLAS

• MPI • OpenMPI

Working with today’s best technology

Figure 1: Intel S2600WTT Server (Available since Q4 2014)

• Processors • Two 14-core Xeon E5-2697 v3 CPUs (2.6

GHz)

• 35 MB Cache

• Memory • DDR4

• 24 DIMMs Up to 3 TB (using 128 G DIMMs)

• OS • Linux (CentOS 7)

Working with today’s best technology

Working with today’s best technology

Figure 1: Intel S2600WTT Server (Available since Q4 2014)

• Supported Compilers • Intel compilers (optimal)

• GNU compilers

• Supported Math Library • MKL (optimal)

• Open BLAS

• MPI • IntelMPI (optimal)

• OpenMPI

Working with today’s best technology

0

50

100

150

200

250

300

350

400

450

500

0 5 10 15 20 25 30

ADCIRC Sample Input Time (s) against Cores

Power 8 Intel Haswell

Figure 1: ADCIRC test performed by Team NUS on IBM Power

S822L and Intel server with E5 2697 v3 (Haswell) CPUs

Figure 2: LINPACK benchmark performed by Team NUS on IBM

Power S822L and Intel server with E5 2697 v3 CPUs

1006.556

540.317

0

200

400

600

800

1000

1200

Ethernet InfiniBand

GF

lop

s

Linpack Test N=10000 on Single Node

Working with today’s best technology

Figure 1: nVIDIA Tesla K40 GPU

(passive)

• Performance • Double Precision floating point performance

(peak) = 1.43 Tflops

• Single Precision floating point performance

(peak) = 4.29 Tflops

• Memory • 12G GDDR5, bandwidth = 288 GB/s

Working with today’s best technology

Figure 1: NAMD test performed by Team NUS with and

without GPU

1006.556

540.317

0

200

400

600

800

1000

1200

Ethernet InfiniBand

GF

lop

s

Linpack Test 2 nodes with and without GPU acceleration

Figure 2: LINPACK benchmark performed by Team NUS

with and without GPU

Working with today’s best technology

Figure 1: Mellanox Infiniband Cards, Cables and Switch

Working with today’s best technology

Figure 2: Graph 500 benchmarking on 2 nodes,

performed by Team NUS with Ethernet and InfiniBand

1006.556

540.317

0

200

400

600

800

1000

1200

Ethernet InfiniBand

Seco

nd

s

Graph 500 test run time comparison using Ethernet and InfiniBand

(Scale=15, Edgefactor = 20, NBFS = 64)

Contents

• SC14 Student Cluster Competition

• Team Formation

• Trainings

• Exciting Hardware

• Advanced Software

• Challenges

• Knowledge and Experience

Advanced scientific applications

• ADCIRC (Advanced Circulation Model Framework)

• NAMD (Nanoscale Molecular Dynamics Program)

• MATLAB Seismic Data Analysis Application

ADCIRC

• Simulate water elevation changes over time.

Figure 1: Gulf of Mexico 2D Mesh Grid Figure 2: Gulf of Mexico Water elevation during

Hurricane Isabel

NAMD

• Simulate particle movement and energy change over time.

Figure 1: Ubiquitin in a water box and in a water sphere.

Hydrogen atoms are colored black for contrast

MATLAB Seismic Data Analysis Application

• Use seismic wave signal to find underground surface topology.

Contents

• SC14 Student Cluster Competition

• Team Formation

• Trainings

• Excitements

• Challenges

• Knowledge and Experience

Challenges

• Hardware • Unfamiliar with HPC hardware

• Lots of terms and jargons to figure out

• Lack of online documentation for latest hardware

• Need to trouble shoot hardware issues

• Steep learning curve on setting up cluster, install/compile the correct software/middleware

• Competition Applications • Need to understand the input/output data format

• Need to understand the workflow of applications

• Need to understand configuration parameters when compiling and running applications

• Need to debug compilation/running errors

• Lots of manual testing need to be automated

Contents

• SC14 Student Cluster Competition

• Team Formation

• Trainings

• Excitements

• Challenges

• Knowledge and Experience

Knowledge and Experience

• Parallel computing theory

• Scientific application models

• Hardware specs and performance measure

• System setup, backup, configuration, communication

• Advanced linux usage

Thanks

Q&A

References

• Power S822L specs

• http://www-03.ibm.com/systems/sg/power/hardware/s812l-s822l/

• Intel S2600WTT specs

• http://ark.intel.com/products/82156/Intel-Server-Board-S2600WTT

• Tesla K40 specs

• http://www.nvidia.com/content/PDF/kepler/Tesla-K40-PCIe-Passive-Board-Spec-BD-06902-001_v05.pdf

• Images

• http://www.112it.pl/_categoryPhoto/24614.jpg

• http://exxactcorp.com/uploads/product/77b1531b36b4b77b613daa85292092b7.jpg

• http://www.storagereview.com/images/StorageReview-Mellanox-InfiniBand.jpg

• http://adcirc.org/home/documentation/example-problems/hurricane-isabel-example/

• http://www.ks.uiuc.edu/Training/Tutorials/namd/namd-tutorial-win-html/node8.html

Student Cluster Competition: A*CRC Story

Summary and Take-Aways

Acknowledgements Dr Marek Michalewicz SGI Micron

Prof. Yuefan Deng NVIDIA Paul Hiew Ngee Heng

Prof. Tan Tin Wee Intel Dr Dominic Chien

Stephen Wong Mellanox Dr Michael Sullivan

Lim Ching Kwang HP Dr Liou Sing Wu

Dr Jonathan Low IBM Dr Gabriel Noaje

A*CRC Comp. System Grp TechSource Nebojsa Novakovic

A*CRC Operations Grp 3M