Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

40
Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010

Transcript of Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

Page 1: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

Getting Started on Topsail

Getting Started on TopsailCharles Davis

ITS Research ComputingFebruary 10, 2010

Page 2: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

2

History of Topsail

Structure of Topsail

File Systems on Topsail

Compiling on Topsail

Topsail and LSF

OutlineOutline

Page 3: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

3

Initial Topsail ClusterInitial Topsail Cluster

Initially: 1040 CPU Dell Linux Cluster

•520 dual socket, single core nodes

Infiniband interconnect

Intended for capability research

Housed in ITS Franklin machine room

Fast and efficient for large computational jobs

Page 4: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

4

Topsail Upgrade 1Topsail Upgrade 1

Topsail upgraded to 4,160 CPU• replaced blades with dual socket, quad core

Intel Xeon 5345 (Clovertown) Processors• Quad-Core with 8 CPU/node

Increased number of processors, but decreased individual processor speed (was 3.6 GHz, now 2.33)

Decreased energy usage and necessary resources for cooling system

Summary: slower clock speed, better memory bandwidth, less heat• Benchmarks tend to run at the same speed per core

• Topsail shows a net ~4X improvement

• Of course, this number is VERY application dependent

Page 5: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

5

Topsail – Upgraded blades

Topsail – Upgraded blades

52 Chassis: Basis of node names• Each holds 10 blades -> 520 blades total• Nodes = cmp-chassis#-blade#

Old Compute Blades: Dell PowerEdge 1855• 2 Single core Intel Xeon EMT64T 3.6 GHZ procs• 800 Mhz FSB• 2MB L2 Cache per socket• Intel NetBurst MicroArchitecture

New Compute Blades: Dell PowerEdge 1955• 2 Quad core Intel 2.33 GHz procs• 1333 Mhz FSB• 4MB L2 Cache per socket• Intel Core 2 MicroArchitecture

Page 6: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

6

Topsail Upgrade 2Topsail Upgrade 2

Most recent Topsail upgrade (Feb/Mar ‘09)

Refreshed much of the infrastructure

Improved IBRIX filesystem

Replaced and improved Infiniband cabling

Moved cluster to ITS-Manning building

•Better cooling and UPS

Page 7: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

7

Current Topsail Architecture

Current Topsail Architecture

Login node: 8 CPU @ 2.3 GHz Intel EM64T, 12 GB memory

Compute nodes: 4,160 CPU @ 2.3 GHz Intel EM64T, 12 GB memory

Shared disk: 39TB IBRIX Parallel File System

Interconnect: Infiniband 4x SDR

64bit Linux Operating System

Page 8: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

8

Multi-Core ComputingMulti-Core Computing

Processor Structure on Topsail

• 500+ nodes

• 2 sockets/node

• 1 processor/socket

• 4 cores/processor (Quad-core)

• 8 cores/node

http://www.tomshardware.com/2006/12/06/quad-core-xeon-clovertown-rolls-into-dp-servers/page3.html

Page 9: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

9

Multi-Core ComputingMulti-Core Computing

The trend in High Performance Computing is towards multi-core or many core computing.

More cores at slower clock speeds for less heat

Now, dual and quad core processors are becoming common.

Soon 64+ core processors will be common

•And these may be heterogeneous!

Page 10: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

10

The Heat ProblemThe Heat Problem

Taken From: Jack Dongarra, UT

Page 11: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

11

More ParallelismMore Parallelism

Taken From: Jack Dongarra, UT

Page 12: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

12

Infiniband Connections

Infiniband Connections

Connection comes in single (SDR), double (DDR), and quad data rates (QDR).

•Topsail is SDR. Single data rate is 2.5 Gbit/s in each direction

per link. Links can be aggregated - 1x, 4x, 12x.

•Topsail is 4x. Links use 8B/10B encoding —10 bits carry 8

bits of data — useful data transmission rate is four-fifths the raw rate. Thus single, double, and quad data rates carry 2, 4, or 8 Gbit/s respectively.

Data rate for Topsail is 8 GB/s (4x SDR).

Page 13: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

13

Topsail Network Topology

Topsail Network Topology

Page 14: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

14

Infiniband Benchmarks

Infiniband Benchmarks

Point-to-point (PTP) intranode communication on Topsail for various MPI send types

Peak bandwidth:• 1288 MB/s

Minimum Latency (1-way):• 3.6 s

Page 15: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

15

Infiniband Benchmarks

Infiniband Benchmarks

Scaled aggregate bandwidth for MPI Broadcast on Topsail

Note good scaling throughout the tested range (from 24-1536 cores)

Page 16: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

16

Login to TopsailLogin to Topsail

Use ssh to connect:•ssh topsail.unc.edu

SSH Secure Shell with Windows For using interactive programs with

X-Windows Display:•ssh –X topsail.unc.edu

•ssh –Y topsail.unc.edu Off-campus users (i.e. domains

outside of unc.edu) must use VPN connection

Page 17: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

17

Topsail File SystemsTopsail File Systems

39TB IBRIX Parallel File System

Split into Home and Scratch Space

Home: /ifs1/home/my_onyen

Scratch: /ifs1/scr/my_onyen

Mass Storage

•Only Home is backed up

•/ifs1/home/my_onyen/ms

Page 18: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

18

File System LimitsFile System Limits

500GB Total Limit per User

Home – 15GB limit for Backups

Scratch:

•No limit except 500GB total

•Not backed up

•Periodically cleaned

Few installed packages/programs

Page 19: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

19

Compiling on TopsailCompiling on Topsail

Modules Serial Programming

• Intel Compiler Suite for Fortran77, Fortran90, C and C++ - Recommended by Research Computing

• GNU

Parallel Programming• MPI

• OpenMP Must use Intel Compiler Suite Compiler tag: -openmp Must set OMP_NUM_THREADS in submission script

Page 20: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

20

Compiling ModulesCompiling Modules

Module commands

•module – list commands

•module avail – list modules

•module add – add module temporarily

•module list – list modules being used

•module clear – remove module temporarily

Add module using startup files

Page 21: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

21

Available Compilers Available Compilers

Intel – ifort, icc, icpc GNU – gcc, g++, gfortran Libraries - BLAS/LAPACK MPI:

•mpicc/mpiCC

•mpif77/mpif90

mpixx is just a wrapper around the Intel or GNU compiler•Adds location of MPI libraries and include files

•Provided as a convenience

Page 22: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

22

Test MPI CompileTest MPI Compile

Copy cpi.c to scratch directory:• cp /ifs1/scr/cdavis/Topsail/cpi.c /ifs1/scr/my_onyen/.

Add Intel module:

•module load hpc/mvapich-intel-11

Confirm Intel module:

•which mpicc

Compile code:

•mpicc –o cpi cpi.c

Page 23: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

23

MPI/OpenMP TrainingMPI/OpenMP Training

Courses are taught throughout year by Research Computing http://learnit.unc.edu/workshops

Next course:

•MPI – Summer

•OpenMP – March 3rd

Page 24: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

24

Running Programs on Topsail

Running Programs on Topsail

Upon ssh to Topsail, you are on the Login node.

Programs SHOULD NOT be run on Login node.

Submit programs to one of 4,160 Compute nodes.

Submit jobs using Load Sharing Facility (LSF).

Page 25: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

25

Job Scheduling Systems

Job Scheduling Systems

Allocates compute nodes to job submissions based on user priority, requested resources, execution time, etc.

Many types of schedulers

•Load Sharing Facility (LSF) – Used by Topsail

•IBM LoadLeveler

•Portable Batch System (PBS)

•Sun Grid Engine (SGE)

Page 26: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

26

Load Sharing Facility (LSF)

Load Sharing Facility (LSF)

Submission host

LIM

Batch API

Master host

MLIM

MBD

Execution host

SBD

Child SBD

LIM

RES

User jobLIM – Load Information ManagerMLIM – Master LIMMBD – Master Batch DaemonSBD – Slave Batch DaemonRES – Remote Execution Server

queue1

2

3

45

6 7

89

10

11

12

13

Loadinformation

otherhosts

otherhosts

bsub app

Page 27: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

27

Submitting a Job to LSF

Submitting a Job to LSF

For a compiled MPI job:

•bsub -n "< number CPUs >" -o out.%J -e err.%J -a mvapich mpirun ./mycode

bsub – LSF command that submits job to compute node

bsub –o and bsub -e

•Job output saved to file in submission directory

Page 28: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

28

Queue System on Topsail

Queue System on Topsail

Topsail uses queues to distribute jobs.

Specify queue with –q in bsub:

•bsub –q week …

No –q specified = default queue (week)

Queues vary depending on size and required time of jobs

See listing of queues:

•bqueues

Page 29: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

29

Topsail QueuesTopsail Queues

Queue Time Limit

Jobs/User CPU/Job

int 2 hrs 128 ---debug 2 hrs 128 ---day 24 hrs 1024 4 – 128week 1 week 1024 4 – 128512cpu 4 days 1024 32 – 1024128cpu 4 days 1024 32 – 12832cpu 2 days 1024 4 – 32chunk 4 days 1024 Batch Jobs

• Most jobs do not scale very well over 128 cpu.

Page 30: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

30

Submission ScriptsSubmission Scripts

Easier to write submission script that can be edited for each job submission.

Example script file – run.hpl:#BSUB -n "< number CPUs >"

#BSUB -e err.%J

#BSUB -o out.%J

#BSUB -a mvapich

mpirun ./mycode

Submit with: bsub < run.hpl

Page 31: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

31

More bsub options More bsub options

bsub –x NO LONGER USE!!!!•Exclusive use of a node

•Use extensively when first testing code bsub –n 4 –R span[ptile=4]

•Forces all 4 processors to be on same node

•Similar to –x bsub –J job_name see man pages for a complete

description•man bsub

Page 32: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

32

Performance TestPerformance Test

Gromacs MD simulation of bulk water

Simulation setups:

•Case 1: -n 8 -R span[ptile=1]

•Case 2: -n 8 -R span[ptile=8]

Simulation times (1ns MD):

•Case 1: 1445 sec

•Case 2: 1255 sec

Using 1 node only improved speed by 13%

Page 33: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

33

Following Job After Submission

Following Job After Submission

bjobs•bjobs –l JobID

•Shows current status of job

bhist•bhist –l JobID

•More details information regarding job history

bkill•bkill –r JobID

•Ends job prematurely

Page 34: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

34

Submit Test MPI JobSubmit Test MPI Job

Submit the test MPI program on Topsail

•bsub –q week –n 4 –o out.%J –e err.%J –a mvapich mpirun ./cpi

Follow submission: bjobs

Output stored in out.%J file

Page 35: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

35

Pre-Compiled Programs on Topsail

Pre-Compiled Programs on Topsail

Some applications are precompiled for all users:

• /ifs1/apps

• Amber, Gaussian, Gromacs, NetCDF, NWChem, R

Add module to path using module commands:

• module list – shows available applications

• module add – add specific application

Once module command is used, executable is added to the full path

Page 36: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

36

Test Gaussian Job on Topsail

Test Gaussian Job on Topsail

Add Gaussian Application to path:

• module add apps/gaussian-03e01

• module list

Copy input com file:

• cp /ifs1/scr/cdavis/Topsail/water.com .

Check that executable has been added to path:

• echo $PATH

Submit job:

• bsub –q week –n 4 –e err.%J –o out.%J g03 water.com

Page 37: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

37

Common Error 1Common Error 1

If job immediately dies, check err.%J file

err.%J file has error:

• Can't read MPIRUN_HOST

Problem: MPI enivronment settings were not correctly applied on compute node

Solution: Include mpirun in bsub command

Page 38: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

38

Common Error 2Common Error 2

Job immediately dies after submission err.%J file is blank Problem: ssh passwords and keys were

not correctly setup at initial login to Topsail

Solution: • cd ~/.ssh/

• mv id_rsa id_rsa-orig

• mv id_rsa.pub id_rsa.pub-orig

• Logout of Topsail

• Login to Topsail and accept all defaults

Page 39: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

39

Interactive JobsInteractive Jobs

To run long shell scripts on Topsail, use int queue

bsub –q int –Ip /bin/bash

•This bsub command provides a prompt on compute node

•Can run program or shell script interactively from compute node

Totalview debugger can also be run interactively from Topsail

Page 40: Getting Started on Topsail Charles Davis ITS Research Computing February 10, 2010.

40

Further Help with Topsail

Further Help with Topsail

More details about using Topsail can be found on the Getting Started on Topsail help document

•http://help.unc.edu/?id=6214•http://keel.isis.unc.edu/wordpress/ - ON

CAMPUS

For assistance with Topsail, please contact the ITS Research Computing group

•Email: [email protected] For immediate assistance, see manual

pages on Topsail:•man <command>