How to Use Our Cluster for the Tutorials - uni-paderborn.de · 2020. 2. 11. · Noctua 2018 Cray...

Post on 22-Jul-2021

3 views 0 download

Transcript of How to Use Our Cluster for the Tutorials - uni-paderborn.de · 2020. 2. 11. · Noctua 2018 Cray...

How to Use Our Cluster for the Tutorials

Dr. Robert SchadeHPC-AdvisorPaderborn Center for Parallel ComputingUniversity Paderborn

February, 11th 2020

2

PC2 and Cluster Systems

3

Structure of HPC in Germany

Tier 0– Partnership for advanced computing in

Europe (PRACE)– all users from Europe

Tier 1– national high-performance computing centers– users from Germany– financing: Gauß-Center for Supercomputing

Tier 2– high-performance computing centers– users from the corresponding state and

Germany– financing: state/BMBF

Forschungsbautenprogramm Tier 3

– local computing centers– financing: DFG Großgeräteanträge

4

• Central scientific institute of Paderborn University

– Roots in theoretical computer science

– Founded 1991

– Member of Gauß-Alliance for high-performance computing

– tier-3 (users in NRW) and tier-2 (users from whole Germany)

HPC-center

• Research and service provider for high-performance computing (HPC)

– Development of methods to use parallel computer systems– Operation of HPC systems for users from NRW and beyond– Support for users with consulting and service

Paderborn Center for Parallel Computing (PC²)

Members of the Gauß-Alliance

5

HPC infrastructure at PC²

system start description

OCuLUS 2013 Cluster Vision, 9920 cores, Intel SNB, 200 TFlop/s, QDR Infiniband, 31 NVidia K20x GPUs, 16 Nvidia GTX 1080 TI, 2 Nvidia RTX 2080 Ti

Pling3 2016 Megware, 1024 cores, Intel HSW, 40 TFLOPS, FDR Infiniband

XLC 2017 8 nodes, Intel BDW, 10GE, FPGA accelerators: 8x AlphaData ADM-PCIE-7V3 and 8x ADM-PCIE-8K5 FPGA

Harp 2017 10 nodes, Intel Xeon BDW+FPGA CPUs, 10GE

Noctua 2018 Cray CS500, 272 nodes, Intel Xeon Gold 6148, 10880 cores, >700 TFlop/s, 0.7 PB Lustre ClusterStor, 100G OmniPath, 32 Intel Stratix-10 FPGAs

Noctua (phase 2)

approx. Q1/2021

Petaflop class

selected HPC production and research systems

6

HPC infrastructure at PC²

system start description

OCuLUS 2013 Cluster Vision, 9920 cores, Intel SNB, 200 TFLOPS, QDR Infiniband, 31 NVidia K20x GPUs, 16 Nvidia GTX 1080 TI, Nvidia RTX 2080 Ti

Pling3 2016 Megware, 1024 cores, Intel HSW, 40 TFLOPS, FDR Infiniband

XLC 2017 8 nodes, Intel BDW, 10GE, FPGA accelerators: 8x AlphaData ADM-PCIE-7V3 and 8x ADM-PCIE-8K5 FPGA

Harp 2017 10 nodes, Intel Xeon BDW+FPGA CPUs, 10GE

Noctua 2018 Cray CS500, 272 nodes, Intel Xeon Gold, 10880 cores, >700 TFlop/s, 0.7 PB Lustre ClusterStor, 100G OmniPath, 32 Intel Stratix-10 FPGAs

Noctua (phase 2)

approx. Q1/2021

Petaflop class

selected HPC production and research systems

Systems for this Winetr School

OCuLUS

Noctua

7

• setup end of 2018, Cray CS 500 • 100 Gbit/s interconnect, 700 TB parallel file system • 10880 Intel Xeon cores (Skylake-SP)

– 256 compute nodes (40 cores, 192 GB RAM)

→ 96 compute nodes reserved for this winter school

– 16 FPGA nodes (40 cores, 192 GB RAM, 2x Intel Stratix 10 FPGA)

● Intel Stratix 10 FPGA– Bittware 520N cards, 32 GB DDR4 RAM– 10 TFlops SP – 4 dedicated network ports for direct FPGA-FPGA

connections– Optical switch for FPGA-FPGA connections

Noctua

Athene NoctuaLat. ‘Steinkauz’

8

• acquisition/setup 2020/2021• >1000 TFlops DP performance

Together with new data center building:

Noctua Phase 2

Athene NoctuaLat. ‘Steinkauz’

9

Accessing Noctua

10

How to get access

● fill out terms of use and data privacy form

● return the form to one of the tutors or the registration desk

● visit https://dev.noctua.pc2.uni-paderborn.de/winterschool2020/

● click on link with your account name

11

How to get access

● click “Connect”

● enter account password

12

How to get access● click “Connect”

● enter account password

● → virtual desktop environment (that runs on our cluster)– material for tutorials under “Winter School” on Desktop

– disconnect by closing the browser window

13

How to run compute jobs

14

Compute Jobs

● Compute job = sequence of commands and programs

● need resources:

– working time (walltime)– compute nodes, cpu cores– memory (RAM, system memory)– accelerators (GPUs, FPGAs,...)– software licenses– disk space

15

Workload Manager Compute nodes

...

...

...

Workload Manager(Scheduler)

Workload manager

Noctua Slurm(Simple Linux Utility for Resource Management)

#!/bin/bash#SBATCH -N 2#SBATCH ….

echo „Hello from job“#run some program here

Compute jobs are usually defined with jobscripts:

user submits compute job to

plans execution (when and where)accounting of resources

16

Workload ManagerNoctua

Workload Manager

Slurm

Node usage exclusive (only one job per node)

Compute nodes of Noctua

1 2 20

CPU 1Xeon 6148

…. 1 2 20

CPU 2Xeon 6148

96 GB RAM 96 GB RAM

….

- 2 CPUs per node (Intel Xeon Gold 6148)- 20 cores per cpu- 192 GB DDR4 RAM- 100 Gbit/s Omnipath interconnect to other nodes- 2.8 TFlops DP per node

1 2 20

CPU 1Xeon 6148

…. 1 2 20

CPU 2Xeon 6148

96 GB RAM 96 GB RAM

Omnipathnetwork ….

A job has to be able to use a whole node!(40 cpu-cores)

…….

100 Gbit/sOmnipath

100 Gbit/sOmnipath

17

Workload Manager: Noctua (Slurm)

nodes are allocated exclusively to a compute job→ required information for a compute job:

● Name of the compute job? -J “solve bla” --job-name=”solve bla”● For how long? -t 2:00:00 --time=2:00:00● To which project should it be billed? -A hpc-lco-usrtr --account=hpc-lco-usrtr● How many nodes? -N 4 --nodes=4

● To which partition should it be submitted? -p batch --partition=batch

18

Workload Manager: Noctua (Slurm)

Examples:

● sbatch -J “short single-node job” -N 1 -t 00:30:00 -A hpc-lco-usrtr -p batch

#!/bin/bash

echo „Hello from job“#run some program here

#!/bin/bash#SBATCH -J „short single-node job“#SBATCH -N 1#SBATCH -t 00:30:00#SBATCH -A hpc-lco-usrtr#SBATCH -p batch

echo „Hello from job“#run some program here

time formats:„minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds“

30 minutes

=+

19

Workload Manager: Noctua (Slurm)

Steps for submitting a job:

1) Take an example jobscript from Dekstop/Winter_School/Example_Jobscripts

2) Create a work directory

3) Copy jobscript to your work directory, e.g., as job.sh

1) Look at the jobscript. Edit “PLEASE CHANGE” parts

4) submit script for execution:

for cp2k and cp-paw: “sbatch job.sh”

for orca: “sbatch job.sh h2o” if your input file is h2o.inp

5) This returns: “Submitted batch job [JOBID]”

6) The JOBID can be used to track your job, e.g., cancel,….

20

Workload Manager: Noctua (Slurm)state of submitted jobs: squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 188262 short everything rschade PD 0:00 2 (Resources) ← pending job (resources not available) 188252 short Answer rschade R 0:13 2 cn-[0252-0253] 188253 short to rschade R 0:13 2 cn-[0254-0255] 188254 short the rschade R 0:13 2 fpga-[0001-0002] 188255 short ultimate rschade R 0:13 2 fpga-[0003-0004] 188256 short question rschade R 0:13 2 fpga-[0005-0006] 188257 short of rschade R 0:13 2 fpga-[0007-0008] running jobs 188258 short life rschade R 0:13 2 fpga-[0009-0010] 188259 short the rschade R 0:13 2 fpga-[0011-0012] 188260 short universe rschade R 0:13 2 fpga-[0013-0014] 188261 short and rschade R 0:13 2 fpga-[0015-0016]

● predict start time of a job: spredict JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) START_TIME 188262 short everything rschade PD 0:00 2 (Resources) 2019-05-18T22:14:31 ← estimated start time

● Attention: On Noctua you will only see your own jobs!!!

→ use scluster or sinfo to see overall cluster utilization:Compute nodes: cn-[0001-0256]

nodes in total: 256

nodes in use: 176 (currently occupied with jobs)

nodes free: 80

nodes drained: 0 (currently not accepting jobs)

nodes offline: 0 (currently offline, e.g. because of hardware problem)

21

Workload Manager: Noctua (Slurm)Cancel jobs: scancel

scancel JOBID

scancel --name=JOBNAME

22

Environment ModulesYou don’t have to install any software

● ORCA, CP2K, CP-PAW and TRAVIS are installed on our system● only need to load a module to use, e.g.,

module load winterschool2020/orca

● if not needed anymore, unload with

module unload winterschool2020/orca

● list the currently loaded modules with

module list

23

Thing you should be aware of:● Example jobscripts for ORCA, CP2K and CP-PAW can be found on the

virtual desktop in Winter_School/Example_Jobscripts

● Don't store any personal, private or important data on our cluster system or in the virtual desktop.

● All participants share the Unix account usrtr001 but work in different home-directories and have individual desktops.

● The home-directories are located on our high-performance parallel file system so that you can directly perform calculations in this file system.

● Always submit a compute job, i.e., with sbatch when you want to perform a calculation.

● Use proper job names so that you can find you jobs again in the job list.

● Don't cancel other peoples compute jobs.

24

NoctuaLogin https://dev.noctua.pc2.uni-paderborn.de

Workload Manager Slurm

Node usage exclusive (only one job per node)

cpu-cores per node 40 (dual-socket with 20 cores per socket)

main memory per node 192 GB

submit a job sbatch (#SBATCH)

account --account=, -A

job name --job-name=, -J

time limit --time=, -t

number of nodes --nodes=, -N

partition --partition=, -p (short, batch, long, fpga)

details of a job scontrol show job JOBID

cancel a job scancel JOBID

job status squeue

predicted job start time spredict

cluster utilization overview scluster, sinfo

#!/bin/bash#SBATCH -J „single-node“#SBATCH -N 1#SBATCH -t 00:30:00#SBATCH -A hpc-lco-usrtr#SBATCH -p batch

module resetmodule load ...

echo „Hello from job“#run some program here