How to Use Our Cluster for the Tutorials

Dr. Robert SchadeHPC-AdvisorPaderborn Center for Parallel ComputingUniversity Paderborn

February, 11th 2020

PC2 and Cluster Systems

Structure of HPC in Germany

Tier 0– Partnership for advanced computing in

Europe (PRACE)– all users from Europe

Tier 1– national high-performance computing centers– users from Germany– financing: Gauß-Center for Supercomputing

Tier 2– high-performance computing centers– users from the corresponding state and

Germany– financing: state/BMBF

Forschungsbautenprogramm Tier 3

– local computing centers– financing: DFG Großgeräteanträge

• Central scientific institute of Paderborn University

– Roots in theoretical computer science

– Founded 1991

– Member of Gauß-Alliance for high-performance computing

– tier-3 (users in NRW) and tier-2 (users from whole Germany)

HPC-center

• Research and service provider for high-performance computing (HPC)

– Development of methods to use parallel computer systems– Operation of HPC systems for users from NRW and beyond– Support for users with consulting and service

Paderborn Center for Parallel Computing (PC²)

Members of the Gauß-Alliance

HPC infrastructure at PC²

system start description

OCuLUS 2013 Cluster Vision, 9920 cores, Intel SNB, 200 TFlop/s, QDR Infiniband, 31 NVidia K20x GPUs, 16 Nvidia GTX 1080 TI, 2 Nvidia RTX 2080 Ti

Pling3 2016 Megware, 1024 cores, Intel HSW, 40 TFLOPS, FDR Infiniband

XLC 2017 8 nodes, Intel BDW, 10GE, FPGA accelerators: 8x AlphaData ADM-PCIE-7V3 and 8x ADM-PCIE-8K5 FPGA

Harp 2017 10 nodes, Intel Xeon BDW+FPGA CPUs, 10GE

Noctua 2018 Cray CS500, 272 nodes, Intel Xeon Gold 6148, 10880 cores, >700 TFlop/s, 0.7 PB Lustre ClusterStor, 100G OmniPath, 32 Intel Stratix-10 FPGAs

Noctua (phase 2)

approx. Q1/2021

Petaflop class

selected HPC production and research systems

HPC infrastructure at PC²

system start description

OCuLUS 2013 Cluster Vision, 9920 cores, Intel SNB, 200 TFLOPS, QDR Infiniband, 31 NVidia K20x GPUs, 16 Nvidia GTX 1080 TI, Nvidia RTX 2080 Ti

Pling3 2016 Megware, 1024 cores, Intel HSW, 40 TFLOPS, FDR Infiniband

XLC 2017 8 nodes, Intel BDW, 10GE, FPGA accelerators: 8x AlphaData ADM-PCIE-7V3 and 8x ADM-PCIE-8K5 FPGA

Harp 2017 10 nodes, Intel Xeon BDW+FPGA CPUs, 10GE

Noctua 2018 Cray CS500, 272 nodes, Intel Xeon Gold, 10880 cores, >700 TFlop/s, 0.7 PB Lustre ClusterStor, 100G OmniPath, 32 Intel Stratix-10 FPGAs

Noctua (phase 2)

approx. Q1/2021

Petaflop class

selected HPC production and research systems

Systems for this Winetr School

OCuLUS

Noctua

• setup end of 2018, Cray CS 500 • 100 Gbit/s interconnect, 700 TB parallel file system • 10880 Intel Xeon cores (Skylake-SP)

– 256 compute nodes (40 cores, 192 GB RAM)

→ 96 compute nodes reserved for this winter school

– 16 FPGA nodes (40 cores, 192 GB RAM, 2x Intel Stratix 10 FPGA)

● Intel Stratix 10 FPGA– Bittware 520N cards, 32 GB DDR4 RAM– 10 TFlops SP – 4 dedicated network ports for direct FPGA-FPGA

connections– Optical switch for FPGA-FPGA connections

Noctua

Athene NoctuaLat. ‘Steinkauz’

• acquisition/setup 2020/2021• >1000 TFlops DP performance

Together with new data center building:

Noctua Phase 2

Athene NoctuaLat. ‘Steinkauz’

Accessing Noctua

How to get access

● fill out terms of use and data privacy form

● return the form to one of the tutors or the registration desk

● visit https://dev.noctua.pc2.uni-paderborn.de/winterschool2020/

● click on link with your account name

How to get access

● click “Connect”

● enter account password

How to get access● click “Connect”

● enter account password

● → virtual desktop environment (that runs on our cluster)– material for tutorials under “Winter School” on Desktop

– disconnect by closing the browser window

How to run compute jobs

Compute Jobs

● Compute job = sequence of commands and programs

● need resources:

– working time (walltime)– compute nodes, cpu cores– memory (RAM, system memory)– accelerators (GPUs, FPGAs,...)– software licenses– disk space

Workload Manager Compute nodes

Workload Manager(Scheduler)

Workload manager

Noctua Slurm(Simple Linux Utility for Resource Management)

#!/bin/bash#SBATCH -N 2#SBATCH ….

echo „Hello from job“#run some program here

Compute jobs are usually defined with jobscripts:

user submits compute job to

plans execution (when and where)accounting of resources

Workload ManagerNoctua

Workload Manager

Node usage exclusive (only one job per node)

Compute nodes of Noctua

1 2 20

CPU 1Xeon 6148

…. 1 2 20

CPU 2Xeon 6148

96 GB RAM 96 GB RAM

- 2 CPUs per node (Intel Xeon Gold 6148)- 20 cores per cpu- 192 GB DDR4 RAM- 100 Gbit/s Omnipath interconnect to other nodes- 2.8 TFlops DP per node

1 2 20

CPU 1Xeon 6148

…. 1 2 20

CPU 2Xeon 6148

96 GB RAM 96 GB RAM

Omnipathnetwork ….

A job has to be able to use a whole node!(40 cpu-cores)

…….

100 Gbit/sOmnipath

Workload Manager: Noctua (Slurm)

nodes are allocated exclusively to a compute job→ required information for a compute job:

● Name of the compute job? -J “solve bla” --job-name=”solve bla”● For how long? -t 2:00:00 --time=2:00:00● To which project should it be billed? -A hpc-lco-usrtr --account=hpc-lco-usrtr● How many nodes? -N 4 --nodes=4

● To which partition should it be submitted? -p batch --partition=batch

Examples:

● sbatch -J “short single-node job” -N 1 -t 00:30:00 -A hpc-lco-usrtr -p batch

#!/bin/bash

#!/bin/bash#SBATCH -J „short single-node job“#SBATCH -N 1#SBATCH -t 00:30:00#SBATCH -A hpc-lco-usrtr#SBATCH -p batch

time formats:„minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds“

30 minutes

Steps for submitting a job:

1) Take an example jobscript from Dekstop/Winter_School/Example_Jobscripts

2) Create a work directory

3) Copy jobscript to your work directory, e.g., as job.sh

1) Look at the jobscript. Edit “PLEASE CHANGE” parts

4) submit script for execution:

for cp2k and cp-paw: “sbatch job.sh”

for orca: “sbatch job.sh h2o” if your input file is h2o.inp

5) This returns: “Submitted batch job [JOBID]”

6) The JOBID can be used to track your job, e.g., cancel,….

Workload Manager: Noctua (Slurm)state of submitted jobs: squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 188262 short everything rschade PD 0:00 2 (Resources) ← pending job (resources not available) 188252 short Answer rschade R 0:13 2 cn-[0252-0253] 188253 short to rschade R 0:13 2 cn-[0254-0255] 188254 short the rschade R 0:13 2 fpga-[0001-0002] 188255 short ultimate rschade R 0:13 2 fpga-[0003-0004] 188256 short question rschade R 0:13 2 fpga-[0005-0006] 188257 short of rschade R 0:13 2 fpga-[0007-0008] running jobs 188258 short life rschade R 0:13 2 fpga-[0009-0010] 188259 short the rschade R 0:13 2 fpga-[0011-0012] 188260 short universe rschade R 0:13 2 fpga-[0013-0014] 188261 short and rschade R 0:13 2 fpga-[0015-0016]

● predict start time of a job: spredict JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) START_TIME 188262 short everything rschade PD 0:00 2 (Resources) 2019-05-18T22:14:31 ← estimated start time

● Attention: On Noctua you will only see your own jobs!!!

→ use scluster or sinfo to see overall cluster utilization:Compute nodes: cn-[0001-0256]

nodes in total: 256

nodes in use: 176 (currently occupied with jobs)

nodes free: 80

nodes drained: 0 (currently not accepting jobs)

nodes offline: 0 (currently offline, e.g. because of hardware problem)

Workload Manager: Noctua (Slurm)Cancel jobs: scancel

scancel JOBID

scancel --name=JOBNAME

Environment ModulesYou don’t have to install any software

● ORCA, CP2K, CP-PAW and TRAVIS are installed on our system● only need to load a module to use, e.g.,

module load winterschool2020/orca

● if not needed anymore, unload with

module unload winterschool2020/orca

● list the currently loaded modules with

module list

Thing you should be aware of:● Example jobscripts for ORCA, CP2K and CP-PAW can be found on the

virtual desktop in Winter_School/Example_Jobscripts

● Don't store any personal, private or important data on our cluster system or in the virtual desktop.

● All participants share the Unix account usrtr001 but work in different home-directories and have individual desktops.

● The home-directories are located on our high-performance parallel file system so that you can directly perform calculations in this file system.

● Always submit a compute job, i.e., with sbatch when you want to perform a calculation.

● Use proper job names so that you can find you jobs again in the job list.

● Don't cancel other peoples compute jobs.

NoctuaLogin https://dev.noctua.pc2.uni-paderborn.de

Workload Manager Slurm

Node usage exclusive (only one job per node)

cpu-cores per node 40 (dual-socket with 20 cores per socket)

main memory per node 192 GB

submit a job sbatch (#SBATCH)

account --account=, -A

job name --job-name=, -J

time limit --time=, -t

number of nodes --nodes=, -N

partition --partition=, -p (short, batch, long, fpga)

details of a job scontrol show job JOBID

cancel a job scancel JOBID

job status squeue

predicted job start time spredict

cluster utilization overview scluster, sinfo

#!/bin/bash#SBATCH -J „single-node“#SBATCH -N 1#SBATCH -t 00:30:00#SBATCH -A hpc-lco-usrtr#SBATCH -p batch

module resetmodule load ...

How to Use Our Cluster for the Tutorials - uni-paderborn.de · 2020. 2. 11. · Noctua 2018 Cray...

Transcript of How to Use Our Cluster for the Tutorials - uni-paderborn.de · 2020. 2. 11. · Noctua 2018 Cray...

How to Use Our Cluster for the Tutorials - uni-paderborn.de · 2020. 2. 11. · Noctua 2018 Cray...

Documents

Transcript of How to Use Our Cluster for the Tutorials - uni-paderborn.de · 2020. 2. 11. · Noctua 2018 Cray...

CS500 SERIES - TeleDynamics...Plantronics legendary CS family is setting a wireless standard for desk phone communication with the CS500 Series. The system features the lightest DECT

ClusterStor 1500 Departmental Scale-Out Storage for HPC

Parallel Computing: Opportunities and Challengesweb.stanford.edu/class/ee380/Abstracts/110330-slides.pdf · Parallel Computing: Opportunities and Challenges ... • First TFlop SGEMM

EN CS500 PS SEP2015 - Headset.hu · 2018. 6. 19. · For more information about the CS500 Series or any other Plantronics products, please visit our website at plantronics.com GLOBAL

Electronic Hook Switch Guide - headsetplus.com · Use the EHS setup below to automatically answer a desk phone call by pressing a button on your Savi 700 Series, CS500/CS500 XD Series

Hadoop Workflow Accelerator - Seagate Accelerator The Hadoop Workflow Accelerator when used in conjuction with ClusterStor reduces core operating costs that often represent the largest

Improving Lustre® OST Performance with ClusterStor GridRAID

DiamondTile Algorithm for High-Performance Wave Modeling · FLOPsandBandwidthPerformanceRatio 10 100 1000 0.1 1 10 GB/s TFLOP/s (fp32) nVidia Maxwell, 2014-15 nVidia Kepler, 2012-13

Parts and Accessories Guide - headset plus 2014 Parts Accessories Guide.pdf6 Plantronics Parts and Accessories Guide 05.14 Plantronics Parts and Accessories Guide 05.14 7 CS500™

Final Review - Computer Science | Drexel CCIjulia/cs500/documents/lectures/lecture... · Final Review!! Julia Stoyanovich (stoyanovich@drexel.edu) Julia Stoyanovich Final exam logistics

Plantronics CS500 Series - premier-cloud · 2017. 10. 31. · Title: Plantronics CS500 Series Author: Plantronics Subject: CS500 Series Keywords: Plantronics, Product Sheet, DECT,

Polar CS500 User Manual - Support | Polar.com. GET TO KNOW YOUR POLAR CS500 Congratulations on your purchase of a new Polar CS500 cycling computer! This user manual includes complete

Why we need Exascale and why weiwcse.phys.ntu.edu.tw/plenary/HorstSimon_IWCSE2013.pdf · 4 4 Performance Development Source: TOP500 June 2013 59.7 GFlop/s 400 MFlop/s 1.17 TFlop/s

Perspectives on Emerging/Novel Computing Paradigms … · Emerging/Novel Computing Paradigms and Future Aerospace Workforce ... performance reported today is 35.86 Tflop/s of the

HPE Reference Architecture for SAP HANA Vora …hortonworks.com/.../HPE-Reference-Architecture-for-SAP.pdfserver, which powers the HPE ConvergedSystem 500 for SAP HANA (CS500), and

CS500 XD™ SERIES · CS500 XD™ SERIES A new addition to Plantronics legendary CS family for desk phone communications, the CS500 XD Series is a high-quality solution for fitting

Jenne® is your Value Added Distributor for Plantronicsmarketing.jenne.com/.../Plantronics_VAD_Brochure.pdf · Voyager 8200 UC Bluetooth Headset CS500 Series Plantronics bestselling

CS500 Treadmill Owner’s Manual - Gym Source · 2015. 12. 7. · CS500 Treadmill Owner’s Manual. truefitness.com / 800.426.6570 / 1.636.272.7100 2 Frank Trulaske, ... TRUE STRONGLY

THE ART OF GOOD COMMUNICATION SMARTER WORKINGpages.plantronics.com/rs/plantronics/images/Collaboration... · 2020. 9. 7. · Voyager Legend™ Savi 700 Series CS500 Series SMARTER

Cray ClusterStor E1000 storage system technical white paper€¦ · • GPFS/Spectrum Scale adoption has grown at a slightly lower rate, from 23% to 26.8%. This modest growth in adoption