Introduction to HPC resources for BCB 660

36
Introduction to HPC resources for BCB 660 Nirav Merchant [email protected]. edu www.iplantcollaborat ive.org

description

Introduction to HPC resources for BCB 660. Nirav Merchant [email protected] www.iplantcollaborative.org. Topic Coverage. What is Parallel Computing ? General overview of HPC systems Overview of batch system (and why we need them) Getting started with Ranger - PowerPoint PPT Presentation

Transcript of Introduction to HPC resources for BCB 660

Page 1: Introduction to HPC resources for BCB 660

Introduction to HPC resources for BCB 660

Nirav [email protected]

Page 2: Introduction to HPC resources for BCB 660

What is Parallel Computing ? General overview of HPC systems Overview of batch system (and why we need

them) Getting started with Ranger Understanding the default user environment Introduction to modules (and why we need

them) Submitting your first job (and monitoring it) Moving your data in and out of HPC systems Q/A

Topic Coverage

Page 3: Introduction to HPC resources for BCB 660

von Neumann Architecture Named after the Hungarian mathematician John

von Neumann who first authored the general requirements for an electronic computer in his 1945 papers.

Since then, virtually all computers have followed this basic design of:

Memory (RAM) Control Unit (CPU) Arithmetic Logic Unit (ALU) Input/Output (Keyboard)

What is computing ?

Page 4: Introduction to HPC resources for BCB 660

What does it look like (your computer) ?

Image courtesy Univ. of Washington

Page 5: Introduction to HPC resources for BCB 660

Parallel computing: use of multiple processors or computers working together on a common task.

Each processor works on part of the problem Processors can exchange information

What is Parallel Computing?

A good introduction to concepts for parallel programing is at:https://computing.llnl.gov/tutorials/parallel_comp/

Page 6: Introduction to HPC resources for BCB 660

Traditional software is written to execute serially i.e. one task at a time running on one CPU

As the size of data (tasks) is increasing we need to utilize multiple CPU’s

Size of data also has implications on how much RAM and disk space is required for the task (we need more RAM or disk that fits on one computer)

Why we need it

Page 7: Introduction to HPC resources for BCB 660

HPC systems: Not very different

Image courtesy TACC at Univ of Texas

Page 8: Introduction to HPC resources for BCB 660

HPC: High Performance Computing = Super Computing

Node: One self contained computer (many of which are connected together to form a “cluster”)

CPU = Socket = Processor = Cores Interconnect: networking between Nodes

(can be fiber optic, or regular ethernet like your computers) e.g. Infiniband or GigE

Some Terminology (Jargon) of HPC

Page 9: Introduction to HPC resources for BCB 660

Scalability: Ability to use additional resources to execute tasks faster

Embarrassingly Parallel: Data Parallel tasks where each task is independent and not much communication or coordination is required among tasks

Observed Speedup: “wall time” taken for serial task divided by wall time for parallel task

More Terminology (Jargon) of HPC

Page 10: Introduction to HPC resources for BCB 660

Shared memory All CPU (processors) have access to shared RAM

Distributed memory Each CPU (processor) has its own local memory,

but can be connected to others nodes via fast interconnect

Types of HPC

Page 11: Introduction to HPC resources for BCB 660

Limits of single CPU computing Performance Available memory (Disk and RAM)

Parallel computing allows one to: Execute Tasks that don’t fit on a single CPU Complete tasks in a reasonable time

Again Please check: https://computing.llnl.gov/tutorials/parallel_comp/

for basic intro to parallel computing concepts

Again why do we need it ?

Page 12: Introduction to HPC resources for BCB 660

Compute power 504 Teraflops 3,936 four socket nodes 62,976 cores, 2.0 GHz AMD Opteron

Memory 125Terabytes 2GB/core, 32 GB/node

Disk subsystem 1.7 PB Storage (Lustre Parallel File System) 1 PB in /work filesystem

Interconnect 8 Gb/s InfiniBand

Lonestar and others machines have similar (much larger specs)

RANGER

Page 13: Introduction to HPC resources for BCB 660

HOME Store your source code and build your executables here Use $HOME to reference your home directory in scripts

WORK Store large files here This file system is NOT backed up, use $ARCHIVE for important

files! Use $WORK to reference this directory in scripts

SCRATCH Store large input or output files here – TEMPORARILY This file system is NOT backed up, use $ARCHIVE for important

files! Use $SCRATCH to reference this directory in scripts

ARCHIVE Massive, long-term storage and archive system Check with staff before using this on your account

Filesystem Access

Page 14: Introduction to HPC resources for BCB 660

Limits on your filesystem

Page 15: Introduction to HPC resources for BCB 660

How is it connected

Page 16: Introduction to HPC resources for BCB 660

Please visit the TACC new user guide for RANGER

You will pick up many hints that will make your life MUCH easier for running tasks on TACC resources

http://www.tacc.utexas.edu/user-services/user-guides/ranger-user-guide

http://goo.gl/0xyN5 (same as above)

MUST READ THIS

Page 17: Introduction to HPC resources for BCB 660

With multiple users we need a way to organize tasks

We need a way to assign suitable resources to the tasks (track, prioritize)

With multiple software we need a way to deal with conflicts in version and dependency per tasks

Batch scheduler user on all TACC systems is SGE (Sun Grid Engine) now owned by Oracle.

Batch, Module system

Page 18: Introduction to HPC resources for BCB 660

Batch submission

Page 19: Introduction to HPC resources for BCB 660

RANGER: Queue Options

Page 20: Introduction to HPC resources for BCB 660

Common SGE commands

Page 21: Introduction to HPC resources for BCB 660

Lets get working

ssh [email protected]

Page 22: Introduction to HPC resources for BCB 660

Module Commands

Page 23: Introduction to HPC resources for BCB 660

Compbio stack/modules

Page 24: Introduction to HPC resources for BCB 660

Modules are for global use, hard to get cutting edge code as modules (limited staff time)

You can always compile and use your own versions without waiting for a module to be built

When possible, build your applications from source rather than running pre-compiled binaries

If you choose to use “make Install”, you will need to modify the “configure” script to change where it is installed

./configure --prefix=$HOME/bin For best performance, use the the intel compilers For best compatibility, use the gcc compilers More in “bleeding edge s/w” slide

But my favorite app is …

Page 25: Introduction to HPC resources for BCB 660

Number of cores and nodes to use is set with:

#$ -pe Nway 16*M N represents the number of cores to utilize

per node Ranger: 1≤N≤16Lonestar:1≤N≤12

M is the number of nodes to utilize The TOTAL number of cores used is thus:

N*M

Preparing for tasks

Page 26: Introduction to HPC resources for BCB 660

Preparing a job submission

Page 27: Introduction to HPC resources for BCB 660

Some more SGE options

Page 28: Introduction to HPC resources for BCB 660

http://genomics.tacc.utexas.edu/projects/ls4compbio/wiki

http://goo.gl/QYnIo (same url as above just short)

Lets look at the tutorial section towards the end of the page

Working with bleeding edge s/w

Page 29: Introduction to HPC resources for BCB 660

More from that page

Page 30: Introduction to HPC resources for BCB 660

SCP will work well for most smaller files Specialized options (bbcp and gridftp need

special end point installation) As you get larger files (10Gb+) it gets time

consuming to move it around Easier to move your data into iPlant data store

from your desktop/server (parallel transfers) Pull that data where you need (and push more

into it) Command line and GUI options (including

dropbox for science)

Getting data in and out

Page 31: Introduction to HPC resources for BCB 660

Details at:

http://goo.gl/4xzhA Connecting from RANGER module load irods iinit Answer the prompts using info from

above link You are now connected (without future

need of passwords to iPlant data store)

iPlant data store

Page 32: Introduction to HPC resources for BCB 660
Page 33: Introduction to HPC resources for BCB 660

From RANGER

After loading irods module i.e module load irods

Page 34: Introduction to HPC resources for BCB 660

You have many tasks that you want to run and they are naturally parallel (“embarrassingly parallel” )

Parametric Job Launcher: a simple utility for submitting multiple serial applications simultaneously.

% module load launcher 2 key components:

paramlist execution command launcher.sge job submission script

Parametric Launcher

Page 36: Introduction to HPC resources for BCB 660

TACC Staff for slides Matt Vaughn Michael Gonzalez And many more

URL http://www.tacc.utexas.edu/user-services/us

er-guides/

Gratitude