Welcome New Users! Getting Started with HPC · Welcome New Users! Getting Started with HPC Erin...

Post on 01-Jun-2020

15 views 0 download

Transcript of Welcome New Users! Getting Started with HPC · Welcome New Users! Getting Started with HPC Erin...

Welcome New Users!

Getting Started with HPC

ErinShaw&CesarSulAdvancedCyberinfrastructure

Research&Educa9onFacilitators(ACI-REF)USCCenterforHigh-PerformanceCompu9ng(HPC)

Spring2017

1. What is HPC?

§  HPCisUSC’sCenterforHigh-PerformanceCompu8ng.•  HPCadvancesUSC’smissionbyprovidingtheinfrastructureand

supportnecessaryforresearchcompu9ng.•  Itexiststohelpadvancescien8ficdiscoveryatUSC.

§  HPCisaworld-classsuper-compu8ngcenter!•  Aspartof“standingup”anupgradedsystem,HPCrunsandpublishes

standardperformancebenchmarks.•  Itiscurrentlyrankedthe12thfastestacademicsupercomputerinU.S.

byTOP500.org,theinterna8onalsupercomputerrankingsite.

ITS Information Technology Services

DouglasShookUSCCIO

MaureenDoughertyUSCHPCDirector

RandolphHallUSCVPofResearch&FacultyExecu=veDirector,HPC*

*TheHPCFacultyAdvisorycommiTeeadvisestheCIOaboutthefaculty’sresearchneedsrelatedtotheuniversity’sHPCresources.

ITSDataCenter

HPC

HPC User Base

§  HPCisaUSC-wideresource.•  HPCresourcesareavailable

atnochargetoUSCfaculty,researchersandstudents.

•  MostusersarefromDornsife,ViterbiandKeck.

•  Othersarefrombusiness,psychology,cinema,pharmacyandelsewhere.

•  Thereare11classaccounts.§  HPCishousedwithintheITSdatacenterandis

monitoredaround-the-clockbyITSstaff.

HPC Facilitation

§  Requestassistance•  Emailhpc@usc.edu(emailagain!)•  Drop-intoOfficeHours

-  EveryTuesday@2:30pm(UPCLVL3M)-  AOerworkshops,whenscheduled, Wednesdays@4:00pm(HSCNML203)

•  Requestalab/individualconsulta8on§  Learnmore!

•  VisithTps://hpcc.usc.edu•  ATendaWorkshop(whenscheduled)

-  UPCFridays,2:30-4:30pm,VPD106-  HPCWednesdays1:30-3:30pm,MCA249

2. HPC Accounts

§  HPCaccountsareproject-based;faculty,staff,researchersandgraduatestudentscanapplyforuptotwoHPCprojectaccounts•  HPCreferstotheapplicantasthePIoftheproject•  ThePIcanaddmemberstotheirprojectgroup(orbethesolemember)

-  e.g.,aprofessoraddsstudentstoaclassproject-  e.g.,aninves=gatoraddsgraduatestudentstoaresearchproject

•  Memberscanbelongtomul8pleprojects,includingtheirown

§  Projectsareallocatedacorehoursanddiskspacequota•  PIcanaskHPCtoincreasehoursandspacethroughthewebsite•  Use$mybalance/$myquotatomonitorcomputehours/diskspace

2. HPC Special Accounts

§  Classaccounts•  Instructorscancreateclassaccountsfortheirstudents,forteaching

andclassassignments.-  Wehadelevencoursesacrosstheacademicyear

§  Securedataaccounts•  AsofJanuary,2017,HPCcannowbeusedwithsensi8veand

restricted-accessdata.•  Previously,youcouldnotstoreorprocessdataordocumentsonHPC

iftheybelongedtoacategoryoflegallyprotectedorhigh-riskinforma8on.

NEW!

3. HPC Computing Cluster

§  Acomputercluster…consistsofconnectedcomputers(nodes)thatworktogether…nodesareconnectedtoeachotherthroughfastlocalareanetworks…witheachnoderunningitsowninstanceofanopera8ngsystem...usuallyincludessohwareforhigh-performancedistributedcompu8ng

3. HPC Computing ClusterAsimple,home-builtcluster OnerackinHPCcluster

Networkcables Mul8pleracksinHPCcluster

RowsofracksinHPCcluster!

3. HPC Computing ClusterNetworkFirewall

ComputeNodes(HPCCluster)>2,700nodes,>32Kcores(CentOS)

Myrinet(10Gbit/sec)Infiniband(56Gbit/sec)

FastNetworks

2.4PB(total)

328TB(/staging)

DataStorage

$sshNetID@hpc-login3.usc.edu

Head*Nodes

hpc-login2

hpc-login3

hpc-transferDataTransferNode*

*OnlyheadnodescanaccesstheInternet.*HeadnodesandDTNaresharedbyallusers

4. Working on HPC from a Laptop or Desktop

§  Asecurenetworkisrequired•  UseUSCSecureWirelessorUSCEthernettoconnectfromUSC•  UseaVirtualPrivateNetwork(VPN)clienttoconnectfromoutsideUSC

§  Asecureshell(ssh)isrequired(ashellisLinux’scommandlineinterface)•  OnMacs,useTerminal,ana8veapplica8on

-  Addi=onally,installXQuartz(www.xquartz.com)forGUIviewing

•  OnWindows,installX-Win32fromsohware.usc.edu-  Orinstallanotherpersonalfavorite,e.g.PuTTY,SecureShell,etc.

§  Toconnect•  OnaMacTerminal,type“ssh–X<YourUSCNetId>@hpc-login2.usc.edu”•  OnWindows,configurethesshconnec8onforhpc-login2.usc.edu

4. Working on HPC from a Laptop or Desktop

§  Asecurefiletransferprotocol(shp)isrequiredfortransferringdatafiles•  UsetheLinuxcommandsscpandrsyncforcommandlinetransfers•  Useoneofthemanyshpclientapplica8onsavailable,e.g.,

-  FilezillaisavailableforbothMacandWindows-  Chooseyourfavorite(e.g.,SecureShellsupportsbothloginandfiletransfer)-  Seeheps://itservices.usc.edu/sOp/forop=ons

§  Toconnect•  Configuretheshpconnec8onforhpc-transfer.usc.edu

-  hpc-transferisadedicatedDTN(datatransfernode)

5. HPC File System

§  Filesystemscontrolhowdataisstoredandusedonadisk§  DataispresentedtotheHPCclusterfrompath“/home”

“root”,denotedbyaforwardslash(“/”),isthetopleveloftheLinuxOpera8ngSystem’sfilesystem’sdirectoryhierarchy

-  Typecommands”cd /” then“ls” toviewdirectoriesunderroot

├auto├bin├homeçHPCuserhomeandprojectdirectories├lib├lib64├mnt├sbin├stagingçHPCdatastagingdirectories├tmp├usrçHPCmaintainedsohwareisin/usr/usc:

/(root)

5. HPC File System

§  Theloca8onsofyourproject,applica8ons,codes,libraries,data,etc.areallspecifiedbyuniquepaths

Tofindyourcurrentpath,typepwd(printyourworkingdirectory):$pwd/home/rcf-proj/T/trojanrcf-projindicatesthatthisisaprojectdirectory,where“e”istheprojectand“trojan”istheuser

$pwd/home/rcf-40/trojanrcf-xxindicatesthatthisisauser’shomedirectorywhere“trojanistheuser

5. HPC File System

§  Pathsareeitherabsoluteorrela8veAbsolutepathscontainroot,orasymbolthatexpandstoafullpathExamples:

/home/rcf-proj/hpc/hpcuser‘/’ startsattop(root)level./mycatphotos/cat.jpg‘.’ resolvestothepathofyourcurrentdirectory~/.bashrc‘~’ resolvestothepathofyourhomedirectory

Allotherpathsareinterpretedrela8vetoyourcurrentdirectoryExamples:

$cdmycatphotoschangedirectoryto‘mycatphotos’(incurrentdir)$catmycatphotos/cat.datdisplaycontentsoffilecat.datin‘mycatphotos’

/home/rcf-xx/

csul/ shaw/ chris/

5. File System: Home Directory

§  Userlogintotheirhomedirectory/home/rcf-40/<user_name>-  valueofenvironmentvariable$HOME

•  Privatedirectory,onlyusercanmodifyfiles•  Backedupdaily

§  Userquotas•  1GBofdiskquotaand100,000filesoffilequota

-  applica=onsmayinstallhiddenfileshere

§  Usedfor•  Loggingin,setngupenvironmentandstoringpersonalfiles

-  notforinstalla=on,computa=onorlargestorage

/home/rcf-proj/

proj1/

jimi/

proj2/

csul/ shaw/ chris/

5. Project Directory

§  Everyprojecthasitsowndirectory/home/rcf-proj/<project_name>•  PIisowner,everymemberhasasubdirectory•  Projectquota(max2TB)issharedamongallmembers•  Backedupdaily

§  Usedfor•  Installingsohware,runningjobsandstoringdata

-  onlyPIcancreatesharedsubdirectoriesandinstallsoOwareattoplevel

§  Permissions•  Bydefaultmembersubdirectorieshavegroupreadaccess

-  memberscanmakeprivate-neversetpermissionsootherscanwrite

5. Staging Directory

§  Everyprojecthasastagingdirectory/staging/<project_name>

-  samestructureasprojectdirectory-  forstagingdataforjobs(copydata/resultsto/from)

•  Lotsofspace(~328TB)-  noquotas!-  /stagingisclearedduringsemi-annualdown=mes

•  Dataisnotbackedup-  storecopyofdatasomewhereelse

§  Stagingisaparallelfilesystem•  Ithasfasterr/waccessratesthantheprojectfilesystem

/staging/

proj1/

jimi/

proj2/

csul/ shaw/ chris/

5. Temporary storage on compute node

§  Everyjobhasaccesstolocalstorage(~60GB–1.8TB)$TMPDIR-  equalto/tmp/{your_job_id}

/scratch-  combines$TMPDIRfromfirst20nodes-  /scratchisavailabletoallnodesofjob

•  Fastestr/wrates•  Onlyaccessiblewhenoncomputenode

§  Dataisnotbackedup!•  Computenodedirectoriesarecleanedattheendofeveryjob

-  copydatabacktoyourprojectorstagingdirectories

/(root)

$TMPDIR/

{your_data}

/(root)

scratch/

{your_data}

OneormorenodesOnenodeonly

6. HPC Computing Resources

§  HPChastwocompu8ngclusters~1700nodesonoriginalMyrinet(10Gbps)interconnectcluster~1300nodesonnewerInfiniband(5.6.6Gbps)interconnectedcluster

264Hewlee-PackardSL250,dualXeon8-core2.6GHz,dualNVIDIAK20GPUscontaining2,496cores,eachwith64GBmemory448Hewlee-PackardSL230,dualXeon8-core2.6GHzCPUs,with64GBmemory288Lenovonx360m5dualXeon8-core2.6GHzCPUswith64GBmemory19Lenovonx360m52.6GHzdualNVIDIAK40GPUscontaining2,880cores,eachwith64GBmemory5Lenovonx360m52.6GHzdualNVIDIAK80GPUscontaining2x2,496cores,eachwith64GBmemory(condo’dbyresearchgroup,notpublic)

§  Runjobsoncomputenodes!

6. HPC Computing Cluster (June 2016)

Index Vendor Model CPU Number Type Core Speed Memory GPU

Number & Type

1 Dell R910 Quad Intel Xeon Decacore 2.0GHz 1TB

2 HP SL160 Dual Intel Xeon Hexcore 3.0GHz 24GB

3 HP DL165 Dual Intel Xeon, Dodecacore 2.3GHz 48GB

4 Oracle X2200 Dual AMD Opteron Dualcore 2.3GHz 16GB

5 Dell PE1950 Dual Intel Xeon Quadcore 2.5GHz 12GB

6 Oracle X2200 Dual AMD Opteron Quadcore 2.3GHz 16GB

7 IBM DX360 Dual Intel Xeon Hexcore 2.6GHz 24GB

8 HP SL250S Dual Intel SB Xeon Octocore 2.6GHz 64GB Dual NVIDIA K20

9 HP SL230S Dual Intel SB Xeon Octocore 2.6GHz 64GB

10 Lenovo NX360 M5 Dual Intel Xeon Octocore GHz 64GB

11 Lenovo NX360 M5 Dual Intel Xeon Octocore 2.6 GHz 64GB Dual NVIDIA K40

queue nodes ppn gpus avx /tmp core cpu model net-work node names node

type

large- mem 4 40 - - 1.8T decacore xeon r910 myri

hpc-1t-1 hpc-1t-2 hpc-1t-3 hpc-1t-4

1

large main quick

8 12 - - 140 GB hexcore xeon sl160 myri hpc0965-0972 2

large main quick

67 24 - - 895 GB

dodeca-core xeon dl165 myri

hpc0981-1021 hpc1044-1050 hpc1123-1128 hpc1196-1200 hpc1223-1230

3

large main quick

26 8 - - 60 GB dualcore opteron x2200 myri

hpc1723-1728 hpc1734-1739 hpc1741-1742 hpc1744-1754

hpc1756

4

large main quick

54 8 - - 60 GB quadcore xeon pe1950 myri hpc2283-2318

hpc2320-2337 5

large main quick

138 8 - - 60 GB quadcore opteron x2200 myri

hpc2349-2370 hpc2470

hpc2472-2481 hpc2483

hpc2486-2505 hpc2510-2544 hpc2546-2559 hpc2561-2580 hpc2582-2597

hpc2600

6

large main quick

4 12 - - 200 GB hexcore xeon dx360 myri hpc2758-2761 7

large main quick

237 16 2 avx 850 GB octocore xeon sl250s IB hpc3025-3027

hpc3031-3264 8

large main quick

45 16 - avx 5500 GB octocore xeon sl230s IB

hpc3648-3688 hpc3695

hpc3766-3768 9

large main quick

217 16 - avx avx2

5500 GB octocore xeon nx360m5 IB

hpc3769-3792 hpc3888-4056 hpc4081-4104

10

large main quick

19 16 2 avx avx2

5500 GB octocore xeon nx360m5 IB hpc3817-3834

hpc3852 11

queue nodes ppn gpus avx /tmp core cpu model net-work node names node

type

large- mem 4 40 - - 1.8T decacore xeon r910 myri

hpc-1t-1 hpc-1t-2 hpc-1t-3 hpc-1t-4

1

large main quick

8 12 - - 140 GB hexcore xeon sl160 myri hpc0965-0972 2

large main quick

67 24 - - 895 GB

dodeca-core xeon dl165 myri

hpc0981-1021 hpc1044-1050 hpc1123-1128 hpc1196-1200 hpc1223-1230

3

large main quick

26 8 - - 60 GB dualcore opteron x2200 myri

hpc1723-1728 hpc1734-1739 hpc1741-1742 hpc1744-1754

hpc1756

4

large main quick

54 8 - - 60 GB quadcore xeon pe1950 myri hpc2283-2318

hpc2320-2337 5

large main quick

138 8 - - 60 GB quadcore opteron x2200 myri

hpc2349-2370 hpc2470

hpc2472-2481 hpc2483

hpc2486-2505 hpc2510-2544 hpc2546-2559 hpc2561-2580 hpc2582-2597

hpc2600

6

large main quick

4 12 - - 200 GB hexcore xeon dx360 myri hpc2758-2761 7

large main quick

237 16 2 avx 850 GB octocore xeon sl250s IB hpc3025-3027

hpc3031-3264 8

large main quick

45 16 - avx 5500 GB octocore xeon sl230s IB

hpc3648-3688 hpc3695

hpc3766-3768 9

large main quick

217 16 - avx avx2

5500 GB octocore xeon nx360m5 IB

hpc3769-3792 hpc3888-4056 hpc4081-4104

10

large main quick

19 16 2 avx avx2

5500 GB octocore xeon nx360m5 IB hpc3817-3834

hpc3852 11

Let’sdoourCGAhomework…

ComputeNodes

$sshNetID@hpc-login3.usc.edu

Let’stestthisonthecluster…

$qsub–I-lnodes=2:ppn=8

HeadNodes

hpc-login2

hpc-login3

(interac9vejob)$myprogram

waitinqu

eue

*HPCusestheTORQUE/PBSresourcemanagerandtheMoabclusterscheduler.JobsarescheduledbasedonordersubmiTed,number&typesofnodesrequestedand8merequired.

7. Running jobs on the ClusterLet’ssubmitthisto

thecluster…

(non-interac9vebatchjob,resultswillbeprintedtofile)

JobScheduler*

$qsubmyjob.pbs

(batchjob)$myprogram

7. Submitting a Job – Batch Mode

§  UseaPBSscripttosubmitajobtotheHPCcluster

1.  AddPBScompu8ngresourcerequests

2.  Addshellcommands3.  Submityourjobtothe

queue:$qsubmyjob.pbs

§  Example:myjob.pbs

#!/bin/bash#PBS-lnodes=2:ppn=16#PBS-lwall8me=02:00:00

#changedirectorycd/path/to/myproject

#setpath/environmentvariablessource/usr/usc/sas/default/setup.sh

#runprogramsasmy.sas

7. Submitting a Job – Interactive mode

§  PBShasaspecialjobsubmissionmodethatallowsyoutoaccessallocatedcompu8ngresourcesinterac8vely,fortes8ngonlyExample:Request1nodewith8processorspernodeforonehour

$ qsub -I -l nodes=1:ppn=8 -l walltime=01:00:00

§  Whenaninterac8vejobisaccepted,anewloginshellwillstartonthefirstcomputenode•  Youcanrunprogramsasmany8mesasyouwantun8ltherequested8me

expires(usuallyuptotwohours)

•  Extremelyusefulforcompiling/debugging/tes8ngyourcodeandpreparingyourPBSscripts

7. Submitting a Job - PBS commands

§  Jobcontrolcommandsqsub submitajobqdel deleteajob

§  Jobmonitoringcommandsqstat–u<user_id>showmyqueuestatusshowstart<job_id> showqueuingschedulecheckjob<job_id> showjobsta8s8cs

§  SeeAdap8veCompu8ng’sTorque(PBS)documenta8on

7. Submitting a Job – Queues

§  Therearefourqueuesavailabletothepublic

§  Eachqueuehasdifferentconstraints

•  Numberqueuedjobs,nodes,“wall8me”,simultaneousjobs•  Thelargememqueueisonlyforhighlyparalleljobs

§  Bydefault,aqueuewillbeselectedforyoubasedonwall8me•  Someresearchlabshavetheirownnodesandqueues(-q)

QueueNameMaximum

JobsQueuedMaximumNodeCount

MaximumWallTime

MaximumJobsperUser

main 1000 99 24hours 10quick 100 4 1hour 10large 100 256 24hours 1

largemem 100 1 336hours 1

8. Installed Software

§  HPCmaintainssohwarein/usr/usc/•  Includescompilers,sta8s8cal,mathema8cal,andsimula8onprograms;

numericallibraries,licensedapplica8onsandmore…

§  Youcanalsoinstallsohwareinyourprojectdirectory

•  HPCcanhelpwiththis

$ls/usr/uscacml/ }w/ imp/ mpich2/qespresso/amber/ gaussian/intel/ mpich-mx/qiime/aspera/gflags/ iperf/ mvapich2/R/bbcp/ git/ java@ NAMD/root/bin/globus/ jdk/ ncview/sas/(manymore)

8. Installed Software

§  Tousesohwarein/usr/usc•  Selectaversion

-  use“default”forthemostrecentversion

•  Ineachversiondirectoryaretwosetupscripts:-  setup.sh(forusewiththebashshell,whichisthedefault)-  setup.csh(forusewiththe“t”or“c”shells)

•  Typethefollowingtosetuptheenvironmentforthesohware-  $sourcesetup.sh

$ls/usr/usc/python2.6.5/ 2.7.6/ 2.7.8/ 3.3.3/ 3.4.3/ 3.4.5/ 3.5.1/ 3.5.2/ default@

$ls/usr/usc/python/default/bin/include/ lib/man/setup.cshsetup.shshare/

9. HPC Policies

§  Requiredreading•  hTps://hpcc.usc.edu/support/accounts/hpcc-policies/

§  Resourcelimits•  Onheadnodes,jobs(24hours),alloca8ons(corehours,diskspace),

§  Scheduleddown8mes•  Twiceayear,forupgradesandmaintenance

§  ProtecteddataallowedwithinHPCSecureDataAccounts(only)•  HPCisnowHIPAA-compliant

§  Playwellwithothers,prac8cesafecompu8ng•  i.e.,sharepublicnodes,notprivatedata!

9. USC Policies

§  Recommendedreading•  Itisyourresponsibilitytoabidebythese

§  Informa8onSecurity •  hTp://policy.usc.edu/info-security/

§  NetworkInfrastructureUse •  hTp://policy.usc.edu/network-infrastructure/

§  PrivacyofPersonalInforma8on•  hTp://policy.usc.edu/info-privacy/

§  DigitalMillenniumCopyrightActCompliance•  hTp://cio.usc.edu/copyright/policy/

10. Linux Commands & ConceptsEnvironmentbashshell,.bashrcenvironmentvariables($PATH)

Keyboardnaviga9on<up>/<down>:showprev/nextcmd<ctl-a>/<ctl-e>:mvtobeg/endofline<alt-f>/<alt-b>:mvforwd/backaword<tab>,<tab-tab>:autocomplete

Specialcharacters“*”:wildcard“/”,“~” :rootdir,homedir“.”,“..”:currentdir,parentdir“>”,“<“:redirectoutput/input“|”:pipeoutputtoinputofnextcmd

Naviga9on$pwd$cd$ls(-alh)$cp/mv$touch/rm(-i)$mkdir/rmdir$history

Reading/Edi9ngfiles$cat,$less$nano($vi,$emacs)

Permissions$chmod$chgrp

Informa9on$mybalance-h$myquota$top$du–h$man$echo[$PATH]$wc$sort

Tools$alias$wget$for(loop)

10. Linux Commands & Concepts (applied)

$  mybalance-h$  myquota$  top$  pwd$  ls,ls-l,ls-F--color$  manls$  echohello$  é é é ê ê ê $  echo$PATH$  cd..,pwd,ls

$  cd/home/rcf-proj/<myproj>/<mydir>$  mkdirtest,ls,rmdirtest,ls$  mkdirworkshop,ls$  cdworkshop,pwd$  toucha.a,ls$  cpa.ab.b,ls$  mvb.bc.c,ls$  rmc.c,ls$  aliasrm=‘rm-i’$  rma.a,ls

10. Linux Commands & Concepts (applied)

$  aliascdp=‘cd/path/to/proj’$  cd~,pwd$  cdp,pwd$  cd~,ls–alh.bash*$  cat.bashrc$  cp.bashrc.bashrc_ori$  nano~/.bashrc

aliasrm=‘rm-i’aliascdp=‘cd/path/to/proj’aliasll='ls-hlt'

$  cdp$  llworkshop$  chmodg+wworkshop,ls-l$  chmodg-rwworkshop,ls–l$  cdworkshop$  wgethTp://hpcc.usc.edu,ls$  wc-lindex.html$  catindex.html|grep@$  catindex.html|grepjpg

$  ls/usr/usc/mat<tab>toautocomplete(fails)<tab><tab>toshowcandidates

$  ls/usr/usc/matl<tab>toautocomplete(succeeds)<tab><tab>toshowcandidates

$  ls/usr/usc/matlab/2:

$  ls*$  foriin*;doecho$i;done

10. Linux Commands & Concepts (applied)

$  history$  history>>history.out$  lesshistory.out$  lesshistory.out|grepwget$  ls-t/usr/usc/>ls.out$  lessls.out$  lessls.out|sort$  lessls.out|sort-f|head–n5$  du–h*|sort–n$  du–h--summarize

10. Linux References

§  HPCworkshop•  Introduc8ontoLinux,PBS&theHPCCluster

§  Lyndavideo(accessviaUSC) •  hTps://www.lynda.com/Linux-tutorials/Learn-Linux-Command-Line-

Basics/435539-2.html

§  SohwareCarpentrytutorial•  hTp://swcarpentry.github.io/shell-novice/

§  O’ReillyBooksdirectory•  hTp://www.linuxdevcenter.com/cmd/

§  Manymanywebsites…usesearch

Appendix I – DDDT!*

§  Don’tshareyourpassword•  Goeswithoutsaying!

§  Don’tsetpermissionssootherscanwritetoyourdirectory•  Makesiteasyforotherstooverwriteanddeleteyourfiles•  Createagroup-shareddirectoryforyourgroup,instead

§  Don’truncomputeintensivejobsonheadnodes•  Useacomputenode.Everyoneiswatching($top).

§  Don’tread/write/copyzillionsof8nyfiles•  Useadatabase(lmdb,mysql)oralargefiletocombinedata.

*Don’tDoDumbThings!