Training Day : High Performance Computing...

52
Training Day : High Performance Computing Cluster

Transcript of Training Day : High Performance Computing...

Page 1: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

Training Day : High Performance Computing

Cluster

Page 2: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 2

Pre-requisite : Linux

Today

● Connect to « genotoul » server

● Basic command line utilization

● File system Hierarchy Standard

● Useful tools (find, sort, cut, grep...)

● Transferring & compressing files

● How to use High Performance Computing Cluster (compute nodes)

Page 3: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 3

➔ To optimise computational power➔ How to submit jobs on compute nodes➔ How to manage his jobs (stat, kill...)

➔ Autonomy, self mastery

Objectives

Page 4: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 4

Planning of the day

Part I : 09h00 - 12h00 Compute nodes environmentOpen Grid EnginePractical 1

Part II : 14h00 – 17h00 Submit array of jobs

Practical 2 Parallel environments Practical 3

Page 5: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 5

Connection to « genotoul » cluster

Internet

ssh

«genotoul» login nodes

Storage Facilities

node001 à 068 : 2720 INTEL cores, 17TB of memory

computer nodes

smp : 240 INTEL cores3TB of memory

Ceri001 à 034 : 1632 AMD cores12TB of memory

Bigmem01 :64 INTEL cores1TB of memory

Page 6: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 6

● Pre-requisite : ask for a linux accounthttp://bioinfo.genotoul.fr/index.php?id=81

● SSH connection to the login nodes (use putty if Windows desktop) : genotoul.toulouse.inra.fr

● Linux command line (terminal session)

Connection to genotoul

Page 7: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 7

Vocabulary : Cluster / Node

● Cluster : set of nodes

● Node : Huge computer (with several CPUs)

CPU CPU

CPU CPU

Page 8: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 8

Vocabulary : CPU / Core

● CPU : Central Processing Unit● Core

1 CPU Dual Core

Page 9: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 9

● Each server = 32 INTEL cores, 128 GB of memory ● Linux 64 bits based on CentOS-6 distribution● Hundreds of users simultaneous● Secured (SSH only), daily saved (backup)● FUNCTIONS :

➔ To serve development environments➔ To test his scripts before data analysis➔ To launch batches on the cluster nodes➔ To follow the execution of jobs➔ To get data results on the /save directory

Login nodes : alias « genotoul »

Page 10: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 10

● Environment dedicated to bioinformatics➔ Software into : /usr/local/bioinfo/src (ex: blastall, clustalw, iprscan, megablast, wu-blast, ...)➔ Genomics databanks into : /bank

● Development languages➔ Shell, perl, C++, java, python...

● Editing tools ➔ nedit, geany, nano, emacs, vi, ...

Login nodes : alias « genotoul »

Page 11: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 11

● Interactive mode : for beginners / for remote display

● Batch access : for intensive usage (most of jobs)

● Communication between server and computational nodes is managed by the grid scheduler. No direct ssh-access to the nodes.

Access to cluster nodes

Page 12: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 12

Data storage

Drive bay

Page 13: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 13

/usr/local/bioinfo/ Bioinformatics Software

/bank/ International genomics Databanks

/home/User configuration files (ONLY)(100 MB user quota)

Disk spaces

/save/

HPC TEMPORARY disk space(1 TB user quota)/work/

User disk space (with BACKUP)(250 GB user quota)

Page 14: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 14

High Performance Computing● Workspace is exactly the same as genotoul servers

(software, data-banks, disk spaces).● Exception with permissions rights onto disk spaces (read

only on /save directory).● Tips :

➔ Submission and control from genotoul➔ Portable binaries (no need to recompile)➔ Facilities to get results

HPC environment

Page 15: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 15

Cluster nodes

High Performance Computing cluster

smp

Node001 à 068 (INTEL)

Ceri001 à 034 (AMD)

Bigmem01

Page 16: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 16

● INTEL cluster: 68 nodes purchased in 2014=> each 20 cores (40 threads), 256GB memory

● AMD cluster: 34 nodes purchased in 2012=> each 48 cores (48 threads), 384GB memory

● BIGMEM : 1 node purchased in 2012=> 32 cores (64 threads), 1TB memory

● SMP : 1 node purchased in 2014=> 120 cores (240 threads), 3TB memory

● High-performance clustered file system (GPFS) /work

Cluster nodes

Page 17: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 17

Planning of the day

Part I : 09h00 - 12h00 Compute nodes environnmentOpen Grid EnginePractical 1

Part II : 14h00 – 17h00 Submit array of jobs

Practical 2 Parallel environments Practical 3

Page 18: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 18

Grid Engine is responsible for accepting, scheduling, dispatching, and managing the remote and distributed execution of large numbers of standalone, parallel or interactive user jobs.

It also manages and schedules the allocation of distributed resources such as processors, memory.

Page 19: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 19

OGE (Open Grid Engine)

Queues availablesQueues availables for users

Queue Access Priority Max time Max slots

Workq (default) everyone 300 96H 4120

unlimitq everyone 100 unlimited 680

smpq on demand 0 unlimited 240

hypermemq On demand 0 unlimited 96

Interq (qlogin) everyone 100 48H 40

Page 20: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 20

OGE (Open Grid Engine)

Ressource quota limitations

Max slots Workq (group) Workq (user) Unlimitq (group)

Unlimitq (user)

contributeurs 4120 1024 680 256

INRA / REGION 3264 512 128 48

autres 1088 256 32 8

It depends on genotoul linux group :(contributeurs, INRA et/ou REGION, autres)

Page 21: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 21

Defaults parameters● Workq● 1 core● 8 GB memory maximum● Write only /work directory (temporary disk space)● 1 TB quota disk per user (on /work directory)● 120 days files without access automatic purged● 100 000H annually computing time (more on demand)

OGE (Open Grid Engine)

Page 22: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 22

[laborie@genotoul2 ~]$ qloginYour job 2470388 ("QLOGIN") has been submittedwaiting for interactive job to be scheduled ...Your interactive job 2470388 has been successfully scheduled.Establishing /SGE/ogs/inra/tools/qlogin_wrapper.sh session to host node001 ...[laborie@node001 ~]$

[laborie@node001 ~]$ exitlogout/SGE/ogs/inra/tools/qlogin_wrapper.sh exited with exit code 0[laborie@genotoul2 ~]$

qrsh (interactive mode)qlogin (interactive with graphical redirection)

OGE (Open Grid Engine)

Connected

Disconnected

Page 23: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 23

qsub : batch Submission1 - First write a script (ex: myscript.sh) with the command line as following:#$ -o /work/.../output.t#$ -e /work/.../error.txt#$ -q workq#$ -m bea# My command lines I want to run on the clusterblastall -d swissprot -p blastx -i /save/.../z72882.fa

2 - Then submit the job with the qsub command line as following:

job ID ->

OGE (Open Grid Engine)

$qsub myscript.shYour job 15660 ("mon_script.sh") has been submitted

$qsub myscript.shYour job 15660 ("mon_script.sh") has been submitted

Page 24: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 24

● -N job_name : to give a name to the job● -q queue_name : to specify the batch queue● -o output_file_name : to redirect output standard● -e error_file_name : to redirect error file● -m bae : mail sending options (b : begin, a : abort, e : end)● -l mem=8G: to ask for 8GB of memory (minimum reservation)● -l h_vmem=10G : to fix the maximum consumption of memory● -l myarch=intel / adm

Job Submission : basic options

OGE (Open Grid Engine)

Page 25: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 25

● Default (workq, 1 core, 8 GB memory max)

OGE (Open Grid Engine)Job Submission : some examples

$qsub myscript.shYour job 15660 ("mon_script.sh") has been submitted

$qsub myscript.shYour job 15660 ("mon_script.sh") has been submitted

● More memory (workq, 1 core, 32 / 36 GB memory)

$qsub -l mem=32G -l h_vmem=36G myscript.shYour job 15661("mon_script.sh") has been submitted

$qsub -l mem=32G -l h_vmem=36G myscript.shYour job 15661("mon_script.sh") has been submitted

● More cores (workq, 8 core, 8*8 GB memory)

$qsub -l parallel smp 8 myscript.shYour job 15662("mon_script.sh") has been submitted

$qsub -l parallel smp 8 myscript.shYour job 15662("mon_script.sh") has been submitted

Page 26: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 26

OGE (Open Grid Engine)

$nedit myscript.sh

### head of myscript.sh #### !/bin/bash#$ -m a#$ -l mem=32G#$ -l h_vmem=36G

#Mon programme commence icils### end of myscript.sh ###

$qsub myscript.shYour job 15660 ("mon_script.sh") has been submitted

$nedit myscript.sh

### head of myscript.sh #### !/bin/bash#$ -m a#$ -l mem=32G#$ -l h_vmem=36G

#Mon programme commence icils### end of myscript.sh ###

$qsub myscript.shYour job 15660 ("mon_script.sh") has been submitted

Script edition

Submission

Job Submission : some examples

Page 27: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 27

Monitoring jobs : qstat

Job-ID : Job identifierprior : priority of jobname : job nameuser : user namestate : actualy state of job (see follow)submit/start at : submit/start dateQueue : batch queue nameslots : number of slots aked for the jobja-task-ID : job array task identifier (see follow)

OGE (Open Grid Engine)

$qstat

job-ID prior name user state submit/start queue slots ja-task-ID

$qstat

job-ID prior name user state submit/start queue slots ja-task-ID

Page 28: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 28

● state : actually state of job

➢ d(eletion) : job is deleting➢ E(rror) : job is in error state➢ h(old), w(waiting) : job is pending➢ t(ransferring) : job is about to be executed➢ r(unning) : job is running

Monitoring jobs : qstat

OGE (Open Grid Engine)

● man qstat : to see all options of qstat command

Page 29: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 29

qstat -f : full format displayOGE (Open Grid Engine)

$qstat -f

queuename qtype resv/used/tot. load_avg arch states---------------------------------------------------------------------------------hypermemq@bigmem01 BIP 0/25/64 25.21 linux-x64 2654562 502.47578 scriptIMR. pbert r 02/01/2015 10:43:21 24 3417296 510.00000 spades.sh klopp r 02/23/2015 09:50:08 1 ---------------------------------------------------------------------------------

hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m lbrousseau r 02/03/2015 20:28:58 2 2822735 505.00000 LasMap faraut r 02/11/2015 14:29:35 1 ---------------------------------------------------------------------------------

interq@node001 IP 0/13/40 2.12 linux-x64 3455759 501.10143 QLOGIN mmolettadena r 02/23/2015 15:21:13 1 3456700 501.10143 QLOGIN mmolettadena r 02/23/2015 15:33:25 1 3456911 506.13893 QLOGIN smehdi r 02/23/2015 15:36:48 1

$qstat -f

queuename qtype resv/used/tot. load_avg arch states---------------------------------------------------------------------------------hypermemq@bigmem01 BIP 0/25/64 25.21 linux-x64 2654562 502.47578 scriptIMR. pbert r 02/01/2015 10:43:21 24 3417296 510.00000 spades.sh klopp r 02/23/2015 09:50:08 1 ---------------------------------------------------------------------------------

hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m lbrousseau r 02/03/2015 20:28:58 2 2822735 505.00000 LasMap faraut r 02/11/2015 14:29:35 1 ---------------------------------------------------------------------------------

interq@node001 IP 0/13/40 2.12 linux-x64 3455759 501.10143 QLOGIN mmolettadena r 02/23/2015 15:21:13 1 3456700 501.10143 QLOGIN mmolettadena r 02/23/2015 15:33:25 1 3456911 506.13893 QLOGIN smehdi r 02/23/2015 15:36:48 1

Page 30: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 30

Deleting a job : qdelOGE (Open Grid Engine)

$qstat -u laborie

job-ID prior name user state submit/start at queue slots ja-task-ID

------------------------------------------------------------------------------------------------------3629151 512.54885 sleep laborie r 02/25/2015 16:23:03

workq@node002 1

$ qdel 3629151laborie has registered the job 3629151 for deletion

$qstat -u laborie

job-ID prior name user state submit/start at queue slots ja-task-ID

------------------------------------------------------------------------------------------------------3629151 512.54885 sleep laborie r 02/25/2015 16:23:03

workq@node002 1

$ qdel 3629151laborie has registered the job 3629151 for deletion

Page 31: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 31

Connection to « genotoul » cluster

Internet

ssh

«genotoul» login nodesAccess to platformDeveloppment (scripts)Jobs submission (cluster)Transfert files to /save

Storage Facilities/save : Read only /work : Read + Write

node001 à 068 : 2720 INTEL cores, 17TB of memory

computer nodesWork, hypermemq, smpq

smp : 240 INTEL cores3TB of memory

Ceri001 à 034 : 1632 AMD cores12TB of memory

Bigmem :64 INTEL cores1TB of memory

qrshqloginqsubqstatqdel

Page 32: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 32

Monitoring genotoul cluster

Page 33: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 33

Practical

Part 1

Page 34: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 34

Planning of the day

Part I : 09h00 - 12h00 Compute nodes environnmentOpen Grid EnginePractical 1

Part II : 14h00 – 17h00 Submit array of jobs

Practical 2 Parallel environments Practical 3

Page 35: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 35

➔ Concept : segment a job into smaller atomic jobs➔ Improve the processing time very significantly

(the calculation is performed on multiple processing cores)

Array of jobs concept

Page 36: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 36

Execution on single core

Ex.1: blast in basic mode(genbank nucléique Sequence reference)

NTseqs.fa

(multi fasta file)

qsub script.sh

script.shblastn+ ­db nt ­query seqs.fa 

Page 37: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 37Execution on 3 cores

Ex.2 : blast in split mode

seqs.fa

qsub script1.shqsub script2.shqsub script3.sh

script1.sh blastn+ ­db nt ­query seq1.fa 

seq3.fa

seq2.fa

seq1.fa

script2.sh

script3.sh

blastn+ ­db nt ­query seq2.fa 

blastn+ ­db nt ­query seq3.fa 

Page 38: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 38Execution on 3 cores

Ex.3 : blast in job array mode

seqs.fa

qarray script.sh

blastx+ ­d nt ­i seq1.fa blastx+ ­d nt ­i seq2.fa blastx+ ­d nt ­i seq3.fa 

seq3.fa

seq2.fa

seq1.fa

script.sh

for i in ...

split ...

Page 39: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 39

Ex.3 : blast in job array mode

qarray script.sh

script.sh

qsub script1.shqsub script2.shqsub script3.sh

script1.sh

script2.sh

script3.sh

3 blast line

Page 40: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 40

Tools

fastasplit <path> <dirpath>Sequence Input Options:­­­­­­­­­­­­­­­­­­­­­­­f ­­fasta [mandatory]  <*** not set ***>­o ­­output [mandatory]  <*** not set ***>­c ­­chunk [2]

Split a fasta file

#mkdir out_split#fastasplit ­f seqs.fa ­o out_split ­c 6

Example :

Page 41: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 41

Create multi commands file 1 rm script.sh 2 for f in `ls out_split/*` 3 >do 4 >echo blastn+ ­query $f ­db ensembl_danio_rerio ­o $f.blast >> script.sh 5 >done

(1) If you execute the 'for loop' a second time, you MUST DELETE script.sh as '>>' add lines in the file if it exists .

Tools

➢ `: is the character on the key '7' (2)➢ for: the $f will loop on the result of the command between ` … ` , (2) i.e.: output of the split➢ do: syntaxically required (3)➢ echo: mean print to the screen (4)➢ >>: redirect screen printing to the file script.sh (4)➢ done: syntaxically required (5)

Page 42: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 42

Practical

Part 2

Page 43: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 43

Planning of the day

Part I : 09h00 - 12h00 Compute nodes environnmentOpen Grid EnginePractical 1

Part II : 14h00 – 17h00 Submit array of jobs

Practical 2 Parallel environments Practical 3

Page 44: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 44

1 job = 1 thread (one core)

Previous use of cluster

OGE (Open Grid Engine)

qarray script.sh script.shblastx+ ­d nt ­i seq1.fa blastx+ ­d nt ­i seq2.fa 

blast1 blast2

Each blast use 1 core

Page 45: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 45

If the program was developed for : 1 job could use multi-threads

Parallel environments

OGE (Open Grid Engine)

qsub ­pe parallel_smp 2 script.sh

script.sh blastx+ ­num_threads 2 ­d nt ­i seqs.fa 

Each blast use 2 cores

blast

Page 46: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 46

Visualisation :

qconf -spl

qconf -sp <parallel_env>

Utilisation: qsub -pe <parallel_env> <n slots> myscript.sh● smp : X cores on the same node (multi-thread, OpenMP)● parallel_fill : fill up the node then use others nodes (MPI)● parallel_rr : X cores on strictly different nodes (MPI)

Parallel environments

OGE (Open Grid Engine)

Page 47: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 47

Parallel environments : smp

OGE (Open Grid Engine)

blast

Shared memory in the same nodeNeed optimized program (e.i. for blast do not use multithread > 8)

Page 48: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 48

Parallel environments : rr / fill

OGE (Open Grid Engine)

Only for MPI programmation (Message Passing Interface)Read the manual of the software before use itNot optimized for blast !

thread2

Thread3

thread1

Page 49: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 49

Examples:

qsub -hard -l myarch=intel … myscript.sh (intel nodes utilisation)

qsub -soft -l myarch=intel … myscript.sh (intel nodes if free only)

qsub -pe parallel_fill 32 -soft -l myarch=intel … myscript.sh

qsub -pe parallel_smp N -hard -l myarch=intel … myscript.sh

Why this job stay in queue waiting ?

qsub -q workq -pe parallel_smp 20 -l mem=12G… myscript.sh

Parallel environments

OGE (Open Grid Engine)

Page 50: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 50

OGE (Open Grid Engine)

$qstat -r3193243 516.61063 tneg_V1_UC aghozlane qw 02/19/2015 12:16:10

Full jobname: tneg_V1_UC35_0_GL0032312 Requested PE: parallel_rr 8 Hard Resources: h_stack=256M (0.000000) h_vmem=50G (0.000000) memoire=50G (0.000000) pri_work=true (2400.000000)

$qstat -r3193243 516.61063 tneg_V1_UC aghozlane qw 02/19/2015 12:16:10

Full jobname: tneg_V1_UC35_0_GL0032312 Requested PE: parallel_rr 8 Hard Resources: h_stack=256M (0.000000) h_vmem=50G (0.000000) memoire=50G (0.000000) pri_work=true (2400.000000)

qstat -r : resources requirements

Page 51: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 51

OGE (Open Grid Engine)

$qstat -t 3191467 516.61063 tneg_MH034 aghozlane r 02/25/2015 09:02:18

workq@node012 MASTER workq@node012 SLAVE 3191467 516.61063 tneg_MH034 aghozlane r 02/25/2015 09:02:18

workq@node014 SLAVE 3191467 516.61063 tneg_MH034 aghozlane r 02/25/2015 09:02:18

workq@node015 SLAVE 3191467 516.61063 tneg_MH034 aghozlane r 02/25/2015 09:02:18

workq@node016 SLAVE 3191467 516.61063 tneg_MH034 aghozlane r 02/25/2015 09:02:18

workq@node017 SLAVE 3191467 516.61063 tneg_MH034 aghozlane r 02/25/2015 09:02:18

workq@node018 SLAVE 3191467 516.61063 tneg_MH034 aghozlane r 02/25/2015 09:02:18

workq@node019 SLAVE 3191467 516.61063 tneg_MH034 aghozlane r 02/25/2015 09:02:18

workq@node020 SLAVE

$qstat -t 3191467 516.61063 tneg_MH034 aghozlane r 02/25/2015 09:02:18

workq@node012 MASTER workq@node012 SLAVE 3191467 516.61063 tneg_MH034 aghozlane r 02/25/2015 09:02:18

workq@node014 SLAVE 3191467 516.61063 tneg_MH034 aghozlane r 02/25/2015 09:02:18

workq@node015 SLAVE 3191467 516.61063 tneg_MH034 aghozlane r 02/25/2015 09:02:18

workq@node016 SLAVE 3191467 516.61063 tneg_MH034 aghozlane r 02/25/2015 09:02:18

workq@node017 SLAVE 3191467 516.61063 tneg_MH034 aghozlane r 02/25/2015 09:02:18

workq@node018 SLAVE 3191467 516.61063 tneg_MH034 aghozlane r 02/25/2015 09:02:18

workq@node019 SLAVE 3191467 516.61063 tneg_MH034 aghozlane r 02/25/2015 09:02:18

workq@node020 SLAVE

qstat -t : sub-tasks (parallel jobs)

Page 52: Training Day : High Performance Computing Clusterbioinfo.genotoul.fr/wp-content/uploads/Diapo_Cluster_EN.pdf · hypermemq@bigmem02 BIP 0/3/32 2.00 linux-x64 2717127 500.10764 bayesian_m

08/03/16 52

Practical

Part 3