Beowulf Ray Winstone as Beowulf in Robert Zemeckis’ Beowulf, 2007.
Beowulf Training v2.1
Transcript of Beowulf Training v2.1
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 1/51
1
Beowulf HPC Cluster TrainingTan Wee ChuanSenior Support Engineer
Centre for Academic Computing
Updated: 21 Feb 08
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 2/51
2
Our Focus Today Beowulf Cluster Setup and Access
Grid Engine
Cluster Software
Job Submission
Hands-on
Research Resources & CAC Website
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 3/51
3
Cluster Physical Setup
AMD Opteron2.4GHz
Processors
4GB mem/node
Intel Xeon3.06GHzProcessors
4GB mem/node
32-bit frontend node
18 x 32-bit compute CPU
2 x 32-bit express CPU
64-bit frontend node
38 x 64-bit compute CPU
2 x 64-bit express CPU
User Disk Quota:- 8 GB (soft)- 12 GB (hard)
<-- -->
32-bit 64-bit
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 4/51
4
Cluster Health Monitoringhttp://beowulf.smu.edu.sg/ganglia
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 5/51
5
Cluster Access
Beowulf Cluster Information
Files Transfer (on campus )
Files Transfer (off campus )
Login to the Cluster (anywhere )
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 6/51
Beowulf Cluster InfoHost name beowulf.smu.edu.sg
Login ID Your SMU userid
Password Your SMU password
Access Protocol SSH 2
Software to Use • Secure shell clients such as:
• Putty (\\beowulf\resources or Google it)
• Linux or Unix OS.
SharedDocuments andWIKI page
http://research2.smu.edu.sg/CAC/HPC/
6
Cluster Access
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 7/51
7
Files Transfer (on campus)
Type \\beowulf into
the address bar of the Windows Explorer
Cluster Access
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 8/51
Files Transfer (off campus)
Previous method (but with SMU VPN on)
Use WinSCP
8
Put hostname,
user id andpassword here
ChooseSFTP (allow SCP fallback)
Cluster Access
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 9/51
Files Transfer (off campus)
9
Cluster Access
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 10/51
Login to the Cluster
10
You can create andsave settings under aConnection Name
Hostname
You can put inyour useridto save typing
• Using Putty
You can changethe backgroundcolour and font size
Cluster Access
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 11/51
Login to the Cluster What you see after you login
Cluster Access
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 12/51
Exercises: Some Commands Directory operations
Present working directory List directory Change directory Make directory
File operations Copy files Delete files Moving files
Creating and editing files
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 13/51
Exercise: Directory Operations Present working directory: pwd
List directorysimple listing: lsfull listing: ls -lfull listing with screen scroll: ls -l | more
Change directoryabsolute path: cd [full pathname] eg: cd /opt/matlab relative path: cd [foldername] eg: cd beowulf-samples return to home: cd
Make directory: mkdir [new folder name ] eg: mkdir set1
Remove directory: rm –r [folder name] eg: rm –r set1
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 14/51
Exercise: File Operations Copy files: cp [file to copy] [location]
Examples: copy within same folder: cp data1.txt data2.txt
copy to another folder with same name: cp data1.txt set1
copy to another folder and rename: cp data1.txt set1/data.txt
Move / Rename files: mv [file to move] [location]
Examples:
move to another folder: mv data1.txt set1
move to another folder and rename: mv data1.txt set1/data.dat
Delete files: rm [file1] [file2] [file3]
Example: rm data1.txt run*.java
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 15/51
Exercise: Create / Edit Files Do not use Microsoft
Notepad, Wordpad orWord. On Windows, useUNIX friendly text editors,then copy the file over
In the cluster, choice of“vi” or “pico”
Using vi:
vi [file name]ddTo delete a line
yyTo copy a line
Press escTo exit insert text mode
iTo go into insert textmode
:qTo quit without save
:wqTo save and quit
:wTo save
TypingAction
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 16/51
16
Grid Engine
qsub
qstat (or jobwatch)
qdel
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 17/51
qsub Use this command to submit a job to the cluster
A job consists of your program codes and a text file (jobscript) “describing” the job
17
My job script
My matlab code& data file
Using qsub to submit the job script
Syntax: qsub [job script name]
Grid Engine
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 18/51
qstat Use this command to check job queues and job status
18
Job state (qw: queueing; t: transfer; r: running; Eqw: error)
Using qstat to check the job queue
Syntax: Action CommandSee all submitted jobs qstat
See only your jobs qstat –u [userid]
See all cluster queues qstat –f
Per second update of status jobwatch –u
{use Ctrl-C to break update}
Grid Engine
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 19/51
qstat -f
19
{ cropped }
AMD64-bit
x86 areIntel 32-bit
Grid Engine
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 20/51
qdel Use this command to delete a submitted job
20
First, use “qstat” to locate job id
Then, use qdel to delete the job
Syntax: qdel [jobid]
Grid Engine
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 21/51
summary
qsub (for job submission)
qstat / jobwatch (for job monitoring)
qdel (for job deletion)
21
Grid Engine
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 22/51
Job Submission Overview
Software Available on the 2 Platforms
Software Specific Job Submissions
Common Errors
22
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 23/51
Overview
23
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 24/51
Software AvailableThe mainstream list of software available across both platforms:
24
Software 32-bit 64-bit
C / C++ /Fortran / IMSL for Fortran
YES (intel & gnu) YES (pathscale & gnu)
Gauss pseudocode YES YES
ILOG Cplex NO YES
JAVA YES YES
MATLAB YES YES
Compiled MATLAB codes YES (source 32-bitlibrary names)
YES (source 64-bit librarynames)
R YES YES
STATA NO YES
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 25/51
Job Submission Script A job submission script consists of 2 parts:
Grid Engine settings (switches) Software specifics
Sample:
25
--------------------------
#! /bin/bash
#$ -j y
#$ -cwd
#$ -m e
#$ -M [email protected]#$ -l amd64=1
. /etc/profile.d/java.sh
./mycode
--------------------------
GE switches
Software specifics
Job Submission
J b S b i i
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 26/51
GE Switch
Switch Meaning#$ -j y Merge the error and output stream into a single file
#$ -cwd Stands for “current working directory”. Means to take the currentfolder as reference for file paths
#$ -m e Send an email notification when the job ends#$ -M [email address] Send job notification to this email address
One of the following switches need to be indicated depending on the platform of your job:
#$ -l intel32=1 Send job to the Intel 32-bit standard queue
#$ -l amd64=1 Send job to the AMD 64-bit standard queue
#$ -l xp=1 Send job to the Intel 32-bit express queue (max. 24 hrs before kill)
#$ -l xp64=1 Send job to the AMD 64-bit express queue (max 24 hrs before kill)
26
The switches and the meaning:
Job Submission
J b S b i i
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 27/51
Software Specifics
27
Software Launching Example
C / C++ /Fortrancompiledbinary
./[compiled filename] > [output
file]
./compute_matrix > output.txt
Gausspseudocode
gsrun -b [filename.e.gcg] gsrun -b ols.e.gcg
ILOG Cplex cplex < [cplex script name] >
[output file]
cplex < optim_val > output.txt
JAVA . /etc/profile.d/java.sh
java [filename]
. /etc/profile.d/java.sh
java testfile > output.txt
27
Each software has its way of launching:
Job Submission
continue next page…
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 28/51
Software Specifics…continue from previous page.
28
Job Submission
Software Launching ExampleMATLAB matlab -r ‘[filename]’ matlab –r ‘matrix’
Compiled
MATLABcodes
. /etc/profile.d/matlab.sh
./[compiled filename]
. /etc/profile.d/matlab.sh
./matrix_compiled
R R -b --vanilla < [filename.R] >
[output file]
R -b --vanilla < Rcode.R >
output.txt
STATA stata -b do [filename] stata -b do sortdata
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 29/51
C / C++
Use the Intel compiler “icc”:
example: icc -static matrix.cpp -o matrix
Move to the 64-bit platform first (“devel64”)
There are 2 licenses available for the Pathscale compilers
Use the Pathscale compiler “pathcc” (C) or “pathCC” (C++)
example: pathcc matrix.c -o matrix
pathCC matrix.cpp -o matrix
29
32-bit compilation
64-bit compilation
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 30/51
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 31/51
C / Fortran Script Form your job submission script. Use the text editor “vi” or “pico” or
use “Editpad”. Normally, we use a file extension “.sh” to name script
Submit the job submission script: qsub mycode.sh Check state: qstat (OR) jobwatch -u
If error, check the error / output stream file “mycode.sh.oXXXX ” and
make corrections
31
#! /bin/bash
#$ -j y
#$ -cwd
#$ -m e
#$ -M [email protected]
#$ -l intel32=1
./compiled_filename > output.txt
GE switches
Code specifics
Example: mycode.sh
Job Submission
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 32/51
Gauss Gauss Runtime Library is available on both the 32-bit and 64-bit
compute nodes
Gauss codes cannot be run directly and has to be compiled intoGauss pseudocode
Launch Gauss on your choice of platform (32-bit or 64-bit) andcompile your Gauss code into pseudocode form
example: compile mygausscode.e
Exit Gauss and find a new file with extension “.e.gcg”
32
Job Submission
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 33/51
Gauss Form your job submission script. Use the text editor “vi” or “pico” or
use “Editpad”. Normally, we use a file extension “.sh” to name script
Submit the job submission script: qsub mycode.sh Check state: qstat (OR) jobwatch -u
If error, check the error / output stream file “mycode.sh.oXXXX ” and
make corrections
33
#! /bin/bash
#$ -j y
#$ -cwd
#$ -m e
#$ -M [email protected]
#$ -l intel32=1
gsrun –b mygausscode.e.gcg
GE switches
Code specifics
Example: mycode.sh
Job Submission
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 34/51
ILOG Cplex ILOG Cplex is available only on the 64-bit platform and each job
utilises a license from the 20 network licenses available on campus
To run a Cplex optimization on a model:
Formulate and program the model
Write a cplex script to read the model and run optimization
34
maximize
x1 + 2 x2 + 3 x3
subject to
-x1 + x2 + x3 <= 20
x1 - 3 x2 + x3 <=30 bounds
0 <= x1 <= 40
0 <= x2
0 <= x3
End
Filename: problem.lp
read problem.lp
optimize
display solution variables x1-x3
quit
Filename: cplex-script
Job Submission
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 35/51
ILOG Cplex Form your job submission script. Use the text editor “vi” or “pico” or
use “Editpad”. Normally, we use a file extension “.sh” to name script
Submit the job submission script: qsub mycode.sh Check state: qstat (OR) jobwatch -u
If error, check the error / output stream file “mycode.sh.oXXXX ” and
make corrections
35
#! /bin/bash
#$ -j y
#$ -cwd
#$ -m e
#$ -M [email protected]
#$ -l intel32=1
cplex < cplex-script > out.txt
GE switches
Code specifics
Example: mycode.sh
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 36/51
JAVA The Sun Java Software Development Kit (SDK) is available on both
the 32-bit and 64-bit platforms. You can compile your JAVA code on
the frontend nodes before job submission
To submit your job, form your job submission script. Use the texteditor “vi” or “pico” or use “Editpad”. Normally, we use a file
extension “.sh” to name script
36
#! /bin/bash
#$ -j y
#$ -cwd
#$ -m e
#$ -M [email protected]
#$ -l intel32=1
. /etc/profile.d/java.sh
java myjavacode
GE switches
Code specifics
example: mycode.sh
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 37/51
JAVA Submit the job submission script: qsub mycode.sh
Check state: qstat (OR) jobwatch -u
If error, check the error / output stream file “mycode.sh.oXXXX ” and
make corrections
37
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 38/51
MATLAB MATLAB is available on both the 32-bit and 64-bit compute nodes.
The version on the main frontend nodes are full versions (to support
compiling) while the compute nodes have only the followingtoolboxes only: Financial
Optimization
Splines
Statistics
Each job utilises the network licenses available on campus
We encourage users to compile MATLAB codes to conservelicenses whenever possible (2 slides later…)
38
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 39/51
MATLAB Form your job submission script. Use the text editor “vi” or “pico” or
use “Editpad”. Normally, we use a file extension “.sh” to name script
Submit the job submission script: qsub mycode.sh Check state: qstat (OR) jobwatch -u
If error, check the error / output stream file “mycode.sh.oXXXX ” and
make corrections
39
#! /bin/bash
#$ -j y
#$ -cwd
#$ -m e
#$ -M [email protected]
#$ -l intel32=1
matlab -r ‘mymatlabcode’
GE switches
Code specifics
Example: mycode.sh
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 40/51
Compiled MATLAB Codes A compiled MATLAB code binary uses the MATLAB Runtime Library
to execute
This way of running your MATLAB code does not use any of thenetwork licenses which is shared by the entire campus
Most MATLAB codes can be compiled but there are minorexceptions
To compile a MATLAB code, it has to be turned into a function
example:
40
function main
mat1 = magic(4)mat2 = mat1 * mat1
exit
Your matlab code ->
<- Function header
<- exit the code
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 41/51
Compiled MATLAB Codes To compile the function, choose your platform of choice and launch
MATLAB
Then run the MATLAB compile command “mcc”
example: mcc -mv mymatlabcode.m
It even automatically source for your own sub-functions. ExitMATLAB to free the Compiler license (only 1 on campus)
The output files created follows your filename: mymatlabcode
mymatlabcode.ctf
mymatlabcode*.c
mymatlabcode.prj
mccExcludedFiles.log 41
These files can be removed
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 42/51
Compiled MATLAB Codes In the job submission script, the platform-dependent Runtime Libraries (RT)
have to be “exported” into the environment
42
#! /bin/bash
#$ -j y
#$ -cwd
#$ -m e
#$ -M [email protected]
#$ -l intel32=1
. /etc/profile.d/matlab.sh
./mymatlabcode
GE switches
Code specifics
eg: mycode.sh (the 32-bit RT Library exported in a single line)
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 43/51
Compiled MATLAB Codes Submit the job submission script: qsub mycode.sh
Check state: qstat (OR) jobwatch -u
If error, check the error / output stream file “mycode.sh.oXXXX ” andmake corrections
If the RT libraries are not exported, you will see such errors:
./mymatlabcode: error while loading shared libraries:
libmwmclmcrrt.so.7.5: cannot open shared object file: No such
file or directory
You can always find updated export paths in the general-submissionscript in the beowulf-samples folder
43
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 44/51
R R is available on both the 32-bit and 64-bit platforms.
We can include contributed extension packages base on your needs
To submit R jobs, form your job submission script. Use the text editor“vi” or “pico” or use “Editpad”. Normally, we use a file extension “.sh”
to name scripts
44
#! /bin/bash
#$ -j y
#$ -cwd
#$ -m e
#$ -M [email protected]
#$ -l intel32=1
R -b –-vanilla < myRcode.R
GE switches
Code specifics
Example: mycode.sh
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 45/51
R Submit the job submission script: qsub mycode.sh
Check state: qstat (OR) jobwatch -u
If error, check the error / output stream file “mycode.sh.oXXXX ” and
make corrections
45
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 46/51
STATA STATA is available only on the 64-bit platform.
To submit STATA jobs, form your job submission script. Use the texteditor “vi” or “pico” or use “Editpad”. Normally, we use a fileextension “.sh” to name scripts
46
#! /bin/bash#$ -j y
#$ -cwd
#$ -m e
#$ -M [email protected]
#$ -l intel32=1
stata -b do mySTATAcode.R
GE switches
Code specifics
Example: mycode.sh
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 47/51
STATA Submit the job submission script: qsub mycode.sh
Check state: qstat (OR) jobwatch -u
If error, check the error / output stream file “mycode.sh.oXXXX ” and
make corrections
47
Job Submission
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 48/51
Common Errors Job state is “Eqw” (error queuing)Cause: - was the job submission script edited using MS Notepad or Wordpad?
Solution: - delete the job from the queue- delete the job script and redo a new job script using a Linux friendly text editor
Job disappeared immediately from queue but results are missing
Cause: - execution terminated due to errors like missing export variables or filesSolution: - check the error / output stream file for hints of problem
- check the software-specific export variables for mistakes if any
- make sure that the path to execute a binary is correct
Error / output stream file shows “segmentation fault”Cause: - usually due to wrong platform of execution for compiled codes
Solution: - check the GE switch “-l [platform]=1” and make sure that the compiled binary
runs on the correct platform
48
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 49/51
Exercise - MATLAB Step 1: move into the beowulf samples matlab sub folder
cd ~/beowulf-samples/matlab
Step 2: see what’s in the folder through windows explorer ( \\beowulf)
ls (OR) ls -l
Step 3: examine the matlab code “matrix.m”
Step 4: edit the job submission script “matlab-submit.sh” and do thefollowing:
Remove 1 hash (#) from the line ##$ -cwd
Remove 1 hash (#) from the line ##$ -l intel32=1
49
8/8/2019 Beowulf Training v2.1
http://slidepdf.com/reader/full/beowulf-training-v21 50/51
Exercise - MATLAB Step 5: back to the shell, make sure you’re in the right sub-folder.
You can check ( /home/wctan/beowulf-samples/matlab):
pwd
Step 6: submit the job
qsub matlab-submit.sh
Step 7: check the job status
qstat (OR) jobwatch -u
Step 8: check the folder again when the job ends. There should bean output file named “output.txt”
50