Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II
-
Upload
george-markomanolis -
Category
Technology
-
view
95 -
download
1
Transcript of Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II
![Page 1: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/1.jpg)
KAUSTSupercompu.ngLaboratoryPor.nganMPIapplica.ontohybridMPI+OpenMPwithRevealtoolonShaheenII
GeorgeMarkomanolisComputa.onalScien.stJune23th,2016
![Page 2: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/2.jpg)
Outline
KAUST King Abdullah University of Science and Technology 2
❖ Introduction
❖ Test case
❖ Reveal
![Page 3: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/3.jpg)
Introduc.on-ComponentsofCrayPat
KAUST King Abdullah University of Science and Technology 3
❖ Module perftools-base
• pat_build – Instruments the program to be analyzed • pat_report – Generates text reports from the performance data
captured during program execution and exports data for use in other programs.
• Cray Apprentice2 – A graphical analysis tool that can be used to visualize and explore the performance data captured during program, execution
• Reveal – A graphical source code analysis tool that can be used to correlate performance analysis data with annotated source code listings, to identify key opportunities for optimization (it works only with Cray compiler)
![Page 4: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/4.jpg)
Studyingcase
KAUST King Abdullah University of Science and Technology 4
❖ Application from seismic group related to acoustic wave
solver • Why this application? A user asked for it • MPI application • Test on 3 nodes with totally 96 cores on
Shaheen II
![Page 5: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/5.jpg)
Prepareforthetutorial
KAUST King Abdullah University of Science and Technology 5
• Connect to Shaheen II and copy the material: • ssh –X [email protected]
• cp /scratch/tmp/model_reveal.tgz .
• tar zxvf model_reveal.tgz
• cd model_reveal
• Reservation name: k1056_141
![Page 6: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/6.jpg)
Reveal
A tool to port your application to OpenMP programming model
KAUST King Abdullah University of Science and Technology 6
![Page 7: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/7.jpg)
Reveal
KAUST King Abdullah University of Science and Technology 7
❖ Reveal is Cray’s next-generation integrated performance analysis and code optimization tool.
• Source code navigation using whole program
analysis (data provided by the Cray compilation environment only)
• Coupling with performance data collected during execution by CrayPAT. Understand which high level serial loops could benefit from parallelism.
• Enhanced loop mark listing functionality. • Dependency information for targeted loops • Assist users optimize code by providing variable
scoping feedback and suggested compile directives.
![Page 8: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/8.jpg)
PrepareforReveal
KAUST King Abdullah University of Science and Technology 8
❖ Load Perftools • module unload darshan • module load perftools-base/6.3.2 • module load perftools/6.3.2
❖ Execute the MPI version • cd model_reveal • make clean • make • In the submit.sh file changed to your account number and submit the
job § sbatch submit.sh
• tail -n 10 testdata.XXX.err § 1m46.361s
Reservation: k1056_141
![Page 9: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/9.jpg)
Preparetheapplica.onforReveal
KAUST King Abdullah University of Science and Technology 9
❖ Compile the version for Reveal tool • make clean –f Makefile_reveal • In the Makefile_reveal file
§ $(CC) -h profile_generate -hpl=data.pl -h noomp $< -o $@ $(CFLAGS)
§ ${CC} -h profile_generate -hpl=data.pl -h noomp -c $< CrayData.c § Reveal needs the object of the files, so you need to modify the
Makefile if needed • make –f Makefile_reveal • The folder data.pl is created in the folder • Instrument your application
§ pat_build –w CrayData.exe § New executable is called CrayData.exe+pat, replace it to submit.sh
![Page 10: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/10.jpg)
SubmitthejobforRevealtool
KAUST King Abdullah University of Science and Technology 10
❖ Submit your job script and do not forget the reservation name (--reservation=…)
• sbatch submit.sh
❖ A performance file (extension .xf) is created, if not something was wrong in the previous steps
❖ Generate the report and the ap2 file • pat_report -o report.txt CrayData.exe+pat+58072-37t.xf
❖ Execute Reveal • reveal data.pl CrayData.exe+pat+58072-37t.ap2
![Page 11: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/11.jpg)
Reveal–LoopPerformance
KAUST King Abdullah University of Science and Technology 11
![Page 12: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/12.jpg)
Reveal–Scoping
KAUST King Abdullah University of Science and Technology 12
![Page 13: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/13.jpg)
Reveal–Programview
KAUST King Abdullah University of Science and Technology 13
![Page 14: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/14.jpg)
Reveal–Func.onView
KAUST King Abdullah University of Science and Technology 14
![Page 15: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/15.jpg)
Reveal–ArrayView
KAUST King Abdullah University of Science and Technology 15
![Page 16: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/16.jpg)
Reveal–CompilerMessages
KAUST King Abdullah University of Science and Technology 16
![Page 17: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/17.jpg)
Reveal–LoopPerformance
KAUST King Abdullah University of Science and Technology 17
![Page 18: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/18.jpg)
Reveal–ScopingTool
KAUST King Abdullah University of Science and Technology 18
![Page 19: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/19.jpg)
Reveal–ScopingResults
KAUST King Abdullah University of Science and Technology 19
![Page 20: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/20.jpg)
Reveal–OpenMPpragmas
KAUST King Abdullah University of Science and Technology 20
![Page 21: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/21.jpg)
Reveal–InsertedOpenMPpragmas
KAUST King Abdullah University of Science and Technology 21
![Page 22: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/22.jpg)
CleanthecodefromunresolvedissuesandobserveOpenMPpragmas
KAUST King Abdullah University of Science and Technology 22
❖ vim CrayData.c ❖ Remove the lines with unresolved, only if you are sure.
#pragma omp parallel for default(none) \ private (i1,i2,u) \ shared (nxpad,nzpad)
#pragma omp parallel for default(none) \ private (ix,ib,ibz) \ shared (nxpad,nb,nzpad,bndr,p0) \ lastprivate (w)
![Page 23: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/23.jpg)
CheckanOpenMPpragmaanditsvalida.on
KAUST King Abdullah University of Science and Technology 23
#pragma omp parallel for default(none) private (ix,ib,ibz) \ shared (nxpad,nb,nzpad,bndr,p0) \ lastprivate (w) for(ix=0; ix<nxpad; ix++) {
for(ib=0; ib<nb; ib++) { w = bndr[nb-ib-1]; ibz = nzpad-ib-1;
p0[ix][ib ] *= w; /* top sponge */ p0[ix][ibz] *= w; /* bottom sponge */ } } for(ib=0; ib<nb; ib++) { ibx = nxpad-ib-1; for(iz=0; iz<nzpad; iz++) { p0[ib ][iz] *= w; /* left sponge */
p0[ibx][iz] *= w; /* right sponge */ } }
![Page 24: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/24.jpg)
Cleanthecodefromunresolvedissues,compileandrun
KAUST King Abdullah University of Science and Technology 24
❖ vim CrayData.c ❖ Remove the lines with unresolved if you are sure. ❖ Compile your application with MPI and OpenMP
• make –f Makefile_omp • The new executable is called CrayData_omp.exe • Comment the active srun line in the submit.sh and uncomment
the next srun call. • Uncomment also the line with OMP_NUM_THREADS=2 • Now, we will execute the application with 48 MPI processes
(ntasks) and 2 threads per MPI process (cpus-per-task) • srun --ntasks=48 --ntasks-per-node=16 --ntasks-per-socket=8 --
hint=nomultithread --cpus-per-task=2 ./CrayData_omp.exe
![Page 25: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/25.jpg)
Differentcasesandresults
KAUST King Abdullah University of Science and Technology 25
❖ Results for 2 threads • Change according:
§ export OMP_NUM_THREADS=2 § srun –ntasks=48 --ntasks-per-node=16 --ntasks-per-
socket=8 --hint=nomultithread --cpus-per-task=2 ./CrayData_omp.exe
• 51.211s (2.86X)
❖ Results 4 threads • Change according:
§ export OMP_NUM_THREADS=4 § srun --ntasks=24 --ntasks-per-node=8 --ntasks-per-socket=4
--hint=nomultithread --cpus-per-task=4 ./CrayData_omp.exe • 24.815s (5.9X)
![Page 26: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/26.jpg)
Differentcasesandresults
KAUST King Abdullah University of Science and Technology 26
❖ Results 8 threads • 12.222s (11.98X)
❖ Results 16 threads • Change according:
§ export OMP_NUM_THREADS=16
§ srun --ntasks=6 --ntasks-per-node=2 --ntasks-per-socket=1 --hint=nomultithread --cpus-per-task=16 ./CrayData_omp.exe
• 8.895s (16.45X)
![Page 27: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/27.jpg)
Theoriginalversionwasimproved19.19.mes
KAUST King Abdullah University of Science and Technology 27
170.67
106.36
8.8950
20406080
100120140160180
Originalversion Op.mizedMPIversion
MPI+OpenMP
Time(in
sec.)
Execu.on.me
![Page 28: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/28.jpg)
Valida.on
KAUST King Abdullah University of Science and Technology 28
Original version Optimized MPI+OpenMP
![Page 29: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/29.jpg)
Summary
KAUST King Abdullah University of Science and Technology 29
❖ Reveal is an easy to use tool
❖ The user should be careful though, give notice to compiler messages
❖ You can have great speedup with this tool
❖ We need to investigate more complicated applications
![Page 30: Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II](https://reader034.fdocuments.in/reader034/viewer/2022042618/58a37ed31a28abaa488b6945/html5/thumbnails/30.jpg)
KAUST Supercomputing Laboratory
KAUST King Abdullah University of Science and Technology 30