Parallel Performance Wizard: a Performance Analysis Tool for UPC (and other PGAS Models) Max...
-
Upload
debra-martha-douglas -
Category
Documents
-
view
215 -
download
0
Transcript of Parallel Performance Wizard: a Performance Analysis Tool for UPC (and other PGAS Models) Max...
Parallel Performance Parallel Performance Wizard: a Performance Wizard: a Performance Analysis Tool for UPC Analysis Tool for UPC (and other PGAS (and other PGAS Models)Models)Max Billingsley III1, Adam Leko1, Hung-Hsun Su1,
Dan Bonachea2, Alan D. George1
1 Electrical and Computer Engineering Dept., University of Florida2 Computer Science Div., UC Berkeley
2
Outline of Talk
Review of PGAS talk
The goal of PPW
Current status of PPW
Using PPW
Continuing Work
How can we make PPW as useful as
possible?
3
Review of PGAS talk
Motivation for performance tools supporting PGAS models printf() doesn’t cut it for optimizing programs writing
using PGAS models such as UPC Good tools can really enhance productivity Currently poor support for UPC from existing tools
Overview of the GASP tool interface Event-based interface between performance tool and GAS
model compiler / runtime system Overview and demonstration of PPW
New performance tool designed for PGAS models
4
The goal of PPW
Help UPC users achieve maximum productivity
in optimizing the performance of their applications
by providing detailed experimental performance data and helping them make sense of this data.
5
Parallel Performance Wizard – current status Beta version of PPW available now:
http://www.hcs.ufl.edu/ppw/ We even have a Java WebStart version you can
test-drive quickly from any computer PPW currently includes many features that
should make it useful for UPC developers UPC-specific array layout visualization
PPW has complete instrumentation support on one UPC implementation Berkeley UPC 2.3.16 beta includes complete
support for PPW by implementing GASP
6
Using PPW
The UPC developer takes the following steps: Build the application using PPW’s compiler
wrapper scripts: ppwupcc –inst-functions -o upc_app upc_app.c
Execute the instrumented application, using the ppwrun script to set up the environment: ppwrun --pofile --output=upc_app.par upcrun -N 32 ./upc_app
Open the resulting file using the PPW GUI Transfer file to workstation and start GUI
7
Continuing work on PPW and GASP PPW
Add Additional PPW visualization features Scalability charts
More interesting analysis functionality GASP
Add support for additional PGAS models Help other tools take advantage of GASP
Nano Case Study, NPB2.4 IS
9
Nano Case Study Intro
PPW looks pretty, how useful is it for real apps?
Examine GWU NPB2.4 IS benchmark and looked for interesting things
Point of study See if tool tells us anything interesting NOT to pick apart a particular implementation Example yesterday illustrated my bad UPC code
10
NPB 2.4 on Marvel (8 dual-core pr. SMP)
11
NPB2.4 on Mu Cluster (Quadrics & Opteron)
12
Close-up of SMP Comm. Pattern
13
Close-up of Cluster Comm. Pattern
14
The Culprit
/*** Equivalent to the mpi_alltoall + mpialltoallv in the c + mpi version* of the NAS Parallel benchmark.*/
for( i=0; i<THREADS; i++ ){ upc_memget( &infos[i], &send_infos_shd[MYTHREAD][i], sizeof( send_info ));}
…
for(i = 0; i < THREADS; i++){ …upc_memget( key_buff2 + total_displ,
key_buff1_shd + i + infos[i].displ * THREADS, infos[i].count * sizeof(INT_TYPE)) ;
…} * Collectives! *
15
Other Interesting Things
Sum reduction Broadcast
16
Interesting Reduction Find
How many remote references? upc_forall(thr_cnt=1; thr_cnt<THREADS; thr_cnt
<<= 1; continue) … upc_memget(local_array, ptrs[MYTHREAD + thr_cnt],
size * sizeof(elem_t)) ; …
What about now? shared elem_t *shared *ptrs ;
17
Comm. Leak, Visually
18
How can we make PPW as useful as possible? We would like feedback on the tool
Try the PPW beta and provide feedback! www.hcs.ufl.edu/ppw
Help us improve GASP What can we do to help language implementers
add GASP support?
Other ideas regarding UPC performance analysis?
19
Interoperability
Some key issues Usefulness of interoperating with other similar
PGAS models? “Dusty deck” MPI code