Application Performance Profiling and Prediction in Grid Environment

32
Presented by: Marlon Bright 14 July 2008 Advisor: Masoud Sadjadi, Ph.D. REU – Florida International University

description

Presented by: Marlon Bright 14 July 2008 Advisor: Masoud Sadjadi, Ph.D. REU – Florida International University. Application Performance Profiling and Prediction in Grid Environment. Outline . Grid Enablement of Weather Research and Forecasting Code (WRF) Profiling and Prediction Tools - PowerPoint PPT Presentation

Transcript of Application Performance Profiling and Prediction in Grid Environment

Page 1: Application Performance Profiling and Prediction in Grid Environment

Presented by: Marlon Bright

14 July 2008

Advisor: Masoud Sadjadi, Ph.D.

REU – Florida International University

Page 2: Application Performance Profiling and Prediction in Grid Environment

Outline

Grid Enablement of Weather Research and Forecasting Code (WRF)

Profiling and Prediction Tools Research Goals Project Timeline Current Progress Challenges Remaining Work

2REU - Florida International University

Page 3: Application Performance Profiling and Prediction in Grid Environment

Motivation – Weather Research and Forecasting Code (WRF) Goal – Improved Weather Prediction

Accurate and Timely Results Precise Location Information

WRF Status Over 160,000 lines (mostly FORTRAN and C) Single Machine/Cluster compatible Single Domain Fine Resolution -> Resource Requirements

How to Overcome this? Through Grid Enablement

Expected Benefits to WRF More available resources – Different Domains Faster results Improved Accuracy

3REU - Florida International University

Page 4: Application Performance Profiling and Prediction in Grid Environment

System Overview Web-Based Portal Grid Middleware (Plumbing)

Job-Flow ManagementMeta-Scheduling

○ Performance Prediction

Profiling and Benchmarking Development Tools and Environments

Transparent Grid Enablement (TGE)○ TRAP: Static and Dynamic adaptation of programs○ TRAP/BPEL, TRAP/J, TRAP.NET, etc.

GRID superscalar: Programming Paradigm for parallelizing a sequential application dynamically in a Computational Grid

4REU - Florida International University

Page 5: Application Performance Profiling and Prediction in Grid Environment

Performance Prediction

IMPORTANT part of Meta-Scheduling

Allows for: Optimal usage of grid resources through

“smarter” meta-schedulingMany users overestimate job requirementsReduced idle time for compute resourcesCould save costs and energy

Optimal resource selection for most expedient job return time

5REU - Florida International University

Page 6: Application Performance Profiling and Prediction in Grid Environment
Page 7: Application Performance Profiling and Prediction in Grid Environment

Amon / Aprof

Amon – monitoring program that runs on each compute node recording new processes

Aprof – regression analysis program running on head node; receives input from Amon to make execution time predictions (within cluster & between clusters)

7REU - Florida International University

Page 8: Application Performance Profiling and Prediction in Grid Environment

Amon / Aprof Monitoring and Prediction

8REU - Florida International University

Page 9: Application Performance Profiling and Prediction in Grid Environment

Amon / Aprof Approach to Modeling Resource Usage

9

WRF

REU - Florida International University

Page 10: Application Performance Profiling and Prediction in Grid Environment

Sample Amon Output Process--- (464) ---

name: wrf.exe

cpus: 8

inv clock: 1/2297.700 [MHz]

inv cache size: 1/1024 [KB]

elapsed time: 1234232 [msec]

utime: 1233890 [msec] 1236360 [msec]

stime: 560 [msec] 1420 [msec]

intr: 44959

ctxt switch: 84394

fork: 89

storage R: 0 [blocks] 0 [blocks]

storage W: 0 [blocks]

network Rx: 4188840 [bytes]

network Tx: 2106854 [bytes]

10REU - Florida International University

Page 11: Application Performance Profiling and Prediction in Grid Environment

Sample Aprof Outputname: wrf_arw_DM.exe

elapsed time:

5.783787e+06

===========================================================

explanatory: value parameter std.dev

----------------- ------------- ------------- -------------

: 1.000000e+00 5.783787e+06 1.982074e+05

===========================================================

predicted: value residue rms std.dev

----------------- ------------- ------------- -------------

elapsed time: 5.783787e+06 4.246451e+06 1.982074e+05

===========================================================

REU - Florida International University 11

Page 12: Application Performance Profiling and Prediction in Grid Environment

Sample Query Automation Script Output

adj. cpu speed, processors, actual, predicted, rms, std. dev, actual difference,

3591.363, 1, 5222, 5924.82, 1592.459, 415.3491, 13.4588280352

3591.363, 2, 2881, 3246.283, 1592.459, 181.5382, 12.6790350573

3591.363, 3, 2281, 2353.438, 1592.459, 105.334, 3.17571240684

3591.363, 4, 1860, 1907.015, 1592.459, 69.19778, 2.52768817204

3591.363, 5, 1681, 1639.161, 1592.459, 49.83672, 2.48893515764

3591.363, 6, 1440, 1460.592, 1592.459, 39.5442, 1.43

3591.363, 7, 1380, 1333.043, 1592.459, 34.76459, 3.40268115942

3591.363, 8, 1200, 1237.381, 1592.459, 33.27651, 3.11508333333

3591.363, 9, 1200, 1162.977, 1592.459, 33.56231, 3.08525

3591.363, 10, 1080, 1103.454, 1592.459, 34.68943, 2.17166666667

3591.363, 11, 1200, 1054.753, 1592.459, 36.15324, 12.1039166667

3591.363, 12, 1080, 1014.169, 1592.459, 37.70271, 6.09546296296

3591.363, 13, 1200, 979.8292, 1592.459, 39.22018, 18.3475666667

3591.363, 14, 1021, 950.3947, 1592.459, 40.65455, 6.91530852106

3591.363, 15, 1020, 924.8848, 1592.459, 41.9872, 9.32501960784

REU - Florida International University 12

Page 13: Application Performance Profiling and Prediction in Grid Environment

Previous Findings for Amon / AprofExperiments were performed on two clusters at FIU

—Mind (16 nodes) and GCB (8 nodes) Experiments were run to predict for different

number of nodes and cpu loads (i.e. 2,3,…,14,15 and 20%, 30%,…,90%, 100%)

Aprof predictions were within 10% error versus actual recorded runtimes within Mind and GCB and between Mind and GCB

Conclusion: first step assumption was valid. -> Move to extending research to higher number of nodes.

13REU - Florida International University

Page 14: Application Performance Profiling and Prediction in Grid Environment

Paraver / Dimemaso Dimemas - simulation tool for the

parametric analysis of the behavior of message-passing applications on a configurable parallel platform.

o Paraver – tool that allows for performance visualization and analysis of trace files generated from actual executions and by Dimemas

Tracefiles generated by MPItrace that is linked into execution code

14REU - Florida International University

Page 15: Application Performance Profiling and Prediction in Grid Environment

Dimemas Simulation Process Overview

1. Link MPItrace into application source code—dynamically generates tracefiles for each node application running on (.mpit)

2. Use CEPBA tool ‘mpi2prv’ to convert .mpit files into one .prv file

3. Load file into Parver using XML filtering file (provided by CEPBA) to reduce tracefile eliminating ‘perturbed regions’ (i.e. much of the initialization)

4. Open tracefile in Paraver using ‘useful_duration’ configuration file and adjust scales to fit events

5. Identify computation iterations compose a smaller trace file by selecting a few iterations, preserving communications and eliminating initialization phases

REU - Florida International University 15

Page 16: Application Performance Profiling and Prediction in Grid Environment

Paraver tracefile with iterations selected, cut, and ready for Dimemas conversion.

REU - Florida International University 16

Page 17: Application Performance Profiling and Prediction in Grid Environment

Simulation Process (cont’d)6. Convert the new tracefile to Dimemas format (.trf) using

CEPBA provided ‘prv2trf’ tool

7. Load tracefile into Dimemas simulator, configure target machine, and with information generate Dimemas configuration file

8. Call simulator with or without option of generating a Paraver (.prv) tracefile for viewing.

Great News:

You only have to go through this process once if done for the maximum amount of nodes you will simulate for! Once configuration file is generated, different numbers of nodes can be simulated for through alterations to the file.

REU - Florida International University 17

Page 18: Application Performance Profiling and Prediction in Grid Environment

Dimemas Simulator Results

18REU - Florida International University

Page 19: Application Performance Profiling and Prediction in Grid Environment

Goals

1. Extend Amon/Aprof research to larger number of nodes, different architecture, and different version of WRF (Version 2.2.1).

2. Compare/contrast Aprof predictions to Dimemas predictions in terms of accuracy and prediction computation time.

3. Analyze if/how Amon/Aprof could be used in conjunction with Dimemas/Paraver for optimized application performance prediction and, ultimately, meta-scheduling

19REU - Florida International University

Page 20: Application Performance Profiling and Prediction in Grid Environment

Timeline End of June: Get MPItrace linking properly with WRF Version Compiled on GCB, then Mind

COMPLETE a) Install Amon and Aprof on MareNostrum and ensure proper functioning

AMON COMPLETE; APROF FINAL STAGES

b) Run Amon benchmarks on MareNostrum COMPLETE Early/Mid July:

Use and analyze Aprof predictions within MareNostrum (and possibly between MareNostrum, GCB, and Mind) IN PROGRESS

Use generated MPI/ OpenMP tracefiles (Paraver/Dimemas) to predict within (and possibly between) Mind, GCB, and MareNostrum IN PROGRESS

Late July/Early August: Experiment with how well Amon and Aprof relate to/could possibly be

combined with Dimemas Analyze how findings relate to bigger picture. Make optimizations on grid-

enablement of WRF. Compose paper presenting significant findings.

20REU - Florida International University

Page 21: Application Performance Profiling and Prediction in Grid Environment

21REU - Florida International University

Page 22: Application Performance Profiling and Prediction in Grid Environment

General

Completed reading of related works papers

Well advanced in Linux studies Established effective

collaboration/working relationship with developers of Dimemas and Paraver

22REU - Florida International University

Page 23: Application Performance Profiling and Prediction in Grid Environment

Amon

Installed on MareNostrum Adjusted source code to properly read node

information from MareNostrum (will document this on Wiki to be considered when configuring on new architectures)

23REU - Florida International University

Page 24: Application Performance Profiling and Prediction in Grid Environment

Amon (cont’d) Automated benchmarking shell script developed

Starts Amon on each compute node returned by system scheduler

Executes WRF with one process per node for:○ Node counts of: 8, 16, 32, 64, 96, and 128○ CPU percentage (%) loads of: 25, 50, 75, & 100 (Done

through implementation of CPULimit program)Writes results (to be used as Aprof input) to

organized results directory of …/<cpu load percentage>/<number of nodes>/<timestamp of run>/ <amon output by node>

24REU - Florida International University

Page 25: Application Performance Profiling and Prediction in Grid Environment

Aprof

Installed on MareNostrum Adjusted source code to change the way

Aprof reads in informationBefore: Input files had to specify number of

bytes in process listing in process header (This was very complicated and error prone. Aprof was inconsistent in loading MareNostrum data).

Now: Input files simply need to separate process entries with one or more blank lines.

25REU - Florida International University

Page 26: Application Performance Profiling and Prediction in Grid Environment

Aprof (cont’d)

Script developed that combines Amon output from all nodes and edits it into the necessary read-in format for Aprof.

Aprof query automation script adjusted /developed for MareNostrumQueries Aprof for prediction information for

different cases (number of nodes; cpu percentage loads)

Compares predicted values to actual values returned by run

26REU - Florida International University

Page 27: Application Performance Profiling and Prediction in Grid Environment

Dimemas / Paraver

Paraver tracefile successfully generated and visualized with GUI on MareNostrum

Dimemas tracefile successfully generated from Paraver on MareNostrum

Configuration file for MareNostrum developed

Prediction simulations will begin shortly

27REU - Florida International University

Page 28: Application Performance Profiling and Prediction in Grid Environment

Significant Challenges Overcome Amon:

Adjustment of source code to proper functioning on MareNostrum

Development of benchmarking script to conform to system architecture of MareNostrum (i.e. going through its scheduler; one process per node; etc.)

Aprof:Adjustment of source code for less complex,

more consistent data inputDevelopment of prediction and comparison

scripts for MareNostrum

28REU - Florida International University

Page 29: Application Performance Profiling and Prediction in Grid Environment

Significant Challenges Overcome(cont’d) Dimemas/Paraver

MPItrace properly linked in with WRF on GCB and Mind

Paraver and Dimemas successfully generated and configuration file configured for MareNostrum.

WRFVersion 2.2 installed and compiled on Mind

29REU - Florida International University

Page 30: Application Performance Profiling and Prediction in Grid Environment

Remaining Work Scripting Dimemas prediction simulations for the

same scenarios of those of Amon and Aprof Finalizing Aprof prediction/comparison script so

that Aprof’s performance on new architecture of MareNostrum can be analyzed

Deciding if and how to compare results from MareNostrum, GCB, and Mind (i.e. the same versions of WRF would have to be running in all three locations)

Experiment with how well Amon and Aprof relate to/could possibly be combined with Dimemas

30REU - Florida International University

Page 31: Application Performance Profiling and Prediction in Grid Environment

References S. Masoud Sadjadi, Liana Fong, Rosa

M. Badia, Javier Figueroa, Javier Delgado, Xabriel J. Collazo-Mojica, Khalid Saleem, Raju Rangaswami, Shu Shimizu, Hector A. Duran Limon, Pat Welsh, Sandeep Pattnaik, Anthony Praino, David Villegas, Selim Kalayci, Gargi Dasgupta, Onyeka Ezenwoye, Juan Carlos Martinez, Ivan Rodero, Shuyi Chen, Javier Muñoz, Diego Lopez, Julita Corbalan, Hugh Willoughby, Michael McFail, Christine Lisetti, and Malek Adjouadi. Transparent grid enablement of weather research and forecasting. In Proceedings of the Mardi Gras Conference 2008 - Workshop on Grid-Enabling Applications, Baton Rouge, Louisiana, USA, January 2008.

http://www.cs.fiu.edu/~sadjadi/Presentations/Mardi-Gras-GEA-2008-TGE-WRF.ppt

S. Masoud Sadjadi, Shu Shimizu, Javier Figueroa, Raju Rangaswami, Javier Delgado, Hector Duran, and Xabriel Collazo. A modeling approach for estimating execution time of long-running scientific applications. In Proceedings of the 22nd IEEE International Parallel & Distributed Processing Symposium (IPDPS-2008), the Fifth High-Performance Grid Computing Workshop (HPGC-2008), Miami, Florida, April 2008.

http://www.cs.fiu.edu/~sadjadi/Presentations/HPGC-2008-WRF%20Modeling%20Paper%20Presentationl.ppt “Performance/Profiling”. Presented by

Javier Figueroa in Special Topics in Grid Enablement of Scientific Applications Class. 13 May 2008

31REU - Florida International University

Page 32: Application Performance Profiling and Prediction in Grid Environment

Acknowledgements

REU PIRE BSC Masoud Sadjadi, Ph. D. - FIU Rosa Badia, Ph.D. - BSC Javier Delgado – FIU Javier Figueroa - UM

32REU - Florida International University