Pablo Project
• http://www-pablo.cs.uiuc.edu/Projects/Pablo/
• Goal: portable performance data environment for parallel systems
• Pablo Version 5.0 components
– SDDF Library
– TraceLibrary
– I/O Analysis programs
– Analysis GUI
– SvPablo
Self Defining Data Format -SDDF
• Performance data description language that specifies both data record structures and data record instances
• Supports definition of records containing scalars and arrays of the base types found in most programming languages
• Developed to link Pablo instrumentation software to Pablo analysis environment
SDDF (cont.)
• Goals - compactness, portability, generality, extensibility
• ASCII and binary formats (binary contains flag indicating byte ordering)
• SDDF interface library -- library of C++ classes for writing and interpreting files in SDDF format
• FileStats utility -- shows types of records and range of values appearing in SDDF file
SDDF Example// “description” “IO Seek”“Seek” { // “Time” “Timestamp” int “Timestamp”[];
// “Seconds” “Floating Point Timestamp” double “Seconds”;
// “Event ID” “Corresponding event” // “700013” “lseek” // “700015” “fseek” int “Event Identifier”;
// “Node” “Processor number”; int “Processor Number”; // “Duration” “Event duration in seconds” double “Duration”; // “File ID” “Unique file identifier” // “Number Bytes” “Number of bytes traversed” int “Number Bytes”; // “Offset” “Byte offset from position indicated by Whence” int “Offset”; // “Whence” “Indicates file position that Offset is measured from” // “0” “SEEK_SET” // “1” “SEEK_CUR” // “2” “SEEK_END” int “Whence”;;;
SDDF Example (cont.)
“Seek” {
[2] {
201803857,
0
}, 20.1803857, 70013, 0, 0.0031946, 3, 0, 0, 0 };;
Pablo TraceLibrary
• Basic trace library with extensions for procedure tracing, loop tracing, NX message passing tracing, I/O tracing, MPI tracing
• Basic trace library– functions traceEvent, countEvent, startTimeEvent,
endTimeEvent
– event ID specifies type of event that is being traced
Pablo TraceLibrary (cont.)
• Extensions provide wrapper functions for management of event ID’s for various event types
• Procedure and loop tracing done manually by inserting calls to TraceLibrary routines into application source code
• Default mode is to dump trace buffer contents to a trace file, but it’s possible to have trace data output sent to a socket for real-time analysis
TraceLibrary Scalability
• Documentation states that TraceLibrary monitors and dynamically alters volume, frequency, and types of event data by– associating a user-specified maximum trace level with
each event and – substituting less invasive data recording (e.g., event
counts rather than complete event traces) if maximum user-specified rate is exceeded
• Unclear if these measure are taken automatically by high-level trace library or if they must be explicitly called by user at low level
I/O Extension to TraceLibrary
• I/O instrumentation requires changes to application source code
• I/O trace initialization and termination routines must be called before and after calling any other I/O trace routines
• I/O trace bracketing routines provided for I/O requests that are not implemented as library calls (e.g., getc macro in C and Fortran I/O statements that are part of the language)
I/O Extension (cont.)
• I/O instrumentation options for C programs– Manually replace standard I/O calls with tracing
counterparts– Define IOTRACE so that pre-processor replaces
standard I/O calls with tracing counterparts• I/O instrumentation of Fortran programs
– Manually bracket each I/O call with I/O trace library bracketing routines
I/O Extension (cont.)
• Programs containing to I/O extension interface routines must be linked with– Pablo Trace Extension Library
libPabloTraceExt.a– Pablo Base Trace Library libPabloTrace.a
Sample C program - No Instrumentation#include <stdio.h>#include <stdlib.h>main()
{
FILE *fp; char buffer[1024]; size_t cnt;
fp = fopen(“/etc/motd”, “r”); if (fp != NULL) { cnt = fread(buffer, sizeof(char), 1024, fp); fclose(fp); }}
Sample C program - Manual Instrumentation#include “IOTrace.h”#include <stdio.h>#include <stdlib.h>
main()
{
FILE *fp; char buffer[1024]; size_t cnt;
initIOTrace(); /* Initialize I/O Extension */
fp = traceFOPEN(“/etc/motd”, “r”); if (fp != NULL) { cnt = traceFREAD(buffer, sizeof(char), 1024, fp) traceFCLOSE(fp); } /* Trace termination routines */ endIOTrace(); endTracing();}
Sample C program - Preprocessor Replacement#define IOTRACE#include “IOTrace.h”#include <stdio.h>#include <stdlib.h>
main()
{
FILE *fp; char buffer[1024]; size_t cnt;
initIOTrace(); /* Initialize I/O Extension */
fp = fopen(“/etc/motd”, “r”); if (fp != NULL) { cnt = fread(buffer, sizeof(char), 1024, fp) fclose(fp); } /* Trace termination routines */ endIOTrace(); endTracing();}
Sample Fortran program - No Instrumentation
integer i
open(unit=2,file=‘/tmp/f’,form=‘formatted’,status=‘new’)
i=0
write(2, 100) I
close(2)
100 format(‘Node ‘, i3)
end
Sample Fortran program - Manual Instrumentation#include “fIOTrace.h”
integer I
call initIOTrace()
call traceOpenBegin(‘/tmp/f’, i) open(unit=2,file=‘/tmp/f’,form=‘formatted’,status=‘new’) call traceOpenEnd(2)
i = 0 call traceWriteBegin(2,1,0) write(2, 100) I call traceWriteEnd(9)
call traceCloseBegin(2) close(2) call traceCloseEnd()
100 format(‘Node ‘,i3)
call endIOTrace() call endTracing()
end
MPI TraceLibrary Extension
• MPI profiling library that can be linked in without making source code changes
• Each MPI process output a trace file labeled with the process number
• Insert call to SetTraceFileName() immediately after MPI_Init() to control location of trace file
MPI Extension (cont.)
• Disable tracing by calling MPI_Control(0)
• Re-enable tracing by calling MPI_Control(1)
• Link with Pablo Trace Extension Library (libPabloTraceExt.a) and Pablo Base Trace Library (libPabloTrace.a)
• Merge per-process trace file using the SDDF utility MergePabloTraces
Pablo Trace File Analysis
• Command-line FileStats program scans SDDF file and reports record types, min and max values for each field, and count of each record type.
• SDDFStatistics GUI for generating and browsing statistics from an SDDF file
• Pablo I/O analysis command-line routines• Pablo Analysis GUI
SDDFStatistics
• Statistics for entire file are displayed along top of display
• Record types are displayed in panel at lower left• Clicking on a record type brings up statistics for
each field of that record type• Clicking on a field displays a histogram
summarizing values for that field• Clicking on an array field type brings up statistics
for each dimension of that field
SDDFStatistics display
SDDFStatistics Usage
• SDDFStatistics [-toolkitoption …] [-loadSummary filename] [-openSDDF filename]
• Or use runSDDFStatistics script which invokes the SDDFStatistics program after setting environment variables so that required resources can be located
I/O Analysis Programs
• Iostats generates a report of application I/O activity summarized by I/O request type.
• IOstatsTable produces table summarizing information about I/O operations.
• IOtotalsByPE produces a report showing the total count, duration, and bytes involved for various operations by processor.
I/O Analysis Programs (cont.)
• LifetimeIOstats produces a report summarizing I/O activity by processor and file, prints a histogram of the file lifetimes, and prints total time spent in I/O calls for each procedure.
• FileRegionIOstats generates a report of application I/O activity summarized by file region. Each file is divided spatially into regions whose size is set by calling enableFileRegionSummaries().
I/O Analysis Programs (cont.)
• TimeWindowIOstats produces a report from Time Window Summary trace records. The execution time of the program is divided into time windows whose size is set by calling enableTimeWindowSummaries().
• SyncIOfileIDs processes a trace file contining I/O trace events where many different file Ids may be associated with a given file, and write a new file where every I/O trace event associated with a particular file (as determined by the file name) has the same file ID.
I/O Characterization Research using Pablo
• Detailed characterization of I/O behavior of scalable applications and existing parallel file systems
• Goals– Enable application developers to achieve higher
fraction of peak I/O performance on existing parallel file systems
– Help system software developers design better parallel file systems
I/O Research (cont.)
• Target Platforms– Intel Paragon– IBM SP– Convex Exemplar– SGI Origin 2000
I/O Research (cont.)
• The Scalable I/O (SIO) Initiative has targeted a number of application codes for study, including:– PRISM incompressible Navier-Stokes
calculations– SAR Synthetic Aperture Radar application– HF Hartree-Fock calculations– ESCAT SMC electron scattering– RENDER ray-identification rendering
Pablo and Virtual Reality
• Problem– Very large volume of captured performance data
for parallel systems– Human-computer interface is bandwidth-limited
• Proposed solution– Immerse users in virtual world so that users can
explore, viscerally experience, and modify the dynamic behavior of application and system software on a massively parallel system
Avatar
• Pablo virtual reality system• Operates with workstation monitor, head-mounted
display, and the CAVE• Presentation metaphors
– Scattercube Matrix• generalization of 2-d scatterplot matrix
• shows 3-d projections of sparsely populated, N-dimensional space
– Time Tunnel• event level display of processor and inter-processor behavior
Pablo Analysis GUI
• Toolkit of data transformation modules capable of processing SDDF records
• Supports graphical connection of performance data transformation modules in style of AVS
• By graphically connecting modules and interactively selecting trace data records, user specifies desired data transformation and presentations
• Expert users can develop and add new data analysis modules
Analysis GUI (cont.)
• Module types– Data analysis
• Mathematical transforms (counts, sums, ratios, max, min, average, trig functions, etc.)
• Synthesis of vectors and arrays from scalar input data
– Data presentation - bar graphs, bubble charts, strip charts, contour plots, interval plots, kiviat diagrams, 2-d and 3-d scatter plots, matrix displays, pie charts, polar plots
Pablo Analysis GUI Main Window
Module Creation Window
Module Connection
Configuring a Module (BarGraph)
Graph Execution
Graph with Synthesize Vector Module
Top Related