Lecture 2a: Performance Measurement. Goals of Performance Analysis The goal of performance analysis...
-
Upload
jack-gibson -
Category
Documents
-
view
216 -
download
0
Transcript of Lecture 2a: Performance Measurement. Goals of Performance Analysis The goal of performance analysis...
Lecture 2a:Lecture 2a:
Performance Performance MeasurementMeasurement
Goals of Performance Analysis
The goal of performance analysis is to provide quantitative information about the performance of a computer system
Goals of Performance Analysis Compare alternatives
• When purchasing a new computer system, to provide quantitative information Determine the impact of a feature
• In designing a new system or upgrading, to provide before-and-after comparison System tuning
• To find the best parameters that produce the best overall performance Identify relative performance
• To quantify the performance relative to previous generations Performance debugging
• To identify the performance problems and correct them Set expectations
• To determine the expected capabilities of the next generation
Performance Evaluation
Performance Evaluation steps:
1. Measurement / Prediction• What to measure? How to measure?
• Modeling for prediction• Simulation
• Analytical Modeling
2. Analysis & Reporting• Performance metrics
Performance Measurement
Interval Timers
• Hardware Timers
• Software Timers
Performance Measurement
Hardware Timers
• Counter value is read from a memory location
• Time is calculated as
Clock Counter
Tc
n bits to processor memory bus
Time = (x2 - x1) x Tc
Performance Measurement
Software Timers
• Interrupt-based
• When interrupt occurs, interrupt-service routine increments the timer value which is read by a program
• Time is calculated as
ClockPrescaling Counter
Tc
to processor interrupt input
T’c
Time = (x2 - x1) x T’c
Performance Measurement
Timer Rollover
Occurs when an n-bit counter undergoes a transition from its maximum value 2n – 1 to zero
There is a trade-off between roll over time and accuracy
T’c 32-bit 64-bit
10 ns 42 s 5850 years
1 s 1.2 hour 0.5 million years
1 ms 49 days 0.5 x 109 years
Timers
Solution:
1. Use 64-bit integer (over half a million year)
2. Timer returns two values:
• One represents seconds
• One represents microseconds since the last second
With 32-bit, the roll over is over 100 years
Performance Measurement
Interval Timers
T0 Read current timeEvent being timed ();T1 Read current time
Time for the event is: T1-T0
Performance MeasurementTimer Overhead
Initiate read_time
Current time is read
Event begins
Event ends; Initiate read_time
Current time is read
T1
T2
T3
T4
Measured time:
Tm = T2 + T3 + T4
Desired measurement:
Te = Tm – (T2 + T4)
= Tm – (T1 + T2) since T1 = T4
Timer overhead:
Tovhd = T1 + T2
Te should be 100-1000 times greater than Tovhd .
Performance MeasurementTimer Resolution
Resolution is the smallest change that can be detected by an interval timer.
nT’c < Te < (n+1)T’c
If T’c is large relative to the event being measured, it may be impossible to measure the duration of the event.
Performance MeasurementMeasuring Short Intervals
Te < T’c
T’c
Te
T’c
Te
1
0
Performance MeasurementMeasuring Short Intervals
Solution: Repeat measurements n times. Approximates a binomial distribution.
Average execution time: T’e = (m/n) x T’c
m: number of 1s measured
T’c
Te
Performance MeasurementMeasuring Short Intervals
Solution: Repeat measurements n times. Measure the total execution time (Tt)
Average execution time: T’e = (Tt / n ) – h
Tt : total execution time of n repetitions
h: repetition overhead
T’c
Te
Tt
Performance Measurement Time
• Elapsed time / wall-clock time / response time• Latency to complete a task, including disk access,
memory access, I/O, operating system overhead, and everything (includes time consumed by other programs in a time-sharing system)
• CPU time• The time CPU is computing, not including I/O time or
waiting time• User time / user CPU time
• CPU time spent in the program• System time / system CPU time
• CPU time spent in the operating system performing tasks requested by the program
Performance Measurement
UNIX time command
90.7u 12.9s 2:39 65%
Drawbacks:
• Resolution is in milliseconds
• Different sections of the code can not be timed
User time
System time
Elapsed time Percentage of
elapsed time
Timers
Timer is a function, subroutine or program that can be used to return the amount of time spent in a section of code.
t0 = timer(); …< code segment > …t1 = timer();time = t1 – t0;
zero = 0.0;t0 = timer(&zero); …< code segment > …t1 = timer(&t0);time = t1;
Timers
Read:
Wadleigh, Crawford pg 130-136 for:
time, clock, gettimeofday, etc.
TimersMeasuring Timer Resolution
main() { . . .zero = 0.0;t0 = timer(&zero);t1 = 0.0;j=0;while (t1 == 0.0) {
j++;zero=0.0;t0 = timer(&zero);foo(j);t1 = timer(&t0);
}printf (“It took %d iterations for a nonzero time\n”, j); if (j==1) printf (“timer resolution <= %13.7f seconds\n”, t1);else printf (“timer resolution is %13.7f seconds\n”, t1);
}foo(n){ . . .
i=0;for (k=0; k<n; k++)
i++;return(i);
}
TimersMeasuring Timer Resolution
Using clock():
Using times():
Using getrusage():
It took 682 iterations for a nonzero timetimer resolution is 0.0200000 seconds
It took 720 iterations for a nonzero timetimer resolution is 0.0200000 seconds
It took 7374 iterations for a nonzero timetimer resolution is 0.0002700 seconds
TimersSpin Loops
For codes that take less time to run than the resolution of the timer First call to a function may require an inordinate amount of time. Therefore the minimum of all times may be desired.
main() { . . .zero = 0.0;t2 = 100000.0;for (j=0; j<n; j++) {
t0 = timer(&zero);foo(j);t1 = timer(&t0); t2 = min(t2, t1);
}t2 = t2 / n;printf (“Minimum time is %13.7f seconds\n”, t2);
}foo(n){ . . .
< code segment >}
Profilers A profiler automatically insert timing calls into applications to
generate calls into applications
It is used to identify the portions of the program that consumes the largest fraction of the total execution time.
It may also be used to find system-level bottlenecks in a multitasking system.
Profilers may alter the timing of a program’s execution
Profilers Data collection techniques
• Sampling-based
• This type of profilers use a predefined clock; every multiple of this clock tick the program is interrupted and the state information is recorded.
• They give the statistical profile of the program behavior.
• They may miss some important events.
• Event-based
• Events are defined (e.g. entry into a subroutine) and data about these events are collected.
• The collected information shows the exact execution frequencies.
• It has substantial amount of run-time overhead and memory requirement.
Information kept
• Trace-based: The compiler keeps all information it collects.
• Reductionist: Only statistical information is collected.
Performance Evaluation
Performance Evaluation steps:
1. Measurement / Prediction• What to measure? How to measure?
• Modeling for prediction• Simulation
• Analytical Modeling
• Queuing Theory
2. Analysis & Reporting• Performance metrics
Predicting Performance
Performance of simple kernels can be predicted to a high degree
Theoretical performance and peak performance must be close
It is preferred that the measured performance is over 80% of the theoretical peak performance
Homework 1
Write a C program to measure the execution time (elapsed time) of an addition operation (i.e. a=b+c). Run your program on both Windows and Linux systems. Use a timer that has at least s resolution.
Prepare a one-page report and explain the following: Your method to measure time Your code Specifications of the system that you run your code (processor, clock
speed, etc.) Your measurement results Comments on your results