Background (Floating-Point Representation 101) Floating-point represents real numbers as (± sig ×...
-
Upload
kerry-mason -
Category
Documents
-
view
219 -
download
0
Transcript of Background (Floating-Point Representation 101) Floating-point represents real numbers as (± sig ×...
Dynamic Floating-Point Cancellation
Detection
Michael O. Lam (Presenter)Jeffrey K. Hollingsworth
G. W. StewartUniversity of Maryland, College Park
2
Background(Floating-Point Representation 101)
Floating-point represents real numbers as (± sig × 2exp) Sign bit Significand (“mantissa” or “fraction”) Exponent
Floating-point numbers have finite binary precision Single-precision: 24 binary digits (~7 decimal digits) Double-precision: 53 binary digits (~16 decimal digits)
Examples: π 3.141592… 11.0010010… 1/10 0.1 0.0001100110…
Image from Wikipedia (“Single precision”)
3
Motivation
Finite precision causes round-off error Compromises ill-conditioned calculations Hard to detect and diagnose
Increasingly important as HPC scales Need to balance speed and accuracy
Lower precision is faster Higher precision is more accurate
Industry-standard double precision may still fail on long-running computations
4
Previous Solutions
Analytical Requires numerical analysis expertise Conservative static error bounds are largely
unhelpful
Ad-hoc Run experiments at different precisions Increase precision where necessary Tedious and time-consuming
5
Instrumentation Solution
Automated (vs. manual) Minimize developer effort Ensure consistency and correctness
Binary-level (vs. source-level) Include shared libraries without source code Include compiler optimizations
Runtime (vs. compile time) Dataset and communication sensitivity
6
Solution Components
Dyninst-based instrumentation utility (“mutator”) Cross-platform No special hardware required Stack walking and binary rewriting
Shared library with runtime analysis routines Flexibility and ease of development
Java-based log viewer GUI Cross-platform Minimal development effort
7
Analysis Process
Run mutator Find floating-point instructions Insert calls to shared library
Run instrumented program Executes analysis alongside original program Stores results in a log file
View output with GUI
8
Analysis Types
Cancellation detection
Shadow-value analysis
9
Cancellation
Loss of significant digits during subtraction operations
Cancellation is a symptom, not the root problem
Indicates that a loss of information has occurred that may cause problems later
1.613647 (7) 1.613647 (7) - 1.613635 (7) - 1.613647 (7) 0.000012 (2) 0.000000 (0)
(5 digits cancelled) (all digits cancelled)
1.6136473- 1.6136467 0.0000006
10
Detecting Cancellation
For each addition/subtraction: Extract value of each operand Calculate result and compare magnitudes
(binary exponents) If eans < max(ex,ey) there is a cancellation
For each cancellation event: Calculate “priority:” max(ex,ey) - eans
If above threshold, save event information to log
For some events, record operand values
11
12
13
Experiments
Gaussian elimination Benefits of partial pivoting Differing runtime behavior of popular
algorithms
14
Gaussian Elimination
A [L,U]
Partial pivoting Nominally to avoid division by zero Also avoids inaccurate results from small pivots This can be detected using cancellation
swap
15
cancellation
loss of data
pivot
16
Gaussian Cancellation
log(diag. element size)
Threshold
Matrix Size
-2 -4 -6 -8Estimate
1 7 13 17
10 x 10 66 37 37 34 2515 x 15 225 123 122 122 10020 x 20 663 247 252 257 22525 x 25 1227 394 423 441 400
Cancellation Counts
17
Gaussian Elimination
This suggests that cancellation can be used to detect the effects of a small pivot
Useful in sparse elimination with limited ability to pivot
Threshold must be kept high enough
18
Gaussian Elimination
A [L,U]
Classical Bordered
19
Size of diagonal elements
Iterations of algorithm
Classical Bordered
Classical Bordered
threshold 1 2 3 4 5 1 2 3 4 5
smallest diag. value
10-5 14 8 1 0 0 8 7 6 5 410-10 29 23 16 11 3 8 8 7 7 610-15 39 33 27 21 17 9 9 9 8 8
20
Gaussian Elimination
Classical method: many small cancellations
Bordered method: fewer but larger cancellations
Our tool can detect these differences and inform the developer, who can then make decisions regarding which algorithm to use
21
Other Results
Approximate nearest neighbor More cancellations in denser point sets
SPEC benchmarks milc and lbm Cancellations in error calculations indicate
good results
SPEC benchmark povray Cancellations indicate color black
22
Conclusions
It is important to vary the threshold Most calculations have background
cancellations Small cancellations can hide large ones
Cancellation results require interpretation by someone who is familiar with the algorithm
Properly employed, cancellation detection can help find “trouble spots” in numerical codes
23
Ongoing Research
Shadow value analysis Replace floating-point numbers with pointers to
auxiliary information (higher precision, etc.)
double x = 1.0;
void func() { double y = 4.0; x = x + y;}
printf(“%f”, x);
1.0004.0005.000
“shadow value”
24
Shadow Value Analysis
Current status: allows programmers to automatically test their entire program in different precisions
Next step: selectively instrument particular code blocks or data structures
Goal: automated floating-point analysis and recommendation framework
25
Thank you!
Code available upon request
Questions?
26
Size of diagonal elements
Iterations of algorithm
Classical Bordered
threshold
1 2 3 4 5
smallest diag. value
C B C B C B C B C B
10-5 14 8 8 7 1 6 0 5 0 4
10-10 29 8 23 8 16 7 11 7 3 6
10-15 39 9 33 9 27 9 21 8 17 8
27
Gaussian Cancellation
log(pivot) -2 -4 -6 -8 log(pivot) -2 -4 -6 -8 Threshold 1 7 13 17 Threshold 1 7 13 17 n = 10 n = 20 Count 66 37 37 34 Count 663 247 252 257 Trunc 55 37 37 34 Trunc 298 245 252 257 Est 25 25 25 25 Est 225 225 225 225 n = 15 n = 25 Count 225 123 122 122 Count 1227 394 423 441 Trunc 154 122 122 122 Trunc 447 381 423 441 Est 100 100 100 100 Est 400 400 400 400