CHAPTER 4 Round-Off and Truncation Errors. Numerical Accuracy Truncation error : Method dependent...
-
date post
19-Dec-2015 -
Category
Documents
-
view
251 -
download
2
Transcript of CHAPTER 4 Round-Off and Truncation Errors. Numerical Accuracy Truncation error : Method dependent...
CHAPTER 4CHAPTER 4
Round-Off and Truncation Errors
Numerical AccuracyNumerical AccuracyTruncation error : Method dependentErrors which result from using an approximation
rather than an exact procedure
Round-off error : Machine dependentErrors which result from not being able to
adequately represent the true valueResult from using an approximate number to
represent exact number
....)x(f!3
h)x(f
!2
h)x(fh)x(f)hx(f i
3
i
2
iii
71828.2e ,1416.3
m
0m
mm
0mm
mmo1mm
m
33
232
12
321
o3
32
21o
axm
afaxcxf
mafcxxc21mm1mcmxf
3afc c6xf
2afc axc6c2xf
afc axc3axc2cxf
afc axcaxcaxccxf
)(!
)()()(
!/)()()()()!()(
!/)()(
!/)()()(
)()()()(
)()()()()(
)(
)()(
Taylor Series ExpansionTaylor Series ExpansionConstruction of finite-difference formulaNumerical accuracy: discretization error
a x Base point x = a
Taylor series expansionsTaylor series expansions....)(
!3)(
!2)()()()(
32
1 iiiiii xfh
xfh
xfhxfhxfxf
nn
n
m
m
m
Raxn
afax
afax
afaxafaf
axm
afxf
)(!
)(...)(
!3
)()(
!2
)())(()(
)(!
)()(
)(32
0
)(
Taylor Series and RemainderTaylor Series and RemainderTaylor series (base point x = a)
Remainder
1)1(
)()!1(
)(
nn
n axn
fR
Truncation ErrorTruncation ErrorTaylor series expansion
Example (higher-order terms truncated)
....)(!3
)(!2
)()()()(32
1 iiiiii xfh
xfh
xfhxfhxfxf
....!9
x
!7
x
!5
x
!3
xxxsin
....!5
x
!4
x
!3
x
!2
xx1e
9753
5432x
(xi = 0, h = x xi+1 = x)
Power series Power series PolynomialsPolynomials
The function becomes more nonlinear as m
increases
A MATLAB Script A MATLAB Script Filename: fun_exp.m
function sum = exp(x)% Evaluate exponential function exp(x) % by Taylor series expansion% f(x)=1 + x + x^2/2! + x^3/3! + … + x^n/n!
clear allx = input(‘enter the value of x = ’);n = input(‘enter the order n = ’);term =1 ; sum= term;for i = 1 : n term = term*x/i; sum = sum + term;end
function sum = exp(x)% Evaluate exponential function exp(x) % by Taylor series expansion% f(x)=1 + x + x^2/2! + x^3/3! + … + x^n/n!x = input(‘enter the value of x =’);n = input(‘enter the order n = ’);term(1) =1 ; sum(1)= term(1);for i = 1 : n term(i+1) = term(i)*x/i; sum(i+1) = sum(i) + term(i+1);end% Display the resultsdisp(‘i term(i) sum(i)’)a = 1:n+1; [a’ term’ sum’]
MATLAB For LoopsMATLAB For LoopsFilename: fun_exp2.m
Truncation ErrorTruncation Error 0 1.0000 1.0000 1 10.0000 11.0000 2 50.0000 61.0000 3 166.6667 227.6667 4 416.6667 644.3334 5 833.3334 1477.6667 6 1388.8890 2866.5557 7 1984.1272 4850.6826 8 2480.1589 7330.8418 9 2755.7322 10086.5742 10 2755.7322 12842.3066 11 2505.2112 15347.5176 12 2087.6760 17435.1934 13 1605.9045 19041.0977 14 1147.0746 20188.1719 15 764.7164 20952.8887 16 477.9478 21430.8359 17 281.1458 21711.9824 18 156.1921 21868.1738 19 82.2064 21950.3809 20 41.1032 21991.4844
21 19.5729 22011.0566 22 8.8968 22019.9531 23 3.8682 22023.8223 24 1.6117 22025.4336 25 0.6447 22026.0781 26 0.2480 22026.3262 27 0.0918 22026.4180 28 0.0328 22026.4512 29 0.0113 22026.4629 30 0.0038 22026.4668 31 0.0012 22026.4688 32 0.0004 22026.4688 33 0.0001 22026.4688 34 0.0000 22026.4688 35 0.0000 22026.4688 36 0.0000 22026.4688 37 0.0000 22026.4688 38 0.0000 22026.4688 39 0.0000 22026.4688 40 0.0000 22026.4688
n term sum n term sum
4658.22026e ,10x x
Truncation ErrorTruncation Errorn term sum n term sum
0 1.0000000 1.0000000 1 -10.0000000 -9.0000000 2 50.0000000 41.0000000 3 -166.6666718 -125.6666718 4 416.6666870 291.0000000 5 -833.3333740 -542.3333740 6 1388.8890381 846.5556641 7 -1984.1271973 -1137.5715332 8 2480.1589355 1342.5874023 9 -2755.7321777 -1413.1447754 10 2755.7321777 1342.5874023 11 -2505.2111816 -1162.6237793 12 2087.6760254 925.0522461 13 -1605.9045410 -680.8522949 14 1147.0745850 466.2222900 15 -764.7164307 -298.4941406 16 477.9477539 179.4536133 17 -281.1457520 -101.6921387 18 156.1920776 54.4999390 19 -82.2063599 -27.7064209 20 41.1031799 13.3967590
21 -19.5729427 -6.1761837 22 8.8967924 2.7206087 23 -3.8681707 -1.1475620 24 1.6117378 0.4641758 25 -0.6446951 -0.1805193 26 0.2479596 0.0674404 27 -0.0918369 -0.0243965 28 0.0327989 0.0084024 29 -0.0113100 -0.0029076 30 0.0037700 0.0008624 31 -0.0012161 -0.0003537 32 0.0003800 0.0000263 33 -0.0001152 -0.0000889 34 0.0000339 -0.0000550 35 -0.0000097 -0.0000647 36 0.0000027 -0.0000620 37 -0.0000007 -0.0000627 38 0.0000002 -0.0000625 39 0.0000000 -0.0000626 40 0.0000000 -0.0000626
4x 1045399.0e ,10x
4658.22026/1e 10 How to reduce error?
Round-off ErrorsRound-off ErrorsComputers can represent numbers to a
finite precisionMost important for real numbers -
integer math can be exact, but limited How do computers represent numbers?Binary representation of the integers and
real numbers in computer memory
38127
38128
10189050(2)11111largest
10146930(2)00100smallest
..
..
1023
1024
(2)11111largest
(2)00100smallest
.
.
64 bits (52, 11, 1)211 = 2048
28 = 256
32 bits (23, 8, 1)
MATLAB uses double precision
Order of operation
Addition problem:
9986.00042.00044.099.0
998.00042.0994.00042.0)0044.099.0(
999.00086.099.0)0042.00044.0(99.0
with 3-digit arithmetic:
exact result
Round-off error
Cancellation error
4
2
2
01
2
2
1
2
br
rbx
rbx
bxx If b is large, r is close to b
Difference of two numbers very close to each other potential for greater error!
rbrbrb
rb
rb
rbrbx
2
2
4
22
22
2
Rationalize:
Try b = 97 01972 xx
exact: 0.01031
standard: 0.01050
rationalized: 0.01031
x2 (3 sig. figs.)
Corresponding to “cancellation, critical arithmetic”
(r = 96.9794)
Significant FiguresSignificant Figures48.9 mph? 48.95 mph?
Significant DigitsSignificant DigitsThe places which can be used with confidence32-bit machine: 7 significant digits64-bit machine: 17 significant digitsDouble precision: reduce round-off error,
but increase CPU time
590471828182842e
7310414213562312
2643897932384614159265353
.
.
.
3.25/1.96 = 1.65816326530162... (from MATLAB)
But in practice only report 1.65 (chopping) or 1.66 (rounding)! Why??Why??Because we don’t know what is beyond the second decimal place
False Significant FiguresFalse Significant Figures
...65586522403258.1964.1/245.3
...77246644501278.1955.1/254.3Rounding
...18696505840528.1969.1/250.3
...40826627551020.1960.1/259.3Chopping
Accuracy - How closely a measured or computed value agrees with the true value
Precision - How closely individual measured or computed values agree with each other
Accuracy is getting all your shots near the target.Precision is getting them close together.
Accuracy and precisionAccuracy and precision
More Accurate
More Precise
Approximation = true value + true errorApproximation = true value + true error
Et = true value approximation = x* x
or in percent
*
*
x
xx
ValueTrue
Error TrueErrorelative R
%*
100*x
xxt
Numerical ErrorsNumerical ErrorsThe difference between the true value and the approximation
%100
%100a
ionapproximat present
approx. previous approx. present
ionapproximat
error eapproximat
Approximate ErrorApproximate ErrorBut the true value is not known
If we knew it, we wouldn’t have a problem
Use approximate error
%100x
xxerrorRelative
new
oldnew
Number SystemsNumber Systems Base-10 (Decimal): 0,1,2,3,4,5,6,7,8,9 Base-8 (Octal): 0,1,2,3,4,5,6,7 Base-2 (Binary): 0,1 – off/on, close/open, negative/positive
charge Other non-decimal systems 1 lb = 16 oz, 1 ft = 12 in, ½”, ¼”, …..
16
11212120211011.0
45212021212021101101:2base
1051021011033125.0
109102101105129,5:10base
4321
012345
4321
0123
Decimal System
(base 10)
Binary System (base 2)
Integer RepresentationInteger Representation
Signed magnitude methodUse the first bit of a word to indicate the
sign – 0: negative (off), 1: positive (on)Remaining bits are used to store a number
+ 1 0 1 0 0 1 0 1 1 0
Sign Number
off / on, close / open, negative / positive
8-bit word
+/- 0000000 are the same, therefore we may use “-0” to represent “-128”
Total numbers = 28 = 256 (-128 127)
Integer RepresentationInteger Representation
Sign Number
0123456 2 2 2 2 2 2 2
base10base2
base10base2
1271111111number largest
00000000number smallest
Integer RepresentationInteger Representation16-bit word
Range: -32,768 to 32,767Overflow: > 32,767 (cannot represent 43,000 A&M students)Underflow: < -32,768 (magnitude too large)
32-bit wordRange: -2,147,483,648 to 2,147,483,6479 significant digitsOverflow: world population 6 billionUnderflow: budget deficit -$100 billion
767,322121....2121 011314
Integer OperationsInteger Operations Integer arithmetic can be exact as long as
you don't get remainders in division 7/2 = 3 in integer math
or overflow the maximum integer For a 8-bit computer max = 128 (or -127) So 123 + 45 = overflow and -74 * 2 = underflow
Floating-Point RepresentationFloating-Point Representation Real numbers (also called floating-point
numbers) are represented differently For fraction or very large numbers Store as
sign is 1 or 0 for negative or positive exponent is maximum value (positive or
negative) of base mantissa contains significant digits
sign signed exponent mantissa
Floating-Point RepresentationFloating-Point Representation
m: mantissaB: Base of the number systeme: “signed” exponentNote: the mantissa is usually “normalized”
if the leading digit is zero
m
p321
e
m21 d d d d e e e sign of
numbersigned exponent mantissa
eep321 mBBddd.d N
Integer representationInteger representation
Floating-point number representationFloating-point number representation
8-bit word
Decimal RepresentationDecimal Representation
sign signed exponent number
432101 10 10 10 10 10 01
1|095|1467 (base: B = 10)mantissa: m = -(1*10-1 + 4*10-2 + 6*10-3 + 7*10-4 ) = -0.1467
signed exponent: e = + (9*101 + 5*100) = 95
95e10base 1014670mB10951467 .
8-bit word (without normalization)
Floating-Point RepresentationFloating-Point Representation
sign signed exponent number
432101 2 2 2 2 2 2
0|111|0101 (base: B = 2)
mantissa: m = +(0*2-1 + 1*2-2 + 0*2-3 + 1*2-4 ) = 5/16
signed exponent: e = - (1*21 + 1*20) = -3
5/1282(5/16)mB10111001 3e2base
NormalizationNormalization
Remove the leading zero by lowering the exponent (d1 = 1 for all numbers)
if m < 1/2, multiply by 2 to remove the leading 0 floating-point allow fractions and very large numbers to
be represented, but take up more memory and CPU time
222
222
ft10694444.0in 1ft006944.0ft(1/144) in 1 (Less accurate)
(Normalization)
1m2
1 :2base
1m0.1 1m10
1 :10base
1mB
1
8-bit word (with normalization)
Binary RepresentationBinary Representation
sign signed exponent number
432101 2 2 2 2 2 2
1|011|1001 (base: B = 2)
mantissa: m = -(1*2-1 + 0*2-2 + 0*2-3 + 1*2-4 ) = -9/16
signed exponent: e = + (1*21 + 1*20) = 3
9/22(9/16)mB10111001 3e2base
Single PrecisionSingle PrecisionA real variable (number) is stored in four words,
or 32 bits (64 bits for Supercomputers)bit (binary digit): 0 or 1byte: 4 bits, 24 = 16 possible valuesword: 2 bytes = 8 bits, 28 = 256 possible values
23 for the digits
32 bits 8 for the signed exponent
1 for the sign
39128
38127
1034028011(2).111 largest100.2938700(2).100 smallest
.
Double PrecisionDouble PrecisionA real variable is stored in eight words, or 64 bits16 words, 128 bits for supercomputers
signed exponent 210 = 1024
52 for the digits
64 bits 11 for the signed exponent
1 for the sign
1024
1023
11(2).111 largest00(2).100 smallest
Round-off ErrorsRound-off ErrorsFloating point characteristics contribute to round-off
error (limited bits for storage)Limited range of quantities can be representedA finite number of quantities can be representedThe interval between numbers increases as the
numbers grow
Example - three significant digits
0.0100 0.0101 0.0102 …… 0.0999 (0.0001 increment)
0.100 0.101 0.102 ……. 0.999 (0.001 increment)
1.00 1.01 1.02 ……. 9.99 (0.01 increment)
MATLABMATLABFinite number of real quantities (integers,
real numbers or text) can be represented
For 8-bit, 28 = 256 quantities
For 16-bit, 216 = 65536 quantities
MATLAB uses double precision 4 bytes = 64 bits more than 1019 (264) quantities