Lecture 6 Linear System Error

7/23/2019 Lecture 6 Linear System Error

http://slidepdf.com/reader/full/lecture-6-linear-system-error 1/36

MATH 3795

Lecture 6. Sensitivity of the Solution of a Linear

System

Dmitriy Leykekhman

Fall 2008

Goals Understand how does the solution of Ax = b changes when A or b

change.

Condition number of a matrix (with respect to inversion).

Vector and matrix norms.

D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Sensitivity of the Solution – 1



Linear Systems

Given A ∈ Rn×n and b ∈ R

n we are interested in the solutionx ∈ R

n of Ax = b.

Suppose that instead of A, and b we are given A + ∆A and b + ∆b,where ∆A ∈ R

n×n and ∆b ∈ Rn. How do these perturbations in

the data change the solution of the linear system?

First we need to understand how to measure the size of vectors andof matrices. This leads to vector norms and matrix norms.




Vector Norms

DefinitionA (vector) norm on R

n is a function

· : Rn → R

x → xwhich for all x, y ∈ R

n and α ∈ R satisfies

1. x ≥ 0, x = 0 ⇔ x = 0,

2. αx = |α|x,

3. x + y ≤ x + y, (triangle inequality).




Vector Norms

The most frequently used norms on Rn are given by

x2 =

ni=1

x2i

1/2

, 2-norm

The MATLAB’s build in function norm(x) or norm(x, 2).

More generally for any p

∈ [1,∞)

x p =

ni=1

|xi| p1/p

, p-norm.

The MATLAB’s build in function norm(x, p) and

x∞ = maxi=1,...,n

|xi|, ∞-norm.

The MATLAB’s build in function norm(x, inf )




Vector Norms

Example

Let x = (1, −2, 3, −4)T . Then

x1 = 1 + 2 + 3 + 4 = 10,

x2 =√

1 + 4 + 9 + 16 =√

30 ≈ 5.48,

x∞ = max {1, 2, 3, 4} = 4.




Vector Norms

The boundaries of the unit balls defined by

{x ∈ Rn : x p ≤ 1}.

One can show the following useful inequalities:




Vector Norms


{x ∈ Rn : x p ≤ 1}.


x∞ ≤ x2 ≤ x1.




Vector Norms


{x ∈ Rn : x p ≤ 1}.


x∞ ≤ x2 ≤ x1. Let · is any vector norm on R

n, then

x + y ≥ x − y

for all x, y ∈ R

n.




Vector Norms


{x ∈ Rn : x p ≤ 1}.


x∞ ≤ x2 ≤ x1. Let · is any vector norm on R

n, then

x + y ≥ x − y

for all x, y ∈ R

n.

Cauchy-Schwarz inequality,

xT y ≤ x2y2 for all x, y ∈ Rn.




Vector Norms

TheoremVector norms on Rn are equivalent, i.e. for every two vector norms · a

and · b on Rn there exist constants cab, C ab (depending on the vector

norms · a and · b, but not on x) such that

cab

x

b

≤ x

a

≤C ab

x

b

∀x

∈Rn.

In particular, for any x ∈ Rn we have the inequalities

1√ n

x1 ≤ x2 ≤ x1

x∞ ≤ x2 ≤ √ nx∞x∞ ≤ x1 ≤ nx∞.




Matrix Norms

DefinitionA matrix norm on R

m×n is a function

· : Rm×n → R

A → A,

which for all A, B ∈ Rm×n and α ∈ R satisfies

1. A ≥ 0, A = 0 ⇔ A = 0 (zero matrix),2. αA = |α|A,

3. A + B ≤ A + B, (triangle inequality).

Warning:Matrix- and vector-norms are denoted by the same symbol ·

.However, as we will see shortly, vector-norms and matrix-norms arecomputed very differently. Thus, before computing a norm we need toexamine carefully whether it is applied to a vector or to a matrix. Itshould be clear from the context which norm, a vector-norm or amatrix-norm, is used.




Matrix Norms. First Approach.

View a matrix A ∈ R

m×n

as a vector in R

mn

, by stacking thecolumns of the matrix into a long vector.






m×n

as a vector in R

mn


Apply the vector-norms to this vectors of length mn.






m×n

as a vector in R

mn



This will give matrix norms. For example if we apply the2-vector-norm, then

AF =

n

i=1

mj=1

a2ij

1/2

.

This is called the Frobenius norm.(We will use A2 to denote a different matrix norm.)






m×n

as a vector in R

mn



This will give matrix norms. For example if we apply the2-vector-norm, then

AF =

n

i=1

mj=1

a2ij

1/2

.

This is called the Frobenius norm.(We will use A2 to denote a different matrix norm.)

This approach is not very useful.




Matrix Norms. Second Approach.

We want to solve linear systems Ax = b.

Find a vector x

such that if we multiply A

by this vector (we applyA to this vector), then we obtain b.





We want to solve linear systems Ax = b.Find a vector x such that if we multiply A by this vector (we applyA to this vector), then we obtain b.

View a matrix A ∈ Rm×n as a linear mapping, which maps a vector

x ∈ Rn into a vector Ax ∈ R

m

A : Rn

→Rm

x → Ax.








m

A : Rn

→Rm

x → Ax.

How do we define the size of a linear mapping?








m

A : Rn

→Rm

x → Ax.

How do we define the size of a linear mapping?

Compare the size of the image Ax ∈ Rm with the size of x. This

leads us to look at

supx=0

AxxHere Ax ∈ R

m and x ∈ Rn are vectors and · are vector norms (in

Rm and R

n).




Matrix Norms

Let p ∈ [1, ∞]. The following identities re valid

supx=0

Ax px p = sup

xp=1

Ax p = maxx=0

Ax px p = max

xp=1Ax p




Matrix Norms

Let p ∈ [1, ∞]. The following identities re valid

supx=0

Ax px p = sup

xp=1

Ax p = maxx=0

Ax px p = max

xp=1Ax p

One can show A p = maxx=0

Ax px p . (1)

Note that on the left hand side in (1) the symbol · p refers to thep-matrix-norm, while on the right hand side in (1) the symbol · prefers to the p-vector-norm applied to the vectors Ax ∈ R

m

andx ∈ Rn, respectively.


M i N



Matrix Norms

For the most commonly used matrix-norms (1) with p = 1, p = 2, or p = ∞, there exist rather simple representations.

Let · p be the matrix norm defined in (1), then

A

1 = max

j=1,...,n

m

i=1 |

aij|

(maximum column norm);

A∞ = maxi=1,...,m

nj=1

|aij | (maximum row norm);

A

2 = λmax(AT A) (spectral norm).

where λmax(AT A) is the largest eigenvalue of AT A.


M i N



Matrix Norms

ExampleLet

A =

1 3 −6

−2 4 22 1

−1

.

ThenA1 = max 5, 8, 9 = 9,

A∞ = max10, 8, 4 = 10,

A2 = max {3.07, 23.86, 49.06} ≈ 7.0045,

xF = √ 76 ≈ 8.718.


M i N



Matrix NormsTwo important inequalities.

Theorem

For any A ∈Rm×n

, B ∈Rn×k

and x ∈Rn

, the following inequalities hold.

Ax p ≤ A px p (compatibility of matrix and vector norm)

and

AB p ≤ A pB p (submultiplicativity of matrix norms)

Note that for the identity matrix I ,

I p = maxx=0 Ix

p

x p = 1.

Compare this with the first approach in which we view I as a vector of length n2. For example the Frobenius norm (2-vector norm) is

I

F =

√ n.


E A l i



Error Analysis

LetAx = b (2)

be the original system, where A ∈ Rn×n and b ∈ R

n.


E A l i



Error Analysis

LetAx = b (2)


n.

Let(A + ∆A)x̃ = b + ∆b (3)

be the perturbed system, where ∆A ∈ Rn×n and ∆b ∈ Rn representthe perturbations in A and b, respectively.


Error Analysis



Error Analysis

LetAx = b (2)


n.

Let(A + ∆A)x̃ = b + ∆b (3)

be the perturbed system, where ∆A ∈ Rn×n and ∆b ∈ Rn representthe perturbations in A and b, respectively.

What is the error ∆x = x̃ − x between the solution x of the exactlinear system (7) and the solution ex perturbed linear system (8).

Use a representationx̃ = x + ∆x.


Error Analysis Perturbation in b only



Error Analysis. Perturbation in b onlyThe original linear system,

Ax = b,

where A∈

Rn×n and b

∈Rn. The perturbed linear system

A(x + ∆x) = b + ∆b,

where ∆b ∈ Rn represents the perturbations in b.

Subtracting we get

A∆

x = ∆

b, or ∆

x =

A−1

∆b.





Error Analysis. Perturbation in b onlyThe original linear system,

Ax = b,

where A∈

Rn×n and b

∈Rn. The perturbed linear system

A(x + ∆x) = b + ∆b,

where ∆b ∈ Rn represents the perturbations in b.

Subtracting we get

A∆x = ∆b, or ∆x = A−1∆b.

Take norms:∆x = A−1∆b ≤ A−1∆b. (4)

To estimate relative error, note that Ax = b and as a result

b = Ax ≤ Ax ⇒ 1

x ≤ A 1

b . (5)

Combining (4) and (5) we get

∆xx ≤ AA−1∆b

b . (6)





Error Analysis. Perturbation in b only

DefinitionThe (p-) condition number κ p(A) of a matrix A (with respect to

inversion) is defined by

κ p(A) = A pA−1 p.

Set κ p(A) = ∞ is A is not invertible. MATLAB’s build in functioncond(A).

If Ax = b,

andA(x + ∆x) = b + ∆b,

then the relative error between the solutions obeys

∆xx ≤ κ p(A)

∆bb .


Error Analysis General Case



Error Analysis. General Case.

Let

Ax = b (7)be the original system, where A ∈ R

n×n and b ∈ Rn.

Let(A + ∆A)(∆x + x) = b + ∆b (8)

be the perturbed system, where ∆A ∈ R

n×n

and ∆b ∈ R

n

representthe perturbations in A and b, respectively.

If A−1 p∆A p < 1, then

∆x px p ≤

κ p(A)

1 − κ p(A)∆ApAp

∆A pA p

+ ∆b

b . (9)

If κ p(A) is small, we say that the linear system is well conditioned.

Otherwise, we say that the linear system is ill conditioned.


Error Analysis Example Hilbert Matrix



Error Analysis. Example. Hilbert Matrix

Example

Hilbert Matrix H ∈ Rn×n with entries

hij =

10

xi+j−2 dx = 1

i + j − 1.

For n = 4,

H =

1 12 13 141

2

1

3

1

4

1

513

14

15

16

14

15

16

17

.

H −1 =

16

−120 240

−140

−120 1200 −2700 1680240 −2700 6480 −4200

−140 1680 −4200 2800

.





Error Analysis. Example. Hilbert Matrix.

Example

We compute that the condition number of a Hilbert matrix grows veryfast with n. For n = 4

H 1 = 25

12 H −11 = 13620, κ1(H ) = 28375,

H ∞ = 25

12 H −1∞ = 13620, κ∞(H ) = 28375,

H 2 ≈ 1.5 H −12 ≈ 1.03 ∗ 104, κ2(H ) ≈ 1.55 ∗ 104.





Error Analysis. Example. Hilbert Matrix.

Example

We consider the linear systems

Hx = b.

For given n we set xex = (1, . . . , 1)T ∈ Rn, and compute b = H xex.

Then we compute the solution of the linear system Hx = b using theLU-decomposition and compute the relative error between exact solution

xex and computed solution x.

n κ∞(H ) xex−x∞xex∞

4 2.837500e + 004 2.958744e − 0135 9.436560e + 005 5.129452e − 012

6 2.907028e + 007 5.096734e − 0117 9.851949e + 008 2.214796e − 0088 3.387279e + 010 1.973904e − 0079 1.099651e + 012 4.215144e − 005

10 3.535372e + 013 5.382182e − 004


Error Analysis.



Error Analysis. If we use finite precision arithmetic, then rounding causes errors in

the input data. Using m-digit floating point arithmetic it holds that

|x − f l(x)||x| ≤ 0.5 ∗ 10−m+1.

Thus, if we solve the linear system in m-digit floating pointarithmetic, then, as rule of thumb, we may approximate the theinput errors due to rounding by

∆AA ≈ 0.5 ∗ 10−m+1,

∆bb| ≈ 0.5 ∗ 10−m+1

If the condition number of A is κ(A) = 10α, then

∆x

x| ≤ 10α

1 − 10α−m+1

(0.5∗

10−m + 0.5∗

10−m)≈

10α−m.

Provided 10α−m+1 < 1. Rule of thumb: If the linear system is solved in m-digit floating

point arithmetic and if the condition number of A is of the order10α, then only m

−α

−1 digits in the solution can be trusted.


Summary.



Summary.

If the condition number of a matrix A is large, then small errors inthe data may lead to large errors in the solution.

Rule of thumb: If the linear system is solved in m-digit floating pointarithmetic and if the condition number of A is of the order 10α,then only m − α − 1 digits in the solution can be trusted.


Lecture 6 Linear System Error

Documents

Transcript of Lecture 6 Linear System Error