ECE 645 { Matrix Equations and Least Squares Estimation jvk/645/645... · PDF file ECE...

Click here to load reader

  • date post

    12-May-2020
  • Category

    Documents

  • view

    7
  • download

    0

Embed Size (px)

Transcript of ECE 645 { Matrix Equations and Least Squares Estimation jvk/645/645... · PDF file ECE...

  • ECE 645 – Matrix Equations and Least Squares Estimation

    J. V. Krogmeier

    March 24, 2014

    Contents

    1 Matrix Equations 4

    2 Exact Solution of Linear Equations 6

    3 Inner Product and Norm 8

    4 The Projection Theorem and Related 9

    5 Eigenvalues and Eigenvectors 12

    1

  • 6 Projection Matrices 13

    7 Construction of Orthogonal Projectors 14

    8 An Adjoint Theorem 16

    9 Left and Right Inverses 17

    10 Some Theorems on the Solution of Ax = b 23

    10.1 Over-determined Full Rank Case . . . . . . . . . . . . . . . . . . . . . . 23

    10.2 Under-determined Full Rank Case . . . . . . . . . . . . . . . . . . . . . . 24

    10.3 The General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    11 The Singular Value Decomposition 25

    12 Deterministic Least Squares Problem 27

    13 Statistical Least Squares and the Additive Linear Model 29

    14 Identifiability in the Additive Linear Model 33

    Matrices and Least Squares 2

  • 15 The Gauss-Markov Theorem 34

    15.1 Reprise of Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    15.2 Usually Least Squares is Not the Best We Can Do . . . . . . . . . . . . . 35

    Matrices and Least Squares 3

  • 1 Matrix Equations

    1. Consider a matrix equation of the form

    a1,1 a1,2 · · · a1,n a2,1 a2,2 · · · a2,n

    ... ... · · · ... am,1 am,2 · · · am,n

    

    

    x1 x2 ...

    xn

     =

    

    b1 b2 ...

    bm

     .

    The usual shorthand for the linear system of equations above is

    Ax = b (1)

    where A is an m× n matrix, x is an n× 1 vector, and b is an m× 1 vector.

    Goal: To study the exact and approximate solutions to linear systems as in Eq. (1).

    In the process we will make extensive use of the ideas of orthogonality, projection,

    and eigenvalue decomposition. Introductory treatments of these topics consume entire

    volumes so no attempt will be made for completeness. See any of a number of good

    texts on linear algebra [1, 2, 3, 4]

    2. Assume familiarity with the following concepts (see the first four chapters of [1]):

    (a) The usual rules for matrix addition and multiplication.

    Matrices and Least Squares 4

  • (b) The notions of a basis for a vector space, change of basis, and the matrix repre-

    sentation of a linear transformation with respect to a particular basis. Similarity

    transformations and diagonalization.

    (c) Transpose AT and Hermitian transpose AH of a matrix A.

    (d) Determinant of a matrix |A|, the rank of a matrix, row rank, column rank, the inverse of a square matrix.

    (e) The standard orthonormal basis of the vector spaces Rn or Cn is the collection of n vectors

    {ej : 1 ≤ j ≤ n} where ej denotes the n-vector whose only nonzero entry is a one in the j-th

    position, i.e.,

    ej =

    

    0 ...

    0

    1

    0 ...

    0

    

    ← jth.

    Matrices and Least Squares 5

  • 2 Exact Solution of Linear Equations

    1. Let A be an m × n matrix with elements from C. The null space of A is the following subspace of Cn

    η(A) = {x ∈ Cn : Ax = 0}.

    The range space or column space of A is the following subspace of Cm

    R(A) = {y = Ax : x ∈ Cn},

    i.e., that subspace spanned by the column vectors of A. The (column) rank of the

    matrix A is defined to be the dimension of its range space.

    2. It is an important and interesting fact1 that the rank ofA is also equal to the dimension

    of the column space of AT (or, equivalently, the row space of A).

    3. Some simple theorems:

    Thm 1. Let A be an m× n matrix. Then

    rank(A) + dim{η(A)} = n. 1Another notion for the rank of a rectangular matrix is that of the order of its largest nonzero minor [1]. All of these, row rank, column

    rank, and determinental rank, turn out to be equal.

    Matrices and Least Squares 6

  • Thm 2. Let A be anm×nmatrix and b be anm-vector, both considered as constant and known. Then considering the n-vector solutions x to the matrix equation Ax = b

    one can say that

    (a) there exists a solution x if and only if b ∈ R(A). (b) if xo is a particular solution of the matrix equation, then the complete set of

    solutions to Eq. (1) is given by

    {xo + xη : xη ∈ η(A)}.

    (c) a solution is unique only if η(A) = {0}. (d) the matrix equation has a unique solution for any b if and only if n = m and

    rank(A) = n. In this case, the unique solution is given by

    x = A−1b

    where A−1 is the (unique) inverse matrix for A. It solves the matrix equation

    AA−1 = A−1A = I .

    Thm 3. Let A and B be any two matrices such that AB is defined. Then

    rank(AB) ≤ min{rank(A), rank(B)}.

    Matrices and Least Squares 7

  • Thm 4. Let A be an m× n matrix with nonzero rank equal to r. Then there exist nonsingular matrices U (m×m) and V (n× n) such that

    A = U

     Ir×r 0r×(n−r) 0(m−r)×r 0(m−r)×(n−r)

    V.

    3 Inner Product and Norm

    1. In the vector space Cn the standard inner product of two vectors x and y is defined as

    〈x, y〉 = yHx = n∑ k=1

    xky ∗ k.

    This inner product defines a norm

    ‖x‖ = √ 〈x, y〉.

    2. Two vectors x and y are said to be orthogonal with respect to the standard inner

    product if 〈x, y〉 = 0. This may be used to define orthogonal subsets and subspaces [1, page 111]. If V is a subspace of Cn we define

    V⊥ = {w ∈ Cn : 〈w, v〉 = 0 for all v ∈ V}

    and call it the orthogonal complement of V .

    Matrices and Least Squares 8

  • 3. There are many variations on this theme, see [1, pages 104-113]. In fact, it can be

    shown that the most general inner product on Cn can be written as

    〈x, y〉Q = yHQx

    where Q is an n× n positive definite matrix2.

    Example.

    Blah blah.

    4 The Projection Theorem and Related

    1. The following result is essential to all that follows. It is commonly known as the

    Projection Theorem. We only state it in the generality required here. For more

    information see [5].

    Thm 5. [Projection Theorem] Let3 H be a finite dimensional vector space over R or C with an inner product 〈·, ·〉 and induced norm ‖ · ‖. LetM be a subspace of H.

    2A square matrix Q is said to be positive definite if xHQx > 0

    for all nonzero vectors x. 3i.e., it is either Rn or Cn and the inner product is of the form 〈x, y〉 = yHQx.

    Matrices and Least Squares 9

  • Then for any x ∈ H there exists a unique vector y∗ ∈M such that

    ‖x− y∗‖ ≤ ‖x− y‖

    for all y ∈ M. Moreover, the unique vector y∗ is characterized by the orthogonality condition

    〈x− y∗, y〉 = 0 for all y ∈M.

    Proof.

    Thm 6. [Dual Projection Theorem] Let H be a finite dimensional vector space over R or C with an inner product 〈·, ·〉 and induced norm ‖ · ‖. Let M be a subspace, x ∈ H, and define the linear variety

    V = x +M = {x + y : y ∈M}.

    Then there is a unique vector v∗ ∈ V of smallest norm, i.e., such that

    ‖v∗‖ ≤ ‖v‖

    for all v ∈ V . Moreover, v∗ ∈M⊥.

    Proof.

    Matrices and Least Squares 10

  • Thm 7. [Corollary to Dual Projection Theorem] Let H be finite dimensional over R or C and suppose that {y1, y2, . . . , ym} is a linearly independent set of vectors from H. Then among all vectors x ∈ H satisfying the constraints

    〈x, y1〉 = c1 〈x, y2〉 = c2

    ...

    〈x, ym〉 = cm

    there is a unique x0 having smallest norm. It is given by

    x0 = m∑ k=1

    βkyk

    where the βk solve Gβ = c in terms of the so-called Gram matrix

    G = G(y1, y2, . . . , ym) =

    

    〈y1, y1〉 〈y2, y1〉 · · · 〈ym, y1〉 〈y1, y2〉 〈y2, y2〉 · · · 〈ym, y2〉

    ... ... · · · ... 〈y1, ym〉 〈y2, ym〉 · · · 〈ym, ym〉

     of the yk.

    Proof.

    Matrices and Least Squares 11

  • 5 Eigenvalues and Eigenvectors

    1. A a square matrix, say n × n. A nonzero vector v is said to be an eigenvector of A corresponding to the eigenvalue λ ∈ C if

    Av = λv.

    2. The eigenvalues are the solutions to the polynomial equation (the characteristic

    polynomial)

    |λI − A| = 0.

    3. The following properties are often useful. They apply to a square matrix A, which is

    Hermitian symmetric, i.e., AH = A.

    (a) A has real eigenvalues.

    (b) There exists an orthonormal basis for Cn consisting of eigenvectors of A, in par- ticular, A is diagonalizable.

    (c) In any case the eigenvectors of A corresponding to distinct eigenvalues are orthog-

    onal.

    (d) If, in addition, A is positive definite then its eigenvalues are positive. If A is

    nonnegative definite then its eigenvalues are nonnegative.