2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor...

56
Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University SCAN Seminar November 3, 2014 Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions 1 / 51

Transcript of 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor...

Page 1: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

Four Talks on Tensor Computations

2. SVD-Based Tensor Decompositions

Charles F. Van Loan

Cornell University

SCAN Seminar

November 3, 2014

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 1 / 51

Page 2: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

What is this Lecture About?

Pushing the SVD Envelope

The SVD is a powerful tool for exposing the structure of a matrix andfor capturing its essence through optimal, data-sparse representations:

A =

rank(A)∑k=1

σkukvTk ≈

r∑k=1

σkukvTk

For a tensor A, let’s try for something similar...

A =∑

whatever!

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 2 / 51

Page 3: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

What We will Need...

A Mechanism for Updating

UTAV = Σ

We will need operations that can transform the given tensor into something

simple.

The Notion of an Abbreviated Decomposition

vec(A) = (V ⊗ U) vec(Σ)

= a structured sum of the V (:, i)⊗ U(:, j)

≈ a SHORTER structured sum of the V (:, i)⊗ U(:, j)

We’ll need a way to “pull out” the essential part of a decomposition.

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 3 / 51

Page 4: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

What is this Lecture About?

Outline

The Mode-k Product and the Tucker Product.

The Tucker Representation and Its Properties

The Higher-Order SVD of a tensor.

An ALS Framework for Reduced-Rank Tucker Approximation

Approximation via the Kronecker Product SVD

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 4 / 51

Page 5: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Mode-k Matrix Product

Main Idea

Given A ∈ IRn1×n2×n3 , a mode k , and a matrix M, we apply M toevery mode-k fiber.

Recall that

A(2) =

a111 a211 a311 a411 a112 a212 a312 a412

a121 a221 a321 a421 a122 a222 a322 a422

a131 a231 a331 a431 a132 a232 a332 a432

is the mode-2 unfolding of A ∈ IR4×3×2 and its columns are itsmode-2 fibers

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 5 / 51

Page 6: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Mode-k Matrix Product: An Example

A Mode-2 Example When A ∈ IR4×3×2

b111 b211 b311 b411 b112 b212 b312 b412

b121 b221 b321 b421 b122 b222 b322 b422

b131 b231 b331 b431 b132 b232 b332 b432

b141 b241 b341 b441 b142 b242 b342 b442

b151 b251 b351 b451 b152 b252 b352 b452

=

m11 m12 m13

m21 m22 m23

m31 m32 m33

m41 m42 m43

m51 m52 m53

a111 a211 a311 a411 a112 a212 a312 a412

a121 a221 a321 a421 a122 a222 a322 a422

a131 a231 a331 a431 a132 a232 a332 a432

Note that (1) B ∈ IR4×5×2 and (2) B(2) = M · A(2).

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 6 / 51

Page 7: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Mode-k Matrix Product: Another Example

A mode-1 example when the Tensor A is Second Order...[b11 b12 b13

b21 b22 b23

]=

[m11 m12

m21 m22

] [a11 a12 a13

a21 a22 a23

](The fibers of A are its columns.)

A mode-2 example when the Tensor A is Second Order...b11 b12

b21 b22

b31 b32

b41 b42

=

m11 m12 m13

m21 m22 m23

m31 m32 m33

m41 m42 m43

a11 a21

a12 a22

a13 a23

(The fibers of A are its rows.)

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 7 / 51

Page 8: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Mode-k Product: Definitions

Mode-1

If A ∈ IRn1×n2×n3 and M ∈ IRm1×n1 , then the mode-1 product

B = A ×1 M ∈ IRm1×n2×n3

is defined by

B(i1, i2, i3) =

n1∑k=1

M(i1, k)A(k, i2, i3)

Two Equivalent Formulations...

B(1) = M · A(1)

vec(B) = (M ⊗ In2⊗ In3)vec(B)

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 8 / 51

Page 9: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Mode-k Product: Definitions

Mode-2

If A ∈ IRn1×n2×n3 and M ∈ IRm2×n2 , then the mode-1 product

B = A ×2 M ∈ IRn1×m2×n3

is defined by

B(i1, i2, i3) =

n2∑k=1

M(i2, k)A(i1, k, i3)

Two Equivalent Formulations...

B(2) = M · A(2)

vec(B) = (In1⊗M ⊗ In3)vec(B)

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 9 / 51

Page 10: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Mode-k Product: Definitions

Mode-3

If A ∈ IRn1×n2×n3 and M ∈ IRm3×n3 , then the mode-1 product

B = A ×3 M ∈ IRn1×n2×m3

is defined by

B(i1, i2, i3) =

n3∑k=1

M(i3, k)A(i1, i2, k)

Two Equivalent Formulations...

B(3) = M · A(3)

vec(B) = (In1⊗ In2

⊗M)vec(B)

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 10 / 51

Page 11: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Mode-k Product: Properties

Successive Products in the Same Mode

If A ∈ IRn1×n2×n3 and M1,M2 ∈ IRnk×nk , then

(A ×kM1) ×kM2 = A ×k(M1M2).

Successive Products in Different Modes

If A ∈ IRn1×n2×n3 , Mk ∈ IRmk×nk , Mj ∈ IRmj×nj , and k 6= j , then

(A ×kMk) ×jMj = (A ×j Mj) ×kMk

The order is not important.

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 11 / 51

Page 12: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

Matlab Tensor Toolbox: Mode-k Matrix Product Using ttm

n = [2 5 4 7];

A = tenrand(n);

M = randn (5,5);

B = ttm(A,M,k);

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 12 / 51

Page 13: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Tucker Product

Definition

The Tucker Product X of the tensor

S ∈ IRr1× r2× r3

with the matrices

U1 ∈ IRn1×r1 , U2 ∈ IRn2×r2 , U3 ∈ IRn3×r3

is given by

X = S ×1 U1 ×2 U2 ×3 U3 = ((S ×1 U1) ×2 U2) ×3 U3

It is a succession of mode-k products.

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 12 / 51

Page 14: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Tucker Product

Notation

[[ S ; U1, U2, U3 ]]

S ×1 U1 ×2 U2 ×3 U3

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 13 / 51

Page 15: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Tucker Product

As a Scalar Summation...

If S ∈ IRr1×r2×r3 , U1 ∈ IRn1×r1 , U2 ∈ IRn2×r2 , U3 ∈ IRn3×r3 , and

X = [[ S ; U1, U2, U3 ]]

then

X (i1, i2, i3) =r1∑

j1=1

r2∑j2=1

r3∑j3=1

S(j1, j2, j3) · U1(i1, j1) · U2(i2, j2) · U3(i3, j3)

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 14 / 51

Page 16: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Tucker Product

As a Scalar Summation and as a Sum of Rank-1 Tensors...

If S ∈ IRr1×r2×r3 , U1 ∈ IRn1×r1 , U2 ∈ IRn2×r2 , U3 ∈ IRn3×r3 , and

X = [[ S ; U1, U2, U3 ]]

then

X (i1, i2, i3) =r1∑

j1=1

r2∑j2=1

r3∑j3=1

S(j1, j2, j3) · U1(i1, j1) · U2(i2, j2) · U3(i3, j3)

X =r1∑

j1=1

r2∑j2=1

r3∑j3=1

S(j1, j2, j3) · U1(:, j1) ◦ U2(:, j2) ◦ U3(:, j3)

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 15 / 51

Page 17: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Tucker Product

As a Giant Matrix-Vector Product...

If S ∈ IRr1×r2×r3 , U1 ∈ IRn1×r1 , U2 ∈ IRn2×r2 , U3 ∈ IRn3×r3 , and

X = [[ S ; U1, U2, U3 ]]

thenvec(X ) = (U3 ⊗ U2 ⊗ U1) · vec (S)

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 16 / 51

Page 18: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

Matlab Tensor Toolbox: Tucker Products

function B = TuckerProd(A,M)

% A is a n(1)-by -...-n(d) tensor.

% M is a length -d cell array with

% M{k} an m(k)-by -n(k) matrix.

% B is an m(1)-by -...-by -m(d) tensor given by

% B = A x1 M{1} x2 M{2} ... xd M{d}

% where "xk" denotes mode -k matrix product.

B = A;

for k=1: length(A.size)

B = ttm(B,M{k},k);

end

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 16 / 51

Page 19: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

Matlab Tensor Toolbox: Tucker Tensor Set-Up

n = [5 8 3]; m = [4 6 2];

F = randn(n(1),m(1)); G = randn(n(2),m(2));

H = randn(n(3),m(3));

S = tenrand(m);

X = ttensor(S,{F,G,H});

A ttensor is a structure with two fields that is used to represent a tensor in Tucker

form. In the above, X.core houses the the core tensor S while X.U is a cell array

of the matrices F , G , and H that define the tensor X.

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 17 / 51

Page 20: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Tucker Product Representation

The Challenge

Given A ∈ IRn1×n2×n3 , compute

S ∈ IRr1×r2×r3

and

U1 ∈ IRn1×r1 , U2 ∈ IRn2×r2 , U3 ∈ IRn3×r3

such that

A = S ×1 U1 ×2 U2×3 U3

is an “illuminating” Tucker product representation of A.

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 17 / 51

Page 21: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Tucker Product Representation

A Simple but Important Result

If A ∈ IRn1×n2×n3 and U1 ∈ IRn1×n1 , U2 ∈ IRn2×n2 , and U3 ∈ IRn3×n3

are nonsingular, then

A = S ×1 U1 ×2 U2 ×3 U3

whereS = A ×1 U−1

1 ×2 U−12 ×3 U−1

3 .

We will refer to the Uk as the inverse factors and S as the core tensor.

A = U1(U−11 AU−1

2 )U2 = U1SU2

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 18 / 51

Page 22: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

Proof.

A = A ×1 (U−11 U1) ×2 (U−1

2 U2) ×3 (U−13 U3)

=“A ×1 U−1

1 ×2 U−12 ×3 U−1

3

”×1 U1 ×2 U2 ×3 U3

= S ×1 U1 ×2 U2 ×3 U3

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 19 / 51

Page 23: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Tucker Product Representation

If the U’s are Orthogonal

If A ∈ IRn1×n2×n3 and U1 ∈ IRn1×n1 , U2 ∈ IRn2×n2 , and U3 ∈ IRn3×n3

are orthogonal, then

A = S ×1 U1 ×2 U2 ×3 U3

withS = A ×1 UT

1 ×2 UT2 ×3 UT

3 .

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 20 / 51

Page 24: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Tucker Product Representation

If the U’s are from the Modal Unfolding SVDs...

Suppose A ∈ IRn1×n2×n3 is given. If

A(1) = U1Σ1VT1

A(2) = U2Σ2VT2

A(3) = U3Σ3VT3

are SVDs andS = A ×1 UT

1 ×2 UT2 ×3 UT

3 ,

thenA = S ×1 U1 ×2 U2 ×3 U3,

is the higher-order SVD of A.

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 21 / 51

Page 25: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

Matlab Tensor Toolbox: Computing the HOSVD

function [S,U] = HOSVD(A)

% A is an n(1)-by -...-by -n(d) tensor.

% U is a length -d cell array with the

% property that U{k} is the left singular

% vector matrix of A’s mode -k unfolding.

% S is an n(1)-by -...-by -n(d) tensor given by

% A x1 U{1} x2 U{2} ... xd U{d}

S = A;

for k=1: length(A.size)

C = tenmat(A,k);

[U{k},Sigma ,V] = svd(C.data);

S = ttm(S,U{k}’,k);

end

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 22 / 51

Page 26: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Higher-Order SVD (HOSVD)

The HOSVD of a Matrix

If d = 2 then A is a matrix and the HOSVD is the SVD. Indeed, if

A = A(1) = U1Σ1VT1

AT = A(2) = U2Σ2VT2

then we can set U = U1 = V2 and V = U2 = V1. Note that

S = (A ×1 UT1 ) ×2 UT

2 = (UT1 A) ×2 U2 = UT

1 AV1 = Σ1.

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 22 / 51

Page 27: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The HOSVD

Core Tensor Properties

If

A(1) = U1Σ1VT1 A(2) = U2Σ2V

T2 A(3) = U3Σ3V

T3

are SVDs andA = S ×1 U1 ×2 U2 ×3 U3

then

A(1) = U1S(1) (U3 ⊗ U2)T and S(1) = Σ1V1(U3 ⊗ U2)

It follows that the rows of S(1) are mutually orthogonal and that the singular

values of A(1) are the 2-norms of these rows.

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 23 / 51

Page 28: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The HOSVD

Core Tensor Properties

If

A(1) = U1Σ1VT1 A(2) = U2Σ2V

T2 A(3) = U3Σ3V

T3

are SVDs andA = S ×1 U1 ×2 U2 ×3 U3

then

A(2) = U2S(2) (U3 ⊗ U1)T and S(2) = Σ2V2(U3 ⊗ U1)

It follows that the rows of S(2) are mutually orthogonal and that the singular

values of A(2) are the 2-norms of these rows.

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 24 / 51

Page 29: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The HOSVD

Core Tensor Properties

If

A(1) = U1Σ1VT1 A(2) = U2Σ2V

T2 A(3) = U3Σ3V

T3

are SVDs andA = S ×1 U1 ×2 U2 ×3 U3

then

A(3) = U3S(3) (U2 ⊗ U1)T and S(3) = Σ3V3(U2 ⊗ U1)

It follows that the rows of S(3) are mutually orthogonal and that the singular

values of A(3) are the 2-norms of these rows.

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 25 / 51

Page 30: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Core Tensor S is Graded

S(1) = Σ1V1(U3 ⊗ U2) ⇒ ‖ S(j , :, :) ‖F = σj(A(1)) j = 1:n1

S(2) = Σ2V2(U3 ⊗ U1) ⇒ ‖ S(:, j , :) ‖F = σj(A(2)) j = 1:n2

S(3) = Σ3V3(U2 ⊗ U1) ⇒ ‖ S(:, :, j) ‖F = σj(A(3)) j = 1:n3

Entries are getting smaller as you move away from A(1, 1, 1)

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 26 / 51

Page 31: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Higher-Order SVD (HOSVD)

The HOSVD as a Multilinear Sum

If A = S ×1 U1 ×2 U2 ×3 U3 is the HOSVD of A ∈ IRn1×n2×n3 , then

A(i1, i2, i3) =

n1∑j1=1

n2∑j2=1

n3∑j3=1

U1(i1, j1)U2(i2, j2)U3(i3, j3)

The HOSVD as a Matrix-Vector Product

If A = S ×1 U1 ×2 U2 ×3 U3 is the HOSVD of A ∈ IRn1×n2×n3 , then

vec(A) = (U3 ⊗ U2 ⊗ U1) · vec(S)

Note that U3 ⊗ U2 ⊗ U1 is orthogonal.

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 27 / 51

Page 32: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

Problem 1. Formulate an HOQRP factorization for a tensorA ∈ IRn1×···×nd that is based on the QR-with-column-pivoting factorizations

A(k)Pk = QkRk

for k = 1:d . Does the core tensor have any special properties?

Problem 2. Does this inequality hold?

‖ A − Ar ‖2F ≤

n1Xj=r1+1

σj(A(1))2 +

n2Xj=r2+1

σj(A(2))2 +

n3Xj=r3+1

σj(A(3))2

Can you do better?

Problem 3. Show that

|A(i1, i2, i3) − Xr(i1, i2, i3)| ≤ min{σr1+1(A(1)), σr2+1(A(2)), σr3+1(A(3)), }

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 28 / 51

Page 33: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Truncated HOSVD

Abbreviate the HOSVD Expansion...

A(i1, i2, i3) =n1∑

j1=1

n2∑j2=1

n3∑j3=1

S(j1, j2, j3) · U1(i1, j1) · U2(i2, j2) · U3(i3, j3)

Ar(i1, i2, i3) =r1∑

j1=1

r2∑j2=1

r3∑j3=1

S(j1, j2, j3) · U1(i1, j1) · U2(i2, j2) · U3(i3, j3)

What can we say about the “thrown away” terms?

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 29 / 51

Page 34: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Truncated HOSVD

Abbreviate the HOSVD Expansion...

A(i1, i2, i3) =n1∑

j1=1

n2∑j2=1

n3∑j3=1

S(j1, j2, j3) · U1(i1, j1) · U2(i2, j2) · U3(i3, j3)

Ar(i1, i2, i3) =r1∑

j1=1

r2∑j2=1

r3∑j3=1

S(j1, j2, j3) · U1(i1, j1) · U2(i2, j2) · U3(i3, j3)

‖ S(j , :, :) ‖F = σj(A(1)) j = 1:n1

‖ S(:, j , :) ‖F = σj(A(2)) j = 1:n2

‖ S(:, :, j) ‖F = σj(A(3)) j = 1:n3

Entries are getting smaller as you move away from A(1, 1, 1)

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 30 / 51

Page 35: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

Modal Rank

Definition

We say that

Ar(i1, i2, i3) =

r1∑j1=1

r2∑j2=1

r3∑j3=1

S(j1, j2, j3) ·U1(i1, j1) ·U2(i2, j2) ·U3(i3, j3)

has modal rank (r1, r2, r3)

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 31 / 51

Page 36: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Tucker Nearness Problem

A Tensor Optimization Problem

Given A ∈ IRn1×n2×n3 and integers r1 ≤ n1, r2 ≤ n2, and r3 ≤ n3

minimize‖ A − B ‖F

over all tensors B ∈ IRn1×n2×n3 that have modal rank (r1, r2, r3)

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 32 / 51

Page 37: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Tucker Nearness Problem

A Tensor Optimization Problem

Given A ∈ IRn1×n2×n3 and integers r1 ≤ n1, r2 ≤ n2, and r3 ≤ n3

minimize‖ A − B ‖F

over all tensors B ∈ IRn1×n2×n3 that have modal rank (r1, r2, r3)

A Matrix Optimization Problem

Given A ∈ IRn1×n2 and integer r ≤ min{n1, n2}, minimize

‖ A − B ‖F

over all matrices B ∈ IRn1×n2 that have rank r .

The matrix problem has a happy solution via the SVD A = UΣV T :

Bopt = σ1U(:, 1)V (:, 1)T + · · ·+ σrU(:, r)V (:, r)T

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 33 / 51

Page 38: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Tucker Nearness Problem

The Plan...

Develop an Alternating Least Squares framework for minimizing∥∥∥∥∥∥ A −r1∑

j1=1

r2∑j2=1

r3∑j3=1

S(j1, j2, j3) · U1(:, j1) ◦ U2(:, j2) ◦ U3(:, j3)

∥∥∥∥∥∥F

Equivalent to finding U1, U2, and U3 (all with orthonormal columns)and core tensor S ∈ IRr1×r2×r3 so that

‖ vec(A) − (U3 ⊗ U2 ⊗ U1)vec(S) ‖F

is minimized.

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 34 / 51

Page 39: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Tucker Nearness Problem

The “Removal” of SSince S must minimize

‖ vec(A) − (U3 ⊗ U2 ⊗ U1) · vec(S) ‖

and U3 ⊗ U2 ⊗ U1 is orthonormal, we see that

S =(UT

3 ⊗ UT2 ⊗ UT

1

)· vec(A)

and so our goal is to choose the Ui so that

‖(I − (U3 ⊗ U2 ⊗ U1)

(UT

3 ⊗ UT2 ⊗ UT

1

))vec(A) ‖

is minimized.

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 35 / 51

Page 40: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Tucker Nearness Problem

Reformulation...

Since U3 ⊗ U2 ⊗ U1 has orthonormal columns, it follows that our goalis to choose orthonormal Ui so that

‖(UT

3 ⊗ UT2 ⊗ UT

1

)· vec(A) ‖

is maximized.

If Q has orthonormal columns then

‖ (I − QQT )a ‖2

2 = ‖ a ‖2 − ‖ QTa ‖2

2

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 36 / 51

Page 41: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Tucker Nearness Problem

Three Reshapings of the Objective Function...

‖(UT

3 ⊗ UT2 ⊗ UT

1

)· vec(A) ‖

=

‖ UT1 · A(1) · (U3 ⊗ U2) ‖

F

=

‖ UT2 · A(2) · (U3 ⊗ U1) ‖

F

=

‖ UT3 · A(3) · (U2 ⊗ U1) ‖

F

Sets the stage for an alternating least squares solution approach...

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 37 / 51

Page 42: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

Alternating Least Squares Framework

A Sequence of Three Linear Problems...

‖(UT

3 ⊗ UT2 ⊗ UT

1

)· vec(A) ‖

=

‖ UT1 · A(1) · (U3 ⊗ U2) ‖

F

=

‖ UT2 · A(2) · (U3 ⊗ U1) ‖

F

=

‖ UT3 · A(3) · (U2 ⊗ U1) ‖

F

⇐ 1. Fix U2 and U3 andmaximize with U1.

⇐ 2. Fix U1 and U3 andmaximize with U2.

⇐ 3. Fix U1 and U2 andmaximize with U3.

These max problems are SVD problems...

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 38 / 51

Page 43: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

How do you maximize ‖ QTM ‖F where Q ∈ IRm×r has orthonormalcolumns, M ∈ IRm×n, and r ≤ n?

IfM = UΣV T

is the SVD of M, then

‖ QTM ‖2F = ‖ QTUΣV T ‖2

F = ‖ QTUΣ ‖2F

=n∑

k=1

σ2k‖ QTU(:, k) ‖2

2.

The best you can do is to set Q = U(:, 1:r).

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 39 / 51

Page 44: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

Alternating Least Squares Framework

A Sequence of Three Linear Problems...

Repeat:

1. Compute the SVD A(1) · (U3 ⊗ U2) = U1Σ1VT1

and set U1 = U1(:, 1:r1).

2. Compute the SVD A(2) · (U3 ⊗ U1) = U2Σ2VT2

and set U2 = U2(:, 1:r2).

3. Compute the SVD A(3) · (U2 ⊗ U1) = U3Σ3VT3

and set U3 = U3(:, 1:r3).

Initial guess via the HOSVD

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 40 / 51

Page 45: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

Matlab Tensor Toolbox: The Function tucker als

n = [ 5 6 7 ];

% Generate a random tensor ...

A = tenrand(n);

for r = 1:min(n)

% Find the closest length -[r r r] ttensor ...

X = tucker_als(A,[r r r]);

% Display the fit ...

E = double(X)-double(A);

fit = norm(reshape(E,prod(n),1));

fprintf(’r = %1d, fit = %5.3e\n’,r,fit);

end

The function Tucker als returns a ttensor. Default values for the number ofiterations and the termination criteria can be modified:

X = Tucker als(A,r,’maxiters’,20,’tol’,.001)

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 41 / 51

Page 46: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

Approximation via Sums of Low-Rank Tensor Products

Motivation

Unfold A ∈ IRn×n×n×n into an n2-by-n2 matrix A.

Express A as a sum of Kronecker products:

A =r∑

k=1

σkBk ⊗ Ck Bk ,Ck ∈ IRn×n

Back to tensor:

A =r∑

k=1

σkCk ◦ Bk

i.e.,

A(i1, i2, j1, j2) =r∑

k=1

σkCk(i1, i2)Bk(j1, j2)

There is an optimal way of doing this.

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 41 / 51

Page 47: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Nearest Kronecker Product Problem

Reshaping the Objective Function (3-by-2 case)

∥∥∥∥∥∥∥∥∥∥∥∥

a11 a12 a13 a14

a21 a22 a23 a24

a31 a32 a33 a34

a41 a42 a43 a44

a51 a52 a53 a54

a61 a62 a63 a64

b11 b12

b21 b22

b31 b32

⊗[

c11 c12

c21 c22

]∥∥∥∥∥∥∥∥∥∥∥∥

F

=∥∥∥∥∥∥∥∥∥∥∥∥∥

a11 a21 a12 a22

a31 a41 a32 a42

a51 a61 a52 a62

a13 a23 a14 a24

a33 a43 a34 a44

a53 a63 a54 a64

b11

b21

b31

b12

b22

b32

[c11 c21 c12 c22

] ∥∥∥∥∥∥∥∥∥∥∥∥∥F

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 42 / 51

Page 48: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Nearest Kronecker Product Problem

Minimizing the Objective Function (3-by-2 case)

It is a nearest rank-1 problem,

φA(B,C ) =

∥∥∥∥∥∥∥∥∥∥∥∥∥

a11 a21 a12 a22

a31 a41 a32 a42

a51 a61 a52 a62

a13 a23 a14 a24

a33 a43 a34 a44

a53 a63 a54 a64

b11

b21

b31

b12

b22

b32

[c11 c21 c12 c22

] ∥∥∥∥∥∥∥∥∥∥∥∥∥F

= ‖ A− vec(B)vec(C )T ‖F

with SVD solution:A = UΣV T

vec(B) =√

σ1U(:, 1)

vec(C ) =√

σ1V (:, 1)

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 43 / 51

Page 49: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Nearest Kronecker Product Problem

The “Tilde Matrix”

A =

a11 a12 a13 a14

a21 a22 a23 a24

a31 a32 a33 a34

a41 a42 a43 a44

a51 a52 a53 a54

a61 a62 a63 a64

=

A11 A12

A21 A22

A31 A32

implies

A =

a11 a21 a12 a22

a31 a41 a32 a42

a51 a61 a52 a62

a13 a23 a14 a24

a33 a43 a34 a44

a53 a63 a54 a64

=

vec(A11)T

vec(A21)T

vec(A31)T

vec(A12)T

vec(A22)T

vec(A32)T

.

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 44 / 51

Page 50: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Kronecker Product SVD (KPSVD)

Theorem

If

A =

A11 · · · A1,c2

.... . .

...Ar2,1 · · · Ar2,c2

Ai2,j2 ∈ IRr1×c1

then there exist U1, . . . ,UrKP∈ IRr2×c2 , V1, . . . ,VrKP

∈ IRr1×c1 , andscalars σ1 ≥ · · · ≥ σrKP

> 0 such that

A =

rKP∑k=1

σkUk ⊗ Vk .

The sets {vec(Uk)} and {vec(Vk)} are orthonormal and rKP is theKronecker rank of A with respect to the chosen blocking.

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 45 / 51

Page 51: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Kronecker Product SVD (KPSVD)

Constructive Proof

Compute the SVD of A:

A = UΣV T =

rKP∑k=1

σkukvTk

and define the Uk and Vk by

vec(Uk) = uk

vec(Vk) = vk

for k = 1:rKP .

Uk = reshape(uk , r2, c2), Vk = reshape(vk , r1, c1)

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 46 / 51

Page 52: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Kronecker Product SVD (KPSVD)

Nearest rank-r

If r ≤ rKP , then

Ar =r∑

k=1

σkUk ⊗ Vk

is the nearest matrix to A (in the Frobenius norm) that has Kroneckerrank r .

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 47 / 51

Page 53: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Nearest Kronecker Product Problem

The Objective Function in Tensor Terms

φA(B,C ) = ‖ A− B ⊗ C ‖F

=

√√√√ r1∑i1=1

c1∑j1=1

r2∑i2=1

c2∑j2=1

A(i1, j1, i2, j2)− B(i2, j2)C(i1, j1)

We are trying to approximate an order-4 tensor with a pair of order-2tensors.

Analogous to approximating a matrix (an order-2 tensor) with a rank-1

matrix (a pair of order-1 tensors.)

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 48 / 51

Page 54: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

The Nearest Kronecker Product Problem

Harder

φA(B,C ,D)

=

‖ A− B ⊗ C ⊗ D ‖F

=√√√√ r1∑i1=1

c1∑j1=1

r2∑i2=1

c2∑j2=1

r3∑i3=1

c2∑j3=1

A(i1, j1, i2, j2, i3, j3)− B(i3, j3)C(i2, j2)D(i1, j1)

We are trying to approximate an order-6 tensor with a triplet of order-2tensors.

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 49 / 51

Page 55: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

References

L. De Lathauwer, B. De Moor, J. Vandewalle (2000a). “A Multilinear SingularValue Decomposition”, SIAM J. Matrix Anal. Appl., 21, 1253–1278.

W. Hackbusch and B.N. Khoromskij (2007). “Tensor-product Approximation toOperators and Functions in High Dimensions,” Journal of Complexity, 23,697–714.

L. De Lathauwer, B. De Moor, J. Vandewalle (2000a). “A Multilinear SingularValue Decomposition”, SIAM J. Matrix Anal. Appl., 21, 1253–1278.

L. De Lathauwer, B. De Moor, and J. Vandewalle (2000b). “On the Best Rank-1and Rank-(r1,r2,...,rN) Approximation of Higher-Order Tensors,” SIAM J.Matrix Analysis and Applications, 21, 1324–1342.

T.G. Kolda (2001). “Orthogonal Tensor Decompositions,” SIAM Journal onMatrix Analysis and Applications, 23, 243–255.

T.G. Kolda (2003). “A Counterexample to the Possibility of an Extension of theEckart-Young Low-Rank approximation Theorem for the Orthogonal RankTensor Decomposition,” SIAM Journal on Matrix Analysis and Applications,24, 762–767.

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 50 / 51

Page 56: 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor Computations 2. SVD-Based Tensor Decompositions Charles F. Van Loan Cornell University

Summary

Key Words

The Mode-k Matrix Product is a contraction between a tensorand a matrix that produces another tensor.

The Modal-rank of a tensor is the vector of mode-k unfoldingranks.

The Higher Order SVD of a tensor A assembles the SVDs of A’smodal unfoldings.

The Tucker Nearness Problem for a given tensor A involvesfinding the nearest tensor that has a given modal rank. Solvedvia alternating LS problems that involve SVDs

The Kronecker Product SVD characterizes a block matrix as asum of Kronecker products. By applying it to an unfolding of atensor A, an outer product expansion for A is obtained.

⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 51 / 51