2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor...
Transcript of 2. SVD-Based Tensor Decompositionspi.math.cornell.edu/~scan/Tensor2.pdf · Four Talks on Tensor...
Four Talks on Tensor Computations
2. SVD-Based Tensor Decompositions
Charles F. Van Loan
Cornell University
SCAN Seminar
November 3, 2014
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 1 / 51
What is this Lecture About?
Pushing the SVD Envelope
The SVD is a powerful tool for exposing the structure of a matrix andfor capturing its essence through optimal, data-sparse representations:
A =
rank(A)∑k=1
σkukvTk ≈
r∑k=1
σkukvTk
For a tensor A, let’s try for something similar...
A =∑
whatever!
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 2 / 51
What We will Need...
A Mechanism for Updating
UTAV = Σ
We will need operations that can transform the given tensor into something
simple.
The Notion of an Abbreviated Decomposition
vec(A) = (V ⊗ U) vec(Σ)
= a structured sum of the V (:, i)⊗ U(:, j)
≈ a SHORTER structured sum of the V (:, i)⊗ U(:, j)
We’ll need a way to “pull out” the essential part of a decomposition.
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 3 / 51
What is this Lecture About?
Outline
The Mode-k Product and the Tucker Product.
The Tucker Representation and Its Properties
The Higher-Order SVD of a tensor.
An ALS Framework for Reduced-Rank Tucker Approximation
Approximation via the Kronecker Product SVD
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 4 / 51
The Mode-k Matrix Product
Main Idea
Given A ∈ IRn1×n2×n3 , a mode k , and a matrix M, we apply M toevery mode-k fiber.
Recall that
A(2) =
a111 a211 a311 a411 a112 a212 a312 a412
a121 a221 a321 a421 a122 a222 a322 a422
a131 a231 a331 a431 a132 a232 a332 a432
is the mode-2 unfolding of A ∈ IR4×3×2 and its columns are itsmode-2 fibers
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 5 / 51
The Mode-k Matrix Product: An Example
A Mode-2 Example When A ∈ IR4×3×2
b111 b211 b311 b411 b112 b212 b312 b412
b121 b221 b321 b421 b122 b222 b322 b422
b131 b231 b331 b431 b132 b232 b332 b432
b141 b241 b341 b441 b142 b242 b342 b442
b151 b251 b351 b451 b152 b252 b352 b452
=
m11 m12 m13
m21 m22 m23
m31 m32 m33
m41 m42 m43
m51 m52 m53
a111 a211 a311 a411 a112 a212 a312 a412
a121 a221 a321 a421 a122 a222 a322 a422
a131 a231 a331 a431 a132 a232 a332 a432
Note that (1) B ∈ IR4×5×2 and (2) B(2) = M · A(2).
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 6 / 51
The Mode-k Matrix Product: Another Example
A mode-1 example when the Tensor A is Second Order...[b11 b12 b13
b21 b22 b23
]=
[m11 m12
m21 m22
] [a11 a12 a13
a21 a22 a23
](The fibers of A are its columns.)
A mode-2 example when the Tensor A is Second Order...b11 b12
b21 b22
b31 b32
b41 b42
=
m11 m12 m13
m21 m22 m23
m31 m32 m33
m41 m42 m43
a11 a21
a12 a22
a13 a23
(The fibers of A are its rows.)
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 7 / 51
The Mode-k Product: Definitions
Mode-1
If A ∈ IRn1×n2×n3 and M ∈ IRm1×n1 , then the mode-1 product
B = A ×1 M ∈ IRm1×n2×n3
is defined by
B(i1, i2, i3) =
n1∑k=1
M(i1, k)A(k, i2, i3)
Two Equivalent Formulations...
B(1) = M · A(1)
vec(B) = (M ⊗ In2⊗ In3)vec(B)
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 8 / 51
The Mode-k Product: Definitions
Mode-2
If A ∈ IRn1×n2×n3 and M ∈ IRm2×n2 , then the mode-1 product
B = A ×2 M ∈ IRn1×m2×n3
is defined by
B(i1, i2, i3) =
n2∑k=1
M(i2, k)A(i1, k, i3)
Two Equivalent Formulations...
B(2) = M · A(2)
vec(B) = (In1⊗M ⊗ In3)vec(B)
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 9 / 51
The Mode-k Product: Definitions
Mode-3
If A ∈ IRn1×n2×n3 and M ∈ IRm3×n3 , then the mode-1 product
B = A ×3 M ∈ IRn1×n2×m3
is defined by
B(i1, i2, i3) =
n3∑k=1
M(i3, k)A(i1, i2, k)
Two Equivalent Formulations...
B(3) = M · A(3)
vec(B) = (In1⊗ In2
⊗M)vec(B)
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 10 / 51
The Mode-k Product: Properties
Successive Products in the Same Mode
If A ∈ IRn1×n2×n3 and M1,M2 ∈ IRnk×nk , then
(A ×kM1) ×kM2 = A ×k(M1M2).
Successive Products in Different Modes
If A ∈ IRn1×n2×n3 , Mk ∈ IRmk×nk , Mj ∈ IRmj×nj , and k 6= j , then
(A ×kMk) ×jMj = (A ×j Mj) ×kMk
The order is not important.
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 11 / 51
Matlab Tensor Toolbox: Mode-k Matrix Product Using ttm
n = [2 5 4 7];
A = tenrand(n);
M = randn (5,5);
B = ttm(A,M,k);
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 12 / 51
The Tucker Product
Definition
The Tucker Product X of the tensor
S ∈ IRr1× r2× r3
with the matrices
U1 ∈ IRn1×r1 , U2 ∈ IRn2×r2 , U3 ∈ IRn3×r3
is given by
X = S ×1 U1 ×2 U2 ×3 U3 = ((S ×1 U1) ×2 U2) ×3 U3
It is a succession of mode-k products.
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 12 / 51
The Tucker Product
Notation
[[ S ; U1, U2, U3 ]]
≡
S ×1 U1 ×2 U2 ×3 U3
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 13 / 51
The Tucker Product
As a Scalar Summation...
If S ∈ IRr1×r2×r3 , U1 ∈ IRn1×r1 , U2 ∈ IRn2×r2 , U3 ∈ IRn3×r3 , and
X = [[ S ; U1, U2, U3 ]]
then
X (i1, i2, i3) =r1∑
j1=1
r2∑j2=1
r3∑j3=1
S(j1, j2, j3) · U1(i1, j1) · U2(i2, j2) · U3(i3, j3)
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 14 / 51
The Tucker Product
As a Scalar Summation and as a Sum of Rank-1 Tensors...
If S ∈ IRr1×r2×r3 , U1 ∈ IRn1×r1 , U2 ∈ IRn2×r2 , U3 ∈ IRn3×r3 , and
X = [[ S ; U1, U2, U3 ]]
then
X (i1, i2, i3) =r1∑
j1=1
r2∑j2=1
r3∑j3=1
S(j1, j2, j3) · U1(i1, j1) · U2(i2, j2) · U3(i3, j3)
X =r1∑
j1=1
r2∑j2=1
r3∑j3=1
S(j1, j2, j3) · U1(:, j1) ◦ U2(:, j2) ◦ U3(:, j3)
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 15 / 51
The Tucker Product
As a Giant Matrix-Vector Product...
If S ∈ IRr1×r2×r3 , U1 ∈ IRn1×r1 , U2 ∈ IRn2×r2 , U3 ∈ IRn3×r3 , and
X = [[ S ; U1, U2, U3 ]]
thenvec(X ) = (U3 ⊗ U2 ⊗ U1) · vec (S)
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 16 / 51
Matlab Tensor Toolbox: Tucker Products
function B = TuckerProd(A,M)
% A is a n(1)-by -...-n(d) tensor.
% M is a length -d cell array with
% M{k} an m(k)-by -n(k) matrix.
% B is an m(1)-by -...-by -m(d) tensor given by
% B = A x1 M{1} x2 M{2} ... xd M{d}
% where "xk" denotes mode -k matrix product.
B = A;
for k=1: length(A.size)
B = ttm(B,M{k},k);
end
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 16 / 51
Matlab Tensor Toolbox: Tucker Tensor Set-Up
n = [5 8 3]; m = [4 6 2];
F = randn(n(1),m(1)); G = randn(n(2),m(2));
H = randn(n(3),m(3));
S = tenrand(m);
X = ttensor(S,{F,G,H});
A ttensor is a structure with two fields that is used to represent a tensor in Tucker
form. In the above, X.core houses the the core tensor S while X.U is a cell array
of the matrices F , G , and H that define the tensor X.
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 17 / 51
The Tucker Product Representation
The Challenge
Given A ∈ IRn1×n2×n3 , compute
S ∈ IRr1×r2×r3
and
U1 ∈ IRn1×r1 , U2 ∈ IRn2×r2 , U3 ∈ IRn3×r3
such that
A = S ×1 U1 ×2 U2×3 U3
is an “illuminating” Tucker product representation of A.
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 17 / 51
The Tucker Product Representation
A Simple but Important Result
If A ∈ IRn1×n2×n3 and U1 ∈ IRn1×n1 , U2 ∈ IRn2×n2 , and U3 ∈ IRn3×n3
are nonsingular, then
A = S ×1 U1 ×2 U2 ×3 U3
whereS = A ×1 U−1
1 ×2 U−12 ×3 U−1
3 .
We will refer to the Uk as the inverse factors and S as the core tensor.
A = U1(U−11 AU−1
2 )U2 = U1SU2
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 18 / 51
Proof.
A = A ×1 (U−11 U1) ×2 (U−1
2 U2) ×3 (U−13 U3)
=“A ×1 U−1
1 ×2 U−12 ×3 U−1
3
”×1 U1 ×2 U2 ×3 U3
= S ×1 U1 ×2 U2 ×3 U3
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 19 / 51
The Tucker Product Representation
If the U’s are Orthogonal
If A ∈ IRn1×n2×n3 and U1 ∈ IRn1×n1 , U2 ∈ IRn2×n2 , and U3 ∈ IRn3×n3
are orthogonal, then
A = S ×1 U1 ×2 U2 ×3 U3
withS = A ×1 UT
1 ×2 UT2 ×3 UT
3 .
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 20 / 51
The Tucker Product Representation
If the U’s are from the Modal Unfolding SVDs...
Suppose A ∈ IRn1×n2×n3 is given. If
A(1) = U1Σ1VT1
A(2) = U2Σ2VT2
A(3) = U3Σ3VT3
are SVDs andS = A ×1 UT
1 ×2 UT2 ×3 UT
3 ,
thenA = S ×1 U1 ×2 U2 ×3 U3,
is the higher-order SVD of A.
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 21 / 51
Matlab Tensor Toolbox: Computing the HOSVD
function [S,U] = HOSVD(A)
% A is an n(1)-by -...-by -n(d) tensor.
% U is a length -d cell array with the
% property that U{k} is the left singular
% vector matrix of A’s mode -k unfolding.
% S is an n(1)-by -...-by -n(d) tensor given by
% A x1 U{1} x2 U{2} ... xd U{d}
S = A;
for k=1: length(A.size)
C = tenmat(A,k);
[U{k},Sigma ,V] = svd(C.data);
S = ttm(S,U{k}’,k);
end
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 22 / 51
The Higher-Order SVD (HOSVD)
The HOSVD of a Matrix
If d = 2 then A is a matrix and the HOSVD is the SVD. Indeed, if
A = A(1) = U1Σ1VT1
AT = A(2) = U2Σ2VT2
then we can set U = U1 = V2 and V = U2 = V1. Note that
S = (A ×1 UT1 ) ×2 UT
2 = (UT1 A) ×2 U2 = UT
1 AV1 = Σ1.
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 22 / 51
The HOSVD
Core Tensor Properties
If
A(1) = U1Σ1VT1 A(2) = U2Σ2V
T2 A(3) = U3Σ3V
T3
are SVDs andA = S ×1 U1 ×2 U2 ×3 U3
then
A(1) = U1S(1) (U3 ⊗ U2)T and S(1) = Σ1V1(U3 ⊗ U2)
It follows that the rows of S(1) are mutually orthogonal and that the singular
values of A(1) are the 2-norms of these rows.
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 23 / 51
The HOSVD
Core Tensor Properties
If
A(1) = U1Σ1VT1 A(2) = U2Σ2V
T2 A(3) = U3Σ3V
T3
are SVDs andA = S ×1 U1 ×2 U2 ×3 U3
then
A(2) = U2S(2) (U3 ⊗ U1)T and S(2) = Σ2V2(U3 ⊗ U1)
It follows that the rows of S(2) are mutually orthogonal and that the singular
values of A(2) are the 2-norms of these rows.
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 24 / 51
The HOSVD
Core Tensor Properties
If
A(1) = U1Σ1VT1 A(2) = U2Σ2V
T2 A(3) = U3Σ3V
T3
are SVDs andA = S ×1 U1 ×2 U2 ×3 U3
then
A(3) = U3S(3) (U2 ⊗ U1)T and S(3) = Σ3V3(U2 ⊗ U1)
It follows that the rows of S(3) are mutually orthogonal and that the singular
values of A(3) are the 2-norms of these rows.
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 25 / 51
The Core Tensor S is Graded
S(1) = Σ1V1(U3 ⊗ U2) ⇒ ‖ S(j , :, :) ‖F = σj(A(1)) j = 1:n1
S(2) = Σ2V2(U3 ⊗ U1) ⇒ ‖ S(:, j , :) ‖F = σj(A(2)) j = 1:n2
S(3) = Σ3V3(U2 ⊗ U1) ⇒ ‖ S(:, :, j) ‖F = σj(A(3)) j = 1:n3
Entries are getting smaller as you move away from A(1, 1, 1)
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 26 / 51
The Higher-Order SVD (HOSVD)
The HOSVD as a Multilinear Sum
If A = S ×1 U1 ×2 U2 ×3 U3 is the HOSVD of A ∈ IRn1×n2×n3 , then
A(i1, i2, i3) =
n1∑j1=1
n2∑j2=1
n3∑j3=1
U1(i1, j1)U2(i2, j2)U3(i3, j3)
The HOSVD as a Matrix-Vector Product
If A = S ×1 U1 ×2 U2 ×3 U3 is the HOSVD of A ∈ IRn1×n2×n3 , then
vec(A) = (U3 ⊗ U2 ⊗ U1) · vec(S)
Note that U3 ⊗ U2 ⊗ U1 is orthogonal.
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 27 / 51
Problem 1. Formulate an HOQRP factorization for a tensorA ∈ IRn1×···×nd that is based on the QR-with-column-pivoting factorizations
A(k)Pk = QkRk
for k = 1:d . Does the core tensor have any special properties?
Problem 2. Does this inequality hold?
‖ A − Ar ‖2F ≤
n1Xj=r1+1
σj(A(1))2 +
n2Xj=r2+1
σj(A(2))2 +
n3Xj=r3+1
σj(A(3))2
Can you do better?
Problem 3. Show that
|A(i1, i2, i3) − Xr(i1, i2, i3)| ≤ min{σr1+1(A(1)), σr2+1(A(2)), σr3+1(A(3)), }
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 28 / 51
The Truncated HOSVD
Abbreviate the HOSVD Expansion...
A(i1, i2, i3) =n1∑
j1=1
n2∑j2=1
n3∑j3=1
S(j1, j2, j3) · U1(i1, j1) · U2(i2, j2) · U3(i3, j3)
Ar(i1, i2, i3) =r1∑
j1=1
r2∑j2=1
r3∑j3=1
S(j1, j2, j3) · U1(i1, j1) · U2(i2, j2) · U3(i3, j3)
What can we say about the “thrown away” terms?
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 29 / 51
The Truncated HOSVD
Abbreviate the HOSVD Expansion...
A(i1, i2, i3) =n1∑
j1=1
n2∑j2=1
n3∑j3=1
S(j1, j2, j3) · U1(i1, j1) · U2(i2, j2) · U3(i3, j3)
Ar(i1, i2, i3) =r1∑
j1=1
r2∑j2=1
r3∑j3=1
S(j1, j2, j3) · U1(i1, j1) · U2(i2, j2) · U3(i3, j3)
‖ S(j , :, :) ‖F = σj(A(1)) j = 1:n1
‖ S(:, j , :) ‖F = σj(A(2)) j = 1:n2
‖ S(:, :, j) ‖F = σj(A(3)) j = 1:n3
Entries are getting smaller as you move away from A(1, 1, 1)
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 30 / 51
Modal Rank
Definition
We say that
Ar(i1, i2, i3) =
r1∑j1=1
r2∑j2=1
r3∑j3=1
S(j1, j2, j3) ·U1(i1, j1) ·U2(i2, j2) ·U3(i3, j3)
has modal rank (r1, r2, r3)
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 31 / 51
The Tucker Nearness Problem
A Tensor Optimization Problem
Given A ∈ IRn1×n2×n3 and integers r1 ≤ n1, r2 ≤ n2, and r3 ≤ n3
minimize‖ A − B ‖F
over all tensors B ∈ IRn1×n2×n3 that have modal rank (r1, r2, r3)
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 32 / 51
The Tucker Nearness Problem
A Tensor Optimization Problem
Given A ∈ IRn1×n2×n3 and integers r1 ≤ n1, r2 ≤ n2, and r3 ≤ n3
minimize‖ A − B ‖F
over all tensors B ∈ IRn1×n2×n3 that have modal rank (r1, r2, r3)
A Matrix Optimization Problem
Given A ∈ IRn1×n2 and integer r ≤ min{n1, n2}, minimize
‖ A − B ‖F
over all matrices B ∈ IRn1×n2 that have rank r .
The matrix problem has a happy solution via the SVD A = UΣV T :
Bopt = σ1U(:, 1)V (:, 1)T + · · ·+ σrU(:, r)V (:, r)T
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 33 / 51
The Tucker Nearness Problem
The Plan...
Develop an Alternating Least Squares framework for minimizing∥∥∥∥∥∥ A −r1∑
j1=1
r2∑j2=1
r3∑j3=1
S(j1, j2, j3) · U1(:, j1) ◦ U2(:, j2) ◦ U3(:, j3)
∥∥∥∥∥∥F
Equivalent to finding U1, U2, and U3 (all with orthonormal columns)and core tensor S ∈ IRr1×r2×r3 so that
‖ vec(A) − (U3 ⊗ U2 ⊗ U1)vec(S) ‖F
is minimized.
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 34 / 51
The Tucker Nearness Problem
The “Removal” of SSince S must minimize
‖ vec(A) − (U3 ⊗ U2 ⊗ U1) · vec(S) ‖
and U3 ⊗ U2 ⊗ U1 is orthonormal, we see that
S =(UT
3 ⊗ UT2 ⊗ UT
1
)· vec(A)
and so our goal is to choose the Ui so that
‖(I − (U3 ⊗ U2 ⊗ U1)
(UT
3 ⊗ UT2 ⊗ UT
1
))vec(A) ‖
is minimized.
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 35 / 51
The Tucker Nearness Problem
Reformulation...
Since U3 ⊗ U2 ⊗ U1 has orthonormal columns, it follows that our goalis to choose orthonormal Ui so that
‖(UT
3 ⊗ UT2 ⊗ UT
1
)· vec(A) ‖
is maximized.
If Q has orthonormal columns then
‖ (I − QQT )a ‖2
2 = ‖ a ‖2 − ‖ QTa ‖2
2
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 36 / 51
The Tucker Nearness Problem
Three Reshapings of the Objective Function...
‖(UT
3 ⊗ UT2 ⊗ UT
1
)· vec(A) ‖
=
‖ UT1 · A(1) · (U3 ⊗ U2) ‖
F
=
‖ UT2 · A(2) · (U3 ⊗ U1) ‖
F
=
‖ UT3 · A(3) · (U2 ⊗ U1) ‖
F
Sets the stage for an alternating least squares solution approach...
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 37 / 51
Alternating Least Squares Framework
A Sequence of Three Linear Problems...
‖(UT
3 ⊗ UT2 ⊗ UT
1
)· vec(A) ‖
=
‖ UT1 · A(1) · (U3 ⊗ U2) ‖
F
=
‖ UT2 · A(2) · (U3 ⊗ U1) ‖
F
=
‖ UT3 · A(3) · (U2 ⊗ U1) ‖
F
⇐ 1. Fix U2 and U3 andmaximize with U1.
⇐ 2. Fix U1 and U3 andmaximize with U2.
⇐ 3. Fix U1 and U2 andmaximize with U3.
These max problems are SVD problems...
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 38 / 51
How do you maximize ‖ QTM ‖F where Q ∈ IRm×r has orthonormalcolumns, M ∈ IRm×n, and r ≤ n?
IfM = UΣV T
is the SVD of M, then
‖ QTM ‖2F = ‖ QTUΣV T ‖2
F = ‖ QTUΣ ‖2F
=n∑
k=1
σ2k‖ QTU(:, k) ‖2
2.
The best you can do is to set Q = U(:, 1:r).
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 39 / 51
Alternating Least Squares Framework
A Sequence of Three Linear Problems...
Repeat:
1. Compute the SVD A(1) · (U3 ⊗ U2) = U1Σ1VT1
and set U1 = U1(:, 1:r1).
2. Compute the SVD A(2) · (U3 ⊗ U1) = U2Σ2VT2
and set U2 = U2(:, 1:r2).
3. Compute the SVD A(3) · (U2 ⊗ U1) = U3Σ3VT3
and set U3 = U3(:, 1:r3).
Initial guess via the HOSVD
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 40 / 51
Matlab Tensor Toolbox: The Function tucker als
n = [ 5 6 7 ];
% Generate a random tensor ...
A = tenrand(n);
for r = 1:min(n)
% Find the closest length -[r r r] ttensor ...
X = tucker_als(A,[r r r]);
% Display the fit ...
E = double(X)-double(A);
fit = norm(reshape(E,prod(n),1));
fprintf(’r = %1d, fit = %5.3e\n’,r,fit);
end
The function Tucker als returns a ttensor. Default values for the number ofiterations and the termination criteria can be modified:
X = Tucker als(A,r,’maxiters’,20,’tol’,.001)
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 41 / 51
Approximation via Sums of Low-Rank Tensor Products
Motivation
Unfold A ∈ IRn×n×n×n into an n2-by-n2 matrix A.
Express A as a sum of Kronecker products:
A =r∑
k=1
σkBk ⊗ Ck Bk ,Ck ∈ IRn×n
Back to tensor:
A =r∑
k=1
σkCk ◦ Bk
i.e.,
A(i1, i2, j1, j2) =r∑
k=1
σkCk(i1, i2)Bk(j1, j2)
There is an optimal way of doing this.
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 41 / 51
The Nearest Kronecker Product Problem
Reshaping the Objective Function (3-by-2 case)
∥∥∥∥∥∥∥∥∥∥∥∥
a11 a12 a13 a14
a21 a22 a23 a24
a31 a32 a33 a34
a41 a42 a43 a44
a51 a52 a53 a54
a61 a62 a63 a64
−
b11 b12
b21 b22
b31 b32
⊗[
c11 c12
c21 c22
]∥∥∥∥∥∥∥∥∥∥∥∥
F
=∥∥∥∥∥∥∥∥∥∥∥∥∥
a11 a21 a12 a22
a31 a41 a32 a42
a51 a61 a52 a62
a13 a23 a14 a24
a33 a43 a34 a44
a53 a63 a54 a64
−
b11
b21
b31
b12
b22
b32
[c11 c21 c12 c22
] ∥∥∥∥∥∥∥∥∥∥∥∥∥F
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 42 / 51
The Nearest Kronecker Product Problem
Minimizing the Objective Function (3-by-2 case)
It is a nearest rank-1 problem,
φA(B,C ) =
∥∥∥∥∥∥∥∥∥∥∥∥∥
a11 a21 a12 a22
a31 a41 a32 a42
a51 a61 a52 a62
a13 a23 a14 a24
a33 a43 a34 a44
a53 a63 a54 a64
−
b11
b21
b31
b12
b22
b32
[c11 c21 c12 c22
] ∥∥∥∥∥∥∥∥∥∥∥∥∥F
= ‖ A− vec(B)vec(C )T ‖F
with SVD solution:A = UΣV T
vec(B) =√
σ1U(:, 1)
vec(C ) =√
σ1V (:, 1)
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 43 / 51
The Nearest Kronecker Product Problem
The “Tilde Matrix”
A =
a11 a12 a13 a14
a21 a22 a23 a24
a31 a32 a33 a34
a41 a42 a43 a44
a51 a52 a53 a54
a61 a62 a63 a64
=
A11 A12
A21 A22
A31 A32
implies
A =
a11 a21 a12 a22
a31 a41 a32 a42
a51 a61 a52 a62
a13 a23 a14 a24
a33 a43 a34 a44
a53 a63 a54 a64
=
vec(A11)T
vec(A21)T
vec(A31)T
vec(A12)T
vec(A22)T
vec(A32)T
.
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 44 / 51
The Kronecker Product SVD (KPSVD)
Theorem
If
A =
A11 · · · A1,c2
.... . .
...Ar2,1 · · · Ar2,c2
Ai2,j2 ∈ IRr1×c1
then there exist U1, . . . ,UrKP∈ IRr2×c2 , V1, . . . ,VrKP
∈ IRr1×c1 , andscalars σ1 ≥ · · · ≥ σrKP
> 0 such that
A =
rKP∑k=1
σkUk ⊗ Vk .
The sets {vec(Uk)} and {vec(Vk)} are orthonormal and rKP is theKronecker rank of A with respect to the chosen blocking.
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 45 / 51
The Kronecker Product SVD (KPSVD)
Constructive Proof
Compute the SVD of A:
A = UΣV T =
rKP∑k=1
σkukvTk
and define the Uk and Vk by
vec(Uk) = uk
vec(Vk) = vk
for k = 1:rKP .
Uk = reshape(uk , r2, c2), Vk = reshape(vk , r1, c1)
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 46 / 51
The Kronecker Product SVD (KPSVD)
Nearest rank-r
If r ≤ rKP , then
Ar =r∑
k=1
σkUk ⊗ Vk
is the nearest matrix to A (in the Frobenius norm) that has Kroneckerrank r .
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 47 / 51
The Nearest Kronecker Product Problem
The Objective Function in Tensor Terms
φA(B,C ) = ‖ A− B ⊗ C ‖F
=
√√√√ r1∑i1=1
c1∑j1=1
r2∑i2=1
c2∑j2=1
A(i1, j1, i2, j2)− B(i2, j2)C(i1, j1)
We are trying to approximate an order-4 tensor with a pair of order-2tensors.
Analogous to approximating a matrix (an order-2 tensor) with a rank-1
matrix (a pair of order-1 tensors.)
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 48 / 51
The Nearest Kronecker Product Problem
Harder
φA(B,C ,D)
=
‖ A− B ⊗ C ⊗ D ‖F
=√√√√ r1∑i1=1
c1∑j1=1
r2∑i2=1
c2∑j2=1
r3∑i3=1
c2∑j3=1
A(i1, j1, i2, j2, i3, j3)− B(i3, j3)C(i2, j2)D(i1, j1)
We are trying to approximate an order-6 tensor with a triplet of order-2tensors.
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 49 / 51
References
L. De Lathauwer, B. De Moor, J. Vandewalle (2000a). “A Multilinear SingularValue Decomposition”, SIAM J. Matrix Anal. Appl., 21, 1253–1278.
W. Hackbusch and B.N. Khoromskij (2007). “Tensor-product Approximation toOperators and Functions in High Dimensions,” Journal of Complexity, 23,697–714.
L. De Lathauwer, B. De Moor, J. Vandewalle (2000a). “A Multilinear SingularValue Decomposition”, SIAM J. Matrix Anal. Appl., 21, 1253–1278.
L. De Lathauwer, B. De Moor, and J. Vandewalle (2000b). “On the Best Rank-1and Rank-(r1,r2,...,rN) Approximation of Higher-Order Tensors,” SIAM J.Matrix Analysis and Applications, 21, 1324–1342.
T.G. Kolda (2001). “Orthogonal Tensor Decompositions,” SIAM Journal onMatrix Analysis and Applications, 23, 243–255.
T.G. Kolda (2003). “A Counterexample to the Possibility of an Extension of theEckart-Young Low-Rank approximation Theorem for the Orthogonal RankTensor Decomposition,” SIAM Journal on Matrix Analysis and Applications,24, 762–767.
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 50 / 51
Summary
Key Words
The Mode-k Matrix Product is a contraction between a tensorand a matrix that produces another tensor.
The Modal-rank of a tensor is the vector of mode-k unfoldingranks.
The Higher Order SVD of a tensor A assembles the SVDs of A’smodal unfoldings.
The Tucker Nearness Problem for a given tensor A involvesfinding the nearest tensor that has a given modal rank. Solvedvia alternating LS problems that involve SVDs
The Kronecker Product SVD characterizes a block matrix as asum of Kronecker products. By applying it to an unfolding of atensor A, an outer product expansion for A is obtained.
⊗ Four Talks on Tensor Computations ⊗ 2. SVD-Based Tensor Decompositions 51 / 51