Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation...

40
Maximum Likelihood Matrix Completion Under Sparse Factor Models: Error Guarantees and Efficient Algorithms Jarvis Haupt Department of Electrical and Computer Engineering University of Minnesota Institute for Computational and Experimental Research in Mathematics (ICERM) Workshop on Approximation, Integration, and Optimization October 1, 2014

Transcript of Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation...

Page 1: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Maximum Likelihood Matrix Completion UnderSparse Factor Models:

Error Guarantees and Efficient Algorithms

Jarvis Haupt

Department of Electrical and Computer EngineeringUniversity of Minnesota

Institute for Computational and Experimental Research in Mathematics (ICERM)Workshop on Approximation, Integration, and Optimization

October 1, 2014

Page 2: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Section 1

Background and Motivation

Page 3: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

A Classical Example

Sampling Theorem:(Whittaker/Kotelnikov/Nyquist/Shannon, 1930’s-1950’s)

Original Signal (Red)Samples (Black)

Accurate Recovery (and Imputation)via Ideal Low-Pass Filtering

when Original Signal is Bandlimited

Basic “Formula” for Inference:To draw inferences from limiteddata (or here, to impute missing

elements), need to leverageunderlying structure in the signal

being inferred.

Page 4: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

A Contemporary Example

Matrix Completion:(Candes & Recht; Keshavan, et al.; Candes & Tao;

Candes & Plan; Negahban & Wainwright;

Koltchinskii et al.; Davenport et al.;... 2009- )

Samples

Accurate Recovery (and Imputation)via Convex Optimization

when Original Matrix is Low-Rank

Low-rank modeling assumptioncommonly utilized in

collaborative filtering applications(e.g. the Netflix prize),

to describe settings where eachobserved value depends on only a

few latent factors or features.

Page 5: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Beyond Low Rank Models?

Low-Rank Models: All columns of the ma-trix are well-approximated as vectors incommon linear subspace.

Union of Subspaces Model: All columns ofthe matrix are well-approximated as vectorsin a union of linear subspaces.

Union of subspaces models are at the essence of sparse subspace clustering (Elhamifar & Vidal;

Soltanolkotabi et al.; Erikkson et al; Balzano et al) and dictionary learning (Olshausen & Field; Aharon et

al; Mairal et al.;...).

Here, we examine the efficacy of such models in matrix completion tasks.

Page 6: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Section 2

Problem Statement

Page 7: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

“Sparse Factor” Data Models

We assume the unknown X∗ ∈ Rn1×n2 we seek to estimate admits a factorization of the form

X∗ = D∗A∗, D∗ ∈ Rn1×r ,A∗ ∈ Rr×n2

where

• ‖D∗‖max , maxi,j |Di,j | ≤ 1 (essentially to fix scaling ambiguities)

• ‖A∗‖max ≤ Amax for a constant 0 < Amax ≤ (n1 ∨ n2)

• ‖X∗‖max ≤ Xmax/2 for a constant Xmax ≥ 1

Our Focus: Sparse factor models, characterized by (approximately or exactly) sparse A∗.

Page 8: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

“Sparse Factor” Data Models

We assume the unknown X∗ ∈ Rn1×n2 we seek to estimate admits a factorization of the form

X∗ = D∗A∗, D∗ ∈ Rn1×r ,A∗ ∈ Rr×n2

where

• ‖D∗‖max , maxi,j |Di,j | ≤ 1 (essentially to fix scaling ambiguities)

• ‖A∗‖max ≤ Amax for a constant 0 < Amax ≤ (n1 ∨ n2)

• ‖X∗‖max ≤ Xmax/2 for a constant Xmax ≥ 1

Our Focus: Sparse factor models, characterized by (approximately or exactly) sparse A∗.

Page 9: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Observation Model

We observe X∗ only at a subset S ∈ {1, 2, . . . , n1} × {1, 2, . . . , n2} of its locations. For someγ ∈ (0, 1] each (i , j) is in S independently with probability γ, and interpret γ = m(n1n2)−1, sothat m = is the nominal number of observations.

Observations {Yi,j}(i,j)∈S , YS conditionally independent given S, modeled via joint density

pX∗S(YS) =

∏(i,j)∈S

pX∗i,j(Yi,j )︸ ︷︷ ︸

scalar densities

Page 10: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Estimation Approach

We estimate X∗ via a sparsity-penalized maximum likelihood approach: for λ > 0, we take

X = arg minX=DA∈X

{− log pXS (YS) + λ · ‖A‖0

}.

The set X of candidate reconstructions is any subset of X ′, where

X ′ , {X = DA : D ∈ D, A ∈ A, ‖X‖max ≤ Xmax} ,where

• D: the set of all matrices D ∈ Rn1×r whose elements are discretized to one of Luniformly-spaced values in the range [−1, 1]

• A: the set of all matrices A ∈ Rr×n2 whose elements either take the value zero, or arediscretized to one of L uniformly-spaced values in the range [−Amax,Amax]

Page 11: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Estimation Approach

We estimate X∗ via a sparsity-penalized maximum likelihood approach: for λ > 0, we take

X = arg minX=DA∈X

{− log pXS (YS) + λ · ‖A‖0

}.

The set X of candidate reconstructions is any subset of X ′, where

X ′ , {X = DA : D ∈ D, A ∈ A, ‖X‖max ≤ Xmax} ,where

• D: the set of all matrices D ∈ Rn1×r whose elements are discretized to one of Luniformly-spaced values in the range [−1, 1]

• A: the set of all matrices A ∈ Rr×n2 whose elements either take the value zero, or arediscretized to one of L uniformly-spaced values in the range [−Amax,Amax]

Page 12: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Section 3

Error Bounds

Page 13: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

A General “Sparse Factor” Matrix Completion Error Guarantee

Theorem (A. Soni, S. Jain, J.H., and S. Gonella, 2014)

Let β > 0 and set L = (n1 ∨ n2)β . If CD satisfies CD ≥ maxX∈X maxi,j D(pX∗i,j‖pXi,j

), then for

any λ ≥ 2 · (β + 2) ·(

1 + 2CD3

)· log(n1 ∨ n2), the sparsity penalized ML estimate

X = arg minX=DA∈X

{− log pXS (YS) + λ · ‖A‖0

}satisfies the (normalized, per-element) error bound

ES,YS[−2 log A(p

X, pX∗ )

]n1n2

≤8CD log m

m

+3 minX=DA∈X

{D(pX∗‖pX)

n1n2+

(λ+

4CD(β + 2) log(n1 ∨ n2)

3

)(n1p + ‖A‖0

m

)}.

Here:

A(pX, pX∗ ) ,∏

i,j A(pXi,j, pX∗

i,j) where A(pXi,j

, pX∗i,j

) , EpX∗i,j

[√pXi,j

/pX∗i,j

]is the Hellinger Affinity

D(pX∗‖pX) ,∑

i,j D(pX∗i,j‖pXi,j

) where D(pX∗i,j‖pXi,j

) , EpX∗i,j

[log(pX∗

i,j/pXi,j

)

]is KL Divergence

Next, we instantiate this result for some specific cases (using a specific choice of β, λ).

Page 14: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

A General “Sparse Factor” Matrix Completion Error Guarantee

Theorem (A. Soni, S. Jain, J.H., and S. Gonella, 2014)

Let β > 0 and set L = (n1 ∨ n2)β . If CD satisfies CD ≥ maxX∈X maxi,j D(pX∗i,j‖pXi,j

), then for

any λ ≥ 2 · (β + 2) ·(

1 + 2CD3

)· log(n1 ∨ n2), the sparsity penalized ML estimate

X = arg minX=DA∈X

{− log pXS (YS) + λ · ‖A‖0

}satisfies the (normalized, per-element) error bound

ES,YS[−2 log A(p

X, pX∗ )

]n1n2

≤8CD log m

m

+3 minX=DA∈X

{D(pX∗‖pX)

n1n2+

(λ+

4CD(β + 2) log(n1 ∨ n2)

3

)(n1p + ‖A‖0

m

)}.

Here:

A(pX, pX∗ ) ,∏

i,j A(pXi,j, pX∗

i,j) where A(pXi,j

, pX∗i,j

) , EpX∗i,j

[√pXi,j

/pX∗i,j

]is the Hellinger Affinity

D(pX∗‖pX) ,∑

i,j D(pX∗i,j‖pXi,j

) where D(pX∗i,j‖pXi,j

) , EpX∗i,j

[log(pX∗

i,j/pXi,j

)

]is KL Divergence

Next, we instantiate this result for some specific cases (using a specific choice of β, λ).

Page 15: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Additive White Gaussian Noise Model

Suppose each observation is corrupted by zero-mean AWGN with known variance σ2, so that

pX∗S(YS) =

1

(2πσ2)|S|/2exp

− 1

2σ2

∑(i,j)∈S

(Yi,j − X∗i,j )2

.

Let X = X ′, essentially (a discretization of) a set of rank and max-norm constrained matrices.

Gaussian Noise (Exact Sparse Factor Model)

If A∗ is exactly sparse with ‖A∗‖0 nonzero elements, the sparsity penalized ML estimate satisfies

ES,YS[‖X∗ − X‖2

F

]n1n2

= O(

(σ2 + X2max)

(n1r + ‖A∗‖0

m

)log(n1 ∨ n2)

).

Page 16: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

AWGN – Our Result in Context

Gaussian Noise (Exact Sparse Factor Model)

If A∗ is exactly sparse with ‖A∗‖0 nonzero elements, the sparsity penalized ML estimate satisfies

ES,YS[‖X∗ − X‖2

F

]n1n2

= O(

(σ2 + X2max)

(n1r + ‖A∗‖0

m

)log(n1 ∨ n2)

).

Compare with result of (Koltchinskii et al, 2011); when X∗ is max-norm and rank-constrained,nuclear-norm penalized optimization yields estimate satisfying

‖X∗ − X‖2F

n1n2= O

((σ2 + X2

max)

((n1 + n2)r

m

)log(n1 ∨ n2)

)with high probability.

Note: Our guarantees can have improved error performance in the case where ‖A∗‖0 � n2r .The two bounds coincide when A∗ is not sparse (take ‖A∗‖0 = n2r in our error bounds).

Page 17: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

AWGN Model (Extension to Approximately Sparse Factor Model)

Recall: For p ≤ 1, a vector x ∈ Rn is said to belong to a weak-`p ball of radius R > 0, denotedx ∈ w`p(R), if its ordered elements |x(1)| ≥ |x(2)| ≥ · · · ≥ |x(n)| satisfy

|x(i)| ≤ Ri−1/p for all i ∈ {1, 2, . . . , n}.

With this, we can state a variant of the above for when columns of A∗ are approximately sparse.

Gaussian Noise (Approximately Sparse Factor Model)

Consider the same Gaussian noise model described above. If for some p ≤ 1 all columns of A∗

belong to a weak-`p ball of radius Amax, then for α = 1/p − 1/2,

ES,YS[‖X∗ − X‖2

F

]n1n2

= O(A2

max

(n2

m

) 2α2α+1

+ (σ2 + X2max)

(n1r

m+(n2

m

) 2α2α+1

)log(n1 ∨ n2)

)

Note:( n2

m

) 2α2α+1 ≤ n2m−

2α2α+1 ⇐ aggregate error of estimating n2 vectors in w`p from noisy obs.

Page 18: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

AWGN Model (Extension to Approximately Sparse Factor Model)

Recall: For p ≤ 1, a vector x ∈ Rn is said to belong to a weak-`p ball of radius R > 0, denotedx ∈ w`p(R), if its ordered elements |x(1)| ≥ |x(2)| ≥ · · · ≥ |x(n)| satisfy

|x(i)| ≤ Ri−1/p for all i ∈ {1, 2, . . . , n}.

With this, we can state a variant of the above for when columns of A∗ are approximately sparse.

Gaussian Noise (Approximately Sparse Factor Model)

Consider the same Gaussian noise model described above. If for some p ≤ 1 all columns of A∗

belong to a weak-`p ball of radius Amax, then for α = 1/p − 1/2,

ES,YS[‖X∗ − X‖2

F

]n1n2

= O(A2

max

(n2

m

) 2α2α+1

+ (σ2 + X2max)

(n1r

m+(n2

m

) 2α2α+1

)log(n1 ∨ n2)

)

Note:( n2

m

) 2α2α+1 ≤ n2m−

2α2α+1 ⇐ aggregate error of estimating n2 vectors in w`p from noisy obs.

Page 19: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Additive Laplace Noise Model

Suppose each observation is corrupted by additive Laplace noise with known parameter τ > 0, so

pX∗S(YS) =

( τ2

)|S|exp

−τ ∑(i,j)∈S

|Yi,j − X∗i,j |

.

Let X = X ′, essentially (a discretization of) a set of rank and max-norm constrained matrices.

Laplace Noise (Exact Sparse Factor Model)

If A∗ is exactly sparse with ‖A∗‖0 nonzero elements, the sparsity penalized ML estimate satisfies

ES,YS[‖X∗ − X‖2

F

]n1n2

= O( (

1

τ2+ X2

max

)︸ ︷︷ ︸

O(variance + X2max)

τXmax

(n1r + ‖A∗‖0

m

)︸ ︷︷ ︸

“parametric-like” formsimilar to sparse model

Gaussian-noise case

log(n1∨n2)

).

Can also obtain results for the approximately sparse case here, analogously to above...

Page 20: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Poisson-distributed Observations

Suppose that each element of X∗ satisfies X∗i,j ≥ Xmin for some Xmin > 0, and that each

observation is Poisson-distributed, so that YS ∈ N|S| and

pX∗S(YS) =

∏(i,j)∈S

(X∗i,j )Yi,j e−X∗i,j

(Yi,j )!,

Let X = {X ∈ X ′ : Xi,j ≥ 0 for all (i , j) ∈ {1, 2, . . . , n1} × {1, 2, . . . , n2}}.(To allow only non-negative rate estimates)

Poisson-distributed Observations (Exact Sparse Factor Model)

If A∗ is exactly sparse with ‖A∗‖0 nonzero elements, the sparsity penalized ML estimate satisfies

ES,YS[‖X∗ − X‖2

F

]n1n2

= O( (

Xmax + X2max

Xmax

Xmin

)︸ ︷︷ ︸

O(worst-case variance + X2max)

when Xmax/Xmin = O(1)

(n1r + ‖A∗‖0

m

)log(n1∨n2)

).

Can also obtain results for the approximately sparse case here, analogously to above...

Page 21: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

One-bit Observations

Let link function F : R→ [0, 1] be a differentiable link function with f (t) = ddt

F (t). Supposeeach observation Yi,j for (i , j) ∈ S is Bernoulli(F (X∗i,j ))-distributed, so that

pX∗S(YS) =

∏(i,j)∈S

[F (X∗i,j )

]Yi,j[1− F (X∗i,j )

]1−Yi,j

Assume F (Xmax) < 1, F (−Xmax) > 0, and inf|t|≤Xmaxf (t) > 0.

One-bit Observations (Exact Sparse Factor Model)

If A∗ is exactly sparse with ‖A∗‖0 nonzero elements, the sparsity penalized ML estimate satisfies

ES,YS[‖X∗ − X‖2

F

]n1n2

= O((

cF ,Xmax

c ′F ,Xmax

)(1

cF ,Xmax

+ X2max

) (n1r + ‖A∗‖0

m

)log(n1 ∨ n2)

),

where

cF ,Xmax ,

(sup

|t|≤Xmax

1

F (t)(1− F (t))

)·(

sup|t|≤Xmax

f 2(t)

)

c ′F ,Xmax, inf

|t|≤Xmax

f 2(t)

F (t)(1− F (t)).

Can also obtain results for the approximately sparse case here, analogously to above...

Page 22: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Comparisons to “One bit Matrix Completion”

One-bit Observations (Exact Sparse Factor Model)

If A∗ is exactly sparse with ‖A∗‖0 nonzero elements, the sparsity penalized ML estimate satisfies

ES,YS[‖X∗ − X‖2

F

]n1n2

= O((

cF ,Xmax

c ′F ,Xmax

)(1

cF ,Xmax

+ X2max

) (n1r + ‖A∗‖0

m

)log(n1 ∨ n2)

),

Compare with low-rank recovery result of (Davenport et al., 2012); maximum likelihoodoptimization over a set of max-norm and nuclear-norm constrained candidates yields estimatesatisfying

‖X∗ − X‖2F

n1n2= O

(CF ,XmaxXmax

√(n1 + n2)r

m

)with high probability, where CF ,Xmax analogous to (cF ,Xmax/c ′F ,Xmax

) factor in our bounds.

Extra loss of Xmax log(n1 ∨ n2) in our bound, but faster “parametric-like” dependence on m (inaddition to the “sparse factor” improvement). Lower bounds for “sparse factor” model stillopen (we think!).

Page 23: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Section 4

Algorithmic Approach

Page 24: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

A Non-Convex Problem...

Our optimizations take the general form

minD∈Rn1×r ,A∈Rr×n2 ,X∈Rn1×n2

∑i,j

−si,j logpXi,j(Yi,j ) + IX (X) + ID(D) + IA(A) + λ‖A‖0

s.t. X = DA.

where si,j = 1 if (i , j) ∈ S (and 0 otherwise) and IX (.), ID(.), IA(.) are indicator functions.

Multiple sources of non-convexity:

• `0 regularizer

• discretized sets D and A• inherent bilinearity of the model!

We propose an approach based on the Alternating Direction Method of Multipliers (ADMM).

Page 25: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

A Non-Convex Problem...

Our optimizations take the general form

minD∈Rn1×r ,A∈Rr×n2 ,X∈Rn1×n2

∑i,j

−si,j logpXi,j(Yi,j ) + IX (X) + ID(D) + IA(A) + λ‖A‖0

s.t. X = DA.

where si,j = 1 if (i , j) ∈ S (and 0 otherwise) and IX (.), ID(.), IA(.) are indicator functions.

Multiple sources of non-convexity:

• `0 regularizer

• discretized sets D and A• inherent bilinearity of the model!

We propose an approach based on the Alternating Direction Method of Multipliers (ADMM).

Page 26: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

A Non-Convex Problem...

Our optimizations take the general form

minD∈Rn1×r ,A∈Rr×n2 ,X∈Rn1×n2

∑i,j

−si,j logpXi,j(Yi,j ) + IX (X) + ID(D) + IA(A) + λ‖A‖0

s.t. X = DA.

where si,j = 1 if (i , j) ∈ S (and 0 otherwise) and IX (.), ID(.), IA(.) are indicator functions.

Multiple sources of non-convexity:

• `0 regularizer

• discretized sets D and A• inherent bilinearity of the model!

We propose an approach based on the Alternating Direction Method of Multipliers (ADMM).

Page 27: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

A General-Purpose ADMM-based Approach

We form the augmented Lagrangian

L(D,A,X,Λ) = −∑i,j

si,j logpXi,j(Yi,j ) + IX (X) + ID(D) + IA(A) + λ‖A‖0

+tr (Λ(X−DA)) +ρ

2‖X−DA‖2F,

where Λ is Lagrange multiplier for the equality constraint and ρ > 0 is a parameter, and solve:

(S1 :) Xk+1 := minX∈Rn1×n2

L(Dk ,Ak ,X,Λk )

(S2 :) Ak+1 := minA∈Rr×n2

L(Dk ,A,Xk+1,Λk )

(S3 :) Dk+1 := minD∈Rn1×r

L(D,Ak+1,Xk+1,Λk )

(S4 :) Λk+1 = Λk + ρ(Xk+1 −Dk+1Ak+1).

Page 28: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Efficiently Solvable Subproblems

We relax D,A,X to closed convex sets, and solve S1-S4 iteratively, as follows...

Step S1: After simplification, the solution can be written in terms of scalar prox functions:

Xk+1i,j = arg min

Xi,j∈R−si,j logpXi,j

(Yi,j ) + IX (Xi,j ) +ρ

2

(Xi,j − (Dk Ak )i,j + (Λk )i,j/ρ

)2

, prox−si,j logp· (Yi,j )+IX (·)

((Dk Ak )i,j − (Λk )i,j/ρ

).

(Closed-form for three of our examples; use Newton’s Method for the one-bit model w/probit or logit link.)

Step S2: The subproblem takes the form

minA∈Rn1×r

IA(A) + λ‖A‖0 +ρ

2‖Xk+1 −Dk A + Λk/ρ‖2

F .

(Solved via “majorization-minimization;” Iterative Hard Thresholding (Blumensath & Davies 2008).)

Step S3: The subproblem takes the form

minD∈Rr×n2

ID(D) +ρ

2‖Xk+1 −DAk+1 + Λkρ‖2

F .

(Efficiently solved via Newton’s Method or closed-form.)

Page 29: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Efficiently Solvable Subproblems

We relax D,A,X to closed convex sets, and solve S1-S4 iteratively, as follows...

Step S1: After simplification, the solution can be written in terms of scalar prox functions:

Xk+1i,j = arg min

Xi,j∈R−si,j logpXi,j

(Yi,j ) + IX (Xi,j ) +ρ

2

(Xi,j − (Dk Ak )i,j + (Λk )i,j/ρ

)2

, prox−si,j logp· (Yi,j )+IX (·)

((Dk Ak )i,j − (Λk )i,j/ρ

).

(Closed-form for three of our examples; use Newton’s Method for the one-bit model w/probit or logit link.)

Step S2: The subproblem takes the form

minA∈Rn1×r

IA(A) + λ‖A‖0 +ρ

2‖Xk+1 −Dk A + Λk/ρ‖2

F .

(Solved via “majorization-minimization;” Iterative Hard Thresholding (Blumensath & Davies 2008).)

Step S3: The subproblem takes the form

minD∈Rr×n2

ID(D) +ρ

2‖Xk+1 −DAk+1 + Λkρ‖2

F .

(Efficiently solved via Newton’s Method or closed-form.)

Page 30: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Efficiently Solvable Subproblems

We relax D,A,X to closed convex sets, and solve S1-S4 iteratively, as follows...

Step S1: After simplification, the solution can be written in terms of scalar prox functions:

Xk+1i,j = arg min

Xi,j∈R−si,j logpXi,j

(Yi,j ) + IX (Xi,j ) +ρ

2

(Xi,j − (Dk Ak )i,j + (Λk )i,j/ρ

)2

, prox−si,j logp· (Yi,j )+IX (·)

((Dk Ak )i,j − (Λk )i,j/ρ

).

(Closed-form for three of our examples; use Newton’s Method for the one-bit model w/probit or logit link.)

Step S2: The subproblem takes the form

minA∈Rn1×r

IA(A) + λ‖A‖0 +ρ

2‖Xk+1 −Dk A + Λk/ρ‖2

F .

(Solved via “majorization-minimization;” Iterative Hard Thresholding (Blumensath & Davies 2008).)

Step S3: The subproblem takes the form

minD∈Rr×n2

ID(D) +ρ

2‖Xk+1 −DAk+1 + Λkρ‖2

F .

(Efficiently solved via Newton’s Method or closed-form.)

Page 31: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Section 5

Experimental Results

Page 32: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

A Comparison with Synthetic Data

Preliminary Experimental Results: We evaluated each of these methods on matrices of size100× 1000 with r = 20 and 4 nonzero elements per column of A∗, for varying sampling rates(and different likelihood models). For each, we evaluated the average (over 5 trials) normalizedreconstruction error as a function of the sampling rate.

Gaussian and Laplace Noises havesame variances.

For sampling rates > 10−4 ≈ 40%,the error exhibits predicted decay

(slope of ≈-1 on the log-log scale).

−1 −0.8 −0.6 −0.4 −0.2 0−2

−1

0

1

2

3

log10(γ )

log 1

0

(

E‖X−X

∗‖2 F

n1n2

)

GaussianLaplacePoisson

Page 33: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Imaging Example – Gaussian Noise

Original 512× 512 image reshaped into 256× 1024 matrix (0.005 ≤ X∗i,j ≤ 1.05 for all i , j)

Inner dimension r = 25, noise standard deviation: σ = 0.01, sampling rate = 50%

Original Image Samples

Estimated Image Estimated A

Page 34: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Imaging Example – Laplace Noise

Original 512× 512 image reshaped into 256× 1024 matrix (0.005 ≤ X∗i,j ≤ 1.05 for all i , j)

Inner dimension r = 25, noise standard deviation:√

2/τ = 0.01, sampling rate = 50%

Original Image Samples

Estimated Image Estimated A

Page 35: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Imaging Example – Poisson-distributed Observations

Original 512× 512 image reshaped into 256× 1024 matrix (0.005 ≤ X∗i,j ≤ 1.05 for all i , j)

Inner dimension r = 25, sampling rate = 50%

Original Image Samples

Estimated Image Estimated A

Page 36: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Imaging Example – One-bit Observations

Original 512× 512 image reshaped into 256× 1024 matrix (0.005 ≤ X∗i,j ≤ 1.05 for all i , j)

Inner dimension r = 25, sampling rate = 50%

Original Image Samples

Estimated Image Estimated A

Page 37: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Section 6

Acknowledgments

Page 38: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Acknowledgments

Collaborators/Co-authors:

Akshay Soni Swayambhoo Jain Prof. Stefano Gonella(UMN ECE PhD Student) (UMN ECE PhD Student) (UMN Civil Engr.)

Research Support:NSF EARS (Enhancing Access to the Radio Spectrum) ProgramDARPA Young Faculty Award

[email protected]

www.ece.umn.edu/~jdhaupt

(Special thanks to Prof. Julian Wolfson, UMN Dept. of Biostatistics, for the Beamer Template!)

Page 39: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Acknowledgments

Collaborators/Co-authors:

Akshay Soni Swayambhoo Jain Prof. Stefano Gonella(UMN ECE PhD Student) (UMN ECE PhD Student) (UMN Civil Engr.)

Research Support:NSF EARS (Enhancing Access to the Radio Spectrum) ProgramDARPA Young Faculty Award

[email protected]

www.ece.umn.edu/~jdhaupt

(Special thanks to Prof. Julian Wolfson, UMN Dept. of Biostatistics, for the Beamer Template!)

Page 40: Maximum Likelihood Matrix Completion Under Sparse Factor ...€¦ · Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Background and Motivation Problem Statement Error Bounds Algorithmic Approach Experimental Results Acknowledgments

Acknowledgments

Collaborators/Co-authors:

Akshay Soni Swayambhoo Jain Prof. Stefano Gonella(UMN ECE PhD Student) (UMN ECE PhD Student) (UMN Civil Engr.)

Research Support:NSF EARS (Enhancing Access to the Radio Spectrum) ProgramDARPA Young Faculty Award

[email protected]

www.ece.umn.edu/~jdhaupt

(Special thanks to Prof. Julian Wolfson, UMN Dept. of Biostatistics, for the Beamer Template!)