Reordering Ranganathan: Shifting Behaviors, Shifting Priorities
Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee...
-
Upload
silvester-byrd -
Category
Documents
-
view
215 -
download
0
Transcript of Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee...
Bayesian Robust Principal Component Analysis
Presenter: Raghu Ranganathan
ECE / CMR
Tennessee Technological University
January 21, 2011
Reading Group
(Xinghao Ding, Lihan He, and Lawrence Carin)
2
Paper contribution
■ The problem of matrix decomposition into low-rank and sparse components is considered employing a hierarchical approach
■ The matrix is assumed noisy, with unknown and possibly non-stationary noise statistics
■ The Bayesian framework approximately infers the noise statistics in addition to the low-rank and sparse outlier contributions
■ The model proposed is robust to a broad range of noise levels without having to change the hyper-parameter settings
■ In addition, a Markov dependency between successive rows of the matrix is inferred by the Bayesian model to exploit additional structure in the observed matrix, particularly, in video applications
1/21/11
3
Introduction
■ Most high-dimensional data such as images, biological data, and social network data (Netflix data) reside in a low-dimensional subspace or low-dimensional manifold
1/21/11
4
Noise models
■ In low-rank matrix representations, two types of noise models are usually considered
■ One causes small scale perturbation to all the matrix elements, e.g. i.i.d. Gaussian noise added to each element.
■ In this case, if the noise energy is small compared to the dominant singular values of the SVD, it does not significantly affect the principal vectors
■ The second case is sparse noise with arbitrary magnitude, impacting a small subset of matrix elements, for example a moving object in video, in the presence of a static background manifests such sparse noise
1/21/11
5
Convex optimization approach
11/5/10
6
Bayesian approach
■ The observation matrix is considered to be of the form
Y = L (low-rank)+ S (sparse)+ E (noise), with the presence of both sparse noise, S, and dense noise E.
■ In the proposed Bayesian model, the noise statistics of E are approximately learned, along with learning S, and L.
■ The proposed model is robust to a broad range of noise variances
■ The Bayesian model infers approximation to the posterior distributions on the model parameters, and obtains approximate probability distributions for L, S, and E
■ The advantage of Bayesian model is that prior knowledge is employed in the inference
1/21/11
7
Bayesian approach
■ The Bayesian framework exploits the anticipated structure in the sparse component.
■ In video analysis, it is desired to separate the spatially localized moving objects (sparse component), from the static or quasi-static background (low-rank component) in the presence frame dependent additive noise E.
■ The correlation between the sparse components of the video from frame to frame (column to column in the matrix) has to be considered
■ In this paper, a Markov dependency in time and space is assumed between the sparse components of consecutive matrix columns
■ This structure is incorporated into the Bayesian framework, with the Markov parameters inferred through the observed matrix
1/21/11
8
Bayesian Robust PCA
■ The work in this paper is closely related to the low-rank matrix completion problem where we try to approximate a matrix (with noisy entries) by a low-rank matrix and to predict the missing entries
■ The matrix Y = L + S + E is missing random entries; the proposed model can make estimates for the missing entries (in terms of the low-rank term L)
■ The S term is defined as a sparse set of matrix entries; the location of S must be inferred while estimating the values of L, S, and E
■ Typically, in Bayesian inference, a sparseness promoting prior is imposed on the desired signal, and the posterior distribution of the sparse signal is inferred.
1/21/11
9
Bayesian Low-rank and Sparse Model
1/21/11
10 11/5/10
11
Bayesian Low-rank and Sparse Model
11/5/10
12
Bayesian Low-rank and Sparse Model
11/5/10
13
C. Noise component
■ The measurement noise is drawn i.i.d from a Gaussian distribution, and the noise affects all measurements
■ The noise variance is assumed unknown, and is learned within the model inference. Mathematically, the noise is modeled as
■ The model can learn different noise variances for different parts of E, i.e. each column/row of Y (each frame) in general have its own noise level. The noise structure is modified as
1/21/11
14
Relation to the optimization based approach
1/21/11
15
Relation to the optimization based approach
■ In the Bayesian model, it is not required to know the noise variance a priori, the model will learn the noise during inference
■ For the low-rank component instead of the constraint to impose sparseness of singular values, the Gaussian prior together with the beta-Bernoulli distribution is used to obtain an constraint
■ For the sparse component, instead of the constraint, the constraint
and the beta-Bernoulli distribution is employed to enforce sparsity■ Compared to the Laplacian prior (gives many small entries close to 0),
the beta-Bernoulli prior yields exactly zero values■ In Bayesian learning, numerical methods are used to estimate the
distribution for the unknown parameters, whereas in the optimization based approach, a solution to the minimum of a function similar to
1l
Fl2
F
1l Fl2
FX
1l
),|(log HYp
1/21/11
16
Markov dependency of Sparse Term in Time and Space
1/21/11
17
Markov dependency of Sparse Term in Time and Space
1/21/11
18
Markov dependency of Sparse Term in Time and Space
1/21/11
19
Posterior inference
1/21/11
20 11/5/10
21
Experimental results
11/5/10
22 11/5/10
23 11/5/10
24
B. Video example
■ The application of video surveillance with a fixed camera is considered
■ The objective is to reconstruct a near static background and moving foreground from a video sequence
■ The data are organized such that the column m of Y is constructed by concatenating all pixels of frame m from a grayscale video sequence
■ The background is modeled as the low-rank component, and the moving foreground as the sparse component.
■ The rank r is usually small for a static background, and the sparse components across frames (columns of Y) are strongly correlated, modeled by a Markov dependency
1/21/11
25 11/5/10
26 11/5/10
27
Conclusions
■ The authors have developed a new robust Bayesian PCA framework for analysis of matrices with sparsely distributed noise of arbitrary magnitude
■ The Bayesian approach is found to be robust to densely distributed noise, and the noise statistics may be inferred based on the data, with no tuning of hyperparameters
■ In addition, using the Markov property, the model allows the noise statistics to vary from frame to frame
■ Future research directions would involve a moving camera which would assume the background resides in a low-dimensional manifold as opposed to low-dimensional linear subspace
■ The Bayesian framework may be extended to infer the properties of the low-dimensional manifold
1/21/11