Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix...
Transcript of Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix...
![Page 1: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/1.jpg)
Non-negative Matrix Factorization via Alternating Updates
Yingyu Liang, Princeton University
Simons workshop, Berkeley, Nov 2016
Andrej RisteskiYuanzhi Li
Joint work with
![Page 2: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/2.jpg)
Non-convex Problems in ML
Dictionary learningMatrix completion
Deep learning
![Page 3: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/3.jpg)
Non-convex Problems in ML
Dictionary learningMatrix completion
Deep learning
NP-hard to solve in the worst case
![Page 4: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/4.jpg)
Non-convex Problems in ML
• In practice, often solved by “local improvement”
• Gradient descent and variants
![Page 5: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/5.jpg)
Non-convex Problems in ML
• In practice, often solved by “local improvement”
• Gradient descent and variants
• Alternating update
![Page 6: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/6.jpg)
Non-convex Problems in ML
• In practice, often solved by “local improvement”
• Gradient descent and variants
• Alternating update
Goal: provable guarantees of simple algos under natural assumptions
When and why do such simple algos work for the hard problems?
![Page 7: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/7.jpg)
Non-convex Problems in ML
• In practice, often solved by “local improvement”
• Gradient descent and variants
• Alternating update
Goal: provable guarantees of simple algos under natural assumptions
When and why do such simple algos work for the hard problems?
This work: alternating update for Non-negative Matrix Factorization
![Page 8: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/8.jpg)
Non-negative Matrix Factorization (NMF)
• Given: matrix
• Find: non-negative matrices
![Page 9: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/9.jpg)
NMF in ML Applications
Basic tool in machine learning
• Topic modeling [Blei-Ng-Jordan03, Arora-Ge-Kannan-Moitra12, Arora-Ge-Moitra12,…]
0.05
0.04
0.08
0.05 0.05 0.05
0.2
0.3
0.1
0.2
0.8
the
physics
soccer
……
……
Doc1 Doc2 Doc3 ……
rain
Weather Sport Science ……
the
soccer
……
……
physics
rain
Doc1 Doc2 Doc3 ……
![Page 10: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/10.jpg)
NMF in ML Applications
Basic tool in machine learning
• Topic modeling [Blei-Ng-Jordan03, Arora-Ge-Kannan-Moitra12, Arora-Ge-Moitra12,…]
• Computer vision [Lee-Seung97, Lee-Seung99, Buchsbaum-Bloch02,…]
![Page 11: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/11.jpg)
NMF in ML Applications
Basic tool in machine learning
• Topic modeling [Blei-Ng-Jordan03, Arora-Ge-Kannan-Moitra12, Arora-Ge-Moitra12,…]
• Computer vision [Lee-Seung97, Lee-Seung99, Buchsbaum-Bloch02,…]
• Many others: network analysis, information retrieval, ……
![Page 12: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/12.jpg)
Worst Case v.s. Practical Heuristic
Worst case analysis [Arora-Ge-Kannan-Moitra12]
• Upper bound:
• Lower bound: no algo, assuming ETH
Alternating updates: typical heuristic, suggested by Lee-Seung
![Page 13: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/13.jpg)
Analyzing Non-convex Problems
• Generative model: the input data is generated from (a distribution defined by) a ground-truth solution
• Warm start: good initialization not far away from the ground-truth
Warm start
Update
![Page 14: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/14.jpg)
Beyond Worst Case for NMF
• Separability-based assumptions [Arora-Ge-Kannan-Moitra12]
• Motivated by topic modeling: each column of (topic) has an anchor word
• Lots of subsequent work [Arora-Ge-Moitra12, Arora-Ge-Halpern-Mimno-Moitra-Sontag-
Wu-Zhu12, Gillis-Vavasis14, Ge-Zhou15, Bhattacharyya-Goyal-Kannan-Pani16, …]
0.05 0.05 0.05
0.2 0 0
0.3
0.1
Weather Sport Science
the
soccer
……
physics
rain
![Page 15: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/15.jpg)
Beyond Worst Case for NMF
• Separability-based assumptions [Arora-Ge-Kannan-Moitra12]
• Motivated by topic modeling: each column of (topic) has an anchor word
• Lots of subsequent work [Arora-Ge-Moitra12, Arora-Ge-Halpern-Mimno-Moitra-Sontag-
Wu-Zhu12, Ge-Zhou15, Bhattacharyya-Goyal-Kannan-Pani16,…]
• Variational inference [Awasthi-Risteski15]
• Alternating update method on objective
• Requires relatively strong assumptions on , and/or a warm start depending on its dynamic range (not realistic)
![Page 16: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/16.jpg)
Outline
• Introduction
• Our model, algorithm and main results
• Analysis of the algorithm
![Page 17: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/17.jpg)
Generative Model
Each column of is i.i.d. example from
![Page 18: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/18.jpg)
Generative Model
Each column of is i.i.d. example from
• (A1): columns of are linearly independent
![Page 19: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/19.jpg)
Generative Model
Each column of is i.i.d. example from
• (A1): columns of are linearly independent
• (A2): are independent random variables
where is a parameter
![Page 20: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/20.jpg)
Our Algorithm
Parameters:
0
Known as Rectified Linear Units (ReLU) in Deep Learning
1
![Page 21: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/21.jpg)
Our Algorithm
Parameters:
are two independent examples, and are their decodings
![Page 22: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/22.jpg)
Our Algorithm
Parameters:
![Page 23: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/23.jpg)
Our Algorithm
Parameters:
![Page 24: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/24.jpg)
Warm Start
• (A3): warm start with error
![Page 25: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/25.jpg)
Warm Start
• (A3): warm start with error
Aligned with truth Not too much
![Page 26: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/26.jpg)
Main Result
![Page 27: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/27.jpg)
Analysis: Overview
Show the algo will
1. Maintain and
2. Decrease the potential function
![Page 28: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/28.jpg)
Analysis: Effect of ReLU
•
![Page 29: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/29.jpg)
Analysis: Effect of ReLU
•
![Page 30: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/30.jpg)
Analysis: Effect of ReLU
•
Much less noise after
![Page 31: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/31.jpg)
Analysis: Change of Error Matrix
•
• Therefore,
![Page 32: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/32.jpg)
More General Results
1. Each column of is generated i.i.d. by
• The decoding is
• So can* tolerate large adversarial noise,
• and tolerate zero-mean noise much larger than signal
2. Distribution of only needs to satisfy some moment conditions
![Page 33: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/33.jpg)
Conclusion
• Beyond worse case analysis of NMF
• generative model with mild condition on the feature matrix
• Provable guarantee of alternating update algorithm
• Strong denoising effect by ReLU + non-negativity
![Page 34: Non-negative Matrix Factorization via Alternating Updates · 2020-01-03 · Non-negative Matrix Factorization via Alternating Updates Yingyu Liang, Princeton University Simons workshop,](https://reader033.fdocuments.in/reader033/viewer/2022060305/5f0953ed7e708231d4264e34/html5/thumbnails/34.jpg)
Conclusion
• Beyond worse case analysis of NMF
• generative model with mild condition on the feature matrix
• Provable guarantee of alternating update algorithm
• Strong denoising effect by ReLU + non-negativity
Thanks! Q&A