Scalable Maximum Margin Factorization by Active Riemannian Subspace search Yan Yan, Mingkui Tan,...
-
Upload
hector-gibson -
Category
Documents
-
view
217 -
download
0
Transcript of Scalable Maximum Margin Factorization by Active Riemannian Subspace search Yan Yan, Mingkui Tan,...
Scalable Maximum Margin Factorization by
Active Riemannian Subspace searchYan Yan, Mingkui Tan, Ivor W. Tsang, Yi Yang,
Chengqi Zhang and Qinfeng Shi
QCIS, University of Technology, Sydney
ACVT, The University
Collaborative filtering for recommendation systems• Goal• Recover missing ratings by low-rank matrix completion
• Real world applications• Recommend TV shows/movies on Netflix• Recommend artists/music tracks on Xiami• Recommend products on Taobao…
• Data that can be used• Partially observed rating data from users on items
• A specific output of recommendation systems• The predicted ranking scores of users on unseen items
Problem setup of matrix completion
• Reconstruct the rating matrix X with a low-rank constraint
• Y is the observed matrix• The problem is NP-hard
• Approach: matrix factorization
Challenges
• Real world rating data are in discrete values• Maximum margin matrix factorization
• Existing methods usually requires repetitive SVDs• Our optimization avoids repetitive SVDs and applies cheaper QR
• The latent variable r is usually unknown and can be different among various datasets• A automatic method to detect the rank
Maximum margin matrix factorization (M3F)• Hinge loss: appropriate for discrete rating data in real world• M3F for binary values (-1/+1)
• From binary values to ordinal values• Suppose • Introduce L+1 thresholds • •
Differential Geometry of Fixed-rank Matrices• Retraction•
• Retraction can be cheaply calculated without SVD in
Experiments
Data sets # users # items # ratings
Binary-syn 1,000 1,000 All
Ordinal-syn-small 1,000 1,000 All
Ordinal-syn-large 20,000 20,000 All
Movielens 1M 6,040 3,952 1,000,209
Movielens 10M 71,567 10,681 10,000,054
Netflix 480,189 17,770 100,480,507
Yahoo! Music Track 1 1,000,990 624,961 262,810,175
Conclusion
• Two challenges in M3F: scalability and latent factor detection• BNRCG addresses the scalability problem by exploiting Riemannian
geometry• ARSS-M3F applies an efficient and simple method to detect the latent
factor• Extensive experiments demonstrate the proposed method can
provide competitive performance.