MF-DMPC: Matrix Factorization with Dual Multiclass...
Transcript of MF-DMPC: Matrix Factorization with Dual Multiclass...
MF-DMPC: Matrix Factorization with Dual MulticlassPreference Context for Rating Prediction
Jing Lin Weike Pan Zhong Ming
National Engineering Laboratory for Big Data System Computing Technology,College of Computer Science and Software Engineering,
Shenzhen University, [email protected], {panweike,mingz}@szu.edu.cn
Lin, Pan and Ming (SZU) MF-DMPC 1 / 26
Introduction
Context in RS
3 ?1? ?
? ??4 ?
5 2
2 ??1 5
? 3?? 4
???
1
2
3
n-1
n
1 2 3 … m-1 mitems
usersSocial network
Weather
Date
Place
……Rating matrix
Internal Context
External ContextLa La Land
Comedy, Drama, Music
…...Avatar
Action, Adventure, Fantasy
…...
Item content
…
Using external context in RS may cause the problems of model inflexibility,computational burden and systems incompatibility. So algorithms thatonly use internal context (collaborative filtering) still occupy an importantplace in the research community.
Lin, Pan and Ming (SZU) MF-DMPC 2 / 26
Introduction
Motivation
MF-MPC combines the two complementary categories of collaborativefiltering – neighborhood-based and model-based.
MF-MPC is an improved method of SVD (a typical kind of matrixfactorization method) by adding a matrix transformed from themulticlass preference context (MPC) of a certain user, whichrepresents the user similarities in a neighborhood-based method.
In this paper, we further introduce a matrix factorization model thatcombines not only user similarities but also item similarities.
Lin, Pan and Ming (SZU) MF-DMPC 3 / 26
Introduction
The Derivation Process of MF-DMPC (1)
user-specific latent feature vector
item-specific latent feature vector
user-specific latent preference vector
user bias,item bias, global average
(a) user-based MF-MPC model (u = 1,2,…,n; i = 1,2,…,m)
According to Pan et al., MF-MPC refers to matrix factorization with(user-based) multiclass preference context.
Lin, Pan and Ming (SZU) MF-DMPC 4 / 26
Introduction
The Derivation Process of MF-DMPC (2)
user-specific latent feature vector
item-specific latent feature vector
item-specific latent preference vector
user bias,item bias, global average
(b) item-based MF-MPC model (u = 1,2,…,n; i = 1,2,…,m)
Symmetrically, we introduce item-based multiclass preference contextto represent item similarities.
Lin, Pan and Ming (SZU) MF-DMPC 5 / 26
Introduction
The Derivation Process of MF-DMPC (3)
user-specific latent feature vector
item-specific latent feature vector
item-specific latent preference vector
user bias,item bias, global average
user-specific latent preference vector
(c) MF-DMPC model (u = 1,2,…,n; i = 1,2,…,m)
Finally, we introduce both user-based and item-based MPC (dualMPC) into the prediction rule to obtain our improved model calledmatrix factorization with dual multi-class preference context(MF-DMPC).
Lin, Pan and Ming (SZU) MF-DMPC 6 / 26
Introduction
Advantage of Our Solution
MF-DMPC inherits high accuracy and good explainability ofMF-MPC and performs even better.
As a matter of fact, our model is a more generic method whichsuccessfully exploits the complementarity between user-based anditem-based neighborhood information.
Lin, Pan and Ming (SZU) MF-DMPC 7 / 26
Preliminaries
Problem Definition
In this paper, we study the problem of making good use of internalcontext in recommendation systems.
We will only need an incomplete rating matrix R = {(u, i , rui )} forour task, where u represents one of the ID numbers of n users (orrows), i represents one of the ID numbers of m items (or columns),and rui is the recorded rating of user u to item i .
As a result, we will build an improved model to estimate the missingentries of the rating matrix.
Lin, Pan and Ming (SZU) MF-DMPC 8 / 26
Preliminaries
Notations
Table: Some notations and explanations.
Symbol Meaning
n user numberm item numberu, u′ user IDi, i′ item IDM multiclass preference setrui ∈ M rating of user u to item iR = {(u, i, rui )} rating records of training datayui ∈ {0, 1} indicator, yui = 1 if (u, i, rui ) ∈ R and yui = 0 otherwiseIru , r ∈ M items rated by user u with rating rIu items rated by user uU ri , r ∈ M users who rate item i with rating rUi users who rate item iµ ∈ R global average rating valuebu ∈ R user biasbi ∈ R item biasd ∈ R number of latent dimensions
Uu·,Nru· ∈ R1×d user-specific latent feature vector
Vi·,Mri· ∈ R1×d item-specific latent feature vector
Rte = {(u, i, rui )} rating records of test datarui predicted rating of user u to item iT iteration number in the algorithm
Lin, Pan and Ming (SZU) MF-DMPC 9 / 26
Preliminaries
Prediction Rule of SVD
In the state-of-the-art matrix factorization based model – SVD model, theprediction rule for the rating of user u to item i is as follows,
rui = Uu·VTi · + bu + bi + µ, (1)
where Uu· ∈ R1×d and Vi · ∈ R1×d are the user-specific and item-specificlatent feature vectors, respectively, and bu , bi and µ are the user bias, theitem bias and the global average, respectively.
Lin, Pan and Ming (SZU) MF-DMPC 10 / 26
Preliminaries
Preference Generalization Probability of MF-MPC
In the MF-MPC model, the rating of user u to item i , rui , can berepresented in a probabilistic way as,
P(rui |(u, i); (u, i ′, rui ′), i′ ∈ ∪r∈MIru\{i}), (2)
which means that rui is dependent on not only the (user, item) pair (u, i),but also the examined items i ′ ∈ Iu\{i} and the categorical score rui ′ ∈Mof each item. Notice that multiclass preference context (MPC) refers tothe condition (u, i ′, rui ′), i
′ ∈ ∪r∈MIru\{i}.
Lin, Pan and Ming (SZU) MF-DMPC 11 / 26
Preliminaries
Prediction Rule of MF-MPC
In order to introduce MPC into MF based model, we need a user-specificlatent preference vector UMPC
u· for user u,
UMPCu· =
∑r∈M
1√|Iru\{i}|
∑i ′∈Iru\{i}
M ri ′·. (3)
Notice that M ri· ∈ R1×d is a classified item-specific latent feature vector and
1√|Iru\{i}|
plays as a normalization term for the preference of class r .
By adding the neighborhood information UMPCu· to SVD model, we get the
MF-MPC prediction rule for the rating of user u to item i ,
rui = Uu·VTi· + UMPC
u· V Ti· + bu + bi + µ, (4)
where Uu· and Vi·, bu , bi and µ are exactly the same with that of the SVDmodel.
MF-MPC generates better recommendation performance than SVD andSVD++, and also contains them as particular cases.
Lin, Pan and Ming (SZU) MF-DMPC 12 / 26
Our Model
Dual Multiclass Preference Context
We first define item-based multiclass preference context (item-basedMPC) VMPC
i · to represent item similarities. Symmetrically, we have
VMPCi · =
∑r∈M
1√|U r
i \{u}|
∑u′∈U r
i \{u}
N ru′·, (5)
where N ru· ∈ R1×d is a classified user-specific latent feature vector.
So we have the item-based MF-MPC prediction rule,
rui = Uu·VTi · + VMPC
i · UTu· + bu + bi + µ. (6)
We can introduce both user-based and item-base neighborhoodinformation into matrix factorization method by keeping both UMPC
u·and VMPC
i · in the model, collectively called dual multiclass preferencecontext (DMPC).
Lin, Pan and Ming (SZU) MF-DMPC 13 / 26
Our Model
Prediction Rule of MF-DMPC
For matrix factorization with dual multiclass preference context, theprediction rule for the rating of user u to item i is defined as follows,
rui = Uu·VTi · + UMPC
u· V Ti · + VMPC
i · UTu· + bu + bi + µ, (7)
with all notations described above. Finally, we call our new model“MF-DMPC” in short.
Lin, Pan and Ming (SZU) MF-DMPC 14 / 26
Our Model
Optimization Problem
With the prediction rule, we can learn the model parameters in thefollowing minimization problem,
minΘ
n∑u=1
m∑i=1
yui [1
2(rui − rri )
2 + reg(u, i)], (8)
wherereg(u, i) = αm
2
∑r∈M
∑i ′∈Iru\{i} ||M
ri ′ ||2F + αn
2
∑r∈M
∑u′∈U r
i \{u}||N r
u′ ||2F +αu2 ||Uu·||2 + αv
2 ||Vi ·||2 + βu2 ||bu·||
2 + βv2 ||bi ·||
2 is the regularization termused to avoid overfitting, and Θ = {Uu·,Vi ·, bu, bi , µ,M
ri ·,N
ru·},
u = 1, 2, . . . , n, i = 1, 2, . . . ,m, r ∈M.Notice that the objective function of MF-DMPC is quite similar to that ofMF-MPC. The difference lies in the “dual” MPC, i.e., VMPC
i · UTu· in the
prediction rule, and αn2
∑r∈M
∑u′∈U r
i \{u}||N r
u′ ||2F in the regularizationterm.
Lin, Pan and Ming (SZU) MF-DMPC 15 / 26
Our Model
Gradients
Using the stochastic gradient descent (SGD) algorithm, we have thegradients of the model parameters,
∇Uu· = −eui (Vi · + VMPCi · ) + αuUu· (9)
∇Vi · = −eui (Uu· + UMPCu· ) + αvVi · (10)
∇bu = −eui + βubu (11)
∇bi = −eui + βvbi (12)
∇µ = −eui (13)
∇M ri ′· =
−euiVi ·√|Iru\{i}|
+ αmMri ′·, i′ ∈ Iru\{i}, r ∈M. (14)
∇N ru′· =
−euiUu·√|U r
i \{u}|+ αnN
ru′·, u
′ ∈ U ri \{u}, r ∈M. (15)
where eui = (rui − rui ) is the difference between the true rating and thepredicted rating.
Lin, Pan and Ming (SZU) MF-DMPC 16 / 26
Our Model
Update Rules
We have the update rules,
θ = θ − γ∇θ (16)
where γ is the learning rate, and θ ∈ Θ is a model parameter to be learned.
Lin, Pan and Ming (SZU) MF-DMPC 17 / 26
Our Model
Algorithm of MF-DMPC
1: Initialize model parameters Θ2: for t = 1, . . . ,T do3: for t2 = 1, . . . , |R| do4: Randomly pick up a rating from R5: Calculate the gradients via Eq.(9 - 15)6: Update the parameters via Eq.(16)7: end for8: Decrease the learning rate γ ←− γ × 0.99: end for
Figure: The algorithm of MF-DMPC.
Lin, Pan and Ming (SZU) MF-DMPC 18 / 26
Our Model
Analysis
The time complexity of MF-MPC and SVD++ and the proposedMF-DMPC is MF-DMPC > MF-MPC > SVD++, mainly because ofthe traversal during calculating UMPC
u· and VMPCi · in MF-DMPC,
UMPCu· in MF-MPC, and UOPC
u· (oneclass preference context definedin) in SVD++.
As for space complexity, we can reckon from the size of dominatingmodel parameters vectors shown in table below.
Table: The size of dominating model parameters vectors in different models.
Model Dominating model parameters vectors size(×d)
SVD Uu·,Vi· n + mSVD++ Uu·,Vi·,Mi· n + m + nMF-MPC Uu·,Vi·,M
ri· n + m + n|M|
MF-DMPC Uu·,Vi·,Mri·,N
ru· n + m + n|M|+ m|M|
Lin, Pan and Ming (SZU) MF-DMPC 19 / 26
Experiments
Data Sets and Evaluation Metrics
We choose the same data sets used in previous research aboutMF-MPC for convenience (see table below). We use five-fold crossvalidation in the empirical studies.
Table: Statistics of the data sets used in the experiments.
Data set n m |R|+ |Rte | |R|/n/m n/m
ML100K 943 1,682 100,000 5.04% 0.56
ML1M 6,040 3,952 1,000,209 3.35% 1.53
ML10M 71,567 10,681 10,000,054 1.05% 6.70
We adopt mean absolute error (MAE) and root mean square error(RMSE) as evaluation metrics:
MAE =∑
(u,i ,rui )∈Rte
|rui − rui |/|Rte |
RMSE =
√ ∑(u,i ,rui )∈Rte
(rui − rui )2/|Rte |
Lin, Pan and Ming (SZU) MF-DMPC 20 / 26
Experiments
Baselines and Parameter Settings
In order to find out the effects of introducing different kinds of MPCinto matrix factorization (MF) model, we compare the performance ofSVD (see Eq.(1)) against that achieved by matrix factorization withuser-based MPC (see Eq.(4)), item-based MPC (see Eq.(6)) and dualMPC (see Eq.(7)).
We configure the parameter settings of factorization-based methodsas follows:
The learning rate: γ = 0.01.The number of latent dimensions: d = 20.The iteration number: T = 50.The tradeoff parameters are searched through experiment using thefirst copy of each data and the RMSE metric, and follow the followingconditions: αu = αv = βu = βv = α, α ∈ {0.001, 0.01, 0.1}; foruser-based MF-MPC, αm = α; for item-based MF-MPC, αn = α; fordual MF-MPC, αm, αn ∈ {0.001, 0.01, 0.1}.
Lin, Pan and Ming (SZU) MF-DMPC 21 / 26
Experiments
Results
Table: Recommendation performance of our MF-DMPC and other baselinemethods on three Movielens data sets.
Data Method MAE RMSE Parameter (α, αm, αn)
ML100K
SVD 0.7446±0.0033 0.9445±0.0035 (0.01,N/A,N/A)User-Based MF-MPC 0.7123±0.0028 0.9102±0.0029 (0.01, 0.01,N/A)Item-Based MF-MPC 0.7038±0.0021 0.9008±0.0025 (0.01,N/A, 0.01)MF-DMPC 0.7011±0.0025 0.8991±0.0024 (0.01, 0.01, 0.01)
ML1M
SVD 0.7017±0.0016 0.8899±0.0023 (0.01,N/A,N/A)User-Based MF-MPC 0.6613±0.0015 0.8465±0.0017 (0.01, 0.01,N/A)Item-Based MF-MPC 0.6587±0.0009 0.8439±0.0013 (0.01,N/A, 0.01)MF-DMPC 0.6564±0.0016 0.8434±0.0017 (0.01, 0.01, 0.01)
ML10M
SVD 0.6067±0.0007 0.7913±0.0009 (0.01,N/A,N/A)User-Based MF-MPC 0.5965±0.0006 0.78135±0.0007 (0.01, 0.01,N/A)Item-Based MF-MPC 0.6024±0.0006 0.7900±0.0008 (0.01,N/A, 0.01)MF-DMPC 0.5955±0.0005 0.78133±0.0007 (0.01, 0.001, 0.1)
Lin, Pan and Ming (SZU) MF-DMPC 22 / 26
Experiments
Observations
We can have the following observations:
The performence of factorization framework greatly improve whenintroducing multiclass preference context;
Among all kinds of MPC, dual MPC contributes the most to theachievement of minimizing prediction error;
In the MovieLens datasets, whether user-based or item-based MPC is morehelpful depends on the ratio of user group size to item group size (n/m).Normally, item-based MF-MPC performs better when n/m is of a suitablesize. While as a massive number of users involved (n/m gets large),user-based MPC become even more important; (May be affected byadditional factors such as the density of rating matrix)
The performance of MF-DMPC is in a way restrained by the better resultbetween user-based and item-based MF-MPC – just slightly better than thebetter result. While the slight improvement supports the view thatMF-DMPC successfully strikes the balance between user-based anditem-based MPC, proving MF-DMPC to be a more generic method.
Lin, Pan and Ming (SZU) MF-DMPC 23 / 26
Conclusions and Future Work
Conclusions
We present a novel collaborative filtering method that joinsneighborhood information to factorization model for rating prediction.
Specifically, we extend multiclass preference context (MPC) toinclude two types, i.e., user-based and item-based, and combine themin one single prediction rule in order to achieve betterrecommendation performance than the reference models.
Lin, Pan and Ming (SZU) MF-DMPC 24 / 26
Conclusions and Future Work
Future Work
We are interested in studying the issues such as efficiency androbustness of factorization-based algorithms with preference context.
We also expect some advanced strategy perhaps about adversarialsampling, denoising or multilayer perception to be used in thealgorithm.
Lin, Pan and Ming (SZU) MF-DMPC 25 / 26
Thank You
Thank you!
We thank Mr. Yunfeng Huang for assitant in code review and helpfuldiscusions.
We thank the support of National Natural Science Foundation ofChina No. 61502307 and No. 61672358, and Natural ScienceFoundation of Guangdong Province No. 2016A030313038.
Lin, Pan and Ming (SZU) MF-DMPC 26 / 26