MF-DMPC: Matrix Factorization with Dual Multiclass...

MF-DMPC: Matrix Factorization with Dual MulticlassPreference Context for Rating Prediction

Jing Lin Weike Pan Zhong Ming

National Engineering Laboratory for Big Data System Computing Technology,College of Computer Science and Software Engineering,

Shenzhen University, [email protected], {panweike,mingz}@szu.edu.cn

Lin, Pan and Ming (SZU) MF-DMPC 1 / 26

Introduction

Context in RS

3 ?1? ?

? ??4 ?

5 2

2 ??1 5

? 3?? 4

???

1

2

3

n-1

n

1 2 3 … m-1 mitems

usersSocial network

Weather

Date

Place

……Rating matrix

Internal Context

External ContextLa La Land

Comedy, Drama, Music

…...Avatar

Action, Adventure, Fantasy

…...

Item content

…

Using external context in RS may cause the problems of model inflexibility,computational burden and systems incompatibility. So algorithms thatonly use internal context (collaborative filtering) still occupy an importantplace in the research community.


Introduction

Motivation

MF-MPC combines the two complementary categories of collaborativefiltering – neighborhood-based and model-based.

MF-MPC is an improved method of SVD (a typical kind of matrixfactorization method) by adding a matrix transformed from themulticlass preference context (MPC) of a certain user, whichrepresents the user similarities in a neighborhood-based method.

In this paper, we further introduce a matrix factorization model thatcombines not only user similarities but also item similarities.


Introduction

The Derivation Process of MF-DMPC (1)

user-specific latent feature vector

item-specific latent feature vector

user-specific latent preference vector

user bias,item bias, global average

(a) user-based MF-MPC model (u = 1,2,…,n; i = 1,2,…,m)

According to Pan et al., MF-MPC refers to matrix factorization with(user-based) multiclass preference context.


Introduction




item-specific latent preference vector


(b) item-based MF-MPC model (u = 1,2,…,n; i = 1,2,…,m)

Symmetrically, we introduce item-based multiclass preference contextto represent item similarities.


Introduction




item-specific latent preference vector


user-specific latent preference vector

(c) MF-DMPC model (u = 1,2,…,n; i = 1,2,…,m)

Finally, we introduce both user-based and item-based MPC (dualMPC) into the prediction rule to obtain our improved model calledmatrix factorization with dual multi-class preference context(MF-DMPC).


Introduction

Advantage of Our Solution

MF-DMPC inherits high accuracy and good explainability ofMF-MPC and performs even better.

As a matter of fact, our model is a more generic method whichsuccessfully exploits the complementarity between user-based anditem-based neighborhood information.


Preliminaries

Problem Definition

In this paper, we study the problem of making good use of internalcontext in recommendation systems.

We will only need an incomplete rating matrix R = {(u, i , rui )} forour task, where u represents one of the ID numbers of n users (orrows), i represents one of the ID numbers of m items (or columns),and rui is the recorded rating of user u to item i .

As a result, we will build an improved model to estimate the missingentries of the rating matrix.


Preliminaries

Notations

Table: Some notations and explanations.

Symbol Meaning

n user numberm item numberu, u′ user IDi, i′ item IDM multiclass preference setrui ∈ M rating of user u to item iR = {(u, i, rui )} rating records of training datayui ∈ {0, 1} indicator, yui = 1 if (u, i, rui ) ∈ R and yui = 0 otherwiseIru , r ∈ M items rated by user u with rating rIu items rated by user uU ri , r ∈ M users who rate item i with rating rUi users who rate item iµ ∈ R global average rating valuebu ∈ R user biasbi ∈ R item biasd ∈ R number of latent dimensions

Uu·,Nru· ∈ R1×d user-specific latent feature vector

Vi·,Mri· ∈ R1×d item-specific latent feature vector

Rte = {(u, i, rui )} rating records of test datarui predicted rating of user u to item iT iteration number in the algorithm


Preliminaries

Prediction Rule of SVD

In the state-of-the-art matrix factorization based model – SVD model, theprediction rule for the rating of user u to item i is as follows,

rui = Uu·VTi · + bu + bi + µ, (1)

where Uu· ∈ R1×d and Vi · ∈ R1×d are the user-specific and item-specificlatent feature vectors, respectively, and bu , bi and µ are the user bias, theitem bias and the global average, respectively.


Preliminaries

Preference Generalization Probability of MF-MPC

In the MF-MPC model, the rating of user u to item i , rui , can berepresented in a probabilistic way as,

P(rui |(u, i); (u, i ′, rui ′), i′ ∈ ∪r∈MIru\{i}), (2)

which means that rui is dependent on not only the (user, item) pair (u, i),but also the examined items i ′ ∈ Iu\{i} and the categorical score rui ′ ∈Mof each item. Notice that multiclass preference context (MPC) refers tothe condition (u, i ′, rui ′), i

′ ∈ ∪r∈MIru\{i}.


Preliminaries

Prediction Rule of MF-MPC

In order to introduce MPC into MF based model, we need a user-specificlatent preference vector UMPC

u· for user u,

UMPCu· =

∑r∈M

1√|Iru\{i}|

∑i ′∈Iru\{i}

M ri ′·. (3)

Notice that M ri· ∈ R1×d is a classified item-specific latent feature vector and

1√|Iru\{i}|

plays as a normalization term for the preference of class r .

By adding the neighborhood information UMPCu· to SVD model, we get the

MF-MPC prediction rule for the rating of user u to item i ,

rui = Uu·VTi· + UMPC

u· V Ti· + bu + bi + µ, (4)

where Uu· and Vi·, bu , bi and µ are exactly the same with that of the SVDmodel.

MF-MPC generates better recommendation performance than SVD andSVD++, and also contains them as particular cases.


Our Model

Dual Multiclass Preference Context

We first define item-based multiclass preference context (item-basedMPC) VMPC

i · to represent item similarities. Symmetrically, we have

VMPCi · =

∑r∈M

1√|U r

i \{u}|

∑u′∈U r

i \{u}

N ru′·, (5)

where N ru· ∈ R1×d is a classified user-specific latent feature vector.

So we have the item-based MF-MPC prediction rule,

rui = Uu·VTi · + VMPC

i · UTu· + bu + bi + µ. (6)

We can introduce both user-based and item-base neighborhoodinformation into matrix factorization method by keeping both UMPC

u·and VMPC

i · in the model, collectively called dual multiclass preferencecontext (DMPC).


Our Model

Prediction Rule of MF-DMPC

For matrix factorization with dual multiclass preference context, theprediction rule for the rating of user u to item i is defined as follows,

rui = Uu·VTi · + UMPC

u· V Ti · + VMPC

i · UTu· + bu + bi + µ, (7)

with all notations described above. Finally, we call our new model“MF-DMPC” in short.


Our Model

Optimization Problem

With the prediction rule, we can learn the model parameters in thefollowing minimization problem,

minΘ

n∑u=1

m∑i=1

yui [1

2(rui − rri )

2 + reg(u, i)], (8)

wherereg(u, i) = αm

2

∑r∈M

∑i ′∈Iru\{i} ||M

ri ′ ||2F + αn

2

∑r∈M

∑u′∈U r

i \{u}||N r

u′ ||2F +αu2 ||Uu·||2 + αv

2 ||Vi ·||2 + βu2 ||bu·||

2 + βv2 ||bi ·||

2 is the regularization termused to avoid overfitting, and Θ = {Uu·,Vi ·, bu, bi , µ,M

ri ·,N

ru·},

u = 1, 2, . . . , n, i = 1, 2, . . . ,m, r ∈M.Notice that the objective function of MF-DMPC is quite similar to that ofMF-MPC. The difference lies in the “dual” MPC, i.e., VMPC

i · UTu· in the

prediction rule, and αn2

∑r∈M

∑u′∈U r

i \{u}||N r

u′ ||2F in the regularizationterm.


Our Model

Gradients

Using the stochastic gradient descent (SGD) algorithm, we have thegradients of the model parameters,

∇Uu· = −eui (Vi · + VMPCi · ) + αuUu· (9)

∇Vi · = −eui (Uu· + UMPCu· ) + αvVi · (10)

∇bu = −eui + βubu (11)

∇bi = −eui + βvbi (12)

∇µ = −eui (13)

∇M ri ′· =

−euiVi ·√|Iru\{i}|

+ αmMri ′·, i′ ∈ Iru\{i}, r ∈M. (14)

∇N ru′· =

−euiUu·√|U r

i \{u}|+ αnN

ru′·, u

′ ∈ U ri \{u}, r ∈M. (15)

where eui = (rui − rui ) is the difference between the true rating and thepredicted rating.


Our Model

Update Rules

We have the update rules,

θ = θ − γ∇θ (16)

where γ is the learning rate, and θ ∈ Θ is a model parameter to be learned.


Our Model

Algorithm of MF-DMPC

1: Initialize model parameters Θ2: for t = 1, . . . ,T do3: for t2 = 1, . . . , |R| do4: Randomly pick up a rating from R5: Calculate the gradients via Eq.(9 - 15)6: Update the parameters via Eq.(16)7: end for8: Decrease the learning rate γ ←− γ × 0.99: end for

Figure: The algorithm of MF-DMPC.


Our Model

Analysis

The time complexity of MF-MPC and SVD++ and the proposedMF-DMPC is MF-DMPC > MF-MPC > SVD++, mainly because ofthe traversal during calculating UMPC

u· and VMPCi · in MF-DMPC,

UMPCu· in MF-MPC, and UOPC

u· (oneclass preference context definedin) in SVD++.

As for space complexity, we can reckon from the size of dominatingmodel parameters vectors shown in table below.

Table: The size of dominating model parameters vectors in different models.

Model Dominating model parameters vectors size(×d)

SVD Uu·,Vi· n + mSVD++ Uu·,Vi·,Mi· n + m + nMF-MPC Uu·,Vi·,M

ri· n + m + n|M|

MF-DMPC Uu·,Vi·,Mri·,N

ru· n + m + n|M|+ m|M|


Experiments

Data Sets and Evaluation Metrics

We choose the same data sets used in previous research aboutMF-MPC for convenience (see table below). We use five-fold crossvalidation in the empirical studies.

Table: Statistics of the data sets used in the experiments.

Data set n m |R|+ |Rte | |R|/n/m n/m

ML100K 943 1,682 100,000 5.04% 0.56

ML1M 6,040 3,952 1,000,209 3.35% 1.53

ML10M 71,567 10,681 10,000,054 1.05% 6.70

We adopt mean absolute error (MAE) and root mean square error(RMSE) as evaluation metrics:

MAE =∑

(u,i ,rui )∈Rte

|rui − rui |/|Rte |

RMSE =

√ ∑(u,i ,rui )∈Rte

(rui − rui )2/|Rte |


Experiments

Baselines and Parameter Settings

In order to find out the effects of introducing different kinds of MPCinto matrix factorization (MF) model, we compare the performance ofSVD (see Eq.(1)) against that achieved by matrix factorization withuser-based MPC (see Eq.(4)), item-based MPC (see Eq.(6)) and dualMPC (see Eq.(7)).

We configure the parameter settings of factorization-based methodsas follows:

The learning rate: γ = 0.01.The number of latent dimensions: d = 20.The iteration number: T = 50.The tradeoff parameters are searched through experiment using thefirst copy of each data and the RMSE metric, and follow the followingconditions: αu = αv = βu = βv = α, α ∈ {0.001, 0.01, 0.1}; foruser-based MF-MPC, αm = α; for item-based MF-MPC, αn = α; fordual MF-MPC, αm, αn ∈ {0.001, 0.01, 0.1}.


Experiments

Results

Table: Recommendation performance of our MF-DMPC and other baselinemethods on three Movielens data sets.

Data Method MAE RMSE Parameter (α, αm, αn)

ML100K

SVD 0.7446±0.0033 0.9445±0.0035 (0.01,N/A,N/A)User-Based MF-MPC 0.7123±0.0028 0.9102±0.0029 (0.01, 0.01,N/A)Item-Based MF-MPC 0.7038±0.0021 0.9008±0.0025 (0.01,N/A, 0.01)MF-DMPC 0.7011±0.0025 0.8991±0.0024 (0.01, 0.01, 0.01)

ML1M


ML10M



Experiments

Observations

We can have the following observations:

The performence of factorization framework greatly improve whenintroducing multiclass preference context;

Among all kinds of MPC, dual MPC contributes the most to theachievement of minimizing prediction error;

In the MovieLens datasets, whether user-based or item-based MPC is morehelpful depends on the ratio of user group size to item group size (n/m).Normally, item-based MF-MPC performs better when n/m is of a suitablesize. While as a massive number of users involved (n/m gets large),user-based MPC become even more important; (May be affected byadditional factors such as the density of rating matrix)

The performance of MF-DMPC is in a way restrained by the better resultbetween user-based and item-based MF-MPC – just slightly better than thebetter result. While the slight improvement supports the view thatMF-DMPC successfully strikes the balance between user-based anditem-based MPC, proving MF-DMPC to be a more generic method.


Conclusions and Future Work

Conclusions

We present a novel collaborative filtering method that joinsneighborhood information to factorization model for rating prediction.

Specifically, we extend multiclass preference context (MPC) toinclude two types, i.e., user-based and item-based, and combine themin one single prediction rule in order to achieve betterrecommendation performance than the reference models.


Conclusions and Future Work

Future Work

We are interested in studying the issues such as efficiency androbustness of factorization-based algorithms with preference context.

We also expect some advanced strategy perhaps about adversarialsampling, denoising or multilayer perception to be used in thealgorithm.


Thank You

Thank you!

We thank Mr. Yunfeng Huang for assitant in code review and helpfuldiscusions.

We thank the support of National Natural Science Foundation ofChina No. 61502307 and No. 61672358, and Natural ScienceFoundation of Guangdong Province No. 2016A030313038.


MF-DMPC: Matrix Factorization with Dual Multiclass...

Documents

Transcript of MF-DMPC: Matrix Factorization with Dual Multiclass...