Dual Similarity Learning for Heterogeneous One...

23
Dual Similarity Learning for Heterogeneous One-Class Collaborative Filtering Xiancong Chen a,b,c , Weike Pan a,b,c * , Zhong Ming a,b,c * [email protected], {panweike,mingz}@szu.edu.cn a National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen, China b Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen University, Shenzhen, China c College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 1 / 22

Transcript of Dual Similarity Learning for Heterogeneous One...

Page 1: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Dual Similarity Learning for HeterogeneousOne-Class Collaborative Filtering

Xiancong Chena,b,c , Weike Pana,b,c∗, Zhong Minga,b,c∗

[email protected], {panweike,mingz}@szu.edu.cn

aNational Engineering Laboratory for Big Data System Computing Technology,Shenzhen University, Shenzhen, China

bGuangdong Laboratory of Artificial Intelligence and Digital Economy (SZ),Shenzhen University, Shenzhen, China

cCollege of Computer Science and Software Engineering,Shenzhen University, Shenzhen, China

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 1 / 22

Page 2: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Introduction

Problem Definition

Problem: In this paper, we study the heterogeneous one-classcollaborative filtering (HOCCF) problem.Input: For a user u ∈ U , we have a set of purchasing items, i.e.,IPu , and a set of examined items, i.e., IEu .Goal: Our goal is to exploit such two types of one-class feedbackand recommend a ranked list of items for each user u.

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 2 / 22

Page 3: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Introduction

Challenges

1 The sparsity of target feedback (i.e., purchases).2 The ambiguity and noise of auxiliary feedback (i.e., examinations).

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 3 / 22

Page 4: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Introduction

Overall of Our Solution

Target Item

Target User

Dual

Similarity

i'is jis

u'us wus

Dual Similarity Learning Model (DSLM):Learn the similarity si ′i between a target item i and a purchaseditem i ′, and the similarity sji between a target item i and anexamined item j .Learn the similarity su′u between a target user u and a user u′

who purchased item i , and the similarity swu between a target useru and a user w who examined item i .

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 4 / 22

Page 5: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Introduction

Advantages of Our Solution

By introducing the auxiliary feedback (i.e., examinations), DSLM isable to alleviate the sparsity problem to some extent.

DSLM learns not only the similarity among items, but also thesimilarity among users, which is useful to capture the correlationsbetween users and items.

DSLM strikes a good balance between the item-based similarityand user-based similarity.

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 5 / 22

Page 6: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Introduction

Notations

Notation Explanation

n number of usersm number of itemsu, u′,w ∈ {1, 2, . . . , n} user IDi, i′, j ∈ {1, 2, . . . ,m} item IDU = {u}, |U| = n the whole set of usersI = {i}, |I| = m the whole set of itemsRP = {(u, i)} the whole set of purchasesRE = {(u, i)} the whole set of examinationsRA = {(u, i)} the set of absent pairsIPu = {i|(u, i) ∈ RP} the set of purchased items w.r.t. uIEu = {i|(u, i) ∈ RE} the set of examined items w.r.t. uUPi = {u|(u, i) ∈ RP} the set of users that have purchased item iUEi = {u|(u, i) ∈ RE} the set of users that have examined item iUu·, Pu′·, Ew· ∈ R1×d user’s latent vectorsVi·, Pi′ , Ej· ∈ R1×d item’s latent vectorsbu , bi ∈ R user bias and item biasd latent feature numberrui predicted preference of user u on item iρ sampling parameterγ learning rateT , L, L0 iteration numbersλ∗, α∗, β∗ tradeoff parameters

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 6 / 22

Page 7: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Background

Factored Item Similarity Model (FISM)

In FISM [Kabbur et al., 2013], we can estimate the preference of useru towards item i by aggregating the similarity between item i and allthe items purchased by user u (i.e., IPu \{i}),

rui =1√

|IPu \{i}|

∑i ′∈IPu \{i}

Pi ′·V Ti· , (1)

where we can regard the term 1√|IPu \{i}|

∑i ′∈IPu \{i} Pi ′· as a certain

virtual user profile w.r.t. the target feedback, denoting the distinctpreference of user u.

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 7 / 22

Page 8: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Background

Transfer via Joint Similarity Learning (TJSL)

TJSL [Pan et al., 2016] introduces a new similarity term in a similarway to that of FISM, through which the knowledge from the auxiliaryfeedback (i.e., examinations) can be transferred. Then, the preferenceestimation of user u towards item i becomes as follows,

∑i ′∈IPu \{i}

si ′i +∑

j∈IE(`)u

sji , IE(`)u ⊆ IEu , (2)

where∑

j∈IE(`)u

sji = 1√|IE(`)

u |

∑j∈IE(`)

uEj·V T

i· , and the term

1√|IE(`)

u |

∑j∈IE(`)

uEj· can be regarded as a virtual user profile w.r.t. the

auxiliary feedback.

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 8 / 22

Page 9: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Method

Dual Similarity Learning Model (DSLM)

In TJSL [Pan et al., 2016], two similarities among items are learned.Symmetrically, we define the similarity among users as follows,∑

u′∈UPi \{u}

su′u +∑

w∈UE(`)i

swu, UE(`)i ⊆ UEi , (3)

where∑

u′∈UPi \{u}su′u = 1√

|UPi \{u}|

∑u′∈UPi \{u}

Pu′·UTu·,∑

w∈UE(`)i

swu = 1√|UE(`)

i |

∑w∈UE(`)

iEw ·UT

u·. Intuitively, we can also regard

the term 1√|UPi \{u}|

∑u′∈UPi \{u}

Pu′· and 1√|UE(`)

i |

∑w∈UE(`)

iEw · as the

virtual item profiles w.r.t. the target feedback and auxiliary feedback,respectively.

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 9 / 22

Page 10: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Method

Prediction Rule

The predicted preference of user u on item i ,

r (`)ui =

∑i ′∈IPu \{i}

si ′i +∑

j∈IE(`)u

sji + bu + bi+∑u′∈UPi \{u}

su′u +∑

w∈UE(`)i

swu, IE(`)u ⊆ IEu , U

E(`)i ⊆ UEi ,

(4)

where IE(`)u is the set of likely-to-prefer items selected from IEu , and

UE(`)i is the set of potential users that are likely to purchase item i

selected from UEi .

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 10 / 22

Page 11: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Method

Objective Function

The objective function of DSLM is as follows,

minΘ(`), IE(`)

u ⊆IEu ,UE(`)i ⊆UEi

∑(u,i)∈RP∪RA

f (`)ui , (5)

where f (`)ui = 1

2(rui − r (`)ui )2 + λu

2 ||Uu·||2F +λp2∑

u′∈UPi \{u}||Pu′·||2F +

λe2∑

w∈UE(`)i||Ew ·||2F + βu

2 b2u + αv

2 ||Vi·||2F +αp2∑

i ′∈IPu \{i} ||Pi ′·||2F +

αe2∑

j∈IE(`)u||Ej·||2F + βv

2 b2i , and the model parameters are

Θ(`) = {Uu·,Pu′·,Ew ·,Vi·, Pi ′·, Ej·,bu,bi}. Note that RA is a set ofnegative feedback used to complement the target feedback, whererui = 1 if (u, i) ∈ RP and rui = 0 otherwise.

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 11 / 22

Page 12: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Method

Gradients (1/2)

To learn the parameters Θ(`), we use the stochastic gradient decent(SGD) algorithm and have the gradients of the model parameters for arandomly sampled pair (u, i) ∈ RP ∪RA,

∇Uu· = −eui1√

|UPi \{u}|

∑u′∈UPi \{u}

Pu′· − eui1√|UE(`)

i |

∑w∈UE(`)

i

Ew · + λuUu·,

(6)

∇Vi· = −eui1√

|IPu \{i}|

∑i ′∈IPu \{i}

Pi ′· − eui1√|IE(`)

u |

∑j∈IE(`)

u

Ej· + αv Vi·,

(7)

∇Pu′· = −eui1√

|UPi \{u}|Uu· + λpPu′·,u′ ∈ UPi \{u}, (8)

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 12 / 22

Page 13: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Method

Gradients (2/2)

∇Ew · =− eui1√|UE(`)

i |Uu· + λeEw ·,w ∈ UE(`)

i , (9)

∇Pi ′· =− eui1√

|IPu \{i}|Vi· + αpPi ′·, i ′ ∈ IPu \{i}, (10)

∇Ej· =− eui1√|IE(`)

u |Vi· + αeEj·, j ∈ I

E(`)u , (11)

∇bu =− eui + βubu,∇bi = −eui + βv bi , (12)

where eui = rui − rui is the difference between the true preference andthe predicted preference.

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 13 / 22

Page 14: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Method

Update Rules

We have the update rules,

θ(`) ← θ(`) − γ∇θ(`), (13)

where γ is the learning rate, and θ(`) ∈ Θ(`) is a model parameter to belearned.

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 14 / 22

Page 15: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Method

Identification of IE(`)u and UE(`)

i

Note that we identify UE(`)i and IE(`)

u by the following way,For each user u ∈ UEi , we estimate the preference for the targetitem i , i.e., r (`)

ui , and take τ |UEi |(τ ∈ (0,1]) users with the highestscores as the potential users that are likely to purchase the targetitem i .For each j ∈ IEu , similarly, we estimate the preference r (`)

uj , andtake τ |IEu | items with the highest scores as the candidate items.

Finally, we save the model and data of the last L0 epochs. Theestimated preference is the average value of r (`)

ui , where ` ranges fromL− L0 + 1 to L.

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 15 / 22

Page 16: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Method

Algorithm of DSLM

1: Input: RP ,RE , T , L, L0, ρ, γ, λ∗, α∗, β∗.2: Output: UE(`), IE(`) and Θ(`), ` = L− L0 + 1, ..., L.3: Let UE(1) = UE , IE(1) = IE , τ = 14: for ` = 1, ..., L do5: Initialize the model Θ(`)

6: for t = 1, ..., T do7: Randomly sampleRA ⊂ R\RP with |RA| = ρ|RP |8: for t2 = 1, ..., |RP ∪RA| do9: Randomly pick up (u, i) ∈ RP ∪RA

10: Calculate r (`)ui via (4)

11: Calculate∇θ, θ ∈ Θ(`) via (6)−(12)12: Update θ, θ ∈ Θ(`) via (13)13: end for14: end for15: if ` > L− L0 then16: Save the current model and data (Θ(`),UE(`), IE(`))

17: end if18: if L > 1 and L > ` then19: τ ← τ × 0.920: Select IE(`+1)

u with |IE(`+1)u | = τ |IEu | for each u

21: Select UE(`+1)i with |UE(`+1)

i | = τ |UEi | for each i

22: end if23: end for

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 16 / 22

Page 17: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Experiments

Datasets and Evaluation Metrics

For direct comparison, we use the three datasets fromTJSL [Pan et al., 2016], including ML100K, ML1M andAlibaba2015.

Table: Description of the datasets used in the experiments.

Dataset |U| |I| |RP | |RE | |RP te |ML100K 943 1682 9438 45285 2153ML1M 6040 3952 90848 400083 45075

Alibaba2015 7475 5275 9290 62659 2322

For performance evaluation, we adopt two commonly usedranking-oriented metrics, i.e., Precision@5 and NDCG@5.

The data and code are available athttp://csse.szu.edu.cn/staff/panwk/publications/DSLM/

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 17 / 22

Page 18: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Experiments

Baselines and Parameter Configurations

For comparative studies, we include the competitive methods asfollows.

BPR (Bayesian personalized ranking) [Rendle et al., 2009]FISM (factored item similarity model) [Kabbur et al., 2013]TJSL (transfer via joint similarity learning)[Pan et al., 2016]RBPR (role-based Bayesian personalizedranking) [Peng et al., 2016]

For parameter configurations, we fix the number of latentdimension d = 20, the learning rate γ = 0.01 and samplingparameter ρ = 3, and search the tradeoff parameters from{0.001,0.01,0.1} and the best iteration number T from{100,500,1000} via NDCG@5 performance.

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 18 / 22

Page 19: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Experiments

Main Results

Dataset BPR FISM TJSL RBPR DSLM

ML100K Precision@5 0.0552± 0.0006 0.0628± 0.0015 0.0697± 0.0016 0.0654± 0.0013 0.0694± 0.0014

NDCG@5 0.0874± 0.0020 0.1029± 0.0017 0.1133± 0.0047 0.1058± 0.0047 0.1140± 0.0022

ML1M Precision@5 0.0928± 0.0008 0.0971± 0.0013 0.1012± 0.0011 0.1086± 0.0009 0.1107± 0.0015

NDCG@5 0.1121± 0.0010 0.1189± 0.0008 0.1248± 0.0010 0.1327± 0.0016 0.1353± 0.0012

Alibaba2015 Precision@5 0.0050± 0.0006 0.0046± 0.0003 0.0071± 0.0004 0.0076± 0.0005 0.0087± 0.0006

NDCG@5 0.0138± 0.0017 0.0126± 0.0009 0.0200± 0.0008 0.0220± 0.0013 0.0269± 0.0017

Observations:In all cases, TJSL, RBPR and DSLM perform significantly betterthan BPR and FISM, which shows the effectiveness of introducingthe auxiliary feedback to assist the task of learning users’preferences; andDSLM performs better than TJSL and RBPR in most cases, e.g., itis the best on ML1M and Alibaba2015, and is comparable withTJSL on ML100K, which clearly shows the usefulness of the dualsimilarity in capturing the correlations between users and items.

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 19 / 22

Page 20: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Conclusions and Future Work

Conclusion

We propose a novel solution, i.e., dual similarity learning model(DSLM), for a recent and important recommendation problemcalled heterogeneous one-class collaborative filtering (HOCCF).

In particular, we jointly learn the dual similarity among both usersand items so as to exploit the complementarity well. Extensiveempirical studies on three public datasets clearly show theeffectiveness of our solution.

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 20 / 22

Page 21: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Conclusions and Future Work

Future Work

For future work, we are interested in further improving our DSLMby exploring pairwise ranking models [Yu et al., 2018] and deeplearning models [Shi et al., 2019].

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 21 / 22

Page 22: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

Thank you

Thank you!

We thank Weike Pan and Zhong Ming are corresponding authorsfor this work, the anonymous reviewers for their expert andconstructive comments and suggestions, and the support ofNational Natural Science Foundation of China Nos. 61872249,61836005 and 61672358.

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 22 / 22

Page 23: Dual Similarity Learning for Heterogeneous One …csse.szu.edu.cn/.../Conference-BigComp-20-DSLM-slides.pdfDual Similarity Learning for Heterogeneous One-Class Collaborative Filtering

References

Kabbur, S., Ning, X., and Karypis, G. (2013).FISM: Factored item similarity models for top-n recommender systems.KDD’13, pages 659–667.

Pan, W., Liu, M., and Ming, Z. (2016).Transfer learning for heterogeneous one-class collaborative filtering.IEEE Intelligent Systems, 31(4):43–49.

Peng, X., Chen, Y., Duan, Y., Pan, W., and Ming, Z. (2016).RBPR: Role-based Bayesian personalized ranking for heterogeneous one-class collaborative filtering.IFUP’16.

Rendle, S., Freudenthaler, C., Gantner, Z., and Lars, S.-T. (2009).BPR: Bayesian personalized ranking from implicit feedback.UAI’09, pages 452–461.

Shi, C., Han, X., Li, S., Wang, X., Wang, S., Du, J., and Yu, P. (2019).Deep collaborative filtering with multi-aspect information in heterogeneous networks.IEEE Transactions on Knowledge and Data Engineering.

Yu, R., Zhang, Y., Ye, Y., Wu, L., Wang, C., Liu, Q., and Chen, E. (2018).Multiple pairwise ranking with implicit feedback.CIKM’18, pages 1727–1730.

Chen, Pan and Ming (SZU) DSLM IEEE BigComp 2020 22 / 22