CLiMF: Collaborative Less-is-More Filtering

40
1 RecSys 2012, Dublin, Ireland, September 13, 2012 CLiMF: Learning to Maximize Reciprocal Rank with Collaborative Less-is-More Filtering Yue Shi a , Alexandros Karatzoglou b (Presenter), Linas Baltrunas b , Martha Larson a , Alan Hanjalic a , Nuria Oliver b a Delft University of Technology, Netherlands b Telefonica Research, Spain

description

RecSys presentation slides of CLiMF, a Collaborative Filtering algorithm based on a novel ranking algorithm

Transcript of CLiMF: Collaborative Less-is-More Filtering

Page 1: CLiMF: Collaborative Less-is-More Filtering

1 RecSys 2012, Dublin, Ireland, September 13, 2012

CLiMF: Learning to Maximize Reciprocal Rank with

Collaborative Less-is-More Filtering

Yue Shia, Alexandros Karatzogloub (Presenter), Linas Baltrunasb, Martha Larsona, Alan Hanjalica, Nuria Oliverb

aDelft University of Technology, Netherlands

bTelefonica Research, Spain

Page 2: CLiMF: Collaborative Less-is-More Filtering

2 RecSys 2012, Dublin, Ireland, September 13, 2012

Top-k Recommendations

Page 3: CLiMF: Collaborative Less-is-More Filtering

3 RecSys 2012, Dublin, Ireland, September 13, 2012

Implicit feedback

Page 4: CLiMF: Collaborative Less-is-More Filtering

4 RecSys 2012, Dublin, Ireland, September 13, 2012

Models for Implicit feedback

• Classification or Learning to Rank • Binary pairwise ranking loss function (Hinge, AUC loss) •  Sample from non-observed/irrelevant entries

Friend Not a Friend

Page 5: CLiMF: Collaborative Less-is-More Filtering

5 RecSys 2012, Dublin, Ireland, September 13, 2012

• The Point-wise Approach •  Reduce Ranking to Regression, Classification, or Ordinal

Regression problem, [OrdRec@Recys 2011]

• The Pairwise Approach •  Reduce Ranking to pair-wise classification [BPR@UAI 2010]

•  List-wise Approach •  Direct optimization of IR measures, List-wise loss minimization [CoFiRank@NIPS 2008]

Learning to Rank in CF

f(user, item) → R

f(user, item1, item2) → R

f(user, item1, . . . , itemn) → R

Page 6: CLiMF: Collaborative Less-is-More Filtering

6 RecSys 2012, Dublin, Ireland, September 13, 2012

List-wise ranking measures for implicit feedback: Reciprocal Rank: Average precision: [TFMAP@SIGIR2012]

Ranking metrics

AP =

|S|�

k=1

P (k)

|S|

RR =1

ranki

Page 7: CLiMF: Collaborative Less-is-More Filtering

7 RecSys 2012, Dublin, Ireland, September 13, 2012

Ranking metrics

AP = 1

RR = 1

Page 8: CLiMF: Collaborative Less-is-More Filtering

8 RecSys 2012, Dublin, Ireland, September 13, 2012

Ranking metrics

AP = 0.66

RR = 1

Page 9: CLiMF: Collaborative Less-is-More Filtering

9 RecSys 2012, Dublin, Ireland, September 13, 2012

Less is more

•  Focus at the very top of the list

• Try to get at least one interesting item at the top of the list

• MRR particularly important measure in domains that usually provide users with only few recommendations, i.e. Top-3 or Top-5

Page 10: CLiMF: Collaborative Less-is-More Filtering

10 RecSys 2012, Dublin, Ireland, September 13, 2012

Ingredients

• What kind of model should we use? •  Factor model

• Which Ranking measure do we need to optimize to have a good Top-k recommender? •  MRR captures the quality of Top-k recommendations

•  But MRR is not smooth so what can we do? •  We can perhaps find a smooth version of MRR

• How to ensure the proposed solution scalable? •  A fast learning algorithm (SGD), smoothness -> gradients

Page 11: CLiMF: Collaborative Less-is-More Filtering

11 RecSys 2012, Dublin, Ireland, September 13, 2012

Model

+

+

+ +

+

+

+

+

+

fij = �Ui, Vj�

Page 12: CLiMF: Collaborative Less-is-More Filtering

12 RecSys 2012, Dublin, Ireland, September 13, 2012

The Non-smoothness of Reciprocal Rank

• Reciprocal Rank (RR) of a ranked list of items for a given user

RRi =N�

j=1

Yij

Rij

N�

k=1

(1− YikI(Rik < Rij))

Page 13: CLiMF: Collaborative Less-is-More Filtering

13 RecSys 2012, Dublin, Ireland, September 13, 2012

Non-smoothness

RR = 0.5 Fi

0.81

0.75

0.64

0.61

0.58

0.55

0.49

0.43

1

2

3

4

5

6

7

8

Page 14: CLiMF: Collaborative Less-is-More Filtering

14 RecSys 2012, Dublin, Ireland, September 13, 2012

Non-smoothness

RR = 0.5 Fi

0.84

0.82

0.56

0.50

0.45

0.40

0.32

0.31

1

2

3

4

5

6

7

8

Page 15: CLiMF: Collaborative Less-is-More Filtering

15 RecSys 2012, Dublin, Ireland, September 13, 2012

Non-smoothness

RR = 1 Fi

0.85

0.84

0.56

0.50

0.45

0.40

0.32

0.31

1

2

3

4

5

6

7

8

Page 16: CLiMF: Collaborative Less-is-More Filtering

16 RecSys 2012, Dublin, Ireland, September 13, 2012

Page 17: CLiMF: Collaborative Less-is-More Filtering

17 RecSys 2012, Dublin, Ireland, September 13, 2012

How can we get a smooth-MRR?

• Borrow techniques from learning-to-rank:

I(Rik < Rij) ≈ g(fik − fij)

g(x) = 1/(1 + e−x)

1

Rij≈ g(fij)

Page 18: CLiMF: Collaborative Less-is-More Filtering

18 RecSys 2012, Dublin, Ireland, September 13, 2012

MRR Loss function

RRi ≈N�

j=1

Yijg(fij)N�

k=1

�1− Yikg(fik − fij)

Ui, V = argmaxUi,V

{RRi}

fij = �Ui, Vj�

O(N2)

Page 19: CLiMF: Collaborative Less-is-More Filtering

19 RecSys 2012, Dublin, Ireland, September 13, 2012

MRR loss function II

• Use concavity and monotonicity of log function,

L(Ui, V ) =N�

j=1

Yij

�ln g(fij) +

N�

k=1

ln�1− Yikg(fik − fij)

��

O(n+2)

Page 20: CLiMF: Collaborative Less-is-More Filtering

20 RecSys 2012, Dublin, Ireland, September 13, 2012

Optimization

• Objective is smooth we can compute:

• We use Stochastic Gradient Descent

• Overall scalability linear to # of relevant items

E(U, V ) =M�

i=1

N�

j=1

Yij

�ln g(UT

i Vj)

+N�

k=1

ln�1− Yikg(U

Ti Vk − UT

i Vj)��

− λ

2(�U�2 + �V �2)

∂E

∂Ui

∂E

∂Vj

O(dS)

Page 21: CLiMF: Collaborative Less-is-More Filtering

21 RecSys 2012, Dublin, Ireland, September 13, 2012

What’s different ?

• CLiMF reciprocal rank loss essentially pushes relevant items apart

•  In the process at least one items ends up high-up in the list

Page 22: CLiMF: Collaborative Less-is-More Filtering

22 RecSys 2012, Dublin, Ireland, September 13, 2012

Conventional loss

Page 23: CLiMF: Collaborative Less-is-More Filtering

23 RecSys 2012, Dublin, Ireland, September 13, 2012

CLiMF MRR-loss

Page 24: CLiMF: Collaborative Less-is-More Filtering

24 RecSys 2012, Dublin, Ireland, September 13, 2012

Experimental Evaluation Data sets

•  Epinions : •  346K observations of trust relationship •  1767 users; 49288 trustees •  99.85% Sparseness •  Avg. friends/trustees per user 73.34

Page 25: CLiMF: Collaborative Less-is-More Filtering

25 RecSys 2012, Dublin, Ireland, September 13, 2012

Experimental Evaluation Data sets

• Tuenti : •  798K observations of trust relationship •  11392 users; 50000 friends •  99.86% Sparseness •  Avg. friends/trustees per user 70.06

Page 26: CLiMF: Collaborative Less-is-More Filtering

26 RecSys 2012, Dublin, Ireland, September 13, 2012

Experimental Evaluation Experimental Protocol

Training data

Test data

(Items holdout)

data

Given 5

Friends/Trustees

Friends/Trustees

Page 27: CLiMF: Collaborative Less-is-More Filtering

27 RecSys 2012, Dublin, Ireland, September 13, 2012

Experimental Evaluation Experimental Protocol

Training data

Test data

(Items holdout)

data

Given 10

Friends/Trustees

Friends/Trustees

Page 28: CLiMF: Collaborative Less-is-More Filtering

28 RecSys 2012, Dublin, Ireland, September 13, 2012

Experimental Evaluation Experimental Protocol

Training data

Test data

(Items holdout)

data

Given 15

Friends/Trustees

Friends/Trustees

Page 29: CLiMF: Collaborative Less-is-More Filtering

29 RecSys 2012, Dublin, Ireland, September 13, 2012

Experimental Evaluation Experimental Protocol

Training data

Test data

(Items holdout)

data

Given 20

Friends/Trustees

Friends/Trustees

Page 30: CLiMF: Collaborative Less-is-More Filtering

30 RecSys 2012, Dublin, Ireland, September 13, 2012

Experimental Evaluation Evaluation Metrics

MRR =1

|S|

|S|�

i=1

1

ranki

ratio of test users who have at least one relevant item in their

Top-5

1− call@5

P@5

Page 31: CLiMF: Collaborative Less-is-More Filtering

31 RecSys 2012, Dublin, Ireland, September 13, 2012

Experimental Evaluation Scalability

Page 32: CLiMF: Collaborative Less-is-More Filtering

32 RecSys 2012, Dublin, Ireland, September 13, 2012

Experimental Evaluation Competition

•  Pop: Naive, recommend based on popularity of each item

•  iMF (Hu and Koren, ICDM’08): Optimizes Squared error loss

•  BPR-MF (Rendle et al., UAI’09): Optimizes AUC

Page 33: CLiMF: Collaborative Less-is-More Filtering

33 RecSys 2012, Dublin, Ireland, September 13, 2012

5 10 15 20

Pop iMF BPR−MF CLiMF

Epinions MRR

MRR

0.0

0.1

0.2

0.3

0.4

Page 34: CLiMF: Collaborative Less-is-More Filtering

34 RecSys 2012, Dublin, Ireland, September 13, 2012

5 10 15 20

Pop iMF BPR−MF CLiMF

Epinions P@5

P@5

0.00

0.05

0.10

0.15

0.20

0.25

0.30

Page 35: CLiMF: Collaborative Less-is-More Filtering

35 RecSys 2012, Dublin, Ireland, September 13, 2012

5 10 15 20

Pop iMF BPR−MF CLiMF

Epinions 1−call@5

1−call@

5

0.0

0.2

0.4

0.6

0.8

Page 36: CLiMF: Collaborative Less-is-More Filtering

36 RecSys 2012, Dublin, Ireland, September 13, 2012

5 10 15 20

Pop iMF BPR−MF CLiMF

Tuenti MRR

MRR

0.00

0.05

0.10

0.15

0.20

Page 37: CLiMF: Collaborative Less-is-More Filtering

37 RecSys 2012, Dublin, Ireland, September 13, 2012

5 10 15 20

Pop iMF BPR−MF CLiMF

Tuenti P@5

P@5

0.00

0.02

0.04

0.06

0.08

0.10

Page 38: CLiMF: Collaborative Less-is-More Filtering

38 RecSys 2012, Dublin, Ireland, September 13, 2012

5 10 15 20

Pop iMF BPR−MF CLiMF

Tuenti 1−call@5

1−call@

5

0.00

0.05

0.10

0.15

0.20

0.25

0.30

Page 39: CLiMF: Collaborative Less-is-More Filtering

39 RecSys 2012, Dublin, Ireland, September 13, 2012

Conclusions and Future Work

• Contribution •  Novel method for implicit data with some nice properties (Top-k, speed)

•  Future work •  Use CLiMF to avoid duplicate or very similar recommendations in

the top-k part of the list •  To optimize other evaluation metrics for top-k recommendation

•  To take the social network of users into account

Page 40: CLiMF: Collaborative Less-is-More Filtering

40 RecSys 2012, Dublin, Ireland, September 13, 2012

Thank you !

Telefonica Research is looking for interns! Contact: [email protected] or [email protected]