Boolean matrix factorisation for collaborative filtering

34
Boolean Matrix Factorisation for Collaborative Filtering: An FCA- based approach Dmitry Ignatov 1 , Elena Nenova 2 , Andrey Konstantinov 1 , Natalia Konstantinova 3 1 National Research University Higher School of Economics, Moscow, Russia 2 Imhonet Research, Moscow, Russia 3 University of Wolverhampton, UK AIMSA 2014, Sept. 12, Varna, Bulgaria

description

We propose a new approach for Collaborative fi ltering which is based on Boolean Matrix Factorisation (BMF) and Formal Concept Analysis. In a series of experiments on real data (MovieLens dataset) we compare the approach with an SVD-based one in terms of Mean Average Error (MAE). One of the experimental consequences is that it is enough to have a binary-scaled rating data to obtain almost the same quality in terms of MAE by BMF as for the SVD-based algorithm in case of non-scaled data.

Transcript of Boolean matrix factorisation for collaborative filtering

Page 1: Boolean matrix factorisation for collaborative filtering

Boolean Matrix Factorisation for Collaborative Filtering: An FCA-based approach

Dmitry Ignatov1, Elena Nenova2, Andrey Konstantinov1, Natalia Konstantinova3

1National Research University Higher School of Economics, Moscow, Russia

2Imhonet Research, Moscow, Russia3University of Wolverhampton, UK

AIMSA 2014, Sept. 12, Varna, Bulgaria

Page 2: Boolean matrix factorisation for collaborative filtering

Outline

• Problem Statement• Basic Matrix Factorisation (MF) Techniques• FCA-based Boolean Matrix Factorisation– FCA definitions– FCA and Recommender Systems– FCA-based BMF

• General Scheme of Experiments• Experiments• Conclusion & Future Plans

Page 3: Boolean matrix factorisation for collaborative filtering

Problem Statement

• Recommender Systems is a rapidly growing area (ACM RecSys conference series since 2007)

• Matrix Factorisation techniques seems to be an industry standard (SVD, NMF, PLSA etc.)

• What about Boolean Matrix Factorisation or/and FCA?

• Hence why not to develop FCA-based BMF technique, evaluate it, and compare with the state-of-the-art techniques?

Page 4: Boolean matrix factorisation for collaborative filtering

Outline

• Problem Statement• Basic Matrix Factorisation (MF) Techniques• FCA-based Boolean Matrix Factorisation– FCA definitions– FCA and Recommender Systems– FCA-based BMF

• General Scheme of Experiments• Experiments• Conclusion & Future Plans

Page 5: Boolean matrix factorisation for collaborative filtering

Basic MF Techniques. SVD ● Singular Value Decomposition

are orthogonal matrices

where

Page 6: Boolean matrix factorisation for collaborative filtering

SVD Example

Page 7: Boolean matrix factorisation for collaborative filtering

Basic MF Techniques. NMF

• Non-negative Matrix Factorisation

Page 8: Boolean matrix factorisation for collaborative filtering

Basic MF Techniques. NMF

Page 9: Boolean matrix factorisation for collaborative filtering

Basic MF Techniques. NMF

• Boolean Matrix Factorisation

Page 10: Boolean matrix factorisation for collaborative filtering

Outline

• Problem Statement• Basic Matrix Factorisation (MF) Techniques• FCA-based Boolean Matrix Factorisation– FCA definitions– FCA and Recommender Systems– FCA-based BMF

• General Scheme of Experiments• Experiments• Conclusion & Future Plans

Page 11: Boolean matrix factorisation for collaborative filtering

Formal Concept Analysis[Wille, 1982, Ganter & Wille, 1999]

Romeo & Juliet The Puppets Master

Ubik Ivanhoe

Kate x x

Mike x x

Alex x x

David x x x

Definition 1. Formal Context is a triple (G, M, I ), where G is a set of (formal) objects, M is a set of (formal) attributes, and I G M is the incidence relation which shows that object g G posseses an attribute m M.

Example. Books recommender

Page 12: Boolean matrix factorisation for collaborative filtering

Formal Concept Analysis

R&J PM Ub Iv

Kate x x

Mike x x

Alex x x

David x x x

Example

Definition 2. Derivation operators (defining Galois connection)

AI := { m M | gIm for all g A } is the set of attributes common to all objects in A

BI := { g G | gIm for all m B } is the set of objects that have all attributes from B

{Kate, Mike}I = {RJ}

{Ubik} I = {Mike, Alex, David}

{RJ,PM} I = {}G

{} IG =M

Page 13: Boolean matrix factorisation for collaborative filtering

Formal Concept Analysis

Example • A pair ({Kate, Mike},{R&J}) is a formal concept

• ({Alex, David} ,{Ubik}) doesn‘t form a formal concept, because {Ubik} I{Alex, David}

• ({Alex, David} {PM, Ubik}) is a formal concept

Definition 3. (A, B) is a formal concept of (G, M, I) iff A G, B M, AI = B, and BI = A .

A is the extent and B is the intent of the concept (A, B).

is a set of all concepts of the context (G, M, I)( , , )G M IB

R&J PM Ub Iv

Kate x x

Mike x x

Alex x x

David x x x

Page 14: Boolean matrix factorisation for collaborative filtering

FCA and Graphs

a b c d

Kate x xMike x xAlex x xDavid x x x

Formal Context Bipartite graphFormal Concept

(maximal rectangle)Biclique

Page 15: Boolean matrix factorisation for collaborative filtering

FCA & Recommender Systems

• Collaborative Recommending using Formal Concept Analysis (du Boucher-Ryan & Bridge, 2006)

• Concept-based Recommendations for Internet Advertisement (Ignatov & Kuznetsov, 2008)

• FCA-based Recommender Models and Data Analysis for Crowdsourcing Platform Witology (Ignatov et al., 2014)

Page 16: Boolean matrix factorisation for collaborative filtering

FCA-based BMF

• Belohlavek & Vyhodil, 2010

Page 17: Boolean matrix factorisation for collaborative filtering

FCA-based BMF

• Belohlavek & Vyhodil, 2010

Page 18: Boolean matrix factorisation for collaborative filtering

Example 1

Page 19: Boolean matrix factorisation for collaborative filtering

Example 2

Page 20: Boolean matrix factorisation for collaborative filtering

Outline

• Problem Statement• Basic Matrix Factorisation (MF) Techniques• FCA-based Boolean Matrix Factorisation– FCA definitions– FCA and Recommender Systems– FCA-based BMF

• General Scheme of Experiments• Experiments• Conclusion & Future Plans

Page 21: Boolean matrix factorisation for collaborative filtering

General Scheme of Experiments

Page 22: Boolean matrix factorisation for collaborative filtering

kNN approach

• Adomavicus & Tuzhilin, 2005• Predicted rating of user c for item s

• sim(c ,c′ ) is similarity between users c′ and c, e.g. cosine-based or Pearson correlation

Page 23: Boolean matrix factorisation for collaborative filtering

Outline

• Problem Statement• Basic Matrix Factorisation (MF) Techniques• FCA-based Boolean Matrix Factorisation– FCA definitions– FCA and Recommender Systems– FCA-based BMF

• General Scheme of Experiments• Experiments• Conclusion & Future Plans

Page 24: Boolean matrix factorisation for collaborative filtering

Dataset

• MovieLens dataset:– 943 users,– 1682 movies,– every user have rated at least 20 movies,– 100000 ratings,– training set 80000 ratings,– test set 20000 ratings.

Page 25: Boolean matrix factorisation for collaborative filtering

Experiments

Page 26: Boolean matrix factorisation for collaborative filtering

Experiments

• MAE for SVD and BMF at 80% coverage level

• Number of factors for SVD and BMF at different coverage level

Page 27: Boolean matrix factorisation for collaborative filtering

Experiments• Comparison of kNN- approach and BMF-based approaches by

Precision and Recall

Page 28: Boolean matrix factorisation for collaborative filtering

Experiments

• Scaling influence on the recommendations quality for BMF in terms of MAE

Page 29: Boolean matrix factorisation for collaborative filtering

Experiments

• MAE dependence on scaling and number of nearest neighbors for 80% coverage.

Page 30: Boolean matrix factorisation for collaborative filtering

Experiments• MAE dependence on data filtration algorithm and the number

of nearest neighbors.

Page 31: Boolean matrix factorisation for collaborative filtering

Experiments

• Speed up of PLSA convergence

Page 32: Boolean matrix factorisation for collaborative filtering

Conclusion

• BMF-based RA is similar to state-of-the-art techniques in terms of MAE and demonstrates good Precision and Recall

• Probably low scalability is the main drawback of the approach

• BMF: O(k|G||M|3) versus SVD: O(|G||M|2+|M|3)

Page 33: Boolean matrix factorisation for collaborative filtering

Future Prospects

• BMF-based RS in Triadic Case (e.g., folksonomy data)

• BMF-based RS for Graded and Ordinal Data • BMF-based RS for simultaneous factorisation of

user-features, user-items, and items-features matrices

• BMF and Least Square based imputation techniques

• Scalability Issues

Page 34: Boolean matrix factorisation for collaborative filtering

Thank you!