CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama...

43
CS249: ADVANCED DATA MINING Instructor: Yizhou Sun [email protected] May 31, 2017 Recommender Systems II

Transcript of CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama...

Page 1: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

CS249: ADVANCED DATA MINING

Instructor: Yizhou [email protected]

May 31, 2017

Recommender Systems II

Page 2: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Recommender Systems

•Recommendation via Information Network Analysis

•Hybrid Collaborative Filtering with Information Networks

•Graph Regularization for Recommendation

• Summary

2

Page 3: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Traditional View of Recommendation

3

Avatar TitanicAliens Revolutionary Road

Page 4: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Recommendation Paradigm

4

recommender system recommendation

user

feedback

external knowledge

product features

user-item feedback

Collaborative FilteringE.g., K-Nearest Neighbor (Sarwar WWW’01), Matrix Factorization (Hu ICDM’08, Koren IEEE-CS’09), Probabilistic Model (Hofmann SIGIR’03)

Content-Based MethodsE.g., (Balabanovic Comm. ACM’ 97, Zhang SIGIR’02)

Hybrid MethodsE.g., Content-Based CF (Antonopoulus, IS’06), External Knowledge CF (Ma WSDM’11)

Page 5: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

An Example of Traditional Method: Matrix Factorization

5

𝑅: Rating Matrix 𝑅: Estimated Rating Matrix

Page 6: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Challenges

•How to address the data sparsity and cold start issues?

•How to leverage different sources of information?

6

Page 7: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Solution: A Heterogeneous Information Network View of Recommendation

7

Avatar TitanicAliens Revolutionary Road

James Cameron

Kate Winslet

Leonardo Dicaprio

Zoe Saldana Adventure

Romance

Page 8: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

What Are Information Networks?

• A network where each node represents an entity (e.g.,

user in a social network) and each link (e.g., friendship)

a relationship between entities.

• Nodes/links may have attributes, labels, and weights.

• Links may carry rich semantic information.

8

Page 9: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

We are living in a connected world!

9

Page 10: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Even in Biomedical Domain

10

Gene

Patient

Symptom

Microbe

carriedBy

cause

Drug

Compound

Side Effect

similarTo

con

tain

Disease

Disease

Page 11: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Recommender Systems

•Recommendation via Information Network Analysis

•Hybrid Collaborative Filtering with Information Networks

•Graph Regularization for Recommendation

• Summary

11

Page 12: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Recommendation Paradigm

12

recommender system recommendation

user

feedback

external knowledge

product features

user-item feedback

Collaborative FilteringE.g., K-Nearest Neighbor (Sarwar WWW’01), Matrix Factorization (Hu ICDM’08, Koren IEEE-CS’09), Probabilistic Model (Hofmann SIGIR’03)

Content-Based MethodsE.g., (Balabanovic Comm. ACM’ 97, Zhang SIGIR’02)

Hybrid MethodsE.g., Content-Based CF (Antonopoulus, IS’06), External Knowledge CF (Ma WSDM’11)

Page 13: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Problem Definition

13

recommender system recommendation

user

feedback

information network

implicit user feedback

hybrid collaborative filteringwith information networks

Page 14: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Recommend with Trust and Distrust Relationships [Ma et al., RecSys’09]

•Users can be easily influenced by the friends they trust, and prefer their friends’ recommendations.

14

Where to have dinner? Ask

Ask

Ask

Good

Very Good

Cheap & Delicious

Page 15: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Trust and Distrust Graph

15

𝑺𝑻: Trust Graph 𝑺𝑫: Distrust Graph

R: User Item Rating Matrix

Page 16: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Recommendation with Trust and Distrust Relationships

16

𝑺𝑻: Trust Graph

𝑺𝑫: Distrust Graph

Page 17: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Results

•Dataset: Epinions

•Metric: RMSE

17

Page 18: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Hybrid Collaborative Filtering with Networks

•Utilizing network relationship informationcan enhance the recommendation quality

•However, most of the previous studies onlyuse single type of relationship between usersor items (e.g., social network Ma,WSDM’11, trustrelationship Ester, KDD’10, service membershipYuan, RecSys’11)

18

Page 19: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

The Heterogeneous Information Network View of Recommender System

19

Avatar TitanicAliens Revolution-ary Road

James Cameron

Kate Winslet

Leonardo Dicaprio

Zoe Saldana Adventure

Romance

Page 20: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Relationship Heterogeneity Alleviates Data Sparsity

20

# of users or items

A small numberof users and itemshave a largenumber of ratings

Most users and items havea small number of ratings

# o

f ra

tin

gs

Collaborative filtering methods suffer from data sparsity issue

• Heterogeneous relationships complement each other

• Users and items with limited feedback can be connected to the

network by different types of paths

• Connect new users or items (cold start) in the information

network

Page 21: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Relationship Heterogeneity Based Personalized Recommendation Models (Yu et al., WSDM’14)

21

Different users may have different behaviors or preferences

Aliens

James Cameron fan

80s Sci-fi fan

Sigourney Weaver fan

Different users may beinterested in the samemovie for different reasons

Two levels of personalizationData level• Most recommendation methods use

one model for all users and rely on

personal feedback to achieve

personalization

Model level• With different entity relationships, we

can learn personalized models for

different users to further distinguish

their differences

Page 22: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Preference Propagation-Based Latent Features

22

Alice

Bob

Kate Winslet

Naomi Watts

Titanic

revolutionary road

skyfall

King Konggenre: drama

Sam Mendes

tag: Oscar NominationCharlie

Generate L different meta-path (path types)

connecting users and items

Propagate user implicit feedback along each meta-

path

Calculate latent-features for users and items for each

meta-path with NMFrelated method

Ralph Fiennes

Page 23: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Luser-cluster similarity

Recommendation Models

23

Observation 1: Different meta-paths may have different importance

Global Recommendation Model

Personalized Recommendation Model

Observation 2: Different users may require different models

ranking score

the q-th meta-path

features for user i and item j

c total soft user clusters

(1)

(2)

Page 24: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Parameter Estimation

24

• Bayesian personalized ranking (Rendle UAI’09)

• Objective function

minΘ

sigmoid function

for each correctly ranked item pairi.e., 𝑢𝑖 gave feedback to 𝑒𝑎 but not 𝑒𝑏

Soft cluster users with NMF + k-means

For each user cluster, learn one

model with Eq. (3)

Generate personalized model for each user on the

fly with Eq. (2)

(3)

Learning Personalized Recommendation Model

Page 25: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Experiment Setup

•Datasets

• Comparison methods:• Popularity: recommend the most popular items to

users

• Co-click: conditional probabilities between items

• NMF: non-negative matrix factorization on userfeedback

• Hybrid-SVM: use Rank-SVM with plain features(utilize both user feedback and information network)

25

Page 26: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Performance Comparison

26

HeteRec personalized recommendation (HeteRec-p) provides the best recommendation results

p

Page 27: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Performance under Different Scenarios

27

HeteRec–p consistently outperform other methods in different scenariosbetter recommendation results if users provide more feedbackbetter recommendation for users who like less popular items

p p

user

Page 28: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Recommender Systems

•Recommendation via Information Network Analysis

•Hybrid Collaborative Filtering with Information Networks

•Graph Regularization for Recommendation

• Summary

28

Page 29: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

From Graph Regularization Point of View

• Why additional links help?• They define new similarity metrics between users or items.

• How to integrate this assumption into recommendation?• Use graph regularization to force two entities to be similar in latent

space, if they are similar in graph

• The original form of graph regularization

•1

2∑𝑤𝑖𝑗 𝑓𝑖 − 𝑓𝑗

2= 𝑓′𝐿𝑓

• 𝑤𝑖𝑗 ∶ 𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 𝑜𝑓 𝑛𝑜𝑑𝑒 𝑖 𝑎𝑛𝑑 𝑗

• 𝑓𝑖: some latent representation for node i• L: Laplacian matrix of W, i.e., 𝐿 = 𝐷 − 𝑊,

• 𝑤ℎ𝑒𝑟𝑒 𝐷 𝑖𝑠 𝑎 𝑑𝑖𝑎𝑔𝑜𝑛𝑎𝑙 𝑚𝑎𝑡𝑟𝑖𝑥 𝑎𝑛𝑑 𝐷𝑖𝑖 = ∑𝑗 𝑤𝑖𝑗

29

Page 30: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Recommender Systems with Social Regularization [Ma et al., WSDM’11]

• Input: Social Relation + Rating Matrix

30

Page 31: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Two Regularization Forms

• Model 1: Average-based Regularization

• We are similar to the average of our friends

• Model2: Individual-based Regularization

• We are similar to each of our friends

31

Similarity can be propagated via friends: transitivity!

Page 32: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

How to compute similarity between two users?

•Cosine similarity (VSS)

•Pearson correlation coefficient (PCC)

32

Page 33: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Results

33

Page 34: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Meta-Path-based Regularization [Yu et al., IJCAI-HINA’13]

• What if it is more than one type of relation?

• Solution:• Use meta-path to generate similarity relation between items,

e.g., movie-director-movie

• Learn the importance score for each meta-path

34

Rating Data Heterogeneous Information Network

Page 35: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Notations

•We have n users and m items.•

•By computing similarity scores of all item pairs along certain meta-path, we can get a similarity matrix•

•With L different meta-paths, we can calculate L similarity matrices as •

35

Page 36: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Objective Function

36

Approximate R with U V product Regularization on U V

Regularization on θ, which is the importance score for each meta-path

Similar items measured from HIN should have similar low-rank representations

Page 37: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Equivalent Objective Function Using Graph Laplacian

37

Similar items measured from HIN should have similar low-rank representations

Page 38: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Dataset

•We combine IMDb + MovieLens100K

38

We random sample training datasets of different sizes (0.4, 0.6, and 0.8)

Page 39: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Results

39

Page 40: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Recommender Systems

•Recommendation via Information Network Analysis

•Hybrid Collaborative Filtering with Information Networks

•Graph Regularization for Recommendation

• Summary

40

Page 41: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Summary

• Recommendation via Information Network Analysis

• Users and items are embedded in a heterogeneous

information network

• Recommendation can be considered as a link prediction

problem

• Hybrid Collaborative Filtering with Information Networks

• Propagate the feedback via meta-paths

• Graph Regularization for Recommendation

• Similar items/users should have similar latent vectors

41

Page 42: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

More about Course Project

•Presentation

• 20mins+5minsQ&A

•Time arrangement

• June 5: Team 1-4

• June 7: Team 5-8

•Course Project Final Report + Data (link) + Code

•Due June 12

42

Page 43: CS249: ADVANCED DATA MININGweb.cs.ucla.edu/~yzsun/classes/2017Spring_CS249/...skyfall genre: drama King Kong Sam Mendes Charlie tag: Oscar Nomination Generate L different meta-path

Peer Evaluation Questions

1. Is the

proposed

problem

interesting

and novel?

2. Is the

problem

formalization

reasonable?

3. Is the

solution solid

and

reasonable?

4. To what

extent the

project

achieves the

claimed goal?

5. How good is

the

presentation?

43