Computer Based Recommendation Systems in Online Businessjzhang/CS689/PPDM-Chapter10.pdf · Most...

Computer Based Recommendation Systems in Online Business

Jun Zhang and Xiwei WangDepartment of Computer ScienceUniversity of KentuckyLexington, KY 40506-0633USA

CONTENTS

Introduction to recommendation systems

Popular algorithms used in recommendation systems

Comparison of recommendation system algorithms using real online market data

Concluding remarks and future work

1

BACKGROUND

The booming e‐Business facilitates profound social behavior changes. More and more people choose to shop online instead of going to real stores.

2

BACKGROUND

Supermarkets provide blind recommendation information to all customers.

3

BACKGROUND

Online merchants provide targeted recommendation information to the customers who have previously visited their website to better market their merchandises (e.g., books, movies, CD’s, web pages, newsgroup messages).

4

BACKGROUND – RECOMMENDER SYSTEM

What is a Recommendation System?Recommendation system is a program that utilizes computer algorithms to predict users’ purchase interests by profiling their shopping patterns.

Recommendation systems provide users with personalized suggestions for products or services.

5

BACKGROUND – RECOMMENDER SYSTEM CONT.

Many online stores provide recommendations (e.g., Amazon, eBay).

Recommendation systems have been shown to substantially increase sales at online stores.

6

From a business perspective, it is viewed as part of Customer Relationship Management (CRM).

BACKGROUND – AN EXAMPLE

7

RECOMMENDATION SYSTEMS

Two types of recommendation systems Content‐based filtering Collaborative filtering

Content‐based filtering Performs profiling by extracting feature values from contents used in the past and recommend new contents with similar feature values.

8

CONTENT-BASED FILTERING

9

COLLABORATIVE FILTERING

Most recommendation systems rely on Collaborative Filtering (CF) technique

In CF‐based recommendation systems, shopping history is analyzed in order to establish connections between users and products.

A profile is created by evaluating contents used by a user in the past, and recommendations are made by evaluating users with similar profiles.

10

COLLABORATIVE FILTERING CONT.

Maintain a database of many users’ ratings of a variety of items.

For a given user, find other similar users whose ratings strongly correlate with the current user.

Recommend items rated highly by these similar users, but not rated by the current user.

Almost all existing commercial recommendation systems use this approach (e.g., Amazon).

11

COLLABORATIVE FILTERING CONT.

12

EXAMPLE OF TRANSACTIONS

Each user may purchase several items Each item could be purchased by a few users

13

User Itemsu01 Bread, Milku02 Bread, Diaper, Beer, Eggsu03 Milk, Diaper, Beer, Cokeu04 Bread, Milk, Diaper, Beeru05 Bread, Milk, Diaper, Coke

THE USER-ITEM RATING MATRIX

14

1100022000005550022200111Jenny

Emily

Diane

Jeremy

Stefan

user vector

item vector

RECOMMENDATION MODELS

Four basic models of recommendation systems are studied Item Popularity‐based Model (IP)

Item Similarity‐based Model (IS)

SVD‐based Latent Factor Model (SVD)

Bipartite Graph Model (BG)

15

WHICH ONE IS THE BSET?

There has been no definite answers as to which recommendation algorithm is the best Most published comparison results are based on some special datasets

These datasets are twisted

We need some comparison results based on real commercial datasets

16

ITEM POPULARITY-BASED MODEL (IP)

The most primitive model in RS.

Recommend most popular, most viewed, or best selling items to users.

It overlooks user’s preferences, but can be used as an auxiliary component in some recommendation systems.

There is a filtering step for IP to improve the prediction result.

17

FILTERING STEP IN IP

If a user prefers to view an item just once, then the algorithm should not recommend the items that have already been viewed by this user;

If a user prefers to view an item several times, the items that have been viewed by this user could be presented to him/her again.

This step can also be applied to other recommendation models.

18

ITEM SIMILARITY-BASED MODEL (IS)

In similarity‐based recommendation, the prediction is based on the similarity between items.

19

Item1

Item2

Item3

User1

User2

User3 Item4

Item4 can be recommended to User 1, Item1 can be recommended toUser 3.

SIMILARITY MEASURE

Central to most item‐oriented approaches is a similarity measure between items.

Pearson correlation coefficient is a measure of the strength of linear dependence between two variables

It measures the tendency of users to rate items iand j similarly.

20

d

kjjk

d

kiik

d

kjjkiik

xx

jiij

xxxx

xxxxxx

ji

1

2

1

2

1

)()(

))((),cov(

IMPROVED SIMILARITY MEASURE

To improve the reliability of the similarity measure, a modification can be applied to the equation:

where nck denotes the number of items that user k has viewed.

21

)log

11('

kikik nc

xx

THE MODEL

We also take into account the item popularity.

where S(i;u) is the set of items that were viewed by uesr u, npi denotes the view count of item i. N is the global maximum view count.

22

Nnp

uiSr i

uiSjijui

ij

)1();(

1ˆ0

);(

2

Similarity Popularity

SVD-BASED LATENT FACTOR MODEL (SVD)

Latent factor model focuses on reducing dimensionality of the user‐item rating matrix to discover some “latent factors”.

Original matrix = Factor matrix * … * Factor matrixOriginal matrix: sparse, not orderedFactor matrix: compact, ordered

23

SINGULAR VALUE DECOMPOSITION

A is an m‐by‐n matrix with rank(A) = r. It could be decomposed into three matrices.

U is an m‐by‐m orthogonal matrix and V is an n‐by‐northogonal matrix.

is the singular value

matrix, where D is a diagonal matrix with the singular values on its diagonal.

24

Tnm VUDA 1

nmrnrmrrm

rnrrr

OOOD

D

)()()(

)(1

SINGULAR VALUE DECOMPOSITION CONT.

The singular values in D have the property σ1 ≥ σ2 ≥ … ≥ σr> 0.

25

r

D

3

2

1

SVD-BASED LATENT FACTOR MODEL CONT.

26

Singular Value Decomposition (SVD)

REDUCE THE DIMENSION

If we only focus on those r non‐trivial singular values, the effective dimensions of the SVD matrices U, D and V will be m×r, r×r and n×r .

To reduce the dimension of data, we could retain the klargest singular values of D and discard the rest.

Expect to capture the underlying latent structure of the original data.

27

Tknkkkmnm VDUA

AN EXAMPLE

28

71.071.00000058.058.058.0

16.30049.9

44.0089.00090.0036.0018.0

1100022000005550022200111

daily use

digital

Jenny

Emily

Diane

Jeremy

Stefan

housewife

digital fans

housewife

digital fans

user – category similarity matrix

strength of digital fans

item – category similarity matrix

SVD-BASED RECOMMENDATION

SVD‐based model factorizes the user‐item matrix into two lower rank matrices, i.e., a “user‐factor” matrix Pm×f and an “item‐factor” matrix Qn×f.

Each user u and item i can be represented as an f‐dimensional factor vector pu and qi.

The prediction of a rating from user u on item ican be computed by

29

iTuui qpr ˆ

SVD-BASED RECOMMENDATION CONT.

Decompose the user‐item rating matrix R into three submatrices:

where U and Q are orthonormal matrices, D is a diagonal matrix.

It can be inferred that

30

TQDUR

QRDUP TQPR

SVD-BASED RECOMMENDATION CONT.

if we use ru to denote the u‐th row of the rating matrix R, then the user factor vector can be obtained by taking the product of ru and Q, i.e.,

and

31

TQPR

Qrp uu

Tiuui qQrr ˆ

QRDUP

BIPARTITE GRAPH MODEL (BG)

In BG, users and items are modeled as vertices of a graph.

32

BIPARTITE GRAPH MODEL (BG) CONT.

In Bipartite Graph model, all item nodes form a finite Markov chain with transition matrix P.

33

m

kjkkijiij tuPutPttPp

1

))|()|(()|(

n

jkj

kiki

r

rutP

1

)|(

m

kkj

kjjk

r

rtuP

1

)|(

probability of a chain ends in ti with initial node tj.

PREDICTION BASED ON TRANSITION

MATRIX

Given the previous click history of user u, the rating of item i can be predicted by

where Tk is the initial state vector for user k in a Markov chain and Tk(tj) is the component corresponding to item j.

34

m

jjkijui tTpr

1

))((ˆ

n

lkl

kjjk

r

rtT

1

)(

EXPERIMENTAL STUDY

The datasets in the experiment are clicking history from some online shopping websites.

35

pid:13505646 - siteId:9093 - uid:08097540 - date:2010-08-08pid:16062417 - siteId:9102 - uid:95429188 - date:2010-08-08pid:12546546 - siteId:7167 - uid:71516943 - date:2010-08-08pid:691224 - siteId:4266 - uid:07079557 - date:2010-08-08pid:4577421 - siteId:4266 - uid:07079557 - date:2010-08-08…

STATISTICS ON DATASETS

36

EVALUATION STRATEGY

The dataset has been divided into three sets, namely training set , test set and last transaction set.

Our goal is to use training set to train the model and apply it on the test set to predict the last transaction of test users.

37

user0: item0, item1, item2, item3

EVALUATION STRATEGY

38

Xaction 1Xaction 2...Xaction 250000

Xaction 250001...Xaction 300000

Xaction 1Xaction 2...Xaction 250000Xaction 250001

Xaction 250001...Xaction 300000

Original Data Set

Training Set

Test Set

EVALUATION STRATEGY CONT.

The�quality�of�the�results�is�measured�by�the�hit�rate:

Use�Top-10�Recommendation�to�verify�the�models

39

users test#users test predicted correctly#

hitRate

PARAMETER STUDY

γ in�item�similarity-based�model

40

Nnp

uiSr i

uiSjijui

ij

)1();(

1ˆ0

);(

2

TOP-10 PREDICTION ON DATASETS

41

20,471 users499 items

60 factors for SVD

TOP-10 PREDICTION ON DATASETS CONT.

42

148,409 users1,004 items

70 factors for SVD


43


100 factors for SVD


44


94 factors for SVD

SUMMARY

Item Popularity‐based model is not suitable for most datasets but could be used as an auxiliary component.

If the dataset has few items but lots of users, SVD‐based model is a good choice.

Item Similarity‐based model and Bipartite Graph model have similar idea so they perform very similarly. They are suitable for the datasets with “normal” number of items and users.

The filtered Bipartite Graph model is a “won’t‐be‐wrong” method in most cases. 45

FUTURE CONSIDERATIONS – INCREMENTAL

DATA

How to handle incremental data?

The amount of shopping data increases every minute, every day

How to handle the new data?

Do we re‐compute the entire data every day, or do we just compute the new data and add it to the existing data?

How do the strategies affect the accuracy of the recommendations?

46

FUTURE CONSIDERATION – IMPUTATION

STRATEGIES

When a person did not buy an item, it does not mean that the person does not like or dislike the item

The item usually receives a rating of 0, which is not the actual rating

The sparse matrix must be imputed – by filling out the missing values

Different imputation strategies will lead to different results, in terms of accuracy and computational efficiency

47

FUTURE CONSIDERATION – DATA PRIVACY

Small merchants do not have the manpower to maintain a recommendation system

They usually buy the services from a third party

They have to provide their customer data to the third party for analysis

This may cause the leak of customer privacy or trade secret

How to pre‐process customer data so that data privacy is preserved?

48

Professor Jun Zhang

Department of Computer ScienceUniversity of KentuckyLexington, KY 40506-0633, USA

E-mail: [email protected]://www.cs.uky.edu/~jzhangTel: 13540021323

TOP-N PREDICTIONS

50


60 factors for SVD

TOP-N PREDICTIONS CONT.

51


70 factors for SVD


52


94 factors for SVD


53


100 factors for SVD

CONCLUSION AND FUTURE WORK

We presented concept and background of recommender system in e‐Business.

There are two classes of recommendation techniques: Content‐based filtering and Collaborative filtering. The latter is extensively used in popular online shopping websites.

We illustrated four basic collaborative filtering algorithms and conducted an empirical study of them on four datasets from a retargeting company.

54

CONCLUSION AND FUTURE WORK

The filtering step of Item‐popularity model has a good effect on Item Similarity and Bipartite Graphmodels but has totally no effect on SVDmodel.

We discovered a strategy to choose models in terms of the features of the datasets.

The models could be combined to provide better prediction accuracy.

Privacy issues in recommender systems. (Future)

55

International Conference on Business Computing and Global Informatization

BCGIn 2013

September 13-15, 2013Changsha, China

www.bcgin.org

Computer Based Recommendation Systems in Online Businessjzhang/CS689/PPDM-Chapter10.pdf · Most...

Documents

Transcript of Computer Based Recommendation Systems in Online Businessjzhang/CS689/PPDM-Chapter10.pdf · Most...