Computer Based Recommendation Systems in Online Businessjzhang/CS689/PPDM-Chapter10.pdf · Most...
Transcript of Computer Based Recommendation Systems in Online Businessjzhang/CS689/PPDM-Chapter10.pdf · Most...
Computer Based Recommendation Systems in Online Business
Jun Zhang and Xiwei WangDepartment of Computer ScienceUniversity of KentuckyLexington, KY 40506-0633USA
CONTENTS
Introduction to recommendation systems
Popular algorithms used in recommendation systems
Comparison of recommendation system algorithms using real online market data
Concluding remarks and future work
1
BACKGROUND
The booming e‐Business facilitates profound social behavior changes. More and more people choose to shop online instead of going to real stores.
2
BACKGROUND
Supermarkets provide blind recommendation information to all customers.
3
BACKGROUND
Online merchants provide targeted recommendation information to the customers who have previously visited their website to better market their merchandises (e.g., books, movies, CD’s, web pages, newsgroup messages).
4
BACKGROUND – RECOMMENDER SYSTEM
What is a Recommendation System?Recommendation system is a program that utilizes computer algorithms to predict users’ purchase interests by profiling their shopping patterns.
Recommendation systems provide users with personalized suggestions for products or services.
5
BACKGROUND – RECOMMENDER SYSTEM CONT.
Many online stores provide recommendations (e.g., Amazon, eBay).
Recommendation systems have been shown to substantially increase sales at online stores.
6
From a business perspective, it is viewed as part of Customer Relationship Management (CRM).
BACKGROUND – AN EXAMPLE
7
RECOMMENDATION SYSTEMS
Two types of recommendation systems Content‐based filtering Collaborative filtering
Content‐based filtering Performs profiling by extracting feature values from contents used in the past and recommend new contents with similar feature values.
8
CONTENT-BASED FILTERING
9
COLLABORATIVE FILTERING
Most recommendation systems rely on Collaborative Filtering (CF) technique
In CF‐based recommendation systems, shopping history is analyzed in order to establish connections between users and products.
A profile is created by evaluating contents used by a user in the past, and recommendations are made by evaluating users with similar profiles.
10
COLLABORATIVE FILTERING CONT.
Maintain a database of many users’ ratings of a variety of items.
For a given user, find other similar users whose ratings strongly correlate with the current user.
Recommend items rated highly by these similar users, but not rated by the current user.
Almost all existing commercial recommendation systems use this approach (e.g., Amazon).
11
COLLABORATIVE FILTERING CONT.
12
EXAMPLE OF TRANSACTIONS
Each user may purchase several items Each item could be purchased by a few users
13
User Itemsu01 Bread, Milku02 Bread, Diaper, Beer, Eggsu03 Milk, Diaper, Beer, Cokeu04 Bread, Milk, Diaper, Beeru05 Bread, Milk, Diaper, Coke
THE USER-ITEM RATING MATRIX
14
1100022000005550022200111Jenny
Emily
Diane
Jeremy
Stefan
user vector
item vector
RECOMMENDATION MODELS
Four basic models of recommendation systems are studied Item Popularity‐based Model (IP)
Item Similarity‐based Model (IS)
SVD‐based Latent Factor Model (SVD)
Bipartite Graph Model (BG)
15
WHICH ONE IS THE BSET?
There has been no definite answers as to which recommendation algorithm is the best Most published comparison results are based on some special datasets
These datasets are twisted
We need some comparison results based on real commercial datasets
16
ITEM POPULARITY-BASED MODEL (IP)
The most primitive model in RS.
Recommend most popular, most viewed, or best selling items to users.
It overlooks user’s preferences, but can be used as an auxiliary component in some recommendation systems.
There is a filtering step for IP to improve the prediction result.
17
FILTERING STEP IN IP
If a user prefers to view an item just once, then the algorithm should not recommend the items that have already been viewed by this user;
If a user prefers to view an item several times, the items that have been viewed by this user could be presented to him/her again.
This step can also be applied to other recommendation models.
18
ITEM SIMILARITY-BASED MODEL (IS)
In similarity‐based recommendation, the prediction is based on the similarity between items.
19
Item1
Item2
Item3
User1
User2
User3 Item4
Item4 can be recommended to User 1, Item1 can be recommended toUser 3.
SIMILARITY MEASURE
Central to most item‐oriented approaches is a similarity measure between items.
Pearson correlation coefficient is a measure of the strength of linear dependence between two variables
It measures the tendency of users to rate items iand j similarly.
20
d
kjjk
d
kiik
d
kjjkiik
xx
jiij
xxxx
xxxxxx
ji
1
2
1
2
1
)()(
))((),cov(
IMPROVED SIMILARITY MEASURE
To improve the reliability of the similarity measure, a modification can be applied to the equation:
where nck denotes the number of items that user k has viewed.
21
)log
11('
kikik nc
xx
THE MODEL
We also take into account the item popularity.
where S(i;u) is the set of items that were viewed by uesr u, npi denotes the view count of item i. N is the global maximum view count.
22
Nnp
uiSr i
uiSjijui
ij
)1();(
1ˆ0
);(
2
Similarity Popularity
SVD-BASED LATENT FACTOR MODEL (SVD)
Latent factor model focuses on reducing dimensionality of the user‐item rating matrix to discover some “latent factors”.
Original matrix = Factor matrix * … * Factor matrixOriginal matrix: sparse, not orderedFactor matrix: compact, ordered
23
SINGULAR VALUE DECOMPOSITION
A is an m‐by‐n matrix with rank(A) = r. It could be decomposed into three matrices.
U is an m‐by‐m orthogonal matrix and V is an n‐by‐northogonal matrix.
is the singular value
matrix, where D is a diagonal matrix with the singular values on its diagonal.
24
Tnm VUDA 1
nmrnrmrrm
rnrrr
OOOD
D
)()()(
)(1
SINGULAR VALUE DECOMPOSITION CONT.
The singular values in D have the property σ1 ≥ σ2 ≥ … ≥ σr> 0.
25
r
D
3
2
1
SVD-BASED LATENT FACTOR MODEL CONT.
26
Singular Value Decomposition (SVD)
REDUCE THE DIMENSION
If we only focus on those r non‐trivial singular values, the effective dimensions of the SVD matrices U, D and V will be m×r, r×r and n×r .
To reduce the dimension of data, we could retain the klargest singular values of D and discard the rest.
Expect to capture the underlying latent structure of the original data.
27
Tknkkkmnm VDUA
AN EXAMPLE
28
71.071.00000058.058.058.0
16.30049.9
44.0089.00090.0036.0018.0
1100022000005550022200111
daily use
digital
Jenny
Emily
Diane
Jeremy
Stefan
housewife
digital fans
housewife
digital fans
user – category similarity matrix
strength of digital fans
item – category similarity matrix
SVD-BASED RECOMMENDATION
SVD‐based model factorizes the user‐item matrix into two lower rank matrices, i.e., a “user‐factor” matrix Pm×f and an “item‐factor” matrix Qn×f.
Each user u and item i can be represented as an f‐dimensional factor vector pu and qi.
The prediction of a rating from user u on item ican be computed by
29
iTuui qpr ˆ
SVD-BASED RECOMMENDATION CONT.
Decompose the user‐item rating matrix R into three submatrices:
where U and Q are orthonormal matrices, D is a diagonal matrix.
It can be inferred that
30
TQDUR
QRDUP TQPR
SVD-BASED RECOMMENDATION CONT.
if we use ru to denote the u‐th row of the rating matrix R, then the user factor vector can be obtained by taking the product of ru and Q, i.e.,
and
31
TQPR
Qrp uu
Tiuui qQrr ˆ
QRDUP
BIPARTITE GRAPH MODEL (BG)
In BG, users and items are modeled as vertices of a graph.
32
BIPARTITE GRAPH MODEL (BG) CONT.
In Bipartite Graph model, all item nodes form a finite Markov chain with transition matrix P.
33
m
kjkkijiij tuPutPttPp
1
))|()|(()|(
n
jkj
kiki
r
rutP
1
)|(
m
kkj
kjjk
r
rtuP
1
)|(
probability of a chain ends in ti with initial node tj.
PREDICTION BASED ON TRANSITION
MATRIX
Given the previous click history of user u, the rating of item i can be predicted by
where Tk is the initial state vector for user k in a Markov chain and Tk(tj) is the component corresponding to item j.
34
m
jjkijui tTpr
1
))((ˆ
n
lkl
kjjk
r
rtT
1
)(
EXPERIMENTAL STUDY
The datasets in the experiment are clicking history from some online shopping websites.
35
pid:13505646 - siteId:9093 - uid:08097540 - date:2010-08-08pid:16062417 - siteId:9102 - uid:95429188 - date:2010-08-08pid:12546546 - siteId:7167 - uid:71516943 - date:2010-08-08pid:691224 - siteId:4266 - uid:07079557 - date:2010-08-08pid:4577421 - siteId:4266 - uid:07079557 - date:2010-08-08…
STATISTICS ON DATASETS
36
EVALUATION STRATEGY
The dataset has been divided into three sets, namely training set , test set and last transaction set.
Our goal is to use training set to train the model and apply it on the test set to predict the last transaction of test users.
37
user0: item0, item1, item2, item3
EVALUATION STRATEGY
38
Xaction 1Xaction 2...Xaction 250000
Xaction 250001...Xaction 300000
Xaction 1Xaction 2...Xaction 250000Xaction 250001
Xaction 250001...Xaction 300000
Original Data Set
Training Set
Test Set
EVALUATION STRATEGY CONT.
The�quality�of�the�results�is�measured�by�the�hit�rate:
Use�Top-10�Recommendation�to�verify�the�models
39
users test#users test predicted correctly#
hitRate
PARAMETER STUDY
γ in�item�similarity-based�model
40
Nnp
uiSr i
uiSjijui
ij
)1();(
1ˆ0
);(
2
TOP-10 PREDICTION ON DATASETS
41
20,471 users499 items
60 factors for SVD
TOP-10 PREDICTION ON DATASETS CONT.
42
148,409 users1,004 items
70 factors for SVD
TOP-10 PREDICTION ON DATASETS CONT.
43
70,049 users2,303 items
100 factors for SVD
TOP-10 PREDICTION ON DATASETS CONT.
44
112,738 users94 items
94 factors for SVD
SUMMARY
Item Popularity‐based model is not suitable for most datasets but could be used as an auxiliary component.
If the dataset has few items but lots of users, SVD‐based model is a good choice.
Item Similarity‐based model and Bipartite Graph model have similar idea so they perform very similarly. They are suitable for the datasets with “normal” number of items and users.
The filtered Bipartite Graph model is a “won’t‐be‐wrong” method in most cases. 45
FUTURE CONSIDERATIONS – INCREMENTAL
DATA
How to handle incremental data?
The amount of shopping data increases every minute, every day
How to handle the new data?
Do we re‐compute the entire data every day, or do we just compute the new data and add it to the existing data?
How do the strategies affect the accuracy of the recommendations?
46
FUTURE CONSIDERATION – IMPUTATION
STRATEGIES
When a person did not buy an item, it does not mean that the person does not like or dislike the item
The item usually receives a rating of 0, which is not the actual rating
The sparse matrix must be imputed – by filling out the missing values
Different imputation strategies will lead to different results, in terms of accuracy and computational efficiency
47
FUTURE CONSIDERATION – DATA PRIVACY
Small merchants do not have the manpower to maintain a recommendation system
They usually buy the services from a third party
They have to provide their customer data to the third party for analysis
This may cause the leak of customer privacy or trade secret
How to pre‐process customer data so that data privacy is preserved?
48
Professor Jun Zhang
Department of Computer ScienceUniversity of KentuckyLexington, KY 40506-0633, USA
E-mail: [email protected]://www.cs.uky.edu/~jzhangTel: 13540021323
TOP-N PREDICTIONS
50
20,471 users499 items
60 factors for SVD
TOP-N PREDICTIONS CONT.
51
148,409 users1,004 items
70 factors for SVD
TOP-N PREDICTIONS CONT.
52
112,738 users94 items
94 factors for SVD
TOP-N PREDICTIONS CONT.
53
70,049 users2,303 items
100 factors for SVD
CONCLUSION AND FUTURE WORK
We presented concept and background of recommender system in e‐Business.
There are two classes of recommendation techniques: Content‐based filtering and Collaborative filtering. The latter is extensively used in popular online shopping websites.
We illustrated four basic collaborative filtering algorithms and conducted an empirical study of them on four datasets from a retargeting company.
54
CONCLUSION AND FUTURE WORK
The filtering step of Item‐popularity model has a good effect on Item Similarity and Bipartite Graphmodels but has totally no effect on SVDmodel.
We discovered a strategy to choose models in terms of the features of the datasets.
The models could be combined to provide better prediction accuracy.
Privacy issues in recommender systems. (Future)
55
International Conference on Business Computing and Global Informatization
BCGIn 2013
September 13-15, 2013Changsha, China
www.bcgin.org