Survey of Recommendation Systems
-
Upload
youalab -
Category
Technology
-
view
6.902 -
download
0
Transcript of Survey of Recommendation Systems
![Page 1: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/1.jpg)
Survey of Recommendation Systems
![Page 2: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/2.jpg)
Outline
• Introduction
• Collaborative Filtering Algorithm
• Challenges
• Experiments (demo)
• Summary
• Future work
![Page 3: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/3.jpg)
Outline
• Introduction
• Collaborative Filtering Algorithm
• Challenges
• Experiments (demo)
• Summary
• Future work
![Page 4: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/4.jpg)
Introduction
• What is recommendation system?
– Recommend related items
– Personalized experiences
• How to build a recommendation system?
– Content-Based
– Collaborative Filtering Algorithm
• Examples
– Amazon
– Youa
![Page 5: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/5.jpg)
Examples
Browsing a book
Recommendations
Rating?
![Page 6: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/6.jpg)
Outline
• Introduction
• Collaborative Filtering Algorithm
• Challenges
• Experiments (demo)
• Summary
• Future work
![Page 7: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/7.jpg)
CF Algorithm
• Memory-Based User-Based
Item-Based
• Model-Based Bayes
Clustering
![Page 8: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/8.jpg)
User-Based CF Algorithm
![Page 9: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/9.jpg)
User-Based CF Algorithm
User by Item Matrix:
Table 1: An example of user-item matrix
Table 2: A simple example of ratings matrix
![Page 10: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/10.jpg)
User-Based CF Algorithm
Voting : vi,j corresponding to the vote for user i on item j.
Mean Vote :
where Ii is the set of items on which user i voted.
Predicted vote:
weights of n similar users normalizer
![Page 11: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/11.jpg)
Similarity Computation
Vector Cosine-Based Similarity
Correlation-Based Similarity (Pearson)
Other Similarities
![Page 12: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/12.jpg)
Vector Cosine-Based Similarity
Vector cosine similarity:
Uu ujuUu uiu
Uu ujuuiu
BA
rrrr
rrrrw
2
,
2
,
,,
,
)()(
))((
Adjusted cosine similarity:
different rating scale?
![Page 13: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/13.jpg)
Correlation-Based Similarity
Pearson correlation:
Thus in the example in Table 2, we have w1,5 = 0.756.
![Page 14: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/14.jpg)
Prediction Computation
Weighted Sum of Others’ Ratings:
For the simple example in Table 4, using the user-based CF algorithm, to
predict the rating for U1 on I2, we have
![Page 15: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/15.jpg)
Recommendations I
Rating Prediction Algorithm:
a) Calculate Pa,i for each item i with prediction
computation formulation.
b) Recommend the top-N highest rating items
that the active user a has not purchased.
![Page 16: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/16.jpg)
Recommendations II
K Nearest Neighbors Algorithm:
a) Find k most similar users (KNN).
b) Identify a set of items, C, purchased by the
group together with their frequency.
c) Recommend the top-N most frequent items in
C that the active user has not purchased.
![Page 17: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/17.jpg)
Item-Based CF Algorithm
Correlation-Based Similarity:
where ru,i is the rating of user u on item i, is the average rating of the ith item by
those users.
User-Item
Matrix
ir
![Page 18: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/18.jpg)
Prediction Computation
Simple Weighted Average:
where wi,n is the weight between items i and n, ru,n is the rating for
user u on item n.
![Page 19: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/19.jpg)
Extensions
• Default Voting
• Inverse User Frequency
• Case Amplification
![Page 20: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/20.jpg)
Default Voting
Problem:
• pair-wise similarity is computed only from the ratings in
the intersection of the items both users have rated.
• too few votes at the beginning
Solution: Assuming some default voting values for the missing
ratings can improve the CF prediction performance.
Dimension Reduction, such as SVD, PCA etc.
![Page 21: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/21.jpg)
Inverse User Frequency
Definition:
)/log( ji nnf
where nj is the number of users who have rated item j and
n is the total number of users.
![Page 22: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/22.jpg)
Case Amplification
where ρ is the case amplification power, ρ ≥ 1, and
typical choice of ρ is 2.5. Case amplification reduces
noise in the data.
It tends to favor high weights as small values raised to a
power become negligible.
For example, wi,j = 0.9, then it remains high (0.92.5 ≈ 0.8);
if wi,j = 0.1, then it be negligible (0.12.5 ≈ 0.003).
![Page 23: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/23.jpg)
Model-Based CF Algorithm
• Simple Bayesian CF Algorithm
• Clustering CF Algorithm
![Page 24: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/24.jpg)
Simple Bayesian CF Algorithm
Simple Bayesian:
Laplace Estimator:
![Page 25: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/25.jpg)
Simple Bayesian CF Algorithm
Example in Table 4, to produce the rating for U1 on I2 using the
Simple Bayesian CF algorithm and the Laplace Estimator:
![Page 26: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/26.jpg)
Clustering CF Algorithm
For two data objects, X = (x1, x2, …, xn) and Y = (y1,
y2, …, yn), the popular Minkowski distance is defined as,
where n is the dimension number of the object, and q is a positive integer.
Obviously, when q = 1, d is Manhattan distance; when
q = 2, d is Euclidian distance.
![Page 27: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/27.jpg)
Evaluation Metrics
Mean Absolute Error and Normalized Mean Absolute Error:
where rmax and rmin are the upper and lower bounds of the ratings.
![Page 28: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/28.jpg)
Outline
• Introduction
• Collaborative Filtering Algorithm
• Challenges
• Experiments (demo)
• Summary
• Future work
![Page 29: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/29.jpg)
Challenges
• Data sparsity
• Scalability
• Synonymy
• Gray Sheep
• Shilling Attacks
![Page 30: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/30.jpg)
Outline
• Introduction
• Collaborative Filtering Algorithm
• Challenges
• Experiments (demo)
• Summary
• Future work
![Page 31: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/31.jpg)
Demo
• Tools:Mahout - Scalable machine learning and data
mining library,http://mahout.apache.org/
• Data: MovieLens, http://www.movielens.org/
![Page 32: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/32.jpg)
Outline
• Introduction
• Collaborative Filtering Algorithm
• Challenges
• Experiments (demo)
• Summary
• Future work
![Page 33: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/33.jpg)
Conclusions
CF categories Memory-based CF
Representative techniques Item-based/user-based top-N
recommendations
Main advantages 1. easy implementation
2. new data can be added easily and
incrementally
3. need not consider the content of the
items being recommended
4. scale well with co-rated items
Main shortcomings 1. are dependent on human ratings
2. performance decrease when data
are sparse
3. cannot recommend for new users
and items
4. have limited scalability for large
![Page 34: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/34.jpg)
Conclusions
CF categories Model-based CF
Representative techniques 1. Bayesian belief nets CF
2. Clustering CF
3. CF using dimensionality reduction
techniques, SVD, PCA
Main advantages 1. better address the sparsity,
scalability and other problems
2. improve prediction performance
3. give an intuitive rationale for
recommendations
Main shortcomings 1. expensive model-building
2. trade-off between prediction
performance and scalability
3. lose useful information for
dimensionality reduction techniques
![Page 35: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/35.jpg)
Outline
• Introduction
• Collaborative Filtering Algorithm
• Challenges
• Experiments (demo)
• Summary
• Future work
![Page 36: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/36.jpg)
Future work
Scalability Real-time
![Page 37: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/37.jpg)
Q & A
![Page 38: Survey of Recommendation Systems](https://reader033.fdocuments.in/reader033/viewer/2022052821/5549238bb4c905a54c8b91f2/html5/thumbnails/38.jpg)
References
J. Breese, D. Heckerman, and C. Kadie, “Empirical analysis of predictive
algorithms for collaborative filtering,” in Proceedings of the 4th
Conference on Uncertainty in Artificial Intelligence (UAI ’98), 1998.
B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, “Item-based collaborative
filtering recommendation algorithms,” in Proc. of the WWW Conference,
2001.
K. Miyahara and M. J. Pazzani, “Collaborative filtering with the simple
Bayesian classifier,” in Proceedings of the 6th Pacific Rim International
Conference on Artificial Intelligence, pp. 679–689, 2000.
L. H. Ungar and D. P. Foster, “Clustering methods for collaborative
filtering,” in Proceedings of the Workshop on Recommendation Systems,
AAAI Press, 1998.
Xiaoyuan Su and Taghi M. Khoshgoftaar, “A Survey of Collaborative
Filtering Techniques,” in Advances in Artificial Intelligence Volume 2009,
Article ID 421425, 19 pages.