Speaker pham cong dinh
-
Upload
aiti-education -
Category
Documents
-
view
426 -
download
3
Transcript of Speaker pham cong dinh
![Page 1: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/1.jpg)
A quick introduction to item-based collaborative filtering
Pham Cong Dinh @pcdinhPHPDay Saigon 2012
![Page 2: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/2.jpg)
Outline
● PHP popularity and challenges to produce engaging content
● Recommendation engine at work● How to build a item-based collaborative
filtering-based recommendation engine
![Page 3: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/3.jpg)
PHP is everywhere
● W3Tech report in 2012●
●
●
![Page 4: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/4.jpg)
PHP website distribution
● Reported by builtwith.com in 2012 (more than 28 millions site in PHP)
●
●
●
![Page 5: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/5.jpg)
You have a website. Now what?
![Page 6: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/6.jpg)
Information overload
From http://bethesignal.org/
ORno engaging
content?
![Page 7: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/7.jpg)
Why recommendation system?
●
![Page 8: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/8.jpg)
Recommendation engine at work
![Page 9: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/9.jpg)
Recommendation engine at work
![Page 10: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/10.jpg)
Build a recommendation system
● Collaborative filtering: user and item– Filtering: automatic predictions about the interests
of a user
– Collaborative: many users (preferences or taste information)
![Page 11: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/11.jpg)
Item-based collaborative filtering
● Model-based– The similarities between different items in the data
set are calculated
– Predict ratings for user-item pairs not present in the data set
![Page 12: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/12.jpg)
Steps to do item-based collaborative filtering
● Data collection and representations (preferences/taste …)
● Finding the relationships and determine the similarity
● Recommendation computations - recommendations/suggestions/discoveries (produce engaging content)
![Page 13: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/13.jpg)
Collaborative filtering: data collection
● Data collection and representations (preferences/taste …)
– Clicks
– Likes, favorites
– Watch, read
– Survey
– Ratings
– Others …
● E.x: Find the set of movies that user X likes
(user, item)
✗X,1
✗X,2
✗Y,1
✗Y,2
✗Z,2
✗Z,3
● Data collection and representations (preferences/taste …)
– Clicks
– Likes, favorites
– Watch, read
– Survey
– Ratings
– Others …
● E.x: Find the set of movies that user X likes
![Page 14: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/14.jpg)
Collaborative filtering: Similarity (1)
● Finding the relationships and determine the similarity
– The similarity values between items are measured by observing all the users who have interacted (rated) both the items
● E.x: Find a group of movies that is similar to these set of movies that we know user X likes
![Page 15: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/15.jpg)
Collaborative filtering: Similarity (2)
● Manhattan distance: |x1 – x2| + |y1 - y2|●
●
User(x, y)
Amy(5, 5)Bill(2, 5)Jim(1, 4)
Item(x1, x2, x3) → RatingsSnow Crash(5, 2, 1)Girl with the Dragon Tattoo (5, 5, 1)
Manhattan distance→ Amy – Bill: |5 – 2| + |5 – 5| = 3→ Snow Crash - Girl with the Dragon Tattoo: 3
X
Y
X Y
![Page 16: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/16.jpg)
Collaborative filtering: Similarity (3)
● Cosine distance: the angle between these vectors. Value: -1 (no related) to 1
Item(x1, x2, x3) → RatingsSnow Crash(5, 2, 1)Girl with the Dragon Tattoo (5, 5, 1)
Cosine distance→ Snow Crash - Girl with the Dragon Tattoo: (5x5 + 2x5 + 1x1) / (( 5x5 + 2x2 + 1 x 1) x ( 5x5 + 5x5 + 1x1))
PHP: https://github.com/aoiaoi/CosineSimilarity/blob/master/CosineSimilarity.php
![Page 17: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/17.jpg)
Collaborative filtering: Similarity (4)
● Pearson Correlation Coefficient: from -1 (no related) to +1
●
●
●
●
● How much the ratings by common users for a pair of items deviate from average ratings for those items
● Correlation is basically the average product
![Page 18: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/18.jpg)
Collaborative filtering: Similarity (5)
● Euclidean distance: the "ordinary" distance between two points.
●
●
●
●
● Values: Near 0 (no related) to 1
![Page 19: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/19.jpg)
Collaborative filtering: Similarity (6)
● Spearman distance: Spearman distance is a square of Euclidean Distance between two rank vectors. A perfect positive correlation is +1 and a perfect negative correlation is -1.
●
●
● Spearman Rank Correlation: The range of Spearman Correlation is from -1 to 1 (a perfect Spearman correlation of +1)
![Page 20: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/20.jpg)
Collaborative filtering: Similarity (6)
● Adjusted Euclidean distance: take length of vectors into account
●
![Page 21: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/21.jpg)
Collaborative filtering: Recommendation computations
● Calculate similarity between Item A that user X watch/buy/like with items that User X does not watch/buy/like
● Score all the items (e.x: apply weighted algorithms – average score by the other)
● Sorting● Return top-N items
![Page 22: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/22.jpg)
Collaborative filtering: Other issues
● Accuracy of Predicting Ratings. To evaluate
accuracy when predicting unrated item for the active user, use Mean Absolute Error (MAE).
● Accuracy of Recommendations. To evaluate the accuracy of recommendations, use Mean Average Precision (MAP), which is defined as Average of the Average Precision (AP) value for a set of queries (a query could be considered as a user’s asking for recommending items in recommender systems).
![Page 23: Speaker pham cong dinh](https://reader034.fdocuments.in/reader034/viewer/2022051617/5593c3251a28abaf4a8b4757/html5/thumbnails/23.jpg)
The End
● Q & A