Post on 03-Aug-2015
Recommender Systems
Simona Dakova
Web Technologies – Prof. Dr. Ulrik Schroeder – WS 2010/111The slides are licensed under a
Creative Commons Attribution 3.0 License
Overview Motivation
Netflix Prize Competition
Collaborative filtering approaches
Content-based techniques
Hybrid recommenders
Summary
Web Technologies2
We live in information overload!
Web Technologies3
“We are leaving the age of Information and entering the Age of Recommendation” -The Long Tail (Chris Anderson)
Netflix: 2/3 of the movies rented were recommended
Google News: 38% more click-throughs
Amazon: 35% sales from recommendations
They try to attract you!
Web Technologies4
Why recommenders? Enhance e-commerce and boost sales
Browsers into buyers
Recommender vs. Search:
Discover the items you are looking, match your preferences
Limited list of results
Personalize your website content to the profile of an individual user
Discover interesting items
Automated personalization
Increase usage and satisfaction
Web Technologies5
Netflix Prize Competition $1.000.000 - if you “only” improve existing system by 10%!
Contest started in 2006
Annual progress prize $ 50.000
Gained great popularity inacademic circles
The Winner
BellKor´s Pragmatic Chaos
10.5% improvement in July 2009
Web Technologies6
Recommender System = ? Definition:
Algorithms/Systems for information filtering attempting to recommend certain items the user might like
Items:
Advertising messages, Investment choices, Restaurants, Cafes, Music tracks, Movies, TV programs, Books, Cloths, Supermarket goods, Tags, News articles, Online mates, Research papers
Web Technologies7
User Profiling Understand people´s needs and interests
Explicit Data Collection
Ask for rating of items
Rank a set of items
Ask for detailed information/feedback
CON: not well received by users, not ubiquitous
Implicit Data Collection
Purchasing history
Items viewed
Navigational patterns
Obtain list of watched/listened items
Analyze social data
CON: Privacy concerns
Web Technologies8
Technology overview
Web Technologies9
RECOMMENDERS
Collaborative filtering(CF)
Content-basedFiltering (CB)
Hybridrecommenders
Memory-basedCF Algorithms
Model-basedCF Algorithms
Collaborative filtering (CF)
Web Technologies10
RECOMMENDERS
Collaborative filtering (CF)
Content-basedFiltering (CB)
Hybridrecommenders
Memory-basedCF Algorithms
Model-basedCF Algorithms
• prediction based on past ratings
• compute similarities betweenusers/items
• make prediction according to thecalculated weight (similarity)
• learn a model from user’s ratings
• use the model to predict theprobabilistic rating of the activeuser on given item
Memory-based CF Algorithms
Web Technologies11
RECOMMENDERS
Collaborative filtering (CF)
Content-basedFiltering (CB)
Hybridrecommenders
Memory-basedCF Algorithms
Model-basedCF Algorithms
Entire or sample of the user-item matrix
Steps:
1. For the active user/item identify his neighbors
Similarity computation
Pearson correlation
Vector cosine-based similarity
2. Neighborhood-based prediction/ Top-N Recommendation
Memory-based CF Algorithms
Web Technologies12
User-based vs. Item-based
Web Technologies13
User-based = You may like it because your “friends” liked it
Item-based = You may like it because you like similar items
i1 i2 i3 i4 i5
u1 5 8 7 8
u2 10 1
u3 2 10 9 9
u4 2 9 9 10
u5 1 5 1
ua 2 9 10
i1 i2 i3 i4 i5
u1 5 8 7 8
u2 10 1
u3 2 10 9 9
u4 2 9 9 10
u5 1 5 1
ua 2 9 10
Model-based CF Algorithms
Web Technologies14
RECOMMENDERS
Collaborative filtering (CF)
Content-basedFiltering (CB)
Hybridrecommenders
Memory-basedCF Algorithms
Model-basedCF Algorithms
Model-Based CF Algorithms
Web Technologies15
Train your system to recognize complex patterns in user-
item data (ratings)
Make the recommendation based on the trained model
Relies on machine learning and data mining algorithms
Train
r11
r9
r8
r7
r6
r5
r4
r1
r3
r2
all ratings
r8
r7 r4
r3
MODEL(only set of ratings)
RECOMMENDATION
Limitations and problems of CF Depend on human ratings
Data sparsity
Cold start , New user and New item problem
Scalability
Synonymy
Shilling attacks
Gray/Black sheep
Web Technologies16
Content-based recommenders
Web Technologies17
RECOMMENDERS
Collaborative filtering(CF)
Content-basedFiltering (CB)
Hybridrecommenders
Memory-basedCF Algorithms
Model-basedCF Algorithms
Content-based recommendation (CB)
For items containing textual information (keywords)
Information Retrieval
Compares similarity of the features of given items
Example: Movie recommendation application
Analyze common features among the movies
Recommend only the movies that have a high degree of similarity to whatever the user’s preferences are
Web Technologies18
LargeSImilarity
Small Similarity
Limitations and problems of CB
Web Technologies19
Limited content analysis
Explicitly associated features
Multimedia data – relies on tagging
Same set of features – indistinguishable
Overspecialization
Difficult to recognize synonyms, concepts, or new emerging words
New user Problem
Hybrid recommenders
Web Technologies20
RECOMMENDERS
Collaborative filtering(CF)
Content-basedFiltering (CB)
Hybridrecommenders
Memory-basedCF Algorithms
Model-basedCF Algorithms
Collaborativefiltering
Hybrid recommenders Use combination of CF and CB
Implementing methods separately and combining their predictions
Incorporating CB characteristics into a CF approach or vice versa
Constructing a general unifying model that incorporates both
Example: content-boosted collaborative filtering
Web Technologies21
i1 i2 i3 i4
u1 5 8 x 7
u2 10 x 1 x
u3 2 x 10 9
u4 x 2 9 9
ua 2 x 9 10
i1 i2 i3 i4
u1 5 8 7 7
u2 10 4 1 8
u3 2 5 10 9
u4 6 2 9 9
ua 2 3 9 10
RECOMMENDATION
Contentpredictor
Pros/Cons of Hybrid Recommenders Advantages
Address limitations of pure CF or CB systems
Provide more accurate recommendations
Performance improvement
Overcome sparsity
Disadvatages
Comlexity
Expensive to build
Web Technologies22
The winning solution on Netflix Contest
A blend of several complex
algorithms into a hybrid recommender system
Main improvement:
Incorporate temporal effects that cause movie and user biases as well as the changing user preferences
Web Technologies23
SummaryTechniques Advantages Limitations
Co
llab
ora
tive
Memory-based algorithms: Neighborhood-based CF Top-N recommendation
• easy implementation• no content considered
•data sparsity•cold start problem•limited scalability
Model-based algorithms: machine learning / data mining algorithms
• deal better with sparsity, scalability• intuitive rationale
• expensive modeling• trade-off between performance and scalability
Co
nte
nt-
bas
ed Information retrieval • no data about other users• recommendation for new/unpopular items• predictions for users with unique tastes
• limited content analysis• overspecialization•new user problem
Hyb
rid
s
combination of collaborative and content-based approaches
• overcome limitations of pure collaborative and content-based recommendations• more accurate recommendations• performance improvement
• complexity• expensive to build
Web Technologies24
Literature
Adomavicius, G., Tuzhilin, A. 2005. Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions.
Su, X., Khoshgoftaar, T. 2009 A Survey on Collaborative Filtering Techniques.
Sarwar, B., Karypis, G., Konstan, J., Riedl, J. 2001 Item-based collaborative Filtering Recommendation Algorithms.
Das, A., Datar, M., Garg, A. 2007 Google News Personalization: Scalable Online Collaborative Fitlering.
Linden, G., Smith, B., York, J. 2003 Amazon.com Recommendations Item-to-Item Collaborative Filtering.
Guy, I., Zwerdling, N., Ronen, I., Carmel, D., Erel, U. 2010 Social Media Recommendation based on People and Tags.
Schafer, J., Konstan, J., Riedl, J. 1999 Recommender Systems in E-Commerce.
http://www.irelaxa.com/Geecat/2010/09/16/recommendation-system-collaborative-filtering/
Piotte, M., Chabbert, M. 2009 Extending the toolbox.
Web Technologies25