Demystifying Recommendation Systems
-
Upload
rumman-chowdhury -
Category
Data & Analytics
-
view
409 -
download
0
Transcript of Demystifying Recommendation Systems
Demystifying Recommendation
Systems
About Rumman
•Senior Data Scientist and Instructor at Metis •Practicing Data Scientist
• Find me on twitter @ruchowdh • Visit my website at rummanchowdhury.com
• Check out my jobs page • …and my blog
About Metis
• Data Science Bootcamp
• Part of Kaplan
• Accredited by ACCET
• 12-weeks, full-time including 60 hours of online pre-work
• Evening and weekend training courses
• Third party financing options
• $3,000 scholarship for women, underrepresented minority groups, and veterans or members of the U.S. military
Overview• What is a recommendation engine? • What are the types of recommendation systems? • What are the drawbacks of the most common recommendation engines and how do I deal with them? • How do I fine-tune my model?
What are recommendation systems?
What are recommendation systems?Automated systems that seek to suggest whether a given item (product, event, movie, song, etc) will be desirable to a user.
Or, more data science-y: predict what a user’s review will be for items that they have not reviewed
Where does a recommendation system lie in the space of data science and analytics?
• Descriptive • Average, percents, etc • Explains post-event or during
• Predictive • Uses modeling of past behavior to make predictions about the future
• Prescriptive • Informed decision of how actions should be taken based on data
How do I pick the best kind of recommender system for my data?
• What is your existing data? • How quickly does your inventory change? • How much information can you get on a user? (explicit and implicit) • Does your model need to scale well?
What are the kinds of recommendation systems?
What are the kinds of recommender systems?
• Search (knowledge-based) • Pros: items will be close matches to expressed needs, no cold-start issues • Cons: Static, manual tagging, will not work well with very similar inventories or rapidly changing inventories
• Example: Amazon’s basic search
What are the kinds of recommender systems?
• Content-based • Items are mapped based on characteristics into an item-feature space, and recommendations are based on specified characteristics
• Pros: Easier comparison between items • Cons: Cold start problem, need good content descriptions, need item ratings •Example: Search for ‘ai’ vs ‘AI’, ‘mit’ vs ‘MIT’
What are the kinds of recommender systems?
• Collaborative filtering: based on user and item similarities • Pros: can provide less-obvious matches • Cons: cold-start problem for new users and new items, requires a feedback rating
Limitations, or, Ask yourself, do you really need a recommendation engine?
• Recommendation systems have to update immediately. • You have to have a sufficiently inexpensive model and have the bandwidth to return results fast.
• You have more information than you think: • existing item popularity • geography based in ip address • cookies
How does Content-Based recommendation work?
• Users and items are represented by vectors in a feature space • Approaches:
• Map users and items to the same feature space, compute distance between a user and an item.
Example: Content-Based Recommendation
Features = (big box office, aimed at kids, famous actors)
Items (movies): Finding Nemo = (5, 5, 2) Mission Impossible = (3, -5, 5) Jiro Dreams of Sushi = (-4, -5, -5)
Predicted ratings*:
(-3*5 + 2*5 + 2*2) = -9 (-3*3 - 2*5 - 2*5) = -29 (3*4 - 2*5 + 2*5) = +12
* Ratings for user with a described preference of (-3, 2, 2) for these features
How does Content Based Recommendation work?
• Another option is to create features from user+item pairs and use an algorithm (classifier?) to predict like/dislike
•Each user/item pair has a labeled outcome, such as purchased/not purchased. You can train a model to predict purchase behavior.
How does Collaborative Filtering work?
• Collaborative filtering refers to a family of methods for predicting ratings where instead of thinking about users and items in terms of a feature space, we are only interested in the existing user-item ratings themselves.
•In this case, our dataset is a ratings matrix whose columns correspond to items, and whose rows correspond to users.
Example: Netflix movie recommendations
How does collaborative filtering work?• Method 1: Item-based CF, a.k.a. neighborhood methods or memory-based CF
• Ratings data are used to create an item-item similarity matrix. • Recommendations are made based on the items most similar to those a user has already rated highly.
•This method does not scale well. • Why? You need a fully populated matrix of item-item similarity. This doesn’t work well if you have a lot of items or if your items change a lot.
How does CF work?• Method 2: Model-based CF use matrix decomposition via singular value decomposition (SVD) to reduce dimensionality and extract latent variables.
• We express users and items in terms of these variables.
Why is model-based CF preferred?
• Scalable, flexible, accurate, domain independent, and requires no explicit information.
What are the drawbacks, and how can I address them?
Let’s discuss the drawbacks
• Cold-start problem! • Data is typically very sparse •Need granularity in your data
Drawback: Cold Start problem
• Build an initial profile based on implicit data, evolve based on explicit feedback as it comes. • Sometimes called a ‘hybrid’ filtering method, you can use content-based information to ease cold-start and data sparsity problems.
Drawback: Sparsity of Data
• Famous Netflix prize dataset, ~ 99% of possible ratings were missing. • Data is skewed and sparse
• or, most people don’t rate a lot and most items aren’t rated • those that are often are rated constantly
Drawback: Granularity of data• Traditional model-based CF works well for non-binary data (ie, a 5 star rating). Doesn’t work well for binary (ie, click/not click, purchased/did not purchase)
• You will need to tweak your measurements of item similarities
Quick overview of measurement
• Non-binary rating: • Pearson correlation coefficient • Euclidean distance • Manhattan distance
• Binary ratings: • Jaccard similarity • Cosine similarity
How do I refine my model?
Normalization
• Some items are significantly higher rated (ie, blockbuster movies, Oscar winners) • Some users are lower (or higher) raters from the norm • Ratings can change over time
Normalization• Need to offset per user • Need to offset per item
•Ex: Mean rating across all users for item x is some value. How does it differ from the mean rating across all items? How does my rating differ from the mean rating of that item?
Capturing data trends• Rating distributions:
• ratings aren’t random, they follow a distribution - model this distribution
• Feature importance: You can regress on your feature vectors to get an understanding of what values impact ratings • Feature generation: Characterize your users and create one-hot features (this can save a lot of time, and help with cold-start problems)
Temporal factors
• There can be an upward trend of ratings over time • Seasonal shifts due to holidays, awards, etc • Anchoring (ie, an item based on a previous iteration or version of that item)
Conclusions
• Think about your data, your capabilities, and your needs prior to creating a recommendation system • Consider the pros and cons of each type • Refine your model thoughtfully