Jane Recommendation Engines

18
Adam Rogers, Data Scientist at Jane.com Jane Recommendation Engine

Transcript of Jane Recommendation Engines

Page 1: Jane Recommendation Engines

Adam Rogers, Data Scientist at Jane.com

Jane Recommendation Engine

Page 2: Jane Recommendation Engines

Jane Recommendation Overview

Page 3: Jane Recommendation Engines

Amazon’s percent of sales from recommendation 35% (2006)

Netflix estimates that 75 percent of viewer activity is driven by recommendation. (2013 - Wired)

Why Recommendations?

Page 4: Jane Recommendation Engines

How does it work?

Application User Events Kinesis

Lambda

Lambda

Lambda

DB

Page 5: Jane Recommendation Engines

Collaborative FilteringAmazon’s “Users also Purchased”

Recommend products based on shared activity with other users

Predicts what other product-user mappings are likely based on current ones

www.amazon.com

Page 6: Jane Recommendation Engines
Page 7: Jane Recommendation Engines

The Tools: Spark, Mahout, CloudsearchSpark:

Fast Parallel Data Processing and Machine Learning

Scales to massive amounts of data

Mahout:Parallel Linear Algebra (Matrix Operations) and Machine Learning

Spark and Mahout together enable fast collaborative filtering on massive datasets

Cloudsearch:AWS’ Fast full-text search engine built on Solr

Cloudsearch allows you to do weighted queries on recommended products - lets you use multiple facets and actions in your recommendations

Page 8: Jane Recommendation Engines

http://s6.postimg.org/r0m8bpjw1/recommender_architecture.png

Page 9: Jane Recommendation Engines

Jane’s Recommendation Challenges“Cold Start Problem” To the Max

No long-lived products to use as baseline for new ones

Every day ⅓ of products are brand new

Means we need to use events as far back as we reasonably can in our calculation

http://www.beautifulonraw.com/raw-food-blog/wp-content/uploads/2010/06/Shivering.jpg

Page 10: Jane Recommendation Engines

Other Types of RecommendersContent

Popular

User Similarity

"Collaborative Filtering in Recommender Systems" by Moshanin - Own work. Licensed under CC BY-SA 3.0 via Commons - https://commons.wikimedia.org/wiki/File:Collaborative_Filtering_in_Recommender_Systems.jpg#/media/File:Collaborative_Filtering_in_Recommender_Systems.jpg

Page 11: Jane Recommendation Engines

Content RecommendationsRecommend items that are similar to the given item

Based on information contained in the item - title, description, images, etc.

Avoids the “Cold Start” problem

User may not want to buy 2 very similar things though

Page 12: Jane Recommendation Engines

Word Embeddings

Page 13: Jane Recommendation Engines

Word Embeddingshttp://spark-public.s3.amazonaws.com/neuralnets/images/Lecture4/turian.png

Page 14: Jane Recommendation Engines

Content Recommendations with Word Embeddings

Calculate word embeddings on text within product (description, title, tags, etc.)

Compute distances between “embedded” product informationEuclidean distance is poor in such high dimensions - try cosine, mahalanobis, others

N nearest neighbors to the product in question are your recommendation

Page 15: Jane Recommendation Engines

Improving Content RecommendationsRemove meaningless, common stopwords

Weight your embedded vectors on given criteria

Use category information

Get creative with your data - different patterns in each dataset

Page 16: Jane Recommendation Engines

Improving, cont.Can “embed” images in a similar fashion using deep networks

Compute distance between embedded images

Combine image distances and text distances to give combined distance metric

Determine nearest neighbors from new distance metric

Page 17: Jane Recommendation Engines

SummaryRecommendations are a powerful (and these days, standard and

necessary) tool for improving customer interaction, conversion, etc.

Collaborative filtering is a proven algorithm for relevant recommendations (given lots of user data and products)

Great tools for building collaborative filtering recommendation systems exist (AWS, Spark, etc.) but you need to adapt to your specific needs

Content recommendations can supplement the weaknesses of collaborative filtering

Get creative to improve the quality of your recommendations

Page 18: Jane Recommendation Engines

Sourceshttp://www.cs.umd.edu/~samir/498/Amazon-Recommendations.pdf

"Collaborative Filtering in Recommender Systems" by Moshanin - Own work. Licensed under CC BY-SA 3.0 via Commons - https://commons.wikimedia.org/wiki/File:Collaborative_Filtering_in_Recommender_Systems.jpg#/media/File:Collaborative_Filtering_in_Recommender_Systems.jpg

https://aws.amazon.com/kinesis/streams/