Jane Recommendation Engines

Adam Rogers, Data Scientist at Jane.com

Jane Recommendation Engine

Jane Recommendation Overview

Amazon’s percent of sales from recommendation 35% (2006)

Netflix estimates that 75 percent of viewer activity is driven by recommendation. (2013 - Wired)

Why Recommendations?

How does it work?

Application User Events Kinesis

Lambda

Lambda

Lambda

DB

Collaborative FilteringAmazon’s “Users also Purchased”

Recommend products based on shared activity with other users

Predicts what other product-user mappings are likely based on current ones

www.amazon.com

The Tools: Spark, Mahout, CloudsearchSpark:

Fast Parallel Data Processing and Machine Learning

Scales to massive amounts of data

Mahout:Parallel Linear Algebra (Matrix Operations) and Machine Learning

Spark and Mahout together enable fast collaborative filtering on massive datasets

Cloudsearch:AWS’ Fast full-text search engine built on Solr

Cloudsearch allows you to do weighted queries on recommended products - lets you use multiple facets and actions in your recommendations

http://s6.postimg.org/r0m8bpjw1/recommender_architecture.png

Jane’s Recommendation Challenges“Cold Start Problem” To the Max

No long-lived products to use as baseline for new ones

Every day ⅓ of products are brand new

Means we need to use events as far back as we reasonably can in our calculation

http://www.beautifulonraw.com/raw-food-blog/wp-content/uploads/2010/06/Shivering.jpg

Other Types of RecommendersContent

Popular

User Similarity

"Collaborative Filtering in Recommender Systems" by Moshanin - Own work. Licensed under CC BY-SA 3.0 via Commons - https://commons.wikimedia.org/wiki/File:Collaborative_Filtering_in_Recommender_Systems.jpg#/media/File:Collaborative_Filtering_in_Recommender_Systems.jpg

Content RecommendationsRecommend items that are similar to the given item

Based on information contained in the item - title, description, images, etc.

Avoids the “Cold Start” problem

User may not want to buy 2 very similar things though

Word Embeddings

Word Embeddingshttp://spark-public.s3.amazonaws.com/neuralnets/images/Lecture4/turian.png

http://spark-public.s3.amazonaws.com/neuralnets/images/Lecture4/turian.png

http://spark-public.s3.amazonaws.com/neuralnets/images/Lecture4/turian.png

Content Recommendations with Word Embeddings

Calculate word embeddings on text within product (description, title, tags, etc.)

Compute distances between “embedded” product informationEuclidean distance is poor in such high dimensions - try cosine, mahalanobis, others

N nearest neighbors to the product in question are your recommendation

Improving Content RecommendationsRemove meaningless, common stopwords

Weight your embedded vectors on given criteria

Use category information

Get creative with your data - different patterns in each dataset

Improving, cont.Can “embed” images in a similar fashion using deep networks

Compute distance between embedded images

Combine image distances and text distances to give combined distance metric

Determine nearest neighbors from new distance metric

SummaryRecommendations are a powerful (and these days, standard and

necessary) tool for improving customer interaction, conversion, etc.

Collaborative filtering is a proven algorithm for relevant recommendations (given lots of user data and products)

Great tools for building collaborative filtering recommendation systems exist (AWS, Spark, etc.) but you need to adapt to your specific needs

Content recommendations can supplement the weaknesses of collaborative filtering

Get creative to improve the quality of your recommendations

Sourceshttp://www.cs.umd.edu/~samir/498/Amazon-Recommendations.pdf

"Collaborative Filtering in Recommender Systems" by Moshanin - Own work. Licensed under CC BY-SA 3.0 via Commons - https://commons.wikimedia.org/wiki/File:Collaborative_Filtering_in_Recommender_Systems.jpg#/media/File:Collaborative_Filtering_in_Recommender_Systems.jpg

https://aws.amazon.com/kinesis/streams/

http://www.cs.umd.edu/~samir/498/Amazon-Recommendations.pdf

https://commons.wikimedia.org/wiki/File:Collaborative_Filtering_in_Recommender_Systems.jpg#/media/File:Collaborative_Filtering_in_Recommender_Systems.jpg



https://aws.amazon.com/kinesis/streams/

Jane Recommendation Engines

Data & Analytics

Transcript of Jane Recommendation Engines