Friend Recommendations in Social Networks using Genetic Algorithms and Network Topology
Real-time recommendations for retail: Architecture, algorithms, and design
description
Transcript of Real-time recommendations for retail: Architecture, algorithms, and design
![Page 1: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/1.jpg)
REAL-TIME RECOMMENDATIONS FOR RETAIL: ARCHITECTURE, ALGORITHMS, AND DESIGN
Juliet Hougland and Jonathan Natkins
![Page 2: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/2.jpg)
Who Are We?
Jonathan NatkinsField Engineer at WibiDataBefore that, Cloudera Software EngineerBefore that, Vertica Software/Field Engineer
Juliet HouglandData Scientist, previously at WibiDataMS in Applied MathBA in Math-Physics
![Page 3: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/3.jpg)
Recommendations in Retail
Personalized versus Non-Personalized
![Page 4: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/4.jpg)
Recommendations in Retail
Personalized versus Non-Personalized
![Page 5: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/5.jpg)
Recommendations in Retail
Personalized versus Non-Personalized
![Page 6: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/6.jpg)
Recommender ContextsTaste History
Based on everything you know about a userInterests over months/years
Current TasteBased on a user’s immediate historyInterests over minutes/hours
EphemeralExtreme version of current tasteFor example, location
Demographic*Similar to taste history, but less subjectiveGeographic region, age bracket, etc.
![Page 7: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/7.jpg)
Why Does Real-Time Matter?
Relevancy
![Page 8: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/8.jpg)
I am a Special Snowflake
Natty
![Page 9: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/9.jpg)
Requirements for a Real-Time System
General System RequirementsHandle millions of customers/usersSupport collection and storage of complex data
Static and event-series
Real-Time System RequirementsQuickly retrieve subsets of data for a single userAggregate/derive new, first-class data per user
![Page 10: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/10.jpg)
What is Kiji?
The Kiji project is a modular, open-source framework for building real-time applications that collect, store, and analyze entity-centric data
kiji.orggithub.com/kijiproject
![Page 11: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/11.jpg)
What is Kiji?
The Kiji project is a modular, open-source framework for building real-time applications that collect, store, and analyze entity-centric data
kiji.orggithub.com/kijiproject
![Page 12: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/12.jpg)
Three Challenges
Developing models for use in real-timeScoring models in real-timeDeploying models into a production environment
![Page 13: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/13.jpg)
How Can We Make Real-Time Models?
Population interests change slowly
Individual interests change quickly
![Page 14: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/14.jpg)
How Can We Make Real-Time Models?
Population interests change slowly
Individual interests change quickly
Models don’t need to be retrained
frequently
![Page 15: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/15.jpg)
How Can We Make Real-Time Models?
Population interests change slowly
Individual interests change quickly
Models don’t need to be retrained
frequently
Application of a model should be fast
![Page 16: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/16.jpg)
A Common Workflow
Train a model over the entire datasetSave fitted model parameters to a file or another tableAccess the model parameters when generating new recommendations based on new data
This is EXPENSIVE
![Page 17: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/17.jpg)
Developing Models
KijiExpressScala interface for interacting with Kiji dataUses Scalding for designing complex dataflows
Model LifecycleAllows analysts and data scientists to break apart a model into phases
![Page 18: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/18.jpg)
Scoring Models in Real-Time
Batch isn’t real-time
![Page 19: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/19.jpg)
Scoring Models in Real-Time
Batch isn’t real-time
Number ofUsers
Number of Interactions
![Page 20: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/20.jpg)
Scoring Models in Real-Time
Batch isn’t real-time
Number ofUsers
Number of Interactions
A few users withmany interactions
![Page 21: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/21.jpg)
Scoring Models in Real-Time
Batch isn’t real-time
Number ofUsers
Number of Interactions
A few users withmany interactions
A lot of users withfew interactions
![Page 22: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/22.jpg)
Fresheners Compute Lazily
Client
KijiScoring Server HBase
Read a column
Get from HBase
![Page 23: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/23.jpg)
Fresheners Compute Lazily
Client
KijiScoring Server HBase
Read a column
Get from HBase
Freshness Policy
![Page 24: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/24.jpg)
Fresheners Compute Lazily
Client
KijiScoring Server HBase
Read a column
Get from HBase
Freshness PolicyYes, return to client
![Page 25: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/25.jpg)
Fresheners Compute Lazily
NO
Client
KijiScoring Server HBase
Read a column
Get from HBase
Freshness Policy
Scorer
![Page 26: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/26.jpg)
Fresheners Compute Lazily
Client
KijiScoring Server HBase
Read a column
Get from HBase
Freshness Policy
ScorerYes, return to client
Write back for next time
![Page 27: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/27.jpg)
Kiji Application Stack
![Page 28: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/28.jpg)
Deployment Challenges
![Page 29: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/29.jpg)
Kiji Model Repository
Link between application and modelsStores Freshener metadata
FreshnessPolicy, Scorer, attached columnLocation of trained model
Stores Scorer codeCode repository makes model scoring code available to the application from a central location
New models can be deployed to the Model Repository and made immediately available to the application
![Page 30: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/30.jpg)
Kiji Model Repository
![Page 31: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/31.jpg)
Retail Recommendation
![Page 32: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/32.jpg)
Types of Recommenders
RecommendationAlgorithms
CollaborativeFilteringMethods
ContentBased
Methods
MemoryBased
ModelBased
![Page 33: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/33.jpg)
Content-Based Recommenders
Orange-Nosed
Lab Assistant
Meeps a lot
Build models around entities using features that we think reflect inherent characteristics
![Page 34: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/34.jpg)
Content-Based Recommenders
safer
faster knife
![Page 35: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/35.jpg)
Pandora: Content-Based
Expertly-CharacterizedMusic
![Page 36: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/36.jpg)
Collaborative Filtering
Represent users-itemaffinities as a sparsematrix
Beaker
BananaSlicer
PineappleSlicerUsers ≈ Rows
Items ≈ Columns
![Page 37: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/37.jpg)
Aspirational Ratings
I put in my queue… I actually watch
![Page 38: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/38.jpg)
Collaborative Filtering
Represent users-itemaffinities as a sparsematrix
Beaker
BananaSlicer
PineappleSlicerUsers ≈ Rows
Items ≈ Columns
![Page 39: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/39.jpg)
Simple aggregate predictors
Collaborative Filtering: How It WorksSimilar Users Similar Products
![Page 40: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/40.jpg)
Similar Entities
What do we mean by similar?Jaccard Index: a measure of set similarityCosine Similarity: the angle between two vectorsPearson Correlation: statistical measure, similar to cosine
Naively, we could compare every entity to each other
…But that would not scale will with increasing numbers of entities
![Page 41: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/41.jpg)
Building the Similarity Matrix
![Page 42: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/42.jpg)
Collaborative Filtering: Is This Useful?
Problem: Too much data!Tracking user preferences and all their events generates huge amounts of data
Problem: Too little data!Dimensions of user-space and item-space are usually very largeMore variables makes it more difficult to generate user preferences
Problem: Cold startIf you don’t know anything about a user, what should you recommend?
Problem: More ratings means slower computationsIdentifying neighborhoods of entities is expensive
![Page 43: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/43.jpg)
Collaborative Filtering: Why Is It Useful?
Because it worksContent-agnostic
All that matters is co-occurrence of events
![Page 44: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/44.jpg)
Amazon: Item-Item Collaborative Filtering
Used for personalized recommendationsFill screen real estate with related itemsProduces specific, but non-creepy recommendations
Linden, G.; Smith, B.; York, J., "Amazon.com recommendations: item-to-item collaborative filtering," Internet Computing, IEEE , vol.7, no.1, pp.76,80, Jan/Feb 2003
>
![Page 45: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/45.jpg)
Item-Item Collaborative Filtering
Beaker buys a banana slicerThen:
Generate list of candidate items to predict ratings forPredict ratings for candidate itemsSelect Top-N items
![Page 46: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/46.jpg)
Accessing External Data
KeyValueStore API enables external data access when applying a modelExternal data might be…
Trained model parametersHierarchical/Taxonomic dataGeo-lookup
Store external data flexiblyText files, sequence files, Kiji tables, etc.Data access is decoupled from use during execution
If the data doesn’t fit in memory, put it in a table
![Page 47: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/47.jpg)
How Much Less Work Can We Do?
We can choose a predictor that allows us to truncate a sum
There are two ways terms in the sum of our predictor can be small
No ratingSmall similarity
![Page 48: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/48.jpg)
How Much Less Work Can We Do?
We can choose a predictor that allows us to truncate a sum
There are two ways terms in the sum of our predictor can be small
No ratingSmall similarity
![Page 49: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/49.jpg)
How Much Less Work Can We Do?
We can choose a predictor that allows us to truncate a sum
There are two ways terms in the sum of our predictor can be small
No ratingSmall similarity
Ignore unrated items
![Page 50: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/50.jpg)
How Much Less Work Can We Do?
We can choose a predictor that allows us to truncate a sum
There are two ways terms in the sum of our predictor can be small
No ratingSmall similarity
Ignore dissimilar items
![Page 51: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/51.jpg)
How Much Less Work Can We Do?
If we only present a few recommendations, we don’t need to predict ratings for all itemsChoose your candidate set to estimate ratings wisely or infer from nearest neighbors
![Page 52: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/52.jpg)
Organizing Data in Item-Item CF
![Page 53: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/53.jpg)
Accessing Data During Freshening
![Page 54: Real-time recommendations for retail: Architecture, algorithms, and design](https://reader036.fdocuments.in/reader036/viewer/2022062301/56815c41550346895dca42ec/html5/thumbnails/54.jpg)
Want to Know More?
The Kiji Projectkiji.orggithub.com/kijiproject
Questions about this presentation?Twitter: @JulietHougland or @nattyiceEmail: [email protected]