Graph Based Recommendation Systems at eBay
-
Upload
planet-cassandra -
Category
Documents
-
view
15.529 -
download
2
description
Transcript of Graph Based Recommendation Systems at eBay
Modeling taste with Cassandra
Affinity is based on user tastes, preferences, and interests
1
What is a taste profile?
2
Stuff I like Stuff I don’t like
Operational definition: the set of things you like and dislike
Operational definition: the set of things you like and dislikeChallenge: how do you build a taste profile for someone?
Thesis: Likes are correlated
1) User A: • Democrat• Likes Arugula
4
Inferring correlations
A
B
C
D
E
2) User B:• Republican• Dislikes Arugula
3) User C indicates:• Democrat
What would we infer is User C’s affinity for Arugula?
Answer: User C would like Arugula
?
Inferring correlations
5
Like arugula
Dislike arugula
DislikeObama
LikeObama
<-3,-3>
<-2,-1.5>
<1,1>
<3, 2.5>
<2,?>
If someone’s affinity for Obama is 2.0, what is their affinity for arugula?
User A
User B
User C
Discovering latent factors
6
Like arugula
Dislike arugula
DislikeObama
LikeObama
<-3,-3>
<-2,-1.5>
<1,1>
<3, 2>
<2,1.5>
Predict 1.5 for how much this person will like arugula.
Liberal
Conservative
User A
User B
User C
Arugula
Iceberg
<4, 4>
<-4, -4>GOP
<-5, -5>
Obama
<5, 5>
Taste space = many latent factors
7
Liberal
Conservative
Masculine Feminine
Extroverted
Introverted
<0.5, 2.4, -.4>
<-0.5, -3.1, 0.1>
<0.7, 4.4, -.1>
A
B
What is a taste profile profile?
8
Stuff I like (close to me in taste space) Stuff I don’t like (far away in taste space)
Operational definition: a coordinate in taste space
Operational definition: the set of things you like and dislikeChallenge: how do you calculate taste coordinates?
9
Calculating taste coordinates
A
B
C
D
E
<1, -2>
<1, -0.5>
<-1, 2>
<1, -1>
2-2
Edge weight = dot product of nodes to constrain similar items to be close to each other.
Assume edge weights of:+2 = “love”-2 = “hate”
Democratic node must solve:1*x -2*y = 2 (edge from A)1*x -1*y = 2 (edge from C)
Solution = <2, 0>
2
2
? <x, y>
Updating taste coordinates
A
B
C
<0.75, -2.5>
<1, -1>
2-2
2
2
<1, -0.5>
<1, -0.5>
<-1, 0.5>
<-1, 2>
2A
B
C
<1, -2>
<1, -1>
2-2
2
<1, -1>
<1, -0.5>
<-1, 0.5>
<-1, 2>
2
User A purchases a camera...
Resulting in blue coordinates changing.
v1 System overview - Model updates
Rec.Engine
Updater
Taste graphReco. DBUser -> coordItem -> coord
1) Receive event (eg, Purchase)
2a) Write Purchase edge2b) Read other edges for this user and item
3) Write user and item coordinates
v1 System overview - Rec serving
Rec.Engine
Updater
Taste graphReco. DBUser -> coordItem -> coord
1) Page load requests recommendations
2) Rec. engine finds other cameras close to user’s coordinates3) Recommendations
shown to user
v1 Taste Graph data size
40 billion edges2 billion item nodes200 million user nodes
5TB of data, takes up 10TB with Replication Factor of 2
We expect this to quadruple next year as we get more events and add new types of edges
13
v1 Taste Graph DB configuration
32 Linux machines 128GB RAM 1TB iSCSI SSD 10 GigE NIC
Cassandra version 1.0.8
8GB JVM heap space
Size-tiered compaction strategy
v1 Taste Graph schema
User Edges
Item Edges
User Nodes
Item Nodes
(timestamp, edge_type, item_id) …user_id <empty>
tastevectoruser_id 200 bytes (50 floats)
(timestamp, edge_type, user_id) …item_id <empty>
tastevectoritem_id 200 bytes (50 floats)
v1 Real-time taste updates
Edges and nodes read per second
v1 Real-time taste updates
Edges and nodes written per second