Polyvalent recommendations
-
Upload
ted-dunning -
Category
Technology
-
view
463 -
download
0
description
Transcript of Polyvalent recommendations
1©MapR Technologies - Confidential
Polyvalent Recommendations
2©MapR Technologies - Confidential
Multiple Kinds of Behavior for Recommending
Multiple Kinds of Things
3©MapR Technologies - Confidential
Contact:– [email protected]– @ted_dunning
Slides and such (available late tonight):– http://www.slideshare.net/tdunning
Hash tags: #mapr #recommendations
4©MapR Technologies - Confidential
A new approach to recommendation, polyvalent recommendation, that is both simpler and much more powerful than traditional approaches. The idea is that you can combine user, item and content recommendations into a single query that you can implement using a very simple architecture.
5©MapR Technologies - Confidential
Recommendations
Often known (inaccurately) as collaborative filtering Actors interact with items– observe successful interaction
We want to suggest additional successful interactions Observations inherently very sparse
6©MapR Technologies - Confidential
Examples
Customers buying books (Linden et al) Web visitors rating music (Shardanand and Maes) or movies (Riedl,
et al), (Netflix) Internet radio listeners not skipping songs (Musicmatch) Internet video watchers watching >30 s
7©MapR Technologies - Confidential
Dyadic Structure
Functional– Interaction: actor -> item*
Relational– Interaction Actors x Items⊆
Matrix– Rows indexed by actor, columns by item– Value is count of interactions
Predict missing observations
8©MapR Technologies - Confidential
Recommendation Basics
History:
User Thing1 3
2 4
3 4
2 3
3 2
1 1
2 1
9©MapR Technologies - Confidential
Recommendation Basics
History as matrix:
(t1, t2) cooccur 2 times, (t1, t4) once, (t2, t4) once
t1 t2 t3 t4
u1 1 0 1 0
u2 1 0 1 1
u3 0 1 0 1
10©MapR Technologies - Confidential
A Quick Simplification
Users who do h
Also do r
User-centric recommendations
Item-centric recommendations
11©MapR Technologies - Confidential
Recommendation Basics
Coocurrence
t1 t2 t3 t4
t1 2 0 2 1
t2 0 1 0 1
t3 2 0 1 1
t4 1 1 1 2
12©MapR Technologies - Confidential
Problems with Raw Cooccurrence
Very popular items co-occur with everything– Welcome document– Elevator music
That isn’t interesting– We want anomalous cooccurrence
13©MapR Technologies - Confidential
Recommendation Basics
Coocurrence
t1 t2 t3 t4
t1 2 0 2 1
t2 0 1 0 1
t3 2 0 1 1
t4 1 1 1 2t3 not t3
t1 2 1
not t1 1 1
14©MapR Technologies - Confidential
Root LLR Details
In Rentropy = function(k) { -sum(k*log((k==0)+(k/sum(k))))}rootLLr = function(k) { sqrt( (entropy(rowSums(k))+entropy(colSums(k)) - entropy(k))/2)}
Like sqrt(mutual information * N/2)
15©MapR Technologies - Confidential
Spot the Anomaly
Root LLR is roughly like standard deviations
A not A
B 13 1000
not B 1000 100,000
A not A
B 1 0
not B 0 2
A not A
B 1 0
not B 0 10,000
A not A
B 10 0
not B 0 100,000
0.44 0.98
2.26 7.15
16©MapR Technologies - Confidential
Threshold by Score
Coocurrence
t1 t2 t3 t4
t1 2 0 2 1
t2 0 1 0 1
t3 2 0 1 1
t4 1 1 1 2
17©MapR Technologies - Confidential
Threshold by Score
Significant cooccurrence => Indicators
t1 t2 t3 t4
t1 1 0 0 1t2 0 1 0 1t3 0 0 1 1t4 1 0 0 1
18©MapR Technologies - Confidential
Decomposition for Cooccurrence
Can use SVD for cooccurrence
But first one or two singular vectors just encode popularity … ignore those
VT projects items into concept space, V projects back into item space
Thresholding reconstructed cooccurrence matrix is another way to get indicators
19©MapR Technologies - Confidential
What’s right about this?
20©MapR Technologies - Confidential
Virtues of Current State of the Art
Lots of well publicized history– Netflix, Amazon, Overstock
Lots of support– Mahout, commercial offerings like Myrrix
Lots of existing code– Mahout, commercial codes
Proven track record Well socialized solution
21©MapR Technologies - Confidential
What’s wrong about this?
22©MapR Technologies - Confidential
Cross Occurrence
We don’t have to do co-occurrence We can do cross-occurrence
Result is cross-recommendation
23©MapR Technologies - Confidential
Fundamental Algorithmics
Cooccurrence
A is users x items, K is items x items Product has general shape of matrix K tells us “users who interacted with x also interacted with y”
24©MapR Technologies - Confidential
Fundamental Algorithmic Structure
Cooccurrence
Matrix approximation by factoring
LLR
25©MapR Technologies - Confidential
But Wait ...
Does it have to be that way?
26©MapR Technologies - Confidential
But why not ...
Why just dyadic learning?
Why not triadic learning?Why not cross learning?
27©MapR Technologies - Confidential
For example
Users enter queries (A)– (actor = user, item=query)
Users view videos (B)– (actor = user, item=video)
A’A gives query recommendation– “did you mean to ask for”
B’B gives video recommendation– “you might like these videos”
28©MapR Technologies - Confidential
The punch-line
B’A recommends videos in response to a query– (isn’t that a search engine?)– (not quite, it doesn’t look at content or meta-data)
29©MapR Technologies - Confidential
Real-life example
Query: “Paco de Lucia” Conventional meta-data search results:– “hombres del paco” times 400– not much else
Recommendation based search:– Flamenco guitar and dancers– Spanish and classical guitar– Van Halen doing a classical/flamenco riff
30©MapR Technologies - Confidential
Real-life example
31©MapR Technologies - Confidential
Hypothetical Example
Want a navigational ontology? Just put labels on a web page with traffic– This gives A = users x label clicks
Remember viewing history– This gives B = users x items
Cross recommend– B’A = label to item mapping
After several users click, results are whatever users think they should be
32©MapR Technologies - Confidential
But wait,there’s more!
33©MapR Technologies - Confidential
users
things
34©MapR Technologies - Confidential
users
thingtype 1
thingtype 2
35©MapR Technologies - Confidential
36©MapR Technologies - Confidential
Summary
Input: Multiple kinds of behavior on one set of things
Output: Recommendations for one kind of behavior with a different set of things
Cross recommendation is a special case
37©MapR Technologies - Confidential
Now again, without the scary math
38©MapR Technologies - Confidential
Input Data User transactions– user id, merchant id– SIC code, amount– Descriptions, cuisine, …
Offer transactions– user id, offer id– vendor id, merchant id’s, – offers, views, accepts
39©MapR Technologies - Confidential
Input Data User transactions– user id, merchant id– SIC code, amount– Descriptions, cuisine, …
Offer transactions– user id, offer id– vendor id, merchant id’s, – offers, views, accepts
Derived user data– merchant id’s– anomalous descriptor terms– offer & vendor id’s
Derived merchant data– local top40– SIC code– vendor code– amount distribution
40©MapR Technologies - Confidential
Cross-recommendation
Per merchant indicators– merchant id’s– chain id’s– SIC codes– indicator terms from text– offer vendor id’s
Computed by finding anomalous (indicator => merchant) rates
41©MapR Technologies - Confidential
Search-based Recommendations
Sample document– Merchant Id– Field for text description– Phone– Address– Location
42©MapR Technologies - Confidential
Search-based Recommendations
Sample document– Merchant Id– Field for text description– Phone– Address– Location
– Indicator merchant id’s– Indicator industry (SIC) id’s– Indicator offers– Indicator text– Local top40
43©MapR Technologies - Confidential
Search-based Recommendations
Sample document– Merchant Id– Field for text description– Phone– Address– Location
– Indicator merchant id’s– Indicator industry (SIC) id’s– Indicator offers– Indicator text– Local top40
Sample query– Current location– Recent merchant descriptions– Recent merchant id’s– Recent SIC codes– Recent accepted offers– Local top40
44©MapR Technologies - Confidential
SolRIndexerSolR
IndexerSolrindexing
Cooccurrence(Mahout)
Item meta-data
Indexshards
Complete history
45©MapR Technologies - Confidential
SolRIndexerSolR
IndexerSolrsearchWeb tier
Item meta-data
Indexshards
User history
46©MapR Technologies - Confidential
Objective Results
At a very large credit card company
History is all transactions, all web interaction
Processing time cut from 20 hours per day to 3
Recommendation engine load time decreased from 8 hours to 3 minutes
Recommendation quality increased visibly
47©MapR Technologies - Confidential
Contact:– [email protected]– @ted_dunning
Slides and such (available late tonight):– http://www.slideshare.net/tdunning
Hash tags: #mapr #recommendations
We are hiring!
48©MapR Technologies - Confidential
Thank You