Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?

21
Preference Elicitation in Mangaki: Is Your Taste Kinda Weird? Jill-Jênn Vie June 22, 2016 1 / 21

Transcript of Preference Elicitation in Mangaki: Is Your Taste Kinda Weird?

Preference Elicitation in Mangaki:Is Your Taste Kinda Weird?

Jill-Jênn Vie

June 22, 2016

1 / 21

France is the 2nd manga consumer in the world

1. Japan - 500M volumes2. France - 13M volumes3. US - 9M volumes

Japan Expo 2015, summer festival about Japanese culture250k tickets over 4 dayse130 per personMore than e8M every year

2 / 21

Mangaki

Let’s use algorithms to discoverpearls of Japanese culture!

23 members:3 mad developers

1 graphic designer1 intern

Feb 2014 - PhD about Adaptive TestingOct 2014 - MangakiOct 2015 - Student Demo Cup winner, Microsoft prizeFeb 2016 - Japanese Culture Embassy Prize, Paris

We’re flying to Tokyo!

3 / 21

Mangaki

2000 users150 → 14000 works anime / manga / OST

290000 ratings fav / like / dislike / neutral / willsee / wontsee

People rate a few works Preference Elicitation

And receive recommendations Collaborative Filtering

4 / 21

Collaborative Filtering

ProblemUsers u = 1, . . . , n and items i = 1, . . . ,mEvery user u rates some items Ru(rui : rating of user u on item i)How to guess unknown ratings?

k -nearest neighbor

Similarity score between users:

score(u, v) =Ru · Rv

||Ru|| · ||Rv ||.

Let us find the k nearest neighbors of the userAnd recommend what they liked that he did not rate

5 / 21

Preference Elicitation

ProblemWhat questions to ask adaptively to a new user?

4 decksPopularityControversy (Reddit)

controversy(L,D) = (L + D)min(L/D,D/L)

Most likedPrecious pearls: few rates but almost no dislike

Problem: most people cannot rate the controversial items

6 / 21

Yahoo’s Bootstrapping Decision Trees

Good balance between like, dislike and unknown outcomes.

7 / 21

Matrix Completion

If the matrix of ratings is assumed low rank:

M =

R1R2...

Rn

= = CP

Every line Ru is a linear combination of few profiles P.

1. Explicable profiles

If P P1 : adventure P2 : romance P3 : plot twistAnd Cu 0,2 −0,5 0,6⇒ u likes a bit adventure, dislikes romance, loves plot twists.

2. We get user2vec & item2vec in the same space!⇒ An user likes items that are close to him.

8 / 21

Mangaki’s Explained Profiles

P1 loves stories for adults (Ghibli/SF), hates stories for teensPaprika : In the near future, a machine that allows to enteranother person’s dreams for psychotherapy treatment has beenrobbed. The doctors investigate.

P2 likes the most popular works, hates really weird worksSame-family homosexual romances:- Mira is a high school student in love with his father, Kyosuke.- They are involved both in a romantic and sexual relationship.- Trouble arises when Mira finds adoption papers.- Mira thinks [his dad] is cheating on him with a girl, which turns out tobe his mother.- Meanwhile, Mira is chased by his best friend, who is also in lovewith him.- And the end, it turns out that Kyosuke is his uncle and everything isright again.

9 / 21

user2vec: plotting every user on a map

10 / 21

user2vec: plotting every user on a map

11 / 21

Modelling Diversity: Determinantal Point Processes

Let’s sample a few items that are far from each other and askthem to the user.

12 / 21

Determinantal Point Processes

We want to sample over n itemsK : n × n similarity matrix over items (positive semidefinite)

P is a determinantal point process if Y is drawn such that:

∀A ⊂ Y , P(A ⊆ Y ) ∝ det(KA)

Example

K =

1 2 3 42 5 6 73 6 8 94 7 9 0

A = {1,2,4} will be includedwith probability prop. to

KA = det

1 2 42 5 74 7 0

13 / 21

Link with diversity

The determinant is the volume of the vectorsNon-correlated (diverse) vectors will increase the volumeWe need to sample k diverse elements efficiently

14 / 21

Results

We assume the eigenvalues of K are known.

Kulesza & Taskar, ICML 2011Sampling k diverse elements from a DPP of size n hascomplexity O(nk2).

Kang (Samsung), NIPS 2013

We found an algorithm ϵ-close with complexity O(k3 log(k/ϵ)).

Rebeschini & Karbasi, COLT 2015No, you did not.Your proof of complexity is false, but at least your algorithmsamples correctly.

Vie, RecSysFR 2016Please calm down guys, we’re all friends here.

15 / 21

Work in progress: Preference Elicitation in Mangaki

Keep the 100 most popular worksPick 5 diverse works using k -DPPAsk the user to rate themEstimate the user’s vectorFilter less informative works accordinglyRepeat.

16 / 21

Preference elicitation: sampling a diverse subset

17 / 21

Estimating according to the answers and filtering

18 / 21

Sampling again

19 / 21

Refining estimate

20 / 21

Thanks for listening!

@MangakiFR

research.mangaki.fr

We’re onGitHub!

21 / 21