[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on LinkedIn

Pairwise Learning:Experiments with Community Recommendation on LinkedIn

Amit Sharma*, Baoshi [email protected], [email protected]

mailto:[email protected]



Typical online recommendation interfaces

Community Recommendation on LinkedIn

Observed preferenceuser u joins a community y (u,y)

The recommendation problemGiven a set of (u, y) tuples, predict a set R(u) for eachuser which are the recommendations for a user u.

A content-based approachOwing to the rich profile data for users, we use a content-based model that computes similarity between users and groups.

An intuitive logistic model (point-wise)

fu, fy: features of user u and community ywi : parameters for the modelCommunities that a user has joined are relevant.

Understanding implicit feedback from users

1

32 Clicked

2 is better than 1.

45

Can pairwise learning help for community recommendation?● A reliable technique used in search engines. [Joachims

01]

● Has been proposed for some collaborative filtering models. [Rendle et al. 09, Pessiot et al. 07]

● Empirical evidence shows promising results. [Balakrishnan and Chopra 10]

CaveatLearning time is quadratic in number of communities.How fast is the inference?

Outline

● Propose pairwise models for content-based recommendation

● Augment pairwise learning with a latent preference model

● Show both offline and online evaluation on linkedin data for our proposed models

Expressing pairwise preference

We establish a pair (yi, yj) if yi was ranked higher than yj and only yj was selected by the user.

We can define a ranking function h such that:

Building a pairwise logistic recommender

Maximizing the likelihood of observed preference among pairs:

Model 1: Feature Difference Model

Assuming h to be a linear function,

Equivalent to logistic classification with features(yj - yi)

Ranking: Can simply rank by computing for each community

Model 2: Logistic Loss Model

Assuming a more general ranking function:

Ranking: As long as we choose h to be a non-decreasing function, we can still rank by computing weighted sum of features for each community.

Pairwise learning improves the classification of pairs

...but the gains are only slight.

Task: For each pair, predict which community is more preferred by a user

Digging deeper: Joining statistics for LinkedIn communities

FACT: Most users join different types of groups.

Possible hypothesis: There are different reasons for joining different types of groups.

Random sample, 1M users

Digging deeper: the effect of group types

Cornell Alumni

ML Group

Cornell Alumni

ML Group

User1

User2

Interest Feature

Education Feature

Interest Feature

Education Feature

>

>

PREFERRED

PREFERRED

When learning a single weight for each feature, varying preferences of users may cancel out the effects.

Different reasons for joining a community can be treated as a set of latent preferences within a user

Core preference

User

Pair of communities

Model 3: Pairwise PLSI model

Extend the Probabilistic Latent Semantic Indexing recommendation model for pairwise learning [Hofmann 02]

We assume users are composed of a set of latent preferences. Each user differs in how she combines the available latent preferences.

Latent preferences over pairs help retain differing user preferences

Cornell Alumni

ML Group

Cornell Alumni

ML Group

User1

User2

Interest Feature

Education Feature

Interest Feature

Education Feature

>

>

z1

z2

User1 puts more weight to z1’s preference. User2 puts more weight to z2’s preference.

Number of core preferences (Z)small ~ {2, 4, 8}Choosing probability modelsUse logistic loss or feature difference for modeling conditional preference.

Multinomial model for modeling the probability of a latent preference given a user.

Some details about the model

Ranking

Thus, we can still rank communities individually (without constructing pairs).

Evaluation

Offline evaluation: Evaluated on group join data on linkedin.com during the summer of 2012.

Train-test data separated chronologically.

Pairwise PLSI performs improves performance on learning pairwise preference

Pairwise PLSI leads to more successful recommendations

Online evaluation

● Tested out Logistic Loss and Feature Difference models on 5% of LinkedIn users, and the baseline model on the rest

● Measured average click-through-rate (CTR) over 2 weeks

● Feature difference reported a 5% increase in CTR, logistic loss reported 3%.

Conclusion: Pairwise learning can be a useful addition.

However, gains may depend on the context / domain.Important to understand and model the special characteristics of a target domain.

thank you Amit Sharma, @amt_shrma

www.cs.cornell.edu/~asharma

http://www.cs.cornell.edu/~asharma

http://www.cs.cornell.edu/~asharma

[RecSys '13]Pairwise Learning: Experiments with Community Recommendation on LinkedIn

Technology

Transcript of [RecSys '13]Pairwise Learning: Experiments with Community Recommendation on LinkedIn