Recommendations @ LinkedIn

29
1 Recommendations @ LinkedIn

description

How does LinkedIn think about solving the recommendation problem as a platform? How do we leverage Hadoop to scale our systems and speed up innovation?

Transcript of Recommendations @ LinkedIn

Page 1: Recommendations @ LinkedIn

1

Recommendations @ LinkedIn

Page 2: Recommendations @ LinkedIn

2

Think PlatformLeverage Hadoop

Page 3: Recommendations @ LinkedIn

The world’s largest professional network Over 50% of members are now international

3

*as of Nov 4, 2011**as of June 30, 2011

2004 2005 2006 2007 2008 2009 2010

2 48

17

32

55

90

LinkedIn Members (Millions)

135M+*

75%Fortune 100 Companies use LinkedIn to hire

Company Pages

>2M**

**

New Members joining

~2/sec

Page 4: Recommendations @ LinkedIn

Recommendations Opportunity

4

Page 5: Recommendations @ LinkedIn

The Recommendations Opportunity

5

Pandora Search for People

Events YouMay BeInterested In

Groups browse maps

Page 6: Recommendations @ LinkedIn

6

50%

Page 7: Recommendations @ LinkedIn

7

PositionsEducation

Summary

Experience

Skills

Page 8: Recommendations @ LinkedIn

Are all titles the same?

- Software Engineer- Technical Yahoo- Member Technical Staff- Software Development Engineer- SDE

Page 9: Recommendations @ LinkedIn

Are all companies the same?

‘IBM’ has 8000+ variations- ibm – ireland- ibm research- T J Watson Labs- International Bus. Machines

Page 10: Recommendations @ LinkedIn

Recommendation Trade-offsThe need for a common platform

10

Real Time

Time Independent

Page 11: Recommendations @ LinkedIn

Recommendation Trade-offsThe need for a common platform

11

Content Analysis

Collaborative

Page 12: Recommendations @ LinkedIn

Recommendation Trade-offsThe need for a common platform

12

Recall

Precision

Page 13: Recommendations @ LinkedIn

Related TitlesRelated CompaniesRelated Industries

Related TitlesRelated CompaniesRelated Industries

TitleSpecialtyEducationExperienceLocationIndustry

SenioritySkills

TitleSpecialtyEducationExperienceLocationIndustry

SenioritySkills Specialty -> Specialty

Seniority -> Seniority

Skills -> Skills

Title -> Title

Summary -> Summary

Title -> Related Title

Education -> Education

.

.

.

BinaryExact match

Exact match in bucket

Soft Match v1 = tf * idf

CosΘ = v1*v2

|v1|*|v2|

Matching 0.58

0.94

0.26

0.18

0.98

0.16

0.40

Page 14: Recommendations @ LinkedIn

Importance

weight vector

(Skills-> Skills)

Similarity

score vector

(Skills-> Skills)

Normalization, Scoring

& RankingFiltering

LocationCompanyIndustry

Fee

db

ack

0.94

0.70

Page 15: Recommendations @ LinkedIn

Technologies

Page 16: Recommendations @ LinkedIn

16

Hadoop Case Studies

• Scaling • Blending Recommendation Algorithms• Grandfathering• Model Selection• A/B Testing• Tracking and Reporting

Page 17: Recommendations @ LinkedIn

1717

ScalingBillions of Recommendations

Latency > 1 sec

Latency < 1 sec

Recall = Low

Latency < 1 sec

Recall = High

Minhashing

Page 18: Recommendations @ LinkedIn

18

Hadoop Case Studies

• Scaling ✔• Blending Recommendation Algorithms• Grandfathering• Model Selection• A/B Testing• Tracking and Reporting

Page 19: Recommendations @ LinkedIn

19

Blending Recommendation Algorithms

Co-View Impact Latency ~ Minutes

Complexity = High

Co-View Impact Latency ~ Hours

Complexity = Low

Page 20: Recommendations @ LinkedIn

20

Hadoop Case Studies

• Scaling ✔• Blending Recommendation Algorithms ✔• Grandfathering• Model Selection• A/B Testing• Tracking and Reporting

Page 21: Recommendations @ LinkedIn

21

GrandfatheringAdding and Changing Features

No Time Guarantees

Minimal Disruption

Next Profile Edit

Time ~ Week

Significant Systems Work

Parallel Feature

Extraction Pipeline

Time ~ Hour

Minimal Disruption

Grandfather When Ready

Page 22: Recommendations @ LinkedIn

22

Hadoop Case Studies

• Scaling ✔• Blending Recommendation Algorithms ✔• Grandfathering ✔• Model Selection• A/B Testing• Tracking and Reporting

Page 23: Recommendations @ LinkedIn

232323

Model Selection

`

• Features • Models• Parameters

SVM

Logistic

RegressionContent,Collaborative

SVMDecision Trees

L1+L2

Regularization

Page 24: Recommendations @ LinkedIn

24

Hadoop Case Studies

• Scaling ✔• Blending Recommendation Algorithms ✔• Grandfathering ✔• Model Selection ✔• A/B Testing• Tracking and Reporting

Page 25: Recommendations @ LinkedIn

252525

A/B TestingIs Option A Better Than Option B? Let’s Test

`

10%

90%

New

Model

Old

Model

A

B

Traffic

Send 10% of members who have more than 100 connections AND

who have logged in the past one week, AND who are based in Europe

Page 26: Recommendations @ LinkedIn

26

Hadoop Case Studies

• Scaling ✔• Blending Recommendation Algorithms ✔• Grandfathering ✔• Model Selection ✔• A/B Testing ✔• Tracking and Reporting

Page 27: Recommendations @ LinkedIn

27

Tracking and ReportingK-way joins across billions of rows

Up to the minute reportingNearsightedness

K-way join complexity

Lacks up to the minute reporting

Simple k-way joins

Page 28: Recommendations @ LinkedIn

28

Think PlatformLeverage Hadoop

Page 29: Recommendations @ LinkedIn

2929

Come work with us at LinkedIn

LinkedIn

Applied Research

Engineer

You