Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

24
Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT Rank centrality: Ranking from comparisons

Transcript of Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Page 1: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Sahand Negahban Sewoong Oh Devavrat Shah

Yale + UIUC + MIT

Rank centrality: Ranking from comparisons

Page 2: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Some scenarios

oGiven partial preferenceso Compute global ranking with scores to reflect

intensity

oSportso Outcome of games between teams/players

oSocial recommendationso Ratings of few restaurants/movies

oCompetitive conference/Graduate admissiono Ordering of few papers/applicants

Page 3: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Revealed preferences

oPartial preferences are revealed in different formso Sports: Win and Losso Social: Starred ratingo Conferences: Scores

oAll can be viewed as pair-wise comparisonso IND beats AUS: IND > AUSo South Indies ***** vs MTR ***: SI > MTRo Ranking Paper 10/10 vs Other Paper 5/10: Ranking >

Other

Page 4: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Data and Decision

oRevealed preferences lead to o Bag of pair-wise comparisonso Sports, Social, Conferences, Transactions, etc.

oQuestion of interest o Obtain global ranking over objects of interest

o Teams/Players, Restaurants, Papers, Applicants.

o Along with intensity/score for each objecto Using given partial preferences/pair-wise

comparisons

Page 5: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Data and Decision

oQ1. Given weighted comparison graph G=(V, E, A)o Find ranking of/scores associated with objects

oQ2. When possible (e.g. Conference/Crowd-Sourcing), choose G so as to o Minimize the number of comparisons required to find ranking/scores

1

6 2

3

4

5

A12

A21

# times 1 defeats 2

Page 6: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Rank aggregation: Model

oWe posito Distribution over permutations as ground-

trutho Pair-wise comparisons are drawn from this

distribution

Data Distribution Ranking

A B

CB

C

A

CB A

CB A

A B

CB

C

ACB¿ A¿

0.25

0.75

¿¿¿¿

¿¿

¿¿

¿ ¿¿ ¿

1

6 2

3

4

5

A12

A21

Page 7: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

1

6 2

3

4

5

A12

A21Rank aggregation: Backgroundo Input: complete preference (not

comparisons)o Axiomatic impossibility [Arrow ’51]

oSome algorithmso Kemeny optimal: minimize disagreements

o Extended Condorcet Criteriao NP-hard, 2-approx algorithm [Dwork et al ’01]

o Borda count: average position is scoreo Simpleo Useful axiomatic properties [Young ‘74]

23

>4

>1

>5

>6

>

62

>5

>1

>4

>3

>

Page 8: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Rank aggregation: BackgroundoAlgorithm with comparisons

o Variant of Kemeny optimal:

o NP-hard

o Variant of Borda count: average position from comparison? o If pij = Aij/(Aij+ Aji) represent pair-wise marginal distribution

o Then, Borda count is given as

o Requires: G complete, many comparisons per pair

oAlso see (short list of relatd works): [Diaconis ‘87], [Alder et al ‘87], [Braverman-Mossel ’09], [Caramanis et al ‘11], [Fernoud et al ’11], [Duchi et al ‘12]…

[Ammar, Shah ’11]

1

6 2

3

4

5

A12

A21

Page 9: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Rank aggregation: Model

oGeneral modelo Effectively impossible to do aggregation

oPracticallyo Restrict choice model o Popularly utilized model is instance of

Thurstone’s ‘27o Used for transportation system (cf. McFadden)o TrueSkill uses for ranking online gamerso Pricing in airline industry (cf. Talluri and Van Ryzin) o …

1

6 2

3

4

5

A12

A21

Page 10: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Choice model

oChoice model (distribution over permutations)

[Bradley-Terry-Luce (BTL) or MNL Model]

o Each object i has an associated weight wi > 0

o When objects i and j are comparedo P(i > j) = wi /(wi + wj)

oSampling modelo Edges E of graph G are selectedo For each (i,j) ε E, sample k pair-wise

comparisons

Page 11: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Rank centrality 1

6 2

3

4

5

A12

A21

oRandom walk on comparison graph G=(V,E,A)o d = max (undirected) vertex degree of Go For each edge (i,j):

o Pij = (Aji +1)/(Aij +Aji +2) x 1/(d+1)

o For each node i: o Pii = 1- Σj≠i Pij

oLet G be connectedo Let s be the unique stationary distribution of

RW P

oRanking: o Use s as scores of objectso Closely related to Dwork et al ‘01 + Saaty ‘03

Page 12: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Rank centrality1

6 2

3

4

5

A12

A21

oRandom walk on comparison graph G=(V,E,A)o Let s be the unique stationary distribution of

RW P

oRanking: o Use s as scores of objects

o That is, object i has higher score if o It beats object j with higher score, o Or, beats many objects.

Page 13: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Rank centrality1

6 2

3

4

5

A12

A21

oRandom walk on comparison graph G=(V,E,A)o Let s be the unique stationary distribution of

RW P

oRanking: o Use s as scores of objects

o Compared to variant of Borda count:

Page 14: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Rank centrality: experiment International Cricket Ranking

Page 15: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Rank centrality: simulation

oError(s) =

oG: Erdos-Renyi graph with edge prob. d/n

d/nk

Page 16: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Rank centrality: performance

oTheorem 1 (Negahban-Oh-Shah). o Let R= (maxij wi/wj).

o Let G be Erdos-Renyi graph. o Under Rank centrality, with d = Ω(log n)

o That is, sufficient to have O(R5 n log n) sampleso Optimal dependence on n for ER grapho Dependence on R ?

Page 17: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Rank centrality: performance

oTheorem 1 (Negahban-Oh-Shah). o Let R= (maxij wi/wj).

o Let G be Erdos-Renyi graph. o Under Rank centrality, with d = Ω(log n)

o Information theoretic lower-bound: for any algorithm

Page 18: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Rank centrality: performance

oTheorem 2 (Negahban-Oh-Shah). o Let R= (maxij wi/wj).

o Let G be any connected graph: o L = D-1 E be it’s Laplacian

o Δ = 1- λmax(L)

o κ = dmax /dmin

o Under Rank centrality, with kd = Ω(log n)

o That is, number of samples required O(R5 κ2 n log n x Δ-2) o Graph structure plays role through it’s Laplacian

Page 19: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Rank centrality and Graph choice

oTheorem 2 (Negahban-Oh-Shah). o Under Rank centrality, with kd = Ω(log n)

o That is, number of samples required O(R5 κ2 n log n x Δ-

2)

o Choice of graph Go Subject to constraints, choose G so that o Spectral gap Δ is maximizedo SDP [Boyd, Diaconis, Xiao ‘04]

Page 20: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Some remarks on proof

Page 21: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Some remarks on proof

Page 22: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Some remarks on proof

oBound ono Use of comparison theorem [Diaconis-Saloff Coste

‘94]++

oBound on o Use of (modified) concentration of measure inequality

for matrices

oFinally, use this to further bound Error(s)

Page 23: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Rank centrality for Admissions, Conferences,…

1

6 2

3

4

5

A12

A21

oMIT admission system

oACM conferences (MobiHoc ‘11, Sigmetrics ‘13)o Past few years has been used for efficient

reviewing

oDaily polls (cf. A. Ammar)o polls.mit.edu

oNetflixo ?

Page 24: Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Concluding remarks

oPair-wise comparisonso Universal way to look at partial preferences

oRank centralityo Simple and intuitive algorithm for rank aggregation

oThe comparison graph plays important role in aggregationo Choose G to maximize spectral gap of natural RW