Applications of Voting Theory to Information Mashups

21
ICSC 2008 Julia Grace, IBM Almaden Research Applications of Voting Theory to Information Mashups Alfredo Alba Varun Bhagwan Julia Grace Daniel Gruhl Kevin Haas Meenakshi Nagarajan Jan Pieper Christine Robson Nachiketa Sahoo

description

Applications of Voting Theory to Information Mashups. Alfredo Alba Varun Bhagwan Julia Grace Daniel Gruhl Kevin Haas Meenakshi Nagarajan Jan Pieper Christine Robson Nachiketa Sahoo. Overview. BBC approached IBM in 2007 - PowerPoint PPT Presentation

Transcript of Applications of Voting Theory to Information Mashups

Page 1: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

Applications of Voting Theory to Information Mashups

Alfredo AlbaVarun Bhagwan

Julia GraceDaniel GruhlKevin Haas

Meenakshi NagarajanJan Pieper

Christine RobsonNachiketa Sahoo

Page 2: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

Overview• BBC approached IBM in 2007

– Goal: Create a better music chart that is more reflective of current tastes and trends in popular music• Billboard charts are no longer relevant

– Do not reflect music listened to and purchased online

• Looked to online music communities for data– page views, music listens, blog posts

• We needed a way of combing these sources– Y.A.M.? (Yet another mashup?)

Page 3: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

Overview• Traditional Mashup

– Google Maps + Craigslist– Music Mashups

• Interweaving 2 tracks– Always same modalities

• Similar, homogenous data sets

Combine “like” data by simple summation

• Information Mashup– Data from disparate online music communities– Different modalities [views, listens, posts]

Page 4: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

Overview• New means of combining/mashing our

data– New methodolgy for mashups

• Our Approach– Voting theory

• Think of our data sources as constituents in an election

Page 5: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

Music Mashup

• End Goal: Gauge Popularity• Challenging

– Diverse data silos– Different sites have different demographics and user

bases– Data volumes vary widely

• MySpace: 13,697,565• Bebo: 10,194

– Data itself comes in different flavors

How do you represent the “voices” of each of these music communities in a single Top-10

list?

Page 6: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

Voting Theory• Voting Systems

– Designed to combine many “voices” into a single decision that is representative of all communities

Different voting systems haveDifferent priorities resulting inDifferent outcomes

• You have to choose the voting system that is right for your circumstances

• We are not going to invent a new voting system– Examine several well-known systems.

Page 7: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

Example: US Presidential Election• US Presidential Election uses Delegate System

– Guarantees states with larger populations don’t always drastically sway elections

– This methodology was used because at the time of implementation, that was what was important

• Bush vs. Gore 2000 Presidential Election

Page 8: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

How to Choose a Voting System?

• Voting theory: how “good” your voting system is varies from person to person and situation to situation

• Metric is needed to gauge the quality of a voting system for your circumstances.

Page 9: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

How to Choose a Voting System?• Example:

– Delegate System in United States – Equal voice for each state by

population was the priority

• How to evaluate the quality Top-10 list? – Ideally we would create lists and perform a massive user

study to determine the best voting system– This is not feasible and does not scale– So we need some heuristics to gauge the quality of our lists

Fortunately, this is a solved problem…Voting theory employs a “Social Welfare Function” to gauge the quality of a voting

system

Page 10: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

Social Welfare Functions• What is a Social Welfare Function? definition:

“Mathematical means to quantify the attributes that you prioritize in a voting system (i.e. all communities have a voice, most popular candidate wins)”

• “Simple example”– Situation: People only care if their first choice wins the

election – Resulting Social Welfare Function: Measure how many

people had their first choice picked

• We will use a Social Welfare Function to measure what is the best voting system to use to combine the data in our mashup to generate our Top-10 list of music

Page 11: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

Well established Social Welfare Functions

• Spearman Footrule: Christine– Type A personality– Preservation of position in the rankings– Entire ranking reflecting accurately

• middle range artists should be in the middle, low range towards the end, etc.

– For example: • Christine ranked Coldplay #2, so she will

be happy if Coldplay is #2 in the final list

• Precision Optimal Aggregation: Julia– Representation (not rank)– For example:

• Julia had Rihanna in her list, so she would like Rihanna to be in the final list

Page 12: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

Voting Systems• We evaluated 8 well established voting

systems• Important to keep in mind

– We are not electing a single candidate, we are creating a rank-ordered list

– Position of artists matters just as much as who is #1

Page 13: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

Voting Systems• Total vote (i.e. election by popular vote)

– Tally counts, listens, etc. regardless of “type” of data– Easy to understand, very transparent– Modalities with very large amounts of data tend dominate the vote

• Weighted votes– Use a multiplier so that postings count more than listens, delegate, count rank

• Semi-Proportional– Each source gets the same number of votes regardless of how many people vote

• Delegates– Each source gets a set number of votes, decided in advance

• Simple Rank (Naru)– Every candidate gets a position vote – person

with the smallest number is the winner

• Inverse Rank– Close to Rank except use 1/number and biggest number wins more weight to

being close to top of list

• Run-off– When ½ the sources agree on a candidate that candidate is elected

Page 14: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

Election Setup

1. Data preparation: Crawled, extracted,

cleaned, mined, analyzed…

2. Applied a voting system• Total Vote, Naru, Run-off, etc.• Ouput: Top-10 list of popular artists

3. Tested Top-10 list against SWF1. Is Julia happy?2. Is Christine happy?

Page 15: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

#1

#2

#3

#4

#5

#6

#7

#8

#9

#10

Total Votes• Total Votes: simple summing

Key

Precision Optimal Aggregation SWF

Spearman Footrule SWF

Contribution of combined ranking for the artist from each source

YouTube and Bebo are nearly sole contributors to Rihanna being #1

Explanation: YouTube dominates all other music communities – it was coincidental that Bebo was also able to contribute to the rankings

YouTube video view counts are so high they dominate all other communities

Page 16: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

#1

#2

#3

#4

#5

#6

#7

#8

#9

#10

Naru• Election system used on Pacific Island nation of Naru

Significantly more even distribution of sources

Naru maximized the Precision Optimal Aggregation SWF

All communities contribute!

Key

Precision Optimal Aggregation SWF

Spearman Footrule SWF

Contribution of combined ranking for the artist from each source

Page 17: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

#1

#2

#3

#4

#5

#6

#7

#8

#9

#10

Run-off• From the top, select artists one at a time from each

source in a fixed order

Significantly more even distribution of sources

Run-off maximized the Spearman Footrule SWF

All communities contribute!

Key

Precision Optimal Aggregation SWF

Spearman Footrule SWF

Contribution of combined ranking for the artist from each source

Page 18: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

http://www.bbc.co.uk/soundindex/

Page 19: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

Lessons Learned• Choosing a voting methodology depends

on what you prioritize

• Think hard about what your Social Welfare Function– Deciding factor in how to combine data – How you measure the success of your mashup

Page 20: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

Conclusion• Novel, new approach to mashups

• We feel this is the future of information mashups from different modalities

Page 21: Applications of Voting Theory to Information Mashups

ICSC 2008 Julia Grace, IBM Almaden Research

Thank you• Any Questions• Julia Grace ([email protected])