Eric bieschke slides

16
Proprietary & Confidential Proprietary & Confidential MLconf November 2013

description

 

Transcript of Eric bieschke slides

Page 1: Eric bieschke slides

Proprietary & ConfidentialProprietary & Confidential

MLconfNovember 2013

Page 2: Eric bieschke slides

Proprietary & ConfidentialProprietary & Confidential

The Data“The files are in the computer.” – Derek Zoolander

Page 3: Eric bieschke slides

Proprietary & Confidential

200+ million registered users

70+ million active monthly users

Average Pandora listener listens for 17 hours a month

More than 80% of listening occurs on mobile and other connected devices

8.06% of total US radio listening hours

Pandora

Page 4: Eric bieschke slides

Proprietary & Confidential

1.47+ billion listening hours in October

30+ billion thumbs

5+ billion stations

Approximately one out of every two US smartphone users has listened to Pandora in the past month

Pandora

Page 5: Eric bieschke slides

Proprietary & ConfidentialProprietary & Confidential

Experimentation & Metrics“It doesn’t matter how beautiful your theory is, it doesn’t matter how smart you

are. If it doesn’t agree with experiment, it’s wrong.” – Richard Feynman

Page 6: Eric bieschke slides

Proprietary & Confidential

A/B Testing

All improvements begin as a hypothesis.

Hypotheses beget experiments.

Experiments are tried against real Pandora listeners.

When an experiment beats the current algorithm, ship it!

Rinse, wash, repeat.

A/B testing is how you leverage scale. More data lets you build stronger models and try fancy data intensive algorithms, but the big win comes from unlocking A/B testing. Online evaluation > Offline evaluation.

Page 7: Eric bieschke slides

Proprietary & Confidential

Metrics

How you judge experiments shapes where you are headed.

Choose the wrong measuring stick and you wind up in the wrong place.

Choose the right measuring stick and progress is inevitable.

Improvements come both from better hypotheses to run experiments but also from better measuring sticks.

Incremental improvements tend to come from hypotheses.

Leapfrog improvements tend to come from better measuring sticks.

Page 8: Eric bieschke slides

Proprietary & Confidential8

Evolution of Big Picture Metrics

Thumb up percentage

Total listening hours

Listener return rate

Machine learning doesn’t exist in a vacuum. Make sure you’re optimizing the right thing.

Approach problems by first deciding what you’re trying to achieve, then think technology. If ML isn’t the right tool for the job, don’t use it.

Page 9: Eric bieschke slides

Proprietary & Confidential9

Deeper Metrics

Relevance

Prediction accuracy

Musical diversity

Novelty / Surprisal

Awesomeness

These metrics all support our big picture goal at Pandora: Connecting people with music they love.

Page 10: Eric bieschke slides

Proprietary & ConfidentialProprietary & Confidential

How It Works“Truth is what works.” – William James

Page 11: Eric bieschke slides

Proprietary & Confidential

“ “There is no silver bullet.

Page 12: Eric bieschke slides

Proprietary & Confidential

People are truly unique

No single approach to music recommendations works for everybody

Using a variety of recommendation techniques and combining them intelligently works – Pandora uses 50+ algorithms

The more varied the individual techniques the stronger the ensemble – seek orthogonality

Ensemble RecommendationsThe Music Genome Project

Page 13: Eric bieschke slides

Proprietary & Confidential

25 music analysts

13 years in development

Up to 450 attributes identified per track – everything from the melody, harmony, and instrumentation to rhythm, vocals, and lyrics

As of yet the human ear still understands music better than machines

Content-Based RecommendationsThe Music Genome Project

Page 14: Eric bieschke slides

Proprietary & Confidential

At small scale matrix factorization techniques work wonders

At Pandora scale MF techniques make less sense for many problems

Don’t waste cycles doing something fancy when scale allows you to simply measure

Simple item-item recommenders win at scale

Collaborative FilteringThe Music Genome Project

Page 15: Eric bieschke slides

Proprietary & Confidential

Our listeners know what they want (most of the time)

Pandora is a platform for listeners to cooperate in making the music better for themselves

We build, grow, measure, and enhance this ecosystem – but mostly we stay out of the way

Pandora is awesome because our listeners are awesome

Collective Intelligence – reinforcement learningThe Music Genome Project

Page 16: Eric bieschke slides

Proprietary & ConfidentialProprietary & Confidential

Eric Bieschke@ericbke

http://pandora.com/careers/