MTurk > Machine Learning

38
Mturk > Machine Learning Bhaskar Rao, Polyvore 1

description

Machine Learning is hard, asking another person's opinion is easy. In this presentation we talk about how Polyvore ( www.polyvore.com) uses Amazon's Mechanical Turk to answer questions too hard for the fastest machines and the best classifiers. We also reveal the secret sauce that helped boost our Mturk answer accuracies from 60% to over 90%.

Transcript of MTurk > Machine Learning

  • 1. Mturk > Machine Learning
    BhaskarRao, Polyvore
    1

2. What is Polyvore?
An online fashion community
2
3. Discover your style
4. How Big is Polyvore?
6/24/11
4
1M sets
Sets created monthly
7 minutes
Average time on site
10M visitors
Unique visitors to Polyvore monthly
1.5M clips
Images clipped monthly
12.4%
Of Polyvores users visit 100+ times monthly
140M views
Pageviews on Polyvore monthly
5. Polyvores in the Wild
General Behavior
Collect & Create
Clip from the Internet, organize, tag.
Create sets, make collections.
Consume
Explore, search, browse.
Like stuff, leave comments, build social networks.
Share
Embed in an offsite instance.
Get alerts for offsite activity.
5
6. 7. 8. 9. 10. 11. What is the Mechanical Turk?
11
12. The Turk (circa 1770)
Invented in 1770 by Mr.Wolfgang.
The first machine that could play chess.
Beat challengers like Napoleon and Benjamin Franklin.
Hoax revealed 50 years later.
12
wikipedia.com
13. Amazon Mechanical Turk(circa 2007)
Artificial AI
Crowd-sourced marketplace
HIT = Question
24/7 ; 100,000s of on-demand workers
13
14. 14
Amazon Mechanical Turk(circa 2007)
mturk.com
15. Why Turk?
15
16. The power of The Turk.
Surveys
Startup idea validation
Training Classifiers
Gathering data
Attempt to find Jim Gray.
Validating recommendations
Removing Porn
Art
16
17. Power of the Turk : Wisdom of Crowds
17
Source crowdflower.com
18. Power of the Turk : Replacing Journalism?
18
mybossisarobot.com
19. (Mturk > ML if (problem == hard OR time == startup)
Mturk is a fantastic resource for startups!
Classify 10000 sites as store2-3 Weeks of researcher time (7000$+) vs1 day of Mturk ($500). ..
Porn Removal
Find the official website of Chanel
Is Chanel a fashion brand ?
19
20. (Mturk > ML if (problem == hard OR time == startup OR.)
Mturk is a fantastic resource for startups!
Classify 10000 sites as store2-3 Weeks of researcher time (7000$+) vs1 day of Mturk ($500). .. POSSIBLE
Porn / Not porn classification
Find the official website of Chanel
Perfect logo of Chanel
20
21. (Mturk > ML if (problem == hard OR time == startup OR.)
Mturk is a fantastic resource for startups!
Classify 10000 sites as store2-3 Weeks of researcher time (7000$+) vs1 day of Mturk ($500). ..
Porn / Not porn classificationHARD
Find the official website of Chanel
Is Chanel a fashion brand ?
21
22. (Mturk > ML if (problem == hard OR time == startup OR.)
Mturk is a fantastic resource for startups!
Classify 10000 sites as store2-3 Weeks of researcher time (7000$+) vs1 day of Mturk ($500). ..
Porn / Not porn classification
Find the official website of ChanelYIKES!
Is Chanel a fashion brand ?
22
23. (Mturk > ML if (problem == hard OR time == startup OR.)
Mturk is a fantastic resource for startups!
Classify 10000 sites as store2-3 Weeks of researcher time (7000$+) vs1 day of Mturk ($500). ..
Porn / Not porn classification
Find the official website of Chanel
Is Chanel a fashion brand
IMPOSSIBLE?
23
24. How to Turk ?
The basics....
Designing complex crowdsourcing tasks is hard
Stick to simple tasks
Iterate
24
25. The golden rule
We are all humanyou and I and mturk.
Say hello at www.turkernation.com
Get feedback
Be fair
Do not get ripped off
25
26. Ready, Set, Fire
Is this website an e-commerce store?
Fire 50 questions
60% accuracy
FAIL !
26
Twitter.com
27. How to design a HIT ?
27
28. Supervision needed..
28
29. Retry 50 questions.
Allow only reputed workers
New HIT design after feedback
That should do it, right ?
29
30. 80%
31. Better? NO!
Call a crowdsourcing company ?
Hire an army?
Write classifier?
31
32. EUREKA The golden rule REDUX
Qualification Tests duh!
So very overlooked or so very obvious ?
Automate it all.
Training data for Mturk ?
32
33. 97%
34. The process(successful mturk recipe)
Design a HIT
Iterate on Design
Answer a few tricky ones.
Upload the HITs
Go home and drink beer and watch reruns
Next day -> 87+% accuracy(usually).
34
35. Best Practices
Automate it all
$ASK->ask($Question, $Options)
$ASK->final_answer()
35
36. Maximum Awesome
What happens if you meld a Classifier, Mturk and yourself into an Unholy Q&A System.
Answer a few questions, and the system self-calibrates.
NEXT TECH TALK
36
37. Thank You
Questions?
[email protected]
www.polyvore.com/cgi/about