Movie vs movie

of 213/213
Irmak Sirer @frrmack movievsmovie.datasco.pe
  • date post

    13-Jan-2015
  • Category

    Technology

  • view

    849
  • download

    6

Embed Size (px)

description

What are your top ten favorite movies of all time? This is a very difficult question. But why? Irmak Sirer explains the challenges of measuring how much we like movies, books, songs, or products; combining insights from diverse sources like the Netflix Prize, Duncan Watts' social experiments, or the beginnings of Facebook. The better we get at measuring and ranking levels of enjoyment, the better we can customize websites, sort search results, find other people with similar tastes, and recommend products, so can we overcome these challenges? Drumroll... Yes, we can.

Transcript of Movie vs movie

  • 1. Irmak Sirermovievsmovie.datasco.pe

2. Howmuch dowelike things? 3. AGE 7 Oh cool. Pretty good. Space and stuff. 4. AGE 14 Omigod Omigod Omigod. Epic masterpiece is epic!!!!1! I'm in love with Leia. 5. AGE 17 WTF? 6. AGE 30 When you think about it, it's not that good. 7. AGE 30 When you think about it, it's not that good. Ah, who am I kidding? It's amazing. I'm still in love with Leia. 8. I mean... look at her. 9. What determines how much I like a movie? 10. What determines how much I like a movie?Is my reaction to a movie / book / song predictable? 11. How much will I like The Book of Eli? 12. 2006 Cinematch1 billion user ratings55,000 movies 13. Cinematch I have a soulmate in tasteIrmak 14. Cinematch I have a soulmate in tasteIrmakFrrmack 15. Cinematch I have a soulmate in taste Watched the same moviesIrmakFrrmack 16. Cinematch I have a soulmate in taste Watched the same movies Gave the exact same ratingsIrmakFrrmack 17. Cinematch I have a soulmate in taste Watched the same movies Gave the exact same ratings Except The Book of EliIrmakFrrmack 18. Cinematch I have a soulmate in tasteFrrmack watched The Book of EliIrmakFrrmack 19. Cinematch I have a soulmate in tasteOh man, it wasIrmakFrrmack 20. Cinematch I have a soulmate in tasteOh man, it was FANTASTIC!IrmakFrrmack 21. Cinematch I have a soulmate in tasteOh man, it was FANTASTIC! PredictIrmakFrrmack 22. No perfect soulmates in real lifeIrmak 23. No perfect soulmates in real lifeAlmost soulmate 1Irmak 24. No perfect soulmates in real lifeAlmost soulmate 1IrmakAlmost soulmate 2 25. No perfect soulmates in real lifeAlmost soulmate 1IrmakAlmost soulmate 3Almost soulmate 2 26. No perfect soulmates in real lifeAlmost soulmate 1IrmakAlmost soulmate 2Almost soulmate 3Almost soulmate 4 27. No perfect soulmates in real life87% soulmateIrmak74% soulmate82% soulmate95% soulmate 28. No perfect soulmates in real lifeIrmak 29. No perfect soulmates in real lifeIrmak 30. Cinematch Works well for movies that everybody rates 31. Cinematch Quite bad with movies that only few people rate 32. Cinematch Some movies are especially difficult to predict Biggest error source: popular but weird15% of all errors from ONE movie 33. Trivial: Mean score of everyone 34. Trivial: Mean score of everyone Error: (RMSE)1.0540 stars 35. Trivial: Mean score of everyone Error: (RMSE)1.0540 starsCinematch Error: (RMSE)0.9525 stars 36. Trivial: Mean score of everyone Error: (RMSE)1.0540 stars 9.6%Cinematch Error: (RMSE)0.9525 stars 37. Trivial: Mean score of everyone Error: (RMSE)1.0540 stars 9.6%Cinematch Error: (RMSE)0.9525 starsBetter rankings Better recommendations 38. Trivial: Mean score of everyone Error: (RMSE)1.0540 stars 9.6%Cinematch Error: (RMSE)0.9525 starsBetter rankings Better recommendations + 8.6% + 1200% people watch top recommendationBigChaos Netflix Prize Report 39. Cinematch Error:0.9525 stars 40. Cinematch Error:0.9525 stars$1,000,000 for a 10% improvement2006 41. Cinematch Error:0.9525 starsBring it down to: Error:0.8563 stars $1,000,000 for a 10% improvement2006 42. BellKors Pragmatic Chaos 43. How did they do it? 44. How did they do it? 45. How did they do it? Before: Solid assumptions You have a certain taste. Your taste dictates a hidden rating for Book of Eli. When you watch it, this rating is revealed to you. 46. How did they do it? Before: Solid assumptionsG N O R WYou have a certain taste.Your taste dictates a hidden rating for Book of Eli. When you watch it, this rating is revealed to you. 47. How did they do it? After: Your rating changes with time. 48. How did they do it? After: Your rating changes with time. It depends on... 49. How did they do it? After: Your rating changes with time. It depends on... how many you rated that day your average rating for the day which movies you rated on this day shown Netflix prediction 50. Trivial: Mean score of everyone Error:1.0540 starsCinematch Error:0.9525 starsY. Koren, The BellKor Solution to the Netflix Grand Prize. 2009 51. Trivial: Mean score of everyone Error:1.0540 starsCinematch Error:0.9525 starsYour time dependent rating tendenciesY. Koren, The BellKor Solution to the Netflix Grand Prize. 2009 52. Trivial: Mean score of everyone Error:1.0540 starsCinematch Error:0.9525 starsYour time dependent rating tendencies Error: 0.9278 starsY. Koren, The BellKor Solution to the Netflix Grand Prize. 2009 53. Trivial: Mean score of everyone Error:1.0540 starsCinematch Error:0.9525 stars 12.0%Your time dependent rating tendencies Error: 0.9278 starsY. Koren, The BellKor Solution to the Netflix Grand Prize. 2009 54. Trivial: Mean score of everyone Error:1.0540 starsCinematch Error:0.9525 stars 12.0%Your time dependent rating tendencies Error: 0.9278 stars without looking at which movies you like/hate! Y. Koren, The BellKor Solution to the Netflix Grand Prize. 2009 55. What does this suggest? 56. What does this suggest? We cannot compare a movie with all others we've seen. 57. What does this suggest? We cannot compare a movie with all others we've seen. We compare it to a limited set. 58. What does this suggest? We cannot compare a movie with all others we've seen. We compare it to a limited set. Liking (real time & remembered) depends on time and mood. 59. What does this suggest? We cannot compare a movie with all others we've seen. We compare it to a limited set. Liking (real time & remembered) depends on time and mood. Other people's opinions affect our own (followers / hipsters) 60. What does this suggest? We cannot compare Book of Eli with all movies we've seen. We compare it to a limited set. Liking (real time & remembered) depends on time and mood. Other people's opinions affect our own (followers / hipsters) 61. An experiment Music Lab: A website for downloading music 62. An experiment Same website: Music download and ratingM.J. Salganik, P.S. Dodds, D.J. Watts. Science, 311:854-856, 2006 63. An experiment Music Lab: A website for downloading music Alternative A: Other people's ratings invisible 64. An experiment Music Lab: A website for downloading music Alternative A: Other people's ratings invisibleMore or less equal ratings 65. An experiment Music Lab: A website for downloading music Alternative A: Other people's ratings invisibleAlternative B: All ratings visibleMore or less equal ratings 66. An experiment Music Lab: A website for downloading music Alternative A: Other people's ratings invisibleAlternative B: All ratings visibleMore or less equal ratingsSeveral songs snowball in popularity 67. An experiment Music Lab: A website for downloading music Alternative A: Other people's ratings invisibleAlternative B: All ratings visibleMore or less equal ratingsSeveral songs snowball in popularity It's different songs for each trial 68. Social influence plays a big part in determining hits and misses 69. Problems with rating movies We cannot compare a movie with all others we've seen. We compare it to a limited set. Liking (real time & remembered) depends on time and mood. Other people's opinions affect our own. 70. Degree of liking is sensitive and vagueAmazing!Tuesday 3amTotal garbageSunday 12pm 71. Degree of liking is sensitive and vagueLiking (real time & remembered) depends on time and mood. Other people's opinions affect our own. 72. Degree of liking is sensitive and vagueDependent on many other environmental factors besides our taste 73. Degree of liking is sensitive and vague We cannot compare a movie with all others we've seen. We compare it to a limited set. 74. Degree of liking is sensitive and vagueDifficult to describe accurately and consistently with a number 75. Predicting aside, can I even reliably rate & rank movies Ive seen in terms of enjoyment? 76. What are your top twenty movies?IrmakFrrmack 77. What are your top twenty movies?Well UmmmIrmakFrrmack 78. What are your top twenty movies?Well Ummm I like Star Wars.IrmakFrrmack 79. Degree of liking is sensitive and vagueCant we do something about this? 80. Degree of liking is sensitive and vague 81. Enjoyment from a movie is very high dimensional information 82. Enjoyment from a movie is very high dimensional informationRating means projecting this onto a single dimension 83. ? 84. But sometimes you just want to do the best projection you canWhat is my top twenty? 85. Degree of liking is sensitive and vague We cannot compare a movie with all others we've seen. We compare it to a limited set. 86. Trying to rate Star Wars 87. Trying to rate Star Wars 88. Trying to rate Star Wars 1 Map enjoyment to a specific scale 89. Trying to rate Star Wars 1 Map enjoyment to a specific scale 90. Trying to rate Star Wars 1 Map enjoyment to a specific scale 91. Trying to rate Star Wars 2 rating ose corresponding cho king for this degree of li 92. Trying to rate Star Wars But we cannot keep this entire history of enjoyment in mind 93. Trying to rate Star Wars But we cannot keep this entire history of enjoyment in mind We fuzzily remember a small subset 94. Trying to rate Star Wars But we cannot keep this entire history of enjoyment in mind We fuzzily remember a small subset We map based on this subset 95. Trying to rate Star Wars But we cannot keep this entire history of enjoyment in mind We fuzzily remember a small subset We map based on this subset 96. SAMPLING 97. BIASED SAMPLING 98. Tuesday 99. Tuesday 100. Friday 101. Friday 102. Degree of liking is sensitive and vagueCant we do something about this? 103. We can certainly handle single comparisons? 104. We can certainly handle single comparisons 105. We can certainly handle single comparisonsless vague 106. We can certainly handle single comparisonslittle information 107. I can manually compare it with all others 108. And find exactly where it belongsJo n e right after Indianasce ss ht before The Prin rig Bride 109. Full ranking: Compare all pairs 110. 1,000,000 comparisons?Thats a bit too much effort for me 111. We dont need all of them 112. We dont need all of them If 113. We dont need all of them If, 114. We dont need all of them If, I have some information about 115. Compare a random sample of pairs 116. Use a ranking algorithm that utilizes all the informationGood idea! 117. Elo rating system 118. Elo rating system 119. Elo rating system 120. Elo rating system7.00 hotness 121. Elo rating system-1.507.00 hotness range+1.50 122. Elo rating system-1.507.00+1.50-1.508.00+1.50 123. Elo rating system-1.507.00 7.12+1.50-1.508.00 7.68+1.50 124. Elo rating system-1.507.00 7.12+1.50-1.508.00 7.68+1.50 125. Elo rating system-1.507.00 7.12+1.50-1.508.00 7.68+1.50 126. Elo rating system-1507.00 36% to win+150-1508.00 64% to win+150 127. Elo rating systemHow do we find out what these ranges are? 128. Elo rating system5.005.005.005.005.005.00Start with the same guess for every contender 129. Elo rating system? 5.005.00 130. Elo rating system5.005.00 131. Elo rating system5.124.88Update the best guesses accordingly 132. Elo rating system? 5.125.00 133. Elo rating system5.244.88 134. Elo rating system? 5.245.00 135. Elo rating system5.145.10 136. We dont need all comparisons If, I have some information about 137. Elo rating system? 7.614.02 138. Elo rating system? 7.614.0289% to win11% to win 139. Elo rating system7.614.02+.02-.0289% to win11% to win 140. Elo rating system7.614.02-.53+.5389% to win11% to win 141. Elo rating system9.078.426.404.884.20We now have scores on a single scale3.03 142. Elo rating system9.078.426.404.884.20We now have scores on a single scale (estimates of peoples appreciation levels)3.03 143. Elo rating system9.0718.4226.4034.884and a ranking3.034.2056 144. Degree of liking is sensitive and vagueCan we somehow apply this to movies, then? 145. We can do better 146. We can do better Bayesian ranking algorithms 147. We can do better Bayesian ranking algorithms Glicko (The Elo Killer)1999 148. We can do better Bayesian ranking algorithms GlickoTrueSkill(The Elo Killer)19992007 149. Bayesian ranking+-4.46+-4.01 150. Degree of liking is sensitive and vagueLiking (real time & remembered) depends on time and mood. Other people's opinions affect our own. 151. Bayesian ranking+-4.46+-4.01 152. Bayesian ranking+-4.014.4682% to win+-3% to draw15% to win 153. Bayesian ranking? 154. Bayesian rankingElo: ?Best guess for the center4.3 155. Bayesian rankingBayesian: ?It could be centered around4.3 156. Bayesian rankingBayesian: ?It could also be centered around4.2 157. Bayesian rankingBayesian: ?or centered around4.4 158. Bayesian rankingBayesian: ?Less likely but even around4.5 159. ProbabilityBayesian ranking3.5?44.54.35 160. ProbabilityBayesian ranking3.544.55uncertainty?4.3 161. Probability2.02.53.03.544.55Few comparisons: Lots of uncertainty (anything from 2.3 to 4.5 is quite possible) 162. Probability2.02.53.03.544.55After many comparisons: Quite sure (pretty much between 4.11 to 4.18) 163. Bayesian ranking? 164. Bayesian rankingLord of the RingsStar Wars2.03.04.05.0 165. Bayesian rankingLord of the RingsStar Wars2.03.04.05.0 166. How did they do it? After:A small, constant increase in uncertainty before each comparisonProbabilityYour rating changes with time.3.544.55uncertainty 167. Degree of liking is sensitive and vagueGreat! We have a system! 168. How many is too many?I dont want to spend too much time on this 169. Minimum Effort Maximum Information 170. Minimum Effort Maximum Information135135135135135 171. Minimum Effort Maximum Information 172. Minimum Effort Maximum Information 173. Minimum Effort Maximum InformationNot reliable by itself Still carries a lot of information 174. Minimum Effort Maximum Information135 175. Minimum Effort Maximum Information135135 176. What else can we do?I dont want to spend too much time on this 177. Minimum Effort Maximum Information? 178. Minimum Effort Maximum Information?98% to win1% to draw1% to win 179. Minimum Effort Maximum Information?98% Did not learn to win anything new 180. Minimum Effort Maximum Information?Quite a bit of 2% new information to win 181. Minimum Effort Maximum Information?I can calculate the expected amount of information from a comparison! 182. Minimum Effort Maximum Information 183. Minimum Effort Maximum Information Certain about both movies Wont learn a lot 184. Minimum Effort Maximum Information Certain about both movies Wont learn a lot 185. Minimum Effort Maximum Information Certain about both movies Wont learn a lotDont know much about either Will learn a lot regardless of outcome 186. What are your top twenty movies?IrmakFrrmack 187. movievsmovie.datasco.pe 188. Quantifying human reactions are hard bookscelebritiessongstv showsfoodimportance of issuespoliticanswhat to spend fun budget onproductsteams in different sports 189. Degree of liking is sensitive and vagueAmazing!Tuesday 3amTotal garbageSunday 12pm 190. Quantifying reactions is very useful 191. Quantifying reactions is very useful customized websites sorting search results recommendations connecting with other people of similar tastes identifying meaningful groups of similar products / people understanding your own preferences 192. Quantifying human reactions are hard Start with a rating, pose the correct comparisons 193. Quantifying human reactions are hard Start with a rating, pose the correct comparisons Every decision gets us closer 194. Degree of liking is sensitive and vagueAmazing!Tuesday 3amTotal garbageSunday 12pm 195. Many comparisons for a movie over different days averages out mood and other factors 196. Many comparisons for a movie over different days averages out mood and other factors We cant do much about social influence, but we should just accept that as natural part of how much we like things 197. Degree of liking is sensitive and vagueAmazing!Tuesday 3amTotal garbageSunday 12pm 198. A great way of collecting desired data is to make it fun 199. movievsmovie.datasco.pe 200. Thanks