Views
http://www.fastcompany.com/magazine/142/it-had-to-be-you.html
The History of YouTube
In five years, YouTube has completely reshaped the Internet, media, and political landscapes.
• February 14, 2005- Chad Hurley, Steve Chen, and Jawed Karim begin work on a "Flickr or HotorNot for video." They register youtube.com the next day.
• April 23, 2005- "Me at the Zoo,” 19 seconds of Karim in front of the elephants at the San Diego Zoo, is the first video posted to the site.
• December 15, 2005- YouTube officially debuts.• October 9, 2006- Google buys YouTube for $1.65 billion.• October 12, 2009- YouTube passes 1 billion videos a day but
remains unprofitable.
YouTube Video Categories
• Autos & Vehicles
• Comedy• Education• Entertainment• Film &
Animation
• Gaming• Howto & Style• Music• News &
Politics• Nonprofits &
Activism
• People & Blogs• Pets & Animals• Science &
Technology• Sports• Travel & Events
• What we analyzed:– 30 random videos from each of the 15 YouTube video categories (every third video)– Length of video– Number of views– Number of comments
• How we collected data:– Split up the categories among the three of us- 5
categories each– Recorded the above mentioned variables of every
third video in the assigned category
Observational Study
Linear Regression T-testLength of Video vs. Number of Views
0
20
40
60
80
100
120
140
160
180
200
220
Length_of_Video (thousands)
0 2 4 6 8 10
Number_of_View s = -550Length_of_Video + 3.5e+06; r2 = 0.00025
Collection 1 Scatter Plot
r=0.0158
Linear Regression T-testLength of Video vs. Number of Views Residual Plot
0
20
40
60
80
100
120
140
160
180
200
220
Length_of_Video (thousands)
0 2 4 6 8 10
Number_of_View s = -550Length_of_Video + 3.5e+06; r2 = 0.00025
0
50
100
150
200
0 2 4 6 8 10
Length_of_Video (thousands)
Collection 1 Scatter Plot
r=0.0158
• Negative• Linear• Very Weak• Scattered Residual Plot• Correlation (r)= 0.01581• Variance (r2)= 0.00025• .025% of the change in the number of views is due to the change in the
length of the video• Overall, for our population of YouTube viewers, as the length of the
video increases, the number of views does slightly decrease; however, our data is not sufficient enough to show a strong enough relationship between the two variables. Our variance was so small that we could not determine any true relationship between the two.
Linear Regression T-testExploratory Data Analysis
Assumptions• 2 independent SRS• True relationship is linear
Checks• Check• Assumed
Linear Regression T-testAssumptions and Checks
Ho: β= 0 Ha: β< 0
Linear Regression T-testLength of Video vs. Number of Views
3315.0bSE
bt
37.04483315.0 dftPWe fail to reject Ho because P-value is > α=0.05.We have sufficient evidence that the slope of the population regression line of length of video versus number of views is 0. Thus, as the length of video increases, the number of views is not affected.
Linear Regression T-testNumber of Views vs. Number of Comments
0
100
200
300
400
500
600
Number_of_Views (millions)
0 20 40 60 80 100 120 140 160 180 200 220
Number_of_Comments = 0.00183Number_of_View s + 40; r2 = 0.73
Collection 1 Scatter Plot
r= 0.8544
Linear Regression T-testNumber of Views vs. Number of Comments Residual Plot
0
100
200
300
400
500
600
Number_of_Views (millions)
0 20 40 60 80 100 120 140 160 180 200 220
Number_of_Comments = 0.00183Number_of_View s + 40; r2 = 0.73
-100
0
100
200
300
Number_of_Views (millions)
0 20 40 60 80 100 120 140 160 180 200 220
Collection 1 Scatter Plot
r= 0.8544
• Positive• Linear• Moderately Strong• Scattered Residual Plot• Correlation (r)= 0.8544• Variance (r2)=0.73• 73% of the change in the number of comments is due to the
change in the number of views• Overall, for our population of YouTube viewers, as the number
of views increases, the number of comments will also increase. Thus, as more people continue to view the video, the number of comments will go up.
Linear Regression T-testExploratory Data Analysis
Assumptions• 2 independent SRS• True relationship is linear
Checks• Check• Assumed
Linear Regression T-testAssumptions and Checks
Ho: β= 0 Ha: β> 0
Linear Regression T-testNumber of Views vs. Number of Comments
66.34bSE
bt
0001.044866.34 dftPWe reject Ho because P-value is < α=0.05.We have sufficient evidence that the slope of the population regression line of number of views versus number of comments is greater than 0. Thus, as the number of views increases, the number of comments also increases.
Calculated MeansAverage Number of Views in Each Category
Category: Means:Autos and Vehicles
ComedyEducation
EntertainmentFilm and Animation
GamingHowto and Style
MusicNews and Politics
Nonprofits and ActivismPeople and BlogsPets and Animals
Science and TechnologySports
Travel and Events
15189.740503.75407.6711304133559.476944.527768.94935040032863.111691.824751.454006.58613.771095382032.5
Autos and VehiclesComedy
EducationEntertainment
Film and AnimationGaming
Howto and StyleMusic
News and PoliticsNonprofits and Activism
People and BlogsPets and Animals
Science and TechnologySports
Travel and Events
0
10000000
20000000
30000000
40000000
50000000
60000000
Series1
Number of Views
Cate
gory
Calculated MeansAverage Number of Views in Each Category
Autos and VehiclesComedy
EducationEntertainment
Film and AnimationGaming
Howto and StyleNews and Politics
Nonprofits and ActivismPeople and BlogsPets and Animals
Science and TechnologySports
Travel and Events
0 20000 40000 60000 80000 100000 120000
Series1
Number of Views
Cate
gory
• Music had the most extreme average number of views of all of the categories at 49,350,400. Sports, Entertainment and Gaming had the next highest average number of views respectively, although none of them came close to Music.
• Thus, the majority of YouTube viewers use it for Music. • Travel and Events, Education and Nonprofits and
Activism had the least average number of views respectively. This is most likely due to the fact that people use YouTube for entertainment purposes.
Calculated MeansExploratory Data Analysis
1-Sample T-intervalAverage Number of Views in Each Category
Assumptions: Checks:• Check• 30=30
• SRS• Normal Population
orn≥30
...
n
st
1-Sample T-intervalAverage Number of Views in Each Category
Category:We are 95% confident that the true mean number
of views for the following categories are:
Autos and VehiclesComedy
EducationEntertainment
Film and AnimationGaming
Howto and StyleMusic
News and PoliticsNonprofits and Activism
People and BlogsPets and Animals
Science and TechnologySports
Travel and Events
(3000.53, 27378.9)(-386.52, 81393.9)(96.4158, 10718.9)(14718.7, 211363)(5052.9, 62065.9)(51236, 102653)(10583.1, 44954.7)(3.04757 x 107, 6.82251 x 107)(21382.2, 44344)(5045.54, 18388.1)(10304.3, 39198.5)(-5397.94, 113411)(3529.7, 13697.8)(-58631.4, 277707)(970.212, 3094.79)
1-Sample T-intervalDifference Between Max and Min Intervals of Number of Views in Each
Category
Autos and VehicalsComedy
EducationEntertainmnet
Film and AnimationGaming
How To and SytleNews and Politics
Nonprofits and ActivismPeople and BlogsPets and Animals
Science and TechnologySports
Travel and Events
0 50000 100000 150000 200000 250000 300000 350000 400000
Series1
Number of Views
Cate
gory
Autos and VehicalsComedy
EducationEntertainmnet
Film and AnimationGaming
How To and SytleMusic
News and PoliticsNonprofits and Activism
People and BlogsPets and Animals
Science and TechnologySports
Travel and Events
0 20000000 40000000 60000000 80000000
Series1
Number of Views
Cate
gory
• Music had the largest difference between max and min of the interval at 68,225,100, which is why we removed it from the rest of the data in order to get a better view of the other categories.
• Travel and Events had the smallest difference between max and min of the interval at 2,124.578.
• We think that Music had the largest interval because the standard deviation was so large at 50,547,300 due to several outliers in the hundred million views.
• Following Music was Sports and then Entertainment. We think these intervals were large for the same reasons as Music because there were a few data points that were outliers.
• The smaller intervals following Travel and Events included Science and Technology, Nonprofits and Activism and Education.
1-Sample T-intervalExploratory Data Analysis
Application to Population• For the population of YouTube viewers, as the length of the video
increases, the number of views does slightly decrease; however, our data is not sufficient enough to show a strong enough relationship between the two variables. Thus, the length of the video will not deter the average viewer.
• For the population of YouTube viewers, as the number of views increases, the number of comments will also increase. Thus, as more people continue to view the video, the more likely it is that people will comment.
• The extreme variation in average number of views is most likely due to the fact that people use YouTube for entertainment purposes.
• YouTube is a social site, so popularity and recognition will affect whether a person watches the video or not.
Bias and Error• Even if you do not watch the video all of the way through it will register
as a view.• There are many videos on YouTube all dispersed within the given
categories. Because they must fit in one, some videos may be more loosely grounded in the topic. – Some of the categories leave less room for variation, such as news
and politics, while others, like comedy, could encompass many different types of video.
• Because there is a variation in amount of videos in each category, in those with less videos there is less variation in the videos that we choose.
• There are some videos that do not have much appeal, whether because of subject matter or other better versions in existence, because of this they are not going to get views regardless of time.
• There were some videos that disabled comments, which we counted as zero views. Perhaps they would have received comments if allowed.
• There are some strong outliers, like the hour and a half long video.
Personal Opinions• Overall, the length of the video did not affect the number
of views. This surprised us because we expected that people would not want to wait to watch a longer video.
• The margin of error for the mean number of views for Music was surprisingly extreme. We attributed this to Justin Bieber because his three videos included in the data had enormous numbers of views.
• We were not surprised that the less “entertaining” categories had a lesser number of views because YouTube is a social site, so popularity and recognition will affect whether a person watches the video or not.
Top Related