A Geographical Characterization of YouTube: a Latin American View Fernando Duarte, Fabrício...

16
A Geographical Characterization of YouTube: a Latin American View Fernando Duarte, Fabrício Benevenuto, Virgílio Almeida, Jussara Almeida Federal University of Minas Gerais – Brazil

Transcript of A Geographical Characterization of YouTube: a Latin American View Fernando Duarte, Fabrício...

Page 1: A Geographical Characterization of YouTube: a Latin American View Fernando Duarte, Fabrício Benevenuto, Virgílio Almeida, Jussara Almeida Federal University.

A Geographical Characterization of YouTube: a Latin American View

Fernando Duarte, Fabrício Benevenuto, Virgílio Almeida, Jussara Almeida

Federal University of Minas Gerais – Brazil

Page 2: A Geographical Characterization of YouTube: a Latin American View Fernando Duarte, Fabrício Benevenuto, Virgílio Almeida, Jussara Almeida Federal University.

Outline

• Motivation and Goals• YouTube Features• Crawler and Sampling• Geographical Characterization• Conclusions and Future Work

Page 3: A Geographical Characterization of YouTube: a Latin American View Fernando Duarte, Fabrício Benevenuto, Virgílio Almeida, Jussara Almeida Federal University.

Motivation and Goals

• YouTube is a popular online social video sharing service which generates high-volumes of Internet traffic

• YouTube Popularity in Latin American (from www.alexa.com)– 6th in Argentina and Paraguay, – 5th in Brazil, Mexico, Chile and Peru,– 4th in Ecuador and Venezuela.

• Goal: characterize influence of geographical localization of users on traffic and social relationship.– Focus on Latin American

Page 4: A Geographical Characterization of YouTube: a Latin American View Fernando Duarte, Fabrício Benevenuto, Virgílio Almeida, Jussara Almeida Federal University.

YouTube Features

Users – users interactions• add users as friends• subscribe to another user

Users – videos• watch videos• upload videos (unlimited)• add videos as favorite• post a comment to a video• respond a video with

another video• rating a video

Videos• Have a list of 20

related videos• are distributed in 14

categories

Page 5: A Geographical Characterization of YouTube: a Latin American View Fernando Duarte, Fabrício Benevenuto, Virgílio Almeida, Jussara Almeida Federal University.

Sampling Mechanism

• Sampling Strategy:– collect information of popular videos and analyze the

user interactions around these videos.

• First crawler: Collect metadata information of Videos– Start from top all time viewed video and collect the

related videos recursively in snowball fashion• Snowball uses a Breadth first scheme

• Second Crawler: Collect metadata information of Users from the first crawler– User who uploaded videos, posted comments or

video responses.

Page 6: A Geographical Characterization of YouTube: a Latin American View Fernando Duarte, Fabrício Benevenuto, Virgílio Almeida, Jussara Almeida Federal University.

Server

Client 1

Crawler Architecture

Client 2

Client 7

• Collected information of over 2 million videos, exhausting 6 tiers in 11 days (from Apr 3rd to 14th)

• 96 of the 100 most all-time popular videos are part of the sample

• Parallel crawler– Server coordinates the

snowball sampling and – Server avoids redundant

data collection – 7 Linux boxes

Page 7: A Geographical Characterization of YouTube: a Latin American View Fernando Duarte, Fabrício Benevenuto, Virgílio Almeida, Jussara Almeida Federal University.

Statistics of videos and users collected

• USA is responsible for 28% of videos and 38% of users• 7% of users are LA, responsible for 7% of uploads and 6% of views• 13% of users without country information (empty)• # views > # comments > # video responses

Page 8: A Geographical Characterization of YouTube: a Latin American View Fernando Duarte, Fabrício Benevenuto, Virgílio Almeida, Jussara Almeida Federal University.

Latin American Users

• Table is sorted by number of users• Users from Brazil, Mexico, and Argentina have contributed with more

videos, but in terms of uploads/user Peru leads the rank• In terms of traffic (wached videos) Brazil, Mexico, and Virgin Islands

lead the rank• LA users have an average 22 favorite videos and average of 2 friends

– Orkut and Myspace have an average of 30 and 137 friends respectively

• We guess that most part of the users interact with friends in other online social network and use YouTube essentially to watch videos

Page 9: A Geographical Characterization of YouTube: a Latin American View Fernando Duarte, Fabrício Benevenuto, Virgílio Almeida, Jussara Almeida Federal University.

Video Popularity

• Curve of number of views does not descend linearly • 10% of the top popular LA videos concentrate 76% of the views:

caching• LA videos are less visualized and discussed, generating less traffic

than other videos

Page 10: A Geographical Characterization of YouTube: a Latin American View Fernando Duarte, Fabrício Benevenuto, Virgílio Almeida, Jussara Almeida Federal University.

Video Duration

• About 80% of the videos are smaller than 5 minutes• There is no difference for different regions

Page 11: A Geographical Characterization of YouTube: a Latin American View Fernando Duarte, Fabrício Benevenuto, Virgílio Almeida, Jussara Almeida Federal University.

Use of Social Features

• LA users interact less at YouTube than other users.

Page 12: A Geographical Characterization of YouTube: a Latin American View Fernando Duarte, Fabrício Benevenuto, Virgílio Almeida, Jussara Almeida Federal University.

Use of Social Features

• Besides less interactive, there are LA users with 2400 friends, users who uploaded 1400 videos and sent more than 1200 comments.

Page 13: A Geographical Characterization of YouTube: a Latin American View Fernando Duarte, Fabrício Benevenuto, Virgílio Almeida, Jussara Almeida Federal University.

User Interactions

• Observe the percentage of comments for videos from LA, USA and others.

• Plot the distribution of this percentage

Latin American videos

Page 14: A Geographical Characterization of YouTube: a Latin American View Fernando Duarte, Fabrício Benevenuto, Virgílio Almeida, Jussara Almeida Federal University.

Textual InteractionsLatin American videos USA videos

• The probability of LA videos have more than 60% of comments from LA users is 0.32 (from USA is only 0.08)

• Videos have higher probability to receive comments from same region• Potential use of CDNs (assuming that number of views is also

influenced by geographical factors)• Few LA users interact with videos from USA/others, but USA/others

interact with LA users

Page 15: A Geographical Characterization of YouTube: a Latin American View Fernando Duarte, Fabrício Benevenuto, Virgílio Almeida, Jussara Almeida Federal University.

Conclusions and Future Work

• We present a geographical characterization of YouTube, highlighting a number of differences between Latin American users and other countries

• Main Findings– Videos uploaded by LA users present different characteristics

than videos uploads by users from other regions: less visualized and discussed.

– Top popular videos concentrate most part of the views, suggesting the use of caching

– Interactions present strong influence of geographical localization, suggesting the use of CDNs to improve performance

• Future Work– Analyzing impact of language on traffic and user behavior– Explore social networks characteristics of interactions between

users and videos across different regions