Lecture 5: Social Web Data Analysis (2012)
-
Upload
lora-aroyo -
Category
Technology
-
view
1.693 -
download
1
description
Transcript of Lecture 5: Social Web Data Analysis (2012)
![Page 1: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/1.jpg)
Social WebLecture 5
How can we MINE, ANALYSE and VISUALISE the Social Web? (1)
Marieke van ErpThe Network Institute
VU University Amsterdam
Monday, March 5, 12
![Page 2: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/2.jpg)
Why?
• UCG provides an enormous wealth of data
• insights in users’ daily lives
• insights in communities
• insights in trends
Monday, March 5, 12
![Page 3: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/3.jpg)
What’s the added value of mining social web data for the individual?
Monday, March 5, 12
![Page 4: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/4.jpg)
To whom it may concern
• Politicians
• Companies
• Governmental institutions
• You?
Monday, March 5, 12
![Page 5: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/5.jpg)
The Age of Big Data
• 25 billion tweets on Twitter in 2010, by 175 million users
• 360 billion pieces of contents on Facebook in 2010, by 600 million different users
• 35 hours of videos uploaded to YouTube every minute
• 130 million photos uploaded to flickr per month
Monday, March 5, 12
![Page 6: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/6.jpg)
Questions to Ask
• Who uploads/talks? (age, gender, nationality, community)
• What are the trending topics?
• What else do these users like?
• Who are the most/least active users?
• etc.
Monday, March 5, 12
![Page 7: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/7.jpg)
The Rise of the Data Scientist
http://radar.oreilly.com/2010/06/what-is-data-science.htmlMonday, March 5, 12
![Page 8: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/8.jpg)
The Rise of the Data Scientist
• Data Science enables the creation of data products
• Data products are applications that acquire their value from the data, and create more data as a result.
• Users are in a feedback loop: they constantly provide information about the products they use, which gets used in the data product.
Monday, March 5, 12
![Page 9: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/9.jpg)
Popular Data Products
Monday, March 5, 12
![Page 10: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/10.jpg)
Data Mining 101
(Inspired by George Tziralis’ FOSS Conf’09, John Elder IV’s Salford Systems Data Mining Conf. and Toon Calders’ slides)
Data mining is the exploration and analysis of large quantities ofdata in order to discover valid, novel, potentially useful, andultimately understandable patterns in data.
http://www.freefoto.com/images/33/12/33_12_7---Pebbles_web.jpgMonday, March 5, 12
![Page 11: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/11.jpg)
Data Mining 101
Databases Statistics
Artificial Intelligence
Monday, March 5, 12
![Page 12: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/12.jpg)
Steps
• Data input & exploration
• Preprocessing
• Data mining algorithms
• Evaluation & Interpretation
Monday, March 5, 12
![Page 13: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/13.jpg)
Data Input & Exploration
• What data do I need to answer question X?
• What variables are in the data?
• Basic stats of my data?
Monday, March 5, 12
![Page 14: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/14.jpg)
Are all likes equal? Do they all mean the same?
Do people like for the same reason?The ‘likes’ across the different systems?
Monday, March 5, 12
![Page 15: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/15.jpg)
Input & Exploration in ‘LikeMiner’
Monday, March 5, 12
![Page 16: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/16.jpg)
Preprocessing
• Cleanup!
• Choose a suitable data model
• What happens if you integrate data from multiple sources?
• Reformat your data
Monday, March 5, 12
![Page 17: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/17.jpg)
Preprocessing in ‘LikeMiner’
Monday, March 5, 12
![Page 18: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/18.jpg)
Data mining algorithms
• Classification: Generalising a known structure & apply to new data
• Association: Finding relationships between variables
• Clustering: Discovering groups and structures in data
Monday, March 5, 12
![Page 19: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/19.jpg)
How do you know you measured what you wanted to measure?
Monday, March 5, 12
![Page 20: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/20.jpg)
Mining in ‘LikeMiner’
• Filter users by interests
• Construct user graphs
• PageRank on graphs to mine representativeness
• Result: set of influential users
• Compare page topics to user interests to find pages most representative for topics
Monday, March 5, 12
![Page 21: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/21.jpg)
Interpreting your results
Monday, March 5, 12
![Page 22: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/22.jpg)
Data Mining is not easy
Monday, March 5, 12
![Page 23: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/23.jpg)
Monday, March 5, 12
![Page 24: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/24.jpg)
Mining Social Web Data
source: http://kunau.us/wp-content/uploads/2011/02/Screen-shot-2011-02-09-
at-9.03.46-PM-w600-h900.png
Monday, March 5, 12
![Page 25: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/25.jpg)
Single Person
Source: http://infosthetics.com/archives/2011/12/all_the_information_facebook_knows_about_you.html
See also: http://www.youtube.com/watch?feature=player_embedded&v=kJvAUqs3Ofg
Monday, March 5, 12
![Page 26: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/26.jpg)
Populations
http://www.brandrants.com/brandrants/obama/Monday, March 5, 12
![Page 27: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/27.jpg)
Brand Sentiment via Twitter
http://flowingdata.com/2011/07/25/brand-sentiment-showdown/Monday, March 5, 12
![Page 28: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/28.jpg)
Assignment 3: Data Analysis
• Analyse an existing social data analysis report
• Apply same analyses to your own data
• Write research report
http://www.actmedia.eu/media/img/text_zones/English/small_38421.jpgMonday, March 5, 12
![Page 29: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/29.jpg)
Final Assignment: Your SocWeb App
• Create a Social Web app with your group
• Use structured data, relationships between entities, data analysis, visualisation
• Write individual research report on one of the main aspects of your app
Image Source: http://blog.compete.com/wp-content/uploads/2012/03/Like.jpgMonday, March 5, 12
![Page 30: Lecture 5: Social Web Data Analysis (2012)](https://reader033.fdocuments.in/reader033/viewer/2022042813/54bc2dab4a795919528b4619/html5/thumbnails/30.jpg)
Hands-on Teaser
• Your Facebook Friends’ popularity in a spread sheet
• Locations of your Facebook Friends
• Tag Cloud of your wall posts
image source: http://www.flickr.com/photos/bionicteaching/1375254387/
Monday, March 5, 12