What’s in a Country Name – Twitter Hashtag Analysis of #singapore
-
Upload
aravind-sesagiri-raamkumar -
Category
Internet
-
view
494 -
download
0
Transcript of What’s in a Country Name – Twitter Hashtag Analysis of #singapore
What’s in a Country Name – Twitter Hashtag Analysis of #singapore
presented by
Aravind Sesagiri Raamkumar
PhD Student, Division of Information Studies
WKWSCI, NTU
Agenda
• Introduction• Literature Review• Research Objectives• Results & Discussion• Conclusion & Future Works
Introduction• Twitter introduced in 2006, popularized the
concept of Micro-blogging with 140 character limit posts called as Tweets
• Main Activities: Tweeting (Posting Multimedia Content), Personal Messaging, Retweeting, and Real-time Information Searching
• Hashtags are keywords starting with symbol “#” used for Content Categorization and Participation in Conversations (popular examples are #FIFAWorldCup, #OWS)
Introduction• Why study hashtags?
– To understand the growth and decline of hashtags
– Do hashtags really help in content categorization?
– To provide inputs for hashtag recommendation algorithms
– To understand the dynamics around hashtags in basic usage and community participation
Literature Review• Hashtag Studies
– Studies have either taken an holistic approach [7] or event specific approach (e.g., US Presidential elections) [3]
– Pöschko analyzed tweets containing hashtags for developing five hashtag classes – Geolocation, Person, Organization, Event and Category [7]
– Bruns et al. showed that hashtags are formed through pre-planning, quickly reached consensus or through protracted debate [8]
– Yang et al. validated the dual purpose of hashtags in content grouping and community participation. A machine learning based hashtag usage prediction algorithm proposed [9]
Where to focus?• Hashtag Studies
– Prior hashtag studies have concentrated on major events and not commonplace hashtags used on daily basis such as country names (e.g.,location based hashtags)
• Tweet Classification– Tweet classification approaches have not used
hashtags as frame of reference
Why do users make use of the hashtag
#singapore in their tweets?
What are the categories that represent the tweets?
Manual & Automatic Tweet Classification
What are the communication patterns between users using #singapore?
Social Network Analysis
Does the provenance data of tweets provide any new insights?
Content Analysis
Research Questions & Methods
Overarching Research Question
Individual Research Questions
Methods
Methods• Manual Tweet Classification
– Performed by three coders for 500 tweets selected from two days of the total period.
– Hashtag #singapore used as the frame of reference
– Categories identified by the first coder, passed over to the other two coders for classification
• Automatic Tweet Classification– The statistical programming language R with the machine
learning library RTextTools used for the exercise
– Maximum Entropy shortlisted as the machine learning text classificatication algorithm
Methods (contd…)• Social Network Analysis
• Exploring communication patterns between user accounts who have posted tweets with #singapore in the sample set
Sample Tweet: @Stanbridge @NatGeo We hope you enjoyed your time at #50GreatestPhotos and we look forward to your next visit! #singapore
• Gephi tool used for building graphs
Data Collection• TweetArchivist was used to extract live data from
Twitter for the hashtag #singapore for the period Aug 20th to Sep 1st 2013
Date Tweet Count Day of weekAug-20 1360 TueAug-21 2305 WedAug-22 2776 ThurAug-23 3780 FriAug-24 2753 SatAug-25 2524 SunAug-26 2387 MonAug-27 3045 TueAug-28 2635 WedAug-29 3298 ThurAug-30 3574 FriAug-31 2930 SatSep-01 2888 SunSep-02 3009 MonSep-03 2644 TueSep-04 2676 WedSep-05 2554 Thur
Results & Discussion
Results & Discussion – Tweet Classification• N = 5140 (28.9%)• e.g.,The Strike That Rattled #Singapore: A WSJ Investigation. The first of a 5-part series this week. http://t.co/nAKrimR4OG
#China #migrantsLocal Events and News (LEN)
• N = 4973 (27.9%)• e.g.,Sky walk , #singapore #trip #nature @ Gardens By The Bay http://t.co/Uhxxknsn1A
Current Location and Landmarks (CLL)
• N = 2822 (15.9%)• e.g.,Almost Chinese hates Commie Govt http://t.co/YVH8Ni9su4 #Taiwan #Turkey #India #Thailand #Singapore #Brazil
#Vietnam
Asia Related and Unrelated topics (ARU)
• N = 2142 (12.0%)• e.g.,Have a beautiful week ahead, bellas! #bellabox #singapore #beauty http://t.co/tVvypOCa9L
Commercial Deals (CD)
• N = 1302 (7.3%)• Had a fantastic time in #Singapore. Should be home in 14 hours. #ChangiAirport
Tourism and Travel Related (TTR)
• N = 1214 (6.8%)• e.g.Morning #SGFBloggers. Lets make it a positive and productive week!positive #singapore #love #quotes
http://t.co/cBR8sw3Tpt,
National Identity and Group Reference (NGR)
• N = 205 (1.1%)• e.g.,In what should've been a normal 30mins ride to work, I'm now 30 mins late for the meeting #Traffic #singapore
Personal Events and Rants (PER)
Results & Discussion – Social Network AnalysisDirected graph built with ‘user mentions’ data
Popular network actors are New agencies and Celebrities and Commercial Bodies
Retweeting appears to be a major activity
Results & Discussion – Provenance Data
Source Tweet Count PercentageInstagram 3206 18.01%
web 1844 10.36%Twitter for iPhone 1629 9.15%
Twitter for Android 1432 8.05%dlvr.it 1212 6.81%
twitterfeed 891 5.01%RoundTeam 686 3.85%
Tweet Old Post 598 3.36%HootSuite 555 3.12%
TweetAdder v4 547 3.07%
URL Presence Tweet Count PercentageNot Present 5256 29.53%
Present 12542 70.47%
Instagram is the most used source Higher % of Tweets (18%) with images of Singapore
landmarks
Information or Content Sharing is the most
prevalent activity with 70% of tweets containing
URLs
Retweeting is not the most major activity (31% of
tweets) at the detailed levelType Count PercentageNormal Tweet 12165 68.35%Retweet 5633 31.65%
Conclusion• Seven themes/categories were identified as a part of a
tweet classification exercise.
• #singapore is prominent in tweets about local events, local news, users’ current location and landmark related information sharing.
• Users who share content from Instagram make use of the hashtag in a more prominent way.
• News agencies, commercial bodies and celebrities make use of the hashtag more than common individuals.
Future Work • Similar classification exercise will be carried out for
different country names so as to validate the identified categories.
– #singapore data from Facebook and Google plus to be studied for cross-verification
– Intent to put forth a classification scheme for location based hashtag studies
– Provide inputs to hashtag recommendation algorithms
Thank You
Selective Bibliography[1] Cheong, M., & Lee, V. (2010). Dissecting Twitter: A Review on Current Microblogging Research and Lessongs from Related Fields. From Sociology to Computing in Social Networks (pp. 343–362). Springer Vienna.[2] Sriram, B., Fuhry, D., Demir, E., Ferhatosmanoglu, H., & Demirbas, M. (2010). Short text classification in twitter to improve information filtering. Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval - SIGIR ’10 (pp. 841–842). New York, New York, USA: ACM Press. doi:10.1145/1835449.1835643[3] Lin, Y., Margolin, D., Keegan, B., Baronchelli, A., & Lazer, D. (2012). # Bigbirds Never Die : Understanding Social Dynamics of Emergent Hashtags.[4] Abel, F., Herder, E., Houben, G., & Henze, N. (2011). Cross-system User Modeling and Personalization on the Social Web. User Modeling and User-Adapted Interaction, 22(3), 1–42.[5] Godin, F., Schrauwen, B., & Walle, R. Van De. (2013). Using Topic Models for Twitter Hashtag Recommendation. Proceedings of the 22nd international conference on World Wide Web companion (pp. 593–596).[6] Magnani, M., Montesi, D., Nunziante, G., & Rossi, L. (2011). Conversation Retrieval from Twitter. Proceedings of the 33rd European Conference on IR Research, ECIR 2011 (pp. 780–783).[7] Pöschko, J. (2011). Exploring Twitter Hashtags (pp. 1–12). Retrieved from http://arxiv.org/abs/1111.6553[8] Bruns, A., & Burgess, J. (2011). The Use of Twitter Hashtags in the Formation of Ad Hoc Publics (pp. 25–27).[9] Yang, L., Sun, T., Zhang, M., & Mei, Q. (2012). We Know What @ You # Tag : Does the Dual Role Affect Hashtag Adoption ? Proceedings of the 21st international conference on World Wide Web (pp. 261–270).[10] Zangerle, E., & Gassler, W. (2011). Recommending # -Tags in Twitter. Proceedings of the Workshop on Semantic Adaptive Social Web (SASWeb 2011) (pp. 67–78).[11] Li, T., & Wu, Y. (2011). Twitter Hash Tag Prediction Algorithm. ICOMP’11-The 2011 International Conference on Internet Computing.[12] Kywe, S. M. (2012). On Recommending Hashtags in Twitter Networks. Social Informatics, 7710, 337–350.
Appendix Java et al (2007) Jansen et al (2009) Honeycutt & Herring (2009) Pear Analytics (2009) Horn (2010)
Conversations Info seeking About addressee Mainstream NewsC1: News,
Events, Company
URL sharing Info providing Advertise Spam C2: Factual, Opionated
News reporting Comment/Sentiment Exhort Self-promotion of businesses
Daily chatter Info for others BabbleInfo for self Conversations
Meta-commentary Pass-along messages (retweets)
Media useExpress opinion
Other's experienceSelf experience
Solicit info
Other misc
Sriram et al (2010) Dann (2010) Sandra et al (2010) Naaman et al (2010) Rosa (2011)
News Conversational Movies Info sharing News
Opinions Pass along Books Self promotion Sports
Deals News Music Opinions/Complaints Science & Technology
Events Status Apps Statements & Random Thoughts
Entertainment
Private Messages Phatic Games Me now Money/Business
Spam Question to followers Just for Fun
Presence Maintenance
Anecdote
Table 2.2 (a) Classification schemes from previous twitter studies
Table 2.2 (b) Classification schemes from previous twitter studies