Dynamics of Trends and Attention in Chinese Social Media · Dynamics of Trends and Attention in...

17
1 Dynamics of Trends and Attention in Chinese Social Media Louis Lei Yu Sitaram Asur Bernardo A. Huberman Abstract—There has been a tremendous rise in the growth of online social networks all over the world in recent years. It has facilitated users to generate a large amount of real-time content at an incessant rate, all competing with each other to attract enough attention and become popular trends. While Western online social networks such as Twitter have been well studied, the popular Chinese microblogging network Sina Weibo has had relatively lower exposure. In this paper, we analyze in detail the temporal aspect of trends and trend-setters in Sina Weibo, contrasting it with earlier observations in Twitter. We find that there is a vast difference in the content shared in China when compared to a global social network such as Twitter. In China, the trends are created almost entirely due to the retweets of media content such as jokes, images and videos, unlike Twitter where it has been shown that the trends tend to have more to do with current global events and news stories. We take a detailed look at the formation, persistence and decay of trends and examine the key topics that trend in Sina Weibo. One of our key findings is that retweets are much more common in Sina Weibo and contribute a lot to creating trends. When we look closer, we observe that most trends in Sina Weibo are due to the continuous retweets of a small percentage of fraudulent accounts. These fake accounts are set up to artificially inflate certain posts, causing them to shoot up into Sina Weibo’s trending list, which are in turn displayed as the most popular topics to users. Index Terms—social network; web structure analysis; temporal analysis; China; social computing Louis Lei Yu, Department of Mathematics and Computer Sci- ence, Gustavus Adolphus College, 800 W College Ave, St Peter, MN 56082, phone: (415)374-9197, FAX: (507) 933-7041, e-mail: [email protected] Sitaram Asur, Social Computing Lab, HP Labs, 1501 Page Mill Road, Palo Alto, CA 94304, phone: (650) 857-1501, fax: (650) 852- 8156, email: [email protected] Bernardo A. Huberman, Social Computing Lab, HP Labs, 1501 Page Mill Road, Palo Alto, CA 94304, phone: (650) 857-1501, fax: (650) 852-8156, email: [email protected] 1 I NTRODUCTION In the past few years, social media services as well as the users who subscribe to them, have grown at a phenomenal rate. This im- mense growth has been witnessed all over the world with millions of people of different back- grounds using these services on a daily basis. This widespread generation and consumption of content has created an extremely complex and competitive online environment where dif- ferent types of content compete with each other for the attention of users. It is very interesting to study how certain types of content such as a viral video, a news article, or an illustrative picture, manage to attract more attention than others, thus bubbling to the top in terms of popularity. Through their visibility, these popu- lar topics contribute to the collective awareness reflecting what is considered important. It can also be powerful enough to affect the public agenda of the community. There have been prior studies on the charac- teristics of trends and trend-setters in Western online social media ( [1], [2]). In this paper, we examine in detail a significantly less-studied but equally fascinating online environment: Chinese social media, in particular, Sina Weibo: China’s biggest microblogging network. Over the years there have been news reports on various Internet phenomena in China, from the surfacing of certain viral videos to the spreading of rumors ( [3]) to the so called “hu- man flesh search engines”: a primarily Chinese arXiv:1312.0649v1 [cs.SI] 2 Dec 2013

Transcript of Dynamics of Trends and Attention in Chinese Social Media · Dynamics of Trends and Attention in...

Page 1: Dynamics of Trends and Attention in Chinese Social Media · Dynamics of Trends and Attention in Chinese Social Media Louis Lei Yu Sitaram Asur Bernardo A. Huberman F Abstract—There

1

Dynamics of Trends and Attention in ChineseSocial Media

Louis Lei YuSitaram Asur

Bernardo A. Huberman

F

Abstract—There has been a tremendous rise in the growth ofonline social networks all over the world in recent years. It hasfacilitated users to generate a large amount of real-time contentat an incessant rate, all competing with each other to attractenough attention and become popular trends. While Westernonline social networks such as Twitter have been well studied,the popular Chinese microblogging network Sina Weibo has hadrelatively lower exposure. In this paper, we analyze in detailthe temporal aspect of trends and trend-setters in Sina Weibo,contrasting it with earlier observations in Twitter. We find thatthere is a vast difference in the content shared in China whencompared to a global social network such as Twitter. In China,the trends are created almost entirely due to the retweets ofmedia content such as jokes, images and videos, unlike Twitterwhere it has been shown that the trends tend to have moreto do with current global events and news stories. We take adetailed look at the formation, persistence and decay of trendsand examine the key topics that trend in Sina Weibo. One ofour key findings is that retweets are much more common inSina Weibo and contribute a lot to creating trends. When welook closer, we observe that most trends in Sina Weibo are dueto the continuous retweets of a small percentage of fraudulentaccounts. These fake accounts are set up to artificially inflatecertain posts, causing them to shoot up into Sina Weibo’strending list, which are in turn displayed as the most populartopics to users.

Index Terms—social network; web structure analysis; temporalanalysis; China; social computing

Louis Lei Yu, Department of Mathematics and Computer Sci-ence, Gustavus Adolphus College, 800 W College Ave, St Peter,MN 56082, phone: (415)374-9197, FAX: (507) 933-7041, e-mail:[email protected] Asur, Social Computing Lab, HP Labs, 1501 Page MillRoad, Palo Alto, CA 94304, phone: (650) 857-1501, fax: (650) 852-8156, email: [email protected] A. Huberman, Social Computing Lab, HP Labs, 1501 PageMill Road, Palo Alto, CA 94304, phone: (650) 857-1501, fax: (650)852-8156, email: [email protected]

1 INTRODUCTION

In the past few years, social media servicesas well as the users who subscribe to them,have grown at a phenomenal rate. This im-mense growth has been witnessed all over theworld with millions of people of different back-grounds using these services on a daily basis.This widespread generation and consumptionof content has created an extremely complexand competitive online environment where dif-ferent types of content compete with each otherfor the attention of users. It is very interestingto study how certain types of content such asa viral video, a news article, or an illustrativepicture, manage to attract more attention thanothers, thus bubbling to the top in terms ofpopularity. Through their visibility, these popu-lar topics contribute to the collective awarenessreflecting what is considered important. It canalso be powerful enough to affect the publicagenda of the community.

There have been prior studies on the charac-teristics of trends and trend-setters in Westernonline social media ( [1], [2]). In this paper, weexamine in detail a significantly less-studiedbut equally fascinating online environment:Chinese social media, in particular, Sina Weibo:China’s biggest microblogging network.

Over the years there have been news reportson various Internet phenomena in China, fromthe surfacing of certain viral videos to thespreading of rumors ( [3]) to the so called “hu-man flesh search engines”: a primarily Chinese

arX

iv:1

312.

0649

v1 [

cs.S

I] 2

Dec

201

3

Page 2: Dynamics of Trends and Attention in Chinese Social Media · Dynamics of Trends and Attention in Chinese Social Media Louis Lei Yu Sitaram Asur Bernardo A. Huberman F Abstract—There

2

Internet phenomenon of massive search usingonline media such as blogs and forums ( [4]).These stories seem to suggest that many eventshappening in Chinese online social networksare unique products of China’s culture andsocial environment.

Due to the vast global connectivity providedby social media, netizens all over the worldare now connected to each other like neverbefore; they can now share and exchange ideaswith ease. It could be argued that the mannerin which the sharing occurs should be similaracross countries. However, China’s unique cul-tural and social environment suggests that theway individuals share ideas might be differentthan that in Western societies [5]. For example,the age of Internet users in China is a lotyounger. So it is likely that they may respondto different types of content than Internet usersin Western societies. The number of Internetusers in China is larger than that in the U.S, andthe majority of users live in large urban cities.One would expect that the way these usersshare information can be even more chaotic.An important question to ask is to what extentwould topics have to compete with each otherin order to capture users’ attention in thisdynamic environment. Furthermore, as docu-mented by [6], it is known that the informationshared between individuals in Chinese socialmedia is monitored. Hence another interestingquestion to ask is what types of content wouldnetizens respond to and what kind of populartopics would emerge u nder such constantsurveillance.

Given the above questions, we present ananalysis on the evolution of trends in SinaWeibo. We monitored the evolution of the toptrending keywords in Sina Weibo for 30 days.First, we analyzed the model of growth in thesetrends and examined the persistance of thesetopics over time. In this regard, we investigatedif topics initially ranked higher tend to stayin the list of top 50 trending topics longer.Subsequently, by analyzing the timestamps oftweets, we looked at the propagation and de-caying process of the trends in Sina Weibo andcompare it to earlier observations of Twitter [1].

Our findings are as follows:• We discovered that the majority of trends

in Sina Weibo are arising from frivolouscontent, such as jokes and funny imagesand photos unlike Twitter where the trendsare mainly news-driven.

• We established that retweets play a greaterrole in Sina Weibo than in Twitter, con-tributing more to the generation and per-sistence of trends.

• Upon examining the tweets in detail, wemade an important discovery. We ob-served that many trending keywords inSina Weibo are heavily manipulated andcontrolled by certain fraudulent accounts.The irregular activities by these accountsmade certain tweets more visible to usersin general.

• We found significant evidence suggestingthat a large percentage of the trends inSina Weibo are due to artificial inflation byfraudulent accounts. The users we identi-fied as fraudulent were 1.08% of the totalusers sampled, but they were responsiblefor 49% of the total retweets (32% of thetotal tweets).

• We evaluated some methods to identifyfraudulent accounts. After we removedthe tweets associated with fraudulent ac-counts, the evolution of the tweets con-taining trending keywords follow the samepersistent and decaying process as the onein Twitter.

The rest of the paper is organized as fol-lows. In Section 2 we provide background in-formation on the development of Internet inChina and on the Sina Weibo social network.In Section 3 we survey some related work ontrends and spam in social media. In Section4, we perform a detailed analysis of trendingtopics in Sina Weibo. In Section 5, we providea discussion of our findings.

2 BACKGROUND

In this Section, we provide some backgroundinformation on the Internet in China, the devel-opment of Chinese social media services, andSina Weibo, the most popular microblog servicein China

Page 3: Dynamics of Trends and Attention in Chinese Social Media · Dynamics of Trends and Attention in Chinese Social Media Louis Lei Yu Sitaram Asur Bernardo A. Huberman F Abstract—There

3

2.1 The Internet in China

The development of the Internet industry inChina over the past decade has been impres-sive. According to a survey from the China In-ternet Network Information Center (CNNIC),by July 2008, the number of Internet users inChina has reached 253 million, surpassing theU.S. as the world’s largest Internet market [7].Furthermore, the number of Internet users inChina as of 2010 was reported to be 420 million.

Despite this, the fractional Internet penetra-tion rate in China is still low. The 2010 surveyby CNNIC on the Internet development inChina [8] reports that the Internet penetrationrate in the rural areas of China is on average5.1%. In contrast, the Internet penetration ratein the urban cities of China is on average21.6%. In metropolitan cities such as Beijingand Shanghai, the Internet penetration rate hasreached over 45%, with Beijing being 46.4% andShanghai being 45.8% [8].

According to the survey by CNNIC in 2010[7], China’s cyberspace is dominated by urbanstudents between the age of 18–30 (see Figure1 and Figure 2, taken from [7]).

Fig. 1. Age Distribution of Internet Users inChina

The Government plays an important role infostering the advance of the Internet industryin China. Tai [6] points out four major stagesof Internet development in China, “with eachperiod reflecting a substantial change not onlyin technological progress and application, butalso in the Government’s approach to and ap-parent perception of the Internet.”

Fig. 2. The Occupation Distribution of InternetUsers in China

According to The Internet in China 1 releasedby the Information Office of the State Councilof China:

The Chinese government attachesgreat importance to protecting the safeflow of Internet information, activelyguides people to manage websites inaccordance with the law and use theInternet in a wholesome and correctway.

2.2 Chinese Online Social NetworksOnline social networks are a major part ofthe Chinese Internet culture [3]. Netizens2 inChina organize themselves using forums, dis-cussion groups, blogs, and social networkingplatforms to engage in activities such as ex-changing viewpoints and sharing information[3]. According to The Internet in China:

Vigorous online ideas exchange is amajor characteristic of China’s Inter-net development, and the huge quan-tity of BBS posts and blog articles is

1. “The Internet in China” by the Information Office of theState Council of the People’s Republic of China is available athttp://www.scio.gov.cn/zxbd/wz/201006/t667385.htm

2. A netizen is a person actively involved in online commu-nities [9].

Page 4: Dynamics of Trends and Attention in Chinese Social Media · Dynamics of Trends and Attention in Chinese Social Media Louis Lei Yu Sitaram Asur Bernardo A. Huberman F Abstract—There

4

far beyond that of any other country.China’s websites attach great impor-tance to providing netizens with opin-ion expression services, with over 80%of them providing electronic bulletinservice. In China, there are over amillion BBSs and some 220 millionbloggers. According to a sample sur-vey, each day people post over threemillion messages via BBS, news com-mentary sites, blogs, etc., and over66% of Chinese netizens frequentlyplace postings to discuss various top-ics, and to fully express their opinionsand represent their interests. The newapplications and services on the In-ternet have provided a broader scopefor people to express their opinions.The newly emerging online services,including blog, microblog, video shar-ing and social networking websitesare developing rapidly in China andprovide greater convenience for Chinese citizens to communicate online.Actively participating in online in-formation communication and con-tent creation, netizens have greatly en-riched Internet information and con-tent.

2.3 Sina Weibo

Sina Weibo was launched by the Sina corpo-ration, China’s biggest web portal, in August2009. It has been reported by the Sina corpo-ration that Sina Weibo now has 250 millionregistered accounts and generates 90 millionposts per day. Similar to Twitter, a user profilein Sina Weibo displays the user’s name, a briefdescription of the user, the number of followersand followees the user has. There are threetypes of user accounts in Sina Weibo, regularuser accounts, verified user accounts, and theexpert (star) user account. A verified user ac-count typically represents a well known publicfigure or organization in China.

Twitter users can address tweets to otherusers and can mention others in their tweets.A common practice in Twitter is “retweeting”,or rebroadcasting someone else’s messages to

one’s followers. The equivalent of a retweet inSina Weibo is instead shown as two amalga-mated entries: the original entry and the cur-rent user’s actual entry which is a commentaryon the original entry.

Sina Weibo has another functionality absentfrom Twitter: the comment. When a Sina Weibouser makes a comment, it is not rebroadcastedto the user’s followers. Instead, it can only beaccessed under the original message.

3 RELATED WORK

In this Section, we provide a survey of papersin two related areas: spam detection and thestudy of trends in social networks. In eacharea, we present work on both Western socialnetworks and Chinese social networks.

3.1 Spam Detection in Twitter

Spam and bot detection in social networks isa relatively recent area of research, motivatedby the vast popularity of social websites suchas Twitter and Facebook. It draws on researchfrom several areas of computer science suchas computer security, machine learning, andnetwork analysis.

In the 2010 work by Benevenuto et al [10], theauthors examine spam detection in Twitter byfirst collecting a large dataset of more than 54million users, 1.9 billion links, and 1.8 billiontweets. After exploring content and behavoirattributes, they developed an SVM classifierand was able to detect spammers with 70% pre-cision and non-spammers with 96% precision.As an insightful follow up, the authors usedχ2 statistics to evaluate the importance of theattributes they used in their model.

The second paper with direct application tospam detection in Twitter was by Wang [11].Wang motivated his research with the statisticthat an estimated 3% of messages in Twitterare spam. The dataset used in in this studywas relatively smaller, gathering informationfrom 25,847 users, 500 thousand tweets, and49 million follower/friend relationships. Wangused decision trees, neural network, SVM, andnaive Bayesian models.

Page 5: Dynamics of Trends and Attention in Chinese Social Media · Dynamics of Trends and Attention in Chinese Social Media Louis Lei Yu Sitaram Asur Bernardo A. Huberman F Abstract—There

5

Finally, Lee et al. [12] described a differ-ent approach to detect spammers. They cre-ated honeypot user accounts in Twitter andrecorded the features of users who interact withthese accounts. They then used these featuresto develop a classifier with high precision.

3.2 Spam Detection in General Online So-cial NetworksIn social bookmarking websites, Markines et al.[13] used just 6 features - tag spam, tag blur,document structure, number of ads, plagiarism,and valid links, to develop a classifier with 98%accuracy.

On facebook, Boshmaf et al. successfullylaunched a network of social bots [14]. DespiteFacebook’s bot detection system, the authorswere able to achieve an 80% infiltration rateover 8 weeks.

In online ad exchanges, advertisers pay web-sites for each user that clicks through an adto their website. The way fraud occurs in thisdomain is for bots to click through ads ona website owned by the botnet owners. Themoney at stake in this case has made the botsemployed very sophisticated. The botnet own-ers use increasingly stealthy, distributed trafficto avoid detection. Stone et al. examined vari-ous attacks and prevention techniques in costper click ad exchanges [15]. Yu et al. [16] gavea sophisticated approach to detect low-rate bottraffic by developing a model that examinesquery logs to detect coordination across botswithin a botnet.

3.3 Spam Detection in Chinese Online So-cial NetworksSome studies had been done on spam andbot detection in Chinese online social networks[17], [18]. Xu et al. [19] observed the spammersin Sina Weibo and found that the spammers canbe classified into two categories: promoters androbot accounts.

Lin et al. [20] presented an analysis of spam-ming behaviors in Sina Weibo. Using methodssuch as proactive honeypots, keyword basedsearch and buying spammer samples directlyfrom online merchants. they were able to col-lect a large set of spammer samples. Through

their analysis they found three representativespamming behaviors: aggressive advertising,repeated duplicate reposting, and aggressivefollowing.

spammer identification system. Throughtests with real data it is demonstrated that thesystem can effectively detect the spamming be-haviors and identify spammers in Sina Weibo.

3.4 Battling the “Internet Water Army” inChinese Online Social NetworksOne relevant area of research is the study ofthe “Online Water Army” 3. It represents full-time or part-time paid posters hired by PRcompanies to help in raising the popularity of aspecific company or person by posting articles,replies, and comments in online social net-works. According to CCTV 4, these paid postersin China help their customers using one of thefollowing three tactics: 1. promoting a specificproduct, company or person; 2. smear/slandercompetitors; 3. help deleting negative posts orcomments.

st in BBS systems, and online social net-works.

In the work by Chen et al. [21], the authorsexamined comments in the Chinese news web-sites such as Sina.com and Sohu.com and usedreply, activity, and semantic features to de-velop an SVM classifier via the LIBSVM Pythonlibrary with 95% accuracy at detecting paidposters. Interesting information discussed inthe paper includes the organizational structureof PR firms which hire the paid posters andthe choice of features: percentage of replies,average interval time of posts, active days, andnumber of reports commented on.

3.5 Measuring Influences in Online SocialNetworksFor many years the structural properties of var-ious Western social networks have been wellstudied by sociologists and computer scientists[22] [23] [24] [25].

In social network analysis, social influencerefers to the concept of people modifying their

3. e.g., http://shuijunwang.com or http://www.51shuijun.net4. see report in Chinese at

http://news.cntv.cn/china/20101107/102619.shtml

Page 6: Dynamics of Trends and Attention in Chinese Social Media · Dynamics of Trends and Attention in Chinese Social Media Louis Lei Yu Sitaram Asur Bernardo A. Huberman F Abstract—There

6

behavior to bring them closer to the behavior oftheir friends. In a social-affiliation network con-sists of nodes representing individuals, linksrepresenting friendships, and nodes represent-ing foci: “social, psychological, legal, or phys-ical entities around which joint activities areorganized (e.g., workplace, social groups) [26]”,if A and B are friends, and F is a focus that Aparticipates in. Over time, B can participate inthe same focus due to A’s involvement, this iscalled a membership closure [26].

Agarwal et al. [27] examined methods toidentify influential bloggers in the blogosphere.They discovered that the most influential blog-gers are not necessarily the most active. Back-strom et al. [28] studied the characteristics ofmembership closure in LiveJournal. Crandall etal. [29] studied the adaptation of influencesbetween editors of Wikipedia articles.

Romero et al. [30] measured retweets inTwitter and found that passivity was a majorfactor when it comes to message forwarding.Based on this result, they presented a measureof social influences that takes into account thepassivity of the audience in social networks.

3.6 The Study of Trends in TwitterThere are various studies on trends in Twitter[2] [31] [32] [33].

One of the most extensive investigations intotrending topics in Twitter was by Asur et al.[34]. The authors examined the growth andpersistence of trending topics in Twitter andobserved that it follows a log-normal distri-bution of popularity. Accordingly, most topicsfaded from popularity relatively quickly, whilea few topics lasted for long periods of time.They estimated the average duration of topicsto be around 20-40 minutes. When they exam-ined the content of the trends, they observedthat traditional notions of influence such as thefrequency of posting and the number of follow-ers were not the main drivers of popularity intrends. Rather it was the resonating nature ofthe content that was important. An interestingfinding was that news topics from traditionalmedia sources such as CNN, New York Timesand ESPN was shown to be some of the mostpopular and long lasting trending topics in

Twitter, suggesting that Twitter amplifies someof the broader trends occurring in society.

Cha et al. [35] explored user influences onTwitter trends and discovered some interestingresults. First, users with many followers werefound to not be very effective in generatingmentions or retweets. Second, the most influ-ential users tend to influence more than onetopic. Third, influences were found to not arisespontaneously, but instead as the result of fo-cused efforts, often concentrating on one topic.

3.7 Social Influences and the Propagationof Information in Chinese Social NetworksResearchers have analyzed the structure of var-ious Chinese offline social networks [36] [37][38] [39] [40].

There have been only a few studies on socialinfluences in Chinese online social networks.Jin [3] studied the structure and interface ofChinese online Bulletin Board Systems (BBS)and the behavioral patterns of its users. Xin[41] conducted a survey of BBS’s influence onUniversity students in China. Yu et al. [42]looked at the adaptation of books, movies, mu-sic, events and discussion groups on Douban,the largest online media database and one ofthe largest online communities in China.

In a similar area, there are some studies onthe structural properties and the characteristicsof information propagation in Chinese onlinesocial networks [43] [44], [45], [46], [47]. Yang etal. [48] noted that various information services(e.g., eBay, Orkut, and Yahoo!) encounteredserious challenges when entering China. Theypresented an empirical study of social interac-tions among Chinese netizens based on over4 years of comprehensive data collected fromMitbbs (www.mitbbs.com), the most frequentlyused online forum for Chinese nationals whoare studying or working abroad.

Lin et al. [49] presented a comparison of theinteraction patterns between two of the largestonline social networks in China: Renren andSina Weibo. Niu et al. [50] gave an empiricalanalysis of Renren, it follows an exponentiallytruncated power law in-degree distribution,and has a short average node distance.

King et al. [5] studied the concept of guanxi,a unique dyadic social construct, as applied

Page 7: Dynamics of Trends and Attention in Chinese Social Media · Dynamics of Trends and Attention in Chinese Social Media Louis Lei Yu Sitaram Asur Bernardo A. Huberman F Abstract—There

7

to the interaction between web sites in China.Chang et al. [51] studied a special case of thepropagation of information in Chinese onlinesocial networks: the sending and receiving ofmessages containing wishes and moral sup-port. They provided analysis on the data fromLinkwish, a micro social network for wishsharing with users mainly from Taiwan, HongKong, and Macao.

Fan et al. [52] looked at the propagation ofemotion in Sina Weibo. They found that thecorrelation of anger among users is signifi-cantly higher than that of joy, which indicatesthat angry emotion could spread more quicklyand broadly in the network. And, there is astronger sentiment correlation between a pairof users if they share more interactions. Finally,users with larger number of friends possessmore significant sentiment influence to theirneighborhoods.

4 ANALYSIS OF TRENDS AND TREND-SETTERS IN SINA WEIBO

4.1 The Trending KeywordsSina Weibo offers a list of 50 keywords thatappear most frequently in users’ tweets. Theyare ranked according to the frequency of ap-pearances in the last hour. This is similar toTwitter, which also presents a constantly up-dated list of trending topics: keywords that aremost frequently used in tweets over a periodof time. We extracted these keywords over aperiod of 30 days (from June 18th, 2011 to July18th, 2011) and retrieved all the correspondingtweets containing these keywords from SinaWeibo.

We first monitored the hourly evolution ofthe top 50 keywords in the trending list for 30days. We observed that the average time spentby each keyword in the hourly trending list is6 hours. And the distribution for the numberof hours each topic remains on the top 50trending list follows the power law (as shownin Figure 3 a). The distribution suggests thatonly a few topics exhibit long-term popularity.Another interesting observation is that a lotof the key words tend to disappear from thetop 50 trending list after a certain amount oftime and then later reappear. We examined the

distribution for the number of times keywordsreappear in the top 50 trending list (Figure 3b). We observe that this distribution follows thepower law as well.

Both the above observations are very similarto the earlier study of trending topics in Twitterby [1]. However, one important difference withTwitter is that the average trending time issignificantly higher in Sina Weibo (in Twitter itwas 20-40 minutes). This suggests that Weibomay not have as many topics competing forattention as Twitter.

Following our observation that some key-words stay in the top 50 trending list longerthan others, we wanted to investigate if topicsthat are ranked higher initially tend to stay inthe top 50 trending list longer. We separatedthe top 50 trending keywords into two rankedsets of 25 each: the top 25 and the bottom 25.Figure 4 illustrates the plot for the percentageof topics that placed in the bottom 25 relating tothe number of hours these topics stayed in thetop 50 trending list. We can observe that topicsthat do not last are usually the ones that arein the bottom 25. On the other hand, the long-trending topics spend most of their time in thetop 25, which suggests that items that becomevery popular are more likely to stay longer inthe top 50. This intuitively means that itemsthat attract phenomenal attention initially arenot likely to dissipate quickly from people’sinterests.

Fig. 4. Distribution of trending times for topics inthe bottom 25 of the top 50 trend list

Page 8: Dynamics of Trends and Attention in Chinese Social Media · Dynamics of Trends and Attention in Chinese Social Media Louis Lei Yu Sitaram Asur Bernardo A. Huberman F Abstract—There

8

Fig. 3. Distributions of trending time and the number of times topics reappeared

4.2 The Evolution of TweetsNext, we investigate the process of persis-tence and decay for the trending topics inSina Weibo. In particular, we want to measurethe distribution for the time intervals betweentweets containing the trending keywords. Wecontinuously monitored the keywords in thetop 50 trending list and for each trending topicwe retrieved all the tweets containing the key-word from the time the topic first appeared inthe top 50 trending list until the time it dis-appeared. Accordingly, we collected completedata for 811 topics over the course of 30 days(from June 20th, 2011 to July 20nd, 2011). Intotal we collected 574,382 tweets from 463,231users. Among the 574,382 Tweets, 35% of thetweets (202,267 tweets) are original tweets,and 65% of the tweets (372,115 tweets) areretweets. 40.3% of the total users (187130 users)retweeted at least once in our sample.

We measured the number of tweets that eachtopic gets in 10 minute intervals, from the timethe topic starts trending until the time it stops.From this we can sum up the tweet counts overtime to obtain the cumulative number of tweetsNq(ti) of topic q for any time frame ti, This isgiven as :

Nq(ti) =i∑

τ=1

nq(tτ) (1)

where nq(t) is the number of tweets on topicq in time interval t. We then calculate the ratiosCq(ti, tj) = Nq(ti)/Nq(tj) for topic q for timeframes ti and tj .

Figure 5 shows the distribution of Cq(ti, tj)’sover all topics for two arbitrarily chosen pairsof time frames: (10, 2) and (8, 3) (neverthelesssuch that ti > tj , and ti is relatively large, andtj is small).

These figures suggest that the ratios Cq(ti, tj)are distributed according to the log-normaldistributions. We tested and confirmed thatthe distributions indeed follow the log-normaldistributions.

This finding agrees with the result from asimilar experiment in Twitter trends. Asur andothers [1] argued that the log-normal distribu-tion occurs due to the multiplicative processinvolved in the growth of trends which incor-porates the decay of novelty as well as therate of propagation. The intuitive explanation isthat at each time step the number of new tweets(original tweets or retweets) on a topic is mul-tiplied over the tweets that we already have.The number of past tweets, in turn, is a proxyfor the number of users that are aware of thetopic up to that point. These users discuss thetopic on different forums, including Twitter, es-sentially creating an effective network throughwhich the topic spreads. As more users talk

Page 9: Dynamics of Trends and Attention in Chinese Social Media · Dynamics of Trends and Attention in Chinese Social Media Louis Lei Yu Sitaram Asur Bernardo A. Huberman F Abstract—There

9

Fig. 5. The distribution of Cq(ti, tj)’s over all topics for two arbitrarily chosen pairs of time frames:(10, 2) and (8, 3)

about a particular topic, many others are likelyto learn about it, thus giving the multiplicativenature of the spreading. On the other hand,the monotically decreasing decaying processcharacterizes the decay in timeliness and novelty of the topic as it slowly becomes obsolete.

However, while only 35% of the tweets inTwitter are retweets, there is a much largerpercentage of tweets that are retweets in SinaWeibo. From our sample we observed thata high 65% of the tweets are retweets. Thisimplies that the topics are trending mainly be-cause of some content that has been retweetedmany times. Thus, Sina Weibo users are morelikely to learn about a particular topic throughretweets.

4.3 Trend-setters in Sina WeiboFor every new trending keyword we retrievedthe most retweeted tweets in the past hour andcompiled a list of most retweeted users. Table1 illustrates the top 20 most retweeted authorsappearing in at least 10 trending topics each.The influential authors are ranked accordingto the ratio between the number of times theauthors’ tweets are retweeted and the numberof trending topics these tweets appeared in.

From Table 1 we observed that only 4 outof the top 20 influential authors were verified

accounts. The 4 verified accounts represent anurban fashion magazine, a fashion brand, anonline travel magazine, and a Chinese celebrity.The other 16 influential authors are unverifiedaccounts. They all seem to have a strong fo-cus on collecting user-contributed jokes, movietrivia, quizzes, stories and so on. This is insharp contrast to the topics that are popular inTwitter as reported by [1]. When we looked ata longer list of authors we observed the sametrend. The most popular items were all relatedto frivolous content and media, unlike Twitterwhich had a strong affinity towards news andcurrent events.

The “# of Times Tweeted” column in Ta-ble 1 gives the unique tweets that have beenretweeted. We can observe that the rate atwhich they have been retweeted is phenom-enal. For example, the top retweeted userposted 37 tweets which in total were retweeted1194999 times.

4.4 The Evolution of Retweets and OriginalTweetsIn the next experiemnt, we separate the tweetsin Sina Weibo into original tweets and retweetsand calculate the densities of ratios between cu-mulative retweets/original tweets counts mea-sured in different time frames. Figure 6 shows

Page 10: Dynamics of Trends and Attention in Chinese Social Media · Dynamics of Trends and Attention in Chinese Social Media Louis Lei Yu Sitaram Asur Bernardo A. Huberman F Abstract—There

10

TABLE 1Top 20 Retweeted Users in At Least 10 Trending Topics

Account Descriptions Verified # of Times Retweeted # of Tweets # of Topics1 Fashion Magazine Yes 1194999 37 122 Fashion Brand Yes 849404 21 133 Travel Magazine Yes 127737 123 214 Gourmet Factory No 553586 86 125 Horoscopes No 1545955 101 386 Silly Jokes No 3210130 258 817 Good Movies No 1497968 140 388 Wonderful Quotes No 602528 39 179 Global Music No 697308 116 22

10 Funny Jokes No 3667566 438 12111 Creative Ideas No 742178 111 2512 Chinese singer Yes 284600 25 1013 Good Music No 323022 52 1214 Movie Factory No 1509003 230 5915 Strange Stories No 1668910 250 6616 Beautiful Pictures No 435312 33 1817 Global Music No 432444 65 1818 Female Fashion No 809440 87 3419 Useful Tips No 735070 153 3120 Funny Quizzes No 589477 77 25

the distributions of original tweets/retweetsratios over all topics for two arbitrarily chosenpairs of time frames: (10, 2) and (8, 3).

We find (as the last two sub-figures in Figure6 show) that the distributions of ratios for orig-inal tweets follow the log-normal distribution.However, we observe (as the first two sub-figures in Figure 6 show) that for retweets, thedistributions do not satisfy all the propertiesof the log-normal distribution. This is indicatedby the large amount of low retweet ratios in thedistribution. Furthermore, there are high spikesin the lower ratios area of the distribution.

4.5 Identifying Spam Activity in SinaWeiboFrom Figure 6 in the previous Section we ob-served that there is a high percentage of lowratios in the distribution of retweet ratios. Thissuggests that for a lot of the topics, there is aninitial flurry of retweets. We hypothesize thatthis is due to the activities of certain users inSina Weibo. As these accounts post a tweet,they tend to set up many other fake accountsto continuously retweet this tweet, expectingthat the high retweet numbers would propelthe tweet to place in the Sina Weibo hourlytrending list. This would then cause other usersto notice the tweet more after it has emerged

as the top hourly, daily, or weekly trend setter.We attempt to verify the above hypothesisempirically. We define a spamming account asone that is set up for the purpose of repeat-edly retweeting certain messages, thus givingthese messages artificially inflated popularity.According to our hypothesis, the users whoretweet abnormally high amounts are morelikely to be spam accounts.

Figure 7 a) illustrates the distribution forthe number of users and their correspondingnumber of retweets (over all topics). Figure 7b) illustrates the distribution for the numberof users and the numbers of topics that theycaused to trend by their retweets. We observethat both distributions in Figure 7 follow thepower law. This implies that there are certainusers who retweet a lot, and a small numberof users are responsible for a large number oftopics. Next, we investigate who these usersare. We manually checked the top 40 accountswho retweeted the most. To our surprise, 37 ofthese 40 accounts could no longer be accessed.That is, when we queried the accounts’ IDs, weretrieved a message from Sina Weibo statingthat the account has been removed and can nolonger be accessed (see Figure 8).

According to Sina Weibo’s frequently askedquestion page, if a user sends a tweet contain-

Page 11: Dynamics of Trends and Attention in Chinese Social Media · Dynamics of Trends and Attention in Chinese Social Media Louis Lei Yu Sitaram Asur Bernardo A. Huberman F Abstract—There

11

Fig. 6. The densities of ratios between cumulative original tweets/retweets counts measured in twoarbitrary time frames: (10, 2) and (8, 3)

ing illegal or sensitive information, such tweetwill be immediately deleted by Sina Weibo’sadministrators, however, the users’ accountswill still be active. For the above reason weassume that if an account was active one monthago and can no longer be reached, it indicatesthat this account has very likely performedmalicious activities such as spamming and hashence been deleted.

Next, we inspect the user accounts with themost retweets in our sample and the number ofaccounts they retweeted. We see that althoughthese accounts retweeted a lot, they mostlyonly retweet messages from a few users. We re-organize the users who retweeted by the ratiobetween the number of times he/she retweeted

and the number of users he/she retweeted. Werefer to this as the user-retweet ratio. Table2 illustrates the top 10 users with the highestuser-retweet ratios. We note that for all theseusers, they each retweet posts from only oneaccount. We observe that this is true for thetop 30 accounts with the highest user-retweetratios.

Next, we conduct the following experiment:starting from the users with the highest user-retweet ratios, we used a crawler to automati-cally visit and retrieve each user’s Sina Weiboaccount. Thus we measured the percentage ofuser accounts that can still be accessed (asopposed to be directed to the error page) or-ganized by user-retweet ratios (Table 3). We

Page 12: Dynamics of Trends and Attention in Chinese Social Media · Dynamics of Trends and Attention in Chinese Social Media Louis Lei Yu Sitaram Asur Bernardo A. Huberman F Abstract—There

12

Fig. 7. The distribution for the number of users’ retweets and the number of topics users’ retweetstrend in

TABLE 2The top 10 accounts with the highest user-retweet ratios (u-r ratio)

User ID Number of Retweets Number of Users Retweeted U-R Ratio1840241580 134 1 1342241506824 125 1 1251840263604 68 1 681840237192 64 1 641840251632 64 1 642208320854 55 1 552208320990 51 1 512208329370 48 1 482218142513 47 1 471843422117 44 1 44

observe that only 12% of the accounts withuser-retweet ratios of above 30 are active. And,as user-retweet ratios decrease, the percentagesof active accounts slowly increase. We considerthis to be strong evidence for the hypothesisthat user accounts with high user-retweet ratiosare likely to be spam accounts.

We observe that in some cases, accounts withlower user-retweet ratios can still be a spamaccount. For example, an account could retweeta number of posts from other spam accounts,thus minimizing the suspicion of being de-tected as a spam account itself.

4.6 Removing Spammers in Sina Weibo

From our sample, after automatically checkingeach account, we identified 4985 accounts thatwere deleted by the Sina Weibo administra-tor. We called these 4985 accounts “suspectedspam accounts”. There were 463,231 users inour sample, and 187,130 of them retweeted atleast once. Thus we identified 1.08% of thetotal users (2.66% of users that retweeted) assuspected spam accounts.

Next, in order to measure the effect of spamon the Weibo network, we removed all retweetsfrom our sample disseminated by suspectedspam accounts as well as posts published bythem (and then later retweeted by others). Wehypothesize that by removing these retweets,

Page 13: Dynamics of Trends and Attention in Chinese Social Media · Dynamics of Trends and Attention in Chinese Social Media Louis Lei Yu Sitaram Asur Bernardo A. Huberman F Abstract—There

13

TABLE 3The percentage of accounts whose profiles can still be accessed, organized by user-retweet ratio

Ratio Percentage of Active Accounts Percentage of Inactive Accounts≥30 12% 88%

20 – 29 38% 63%11 – 19 16% 84%

10 22% 78%9 12% 88%8 16% 84%7 15% 85%6 21% 79%5 30% 70%4 58% 42%3 80% 20%2 96% 4%1 92% 8%

Fig. 8. An Example of an Error Page

we can eliminate the influences caused bythe suspected spam accounts. We observedthat after these posts were removed, we wereleft with only 189,686 retweets in our sample(51% of the original total retweets). In otherwords, by removing retweets associated wthsuspected spam accounts, we successfully re-moved 182,429 retweets, which is 49% ofthe total retweets and 32% of total tweets(both retweets and original tweets) from oursample. This result is very interesting becauseit shows that a large amount of retweets in oursample are associated with suspected spam ac-counts. The spam accounts are therefore artifi-cially inflating the popularity of topics, causingthem to trend.

To see the difference after the posts associ-ated with suspected spam accounts were re-

moved, we re-calculated the distribution ofuser-retweet ratios again for arbitrarily chosenpairs of time frames. Figure 9 illustrates the dis-tribution for time frames (10, 2). We observedthat the distribution is now much smootherand seem to follow the log-normal distribution.We performed the log-normal test and verifiedthat this is indeed the case.

Fig. 9. The distribution of retweet ratios for timeframe (10, 2) after the removal of tweets associ-ated with suspected spam accounts

4.7 Spammers and Trend-settersWe found 6824 users in our sample whosetweets were retweeted. However, the totalnumber of users who retweeted at least oneperson’s tweet was 187130, which is very

Page 14: Dynamics of Trends and Attention in Chinese Social Media · Dynamics of Trends and Attention in Chinese Social Media Louis Lei Yu Sitaram Asur Bernardo A. Huberman F Abstract—There

14

skewed. Figure 10 illustrates the distributionfor the number of times users were retweeted.This distribution follows the power law.

Fig. 10. The distribution for the frequency ofretweets of user posts

We discovered that the number of userswhose tweets were retweeted by the suspectedspam accounts was 4665, which is a surprising68% of the users who were retweeted in oursample. This shows that the suspected spamaccounts affect a majority of the trend-settersin our sample, helping them raise the retweetnumber of their posts and thereby making theirposts appear on the trending list. The overalleffect of the spammers is very significant. Wealso observed that a high 98% of the total trend-ing keywords can be found in posts retweetedby suspected spam accounts. Thus it can alsobe argued that many of the trends themselvesare artificially generated, which is a very im-portant result.

4.8 Examples of Spam AccountsNext, we investigate the activities of typicalspam accounts in Sina Weibo. We have shownthat accounts with high retweet ratios are likelyto be spam accounts. Although the majorityof the accounts had already been deleted bythe administrator, we manually inspected 100currently existing accounts with high retweetratios and found that 95 clearly participate inspamming activities. The other 5 were regu-lar users supporting their favorite singers and

celebrities by repeatedly retweeting their posts,which can also be construed as spam; however,we exclude those from our list of suspectedspam accounts. Figure 11 illustrates two ex-amples of the activities from suspected spamaccounts.

Fig. 11. Example of a spam account

First, we observe that the suspected spam ac-counts we inspected tend to repeatedly retweetthe same post with the goal of increasing theretweet number of said post. Next, the intervaltime of these repeated retweets tend to be veryclose to each other with long breaks betweeneach set. Finally, we observe that the repliesleft from spam accounts often do not make anysense (see the comments circled in Figure 11).[21] had similar findings, and explained thatthis was because the paid posters are mainlyinterested in finishing the job as quickly aspossible, thus they tend to retweet multipletimes in short bursts and leave gibberish asreplies. We observe that the replies in 11 a) andb) are not proper sentences.

For the 4665 users whose tweets wereretweeted by at least one suspected spam ac-count, we calculate the percentage of retweetsfrom spam accounts and the percentage ofsuspected spam accounts involved. We selected

Page 15: Dynamics of Trends and Attention in Chinese Social Media · Dynamics of Trends and Attention in Chinese Social Media Louis Lei Yu Sitaram Asur Bernardo A. Huberman F Abstract—There

15

only accounts whose tweets were retweeted byat least 50% of the accounts that are suspectedspam accounts. From our manual inspectionwe found mainly three types of accounts:

1) Verified accounts from celebrities and re-ality show contestants: We hypothesizethat they employ spam accounts to boastthe popularity of their posts, making itseem like the posts were retweeted bymany fans;

2) Verified accounts from companies: Wehypothesize that they employ spam ac-counts to boast the perceived popularityof their products;

3) Unverified accounts with posts consist ofads for products: We hypothesize thatthese accounts employ spam accounts todistribute the ads and to boast the per-ceived popularity of their products, hop-ing other users will notice and distribute(see Figure 12 for an example).

Fig. 12. Example of an account using spam

5 DISCUSSION AND FUTURE WORK

We have examined the tweets relating to thetrending topics in Sina Weibo. First we ana-lyzed the growth and persistence of trends.When we looked at the distribution of tweets

over time, we observed that there was a sig-nificant difference when contrasted with Twit-ter. The effect of retweets in Sina Weibo wassignificantly higher than in Twitter. We alsofound that many of the accounts that contributeto trends tend to operate as user contributedonline magazines, sharing amusing pictures,jokes, stories and antidotes. Such posts tendto recieve a large amount of responses fromusers and thus retweets. Yang et al. [48] haveshown similar results about Mitbbs users for-warding amusing messages and “virtual gifts”to online friends. The effect of this is similarto that of sending “a cyber greeting card”.This phenomenon can also be observed fromtext messages sent from cell phones betweenindividuals in China [53]. This is interesting inthe context of there being strong censorship inchinese social media. It can be hypothesizedthat under such circumstances, it is these kindof “safe” topics that can emerge.

When we examined the retweets in moredetail, we made an important discovery. Wefound that 49% of the retweets in Sina Weibocontaining trending keywords were actuallyassociated with fraudulent accounts. We ob-served that these accounts comprised of a smallamount (1.08% of the total users) of users butwere responsible for a large percentage of thetotal retweets for the trending keywords. Thesefake accounts are responsible for artificiallyinflating certain posts, thus creating fake trendsin Sina Weibo.

We relate our finding to the questions weraised in the introduction. There is a strongcompetition among content in online social me-dia to become popular and trend and this givesmotivation to users to artificially inflate topicsto gain a competitive edge. We hypothesizethat certain accounts in Sina Weibo employfake accounts to repeatedly repeat their tweetsin order to propel them to the top trending list,thus gaining prominence as top trend setters(and more visible to other users). We foundevidence suggesting that the accounts that doso tend to be verified accounts with commercialpurposes.

It is clear that the owners of these user con-tributed online magazines see this as a businessopportunity to gain audience for their content.

Page 16: Dynamics of Trends and Attention in Chinese Social Media · Dynamics of Trends and Attention in Chinese Social Media Louis Lei Yu Sitaram Asur Bernardo A. Huberman F Abstract—There

16

They can start by generating and propagatingpopular content and subsequently begin insert-ing advertisements amongst the jokes in theirtheir Sina Weibo accounts. The artificial infla-tion makes it an even more effective campaign.

We have found that we can effectively detectsuspected spam accounts using retweet ratios.This can lead to future work such as usingmachine learning to identify other spammingtechniques. In the future, we would like to ex-amine the behavior of these fake accounts thatcontribute to artificial inflation in Sina Weiboto learn how successful they are in influencingtrends.

REFERENCES

[1] S. Asur, B. A. Huberman, G. Szabo, and C. Wang, “Trendsin social media - persistence and decay,” in 5th Interna-tional AAAI Conference on Weblogs and Social Media, 2011.

[2] B. A. Huberman, D. M. Romero, and F. Wu, “Socialnetworks that matter: Twitter under the microscope,”Computing Research Repository, 2008.

[3] L. Jin, “Chinese outline BBS sphere: what BBS has broughtto China,” Master’s thesis, Massachusetts Institute ofTechnology, April 2009.

[4] B. Wang, B. Hou, Y. Yao, and L. Yan, “Human flesh searchmodel incorporating network expansion and gossip withfeedback,” in Proceedings of the 2009 13th IEEE/ACM Inter-national Symposium on Distributed Simulation and Real TimeApplications. IEEE Computer Society, 2009, pp. 82–88.

[5] V. King, L. Yu, and Y. Zhuang, “Guanxi in the chineseweb,” in Proceedings of the 2009 IEEE International Confer-ence on Computational Science and Engineering, vol. 4. IEEEComputer Society, 2009, pp. 9–17.

[6] T. Z. Xue, The Internet in China : Cyberspace and CivilSociety. Routledge, 2006.

[7] CNNIC. (2010) The 21st statistics report on the internetdevelopment in china (in chinese). [Online]. Available:http://www.cnnic.cn/index/0E/00/11/index.htm

[8] ——. (2010) Survey report on internet development inrural china (in chinese). [Online]. Available: http://www.cnnic.cn/en/index/00/02/index.htm

[9] F. Y. Wang, “Beyond x 2.0: where should we go?” IEEEIntelligent Systems, vol. 24, no. 3, pp. 2–4, 2009.

[10] F. Benevenuto, G. Magno, T. Rodrigues, and V. Almeida,“Detecting spammers on twitter,” in Collaboration, Elec-tronic messaging, Anti-Abuse and Spam Conference (CEAS),vol. 6. National Academy Press, 2010.

[11] A. Wang, “Detecting spam bots in online social net-working sites: a machine learning approach,” Data andApplications Security and Privacy XXIV, pp. 335–342, 2010.

[12] K. Lee, J. Caverlee, and S. Webb, “Uncovering socialspammers: social honeypots+ machine learning,” in Pro-ceeding of the 33rd international ACM SIGIR conference onResearch and development in information retrieval. ACM,2010, pp. 435–442.

[13] B. Markines, C. Cattuto, and F. Menczer, “Social spamdetection,” in Proceedings of the 5th International Workshopon Adversarial Information Retrieval on the Web. ACM, 2009,pp. 41–48.

[14] Y. Boshmaf, I. Muslukhov, K. Beznosov, and M. Ripeanu,“The socialbot network: when bots socialize for fame andmoney,” in Proceedings of the 27th Annual Computer SecurityApplications Conference. ACM, 2011, pp. 93–102.

[15] B. Stone-Gross, R. Stevens, A. Zarras, R. Kemmerer,C. Kruegel, and G. Vigna, “Understanding fraudulentactivities in online ad exchanges,” in Proceedings of the2011 ACM SIGCOMM conference on Internet measurementconference. ACM, 2011, pp. 279–294.

[16] F. Yu, Y. Xie, and Q. Ke, “Sbotminer: large scale searchbot detection,” in Proceedings of the third ACM internationalconference on Web search and data mining. ACM, 2010, pp.421–430.

[17] L. Liu and K. Jia, “Detecting spam in chinese microblogs- a study on sina weibo,” in Computational Intelligence andSecurity (CIS), 2012 Eighth International Conference on, 2012,pp. 578–581.

[18] Y. Zhou, K. Chen, L. Song, X. Yang, and J. He, “Featureanalysis of spammers in social networks with activehoneypots: A case study of chinese microblogging net-works,” in Advances in Social Networks Analysis and Mining(ASONAM), 2012 IEEE/ACM International Conference on,2012, pp. 728–729.

[19] X. Yong, Z. Yi, and C. Kai, “Observation on spammersin sina weibo,” in Proceedings of the 2nd InternationalConference on Computer Science and Electronics Engineering(ICCSEE 2013). Atlantis Press, 2013.

[20] C. Lin, J. He, Y. Zhou, X. Yang, K. Chen, and L. Song,“Analysis and identification of spamming behaviors insina weibo microblog,” in Proceedings of the 7th Workshopon Social Network Mining and Analysis, ser. SNAKDD ’13,2013, pp. 5:1–5:9.

[21] C. Chen, K. Wu, V. Srinivasan, and X. Zhang, “Battling theinternet water army: Detection of hidden paid posters,”CoRR, vol. abs/1111.4297, 2011.

[22] M. Jamali and H. Abolhassani, “Different aspects ofsocial network analysis,” in Proceedings of the 2006IEEE/WIC/ACM International Conference on Web Intelli-gence, 2006, pp. 66–72.

[23] A. Mislove, M. Marcon, K. P. Gummadi, P. Druschel, andB. Bhattacharjee, “Measurement and analysis of onlinesocial networks,” in Proceedings of the 7th SIGCOMMConference on Internet Measurement. ACM, 2007, pp. 29–42.

[24] M. Buchanan, Nexus: Small Worlds and the GroundbreakingTheory of Networks. W. W. Norton & Company, May 2003.

[25] R. Kumar, J. Novak, and A. Tomkins, “Structure andevolution of online social networks,” in Proceedings of the12th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining. ACM, 2006, pp. 611–617.

[26] M. McPherson, L. Smith-Lovin, and J. M. Cook, “Birds ofa feather: homophily in social networks,” Annual Reviewof Sociology, vol. 27, no. 1, pp. 415–444, 2001.

[27] N. Agarwal, H. Liu, L. Tang, and P. S. Yu, “Identifying theInfluential Bloggers in a Community,” WSDM’08, 2008.

[28] L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan,“Group formation in large social networks: membership,growth, and evolution,” in Proceedings of the 12th Interna-tional Conference on Knowledge Discovery and Data Mining.ACM, 2006, pp. 44–54.

Page 17: Dynamics of Trends and Attention in Chinese Social Media · Dynamics of Trends and Attention in Chinese Social Media Louis Lei Yu Sitaram Asur Bernardo A. Huberman F Abstract—There

17

[29] D. Crandall, D. Cosley, D. Huttenlocher, J. Kleinberg, andS. Suri, “Feedback effects between similarity and socialinfluence in online communities,” in Proceedings of the14th ACM SIGKDD international conference on Knowledgediscovery and data mining. ACM, 2008, pp. 160–168.

[30] D. M. Romero, W. Galuba, S. Asur, and B. A. Huberman,“Influence and passivity in social media,” in 20th Interna-tional World Wide Web Conference (WWW’11), 2011.

[31] H. Kwak, C. Lee, H. Park, and S. Moon, “What is twitter,a social network or a news media?” in Proceedings of the19th international conference on World wide web, ser. WWW’10, 2010, pp. 591–600.

[32] M. Mathioudakis and N. Koudas, “Twittermonitor: trenddetection over the twitter stream,” in Proceedings of the2010 international conference on Management of data, ser.SIGMOD ’10, 2010, pp. 1155–1158.

[33] S. Wu, J. M. Hofman, W. A. Mason, and D. J. Watts, “Whosays what to whom on twitter,” in Proceedings of the 20thinternational conference on World wide web, ser. WWW ’11,2011, pp. 705–714.

[34] S. Asur, B. A. Huberman, G. Szabo, and C. Wang, “Trendsin social media: Persistence and decay,” in 5th InternationalAAAI Conference on Weblogs and Social Media. AAAI, 2011,pp. 434–437.

[35] M. Cha, H. Haddadi, F. Benevenuto, and K. P. Gummadi,“Measuring user influence in twitter: The million followerfallacy,” in 4th International AAAI Conference on Weblogsand Social Media (ICWSM). AAAI, 2010.

[36] Y. Bian, “Bringing strong ties back in: indirect ties, net-work bridges, and job searches in china,” American Socio-logical Review, vol. 62, no. 3, pp. 366–385, 1997.

[37] D. Ruan, “Interpersonal networks and workplace controlsin urban china,” The Australian Journal of Chinese Affairs,vol. 29, pp. 89–105, 1993.

[38] J.-L. Farh, A. S. Tsui, K. Xin, and B.-S. Cheng, “The influ-ence of relational demography and guanxi: the Chinesecase,” Organization Science, vol. 9, no. 4, pp. 471–488, 1998.

[39] Y. Bian, R. Breiger, D. Davis, and J. Galaskiewicz, “Occu-pation, class, and social networks in urban china,” SocialForces, vol. 83, no. 4, pp. 1443–1468, 2005.

[40] P. J. Carrington, J. Scott, and S. Wasserman, Eds., Modelsand Methods in Social Network Analysis. Cambrige Uni-versity Press, 2005.

[41] M. Xin, “Chinese bulletin board system’s influence uponuniversity students and ways to cope with it (in chinese),”Journal of Nanjing University of Technology (Social ScienceEdition), vol. 4, pp. 100 –104, 2003.

[42] L. Yu and V. King, “The evolution of friendships inchinese online social networks,” in Proceedings of the 2010IEEE Second International Conference on Social Computing,ser. SOCIALCOM ’10, 2010, pp. 81–87.

[43] Z.-J. Zhong, “Social networking services (sns) in china,”International Journal of e-Business Management, vol. 4, no. 1,pp. 66–69, 2010.

[44] B. Zhang, X. Guan, M. J. Khan, and Y. Zhou, “A time-varying propagation model of hot topic on {BBS} sitesand blog networks,” Information Sciences, vol. 187, no. 0,pp. 15 – 32, 2012.

[45] M. Chan, X. Wu, Y. Hao, R. Xi, and T. Jin, “Microblogging,online expression, and political efficacy among youngchinese citizens: The moderating role of information andentertainment needs in the use of weibo,” Cyberpsy., Be-havior, and Soc. Networking, vol. 15, no. 7, pp. 345–349,2012.

[46] H. Chen and E. Haley, “The lived meanings of productplacement in social network sites (snss) among urbanchinese white-collar professional users: A story of happynetwork,” Journal of Interactive Advertising, vol. 11, no. 1,pp. 11–16, 2010.

[47] S.-C. Chu and S. M. Choi, “Electronic word-of-mouthin social networking sites: A cross-cultural study of theunited states and china,” Journal of Global Marketing,vol. 24, no. 3, pp. 263–281, 2011.

[48] J. Yang, M. S. Ackerman, and L. A. Adamic, “Virtual giftsand guanxi: Supporting social exchange in a chinese on-line community,” in Proceedings of the ACM 2011 Conferenceon Computer Supported Cooperative Work, ser. CSCW ’11,2011, pp. 45–54.

[49] J. Lin, Z. Li, D. Wang, K. Salamatian, and G. Xie, “Analysisand comparison of interaction patterns in online socialnetwork and social media,” in Computer Communicationsand Networks (ICCCN), 2012 21st International Conferenceon, 2012, pp. 1–7.

[50] J. Niu, J. Peng, L. Shu, C. Tong, and W. Liao, “An em-pirical study of a chinese online social network–renren,”Computer, vol. 46, no. 9, pp. 78–84, 2013.

[51] G. Chang, H.-S. Huang, and J. Y. jen Hsu, “Detectingchinese wish messages in social media: An empiricalstudy,” in ICWSM, 2013.

[52] R. Fan, J. Zhao, Y. Chen, and K. Xu, “Anger is more in-fluential than joy: Sentiment correlation in weibo,” CoRR,vol. abs/1309.2402, 2013.

[53] Y. Xia, “Chinese use of mobile texting for social interac-tions: Cultural implications in the use of communicationtechnology,” Intercultural Communication Studies, vol. 21,no. 2, p. 131, 2012.