Epidemics in Blogspace

download Epidemics in Blogspace

of 28

  • date post

    25-Feb-2016
  • Category

    Documents

  • view

    35
  • download

    2

Embed Size (px)

description

Epidemics in Blogspace. Hasan T Karaoglu. Outline. Introduction Blogs are different! Methods are different! Contents are different! Some methods on Some Content of Some Blogs Discussion. Introduction. Blogs are a popular way to share personal journals, - PowerPoint PPT Presentation

Transcript of Epidemics in Blogspace

Epidemics in Blogspace

Hasan T KaraogluEpidemics in BlogspaceIntroductionBlogs are different!Methods are different!Contents are different!Some methods on Some Content of Some BlogsDiscussion

OutlineBlogs are a popular way to share personal journals, discuss matters of public opinion, have collaborative conversations,aggregate content on similar topics. Blogs also disseminatenew content novel ideas How does content spread across, what kinds of content spreads, and at what rate?

IntroductionEpidemics : one way of modeling these aspectsPhysics of Information DiffusionDisease Propagation ModelSusceptibleInfectedRecoveredMutation?Threshold Model for Social Networks

Introduction - EpidemicsYoutube, Flickr (Content Sharing )AmazonCNN, MSNBC (Web)Linkedln (Professional Networking)Orkut, Facebook, Yonja (Social Networking)Twitter (?)Blogger, Blogspot, LiveJournal, Slashdot (Blogspace)

Blogs are differentBlogs are different

High level of reciprocitySymmetric indegree outdegreeIn contrast to Web (high authority sites)Blogs are different

Blogs are differentAverage Path Length is very short in compared to Web.(Directionality ?)Blogs are different

Joint Degree Distribution(High Degree Nodes Connect to Other High Degree Nodes)

Epidemics on Network Core?

Youtube Celebrities?Blogs are different

Strongly Connected Core AnalysisSlowly Increasing Shortest PathHigh Clustering

Blogs are different

Strong Local Clustering(people tend to be introduced to other people via mutual friends)

EpidemicsGossipInfluence Map (Word of Mouth)Recommendation Based Web (Data) MiningMathematical Modeling (Markov Chains, Information Theory, )

Methods are differentContents are differentRecommendationNews (Political, Fun, Paparazzi)GossipMedia (Music, News, Excerpts)

The horizontalaxis represents the time difference between the video uploadand the blog linking. The vertical axis shows the cumulativedistribution plot (CDF) of the number of links. Next to eachplot we show the median age of liked videos in units of days.The median age for videos in the news category is 2 daysold, and some links appeared within a few seconds to minutesof the video upload. Very few news videos were linkedafter a year of being uploaded. This demonstrates that newsvideos that spread in the blogosphere are topical and young.The other video categories show a pattern of a much delayeddiscovery; the median age of a comedy video is 72 days atthe time it was linked by a blog. The median age of videosfor entertainment is 125 days and is 357 days for music!This indicates that bloggers post about recent events when itcomes to news and politics, but also enjoy rediscovering oldcontent (nearly one year old) for other video topics

13Infection Inference technique introduced by Adamic et al.Link inferenceLink classificationClassifier training Problems and ChallengesSome methods on Some Content of Some Blogs

Some methods on Some Content of Some BlogsPattern Used for Classifier TrainingThe number of common blogs explicitly linked to by both blogs (indicating whether two blogs are in the same community)The number of non-blog links (i.e. URLs) shared by the twoText similarityOrder and frequency of repeated infections.Specifically, the number of times one blog mentions a URL before the other and the number of timesThey both mention the URL on the same day. In-link and out-link counts for the two blogs

Some methods on Some Content of Some BlogsText Similaritys(A,B) = nAB / nA / nB

Some methods on Some Content of Some BlogsTiming of Infection

Some methods on Some Content of Some BlogsLink Inference Blog URL and Text Similarity PatternsThree-way Classifier (57%)reciprocated links, one way links, unlinked pairs Two-way Classifier (SVM 91.2% Logistic Regression 91.9%)linkedunlinked pairsInfection Inference nA-before-B /nA, nA-after-B /nA, nA-same-day-B /nA Timing Patterns (75%)with all 6 timing patterns and text/blog similarity patterns (61 75%)link-in / link-out counts

Some methods on Some Content of Some BlogsVisualization Heuristics using classifiersTwo types of graphDirected Acyclic GraphMost likely tree

Some methods on Some Content of Some BlogsEpidemic Propagation Model by Gruhl et al.TopicsIndividualsTopicsTopic = Chatter + Spike + (Resonance)

Some methods on Some Content of Some BlogsEpidemic Propagation Model by Gruhl et al.TopicsIndividualsTopicsTopic = Chatter + Spike + (Resonance)

There is a community of bloggers interested in any topic that appearsin postings. On any given day, some of the bloggers expressnew thoughts on the topic, or react to topical postings by other bloggers.This constitutes the chatter on that topic.Occasionally, an event occurring in the real world induces a reactionfrom bloggers, and we see a spike in the number of postingson a topic. Spikes do not typically propagate through blogspace, inthe sense that bloggers typically learn about spikes not from otherblogs, but instead from a broad range of channels including mainstreammedia. Thus, we can assume all informed authors are awareof the topical event and have an opportunity to write about it.On rare occasions, the chatter reaches resonance, i.e., someonemakes a posting to which everyone reacts sharply, thereby causinga spike. The main characteristic of resonance is that a spikearises from either no external input or a very small external input.The formation of order (a spike) out of chaos (chatter) has beenobserved in a variety of situations [26], though observation of ourdata reveals that this happens very rarely in blogspace. In fact, theonly sustained block re-posting meme that we observed in our dataconsisted of the aoccdrnig to rscheearch at an elingsh uinervtisyit deosnt mttaer in waht oredr the ltteers in a wrod are, the olnyiprmoetnt tihng is taht the frist and lsat ltteer is at the rghit pclaestory which came out of nowhere, spiked and died in about 2 weeks(with most postings over a four-day period).

21Some methods on Some Content of Some Blogs

There is a community of bloggers interested in any topic that appearsin postings. On any given day, some of the bloggers expressnew thoughts on the topic, or react to topical postings by other bloggers.This constitutes the chatter on that topic.Occasionally, an event occurring in the real world induces a reactionfrom bloggers, and we see a spike in the number of postingson a topic. Spikes do not typically propagate through blogspace, inthe sense that bloggers typically learn about spikes not from otherblogs, but instead from a broad range of channels including mainstreammedia. Thus, we can assume all informed authors are awareof the topical event and have an opportunity to write about it.On rare occasions, the chatter reaches resonance, i.e., someonemakes a posting to which everyone reacts sharply, thereby causinga spike. The main characteristic of resonance is that a spikearises from either no external input or a very small external input.The formation of order (a spike) out of chaos (chatter) has beenobserved in a variety of situations [26], though observation of ourdata reveals that this happens very rarely in blogspace. In fact, theonly sustained block re-posting meme that we observed in our dataconsisted of the aoccdrnig to rscheearch at an elingsh uinervtisyit deosnt mttaer in waht oredr the ltteers in a wrod are, the olnyiprmoetnt tihng is taht the frist and lsat ltteer is at the rghit pclaestory which came out of nowhere, spiked and died in about 2 weeks(with most postings over a four-day period).

22Some methods on Some Content of Some Blogsaoccdrnig to rscheearch at an elingsh uinervtisy it deosnt mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer is at the rghit pclaeThere is a community of bloggers interested in any topic that appearsin postings. On any given day, some of the bloggers expressnew thoughts on the topic, or react to topical postings by other bloggers.This constitutes the chatter on that topic.Occasionally, an event occurring in the real world induces a reactionfrom bloggers, and we see a spike in the number of postingson a topic. Spikes do not typically propagate through blogspace, inthe sense that bloggers typically learn about spikes not from otherblogs, but instead from a broad range of channels including mainstreammedia. Thus, we can assume all informed authors are awareof the topical event and have an opportunity to write about it.On rare occasions, the chatter reaches resonance, i.e., someonemakes a posting to which everyone reacts sharply, thereby causinga spike. The main characteristic of resonance is that a spikearises from either no external input or a very small external input.The formation of order (a spike) out of chaos (chatter) has beenobserved in a variety of situations [26], though observation of ourdata reveals that this happens very rarely in blogspace. In fact, theonly sustained block re-posting meme that we observed in our dataconsisted of the aoccdrnig to rscheearch at an elingsh uinervtisyit deosnt mttaer in waht oredr the ltteers in a wrod are, the olnyiprmoetnt tihng is taht the frist and lsat ltteer is at the rghit pclaestory which came out of nowhere, spiked and died in about 2 weeks(with most postings over a four-day period).

23Some methods on Some Content of Some Blogs

Power-law Characteristic for IndividualsDifferent Posting Behaviors for IndividualsThere is a community of bloggers interested in any topic that appearsin postings. On any given day, some of the bloggers expressnew thoughts on the topic, or react to topical postings by other bloggers.This constitutes the chatter on that topic.Occasionally, an event occurring in the real world induces a reactionfrom bloggers, and we see a spike in the number of posting