Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State...

42
Arizona State University Data Mining and Machine Learning Lab Searching for Credible Informa=on via Social Media Mining Searching for Credible Informa3on via Social Media Mining Huan Liu Data Mining and Machine Learning Lab Arizona State University

Transcript of Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State...

Page 1: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=onviaSocialMediaMining

SearchingforCredibleInforma3onviaSocialMediaMining

HuanLiu

DataMiningandMachineLearningLabArizonaStateUniversity

Page 2: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ThankstoFormerandCurrentPhDStudentsofDMML

•  RezaZafarni,AsstProf,SyracuseU•  XiaHu,AsstProf,TexasA&MU•  MagdielGalan,Intel•  ShamanthKumar,CastlightHealth•  PritamGundecha,IBMResAlmaden•  JiliangTang,AsstProf,MSU•  HuijiGao,LinkedIn•  AliAbbasi,MachineZone•  SalemAlelyani,AsstProf,KingKhalidU•  XufeiWang,LinkedIn•  GeoffreyBarbier,AFRL•  LeiTang,Clari•  ZhengZhao,Google•  Ni3nAgarwal,ChairProf,UALR•  SaiMoturu,PostDoc,MITMediaLab•  LeiYu,AsscProf,BinghamtonU,NY

•  RobertTrevino,AFRL•  YunzhongLiu,LeEco,US•  SomnathShahapurkar,FICO•  FredMorstaXer•  IsaacJones•  SuhasRanganath•  SuhangWang•  TahoraNazer•  JundongLi•  LiangWu•  GhazalehBeigi•  KaiShu•  Jus3nSampson

Page 3: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

False,Misleading,andInaccurateInforma3on

•  Spam•  Fraud•  FakeNews•  Rumor•  UrbanLegend•  Gossip•  Informa3oncanbe:true,false,oruncertain•  BigData:6th`V’EveryoneShouldKnowAbout

– Vulnerability–  Socialmediahasall6V’s

3

Disinforma*on(purposeful)

Misinforma*on(uninten*onal)&Disinforma*on

Page 4: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

SpaminSocialMedia

•  Unwantedcontentinforma3ongeneratedbyspammingusersascomments,chat,fakerequeststhatareusedtopromoteproductsorspreadmaliciousinforma3on.

4

–  Fakereviews – Maliciouslinks –  Fakerequests

Page 5: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Fraud(Scam)inSocialMedia

•  Asocialmediafraudisdefraudingand/ortakingadvantageofsocialmediauserswiththeuseofsocialmediaservices.

5

–  Swindlemoney –  Stealpersonalinforma3on

Page 6: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

FakeNewsWebsitesandSocialMedia

•  Fakenewswebsitesdeliberatelypublishhoaxes,propaganda,anddisinforma3ontodrivetrafficexacerbatedbysocialmedia

•  Fakenewscanaffectdomes3cpoli3cs,inflamedbysocialmedia,duetolimitedresourcestochecktheveracityofclaims–  Easyto“like”and“share”,buttakingefforttocheck,albeitjustafewclicksaway(effortasymmetry)

•  Fakenews+SocialmediaCyberwarfare

6

Page 7: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

FakeNewsIsRampantinSocialMedia

•  Fakenewsspreadsonsocialmedia–  Spreadsrapidly

–  Evolvesfast

7

• Crossovertoothernetworks • Modifiedcontent

Page 8: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

FakeNewsCanCauseRealHarm

•  Pizzagate:storiesoffakenewsfromRedditleadtorealshoo3ng

•  Afalserumorerased$136billionin10minutes

8

Fake News Onslaught Targets Pizzeria as Nest of Child-Trafficking, New York Times, 2016

Page 9: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Rumors

• Wikipedia:“Atalltaleofexplana3onscircula3ngfrompersontopersonandpertainingtoanobject,event,orissueinpublicconcern”.

•  Rumorscanbetrueorfalse.

9

–  Falserumor

Page 10: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

GossipinSocialMedia

•  Gossipisidlechatandrumoraboutpersonaland/orprivateaffairsofothers.

•  Socialmediaallowsforfaster,alargerscaleof,andmoreconvenientidlechat.

10

–  Celebrity:“ObamasmovingtoAsheville”

–  Friends:People“aremuchmorelikelytogossipwhenastoryunitesafamiliarpersonwithaninteres3ngscenario.“

FamiliaritywithInterestBreedsGossip:Contribu3onsofEmo3on,Expecta3on,andReputa3on, PLoS ONE, 2014

Page 11: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

UrbanLegendinSocialMedia

•  Fic3onalstorieswithmacabreelementsrootedinlocalpopularculture.– Onsocialmedia,itdevelopsfasterandspreadswider

•  Insummary,itisimpera3vetostudycredibilitychecking

11

• UrbanlegendofFengshui

Page 12: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

OnCredibilityChecking

•  Studyingdifferenttypesofcredibilityandtheneedfordifferentdataandinforma3onsourcesincredibilitychecking– Wedon’thavetoreinventwheelsinsocialmediaminingandcan“standontheshoulderofgiants”

– Machinesdifferfromhumansincredibilitychecking

•  AboutCredibilityChecking–  TypesofCredibility(socialsciences,psychology,CS)– AspectsofCredibilityChecking–  ComponentsofCredibilityCheckinginSocialMedia

12

Page 13: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

FourTypesofCredibility

•  Presumedcredibility(generalassump3ons)–  “Ourfriendsusuallytelltruth”

•  Reputedcredibility(basedonthirdpar3es’reports)–  Forinstance,pres3giousawardsorofficial3tles

•  Surfacecredibility(simpleinspec3on)–  “Peoplejudgeabookbyitscover”

•  Experiencedcredibility(first-handexperience)–  “Timecantell”(路遥知马力,日久见人心)

•  Anynewtypetoexploreinsocialmedia? 13

Page 14: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

AspectsofCredibilityChecking(CC)

•  CanweturnCCintoaproblemeasierforusersorAMTurks(withoutmuchexper3se)tocheck?

•  IssuesaboutCredibilityCheckingMeasures–  Reputa3onandHistory(3me)– AccuracyandRelevance–  TransparencyandIntegrity(consistency)–  Responsefromindependentsources(consistency)

•  Implica3onorimpactassessment– Noteverypieceoffakenewsisdisastrous–  “Warnornottowarn”:howtobalance?

14

Page 15: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

News/Post

Fake

Yes NoUncertain

•  Recipients

•  Senders

•  Sourceofinforma3on

•  Content

•  Networkcontext•  Crowdsourcing(fact-checkingsites,e.g.,Snopes)•  Groundtruth(mul3faceted,goldstandard)

Exper=se,experienceBackground,occupa=on

Reputa=onLengthofonlinepresenceSocialnetworks

ProvenanceReputa=on,Cura=on/Edi=ngLength

Wri=ngstyleTopicsURLsMul=media

Topicthread(Outlierdetec=on)RetweetsRepliesComments

ComponentsinCredibilityCheckinginSocialMedia

15

Page 16: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

SearchingforCredibleInforma=on

16

CredibleData

Spam

Bots(automa=callygeneratedcontent)

FakeNews

Rumor

•  AUniqueChallenge–  Groundtruth

•  Addi3onalChallenges–  Credibilityverifica3on–  Dynamicchange–  Timeliness

•  Alterna3veApproaches–  RumorDetec3on–  SpamDetec3on–  BotDetec3on–  InferringDistrust

Page 17: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

UsingSocialMediaforCredibilityChecking

•  VelocityandVolume–  6,000tweetspersecond,5millionperdayonTwiXer–  55millionstatusand300millionphotosperdayonFB

•  Variety–  Geo-spa3al,textual,pictorial,temporal,socialdimensions–  Crossmodality(e.g.,geotaggedpictures)

•  Veracity–  Truthfulnessandaccuracyofinforma3on

•  Usebigdata,mul3-sourceinfo,andsocialnetworkstocompensateforlackofexper3se(以其之矛还其之盾)

17

Page 18: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

18

Adecentbreakdownofallthingsrealandfakenew

s.hX

p://imgur.com

/7xHaUXf

Page 19: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

RumorDetec=on

•  Rumor:unverifiedandrelevantinforma3onthatcirculatesinthecontextofambiguity.

•  Goal:detec3ngemergingrumorswithminimuminforma3onasearlyaspossible–  Ifinterven3onisnotfeasible,getearlywarningorprepared

•  Challenges:–  Howtoovercomethelackofinforma3oninasingletweet?–  Howtodetectrumorsintheirforma3vestage?

19

Page 20: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

InsufficientInforma=oninaSingleTweet

•  Asingletweetcouldbedamaging,butcontainsliXleinforma3onw/ocontextfordetec3on

•  Treatbatchesoftweetsas“conversa3ons”•  Basedonkeywordsimilari3es•  Basedonreplychains

20

...

1to9tweets 10+tweets

PointofAcceptableAccuracy

•  Aggregateconversa3ons•  Sharedhashtags•  Commonlinks•  Cosinesimilarity

Page 21: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Detec=onofEmergingRumors

•  Emergentdetec3on-linkthefirsttweetinarumorwiththosealreadyposted

•  Standardrumorclassifica3onsarenoteffec3veforsmallconversa3ons–  Lackofnetworkandsta3s3caldata–  Datasparsityissues

•  Implicitlinkingworkseffec3velyfordetec3ngsmallrumorcascades

21

Page 22: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

BotDetec=on

•  Bots–  Innocuous:relayinforma3onfromofficialsources–  Malicious:spreadrumorsandfalseinforma3on

•  Goal:RemovebotsfromsocialmediadatawithhighRecall–  WHY?

•  Challenges–  Acquiringgroundtruth–  IncreasingRecallwithoutsignificantlyreducingPrecision

22

Page 23: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

BotsinSocialMedia

•  BotsonTwiXer:–  TwiXerclaims5%of230Musersarebots.– Onestudyfound20Mbotaccounts=9%**.–  24%ofalltweetsaregeneratedbybots***.

•  5-11%ofFacebookaccountsarefake****.

*hXp://blogs.wsj.com/digits/2014/03/21/new-report-spotlights-twiXers-reten3on-problem/**hXp://www.nbcnews.com/technology/1-10-twiXer-accounts-fake-say-researchers-2D11655362***hXps://sysomos.com/inside-twiXer/most-ac3ve-twiXer-user-data****hXp://thenextweb.com/facebook/2014/02/03/facebook-es3mates-5-5-11-2-accounts-fake/ 23

Page 24: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

FindingGroundTruth

•  ThreestatesofaTwiXeruser:–  Ac3ve–  Suspended–  Deleted

•  Idea:–  Usethesestatesas

labels–  Twosnapshotsof

eachuseristaken

24

Suspended

Deleted

Ac3ve

Ini=alCrawl•  Findsseedsetofusers.•  CrawlsProfile,Network,...

StatusonTwiXerasalabelingmechanism

Page 25: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

GroundTruth-Honeypots

•  Actasobviousbotaccounts•  AXractotherbotaccounts•  Botsareiden3fiedwhentheyfollowouraccount•  Assump=on:Realusersdonotfollowbots

25

Page 26: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Honeypots-Logic

•  Post“Luring”Content–  Postcontentthatwillbeseen–  trendingtopics,hashtags,

“famous”tweets...•  MaintainNetwork

Connec=ons–  “Followback”,Retweets–  Famebegetsfame

•  PromoteOtherHoneypots–  Retweeteachother’stweets–  Men3oneachother

HoneypotAccounts

ChooseHoneypot,

h

RetweetRandomHoneypot

10%

SampleRandomTweet,t

90% hretweets

t

30%

hcopiest70%

Recordh’snewfriends

Wait10s

Follownew

friends

26

Page 27: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

BoostOR

•  BasedonAdaBoost•  TrytoincreaseRecallwithoutdras3cdecreaseinPrecision

•  Itera3velyupdatetheweightofinstances:–  Unchangedifcorrectlyclassified–  Decreasediffalsenega3ve–  Increasediffalseposi3ve

27

Page 28: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Trust-DistrustPredic=on

•  Goal–  Trustanddistrustrela3onscanplayanimportantroleinhelpingonlineuserscollectreliableinforma3on

–  Findingtrustworthyusersandreliableinforma3onisofsignificantimportance

–  Howtopredicttrustrela3onsbetweenusers?

•  Challenges–  Trustrela3onsareextremelysparse–  Distrustrela3onsareevensparserthantrustones–  Findingsubs*tutefeaturesindica3veoftrustanddistrust

28

Page 29: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

TrustandEmo=ons

•  Accordingtopsychology,user’semo3onscanbestrongindicatorsoftrustanddistrustrela3ons

•  Emo3onalinforma3onismoreavailablethanthatoftrust/distrust

•  Thereexistsacorrela3onbetweenemo3onsandtrust/distrustrela3ons

29

Page 30: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

ModelingEmo=onalInforma=on

•  Userswithposi3ve(nega3ve)emo3onsaremorelikelytoestablishtrust(distrust)rela3ons

•  Userswithhighposi3ve(nega3ve)emo3onstrengthsaremorelikelytoestablishtrust(distrust)

•  TheEmo3onalTrustDistrustframeworkETD–  Low-rankmatrixfactoriza3on

–  Emo3onalinforma3onregulariza3on

30

Page 31: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

StudyingBiasinSocialMediaData

•  TwiXersharesitsdata–  “Firehose”feed-100%-costly–  “StreamingAPI”feed-1%-free

• Weusuallyobtaindataviasampling–  IsthesampleddatafromtheStreamingAPIrepresenta3veofthetrueac3vityonTwiXer’sFirehose?

•  Challenges– Howtodetermineifthesampleisbiasedwhenwedonothaveaccesstothewholedata?

– Howtoobtainanunbiasedsample?

31

Page 32: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Twicer’sStreamingAPIvs.Firehose

•  DatafromFirehoseandStreamingAPIhasbeencollectedforspecificperiodof3metoperformanalysis

•  Morethan90%ofallgeotaggedtweetsareavailableviaStreamingAPIandthereisnotsignificantdifferenceinloca3ondistribu3on

•  Basedonin-degreecentralityandbetweennesscentralityinuser-userretweetnetworks,theStreamingAPIfinds~50%ofthekeyusers

32

Page 33: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Mi=ga=ngBiasinTwicer’sStreamingAPI

CanwefindbiaswithouttheFirehose?

Es3ma3ngBiasfromStreamingAPI:–  ObtaintrendofhashtagfromSampleAPIandStreamingAPI

–  BootstrapSampleAPItoobtainconfidenceintervals

–  MarkregionswhereStreamingAPIisoutsideofconfidenceintervals

Mi3ga3ngBias:–  Leveragemul3plecrawlerstomaximizedataforeachquery

–  RoundRobinSpliyng

33

Page 34: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Time-Cri=calInforma=oninCrisisResponse

• Socialmediaisusedtorequestforimmediateassistanceduringcrisis

• Time-cri3calpostsdemandimmediateaXen3on• Addressingthesequeriespromptlycanhelpinemergencyresponse

• Howcanthesepostsbedis3nguishedfromothers?

• WhatIsRequiredinFindingTime-Cri*calResponses?– Userswithexper3seorknowledge– Fastresponse– Relevantanswers

34

Page 35: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

FindingTime-Cri=calResponses

• Manyques3onsaskedduringcrisisshouldbeimmediatelyaXended

• Manyrespondersarebusy• Howcanwefindapromptresponderwhocanprovidearelevantanswer?

• ChallengesofIden3fyingPromptResponders–  Howdowees3matethereply*meofuserstoiden3fypromptresponders?

–  Timelinessandrelevance:howdoweintegrate3melinesswithrelevancetorankcandidateresponders?

35

Page 36: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Informa=onSeekinginSocialMedia

• Socialmediaisusedtorequestforhelpduringcrisis

• Addressingthesequeriespromptlycanhelpinemergencyresponse

36

Page 37: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

Iden=fyingCandidateResponders

•  Timeliness–  Theusercanrespondmorequicklyifsheisavailablesoonazertheques3onisposted.Itcanbees3matedusingthepreviouspos3ng3mes

–  Auserrespondstoques3onsfasterifshehasrepliedpromptlytosimilarques3onsinthepast

•  Relevance–  Userswhosepreviouscontentissimilartotheques3onhavehigherrelevanceandtheirresponseismorelikelytobearelevantanswer

•  Timelinessandrelevanceareintegratedbycombiningtherankingscores

37

Page 38: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

SearchingforCredibleInforma=on

38

CredibleData

Spam

Bots(automa=callygeneratedcontent)

FakeNews

Rumor

•  AUniqueChallenge–  Groundtruth

•  Addi3onalChallenges–  Credibilityverifica3on–  Dynamicchange–  Timeliness

•  Alterna3veApproaches–  RumorDetec3on–  SpamDetec3on–  BotDetec3on–  InferringDistrust

以其之矛还其之盾

Page 39: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

ThankYouAll

•  ProfessorYang’skindinvita3onandwarmhospitality•  FundingsupportfromONR,NSF,ARO,amongothers•  DMMLLabformerandcurrentmembers,andLiangWuforhelpingwiththeprepara3onofthispresenta3on

Searchfor“HuanLiu”formoreinforma3onaboutDMML

HLiu,FMorstaXer,JTang,andRZafarani.``Thegood,thebad,andtheugly:uncoveringnovelresearchopportuni=esinsocialmediamining",inTrendsofDataScience,Interna3onalJournalonDataScienceandAnaly3cs,SpringerInterna3onalPublishingSwitzerland.September,2016.DOI10.1007/s41060-016-0023-0

39

Page 40: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

40

•  scikit-feature–anopensourcefeatureselec3onrepositoryinPython

•  SocialCompu3ngRepository

RepositoriesandRecentBooks

Page 41: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

41hcp://dmml.asu.edu/smm/

Page 42: Searching for Credible Informaon via Social Media Mininghuanliu/papers/BJUT12222016.pdfArizona State University Data Mining and Machine Learning Lab Searching for Credible Informaon

ArizonaStateUniversityDataMiningandMachineLearningLab SearchingforCredibleInforma=on BJUT2016

References

1.  [BeigiSDM’16]GhazalehBeigi,JiliangTang,SuhangWang,andHuanLiu.“Exploi3ngEmo3onalInforma3onforTrust/DistrustPredic3on”.SIAMInterna3onalConferenceonDataMining(SDM16),May5-7,2016.Miami,Florida.

2.  [MorstaXerASONAM’16]FredMorstaXer,LiangWu,TahoraH.Nazer,KathleenM.Carley,andHuanLiu.“ANewApproachtoBotDetec3on:StrikingtheBalanceBetweenPrecisionandRecall”,IEEE/ACMInterna3onalConferenceonAdvancesinSocialNetworkAnalysisandMining(ASONAM2016),August18-21,SanFrancisco,CA.

3.  [MorstaXerWWW’14]FredMorstaXer,JürgenPfeffer,HuanLiu.WhenisitBiased?AssessingtheRepresenta3venessofTwiXer'sStreamingAPI”,WWWWebScience2014.

4.  [MorstaXerICWSM’13]FredMorstaXer,JürgenPfeffer,HuanLiu,KathleenMCarley.IstheSampleGoodEnough?ComparingDatafromTwiXer'sStreamingAPIwithTwiXer'sFirehose”,ICWSM2013.

5.  [SampsonCIKM’16]Jus3nSampson,FredMorstaXer,LiangWuandHuanLiu.“LeveragingtheImplicitStructurewithinSocialMediaforEmergentRumorDetec3on",shortpaper,ACMInterna3onalConferenceofInforma3onandKnowledgeManagement(CIKM2016),October24-28,2016.Indianapolis,Indiana.

6.  [SampsonICDM’15]Jus3nSampson,FredMorstaXer,RezaZafarani,andHuanLiu.“Real-TimeCrisisMappingUsingLanguageDistribu3on”.Demo.InProceedingsofIEEEInterna3onalConferenceonDataMining(ICDM2015),November14-17,2015.Atlan3cCity,NJ.

42