Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline:...

83
Internet Application Censorship: Studies of Weibo in China and Twitter in Turkey Dan S. Wallach, Rice University

Transcript of Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline:...

Page 1: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Internet Application Censorship: Studies of Weibo in China and Twitter in Turkey Dan S. Wallach, Rice University

Page 2: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Today, two short studies Weibo:atonepointChina’slargestsocialnetwork.Twi(er:broadlyusedworldwide,par:cularlypopularintheMiddleEast.We’relookingatwebapplica4ons,notatnetworksorthe“GreatFirewallofChina”.

Page 3: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

The Velocity of Censorship: High-Fidelity Detection of Microblog Post Deletions

Tao Zhu, Independent Researcher David Phipps, Bowdoin College Adam Pridgen, Rice University Jedidiah R. Crandall, University of New Mexico Dan S. Wallach, Rice University

Page 4: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

March 2006

July 2009

http://en.wikipedia.org/wiki/Microblogging_in_China

August 2009

Microblogging sites in China

Page 5: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Sina Weibo

● 503 million registered users as of Dec 2012. o More than half are from mobile devices.

● About 100 million messages are posted

each day on Sina Weibo. ● Promote visibility of social issues.

http://en.wikipedia.org/wiki/Sina_Weibo

Page 6: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Weibo’s influence: Wukan incident - 2011

(The village name) vs (Neologism)

Page 7: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Sina Weibo

● Strict controls over the posts.

Page 8: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Introduction of our research

● Detecting a censorship event within 1-2 minutes of its occurrence.

● Three strategies Weibo system uses to

target sensitive content quickly. ● Performing a topical analysis of the deleted

posts.

Page 9: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Methodology

1. Identifying the sensitive user group 2. Crawling posts of sensitive user groups 3. Detecting deletions

Page 10: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Identifying the sensitive user group ● Use outdated sensitive keywords from China

Digital Times

Page 11: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

● Identifying the sensitive user group o Use outdated sensitive keywords from China Digital

Times. o Start with 25 sensitive users.

Repost

Identifying the sensitive user group

Page 12: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

● Identifying the sensitive user group o Use outdated sensitive keywords from China Digital

Times. o Start with 25 sensitive users. o Sensitive group reaches 3,567 users after 15 days. o More than 4,500 deletion daily

Identifying the sensitive user group

Page 13: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

● User timeline: o Weibo user timeline API returns the most recent 50 posts

of the specified user.

o Query 3,567 sensitive users once per minute §  100 accounts for API call §  300 concurrent Tor circuits.

o Four-node cluster running Hadoop and Hbase

§  2.38 million posts from July 20 to September 8, 2012.

Crawling

Page 14: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Diff

Our database Latest 50 posts Deleted Post

Detecting deletions

Page 15: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

t0 t1 t2 tn

The lifetime of deleted Post = tn - t0

Detecting deletions

…...

Page 16: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

● Permission-denied or system deletion o “Permission denied” error. o Caused by censorship events. o The post still exists but cannot be accessed by users.

● General deletion

o “Post does not exist” error. o May caused by user self deletion or censorship events. o The post does not exist.

Detecting deletions

Page 17: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Detecting deletions

12

Permission-denied deletion 4.5%

General deletion 8.3%

2.38 Million user timeline posts

Page 18: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

● Permission-denied deletion or system deletion §  Around 1,500 permission denied deletions. §  Comparing with WeiboScope, which is tracking

around 300,000 users and have no more than 100 permission denied deletions daily.

Detecting deletions

Page 19: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Distribution of deleted posts

Whole lifetime First two hours

Page 20: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Strategies to target sensitive contents

1. Weibo has filtering mechanisms as a proactive, automated defense.

2. Weibo targets specific users, such as those

who frequently post sensitive content. 3. When a sensitive post is found, a moderator

will use automated searching tools to find all of its related reposts, and delete them all at once.

Page 21: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

1. Keywords list filtering

● Weibo has filtering mechanisms as a proactive, automated defense

o Explicit filtering Sorry, The content

violates the relevant laws and

regulations. If need help, please contact customer service.

Page 22: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

● Weibo has filtering mechanisms as a proactive, automated defense

o Explicit filtering o Implicit filtering

1. Keywords list filtering Your post has been submitted

successfully. Currently, there is a delay caused by server data

synchronization. Please wait for 1 to 2 minutes. Thank you very much.

Page 23: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

● Weibo has filtering mechanisms as a proactive, automated defense o Explicit filtering o Implicit filtering o Camouflaged posts o Surveillance keywords list?

§  If no such list the cost will be too expansive

1. Keywords list filtering

Page 24: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

2. Targeting specific users

● Weibo targets specific users, such as those who frequently post sensitive content.

Page 25: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

3. Finding all related reposts ●  When a sensitive post is found, a moderator can find all

of its related reposts, and delete them all at once

Page 26: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Censors work in the night

Page 27: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Censors catch up in the morning

Page 28: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Conclusion

Whole lifetime First two hours

Page 29: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

KnownUnknowns:AnAnalysisofTwi1erCensorshipin#Turkey

RimaTanash,RiceUniversityZhouhanChen,RiceUniversityTanmayThakur,UniversityofHoustonChrisBronk,UniversityofHoustonDevikaSubramanian,RiceUniversityDanS.Wallach,RiceUniversity

Page 30: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Disclaimer

•  Ourresearchisfocusedonquan:fyingtheextentofTwiJercensorshipinagivencountry.

•  Wedonotdiscloseprivateuserinforma:on,orshareTwiJerdatasets.

Page 31: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Mo=va=on•  Socialmediaplayedasignificantroleduringtherecentwave

ofuprisingsintheMiddleEast(a.k.a.ArabSpring)-  Demonstra:onswereorchestratedviaTwiJer&Facebook.

Page 32: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Mo=va=on•  Socialmediaplayedasignificantroleduringtherecentwave

ofuprisingsintheMiddleEast(a.k.a.ArabSpring)-  Demonstra:onswereorchestratedviaTwiJer&Facebook.-  ResultedintheoverthrowofsomeArabDictatorships.

Page 33: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Mo=va=on•  Socialmediaplayedasignificantroleduringtherecentwave

ofuprisingsintheMiddleEast(a.k.a.ArabSpring)-  Demonstra:onswereorchestratedviaTwiJer&Facebook.-  ResultedintheoverthrowofsomeArabDictatorships.

Page 34: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Governmentcensorship• Clearly,governmentshavegreatinterestincontrollingsocialmedia.

Page 35: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Governmentcensorshipinthenews

Page 36: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Typesofcensorship• Networklevel:

•  Blocken:reservice.•  Chinablockswesternsocialmedia.•  TurkeyblocksTwiJerMarch,2014.

• Applica=onlevel:•  Serviceisnotblocked,censorshipisinternaltotheservice•  Example:SinaWeibo(Zhuet.al)–keywordfiltering,etc.

Page 37: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Typesofcensorship• Networklevel:

•  Blocken:reservice.•  Chinablockswesternsocialmedia.•  TurkeyblocksTwiJerMarch,2014.

• Applica=onlevel:•  Serviceisnotblocked,censorshipisinternaltotheservice•  Example:SinaWeibo(Zhuet.al)–keywordfiltering,etc.

•  Butwhataboutapplica=oncensorshipinwesternsocialmedia?

Page 38: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Twi1ercensorship•  January2012,TwiJer“Country-WithheldContent”

•  TwiJerpublishescensorshiprequestsonChillingEffects.org

•  TwiJerpublishesbi-annualtransparencyreports

Page 39: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Removalrequests

https://transparency.twitter.com/removal-requests/2015/jan-jun

Page 40: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

RemovalrequestsUSA

Page 41: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

RemovalrequestsTurkey

Page 42: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Thismapraisesmanyques=on

•  What’supwithTurkey?•  Whyarethere0requestsfromArabic-speaking

countriessuchasSaudiandUAE?•  Theyhavereputa:onforcensorship.•  TwiJerisverypopularinthesecountries.

Page 43: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

AreTwi1ertransparencyreportscomplete?

•  “NOTE:Thedatainthesereportsisasaccurateaspossible,butmaynotbe100%comprehensive.”

•  TwiJerdoesnotposttheno:cestoChillingEffectswhenthey“…arelegallyprohibitedfromdoingso”

Prohibitedbywhom?Andwhichlaws?

Page 44: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

RequestsamplefromChillingEffects

(Redac:onsbyTwiJer.)

Page 45: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

WhyTurkey?•  Itrepresentsthedarkestblobonthemap.

•  TurkeybannedTwiJer,initsen:rely,in2014.

Page 46: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

ResearchQues=ons

•  CanweconfirmthenumberofwithheldtweetsreportedforTurkeyintheTransparencyReports?

•  CanwefindunreportedtweetsinTurkey?•  Canweextractandanalyzetopicsbeing

withheld?

Page 47: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Agenda•  Methodology

•  Valida:ngcensoredtweets•  Findinginteres:ngusers•  Crawling

•  Findings•  Topicanalysis•  Bypassingcensorship•  Futurework

Page 48: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

HowtovalidateiftweetsarecensoredinTurkey?

Page 49: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

HowtovalidateiftweetsarecensoredinTurkey?

•  ViewtweetsfrominsideTurkeyandcheckiftheyareinvisible.

Page 50: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

HowtovalidateiftweetsarecensoredinTurkey?

•  ViewtweetsfrominsideTurkeyandcheckiftheyareinvisible.•  UseafreeTurkishproxy

Page 51: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

HowtovalidateiftweetsarecensoredinTurkey?

•  ViewtweetsfrominsideTurkeyandcheckiftheyareinvisible.•  UseafreeTurkishproxy✗

Page 52: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

HowtovalidateiftweetsarecensoredinTurkey?

•  ViewtweetsfrominsideTurkeyandcheckiftheyareinvisible.•  UseafreeTurkishproxy✗•  UsethePlanetLabnetwork

Page 53: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

HowtovalidateiftweetsarecensoredinTurkey?

•  ViewtweetsfrominsideTurkeyandcheckiftheyareinvisible.•  UseafreeTurkishproxy✗•  UsethePlanetLabnetwork✗

Page 54: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

HowtovalidateiftweetsarecensoredinTurkey?

•  ViewtweetsfrominsideTurkeyandcheckiftheyareinvisible.•  UseafreeTurkishproxy✗•  UsethePlanetLabnetwork✗

•  Whileanalyzingasampleofknowncensoredtweets,weobservedaspecialfieldinthetweetstructure,called:

"id_str" : "51707756700291... "in_reply_to_user_id":null, "favorited":false, "withheld_in_countries":["TR"] “...

Page 55: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

HowtovalidateiftweetsarecensoredinTurkey?

•  ViewtweetsfrominsideTurkeyandcheckiftheyareinvisible.•  UseafreeTurkishproxy✗•  UsethePlanetLabnetwork✗

•  Whileanalyzingasampleofknowncensoredtweets,weobservedaspecialfieldinthetweetstructure,called:

•  Ifthisfieldisreliable,thenwecancrawlfromhome!

"id_str" : "51707756700291... "in_reply_to_user_id":null, "favorited":false, "withheld_in_countries":["TR"] “...

Page 56: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Valida=ngourobserva=on•  TorbrowserbundlewithExitNode{TR}.•  Result:

•  Weconfirmedthatalltweetsthatcontainedthe"withheld_in_countries”field,wereindeedinvisibleinTurkey.

•  Therefore,cancrawlfromUSA.✓

Page 57: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Crawling•  Goal:collectcensoredtweetsandexaminetheircontent.•  FreeTwiJerpublicAPI.•  Startwith689seedsensi:veusers:

•  Source:ChillingEffectswebsite.•  Spider-outinthesocialgroupfornewusers.

•  APIwithgeoboundingboxesofthreemajorTurkishci:es.•  Crawlhistoricalusers’:meline.

Page 58: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Revisi=ngcollectedtweets•  Methodology:Collectnow,verifylater

•  Censorshipdoesn’thappenimmediately•  Inspectif“withheld_in_countries”fieldispresent

•  Tradeoff:#usersvs.:megranularity

Page 59: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Agenda•  Methodology

•  Valida:ngcensoredtweets•  Findinginteres:ngusers•  Crawling

•  Results•  Topicanalysis•  Bypassingcensorship•  Futurework

Page 60: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Results

Overallwecollected 20+miltweets

Experimentdura:on(allmethods) 10/2014-5/2015

Numberofinteres:ngusers 7,642users

Page 61: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Results

Overallwecollected 20+miltweets

Experimentdura:on(allmethods) 10/2014-5/2015

Numberofinteres:ngusers 7,642users

Numberofreportedwithheldtweetsby6/15 3,981tweets

Numberofreportedwithheldusersby6/15 204users

Page 62: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Results

Overallwecollected 20+miltweets

Experimentdura:on(allmethods) 10/2014-5/2015

Numberofinteres:ngusers 7,642users

Numberofreportedwithheldtweetsby6/15 3,981tweets

Numberofreportedwithheldusersby6/15 204users

Numberofwithheldtweetswecollected 266,407tweets

Numberofwithheldusersweiden=fied 46users

Numberofwithheldtweetsnotincludingthosefromwithheldaccounts

205,451tweets

Page 63: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Results

Overallwecollected ~20milTweets

Timerange** 10/2014-5/2015

Interes:ngusers 7,642

Numberofreportedwithheldtweetsby6/15 3,981tweets

Numberofreportedwithheldaccountsby6/15 204users

Numberofwithheldtweetswecollected 266,407tweets

Numberofwithheldaccountsweiden=fied 46users

Numberofwithheldtweetsnotincludingthosefromwithheldaccounts

205,451tweets

There is at least two orders of magnitude more withheld tweets in Turkey than what Twitter reported

Page 64: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Twoordersofmagnitude•  “NOTE:Thedatainthesereportsisasaccurateaspossible,

butmaynotbe100%comprehensive.”

•  Itisindeednot100%comprehensivenoraccurate.

Page 65: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Deduplica=on •  MaybeTwiJerreportscensoredcopiesasoneevent?•  Note:thetweetswecollectedareuniquebyID.•  Weremovedduplica:ons:copy/paste&retweets(details

inthepaper).

Page 66: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Deduplica=on •  MaybeTwiJerreportscensoredcopiesasoneevent?•  Note:thetweetswecollectedareuniquebyID.•  Weremovedduplica:ons:copy/paste&retweets(details

inthepaper).•  Reducedthenumberoftweetsto:88,276.

Page 67: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Deduplica=on •  MaybeTwiJerreportscensoredcopiesasoneevent?•  Note:thetweetswecollectedareuniquebyID.•  Weremovedduplica:ons:copy/paste&retweets(details

inthepaper).•  Reducedthenumberoftweetsto:88,276.•  àOneorderofmagnitudehigherthanwhatTwi1er

reported.

Page 68: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Agenda•  Methodology

•  Valida:ngcensoredtweets•  Findinginteres:ngusers•  Crawling

•  Findings•  Topicanalysis•  Bypassingcensorship•  Futurework

Page 69: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Topicanalysis•  Topicanalysisisvaluableforunderstandingthepoli:cal

aimsoftheTurkishcensors.•  Termfrequency–inversedocumentfrequency(t-idf)–

standardmachinelearningalgorithm.•  Weextractedthetop5topicswith10wordsforeachtopic.

Page 70: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Topicanalysis•  Word:media,dishonest,freedom,an=-Semi=creferences,minister,etc.•  Wecaninferthatstronglywordedandvulgarpoli:caldiscussionsarebeing

targetedbyTurkey’scensorshipauthori:es.Turkish“topic” Englishtransla=on

do˘ganaydınmedya¸serefsizgrubumedyasıvatanhurriyetde˘gilköpe˘gi

AydinDogan(apersonwhoownsthebiggestmediainTurkey)mediadishonestgrouphomefreedomnotdog

cort¸sekerbank¸sap¸sikinhikayesidikkatchpibrahimkaraca Notmeaningful:Sekerbank(aTurkishfinancialins:tu:on)aJen:oninstoryIbrahimKaraca(aTurkishname)

koçunvehbio˘gluınaydınrahmido˘ganyahudinahum VehbiKoç(aTurkishentrepreneur/philanthropist)sonenlightenedwombrisingJewish

elvanlütibakanbakanıula¸s|rmaJtaptalbiwwwadam LüfiElvan(aTurkishgovernmentminister)stupidmanministertransporta=on

davuto˘gluahmetba¸sbakanlanpicsikeyimyahudigavatgötvatan

AhmetDavutoglu(currentTurkishprimeminister)primeministermanbastardfuckJewishpimpass

Page 71: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Agenda•  Methodology

•  Valida:ngcensoredtweets•  Findinginteres:ngusers•  Crawling

•  Findings•  Topicanalysis•  Bypassingcensorship•  Futurework

Page 72: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Bypassingcensorship•  InApril2015,wefollowedagroupofwithheldaccountsin

Turkey,andno:cedthatsomeusersweres:lltwee:ngfrominsidethecountrydespitetheiraccountsbeingwithheld.

Page 73: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Bypassingcensorship•  InApril2015,wefollowedagroupofwithheldaccountsin

Turkey,andno:cedthatsomeusersweres:lltwee:ngfrominsidethecountrydespitetheiraccountsbeingwithheld.

•  How?

Page 74: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Bypassingcensorship•  InApril2015,wefollowedagroupofwithheldaccountsin

Turkey,andno:cedthatsomeusersweres:lltwee:ngfrominsidethecountrydespitetheiraccountsbeingwithheld.

•  How?•  VPN&Tor.

Page 75: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Users connecting to Tor in Turkey 1.  Tormetric:Numberofusersconnec:ngto

TorinTurkeydaily.2.  Distribu:onofwithheldtweetbymonth

(logscale).

Page 76: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Users connecting to Tor in Turkey 1.  Tormetric:Numberofusersconnec:ngto

TorinTurkeydaily.2.  Distribu:onofwithheldtweetbymonth

(logscale).

Peaks:1.  5/2013Taksimsquareprotest.2.  3/2014TwiJerbaninTurkey.

Page 77: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Users connecting to Tor in Turkey 1.  Tormetric:Numberofusersconnec:ngto

TorinTurkeydaily.2.  Distribu:onofwithheldtweetbymonth

(logscale).

Peaks:1.  5/2013Taksimsquareprotest.2.  3/2014TwiJerban.3.  11/2014&12/2014:peakin“withheld

content”censorship

Page 78: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Bypassingcensorship •  Turnsout,itiseveneasier!•  Changethe“Loca=on”se}ng

intheTwiJerapplica:on “Turto“USA”forexample.

•  Weexpectmanyuserstostart

usingthismethodinthefuture.

Page 79: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

WhatifTwi1erbecomesvigilant?

•  UserswillreverttoVPNsorTor.

Page 80: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Countrylevelcensorshipishard•  If“countrywithholding”mechanismsdon’twork,countries

willdemandglobalTwiJercensorship

Page 81: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Future work •  Measuringcensorshipinothercountries.

•  TwiJerimposesrestric:veratelimits.

•  Poli:calsciencecollabora:onongoing.

Page 82: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

Conclusion 1.  Weprovidedmethodstofindunpublishedcensoredtweets2.  WeshowedthatthesizeofcensorshipinTurkeyisatleast

twoordersofmagnitudelargerthanwhatTwiJerreported.

3.  WeintroducedanewsimplemethodtobypassTwiJercensorshipbychangingtheloca:onse}ng.

4.  Weextractedcensoredtopicsusingmachinelearningclusteringalgorithmandfoundthatmostofcensoredtopicsarepoli:cal.

Page 83: Internet Application Censorship: Studies of Weibo in China and … · 2020-01-17 · User timeline: o Weibo user timeline API returns the most recent 50 posts of the specified user.

FAQ’s:

http://www.cs.rice.edu/~rst5/twitterTurkey/