Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh...

33
Chipping Away at Censorship with User- Generated Content Sam Burnett, Nick Feamster and Santosh Vempala

Transcript of Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh...

Page 1: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Chipping Away at Censorship with User-Generated Content

Sam Burnett, Nick Feamster and Santosh Vempala

Page 2: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Internet Censorship is a Problem

• 12 censors• 11 monitors• 25% of population• More on the way

See http://rsf.org for more

Page 3: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

It’s Not Only China…at Home, Too

Page 4: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Censored net Uncensored netBob

Firewall

Alice

Intro to Internet Censorship

Block TrafficPunish User

CensorCensor

Page 5: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Censored net Uncensored netBob

Firewall

Alice

Solution: Use a Helper

The helper sends messages to and from blocked hosts on your behalf

Helper

Page 6: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Design Goals for the Helper

• Be robust against blocking• Be deniable against user identification• Require no dedicated infrastructure

Page 7: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

What about Proxies and Mixnets? (e.g., Tor)

Proxy Proxy

Censored net Uncensored netBob

Firewall

Alice

• Censors can block proxies if the proxy list is public• Not deniable if encryption is incriminating• Requires dedicated infrastructure (network of proxies)

Page 8: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

What About Covert Channels?(e.g., Infranet)

• Not entirely robust against blocking• More deniable because messages are hidden• Requires dedicated infrastructure (Web servers)

Unblocked host

usenix.org

Censored net Uncensored netBob

Firewall

Alice

Page 9: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Alice

Collage: Let User-Generated Content Help Defeat Censorship

• Robust by using redundancy• Users generate innocuous-looking traffic• No dedicated infrastructure required

User-generated content hosts

Bob, a Flickr user

Page 10: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Why Might Collage Work?

• Lots of User-Generated Content (UGC)– More than 4 billion Flickr images– A day of video uploaded to YouTube every minute

• Many sites host UGC• We have tools to store censored data in UGC

– Steganography, watermarking

Page 11: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Outline

• Background and Design Goals• Collage Design• Performance and Demo

Page 12: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Content host

Bob

Collage, Step-by-Step

Step 1: Obtain messageStep 3: Obtain cover media• Your personal photos• Generous users

Step 4: Embed message in cover• Next slideStep 5: Upload UGC to content hostStep 6: Find and download UGCStep 7: Decode message from UGCStep 2: Pick message identifier• Application specific• Only intended recipient should know it

VectorMessage

Embedded Vector

Alice

Collage steps:1. Obtain message2. Pick message identifier3. Obtain cover media4. Embed message in cover5. Upload UGC to content host

6. Find and download UGC7. Decode message from UGC

Page 13: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Embedding Messages in Vectors

• Encrypt the message using the identifier• Generate chunks using erasure coding

– Generate many chunks, recover from any k-subset– Allows splitting among many vectors, robustness

• Embed chunks into vectors

Steganography: hard to detectWatermarking: hard to remove

Do the reverse to decode

Collage steps:1. Obtain message2. Pick message identifier3. Obtain cover media4. Embed message in cover5. Upload UGC to content host

6. Find and download UGC7. Decode message from UGC

Page 14: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Where Are Bob’s Vectors?

• Crawling all of Flickr is not an option• Must agree on a subset of a content host

without any immediate communication

Collage steps:1. Obtain message2. Pick message identifier3. Obtain cover media4. Embed message in cover5. Upload UGC to content host

6. Find and download UGC7. Decode message from UGC

Solution: A predictable way of mapping message identifiers to subsets of content hosts

Page 15: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Message Identifier

Solution: Task Mapping

Tasks

http://nytimes.com

3

6

9

1111

9

• Receivers perform these tasks to get vectors

• Senders publish vectors so that when receivers perform tasks, they get the sender’s vectors

Tasks1. Hash the identifier2. Hash the tasks3. Map identifier to closest tasks

Collage steps:1. Obtain message2. Pick message identifier3. Obtain cover media4. Embed message in cover5. Upload UGC to content host

6. Find and download UGC7. Decode message from UGC

1

Look at JohnDoe’s videos on YouTube

Search for blue flowers on Flickr

Page 16: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

How Does Collage Meet the Design Goals?

• Robust against blocking– Erasure coding– Many content hosts

• Deniable against user identification– Traffic only to/from content hosts– Depends upon task construction

• Require no dedicated infrastructure– Messages stored on content hosts

Page 17: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

How Do You Start Using Collage?

Send & Receive Messages1. Distribute software

– CDROM– Spam everyone– A secure network

2. Refresh task list– Receive using Collage– Online resource

3. Message identifier– Application specific

Help Censored Users1. Donate your UGC vectors

– Photos on Flickr– Tweets on Twitter– Etc.

2. Write Collage applications

Page 18: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Outline

• Background and Design Goals• Collage Design• Performance and Demo

Page 19: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Performance Metrics

• Sender and receiver traffic overhead• Sender and receiver transfer time• Storage required on content hosts

But these metrics can vary a lot:• Different content hosts• Different tasks• Different applications

Page 20: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Case Study

News Articles Covert TweetsContent host Flickr TwitterMessage size 30 KB 140 BytesVectors needed 5 30Storage needed 600 KB 4 KBSending traffic 1,200 KB 1,100 KBSending time 5 minutes 60 minutesReceiving traffic 6,000 KB 600 KBReceiving time 2 minutes ½ minute

Experiments performed on a 768/128 Kbps DSL connection

Page 21: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Demo of a Collage Application

Page 22: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

What Should You Do Now?

• Try out the demo application• Donate your photos

– For now, only Flickr Pro users– Embed news articles when you upload photos

Visit http://gtnoise.net/collage

Page 23: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Conclusion

• Collage evades Internet censorship by tunneling messages inside user-generated content– Robust against blocking– Deniable against user identification– Requires no dedicated infrastructure

• More work needed– Statistical deniability against traffic analysis– Learn timing behavior from users– Tor bridge discovery

http://gtnoise.net/collage

Page 24: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.
Page 25: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Yes, Steganography is Broken

• Doesn’t mean it isn’t still practical• Collage can use new information hiding

techniques as they come along– Not dependent upon a particular technology

• Use many hiding techniques in parallel

Page 26: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Can Bob Be Behind the Firewall?

Page 27: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Comparison to Other SystemsTechnique System Robustness Deniability Infrastructure

Onion routing TorLayered encryption; bridges

Encryption Network of relay servers

Anonymous remailing Mixminion Many hops; mixing Encryption Network of

remailers

Covert channels

Infranet Could use many proxy forwarders

Browsing regular Web sites; Steganography

Custom Web servers

CovertFS Multiple content hosts

Browsing regular Web sites; Steganography

None

CollageErasure coding; Many content hosts

Browsing content hosts; Info hiding

None

Page 28: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

“A Web based Covert File System”, HotOS ’07

• Stores files inside of Flickr photos• Intended for personal file storage• Collage is more robust

– Erasure coding– Automatically spread data among content hosts

• Could be implemented using Collage• Implementation not publically available

Page 29: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Where Could We Store Content?

• Photo sharing sites (Wikipedia lists 71)• Video sharing sites (Wikipedia lists 68)• Web forums, blogs

– Plugins for popular forum and blogging software

Page 30: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

What If Senders and Receivers Disagree on Task Lists?

• Consistent hashing ensures some agreement• Depends on number of tasks per identifier:

– If identifier mapped to 2 tasks, can still communicate with 25% disagreement

– If identifier mapped to 4 tasks, can still communicate with 50% disagreement

Page 31: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Agreeing on Message Identifiers

• Goal: Agree without communicating• For news reader application:

– Public knowledge, hopefully censor doesn’t block• Another option:

– Exchange keys with communicators beforehand– Identifier is, e.g., ciphertext of current date

Page 32: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

What Affects Deniability?

• Task construction:– The content host– Popularity the sequence of Web requests made– Injection of think times– Injection of garbage requests– Alternative routes to the same vectors

• Frequency of task execution• Are you uploading suspicious vectors?

– E.g., tagging pictures with bogus terms

Page 33: Chipping Away at Censorship with User-Generated Content Sam Burnett, Nick Feamster and Santosh Vempala.

Censorship Is an Arms Race

• Popular systems will eventually be blocked• We must continually create systems that are

harder to compromise