Sneakernet : Clouds with Mobility
description
Transcript of Sneakernet : Clouds with Mobility
Sneakernet: Clouds with Mobility
Kenneth Church (Johns Hopkins)James Hamilton (Amazon)
Clouds with Mobility• Standard view of Clouds
– Big Datacenters with no mobility,– But mobility is a big opportunity
• Moore’s Law:– Everything is getting better
• But at different rates
– Mobility Gap (18x per decade)• Moveable media (flash/disk) >> Wires
• Sneakernet Alternatives to WANs– Media on the move in:
• Shipping Containers • Car Trunks• Laptops• Cell Phones
WANs v. SneakerNet:Throughput, Cost, Latency, Convenience
• For tiny payloads (MBs) – Wires (WANs)
• For modest payloads (GBs/TBs) – Post Office (Jim Gray http://aws.amazon.com/importexport)
• For serious payloads (PBs) – Man with a Shipping Container (3 PBs/day)
• Like a man with a van– Benchmarks:
• 5.3 containers ≈ 2008 AT&T Backbone (16 PBs/day)• ½ container is big enough for the Internet Archive
½ container (3 PBs) is big enough for Internet Archive(1 man can drive the Internet a day’s drive in a day)
The Mobility Gap:Moveable Media >> Wires
1024x/57x = 18x per decade
Law ResourceGrowth
per Year
Growth per
DecadeNielsen's
Law Internet Bandwidth to Home 1.5x 57x
Moore's Law CPU 1.6x 100x
Internet Backbone 1.7x 256x
Kryder's Law Disk Capacity 2.0x 1024x
The Mobility Gap SneakerNet
• In the limit (time, payload size)– Eventually (when the gap becomes wide enough)– For large enough payloads
• SneakerNet >> Wires– Throughput, Cost, Latency, Convenience
• More obvious: Throughput, Cost• Less obvious: Latency, Convenience
• Intuitively, wires (speed of light) are fast (and trucks are sloooow)– But for large payloads (TBs),
• Trucks are faster (and more convenient)• A truck can move the Internet (3 PBs) a day’s drive in a day: Truck ≈ ¼Tbps >> WAN ≈ 10 Gbps
– Especially for a one-shot ad hoc transfer• It is quicker and easier to hire a man with a shipping container• Than to provision new network capacity
– Provisioning new network capacity (private lines)• Not quick (or convenient)• Usually requires long-term commitments• Subject to availability (there isn’t that much capacity; AT&T transfers just 16 PBs/day in 2008)
The Mobility Gap SneakerNet(More Consequences)
• Eventually, when disk becomes infinitely more plentiful than networking,– Do whatever it takes to alleviate network
bottlenecks– Even if doing so consumes vast quantities of disk• Cache everything everywhere forever
Gray’s Legacy: Amazon Web Serviceshttp://aws.amazon.com/importexport
Throughput, Cost, Latency, Convenience
Internet Speed
Time to Transfer 1TB at 80% Network
Utilization
When to Consider
SneakernetT1 (1.5 Mbps) 82 days 100 GB or more
10 Mbps 13 days 600 GB or more
T3 (45 Mbps) 3 days 2 TB or more
100 Mbps 1 to 2 days 5 TB or more
1000 Mbps Less than 1 day 60 TB or more
Jim Gray’s Motivation for SneakerNet
• Gray’s Question:– What is the best way to move a terabyte from place to place?
• The Next Generation Internet (NGI) promised– Gps desktop-to-desktop by 2000
• So, if you have NGI, – then 1TB transfer 8k seconds (a few hours)
• Unfortunately, most of us are still waiting for NGI– We still use 1-100 Mbps
• So, it is takes us days or months to move 1TB– Using Last Generation Internet (LGI)
• UPDATE: We’re still waiting for NGI…– Worse: The Mobility Gap We’ll always be waiting
Update to Gray’s QuestionAfter 10 years (1000x more disk capacity): 1TB 1PB
• Gray’s Question: What is the best way to move a terabyte from place to place?
• Updated Version: What is the best way to move a petabyte from place to place?
Internet Speed
Time to Transfer 1PB at 80% Network
Utilization
When to Consider
Sneakernet100 Mbps 1157 days 5TB or more
1 Gbps 116 days 60TB or more10 Gbps 12 days 600TBs or more
• Standard clouds:– Datacenters tied to physical locations
• Generalized clouds:– Datacenters + Data on the Move
• Shuttling between Datacenters– Shipping containers
• Shuttling between work and home (as we commute to work)– Laptops– Car trunks– Cell phones (flash in pockets)
Petabytes on the Move
The Cloud of the Future:4M PBs, Mostly on the Move
Storage/Unit Units Total
Data Centers 1k PBs 1k 1M PBs
Shipping Containers on trucks between datacenters 3 PBs 1k 3k PBs
Cars 10 TBs 100M 1M PBs
Laptops &Netbooks 1 TB 1B 1M PBs
Cell phones & Flash 100 GBs 10B 1M PBs
Challenges• Last Inch:
– Easy enough to drive a shipping container from here to there,• but loading & unloading?• Can we design a shipping container so
– it can be plugged into a cloud – as easily as plugging a USB disk into a laptop?
– Ditto for car trunks and cell phones• Naming Conventions (and File Futures)
– How do we refer to blobs that • used to be here • or will be there?
– Anything you can do with a file,• I would like to do with a file future• Normally, can’t do much with a file until you have it on your machine
– Exceptions: scheduled cron jobs– Suggestion: drag & drop file futures
The Last Inch: On-Ramps & Off-Ramps for the Information Super-Highway
• 1 Packet per day scenarios: – e.g., daily commute point-to-point data connection between work and home
• Phones (GBs/day)– Small payloads (MBs) Wireless (radio stack, blue tooth, WiFi)– Larger payloads Wired USB (power: can’t afford to run WiFi all day)
• Laptops (1 TB/day)• Disks in Car Trunks (10 TBs/day)
– Power is less of a concern (than with phones): • Plenty of power to run WiFi all day
– Small payloads (½ TB/day):• Wireless (8 hours of 144 Mbps WiFi in both garages)
– Larger payloads: Wires (or carry disks)
Another 1 Packet per Day Scenario
• Ship Shipping Containers (3 PBs/day)– Man with Shipping Container
• (like a man with a van)– Scenario: Replace expensive batteries &
generators with geo-redundancy• Shuttle containers back and forth between two
datacenters that are a day’s drive apart• Driver connects a bunch of jumper cables at the end of
his shift (and the data off-loads overnight)– 1 Tbps cables can transfer a PB in hours (3PBs per night)
» Patch panel with 100 slots for 10 Gbps– Chilled water
Wires v. Sneakernet for 1 Packet per Day Scenarios Throughput, Cost, Latency, Convenience
In the limit, Mobility Gap will eventually favor Sneakernet for large payloads
Wires Throughput Monthly Rent CapitalConsumer
Broadband to Home1 Mbps
(= 11 GB/day)$50/month Negligible
Enterprise WANs10 Gbps
(= 108 TBs/day)$10k/month Negligible
SneakernetPacket Size
Throughput@ 1 Packet/Day
Monthly Rent Capital
Shipping Containers 3 PBs ¼ Tbps Negligible $4 M
Cars 10 TBs 1000 Mbps Negligible $1000
Laptops &Netbooks 1 TB 100 Mbps Negligible $1000
Flash on Phones 100 GBs 10 Mbps Negligible $100
Naming Conventions & File Futures• How do we refer to blobs on the move?• Separate signaling from payloads• Scenario: Email with large (GBs PBs) attachments
– Payload (Attachment): • Don’t fail for large attachments• Rather, fall back to sneakernet (if necessary)
– Signaling URLs with serial numbers• Tracking service: where’s my package• Permalink: a link that I can give to friends and family• A query to a search service:
– Are there any other copies of this blob that are easier to get to from where I am right now?– Other blobs like this one?– Who else is interested in this blob?– Updates?
• Signaling URLs File futures:– Anything you can do with a file,– I would like to do with a file future
From Point-to-Point Scenarios Haggling
• Buildout sneakernet slowly• Start with Point-to-Point Scenarios– Transactions:
• e.g., post office, man with a shipping container– Subscriptions:
• e.g., commute data connection between work & home
• Killer Apps– Transactions:
• Ad hoc copies, email with large attachments– Subscriptions:
• Backup, mirrors, geo-redundancy, remote sync, hub & spoke network, (big) podcasts, CDN (Content Distribution Networks)
Haggling: http://ica1www.epfl.ch/haggle/
• Buildout sneakernet slowly• Start with Point-to-Point Scenarios• When take-rates are sufficiently high Haggling (and more)– Switching/Hitchhiking:
• Cars exchange packets when parked near one another
– Warehouses (moving companies):• Aggregate loads to fill packets
– CDN (Content Distribution Networks)• Caching, Geo-distribution, Syncing, Compression
– Peer-to-Peer– Hub & Spokes (Fed-Ex)
Hub and Spokes (Fed-Ex)• Most routes make connections in major hub cities• Hub & Spoke Network Buildout Plan:
– Hubs: More network capacity at work– Spokes: As consumers commute to work,
• They sync their homes to work• Where there is more network capacity
• Airline Analogy– Work Hub Airport– Daily Commute Commuter Airline
Conclusions• Advantages & Disadvantages
– Throughput, Cost, Latency & Convenience• Large payloads (PBs) Sneakernet• Small payloads (MBs) Wires
• Mobility Gap (Disk/Flash are getting better faster than Wires)– In the limit, for large enough payloads, Mobility Gap Sneakernet
• Hybrids: Email with large attachments– Large attachment Sneakernet– Email + Signaling URLs Wires
• 1 Packet per Day Scenarios– Daily commute Data Connection between Work & Home
• Hub & Spoke Network Buildout: Work Hub Airport; Daily Commute Hub Airline– Backup Power (Batteries & Generators) Geo-Redundancy
• Man with shipping container shuttles between datacenters
• Research Challenges: Last Inch & Naming Conventions (File Futures)
SneakerNet
Packet SizeThroughput@ 1
Packet/Day
Shipping Containers 3 PBs ¼ Tbps
Cars 10 TBs 1000 Mbps
Laptops 1 TB 100 Mbps
Phones 100 GBs 10 Mbps
Wires ThroughputEnterprise WAN 10 Gbps
Consumer Broadband 1 Mbps