Designing, improving, and installing a collaborative cache for disconnected villages Sibren Isaacman...
-
date post
21-Dec-2015 -
Category
Documents
-
view
219 -
download
1
Transcript of Designing, improving, and installing a collaborative cache for disconnected villages Sibren Isaacman...
Designing, improving, and installing a collaborative cache for disconnected villages
Sibren IsaacmanGroup Talk Feb 25, 2009
Closing the Digital Divide• Connectivity problems in rural regionsConnectivity problems in rural regions
– Large infrastructure overheadLarge infrastructure overhead– High cost/bit and low bandwidthHigh cost/bit and low bandwidth
• $3000/Mbs/month $3000/Mbs/month – Extremely high latencyExtremely high latency
• 80% of North American adults have access 80% of North American adults have access to internet, compared to 5% of Africansto internet, compared to 5% of Africans
• Technology for developing regions has Technology for developing regions has been named a “Millennium Development been named a “Millennium Development Goal” by the UNGoal” by the UN– Increased access to the information on the Increased access to the information on the
internet and digital classrooms can change livesinternet and digital classrooms can change lives
Our solution
• Collaborative caching and predictive Collaborative caching and predictive prefetching increase usabilityprefetching increase usability– Decreases number of roundtrips Decreases number of roundtrips – Reduces miss rates by up to 89%Reduces miss rates by up to 89%
• Decrease sent bits (often directly related to Decrease sent bits (often directly related to cost) by as much as 6xcost) by as much as 6x
• Latency for average page access time Latency for average page access time reduced up to 90%reduced up to 90%
Outline
• Our previous work– C-LINK– Simulation
• Current efforts at improvement• The real deployment
Related work
• DTNs and network connectivity– DakNet [Pentland, 2004]– KioskNet [Seth, 2006]– DTLSR [Demmer, 2007]– TEK [Thies, 2002]
• Collaborative Caching and Web Proxies– Summary cache [Fan, 2000]– Cache Digests [Rousskov, 1998]– Squid [Wessels, 1998]– DitTorrent [Saif, 2006]
DTNs are thought to have too much delay for interactive Web accessCaching strategies thus far have not looked at disconnected networks
Design Goals of C-LINK
• Allow web access over any network layer– DTN, VSAT, cellular– Miss rates must be brought down– Data requested by one node may be
used by another• Possibly serving stale data in the short term
• Must adhere to constraints imposed by the environment
Environmental constraints
• Severe storage constraints– ~10’s of GB per machine
• Heterogeneous devices• Multiple transport options• Frequent power interruptions
– At both node and system level
C-LINKWeb Browser
Interface
Load Manager
Web Browser
Interface
Load Manager
Notifier
Node 1
Node n
Kiosk
Village
Kioskdaemon Network Layer
City
City Fetch Engine
Proxy
The Internet
The Internet
Page A, please
Page A, please
A
Page B, please
B
A
Page A, please
A
B
Node 1 has it
Page A, please
Interface
• Handles requests from a generic web browser– Captures socket dump from browser– Returns pages or “waiting” message
• Searches for files locally first– Notifies Kiosk if file is found
• Contacts Kiosks and Load Managers on behalf of Web browser
Kiosk daemon
• Point of contact for Interfaces– DHCP server
• Maintain hash map of URLs to machines– Returns last known IP address when URL is
requested
• Determine which network requests should be sent on
• Send files to Load Managers– Or temporary local storage
• Note cacheability of pages to refetch “non-cacheable” pages
City Fetch-Engine
• Makes connections to internet servers– Puts browsers socket dump in to file– Dumps response into file to send back
• Selects network connection over which to send data back
• Prefetch pages– Simple parsing for imbedded files/links– More complex models
Notifier
• Woken up by network when pages reach village
• Determines whether prefetched pages should be retained or discarded
• Informs Kiosk daemon of page’s arrival and name of original requestor
Load Manager
• Serves requested pages to Interfaces– Notifies Interfaces if page has been removed
• Maintains and enforces storage quotas– May be dynamically tuned or statically
apportioned– Separate cache space for prefetched and
explicitly requested pages
• Determines eviction policies– LRU queue maintained
• Notifies Kiosk daemon on page eviction• Inserts pages in to local cache
Prototype system
• 3 Pentium 3 computers running KioskNet– The “proxy” runs the City Fetch-Engine– The “kiosk” runs the Kiosk daemon and Notifier
• 1 Core Duo laptop configures as KioskNet’s “ferry”
• 3 Pentium 4 computers running standard Hardy Heron Kubuntu– Run both Interface and Load Manager– 1 GB caches
• No prefetching
Observations
• 19628 requests made by students– 44% were local hits– 20% were collaborative hits– 60% of hits were for “non cacheable”
content
• Average time to display a page was 300 msec– Local pages could be served in 5 ms– Pages elsewhere in the village may take
up to 1 second to display
Outline
• Our previous work– C-LINK– Simulation
• Current efforts at improvement• The real deployment
Simulator
• Trace Driven– Cambodia, Blackboard, and Prototype traces– Page accesses in trace are assumed to be
requests– Nodes enter the cache on their first access– Track hits, misses, and latencies
• Accurate model of the system previously described
• Collaborative caching turned “off” results in nodes maintaining individual caches
Tunable parameter: Cache Size
• Size of cache at user nodes– Based on previous work, range from 0-
100 KB per node• Sizes must be scaled appropriately for
number of users
• Multiplicative factor for cache size at Kiosk– The kiosk may be slightly better then a
user machine• Use 1.25x, 10x, 100x, and infinite space
Tunable parameter: Network Delay• “leave time” – how frequently
requests leave the village• “length of trip” – the length of a
round trip• Define three “networks”:
– Instant – leave time=0; length=0– Bus – leave time=60; length=60– Hybrid - leave time=60; length=30
Tunable parameter: User Connectivity• Use random connections or traces obtained from
CRAWDAD
• Random connections may be “balanced” with means of 90 min or “unbalanced” with means matching the CRAWDAD distribution
• Nodes use selected distribution to determine length of time connected or disconnected
Time in range (CRAWDAD) Time out of range (CRAWDAD)
Exploring the network layer
7-14x reduction in miss ratesSaving 75% of bytes transmitted
2x reduction in miss ratesSaving 28% of bytes transmitted
Exploring Cache Size
Average latencies are less than 10 minutesGreater than 5X improvement at low cache sizes
Exploring Kiosk Limits
Average latencies reduced 2x with little extra space at kioskAdditional storage provides little benefit
Exploring node movement
Previously examined cases are worst casesUnbalanced motion shows improvements of 3x
Prototype system
13% improvements in the best caseReflect abnormal usage patterns and general web traffic
Outline
• Our previous work– C-LINK– Simulation
• Current efforts at improvement• The real deployment
Prefetching
• Need intelligent prefetching• Coding is nearly completed to deliver
all embedded content on the page
Replication
• Data accessed more than N times should be replicated in the network– Protects against failures– Speeds up page delivery
• Modifications to the simulator complete– Comparisons of N=999 (very little replication)
and N=3 (the average number of requests) • 10% improvements in miss rate and number of
reachable pages
– Numbers may be better because of forced replication when node was out of range
• Previously, only one copy was known by the kiosk
Resiliency
Web Browser
Interface
Load Manager
Node 1
Web Browser
Interface
Load Manager
Node n
Notifier
Kiosk
Kioskdaemon
Need a Kiosk
Randomback off
Randomback off
I’m the new Kiosk
Notifier
Kioskdaemon
Page list
Page list
Notifier
Kiosk
Kioskdaemon
I’m back!
Outline
• Our previous work– C-LINK– Simulation
• Current efforts at improvement• The real deployment
Cinco Pinos, Nicaragua
Library
• attract users from general population – students from school & others
• alternative to pay-to-use, full-service internet will likely draw less wealthy
• just off bus route • owned by CODER, Comision de Desarrollo Rural
– Local NGO, enclosed and protected space– already have 1 computer in library which they used to
give computer lessons to local youth– already invested in creating development programs for
locals
• Other libraries nearby allow for exploration of multi-kiosk effects later
Equipment• 6 computers with Kubuntu• 2 single board computers from Soekris Engineering
– 433 to 600 MHz AMD Geode LX single chip processor with CS5536 companion chip
– 128-1024 MB DDR-SDRAM, soldered on board – 2 Serial ports, DB9 and 10 pins internal header – Power LED, Disk LED, Error LED, Network LED's – Mini-PCI type III socket.
• Atheros 802.11 card inserted• 1 mini-box M300-LCD
– Intel mini-ITX Atom 1.6 GHz– 1GB RAM– VGA, Serial, PS/2– PCI card with mini-PCI adapter
• Atheros 802.11 card inserted• 3 antennas
– 2 outdoor and 1 vehicle mount– 2.4GHz omni-directional
Measurement
• Quantitative– Similar to measurements taken on
prototype• May pre-load some pages
– Include measurements of number of evictions and cache readjustments
• Qualitative– User experiences– Suggested improvements– Perceived value of system– Future features
The End
On to Nicaragua!