MiddleMan: A Video Caching Proxy Server NOSSDAV 2000 Brian Smith Department of Computer Science...

Post on 08-Jan-2018

220 views 0 download

description

Internet Problems With VOW Network unreliability Client heterogeneity Server bottlenecks Content Distribution Networks Only distribute portion of a web site Don’t improve latency as much as.. My solution: cache transmission CNN 28.8/56 Kbps T1 ISP

Transcript of MiddleMan: A Video Caching Proxy Server NOSSDAV 2000 Brian Smith Department of Computer Science...

MiddleMan: A Video Caching Proxy Server

NOSSDAV 2000

Brian SmithDepartment of Computer Science

Cornell UniversityIthaca, NY

Soam AcharyaInktomi Corporation

Foster City, CA

VOW (Video on the Web):1999-2000• Remote sites

– eg. CNN, Mars Pathfinder, ESPN

– Clinton’s testimony– “micro broadcasting”

stations• Local sites

– corporate intranets - training videos

– University-wide lecture distribution

Internet

Problems With VOW

• Network unreliability• Client heterogeneity• Server bottlenecks

• Content Distribution Networks • Only distribute portion of a web site• Don’t improve latency as much as ..

• My solution: cache transmission

CNN28.8/56 Kbps T1

ISP

Our Goals

• Low end, low cost approach• Scalable• Utilize highly connected network of

powerful commodity PCs

How Does It Work?

P

CNN

P

Video web caching proxy system

TCP

Coordinator

Why is this an attractive idea?

• Consider a University campus– 50-1000 PCs– 4 GB disk each– 100 MB for caching– 5-100 GB aggregate cache

• System Architecture– Proxies: cache on each end system. Acts

as both client and server– Coordinator: manages cache

Roadmap

• Survey Summary• Architecture: Detailed system architecture• Analysis: Architecture analysis and results• Comparison with other designs• Conclusion and future direction

Characterizing Videos On The Web

• mid April 1997 - May 1997

• about 57000 video titles

• 100 GB of data

• Web video size (~ 1 MB)

• Videos are WORMs

Characterizing User Access To Videos On The World Wide Web

• Traces from an ongoing Enterprise VoW trial:• 2 year period• 13100 requests• 246 titles

• Video browsing pattern• Partial browsing: only 61% of all movie

accesses went to completion• High temporal locality• File size trends …

Roadmap

• Survey Summary• Architecture: Detailed system architecture• Analysis: Architecture analysis and results• Comparison with other designs• Conclusion and future direction

MiddleMan Design Principles

• Proxy cluster– proxies + one coordinator per LAN

• Fragmentation– videos fragmented into equal sized file

fragments (1 MB “cache lines”)• Partial Video Storage Policy

– don’t have to keep entire video lying around– Replace on a block by block basis

How It Works I

P

WWW

C

P

P

C

P

Coordinator

Proxy

How It Works II

P

WWW

C

P

P

C

P

Coordinator

Proxy

LP

LPLP

SP

MiddleMan: Actual ArchitectureC

WWW

LAN

LP

CoordinatorLocal ProxyStorage ProxySP

SPC

CommunicationData transfer

Linking Multiple Proxy Clusters

C C

C

Roadmap

• Survey Summary• Architecture: Detailed system architecture• Analysis: Architecture analysis and results• Comparison with other designs• Conclusion and future direction

Issues To Analyze

• Local cache size• Size of file blocks• Caching algorithm• Controlling proxy load

Performance Analysis (via Simulations)

• Caching– analyze byte hit rates of various caching

algorithms

• Load Balancing– load analysis on the proxies

The Configuration Tested• 44 proxies• Individual proxy cache size:

– 50 Mbytes (8.6% of video server size), – 25 Mbytes (4.3%)– 12 Mbytes (2.1%)

• Block size: 1 Mbyte• Three sub traces: cdt, sm, campus • Trace Durations:

– 6 months– 2 years

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

Perfect

LRU

LFU

FIFO

LRU

-3

Perfect

LRU

LFU

FIFO

LRU

-3

Perfect

LRU

LFU

FIFO

LRU

-3By

te H

it Ra

te

Overall Cdt Cache Performance

12 MB, 2.1%

25 MB, 4.3% 50 MB, 8.6%

Caching Conclusion

• Not much difference in caching algorithm performance– cannot evaluate architecture solely based

on byte hit rate– Should consider load balancing behavior

Load Balancing• How define/measure “load”?

– Dynamic: traffic in the past hour, peak # of connections, peak BW

– Cumulative: byte traffic through a proxy?

• tested the following algorithms:– LRU– histLRUpick

• run LRU-2, LRU-3, LRU-4 in parallel to obtain multiple candidates for block replacement

• pick the least loaded block

Measuring Dynamic Load

Balancing Performance

P1 P2 P3

# of connections

Total # of connections

Timeconnections at (max) proxy

# of connections

Time

Example Proxy Connection Plot

# of connections

LRU

histLRUpick

Proxy Conns:44 proxies, 50MB each

Cumulative Load balancing: Max/Min

0.00

2000.00

4000.00

6000.00

8000.00

10000.00

12000.00

14000.00

16000.00

44-LRU

44-HistLR

Upick

22-LRU

22-HistLR

Upick

11-LRU

11-HistLR

Upick

MB

ytes

Max

Min

Load balancing: Conclusions

• histLRUpick performs the best• as number of proxies decrease,

individual load increases• Room for improvement:

– RWFQ– replication

Roadmap

• Survey Summary• Architecture: Detailed system architecture• Analysis: Architecture analysis and results• Comparison with other designs• Conclusion and future direction

Comparison to Other Proxy Caches

Internet

www

P P

• Document Characteristics– fragment, distribute video– keep initial portions of video

• Caching Approach– other approaches optimized for

• small documents• high reference locality

• Proxy architecture– Distributed vs. centralized

Conclusion

• Caching in general is a good idea:– High hit rates for relatively small cache size

• HistLRUk – high hit rates– effective dynamic and cumulative load

balancing• Used Jigsaw, HTTP/1.1, GET/PUTs to

build prototype storage proxy server

Future Work

• Other load balancing schemes• Fault Tolerance• FF/REW support• Security/Authentication• Proxy Cluster Cooperation

histLRUpick

Dynamic Load Balancing

Performance: 22 * 100LRU

• reduce the number of proxies

• raise the cache size of each proxy to keep the same global cache size