Distributed Caching Algorithms for Content Distribution Networks

Distributed Caching Algorithms

for Content Distribution Networks

Sem Borst†, Varun Gupta?, Anwar Walid†

†Alcatel-Lucent Bell Labs, ?CMU

BCAM Seminar

Bilbao, September 30, 2010

Introduction

Scope: personalized/on-demand delivery of high-definition

video through service provider

• CatchUp TV / PauseLive TV features

• NPVR (Network Personal Video Recorder) capabilities

• Movie libraries / VoD (Video-on-Demand)

• User-generated content

Unicast nature defies conventional broadcast TV paradigm1

Caching strategies

Focus: ‘hierarchical’ network architecture

Store popular content closer to network edge to reduce

traffic load, capital expense and performance bottlenecks

1% stored10% served

35% served

20% stored30% served

9% stored

20% served30% stored

5% of traffic served40% of content stored

Caching strategies (cont’d)

Typically there are caches installed at only one or two levels

45% served

30% stored30% served

10% stored

25% of traffic served60% of content stored (or 100%)

Caching strategies (cont’d)

Two interrelated problems

• Design: optimal cache locations and sizes

(joint work with Marty Reiman)

• Operation: efficient (dynamic) placement of content

Popularity statistics

Cache effectiveness strongly depends on locality / com-

monality in user requests

frequenciesrequest

content item ranks1 2 3 N

Popularity statistics (cont’d)

Empirical data suggests that rank statistics resemble Zipf-Mandelbrot distribution

Relative frequency of n-th most popular item is

(q+ n)α, n = 1, . . . , N,

• α ≥ 0: shape parameter

• q ≥ 0: shift parameter

• H =[∑N

(q+n)α

]−1normalization constant

Ideal hit ratio for cache of size B ≤ N is

R =B∑

(q+ n)α

Shape parameter α varies with content type, and strongly

impacts cache effectiveness

0 1 2Shape parameter alpha

B = 100B = 500B = 1000

Hit ratio as function of shape parameter α for various cache

sizes B and population of N = 10,000 content items7

Zipf-Mandelbrot distribution is inherently static, and diffi-

cult to reconcile with dynamic phenomena

• Dynamic content ingestion and removal

• Time-varying popularity, request-at-most-once

Both adverse and favorable implications

• Requires agile caching strategies policies and (implicit)

popularity estimation, negatively affecting caching per-

formance

• Causes popularity distribution to be steeper (higher α

values over shorter time intervals), improving potential

caching effectiveness

Optimal content placement

Consider symmetric scenario (cache sizes, popularity dis-

tributions)

For now, assume strictly hierarchical topology: content can

only be requested from parent node

Caches should be filled with most popular content items

from lowest level up

Greedy content placement strategy

Whenever node receives request for item, its local ‘popu-

larity estimate’ for that item is updated

If requested item is not stored in local cache, then

• Request is forwarded to parent node

• Popularity estimate for requested item is compared with

that for ‘marginal’ item, which may then be evicted and

replaced

Provable ‘convergence’ to optimal content placement

Optimal content placement (cont’d)

Relies on two strong (though reasonable) assumptions

• Symmetric popularity distributions and cache sizes

• Strictly hierarchical topology

What if popularity distributions are spatially heterogeneous?

Or what if content can be requested from peers as well?

Assume there are caches installed at only two levels

Consider cluster of M nodes at same level in hierarchy

Cluster nodes are either directly connected or indirectly via

common parent node

parent node

leaf nodes

root node

Some notation

• c0: transfer cost from root node - 1 to parent node 0

• ci: transfer cost from parent node 0 to node i

• cij < c0+ci: transfer cost from leaf node j to leaf node i

fij :=

c0 + ci j = ic0 j = 00 j = −1c0 + ci − cij > 0 j 6= −1,0, i

represents transport cost savings achieved by transferring

data to leaf node i from node j instead of root node14

Problem of maximizing cost savings may be formulated as

maxM∑i=1

N∑n=1

M∑j=0

fijxjin (1)

subN∑n=1

snxin ≤ Bi, i = 0,1, . . . ,M (2)

xjin ≤ xjn, i = 1, . . . ,M, j = 0,1, . . . ,M, n = 1, . . . , N(3)M∑j=0

xjin ≤ 1, i = 1, . . . ,M, n = 1, . . . , N, (4)

with Bi denoting cache size of i-th node, sn size of n-th

item, din demand for n-th item at i-th node

Inter-level cache cooperation

Allow for heterogeneous demands, but assume cij =∞, i.e.,

content can only be fetched from parent node and not from

For compactness, denote cmin := mini=1,...,M ci

Proposition

For arbitrary demands, greedy content placement strategy

is guaranteed to achieve at least fraction

(M − 1)cmin +Mc0(M − 1)cmin + (2M − 1)c0

2M − 1

of maximum achievable cost savings

Intra-level cache collaboration

Now suppose content can be requested from peers as well

Intra-level connectivity allows distributed caches to coop-

erate and act as single logical cache, and makes caching

at lower levels more cost-effective

Greedy optimization of local hit rate will lead to complete

replication of cache content

Cache cooperation improves aggregate hit rate across cache

cluster, at expense of lower local hit rate

Optimal trade-off and degree of replication depends on

cost of intra-level transfers relative to transfers from parent

or root node

Intra-level cache cooperation (cont’d)

Assume symmetric transport cost, cache sizes and de-

mands: Bi ≡ B, ci ≡ c, cij ≡ c′, and din ≡ dn

For compactness, denote c′′ := M(c+ c0)− (M − 1)c′ > c′

Problem (1)–(4) may be simplified to

maxN∑n=1

sndn(c′′pn + (M − 1)c′q′n +Mc0x0n) (5)

subN∑n=1

snx0n ≤ B0 (6)

N∑n=1

sn(pn + (M − 1)q′n) ≤MB (7)

pn + x0n ≤ 1, n = 1, . . . , N (8)

q′n + x0n ≤ 1, n = 1, . . . , N (9)

Knapsack problem type structure18

Intra-level cache collaboration (cont’d)

Optimal solution of content placement problem has rela-

tively simple structure

Distinguish between two cases

• Mc ≥ (M−1)c′: more advantageous to store un-replicated

content in leaf nodes than in parent node

• Mc ≤ (M − 1)c′: more attractive to store un-replicated

content in parent node than in leaf nodes

with c cost between parent and leaf node and c′ cost be-

tween two leaf nodes

Case Mc ≥ (M − 1)c′

root node

leaf nodes

parent node

Four popularity ‘tiers’

• Highly popular (red): replicated in all leafs pn = 1, q′n = 1

• Fairly popular (pink): stored in single leaf pn = 1

• Mildly popular (yellow): stored in parent node x0n = 1

• Hardly popular (green): stored in root node only

Case Mc ≥ (M − 1)c′ (cont’d)

2n − 1

qnqnqn

q0n = 0

n0x n0

− 10n− 10n− 10n0n0n

1n1n1n

= 01np

= 02n0x

Case Mc ≥ (M − 1)c′

root node

leaf nodes

parent node

Four popularity ‘tiers’

• Highly popular (red): replicated in all leafs pn = 1, q′n = 1

• Fairly popular (pink): stored in common parent x0n = 1

• Mildly popular (yellow): stored in single leaf pn = 1

• Hardly popular (green): stored in root node only

Case Mc ≥ (M − 1)c′ (cont’d)

qnqnqn

q0n = 0

1n − 1

n0xn0x

npnpnp

− 10n− 10n0n0nn

2n − 1

1n0x = 02np = 0

− 10n1

Local-Greedy algorithm

For convenience, assume B0 = 0, sn = 1 for all n = 1, . . . N

If requested item is not stored in local cache, then

• Item is fetched from peer if cached elsewhere in cluster

and otherwise from root node

• Value of requested item is compared with ‘marginal’

cache value, i.e., value provided by marginal item in

local cache, which may then be evicted and replaced

Value of item n =

{c′dn if stored elsewhere in clusterc′′dn otherwise

Local-Greedy algorithm (cont’d)

May get stuck in suboptimal configuration

globally optimal configuration local optimum

• Duplicating red item less valuable than single yellow

• Duplicating yellow item less valuable then single green

Local-Greedy algorithm (cont’d)

Performance guarantees (competitive ratios)

• Symmetric demands: within factor 4/3 from optimal

• Arbitrary demands: within factor 2 from optimal

Numerical experiments

• M = 10 leaf nodes, each with cache of size B = 1 TB

• Unit transport cost c0 = 2, c = 1, c′ = 1

• N = 10,000 content items, with common size S = 2 GB

• Each leaf node can store K = B/S content items

Numerical experiments (cont’d)

• Each leaf receives average of 1 request every 160 sec,

i.e., total request rate per leaf is ν = 0.00625 sec−1

• Zipf-Mandelbrot popularity distribution with shape pa-

rameter α and shift parameter q, i.e.,

(q+ n)α, n = 1, . . . , N,

with normalization constant

N∑n=1

(q+ n)α

• Request rate for n-th item at each leaf node is dn = pnν

Gains from cooperative caching

Compare minimum bandwidth cost with that in two other

scenarios

• Full replication: each leaf node stores K most popular

• No replication: only single copy of MK most popular

items is stored in one of leaf nodes

Without caching, bandwidth cost would be MνS(c+ c0) =

10× 0.00625× 2× 3 = 0.375 GBps = 3 Gbps

Bandwidth cost as function of shape parameter α for vari-

ous scenarios and cache sizes

Some observations

• Caching effectiveness improves as popularity distribu-tion gets steeper: bandwidth costs markedly decreasewith increasing values of α

• Even when collective cache space can only hold 10%of total content, bandwidth costs reduce to fraction ofthose without caching, as long as value of α is not toolow

• Best between either full or zero replication is often notmuch worse than optimal content placement;however, neither full nor zero replication performs wellacross entire range of α values

• Critical to adjust degree of replication to steepness ofpopularity distribution;Local-Greedy algorithm does just that

Performance of Local-Greedy algorithm

Various leaf nodes receive requests over time, sampledfrom Zipf-Mandelbrot popularity distribution

If requested item is not presently stored, node decideswhether to cache it and if so, which currently stored itemto evict

Distinguish three scenarios for initial placement

• Full replication: each leaf node stores 500 top items

• No replication: only single copy of 5000 top items isstored in one of leaf nodes

• Random: each leaf stores 500 randomly selected items

In optimal placement, items 1 through 165 fully replicated,and single copies of items 166 through 3515 stored

Performance ratio as function of number of requests, with

static or dynamic popularities

Bandwidth savings as function of number of requests, with

inaccurate popularity estimates

Some observations

• Local-Greedy algorithm gets progressively closer to op-

timum as system receives more requests and replaces

items over time

• After only 3000 requests (out of total number 10,000

items) Local-Greedy algorithm has come to within 1% of

optimum, and stays there

• Performs markedly better than worst-case ratio of 3/4

might suggest

• While algorithm seems to ‘converge’ for all three initial

states, scenario with no replication appears to be most

favorable one, due to fact that in optimal placement

only items 1 through 165 are fully replicated

Distributed Caching Algorithms for Content Distribution Networks

Documents

Transcript of Distributed Caching Algorithms for Content Distribution Networks

Caching in Distributed Environment

IOA: Distributed Algorithms Distributed Programs

Distributed Caching in Oracle Enterprise Gateway€¦ · Oracle Enterprise Gateway 5 / 21 Setting up the Distributed Caching and Caching Policy Configuring a Distributed Cache In

Distributed Caching Algorithms for Content Distribution ...€¦ · Sem Borsty, Varun Gupta?, Anwar Walidy yAlcatel-Lucent Bell Labs,?CMU BCAM Seminar Bilbao, September 30, 2010.

Efﬁcient Distributed Memory Management with RDMA and Caching · Efﬁcient Distributed Memory Management with RDMA and Caching Qingchao Cai, Wentian Guo, Hao Zhang, Divyakant Agrawaly,

Distributed Algorithms Distributed Transactionsdisi.unitn.it/~montreso/ds/handouts/13-acid.pdf · Distributed Algorithms Distributed Transactions ... I Distributed rollback recovery

Towards Adaptive Caching for Parallel and Distributed Simulation

Introduction to Distributed Computingansuman/dist_sys/IntroDs.pdf · Distributed Algorithms Algorithms that run on distributed systems to perform some desired task Examples Algorithms

Word Wide Cache Distributed Caching for the Distributed Enterprise.

Distributed Caching Algorithms in the Realm of Layered ...jordan/PAPERS/TMC-2018-VIDEO.pdf · Index Terms—Distributed caching, Cooperation, Layered-video encoding. F 1 INTRODUCTION

Training Webinar: Enterprise application performance with distributed caching

C2P:Co-operative Caching in Distributed Storage Systems · C2P:Co-operative Caching in Distributed Storage Systems Shripad J Nadgowda1, Ravella C Sreenivas2, Sanchit Gupta3, Neha

Randomized Competitive Algorithms for Generalized Caching

UBI529 3. Distributed Graph Algorithms. 2.4 Distributed Path Traversals Distributed BFS Algorithms Distributed DFS Algorithms.

November 2005Distributed systems: distributed algorithms 1 Distributed Systems: Distributed algorithms.

Distributed Caching in an Ephemeral World Rahul Singh€¦ · Distributed Caching in an Ephemeral World Rahul Singh Founder & CEO distelli.com June 7th 2017

.Net Distributed Caching

Distributed Innodb Caching With Memcached Presentation

Distributed Caching - Computer Measurement Group · Distributed Caching: Gaining Speed by Reduplicating Data Christopher R. Hertel Samba Geek Senior Principal Software Engineer, Red

Scale ColdFusion with Terracotta Distributed Caching for Ehchache