Sleepers & Workaholics Caching Strategies in Mobile Computing Dr. Daniel Barbará Dr. Tomasz...

Post on 18-Dec-2015

214 views 2 download

Transcript of Sleepers & Workaholics Caching Strategies in Mobile Computing Dr. Daniel Barbará Dr. Tomasz...

Sleepers & Workaholics

Caching Strategies in Mobile ComputingDr. Daniel Barbará

Dr. Tomasz Imielinski

About Me

Peter Rosegger 5th year Computer Science Specialization: Databases Graduation: December 2007

Sleepers & WorkaholicsCaching Strategies in Mobile Computing

Dr. Daniel Barbará Professor at George Mason University Several patents associated with mobile caching

Dr. Tomasz Imielinski Professor at Rutgers University Senior VP: Search Technology at Ask.com

199416 million cellular subscribers in US

1994

The Future of Mobile Computing

Use Habits: Large # of users Check weather, stocks, scores, etc. Mobile between cells (& wireless networks)

Hardware: Low-powered palmtop machines Poor battery life Narrow bandwidth

The Future of Mobile Computing

Query complex databases, but… Frequently powered off to save battery Frequently changing cells Network traffic must be minimized to

conserve bandwidth

Why Caching is Important

Conserve:

1. COMPUTATIONAL RESOURCES

2. BATTERY LIFE

3. BANDWIDTH

Traditional Strategies Fail

Server lacks knowledge of: Which units are in its cell Which units are powered ON

Client caches cannot be tracked

The Solution

Purpose of Sleepers & Workaholics:

"…to propose a taxonomy of different cache invalidation strategies and study the impact of clients' disconnection times on their performance."

Strategies

Timestamps (TS) Amnesic Terminals (AT) Signatures (SIG)

Control Strategy: No Cache (NC)

Timestamps

-Cache entries have timestamps-Synchronous, history based, uncompressed reports

SERVER:Notify clients of identifiers of items changed within last w seconds

CLIENT:For each item in cache: If in report, purge from cache If NOT in report, update timestamp to current time

Amnesic Terminals

-Cache entries have identifiers-Synchronous, history based, uncompressed reports

SERVER:Notify clients of identifiers of items changed within last w seconds

CLIENT:For each item in cache: If in report, purge from cache If NOT in report, do nothing

Signatures

-Checksums calculated over value of data to form Signature-Signatures combined using XOR-Synchronous, state based, compressed reports

SERVER:Server broadcasts the set of combined signatures

CLIENT:Item in cache is declared invalid if it belongs to “too many”

unmatching signatures (suspected of being out of date)

AnalysisCalculate THROUGHPUT for each strategy…

L = time between invalidation report broadcasts W = bandwidth B = # bits in the broadcast (invalidation reports)

# bits available for answering queries (cache misses)

C

=LW − BC

AnalysisT = THROUGHPUT; queries per interval handled by the system

h = cache hit rate, expressed [0, 1]

b = # bits for a query

b = # bits to answer a query

Traffic (in bits) due to cache misses

q

a

=T(1− h)(bq + ba )

Throughput

T(1− h)(bq + ba ) = LW − BC

T =LW − BC

(1− h)(bq + ba )

Effectiveness of a Strategy

e =T

Tmax

Maximal ThroughputServer knows:-What units are in the cell-What those units have in their cachesServer can:

-instantaneously notify units when an item changes

BC = 0

h = MaximalHitRatio

Maximal Hit Ratio

The Hit Ratio achieved in ideal conditions:

MHR = λe−λτ e−μτ dτ0

MHR =λ

λ + μ

Maximal Throughput

Tmax =LW

(1− M .H.R.)(bq + ba )€

BC = 0

h = MaximalHitRatio

No Caching-No invalidation report

-No intervals

BC = 0

h = 0

Tnc =LW

(bq + ba )

Timestamps

TTS =LW − nc (log(n) + bT )

(bq + ba )(1− hts)

Amnesic Terminals

TAT =LW − nL log(n)

(bq + ba )(1− hat )

Signatures

Consider the probability of false diagnosis: Probability of a false positive Probability of a false negative

TSIG =LW − 6g( f +1)(ln(

1

δ) + ln(n))

(bq + ba )(1− hsig )

Asymptotic AnalysisAnalyze throughput in extreme cases: As probability of sleeping s0, s1

Analyze throughput as system parameters vary: Database size Update frequency Bandwidth Etc.

WorkaholicsUnit sleeps less and less: s0 All hit ratios approach the same value SIG lags behind TS and AT by a factor of

BEST THROUGHPUT: AT, because its report is the shortest

pnf

SleepersUnit sleeps more and more: s1 All hit ratios approach 0

BEST THROUGHPUT: No Caching eventually wins as s becomes very large For practical purposes, SIG is the best choice

Infrequent Updates

Effectiveness as s ranges from 0 to 1

Increase Database Size & Bandwidth

Effectiveness as s ranges from 0 to 1

Update Intensive

Effectiveness as s ranges from 0 to 1

Increase Database Size & Bandwidth

Effectiveness as s ranges from 0 to 1

Conclusions on Effectiveness

Strategy depends on circumstances: SIG is best for sleepers TS is best for query-intensive scenarios, but… AT is best for workaholics

How can we improve effectiveness?

Relax: Consistency of the Cache

Depending on data type, data may not need to be exact…

EX: stocks, weather, etc.

Makes shorter invalidation reports possible

How Do We Decide to Update?

- Consider cached copies to be quasi-copies

- Each quasi-copy has a coherency condition attached to it

Coherency Conditions:Delay Condition - updated based on time

Arithmetic Condition - updated based on difference between data and quasi-copy

Adaptive Invalidation Reports-Start with TS strategy

Use algorithms to optimize strategy.

Examples: If an item is queried very often by units that sleep

a lot, include it in reports for longer If an item changes frequently, do not bother

caching

Criticism Units rarely powered down

Battery life better than predicted Battery life does not dictate use

Units still lose reception frequently Today’s most common “sleeper” condition --

explicitly excluded from definition in S&W Bandwidth better than predicted

However… Adjust “sleeper” to include lost reception Caching is still important

Endless demand for computational resources Endless demand for battery life Endless demand for more bandwidth