Towards Adaptive Caching for Parallel and Distributed Simulation

33
Maria Hybinette, UGA 1 Towards Adaptive Caching for Parallel and Distributed Simulation Abhishek Chugh & Maria Hybinette Computer Science Department The University of Georgia WSC-2004

description

Towards Adaptive Caching for Parallel and Distributed Simulation. Abhishek Chugh & Maria Hybinette Computer Science Department The University of Georgia WSC-2004. Simulation Model Assumptions. Collection of Logical Processes (LPs) Assume LPs do not share state variables - PowerPoint PPT Presentation

Transcript of Towards Adaptive Caching for Parallel and Distributed Simulation

Page 1: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 1

Towards Adaptive Caching for

Parallel and Distributed Simulation

Abhishek Chugh & Maria Hybinette

Computer Science Department

The University of Georgia

WSC-2004

Page 2: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 2

Airspace

Atlanta Munich

Simulation Model Assumptions

Collection of Logical Processes (LPs) Assume LPs do not share state variables Communicate by exchanging time stamped

messages

LP

LPLP

LP

LP

Page 3: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 3

Problem & Goal

Problem:Inefficiency in PDES: Redundant computations

Observation:Computations repeat: » Long run of simulations» Cyclic Systems» Communication network simulations

Goal:Increase efficiency by reusing computations

Page 4: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 4

LPLPLPLPMsgMsgMsg

Cache

Approach

Cache computations and re-use when they repeat instead of re-compute.

Msg Msg Msg

Msg

Msg MsgMsg LP

Msg LP

Msg

Msg

LP

LP

Msg

Page 5: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 5

Approach: Adaptive Caching

Cache computations and re-use when they repeat instead of re-compute.

Generic caching mechanism independent of simulation engine and application

Caveat: Different factors that impact the effectiveness of caching

» Proposal: An adaptive approach

Msg LP

Msg LP

Msg

Msg

LP

LP

Cache

Page 6: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 6

Factors Affecting Caching Effectiveness

Cache size Cost of looking up into the cache and

updating cache Execution time of the computation Probability of a hit: Hit rate

Page 7: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 7

Effective Caching Cost

E(Costuse_cache) =

hit_rate * Costlookup_hit

+ (1 - hit_rate) * (Costlookup_miss + Costcomputation+ Costinsert)

Page 8: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 8

Caching is Not Always a Good Idea

E(Costuse_cache) =

hit_rate * Costlookup_hit

+ (1 - hit_rate) * (Costlookup_miss + Costcomputation+ Costinsert)

Hit rate low, or Very fast computation Only when Costuse_cache < Costcomputation is caching

worthwhile

Page 9: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 9

How Much Speedup is Possible?

Neglecting cache warm up and fixed costs

Expected Speedup = Costcomputation / Costuse_cache

Upper bound (hit_rate = 1)

= Costcomputation / Costlookup

In our experiments Costcomputation / Costlookup = ~3.5

Page 10: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 10

Related Work

Function Caching: Replace application level function calls with cache queries:

» Introduced by: Bellman (1957); Michie (1968)» Incremental computations:

– Pugh & Teitelbaum (1989), Liu & Teitelbaum (1995)» Sequential discrete event simulation:

– Staged Simulation: Walsh & Sirer (2003) function caching + currying (break up computations), re-ordering and pre-computations),

Decision Tool Techniques for PADS: Multiple runs of similar simulations

» Simulation Cloning: Hybinette & Fujimoto (1998); Chen & Turner, et al (2002); Straburger (2000)

» Updateable Simulations (Ferenci et al 2002) Related Optimization Techniques

» Lazy Re-Evaluation: West (1988)

Page 11: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 11

Overview of Adaptive Caching

Execution time:

1. Warm-up execution phase, for each function:a) Monitor: hit rate, query time, function run time

b) Determine utility of using cache

2. Main execution phase, for each function:a) Use cache (or not) depending on results from 1

b) Randomly sample: hit rate, query time, function run time» Revise decision if conditions change

Page 12: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 12

What’s New

Decision to use cache is made dynamically » in response to unpredictable local conditions for each LP at

execution time

Relieves user of having to know whether something is worth caching

» adaptive method will automatically identify caching opportunities, reject poor caching choices

Easy to use caching API » independent of application or simulation kernel

» cache middleware

Distributed cache» Each LP maintains own independent cache

Page 13: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 13

Pseudo-Code Example

// ORIGINAL LP CODE

LP_init()

{

cacheInitialize(int argc, char** argv);

}

Page 14: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 14

Pseudo-Code Example

// ORIGINAL LP CODE

LP_init()

{

cacheInitialize(int argc, char** argv);

}

Page 15: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 15

Pseudo-Code Example

// ORIGINAL LP CODE

LP_init(){cacheInitialize(int argc, char** argv);

}

Proc(state, msg, MyPE){retval = cacheCheckStart( currentstate, event );if( retval == NULL )

{/* original LP code. compute new state and events to be scheduled */

/* allow cache to save results */cacheCheckEnd( newstate, newevents ) ;}

else{newstate = retval.state;newevents = retval.events;}

schedule( newevents );

}

Page 16: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 16

Pseudo-Code Example

// ORIGINAL LP CODE

LP_init(){cacheInitialize(int argc, char** argv);

}

Proc(state, msg, MyPE){retval = cacheCheckStart( currentstate, event );if( retval == NULL )

{/* original LP code. compute new state and events to be scheduled */

/* allow cache to save results */cacheCheckEnd( newstate, newevents ) ;}

else{newstate = retval.state;newevents = retval.events;}

schedule( newevents );

}

Page 17: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 17

Implementation

Page 18: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 18

Caching Middleware

Simulation Application

Cache Middleware

Simulation Kernel

Page 19: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 19

Caching Middleware (Hit)

Simulation Application

Cache Middleware

Simulation Kernel

Check cache state/message Cache Hit

Page 20: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 20

Caching Middleware (Miss)

Simulation Application

Cache Middleware

Simulation Kernel

Check cache state/message

Miss or cache lookup expensive

Miss: Cache new state & message

Cache Miss

Page 21: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 21

Cache Implementation

Hash table and separate chaining Input: Current State & Message Output: State and output message(s) Hash function (djb2 by Dan Bernstein, Perl)

Page 22: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 22

Memory Management

Distributed cache; one for each LP Pre-allocate memory pool for cache in each

LP during initialization phase Upper limit parameterized

Page 23: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 23

Experiments

3 Sets of Experiments with P-Hold» Proof of concept (no adaptive caching) hit-rate» Evaluation of impact of cache size and simulation

running time on speedup (no caching/caching)» Evaluation of adaptive caching with regard to the cost of

event computation 16 processor SGI Origin 2000

» 4 processors

“Curried” out time stamps

Page 24: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 24

0

10

20

30

40

50

60

70

80

90

100

0 20000 40000 60000 80000 100000 120000 140000 160000 180000

Progress (Simulated Time)

Hit

Rate

(Perc

en

tag

e %

)

90 KB (10%)

25000 KB (25%)

10000 KB (100%)

Hit Rate versus Progress

As expected hit ratio increases as cache size increases Maximum hit rate for large cache Hit rates sets an upper bound for speedup

Page 25: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 25

Speedup vs Cache Size

0

0.5

1

1.5

2

2.5

3

3.5

0 2000 4000 6000 8000 10000

Size of Cache (KB)

Spe

edu

p (

No C

achin

g/C

ach

ing

)

5 msec3 msec

Speedup improves as size of the cache increases Beyond size 9,000KB speedup declines and levels off Better performance for simulations with computations

that have higher latency

Page 26: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 26

Speedup vs Costcomputation

Non-adaptive caching suffers a speedup of 0.82 for low latency computations and improves to 1 when the computational latency approaches 1.5 msec

0.8

0.85

0.9

0.95

1

1.05

1.1

0 0.5 1 1.5 2 2.5 3

Computational Latency (msec)

Speedup (

Cach

ing/N

o C

ach

ing)

Non-Adaptive

Page 27: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 27

Speedup vs Costcomputation

Adaptive Caching, tracks the cost of consulting the cast in comparison of running the actual computation

Adaptive caching is 1 for small computational latencies (selects performing computation instead of consulting cache)

0.8

0.85

0.9

0.95

1

1.05

1.1

0 0.5 1 1.5 2 2.5 3

Computational Latency (msec)

Speedup (

Cach

ing/N

o C

ach

ing)

Non-Adaptive

Adaptive

Page 28: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 28

Summary & Future Work

Summary: Middleware implementation that require no major

structural revision of application code Best case speedup approaches 3.5 worst case speedup

of 1 (speedup is limited to a hit rate of 70%) Random generated information (such as time stamps or

other) caching may become ineffective unless taking pre-cautions

Future Work: Function caching instead of LP caching Look at series of functions to jump forward Adaptive replacement strategies

Page 29: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 29

Closing

“A sword wielded poorly will kill it’s owner”

-- Ancient Proverb

Page 30: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 30

Pseudo-Code Example

// ORIGINAL LP CODE

LP_init()

{

//

//

//

//

}

Proc(state, msg, MyPE)

{

val1 =

fancy_function(msg->param1,

state->key_part);

val2 =

fancier_function(msg->param3);

state->key_part = val1 + val2;

}

Page 31: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 31

Pseudo-Code Example

// ORIGINAL LP CODE

LP_init()

{

//

//

//

//

}

Proc(state, msg, MyPE)

{

val1 =

fancy_function(msg->param1,

state->key_part);

val2 =

fancier_function(msg->param3);

state->key_part = val1 + val2;

}

Page 32: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 32

Pseudo-Code Example

// ORIGINAL LP CODE

LP_init()

{

//

//

//

//

}

Proc(state, msg, MyPE)

{

val1 =

fancy_function(msg->param1,

state->key_part);

val2 =

fancier_function(msg->param3);

state->key_part = val1 + val2;

}

// LP CODE WITH CACHING

LP_init()

{

cache_init(FF1, SIZE1, 2,

fancy_function);

cache_init(FF2, SIZE2, 1,

fancier_function);

}

Proc(state, msg, MyPE)

{

val1 =

cache_query(FF1, msg->param1,

state->key_part);

val2 =

cache_query(FF2, msg->param3);

State->key_part = val1 + val2;

}

Page 33: Towards  Adaptive  Caching for Parallel and Distributed Simulation

Maria Hybinette, UGA 33

Approach

Cache computations and re-use when they repeat instead of re-compute.

LP

LPLP

LPLP

LPLP

LPLP

LP