Relaxed Consistency Models

OutlineLazy Release Consistency

TreadMarks DSM system

Review: what makes a good consistency model?Model is a contract between memory system and programmer◦ Programmer follows some rules about reads and writes◦ Model provides guarantees

Model embodies a tradeoff◦ Intuitive for programmer vs. Can be implemented efficiently

Treadmarks high level goalsBetter DSM performance

Run existing parallel (and “correct”) code.

What specific problems with IVY are TreadMarks want to fix?False sharing: two machines use different variables on the same page, at least on writes◦ IVY will make the pages bouncing back and forth◦ However, it doesn’t need to do so of two process (threads) working on

different variables.

send only written bytes – not whole pages

Goal 1: Reducing the data to be sentGoal: don’t send while page, just the written bytes.

On M1 write fault:◦ tell other hosts to invalidate but keep hidden copy.◦ M1 itself also keep the hidden copy.

On M2 fault:◦ M2 asks M1 for recent modifications.◦ M1 “diffs” current page against hidden copy.◦ M1 send differences to M2.◦ M2 applies diffs to its hidden copy and make the up-to-date version

Goal 2: allow multiple readers+writersTo cope with false sharing◦ no invalidation when a machine writes◦ no r/w r/o demotion when a machines reads◦ so, there will be multiple “different” copies of a page! which should a reader

look at?

Diffs help here: can merge writes to same page

But, when to send the diffs?◦ No invalidations, no page faults, what triggers sending diffs?

Release ConsistencyThink about how you program your multi-thread codes. While accessing the shared data, you should first get a lock and then accessing the data and final you have to release the lock. This is considered as the “correct” programming practice.

In distributed environment, think about we have a lock server. Each process should get a lock from the lock server before accessing the shared resources

Thus, we can send out write diffs on release to all copies of pages written.

This is a new consistency model!

Release Consistency ModelM0 wont see M1’s writes until M1 releases a lock

so machines can temporarily disagree on memory contents

If the programs always follow the rules of lock:◦ Locks force order no stale reads like sequential consistency

But, if you do not follow this guideline (don’t lock)◦ reads can return stale data◦ concurrent writes to same variable trouble (data race)

Benefit?

multiple machines can have copies of a page, even when 1 or more writes◦ no bouncing of pages due to false sharing◦ read copies can co-exist with writers◦ relies on write diffs otherwise can’t reconcile concurrent writes to same page

Lazy Release Consistency ModelDo we really need to update the pages at moment of release a lock? Suppose you never use a variable which is updated by some processes in the system. You do not need to get notified by the update event for the variable.

Only fetch write diffs on acquire of a lock and only fetch from previous holder of that lock. Thus nothing happens at time of write or release.

This is called as Lazy Release Consistency Model (LRC) and is another new consistency model!

LRC hides some writes that RC reveals.

Benefit?◦ if you don’t acquire lock on object, you don’t have to fetch updates to it◦ if you use just some variables on a page, no need to fetch writes to others◦ less network traffic

Conventional DSM Implementation

Sequential vs Release Consistency

Every Write is broadcasted

More Message Passing

Writes are broadcasted only synchronization points

More Memory overhead

Read-Write False Sharing

w(x)

r(y) r(y) r(x)

w(x) w(x)

Read-Write False Sharing

w(x) w(x)

r(y) r(y) r(x)

synch

Write-Write False Sharing

w(x)

w(y) w(y) r(x)

synch

w(x) w(x)

Multiple-Writer False Sharing

w(x)

w(y) w(y) r(x)

synch

w(x) w(x)

Example 1 (false sharing)x and y are on the same page. (a: acquire, r: release)

M0: a1 for (…) x++ r1

M1: a2 for (…) y++ r2 a1 print x, y r1

What does IVY do?

What does Treadmarks do?◦ M0 and M1 both get cached writeable copy of the page◦ when they release, each computes diff against original page◦ M1’s a1 cause it to pull write diffs from last holder of lock1, so M1 update x

in its page.

Example 2 (LRC)x and y on same page

M0: a1 x=1 r1

M1: a2 y=1 r2

M2: a1 print x r1

What does IVY do?

What does Treadmarks do?◦ M2 only ask previous holder of lock 1 for write diffs◦ M2 does not see M1’s modification to y, even though on the same page

DiscussionQ: is LRC a win over IVY if each variable on a separate page? (No)

Q: why is LRC a reasonably intuitive model for programmers?

It is the same as sequential consistency if the programmers always use lock and unlock locks. (follow the rules defined by LRC)

but, non-locking code does not work. like v=f(); done=1;

Example 3 (motivate vector timestamps)M0: a1 x=1 r1

M1: a1 a2 y=x r2 r1

M2: a2 print x, y r2

What’s the “right ” answer?◦ we need to define what LRC guaranetees◦ answer: when you acquire a lock,

◦ you see all writes by previous holder and all writes previous holder saw

What does TreadMarks do for example 3?What does TreadMarks do?◦ M2 and M1 need to decide what M2 needs and doesn’t already have uses

“vector timestamps”◦ each machine numbers its releases (i.e. write diffs)◦ M1 tells M2:

◦ at release, had seen M0’s writes through #20, and see◦ 0:20◦ 1:25◦ 2:19◦ 3:36◦ ……◦ this is a “vector timestanmp”

◦ M2 remembers a vector timestamp of writes it has seen◦ M2 compares with M1’s VT to see what writes it needs from other machines.

DiscussionsVTs order writes to same variable by different machines:◦ M0: a1 x=1 r1 a2 y=9 r2◦ M1: a1 x=2 r1◦ M2: a1 a2 z = x + y r2 r1◦ M1 is going to hear “x=1” from M0, and “x=2” from M1.◦ How does M1 know what to do?

Could the VTs for two values of the same variable not be ordered?

M0: a1 x=1 r1

M1: a2 x=2 r2

M2: a1 a2 print x r2 r1

Programmer rules /system guarentees?Programmer must lock around all writes to shared variables to order writes to same variable, otherwise “latest value” not well defined

to read latest value, must lock

if no lock for read, guaranteed to see values that contributed to the variables you did lock

Example of when LRC might work too hardM0: a2 z=99 r2 a1 x=1 r1

M1: a1 y=x r1

TreadMarks will send z to M1 because it comes before x=1 in VT order.◦ Assuming x and z are on the same page.◦ Even if on different pages, M1 must invalidate z’s page.

But M1 doesn’t use z

How could a system understand that z isn’t needed?◦ Require locking of all data you read thus to relax the causal part of the LRC

model

Q: without using VM page protection?It uses VM to ◦ detect writes to avoid making hidden copies (for diffs) if not needed◦ detect reads to pages know whether to fetch a diffneither is really crucialso TreadMarks doesn’t depend on VM as much as IVY does

IVY used VM faults to decide what data has to be moved and whenTM uses acquire()/release() and diffs for that purpose

TreadMarks ImplementationLooks a lot like pthreads

Implicit message passing

Implicit process creation

Only standard Unix System Calls◦ Message Passing◦ Memory Management

TreadMarks Code

Eager vs. Lazy RCSends Messages at release of lock or at barriers

Broadcasts Messages to all nodes

Sends Messages when locks are acquired

Message goes only to the required node

Eager vs. Lazy RC

Memory ConsistencyDone by creating diffs

Eager RC creates diffs at barriers

Lazy RC creates diffs at the first use of a page

Twin Creation

Diff Organization

Vector Timestampsw(x) rel

acq w(y) rel

p1

p2

p3 acq r(x) r(y)000

000

000

100

110

Diff chain in Proc 4

Garbage CollectionUsed to merge all diffs – recover memory

Occurs only at barriers

All nodes that have a pages must have all diffs of that page.

DSM successful?clusters of cooperating machines are hugely successful

DSM not so much◦ main justification is transparency for existing threaded code◦ that's not interesting for new apps◦ and transparency makes it hard to get high performance

MapReduce or message-passing or shared storage more common than DSM

Thank You! Any Questions?

Click icon to add picture

Relaxed Consistency Models

Documents

Transcript of Relaxed Consistency Models