Formalisms and Verification for Transactional Memories Vasu Singh EPFL Switzerland.

Post on 20-Dec-2015

216 views 0 download

Tags:

Transcript of Formalisms and Verification for Transactional Memories Vasu Singh EPFL Switzerland.

Formalisms and Verification for

Transactional Memories

Vasu Singh

EPFL Switzerland

Part 1:

Verification for Pure Transactional Programs

Part 2:

Formalisms for Mixed Transactional Programs

(Parametrized Opacity)

Pure Transactional Programs

• All operations within transactions

• No non-transactional operations

Interaction

PureTransactional Program

Transactional Memory Algorithm

Hardware

Memory operations may be reordered by the hardware

Relaxed memory models

• For reasons of performance, hardware may transform the sequence of instructions of a thread

• One uses fences to ensure order with relaxed memory models: fences have a high performance overhead

Verification Problem

Does a given TM algorithm guarantee atomicity for every transactional program under a given memory model?

How do we verify?

1. Formalize common memory models

2. Capture the behavior of STM algorithms under relaxed memory models

3. Build a specification (of say, opacity) at hardware level atomicity

4. Implement a tool to check the correctness of an STM algorithm using the spec

Relaxed Memory Language

• A new language to write concurrent programs under relaxed memory models

• Syntax: Statements execute atomically in hardware

• Semantics: Parametrized by the memory model M

• We express TM algorithms in RML

Homework slide on RM 101

A := 1B := 1r1 := Dr3 := CA := 2

C := 1D := 1r2 := Br4 := AC := 2

How many possible valuationsfor r1, r2, r3, r4?

On SC ?

On TSO ?

On PSO ? On RMO ?

Homework slide on RM 101

A := 1B := 1r1 := Dr3 := CA := 2

C := 1D := 1r2 := Br4 := AC := 2

How many possible valuationsfor r1, r2, r3, r4?

On SC ? 7

On TSO ? 1 more

On PSO ? 7 more On RMO ? 1 more

Manually: a few minutes, at least !

RML: less than a second on a dual core 2.8 GHz

Our Tool

FOIL

Our Tool

FOIL

RML description ofan STM algorithm A

Memory Model M

A is correct under M

A is correct under Mwith fences at …

A is not correct under SC

Our Tool

RML description ofan STM algorithm A

Memory Model M

Our Tool

L(A,M)

RML description ofan STM algorithm A

Memory Model M

Our Tool

L(A,M)

RML description ofan STM algorithm A

Memory Model M

Spec

L(A,M) subsetof Spec?

Our Tool

L(A,M)

RML description ofan STM algorithm A

Memory Model M

Spec

L(A,M) subsetof Spec?

A is correct under M

YES

Our Tool

L(A,M)

RML description ofan STM algorithm A

Memory Model M

Spec

L(A,M) subsetof Spec?

A is correct under M

YES

NO

L(A,SC) subsetof Spec?

Our Tool

L(A,M)

RML description ofan STM algorithm A

Memory Model M

Spec

L(A,M) subsetof Spec?

A is correct under M

YES

NO

L(A,SC) subsetof Spec?

A is not correct under SC

NO

Our Tool

L(A,M)

RML description ofan STM algorithm A

Memory Model M

Spec

L(A,M) subsetof Spec?

A is correct under M

YES

NO

L(A,SC) subsetof Spec?

A is not correct under SC

NO

Add fence to A

YES

Our Tool

L(A,M)

RML description ofan STM algorithm A

Memory Model M

Spec

L(A,M) subsetof Spec?

A is correct under M

YES

NO

L(A,SC) subsetof Spec?

A is not correct under SC

NO

Add fence to A

YES

Our Tool

L(A,M)

RML description ofan STM algorithm A

Memory Model M

Spec

L(A,M) subsetof Spec?

A is correct under M

YES

NO

L(A,SC) subsetof Spec?

A is not correct under SC

NO

Add fence to A

YES

YESA is correct under Mwith fences at …

NO

Our experiments

• Wrote DSTM, TL2, and McRT STM in RML without fences

• Found the STM algorithms correct under SC and TSO

• FOIL places required fences for correctness under further relaxed PSO and RMO

• The set of inserted fences matches those in the official implementation for TL2

Mixed Transactional Programs

Mixed Transactional Programs

• No formal framework

• We try to define one

Mixed Transactional Programs

Transactional Memory Algorithm

Hardware

Non transactional interaction

MixedTransactional Program

atomic { x := 1 x := 2}

r1 := x

A strong correctness property

Strong atomicity / Strong isolation:

Transactions are isolated from other transactions and non-transactional operations

A Common Quote

• “Strong atomicity is expensive to achieve”

• Questions: – What is strong atomicity precisely ?– How expensive ?

Specifying correctness

Strong Atomicity

• Precise part:– Transactions isolated from other transactions,

and also from non-transactional operations

• Ambiguous part:– What is the interaction between non-

transactional operations?– Two definitions ::

• every non-transactional operation executes as a transaction (Larus et al.)

• Non-transactional operations execute according to a relaxed memory model (Martin et al.)

Precise part 1

atomic { x := 1 x := 2}

r1 := x

r1 = 0

r1 = 2

Two possibilities forr1. Allowed by both:Larus and Martin

Precise part 2

atomic { x := 1 r1 := x}

x := 2

Only possibility byboth Larus and Martin:r1 = 1

Ambiguous part 1

atomic { x := 1 y := 1}

r1 := yr2 := x

Ambiguous part 1

atomic { x := 1 y := 1}

r1 := yr2 := x

r1 = 0r2 = 0

r1 = 0

r2 = 1

r1 = 1r2 = 1

r2 = 0

r1 = 1

Allowed by both:Larus and Martin

Allowed only byMartin

Ambiguous part 2

atomic { x := 1}

x := 2

Can r1 be 42 ?

Depends on thememory model !

If the memory modelallows out of thin airvalues, r1 can be 42.

r1 := x

Parametrized Opacity

Motivation

• Separate the concerns:

– Memory model “contract” for non-tx

– Strong atomicity for tx

Intuition

• Opacity for transactions

• Isolation of transactions from non-transactional operations

• Non-transactional operations respect the memory model

How Expensive: Complexity Analysis

Basically …

• We want to know what is so expensive about strong atomicity – Does it require non-transactional

operations to perform a long sequence of operations ?

– Does it require non-transactional operations to wait indefinitely for transactions to finish ?

– Is it impossible to achieve?

Uninstrumented TM

• Let us study these first• No overhead of non-tx operations • What can we achieve• Under what conditions?

Classes of memory models

• Four classes based on restriction of reorderings ->– RR : does not allow to reorder two read

instructions to different variables– RW : does not allow to reorder a read

followed by a write to a different variable– WR : does not allow to reorder a write

followed by a read to a different variable– WW : …

Examples

• SC: RR, RW, WR, WW• PSO: RR, RW

• RMO: RR_d, RW_d • Java: RW_d, RR_d• Alpha: RW_d

NULL memory model

• Every pair of operations can be reordered

• NULL not in { RR, RW, WR, WW }• Even the most relaxed memory

models enforce an order between a load and a dependent store

• NULL memory model is not practical

Results

• Parametrized opacity can be obtained with uninstrumented TMs only under NULL memory model

• For every non-NULL memory model, it is impossible to achieve parametrized opacity without instrumentation

Proof idea

1. Assume the memory model restricts the order of two instructions

2. Create a counterexample history that is not opaque parametrized by this memory model

3. Do this for every possible restriction (RR, RW, WR, WW)

What about some instrumentation?

Instrumented TM

• Change the semantics of non-transactional operations

• Reads are no longer just loads and writes are no longer just stores

• Used to make non-tx operations tx aware• Example:

– Non-transactional writes are required to hold the lock that is used by transactional writes before performing a write

Example of instrumented TM [Shpeisman et al., PLDI 07]

• Every object accessed in a transaction has a tx record

• A tx record is in one of the states: shared, exclusive, private, exclusive anonymous

• Tx operations as in common TMs• Non-tx read: check no tx write interferes• Non-tx write: get exclusive access to the

tx record

Example of instrumented TM [Shpeisman et al., PLDI 07]

• Every object accessed in a transaction has a tx record

• A tx record is in one of the states: shared, exclusive, private, exclusive anonymous

• Tx operations as in common TMs• Non-tx read: check no tx write interferes• Non-tx write: get exclusive access to the

tx record

Expensive !

Instrumented TMs

• The instrumented guarantees parametrized opacity wrt SC !

• You saw it was “expensive”: a non-tx read or write may indefinitely wait for a tx to complete

Can we do better?

In other words…

• Can we provide parametrized opacity for some memory models with constant-time instrumentation for non-tx accesses ?

• Or even better, can we do just with constant-time instrumentation for non-tx writes (uninstrumented reads) ?

Instrumented TMs

Let M be a memory model which relaxes read/write to read order (e.g. RMO, Java, and Alpha)

Theorem: It is possible to obtain opacity parametrized by M with constant time instrumentation for writes and no instrumentation for reads.

Intuition of the construction

• Transactions: – Use a global lock from start and finish– Use CAS to store the updates to memory on

commit

• Non-transactional reads: Just a load• Non-transactional writes:

– Maintain a per-process version number (vp)– Increment vp before every write– Store the <vp,value> in the variable (this

ensures that the CAS in the transactions do not have the ABA problem)

What this tells us?

• Existing TM implementations for strong atomicity promise “too much”: they even guarantee ordering of non-transactional accesses

• This “too much” is the reason for poor performance

Putting Tim’s talk in perspective

• + Worry about programming language constructs: we care about relaxed memory models

• + Define intuitive correctness properties for programmers: we formalize strong atomicity

• - Need to define semantics/properties at the level of TM implementations: to know what can be implemented efficiently and what cannot be

An Analogy

• Java memory model is weak: programmers use synchronization to ensure sequentially consistent behavior (DRF property)

• Similarly, let opacity parametrized by Java be a “weak” notion. Let explicit synchronization guarantee opacity parametrized by SC

Putting Tim’s talk in perspective

• +- Let TM implementations not promise everything needed for the programmer: + we let a TM implementation guarantee opacity with respect to weaker memory models, and let program-level synchronization take care of the rest

- we keep the guarantee of transactions strong (opacity)

Conclusion

• TM implementations should not enforce order of non-transactional operations

• Non-transactional operations should still satisfy just the “contract” of the memory model

• We can hope to use these relaxations to make TM implementations with strong isolation guarantees faster

Papers

• Part 1:

“STM on Relaxed Memory Models” [with Guerraoui, Henzinger]

• Part 2:

“Transactions in the Jungle”[with Guerraoui, Henzinger, Kapalka]

Questions ?