Scheduling-based TM Contention Management A survey talk

34
Scheduling-based TM Contention Management A survey talk 3 rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, Rome Danny Hendler Ben-Gurion university Danny Hendler

description

Scheduling-based TM Contention Management A survey talk. Danny Hendler Ben-Gurion university. Danny Hendler. TM: a research toy?. TM. “Why STM can be more than a research toy.” Dragojevic et al., 2009. “ TM:Why it is only a research toy.” Cascaval et al., 2008. - PowerPoint PPT Presentation

Transcript of Scheduling-based TM Contention Management A survey talk

Page 1: Scheduling-based TM Contention Management A survey talk

Scheduling-based TM Contention ManagementA survey talk

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, Rome

Danny Hendler Ben-Gurion university

Danny Hendler

Page 2: Scheduling-based TM Contention Management A survey talk

TM: a research toy?

TM

“TM:Why it is only a research toy.”

Cascaval et al., 2008

“Why STM can be more than a research toy.”

Dragojevic et al., 2009

There is consensus that TM performance must be improved. Specifically:

“In some workloads, performance degraded when we used too many concurrent threads. One possible alternative to improving performance in these cases would be to modify the thread scheduler so it avoids running more concurrent threads than is optimal for a given workload, based on the

information provided by the STM runtime. “ [Dragojevic et al.]

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 3: Scheduling-based TM Contention Management A survey talk

TM scheduling: rationale

Transactional threads controlled by TM-aware schedulero Kernel-level, user-level

Richer “tool-box“ for reducing and/or preventing transaction conflicts

Improve performance under high-contention

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 4: Scheduling-based TM Contention Management A survey talk

TM system

Non TM-scheduledthreads

Contention

Manager

ContentionDetection

arbitrate

proceed

Abort/retry, wait

“Conventional” (non-scheduling) contention management

greedy

Aggressive

Karma

Polka

Suicide

Polite

“Polymorphic contention management”, [Guerraoui, Herlihy & Pochon DISC'05]

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 5: Scheduling-based TM Contention Management A survey talk

Conventional Contention Management is often problematic

Loser resumes execution after a waiting period May resume execution too early May resume execution too late

Repeated collisions occur under high contention Livelocks Performance may become worse than single lock

Scheduling-based CM to the rescue.

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 6: Scheduling-based TM Contention Management A survey talk

Talk outline

Preliminaries The first TM schedulers Later user-land work Kernel support

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 7: Scheduling-based TM Contention Management A survey talk

“Adaptive Transaction Scheduling for transactional memory systems” [Yoo & Lee, SPAA'08]

“CAR-STM: Scheduling-based collision avoidance and resolution for software transactional memory” [Dolev, Hendler & Suissa, PODC '08]

“Steal-on-abort: dynamic transaction reordering to reduce conflicts in transactional memory” [Ansari , Jarvis, Kirkham, Kotsedilis, Lujan and Watson, HiPEAC'09]

The first TM schedulers

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 8: Scheduling-based TM Contention Management A survey talk

A single scheduling queue

Per-thread Contention Intensity (CI) computed

Adaptive mechanism CI below threshold transaction begins normally CI above threshold transaction serialized (queued)

Adaptive Transaction Scheduling (ATS)(Yoo & Lee, SPAA'08)

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 9: Scheduling-based TM Contention Management A survey talk

time

Contention Intensity

ThresholdTimeline 1

Timeline 2

Timeline 3

Timeline 4

Timeline 5

Queued TransactionExecuting Transaction

Timeline flows from top to bottom

An average of all the CIs from running threads

Transactions begin execution without resorting to the scheduler

As contention starts to increase,some transactions call the scheduler

As more transactions get serialized,contention intensity starts to decrease

Contention intensity subsides below threshold

More transactions start without the scheduler to exploit more parallelism

ATS adaptively varies the number of concurrent transactions according to the dynamic contention feedback

Behavior of a Queue-Based Scheduler

ATS: adaptive parallelism control

Yoo & Lee, Transaction Scheduling.

Page 10: Scheduling-based TM Contention Management A survey talk

CAR-STM (Collision Avoidance and Resolution for STM) (Dolev, Hendler & Suissa, PODC'08)

Per-core transaction queues

Serialize conflicting transactions

Contention avoidance: attempt to avoid even first collision

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 11: Scheduling-based TM Contention Management A survey talk

CAR-STM high-level architecture

Transaction queue #1

TQ thread

TQ thread

Transaction thread

T-Info

Core #1

Serializing

contention mgr.

Dispatcher

Collision

Avoider

Core #k

Transaction queue #k

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 12: Scheduling-based TM Contention Management A survey talk

Execution time: STMBench7*

R/W dominated workloads

(*) “STMBench7: a benchmark for STM”, [Guerraoui, Kapalka & Viteck., Eurosys'07]

Page 13: Scheduling-based TM Contention Management A survey talk

Shortcomings of first TM schedulers

May restrict parallelism too much

ATS: a single serialization queue

CAR-STM: at most a single transactional thread per coreo High overheads even in the lack of contention

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 14: Scheduling-based TM Contention Management A survey talk

Talk outline

Preliminaries The first TM schedulers Later user-land work Kernel support

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 15: Scheduling-based TM Contention Management A survey talk

A low-overhead serializer

Avoid repeated collisions while minimizing over-serialization

No per-core queues

Adaptive

• “On the impact of serializing contention management on STM performance”, [Heber, Hendler & Suissa., opodis'09]

• “Scheduling support for TM contention management”, [Maldonado, Felber, Fedorova, Hendler, Lawall, Marlier, Muller & Suissa PPoPP'10]

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 16: Scheduling-based TM Contention Management A survey talk

Transactional threads

Conditionvariables

A low-overhead serializer (cont'd)

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 17: Scheduling-based TM Contention Management A survey talk

1) t Identifies a collision

2) t calls contention manager: ABORT_OTHER

3) t changes status of t' to ABORT (writes that t is winner)

tt'

4) t' identifies it was aborted

A low-overhead serializer (cont'd)

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 18: Scheduling-based TM Contention Management A survey talk

t

t'

5) t' rolls back transaction and goes to sleep on the condition variable of t

6) Eventually t commits and broadcasts on its condition variable…

A low-overhead serializer (cont'd)

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 19: Scheduling-based TM Contention Management A survey talk

t'

A low-overhead serializer (cont'd)

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 20: Scheduling-based TM Contention Management A survey talk

A low-overhead serializer (cont'd)Stabilization mechanism

Algorithm is adaptiveo Serializing mode / “Conventional’’ mode

Prevents “mode-oscillations”:o Shifting to serialization-mode reduces perceived

contentiono Should use two thresholds

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 21: Scheduling-based TM Contention Management A survey talk

Throughput of CM conventional algorithms is low

CAR-STM over-serializes compared withlow-overhead serializer

Stabilization mechanism helps

A low-overhead serializer (cont'd)Experimental evaluation

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 22: Scheduling-based TM Contention Management A survey talk

“Shrink” – collision prevention/avoidance

Predicts future accesses based on past accesseso Read-set predicted based on past few committed/aborted TXs

(temporal locality)o Write-set predicted based on immediately preceding aborted

TX

Serialize with a probability proportional to the number of threads currently serialized (serialization affinity), if thread's success rate is low and a collision is predicted

• “Preventing versus curing: avoiding conflicts in transactional memories”, [Dragojevic, Guerraoui, Singh and Singh, PODC'09]

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 23: Scheduling-based TM Contention Management A survey talk

“Shrink” – collision prevention (cont'd)

Don't serialize when contention is low

Update statistics, release lock if you own it

Serialize only if contention is high &a collision is “predicted”

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 24: Scheduling-based TM Contention Management A survey talk

“Shrink” – experimental evaluationSTMBench7, read-write workload

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 25: Scheduling-based TM Contention Management A survey talk

Talk outline

Preliminaries The first TM schedulers Later user-land work Kernel support

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 26: Scheduling-based TM Contention Management A survey talk

Scheduling Support for Transactional Memory Contention Management

Implement CM scheduling support in the kernel scheduler (Linux & OpenSolaris) (Strict) serialization Soft serialization Time-slice extension

Different mechanisms for communication between user-level STM library and kernel scheduler

• “Scheduling support for TM contention management”, [Maldonado, Felber, Fedorova, Hendler, Lawall, Marlier, Muller & Suissa PPoPP'10]

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 27: Scheduling-based TM Contention Management A survey talk

TM Library / Kernel Communication via Shared Memory Segment (Ser-k algorithm)

User code notifies kernel on events such as: transaction start, commit and abort (in which case thread yields)

Kernel code handles moving thread between ready and blocked queues

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 28: Scheduling-based TM Contention Management A survey talk

Soft Serialization

Instead of blocking, reduce loser thread priority and yield

Efficient in scenarios where loser transactions may take a different execution path when retrying

Priority should be restored upon commit or when conflicting transactions terminate

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 29: Scheduling-based TM Contention Management A survey talk

Time-slice extention

Preemption in the midst of a transaction increases conflict “window of vulnerability”

Defer preemption of transactional threads avoid CPU monopolization by bounding number of

extensions and yielding after commit

May be combined with serialization/soft serialization

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 30: Scheduling-based TM Contention Management A survey talk

Evaluation (STMBench7, 16-core AMD

Opterom)

Conventional CM deteriorates when

threads>cores

Serializing by local spinning is efficient as long as threads ≤ cores

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 31: Scheduling-based TM Contention Management A survey talk

Evaluation - STMBench7 throughput

Serializing by sleeping on condition var is best when threads>cores, since system call overhead is negligible

(long transactions)All strict serialization schemes significantly reduce aborts

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 32: Scheduling-based TM Contention Management A survey talk

“Transactional scheduling for read-dominated workloads” [Attiya & Milani, OPODIS'09]

“Taking the heat off transactions: dynamic selection of pessimistic concurrency control [Sonmez, Harris, Cristal, Unsal & Valeo, IPDPS'09]

“Proactive transaction scheduling for contention management”[Blake, Dreslinky & Mudge, MICRO'09]

“Improving performance by reducing aborts in HTM”[Ansari, Khan, Lujan, Kotselidis, Kirkham and Watson, HIPEAC'10]

“Window-based greedy contention management for TM”[Sharma, Estrade & Busch, DC'10]

“On Transaction Scheduling in distributed TM systems][Kim & Ravindran, 2010]

“Kernel-assisted Scheduling and Deadline Support for STM”[Maldonado, Marlier, Felber, Lawall, Muller & Riviere, DSN'11]

“Adaptive thread scheduling techniques for improving scalability of STM”[Chan, Lam & Wang, 2011]

Additional TM scheduling work

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 33: Scheduling-based TM Contention Management A survey talk

Scheduling support for TMConclusions & future work

Scheduling-based CM results in improved throughput under high contentiono Overhead is negligible when contention is low

Lightweight kernel support can improve performance and efficiency for some workloads

Dynamically selecting best CM algorithm for workload at hand is a challenging research directiono Machine learning?

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler

Page 34: Scheduling-based TM Contention Management A survey talk

Thank you.

3rd workshop on the Theory of Transactional Memory, Sep 22-23, 2011, RomeDanny Hendler