Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

42
Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1

Transcript of Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Page 1: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Locality-Conscious Lock-Free Linked Lists

Anastasia Braginsky & Erez Petrank

1

Page 2: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Lock-Free Locality-Conscious Linked Lists

List of constant size ''containers", with minimal and maximal bounds on the number of elements in container

Traverse the list quickly to the relevant container

Lock-free, locality-conscious, fast access, scalable

3 7 9 12 18 25 26 31 40 52 63 77 89 92

2

Page 3: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Non-blocking AlgorithmsEnsures progress in finite number of steps.

A non-blocking algorithm is:

◦wait-free if there is a guaranteed per-thread progress in bounded number of steps

◦ lock-free if there is a guaranteed system-wide progress in bounded number of steps

◦obstruction-free if a single thread executing in isolation for a bounded number of steps will make progress.

3

Page 4: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Existing Lock-Free Lists DesignsJ. D. VALOIS, Lock-free linked lists using compare-

and-swap, in Proc. PODC, 1995.

T.L. HARRIS, A pragmatic implementation of non-blocking linked-lists, in DISC 2001.

M.M. MICHAEL, Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects, in IEEE 2004.

M. FORMITCHEV, and E. RUPERT. Lock-free linked lists and skip lists, in Proc. PODC, 2004.

4

Page 5: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Outline

Introduction

A list of memory chunks

Design of in-chunk list

Merges & Splits via freezing

Empirical results

Summary

5

Page 6: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

The List StructureA list consists of

◦A list of memory chunks

◦A list in each chunk (chunk implementation)

When a chunk gets too sparse or dense, the update operations on the list are stopped and the chunk is split or merged with its preceding chunk.

6

Page 7: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

An Example of a List of Fixed-Sized Memory Chunks

Chunk A

HEAD

NextChunk Chunk B NextChunk

NULL

Key: 3Data: G

Key: 14Data: K

Key: 25Data: A

Key: 67Data: D

Key: 89Data: M

EntriesHead EntriesHead

7

Page 8: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

When No More Space for Insertion

Chunk A

HEAD

NextChunk Chunk B NextChunk

Key: 3Data: G

Key: 6Data: B

Key: 9Data: C

Key: 14Data: K

Key: 25Data: A

Key: 67Data: D

Key: 89Data: M

EntriesHead EntriesHead

Key: 12Data: H

Freeze

8

NULL

Page 9: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Split

Chunk A

HEAD

NextChunk Chunk B NextChunk

Key: 3Data: G

Key: 6Data: B

Key: 9Data: C

Key: 14Data: K

Key: 25Data: A

Key: 67Data: D

Key: 89Data: M

EntriesHead EntriesHead

Key: 12Data: H

Freeze

Chunk C NextChunk

Key: 3Data: G

Key: 9Data: C

EntriesHead

Key: 6Data: B

Chunk D NextChunk

Key: 12Data: H

EntriesHead

Key: 14Data: K

9

NULL

Page 10: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Split

Chunk A

HEAD

NextChunk Chunk B NextChunk

Key: 3Data: G

Key: 6Data: B

Key: 9Data: C

Key: 14Data: K

Key: 25Data: A

Key: 67Data: D

Key: 89Data: M

EntriesHead EntriesHead

Key: 12Data: H

Freeze

Chunk C NextChunk

Key: 3Data: G

Key: 9Data: C

EntriesHead

Key: 6Data: B

Chunk D NextChunk

Key: 12Data: H

EntriesHead

Key: 14Data: K

10

NULL

Page 11: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

When a Chunk Gets SparseHEAD

Chunk B NextChunk

Key: 25Data: A

Key: 67Data: D

Key: 89Data: M

EntriesHead

Chunk C NextChunk

Key: 3Data: G

Key: 9Data: C

EntriesHead

Key: 6Data: B

Chunk D NextChunk

EntriesHead

Key: 14Data: K

Freeze master

Freeze slave

11

NULL

Page 12: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

MergeHEAD

Chunk B NextChunk

Key: 25Data: A

Key: 67Data: D

Key: 89Data: M

EntriesHead

Chunk C NextChunk

Key: 3Data: G

Key: 9Data: C

EntriesHead

Key: 6Data: B

Chunk D NextChunk

EntriesHead

Key: 14Data: K

Freeze master

Freeze slave

Chunk E NextChunk

Key: 3Data: G

Key: 6Data: B

Key: 9Data: C

Key: 14Data: K

EntriesHead

12

NULL

Page 13: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

MergeHEAD

Chunk B NextChunk

Key: 25Data: A

Key: 67Data: D

Key: 89Data: M

EntriesHead

Chunk C NextChunk

Key: 3Data: G

Key: 9Data: C

EntriesHead

Key: 6Data: B

Chunk D NextChunk

EntriesHead

Key: 14Data: K

Freeze master

Freeze slave

13

Chunk E NextChunk

Key: 3Data: G

Key: 6Data: B

Key: 9Data: C

Key: 14Data: K

EntriesHead

NULL

Page 14: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Outline

Introduction

A list of memory chunks

Design of in-chunk list

Merges & Splits via freezing

Empirical results

Summary

14

Page 15: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

A List of Fixed-Sized Memory Chunks

Chunk A

HEAD

NextChunk Chunk B NextChunk

NULL

Key: 3Data: G

Key: 14Data: K

Key: 25Data: A

Key: 67Data: D

Key: 89Data: M

EntriesHead EntriesHead

15

Page 16: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

The Structure of an Entry2 machine wordsFreeze bit: to mark chunk entries frozen.A ┴ (bottom) value is not allowed as a key value.

It means that entry is not allocated.

Data Key Freezebit

Next entry pointer

32 bit 31 bit

Deletebit

Freezebit

62 bit

KeyData word NextEntry word

16

Page 17: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

The Structure of a Chunk

Key :┴Key: 7

Data: 89

Head :dummy entry

Key: 14Data: 9

Key :┴Key: 22Data: 13

Key :┴Key: 23Data: 53

Deleted bit: 1

Key: 11Data: 13

Counter :4

Key: 24Data: 78

Deleted bit: 1

NextChunk pointer

new pointer

MergeBuddy pointer

Freeze State

2 bits

An array of entries of size MAX

17

Page 18: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Initiating a Freeze

When a process p realizes that

◦A chunk is full, or

◦A chunk is sparse, or

◦A chunk is in progress of being frozen,

Then p starts a freeze or p helps another process that has already started a freeze.

18

Page 19: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

The Freeze Process Starts by:Going over all the entries in the array and

setting their freeze bit

Finish ◦insertions of all currently allocated entries that

are not yet in the list

◦deletions of entries already marked as deleted but still in the list

19

Page 20: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Chunk List is Different from Known Lock-Free Linked Lists

Non-private insertion: entry is visible when allocated, even before linking to the list.

Allow help with insertion.

Boundary conditions causing merges and splits.

20

Page 21: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Entry Allocation1. Entry is allocated at the

beginning of the insertion process

2. Find zeroed entry, with ┴ key value

3. Allocate by swapping the KeyData word to the desired value.

◦ Upon a failure of the CAS command, goto 2.

◦ Frozen entry can not be allocated

4. If no entry is found -- freeze starts

Next, use allocated entry for list insertion…

21

k:3d:9f:1

k:4d:2f:1

k:8d:5f:0

k:┴d:0f:1

k:┴d:0f:0

Page 22: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Entry Allocation1. Entry is allocated at the

beginning of the insertion process

2. Find zeroed entry, with ┴ key value

3. Allocate by swapping the KeyData word to the desired value.

◦ Upon a failure of the CAS command, goto 2.

◦ Frozen entry can not be allocated

4. If no entry is found -- freeze starts

Next, use allocated entry for list insertion…

22

k:3d:9f:1

k:4d:2f:1

k:8d:5f:0

k:┴d:0f:1

k:6d:2f:0

Page 23: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Insertion Algorithm

1. Record entry’s next pointer value in savedNext.

2. Find a location for adding the new entry. ◦ If key already exists (in a different entry) – free allocated entry

by clearing it and return.

3. CAS entry’s next pointer from savedNext to the next entry in the list

4. CAS previous entry’s next pointer to newly allocated entry

◦ If any CAS fails, goto 1 (restarting from the beginning of a chunk)

5. Increase the counter and return

23

k:3d:9f:1

k:4d:2f:1

k:8d:5f:0

k:┴d:0f:1

k:6d:2f:0

previous next

Page 24: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Insertion Algorithm

1. Record entry’s next pointer value in savedNext.

2. Find a location for adding the new entry. ◦ If key already exists (in a different entry) – free allocated entry

by clearing it and return.

3. CAS entry’s next pointer from savedNext to the next entry in the list

4. CAS previous entry’s next pointer to newly allocated entry

◦ If any CAS fails, goto 1 (restarting from the beginning of a chunk)

5. Increase the counter and return

24

k:3d:9f:1

k:4d:2f:1

k:8d:5f:0

k:┴d:0f:1

k:6d:2f:0

previous next

Page 25: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Insertion Algorithm

1. Record entry’s next pointer value in savedNext.

2. Find a location for adding the new entry. ◦ If key already exists (in a different entry) – free allocated entry

by clearing it and return.

3. CAS entry’s next pointer from savedNext to the next entry in the list

4. CAS previous entry’s next pointer to newly allocated entry

◦ If any CAS fails, goto 1 (restarting from the beginning of a chunk)

5. Increase the counter and return

25

k:3d:9f:1

k:4d:2f:1

k:8d:5f:0

k:┴d:0f:1

k:6d:2f:0

previous next

Page 26: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Deletion

Standard implementation, except for taking care not to get under the minimum number of entries

Counter always holds a lower bound on the actual number of entries.

◦ increased after actual insert

◦ decreased before actual delete

Decrementing the counter below the minimum allowed number, initiates a freeze

Frozen entry can not be marked as deleted

26

Page 27: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Outline

Introduction

A list of memory chunks

Design of in-chunk list

Merges & Splits via freezing

Empirical results

Summary

28

Page 28: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

FreezingPhase I: Marking entries with frozen bits

◦Non-frozen entries can still change concurrently

Phase II: List stabilization◦Everything frozen, now finish all incomplete

operations.

Phase III: Decision◦Split, merge, or copy.

Phase IV: Recovery◦Implementation of the above decision

29

Page 29: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Phase IV - RecoveryAllocate new chunk or chunks locally

Copy the frozen data to the new chunk

Execute the operation that initially caused the freeze

Attach the new chunk to the frozen one

Replace frozen chunk(s) with new chunk(s) in the entire List’s data structure

30

Page 30: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

RemarksSearch can run on a frozen chunk (and is

not delayed).

◦Wait-free except for the use of the hazard pointer mechanism

A chunk can never be unfrozen

31

Page 31: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Outline

Introduction

A list of memory chunks

Design of in-chunk list

Merges & Splits via freezing

Empirical results

Summary

32

Page 32: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

The Test EnvironmentPlatform: SUN FIRE with UltraSPARC T1

8-core processor, each core running 4 hyper-threads.

OS: Solaris 10

Chunk size set to virtual page size -- 8KB.

◦All accesses inside a chunk are on the same page

33

Page 33: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Workload Each test had two stages:

◦ Stage I:

Insertions (only) of N random keys (in order to obtain a substantial list)

N: 103, 104, 105, 106

◦ Stage II:

Insertions, deletions and searches in parallel

N operations overall out of which 15% insertions, 15% deletions, and 70% searches.

Reporting results for runs of 32 concurrent threads.

34

Page 34: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Reference for ComparisonMichael’s lock-free linked list implemented in C

according to the pseudo-code from

◦ MICHAEL, M. M., Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects., in IEEE 2004.

◦ Uses hazard pointers.

A Java implementation of the lock-free linked list provided in the book “The Art of Multiprocessor Programming”

◦ Garbage collection is assumed.

35

Page 35: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Comparison with Michael’s List Total Time

36

1000 10000 100000 10000000.001

0.01

0.1

1

10

100

1000

0.01 0.56

27.00

368.08

0.16

1.16

4.90

24.68

Stage I total time / NOriginal List Chunk List

N

time

(s)

loga

rithm

icsc

ale

1000 10000 100000 10000000.001

0.01

0.1

1

10

100

1000

0.01

1.15

33.91

237.93

0.004 0.071

2.050

20.269

Stage II total time / NOriginal List Chunk List

N

time

(s)

loga

rithm

icsc

ale

Already at 20000 we get

same performance

More then 10 times faster

Constantly better performance.

For substantial lists in more then

10 times

Page 36: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Comparison with Michael’s List Single Operation Average

37

Better performance, as lists are going

more substantial

Again constantly better

performance

Page 37: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Comparison with Lock-Free List in Java Total Times

38

Page 38: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Comparison with Lock-Free List in Java Single Operation Average

39

Page 39: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

OutlineIntroduction

A list of memory chunks

Design of in-chunk list

Merges & Splits via freezing

Empirical results

Summary

40

Page 40: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

ConclusionNew lock-free algorithm for chunked linked list

Fast due to:

◦Skips over chunks

◦Restarting from the beginning of a chunk

◦Locality-conscious

May be useful for other structures that can use the chunks

Good empirical results for the substantial lists

41

Page 41: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Questions?

42

Page 42: Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.

Thank you !!

43