how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared...

23
how to implement any concurrent data structure marcos k. aguilera vmware jointly with irina calciu siddhartha sen mahesh balakrishnan

Transcript of how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared...

Page 1: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

how to implementany

concurrent data structure marcos k. aguilera

vmware

jointly withirina calciu

siddhartha senmahesh balakrishnan

Page 2: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

Where to find more information about this work

How to Implement Any Concurrent Data Structure.By Irina Calciu, Siddhartha Sen, Mahesh Balakrishnan, Marcos K. Aguilera.Communications of the ACM, 2018

Black-box Concurrent Data Structures for NUMA Architectures.Irina Calciu, Siddhartha Sen, Mahesh Balakrishnan, Marcos K. Aguilera.ASPLOS, 2017

Page 3: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

concurrent data structuresare everywhere

kernel

application libraries

applications

Page 4: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

but efficient ones are hard to design

locks

transactional memory

lock-free and wait-free

Page 5: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

effort in 2012–2014The Future(s) of Shared Data StructuresAlex Kogan and Maurice HerlihyPODC 2014

Concurrent Updates with RCU: Search Tree as an ExampleMaya Arbel and Hagit AttiyaPODC 2014

Dynamic-Sized Nonblocking Hash TablesYujie Liu, Kunlong Zhang and Michael SpearPODC 2014

Efficient Lock-free Binary Search TreesBapi Chatterjee, Nhan Nguyen and Philippas TsigasPODC 2014

The Amortized Complexity of Non-blocking Binary Search TreesFaith Ellen, Panagiota Fatourou, Joanna Helga and Eric RuppertPODC 2014

The Adaptive Priority Queue with Elimination and CombiningIrina Calciu, Hammurabi Mendes and Maurice HerlihyDISC 2014

Solo-fast Universal Constructions for Deterministic Abortable ObjectsClaire Capdevielle, Colette Johnen and Alessia MilaniDISC 2014

On Deterministic Abortable ObjectsVassos Hadzilacos and Sam TouegPODC 2013

Leaplist: Lessons Learned in Designing TM-Supported Range QueriesHillel Avni, Nir Shavit, and Adi SuissaPODC 2013

The SkipTrie: Low-Depth Concurrent Search without RebalancingRotem Oshman and Nir ShavitPODC 2013

Pragmatic Primitives for Non-blocking Data StructuresTrevor Brown, Faith Ellen, and Eric RuppertPODC 2013

Lock-Free Data Structure IteratorsErez Petrank and Shahar TimnatDISC 2013

Practical Non-blocking Unordered ListsKunlong Zhang, Yujiao Zhao, Yajun Yang, Yujie Liu and Michael SpearDISC 2013

Atomic snapshots in expected $O(\log^3 n)$ steps using randomized helpingJames Aspnes and Keren Censor-HillelDISC 2013

An Optimal Implementation of Fetch-and-IncrementFaith Ellen and Philipp WoelfelDISC 2013

On the Time and Space Complexity of Randomized Test-And-Set George Giakkoupis and Philipp WoelfelPODC 2012

Universal Constructions that Ensure Disjoint-Access Parallelism and Wait-Freedom Faith Ellen, Panagiota Fatourou, Eleftherios Kosmas, Alessia Milani, and CorentinTraversPODC 2012

Faster than Optimal Snapshots (for a While) James Aspnes, Hagit Attiya, Keren Censor-Hillel, and Faith EllenPODC 2012

Strongly Linearizable Implementations: Possibilities and Impossibilities Maryam Helmi, Lisa Higham, and Philipp WoelfelPODC 2012

CBTree: A Practical Concurrent Self-Adjusting Search TreeYehuda Afek, Haim Kaplan, Boris Korenfeld, Adam Morrison, Robert E. TarjanDISC 2012

Efficient Fetch-and-IncrementFaith Ellen, Vijaya Ramachandran, Philipp WoelfelDISC 2012

Page 6: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

problems withconcurrent data structure design

herculean effort for each data structure

rigid designs

an even greater problem…

Page 7: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

problems withconcurrent data structure design

herculean effort for each data structure

rigid designs

an even greater problem…new hardware architectures

Page 8: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

our options?1. underutilize the system

2. develop new data structures…

3. we think there is a better way

for each new architecture

Page 9: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

architecture-awareblack-box data structures

sequential data structures

architecture 1

architecture 2

transformation 1

transformation 2

architecture 3transformation 3

Page 10: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

architecture-awareblack-box data structures

sequential data structures

architecture 1

architecture 2

transformation 1

transformation 2

architecture 3transformation 3

FOCUS OF REST OF TALK NUMAarchitecture

Page 11: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

the NR algorithm

Page 12: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

NUMA architectureNon-Uniform Memory Access

❖ local access more efficient

core

cache

core

cache

core

cache

core

cachecache

core

cache

core

cache

core

cache

core

cachecache

memory memory

node node

Page 13: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

evaluation

Intel Xeon E7-4850v356 cores, 4 nodes

2.2 GHz512 GB RAML3 35 MBL2 256 KBL1 64 KB

Page 14: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

0

20

40

60

1 28 56 84 110

op

s/u

s

# threads

skip list priority queue – 10% updates(FC+) FC + RWL (RWL) Readers-Writer Lock

(SL) Spinlock(FC) Flat CombiningX

(NR) Node ReplicationX

(LF) Lock-free

Page 15: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

0

2

4

6

1 28 56 84 110

op

s/u

s

# threads

data structure in REDIS: 10% updates(NR) Node Replication (FC+) FC + RWL (RWL) Readers-Writer Lock

(FC) Flat Combining (SL) SpinlockX

X

Page 16: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

the transformation

given single-threadedexecute(op,parameters)

isReadOnly(op)

we produce multi-threadedexecute(op,parameters)

works well in NUMA servers

Page 17: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

key ideas

1. replicate data structure across (NUMA) nodesstate machine approach with a shared log

2. provide efficient NUMA-aware loglarge effort to optimize log

Page 18: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

NUMA Node

Local Replica

the transformation

ThreadThread

NUMA Node

Local Replica

ThreadThread

Page 19: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

NUMA Node

Local Replica

Local Tail

the transformation

Shared Log

LogTail

ThreadThread

NUMA Node

Local Replica

Local Tail

ThreadThread

Page 20: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

how to implement log?

key observationcoordination within node cheaper than across nodes

within node: we use flat combining

across nodes: we use lock-free appending to log

Page 21: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

correctness

linearizability [Herlihy Wing 1990]:each operation appears to take effect instantaneously at a point between its invocation and response

Page 22: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

whence performance comes• trade memory + computation for less communication• compact representation of operations• limited cross-node synchronization and contention

• enable parallelism • combiners across nodes• readers within a node • readers and the combiner on the same node

• leverage batching

22

Page 23: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

conclusion• fundamental changes in hardware

• exposed to software developers

• take-away:instead of individual data structures,let’s develop general architecture-aware techniques