Scalable Flat-Combining Based Synchronous Queues
description
Transcript of Scalable Flat-Combining Based Synchronous Queues
Scalable Flat-Combining BasedSynchronous QueuesDanny Hendler, Itai Incze, Nir Shavit and Moran Tzafrir
Presentation by Uri Golani
Overview•Synchronous queue•Synchronous queue using single combiner flat combining
•Synchronous queue using Parallel flat combining
•Benchmarks
Overview•Synchronous queue•Synchronous queue using single combiner flat combining
•Synchronous queue using Parallel flat combining
•Benchmarks
Synchronous queue•Suited for Handoff designs•Each put must wait for a get and vice verca.
•No capacity•Does not permit null elements
•Does not impose order (unfair)
Overview•Synchronous queue•Synchronous queue using single combiner flat combining
•Synchronous queue using Parallel flat combining
•Benchmarks
So what is flat combining?It means that all requests are laid on a sequential data structure and are combined by traversing it.
F.C Algorithm’s Attributes:•Publication record – per thread
•Publication list•Global lock•Private stack•Count
A Thread’s Publication Record :
Request Item
Publication Record Is
linked
Publication list•A list of containing a static head and publication records
•A thread has at most one P.R in the list
•Adding a P.R to the list involves a CAS to the head of the list.
•Removing P.R doesn’t include CAS therefore head.next will never be removed.
Global lock•Enables one thread only to traverse the publication list and act as a combiner
•Publication records can still be added to the publication list even when the lock is taken.
Counter•When grabbing the lock and becoming a combiner a thread increments this field.
•Every predefined number of increments a clean up of old requests from the publication list takes place
Private stack•A construct member. Not issued per combiner
•Stores Push/Pop operations during the traversal of the combiner
•Keeps the overflow of requests for the next combining rounds
FC Synchronous queue –get/put methods•1.Allocate a publication Record if it is null•2.while(true)
▫2.1 check that P.R is still active and in the publication list. CAS it to the head of the publication list otherwise.
▫2.2 try to acquire the lock (CAS) to become combiner and
Combine() if succeeded ▫2.3 check If the publication record’s item is not
null, if so, break.
Combine()•1. increment count•2. for (COMBINING_ROUND)
▫2.1 traverse the publication list combining complimentary requests. Pushing overflows into the stack
▫2.2 if (count % CLEAN_UP = 0) 2.2.1 remove old P.R
Single Combiner Overlay :
Request
Item
Request
Item
Request
Item
Request
Item
3. Combiner traverses list collecting requests into stack and matching them to other requests along the list
Head
Count
Thread C
Thread A
Thread B
Thread D
4. infrequently, new records are CASedby threads to head of list, and old ones areremoved by combiner
2. thread acquires lock,becomes combiner,updates count
Combiner’s private stack
publication list
1. thread writes push or pop request andspins on local record
PushPush
Push
Pop
Push
Push
Overview•Synchronous queue•Synchronous queue using single combiner flat combining
•Synchronous queue using Parallel flat combining
•Benchmarks
Parallel flat combining•Uses two levels of combining:•1.Dynamic level – Multiple combiners working in parallel
•2.Exchange level – Combining overflows of the dynamic level in a single combiner form
Dynamic level•Publication list is divided to sub lists of limited size
•Each sub list is issued a combiner node head, has it’s own lock, and private stack
•Multiple combiners can work on the sub lists, and thus the work on the publication list is done in parallel
Exchange level•Has a publication list, private stack and lock
•Each publication record represents one sub list in the dynamic level
•Publication record’s item consists of a list of overflow requests (gets/puts) from the dynamic level
•Combining is done using a single combiner
Parallel Combiner Overlay:
Request
Item
Request
Item
Request
Item
RequestItem
Request
Item
Request
Item
1st combiner node Reque
stItem
Thread C
Combiner Node
Thread A
Request
Item
Thread G
Request
Item
Thread B
Request
Item
Thread D
Request
Item
Thread E
Combiner Node
2nd combiner node
3rd combiner nodeCombiner Node
Count
Countprivate stack
Count
private stack
private stack
Head of the dynamic FC publication list
2nd combiner
Request
Item
3rd combiner
Request
Item
1st combiner
Request
Item
Head of the exchange FC publication listCou
ntprivate stack
Request overflows
Request overflows
Is parllel flat combining linearizable?
Operations can be linearized at the point of release of the first of two combined requests.
It can be viewed as an object whose history is made of a sequence of pairs consisting of a push followed by a pop (i.e. push, pop, push, pop...)
Well , it is.
Overview•Synchronous queue•Synchronous queue using single combiner flat combining
•Synchronous queue using Parallel flat combining
•Benchmarks
Pitfalls of parallel flat combining:•Performance is highly based on the
balance of requests type on the various sub lists