McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben...

22
McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl- Tabatabai Ben Hertzberg Rick Hudson Bratin Saha

Transcript of McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben...

Page 1: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

McRT-Malloc: A Scalable Non-Blocking

Transaction Aware Memory Allocator

Ali Adl-Tabatabai

Ben Hertzberg

Rick Hudson

Bratin Saha

Page 2: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

2

Goals of McRT-Malloc

Scalable • Performance linear to # of processors then flat as you add

more SW threads• Preemption safety• Implies a lock free approach to all structures

Allows other scalable McRT algorithms to use malloc and remain scalable

Transactional memory awareness• Avoid memory blowup within transaction• Avoid freeing of bits needed to validate other transactions• Enable a object level conflict detection in STM

Best of class

Page 3: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

3

Block Data Structure

Heap divided into aligned 16K blocks• 18 significant bits

Block• Owned by a single thread during

allocation • Blocks segregated into bins

according to objects size• Meta data header

– Free Lists– Bump Pointer– Next/Previous Block– Object size and usage info

• No per object Headers• Free blocks on non-blocking LIFO

queue – 46 bit for update timestamp

0xABCD0000

0xABCD0040

0xABCD4000

Meta data Header

.

.

.

Object Pointer

Page 4: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

4

Object Allocation and Freeing

Thread owns block they allocate in

Trick - Free uses two linked free lists per block

• Private free list for block owner avoids atomic instructions

• Public list for other threads use atomic instruction and non-blocking algorithm

Trick - Fresh block uses frontier pointer to avoid free list initialization

Then allocates from private free list

Privatize entire public list as needed with atomic xchg

Page 5: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

5

McRT-Malloc: A Transaction Aware Memory Allocator

Three problems

1. Speculative memory allocation and de-allocation inside transactions can cause space blowup

2. Transactional conflict detection and frees

3. Object-based conflict detection in C/C++

Garbage collection also solves these issues

Page 6: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

6

Allocation with STM

Speculatively allocate or free inside transaction

• Valid at commit - rolled back on abort

Balanced – both malloc and free within transaction

• Memory is transaction-local must be reused to prevent memory blowup

transaction {

for (i=0; i<big_number; i++) {

foo = malloc(size);

free(foo);

}

}

Page 7: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

7

Solution

Use sequence numbers to track allocation relationships

• Sequence counter per-thread (thread-local)

• Every transaction (even nested) takes a new (incremented) sequence number upon start

• Every allocation in the transaction is tagged with its sequence number

The relationship of an object being freed in a given transaction is determined by sequence number:

• seq(object) < seq(transaction) → speculative free

• seq(object) == seq(transaction) → balanced free

Page 8: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

8

Monitors != Transactions

• STM uses bits in object to validate at commit

• Pessimistically monitors (locks) allow only one thread inside a critical section

• Optimistically transactions allow multiple threads inside a critical section

• This causes problems freeing an object

Page 9: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

9

nodeDelete(int key) {nodeDelete(int key) { ptr = head of list;ptr = head of list; transaction {transaction { while( ptr->next->key != key ) {while( ptr->next->key != key ) { ptr = ptr->next;ptr = ptr->next; }} /* end while */ temp = ptr->next;temp = ptr->next; ptr->next = ptr->next->next;ptr->next = ptr->next->next; } /* validate &} /* validate & end transaction */end transaction */ free(temp); /* Anyone using? */free(temp); /* Anyone using? */}}

nodeDelete(int key) {nodeDelete(int key) { ptr = head of list;ptr = head of list; transaction {transaction { while( ptr->next->key != key ) {while( ptr->next->key != key ) { ptr = ptr->next;ptr = ptr->next; }} /* end while */ temp = ptr->next;temp = ptr->next; ptr->next = ptr->next->next;ptr->next = ptr->next->next; } /* validate &} /* validate & end transaction */end transaction */ free(temp); /* Anyone using? */free(temp); /* Anyone using? */}}

Thread 1Deleting node 2

Thread 2Deleting node 3

Page 10: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

10

nodeDelete(int key) {nodeDelete(int key) { ptr = head of list;ptr = head of list; transaction {transaction { while( ptr->next->key != key ) {while( ptr->next->key != key ) { ptr = ptr->next;ptr = ptr->next; }} /* end while */ temp = ptr->next;temp = ptr->next; ptr->next = ptr->next->next;ptr->next = ptr->next->next; } /* validate &} /* validate & end transaction */end transaction */ free(temp); /* Anyone using? */free(temp); /* Anyone using? */}}

nodeDelete(int key) {nodeDelete(int key) { ptr = head of list;ptr = head of list; transaction {transaction { while( ptr->next->key != key ) {while( ptr->next->key != key ) { ptr = ptr->next;ptr = ptr->next; }} /* end while */ temp = ptr->next;temp = ptr->next; ptr->next = ptr->next->next;ptr->next = ptr->next->next; } /* validate &} /* validate & end transaction */end transaction */ free(temp); /* Anyone using? */free(temp); /* Anyone using? */}}

Page 11: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

11

nodeDelete(int key) {nodeDelete(int key) { ptr = head of list;ptr = head of list; transaction {transaction { while( ptr->next->key != key ) {while( ptr->next->key != key ) { ptr = ptr->next;ptr = ptr->next; }} /* end while */ temp = ptr->next;temp = ptr->next; ptr->next = ptr->next->next;ptr->next = ptr->next->next; } /* validate &} /* validate & end transaction */end transaction */ free(temp); /* Anyone using? */free(temp); /* Anyone using? */}}

nodeDelete(int key) {nodeDelete(int key) { ptr = head of list;ptr = head of list; transaction {transaction { while( ptr->next->key != key ) {while( ptr->next->key != key ) { ptr = ptr->next;ptr = ptr->next; }} /* end while */ temp = ptr->next;temp = ptr->next; ptr->next = ptr->next->next;ptr->next = ptr->next->next; } /* validate &} /* validate & end transaction */end transaction */ free(temp); /* Anyone using? */free(temp); /* Anyone using? */}}

At this point you have read / read (non) conflict

Page 12: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

12

nodeDelete(int key) {nodeDelete(int key) { ptr = head of list;ptr = head of list; transaction {transaction { while( ptr->next->key != key ) {while( ptr->next->key != key ) { ptr = ptr->next;ptr = ptr->next; }} /* end while */ temp = ptr->next;temp = ptr->next; ptr->next = ptr->next->next;ptr->next = ptr->next->next; } /* validate &} /* validate & end transaction */end transaction */ free(temp); /* Anyone using? */free(temp); /* Anyone using? */}}

nodeDelete(int key) {nodeDelete(int key) { ptr = head of list;ptr = head of list; transaction {transaction { while( ptr->next->key != key ) {while( ptr->next->key != key ) { ptr = ptr->next;ptr = ptr->next; }} /* end while */ temp = ptr->next;temp = ptr->next; ptr->next = ptr->next->next;ptr->next = ptr->next->next; } /* validate &} /* validate & end transaction */end transaction */ free(temp); /* Anyone using? */free(temp); /* Anyone using? */}}

Now we have a read / write conflictThread 1 commits and thread two will abort

Page 13: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

13

nodeDelete(int key) {nodeDelete(int key) { ptr = head of list;ptr = head of list; transaction {transaction { while( ptr->next->key != key ) {while( ptr->next->key != key ) { ptr = ptr->next;ptr = ptr->next; }} /* end while */ temp = ptr->next;temp = ptr->next; ptr->next = ptr->next->next;ptr->next = ptr->next->next; } /* validate & end transaction */} /* validate & end transaction */ free(temp); /* Anyone using? */free(temp); /* Anyone using? */}}

nodeDelete(int key) {nodeDelete(int key) { ptr = head of list;ptr = head of list; transaction {transaction { while( ptr->next->key != key ) {while( ptr->next->key != key ) { ptr = ptr->next;ptr = ptr->next; }} /* end while */ temp = ptr->next;temp = ptr->next; ptr->next = ptr->next->next;ptr->next = ptr->next->next; } /* validate & end transaction */} /* validate & end transaction */ free(temp); /* Anyone using? */free(temp); /* Anyone using? */}}

STM Version information needed for validation is destroyed along with object 2

Page 14: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

14

nodeDelete(int key) {nodeDelete(int key) { ptr = head of list;ptr = head of list; transaction {transaction { while( ptr->next->key != key ) {while( ptr->next->key != key ) { ptr = ptr->next;ptr = ptr->next; }} /* end while */ temp = ptr->next;temp = ptr->next; ptr->next = ptr->next->next;ptr->next = ptr->next->next; } /* validate &} /* validate & end transaction */end transaction */ free(temp); /* Anyone using? */free(temp); /* Anyone using? */}}

nodeDelete(int key) {nodeDelete(int key) { ptr = head of list;ptr = head of list; transaction {transaction { while( ptr->next->key != key ) {while( ptr->next->key != key ) { ptr = ptr->next;ptr = ptr->next; }} /* end while */ temp = ptr->next;temp = ptr->next; ptr->next = ptr->next->next;ptr->next = ptr->next->next; } /* validate &} /* validate & end transaction */end transaction */ free(temp); /* Anyone using? */free(temp); /* Anyone using? */}}

Thread two wakes up

Page 15: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

15

The bits thread 2 are relying on to detect and resolve conflict by aborting are now garbage

nodeDelete(int key) {nodeDelete(int key) { ptr = head of list;ptr = head of list; transaction {transaction { while( ptr->next->key != key ) {while( ptr->next->key != key ) { ptr = ptr->next;ptr = ptr->next; }} /* end while */ temp = ptr->next;temp = ptr->next; ptr->next = ptr->next->next;ptr->next = ptr->next->next; } /* validate &} /* validate & end transaction */end transaction */ free(temp); /* Anyone using? */free(temp); /* Anyone using? */}}

nodeDelete(int key) {nodeDelete(int key) { ptr = head of list;ptr = head of list; transaction {transaction { while( ptr->next->key != key ) {while( ptr->next->key != key ) { ptr = ptr->next;ptr = ptr->next; }} /* end while */ temp = ptr->next;temp = ptr->next; ptr->next = ptr->next->next;ptr->next = ptr->next->next; } /* validate &} /* validate & end transaction */end transaction */ free(temp); /* Anyone using? */free(temp); /* Anyone using? */}}

Page 16: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

16

Solution

Delay the actual free and reuse until in a consistent state

A global epoch (timestamp) is maintained and incremented periodically

Each thread locally remembers the global epoch of the last time it entered or exited a top level transaction

• Set as part of TransactionBegin and TransactionAbort/Commit

Each free and global epoch noted in a thread local buffer

When the buffer fills each thread’s epoch is queried

All frees before the minimum epoch are freed “for real”

O(number of frees) not O(number of memory accesses)

Page 17: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

17

McRT-Malloc Beats Hoard

Machias Benchmark Mimics the consumer producer pattern with minimal work load

(Normalized so X axis indicates linear scaling)

McRT Malloc vs. Hoard

1

10

100

1000

1 2 4 8 16 32 64 128

SW Threads

HW

Th

rea

d N

orm

aliz

ed

Tim

e (

low

er

is b

ett

er)

Hoard 100% Sharing

Hoard 50% Sharing

Hoard 25% Sharing

Hoard 12.5% Sharing

Hoard 0% Sharing

100% Sharing

50% Sharing

25% Sharing

12.5% Sharing

0% Sharing

Page 18: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

18

McRT STM Malloc Running Machias

McRT STM Malloc

1

10

100

1000

1 2 4 8 16 32 64 128

SW Threads

HW

Th

rea

d N

orm

ailz

ed

Tim

e (

low

er

is b

ett

er)

100% Sharing

50% Sharing

25% Sharing

12.5% Sharing

0% Sharing

Page 19: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

19

McRT STM vs. McRT Malloc Running Machias

McRT STM Malloc vs. McRT Malloc

1

10

100

1000

1 2 4 8 16 32 64 128

SW Threads

HW

Thr

ead

Nor

mal

ized

Tim

e (lo

wer

is b

ette

r)

Transactional

Non-Transactional

Page 20: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

20

McRT STM vs. McRT Malloc Memory UsageRunning Machias

Memory Usage For Machias

0

10

20

30

40

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39

Machias Scenario

MB

ytes

use

d

McRT-STM-Malloc McRT Malloc

Page 21: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

21

Conclusion

Best of class scalable malloc implementation

Non-blocking to enable other McRT algorithms to be non-blocking and still use malloc

Solved memory blowup within a transaction

Solved premature freeing problem for STM with optimistic concurrency

Enabled object granularity conflict detection in C

Page 22: McRT-Malloc: A Scalable Non-Blocking Transaction Aware Memory Allocator Ali Adl-Tabatabai Ben Hertzberg Rick Hudson Bratin Saha.

22

Questions