Pay-to-use strong atomicity on conventional hardware
description
Transcript of Pay-to-use strong atomicity on conventional hardware
![Page 1: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/1.jpg)
Pay-to-use strong atomicity on conventional hardware
Martín Abadi, Tim Harris, Mojtaba Mehrara
Microsoft Research
![Page 2: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/2.jpg)
Our approachStrong semantics
atomic, retry, ..... What, ideally, should these constructs do?
Programming discipline(s) What does it mean for a
program to use the constructs correctly?
Low-level semantics & actual implementations
Transactions, optimistic concurrency, program transformations, weak
memory models, ...
![Page 3: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/3.jpg)
Programming disciplines
All programs
Violation-freeprograms
Obeying dynamic separation
Obeying static separation
More implementation flexibility
More programs correctly synchronized
• Which programs are correctly synchronized?
![Page 4: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/4.jpg)
Strong atomicity• Direct accesses work like single-access
transactions• We would like:– Implementation flexibility; ongoing innovation in
STM/hybrid techniques, optimizations, ...• Invisible / visible readers• In-place / deferred updates• Eager / lazy conflict detection
– No overhead on direct accesses– Robust performance, not dependent on success of
static analyses
![Page 5: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/5.jpg)
Strong atomicity: implementation
Physicaladdress
space
Virtual address
space
Tx-heapDirect-heap
Direct memory accesses
Memory accesses
from atomic blocks
![Page 6: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/6.jpg)
Writes from atomic blocksPhysicaladdress
space
Virtual address
space
Tx-heapDirect-heap
Direct memory accesses
Memory accesses
from atomic blocks
1. Atomic block attempts to write to a field of an
object
![Page 7: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/7.jpg)
Writes from atomic blocksPhysicaladdress
space
Virtual address
space
Tx-heapDirect-heap
Direct memory accesses
Memory accesses
from atomic blocks
2. Revoke direct access to the page holding the direct view of the object
![Page 8: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/8.jpg)
Writes from atomic blocksPhysicaladdress
space
Virtual address
space
Tx-heapDirect-heap
Direct memory accesses
Memory accesses
from atomic blocks
3. Use underlying STM write primitives
![Page 9: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/9.jpg)
Writes from atomic blocksPhysicaladdress
space
Virtual address
space
Tx-heapDirect-heap
Direct memory accesses
Memory accesses
from atomic blocks
4. Restore direct access once the underlying
transaction has finished and an access violation
(AV) occurs
![Page 10: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/10.jpg)
Avoiding Access Violations1. Safe accesses in runtime system
code– Virtual method tables and array length–Memory allocation structures (e.g. free
list)– STM implementation structures– GC implementation
Forward all these to TX-
heap at compile time
![Page 11: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/11.jpg)
Avoiding Access Violations2. Safe accesses in normal code – Normal writes to locations that haven’t been
read or written in a TX– Normal reads from locations that
haven’t been written in a TX3. Safe accesses in TX code – TX writes to locations that haven’t been read or
written outside TXs– TX reads from locations that haven’t been
written outside TXs
Forward to TX-heap
Avoid page-level
tracking
![Page 12: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/12.jpg)
Sample Codeprivate int ComputeUniqueSegments (int nthreads) { int numUniqueSegment = 0;
for (int i = 0; i < nthreads; i++) numUniqueSegment += this.uniqueSegments[i].Count; return numUniqueSegment; }Genome_Sequencer_ComputeUniqueSegments::loop: mov eax,dword ptr [edi+0x20] // Load uniqueSegments array reference cmp ebx,dword ptr [eax+0x4] // Check reference with array bounds jae outOfRange mov ecx,dword ptr [eax+ebx*4+0x08] // load array element mov eax,dword ptr [ecx] // load Count function pointer call dword ptr [eax+0x88] // call Count (get) function add ebp,eax // add it to numUniqueSegments add ebx,1 cmp ebx,esi jl loop
Access immutable runtime-
system datacmp ebx,dword ptr [eax+0x40000004] // Check reference with array bounds
mov eax,dword ptr [ecx+0x40000000] // load Count function pointercall dword ptr [eax+0x40000088] // call Count (get) function
mov ecx,dword ptr [eax+ebx*4+0x40000008] // load array element
mov eax,dword ptr [edi+0x40000020] // Load uniqueSegments array reference
Safe normal access
![Page 13: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/13.jpg)
Exploiting Safe Accesses• Implemented by extending Steensgard’s
points-to analysis• Only safe accesses from normal code were
beneficial• Little benefit from identifying safe accesses from
inside atomic blocks. #page-table changes:Genome Delaunay Labyrinth Vacation
Before 31 K 43 147 41 K
After 31 K 39 36 38 K
Ratio 99% 90% 36% 92 %
![Page 14: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/14.jpg)
Patching access violations• Patch sites of AVs• Our heuristic:– Patch on first AV– Also change page protection as normal
• Future work:– Remove patches if they become unnecessary–Make multiple patches to bound worst-case
perf
![Page 15: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/15.jpg)
Results - Vacation
WA
SA, co
nservative
+ analy
sis
SA, h
andle AVs
+ analy
sis
SA, p
atch AVs
+ analy
sis0
1
2
3
4
5
6
7
8
9
10
Exec
ution
tim
e (s
)
![Page 16: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/16.jpg)
Results - Delaunay
WA SA, conservative + analysis SA, handle AVs + analysis SA, patch AVs + analysis0
1
2
3
4
5
6
7
Exec
ution
tim
e (s
)
![Page 17: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/17.jpg)
Results - Genome
WA
SA, co
nservative
+ analy
sis
SA, h
andle AVs
+ analy
sis
SA, p
atch AVs
+ analy
sis0
0.5
1
1.5
2
2.5
3
Exec
ution
tim
e (s
)
![Page 18: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/18.jpg)
Results - Labyrinth
WA
SA, co
nservative
+ analy
sis
SA, h
andle AVs
+ analy
sis
SA, p
atch AVs
+ analy
sis7.8
8
8.2
8.4
8.6
8.8
9
9.2
Exec
ution
tim
e (s
)
![Page 19: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/19.jpg)
Scaling
1 2 3 4 5 6 7 80
0.2
0.4
0.6
0.8
1
1.2
Labyrinth
#Threads
Nor
mal
ized
exec
ution
tim
e
1 2 3 4 5 6 7 80
0.2
0.4
0.6
0.8
1
1.2
Vacation
#Threads
Nor
mal
ized
exec
ution
tim
e
1 2 3 4 5 6 7 80
0.2
0.4
0.6
0.8
1
1.2
Delaunay
#Threads
Nor
mal
ized
exec
ution
tim
e
1 2 3 4 5 6 7 80
0.2
0.4
0.6
0.8
1
1.2
Genome
#Threads
Nor
mal
ized
exec
ution
tim
e
SA – patch AV + analysisWA
![Page 20: Pay-to-use strong atomicity on conventional hardware](https://reader031.fdocuments.in/reader031/viewer/2022012922/568160b8550346895dcfde8b/html5/thumbnails/20.jpg)
Conclusion• Weak atomicity is an obstacle in
providing clear semantics for TM models• We use conventional memory protection
hardware to provide strong atomicity• This comes at a low performance cost…
high runtime complexity cost• Performance hit can be lowered by
compile time analysis