Data Representation Synthesis PLDI’2011, ESOP’12, PLDI’12 Peter Hawkins, Stanford University...
-
Upload
osborn-lester -
Category
Documents
-
view
216 -
download
1
Transcript of Data Representation Synthesis PLDI’2011, ESOP’12, PLDI’12 Peter Hawkins, Stanford University...
Data Representation SynthesisPLDI’2011, ESOP’12, PLDI’12
Peter Hawkins, Stanford UniversityAlex Aiken, Stanford University
Kathleen Fisher, TuftsMartin Rinard, MITMooly Sagiv, TAU
http://theory.stanford.edu/~hawkinsp/
Research Interests
• Verifying properties of low level data structure manipulations
• Synthesizing low level data structure manipulations– Composing ADT operations
• Concurrent programming via smart libraries
Motivation
• Sophisticated data structures exist• Increase abstraction level• Simplifies programming– Concurrency
• Integrated into modern PL• But hard to use– Composing several operations
next next next next next
table[0]
[1][2][3][4][5]
null
index= 2 index = 0 index = 3 index = 1 index= 52 0 3 1 1
thttpd: Web ServerConn
Map
Representation Invariants:1. n: Map. v: Z. table[v] =n n->index=v
2. n: Map. n->rc =|{n' : Conn . n’->file_data = n}|
static void add_map(Map *m){ int i = hash(m); … table[i] = m ; … m->index= i ; … m->rc++;
}
thttptd:mmc.c
Representation Invariants:1. n: Map. v:Z. table[v] =n index[n]=v
2. n:Map. rc[n] =|{n' : Conn . file_data[n’] = n}|
broken
restored
attr = new HashMap();…
Attribute removeAttribute(String name){ Attribute val = null; synchronized(attr) { found = attr.containsKey(name) ; if (found) { val = attr.get(name); attr.remove(name); } } return val;}
TOMCAT Motivating ExampleTOMCAT 5.*
Invariant: removeAttribute(name) returns the removed value or null if the name does not exist
attr = new ConcurrentHashMap();…
Attribute removeAttribute(String name){ Attribute val = null; /* synchronized(attr) { */ found = attr.containsKey(name) ; if (found) { val = attr.get(name); attr.remove(name); } /* } */ return val;}
TOMCAT Motivating ExampleTOMCAT 6.*
attr.put(“A”, o);
attr.remove(“A”);
o
Invariant: removeAttribute(name) returns the removed value or null if the name does not exist
Multiple ADTs
Invariant: Every element that added to eden is either in eden or in longterm
public void put(K k, V v) { if (this.eden.size() >= size) { this.longterm.putAll(this.eden); this.eden.clear(); } this.eden.put(k, v);}
filesystem=1s_list
s_files
filesystem=2s_list
s_files
filesystems
file=14f_list
f_fs_list
file=6f_list
f_fs_list
file=2f_list
f_fs_list
file=7f_list
f_fs_list
file=5f_list
f_fs_list
file_in_use
file_unused
Filesystem Example
Filesystem Example
Access Patterns
• Find all mounted filesystems
• Find cached files on each filesystem
• Iterate over all used or unused cached files in Least-Recently-Used order
filesystem=1s_list
s_files
filesystem=2s_list
s_files
filesystems
file=14f_list
f_fs_list
file=6f_list
f_fs_list
file=2f_list
f_fs_list
file=7f_list
f_fs_list
file=5f_list
f_fs_list
file_in_use
file_unused
11
Desired Properties of Data Representations
• Correctness
• Efficiency for access patterns
• Simple to reason about
• Easy to evolve
Specifying Combination of Shared Data Structures
• Separation Logic• First order logic+ reachability• Monadic 2nd order logic on trees• Graph grammars• …
fs file inuse
1 14 F
2 7 T
2 5 F
1 6 T
1 2 F
filesystems filesystem=1s_list
s_files
filesystem=2s_list
s_files
file=14f_list
f_fs_list
file=6f_list
f_fs_list
file=2f_list
f_fs_list
file=7f_list
f_fs_list
file=5f_list
f_fs_list
file_in_use
file_unused
{fs, file} {inuse}
fs file inuse
1 17 F
1 34 T
1 42 F
2 5 T
2 5 F
Filesystem Example
Filesystem ExampleContainers: View 1
fs:1 fs:2
file:14file:6
file:2 file:7 file:5
filesystem=1s_list
s_files
filesystem=2s_list
s_files
filesystems
file=14f_list
f_fs_list
file=6f_list
f_fs_list
file=2f_list
f_fs_list
file=7f_list
f_fs_list
file=5f_list
f_fs_list
file_in_use
file_unused
Filesystem ExampleContainers: View 2
filesystem=1s_list
s_files
filesystem=2s_list
s_files
filesystems
file=14f_list
f_fs_list
file=6f_list
f_fs_list
file=2f_list
f_fs_list
file=7f_list
f_fs_list
file=5f_list
f_fs_list
file_in_use
file_unused
fs: 2, file:7fs: 1,
file:14
fs: 2, file:5
inuse:T inuse:F
fs: 1, file:2
Filesystem ExampleContainers: Both Views
fs:1 fs:2
file:14file:6
file:2 file:7 file:5
filesystem=1s_list
s_files
filesystem=2s_list
s_files
filesystems
file=14f_list
f_fs_list
file=6f_list
f_fs_list
file=2f_list
f_fs_list
file=7f_list
f_fs_list
file=5f_list
f_fs_list
file_in_use
file_unused
fs: 2, file:7fs: 1,
file14
fs: 2, file:5
inuse:T inuse:F
fs file inuse
{fs, file} { inuse}
Decomposing Relations
• High-level shape descriptor• A rooted directed acyclic graph• The root represents the whole
relation• The sub-graphs rooted at each
node represent sub-relations• Edges are labeled with columns• Each edge maps a relation into a
sub-relation• Different types of edges for
different primitive data-structures
• Shared nodes correspond to shared sub-relations
fs file inuse
{fs, file} { inuse} inuse
fs, file
fs inuse
file
list
list
array
list
Filesystem Examplefs file inuse1 14 F2 7 T2 5 F1 6 T1 2 F
inuse
fs, file
fs inuse
filefile inuse
14 F
6 T
2 F
file inuse
7 T
5 F
inuse
F
inuse
T
inuse
F
inuse
Tinuse
F
fs:1 fs:2
file:14 file:6 file:2 file:7 file:5
Memory Decomposition(Left)
inuse
fs, file
fs inuse
file
inuse:F inuse:T inuse:F Inuse:T Inuse:F
fs:1 fs:2
file:14 file:6 file:2 file:7 file:5
Filesystem Examplefs file inuse1 14 F2 7 T2 5 F1 6 T1 2 F
fs file2 71 6
fs file1 142 51 2
fs:2file:7
inuse
T
inuse
Tinuse
F
inuse
F
inuse
F
fs:1file:6
fs:1file:14
fs:2file:5
fs:1file:2
inuse:T inuse:F
inuse
fs, file
fs inuse
file
Memory Decomposition(Right)
fs:2file:7
fs:1file:6
fs:1file:14
fs:2file:5
fs:1file:2
inuse:T inuse:F
inuse:F inuse:T inuse:F Inuse:T Inuse:F
inuse
fs, file
fs inuse
file
Decomposition Instance
fs:2file:7
fs:1file:6
fs:1file:14
fs:2file:5
fs:1file:2
inuse:T inuse:F
inuse:F inuse:T inuse:F Inuse:T Inuse:F
fs file inuse
1 14 F
2 7 T
2 5 F
1 6 T
1 2 F
fs:1
file:14 file:6
file:2
file:7file:5
fs:2
fs file inuse
{fs, file} { inuse} inuse
fs, file
fs inuse
file
Shape Descriptor
fs:2file:7
fs:1file:6
fs:1file:14
fs:2file:5
fs:1file:2
inuse:T inuse:F
inuse:F inuse:T inuse:F Inuse:T Inuse:F
fs file inuse
1 14 F
2 7 T
2 5 F
1 6 T
1 2 F
fs:1
file:14 file:6
file:2
file:7file:5
fs:2
fs file inuse
{fs, file} { inuse} inuse
fs, file
fs inuse
file
Memory State
fs:2file:7
fs:1file:6
fs:1file:14
fs:2file:5
fs:1file:2
inuse:T inuse:F
inuse:F inuse:T inuse:F Inuse:T Inuse:F
fs file inuse
1 14 F
2 7 T
2 5 F
1 6 T
1 2 F
fs:1
file:14 file:6
file:2
file:7file:5
fs file inuse
{fs, file} { inuse} inuse
fs, file
fs inuse
filefs:2
list
list
array
list
s_list
f_fs_list f_fs_list f_fs_list
f_list f_list
f_list
Memory State
fs file inuse1 14 F2 7 T2 5 F1 6 T1 2 F
fs file inuse
{fs, file} { inuse} inuse
fs, file
fs inuse
file
list
list
array
list
filesystems
filesystem=1s_list
s_files
filesystem=2s_list
s_files
file=14f_list
f_fs_list
file=6f_list
f_fs_list
file=2f_list
f_fs_list
file=7f_list
f_fs_list
file=5f_list
f_fs_listfile_in_use
file_unused
Adequacy
A decomposition is adequate if it can represent every possible relation matching a relational specification
Adequacy
enforces sufficient conditions for adequacy
Not every decomposition is a good representation of a relation
Adequacy of Decompositions
• All columns are represented• Nodes are consistent with functional
dependencies– Columns bound to paths leading to a common
node must functionally determine each other
Respect Functional Dependencies
file
inuse
{file} {inuse}
Respect Functional Dependencies
file,fs
inuse
{file, fs} {inuse}
Adequacy and Sharing
fs, file
fs inuse
file
inuseColumns bound on a path to an object x must functionallydetermine columns bound on any other path to x
{fs, file}{inuse, fs, file}
Adequacy and Sharing
fs
fs inuse
file
inuse
Columns bound on a path to an object x must functionallydetermine columns bound on any other path to x
{fs, file} {inuse, fs}
Decompositions and Aliasing• A decomposition is an
upper bound on the set of possible aliases
• Example: there are exactly two paths to any instance of node w
The RelC Compiler[PLDI’11]
RelationalSpecification
Programmer
Decomposition
Data structure designer
RelC C++
The RelC Compilerfs fileinuse{fs, file} {inuse}
foreach <fs, file, inuse> filesystems s.t. fs= 5 do …
RelC C++
inuse
fs, file
fs
list inuse
filelist
array
list
Relational Specification
Graph decomposition
Query Plans
foreach <fs, file, inuse> filesystems if inuse=T do …
fs, file
fs
list inuse
file
inuse
list
array
list
scan
scanCost proportional to the number of files
Query Plans
foreach <fs, file, inuse> filesystems if inuse=T do …
fs, file
fs
list inuse
file
inuse
list
array
list
access
scan
Cost proportional to the number of files in use
Removal and graph cutsremove <fs:1>
fs
list
file
inuse
list
array
list
fs file inuse
1 14 F
2 7 T
2 5 F
1 6 T
1 2 F
filesystems fs:1s_list
s_files
fs:2s_list
s_files
file:14f_list
f_fs_list
file:6f_list
f_fs_list
f_listf_fs_list
file:7f_list
f_fs_list
file:5f_list
f_fs_list
inuse:T
inuse:F
file:2
Removal and graph cutsremove <fs:1>
fs file inuse
1 14 F
2 7 T
2 5 F
1 6 T
1 2 F
filesystems fs:2s_list
s_files
file:7f_list
f_fs_list
file:5f_list
f_fs_list
inuse:T
inuse:F
fs
list
file
inuse
list
array
list
Abstraction Theorem
• If the programmer obeys the relational specification and the decomposition is adequate and if the individual containers are correct
• Then the generated low-level code maintains the relational abstraction
relation relationremove <fs:1>
memorystate
memorystate
low level code remove <fs:1>
40
Filesystem ExampleConcurrent
^
•Lock-based concurrent code is difficult to write and difficult to reason about•Hard to choose the
correct granularity of locking•Hard to use
concurrent containers correctly
41
Granularity of Locking
Fine-grained locking
Coarse-grained locking
More complex
Easier to reason aboutEasier to change
More scalable
Lower overheadLess complex
42
Concurrent Containers• Concurrent containers are hard to write• Excellent concurrent containers exist (e.g.,
Java’s ConcurrentHashMap and ConcurrentSkipListMap)
• Composing concurrent containers is tricky– It is not usually safe to update containers in paralel– Lock association is tricky
• In practice programmers often* use concurrent containers incorrectly
* 44% of the time [Shacham et al., 2011]
43
Concurrent Data Representation Synthesis[PLDI’12]
RelScala
Compiler
Concurrent Data Structures,Atomic Transactions
Relational Specification
Concurrent Decomposition
44
Concurrent Decompositions
• Concurrent decompositions describe how to represent a relation using concurrent containers and locks
• Concurrent decompositions specify: – the choice of containers– the number and placement of
locks– which locks protect which data= ConcurrentHashMap
= TreeMap
45
Concurrent Interface
Interface consisting of atomic operations:
46
Goal: Serializability
Thread 1
Thread 2
Time
Every schedule of operations is equivalent to some serial schedule
47
Goal: Serializability
Time
Thread 1
Thread 2
Every schedule of operations is equivalent to some serial schedule
48
Two-Phase Locking
Two phase locking protocol:
• Well-locked: To perform a read or write, a thread must hold the corresponding lock
• Two-phase: All lock acquisitions must precede all lock releases
Attach a lock to each piece of data
Theorem [Eswaran et al., 1976]: Well-locked, two-phase transactions are serializable
49
Two Phase Locking
Attach a lock to every edge
Problem 2: Too many locks
Decomposition Decomposition Instance
We’re done!
Problem 1: Can’t attach locks to container entries
Lock Placements
1. Attach locks to nodes
Decomposition Decomposition Instance
51
Coarse-Grained LockingDecomposition Decomposition Instance
52
Finer-Grained LockingDecomposition Decomposition Instance
53
Lock StripingDecomposition Decomposition Instance
54
Lock Placements: Domination
Decomposition Decomposition Instance
Locks must dominate the edges they protect
55
Lock Placements: Path-ClosureAll edges on a path between an edge and its lock must share the same lock
56
Decompositions and Aliasing• A decomposition is an upper
bound on the set of possible aliases
• Example: there are exactly two paths to any instance of node w
• We can reason about the relationship between locks and heap cells soundly
57
Lock Ordering
Prevent deadlock via a topological order on locks
58
Queries and Deadlock
2. lookup(tv)
1. acquire(t)
3. acquire(v)
4. scan(vw)
Query plans must acquire the correct locks in the correct order
Example: find files on a particular filesystem
Autotuner
• Given a fixed set of primitive types– list, circular list, doubly-linked list, array, map, …
• A workload• Exhaustively enumerate all the adequate
decompositions up to certain size• The compiler can automatically pick the best
representation for the workload
Directed Graph Example (DFS)• Columns
src dst weight• Functional Dependencies
– {src, dst} {weight}
• Primitive data types– map, list
…
src
dst
weight
map
list
src
dst
dst
src
weight
map
listlist
mapdst
src
weightm
aplist
dst
weight
map srcmap
weight
dst
listsrclist
61
Concurrent SynthesisFind optimal combination of
Decomposition Shape ContainerData Structures
ConcurrentHashMap
ConcurrentSkipListMap
CopyOnWriteArrayList
Array
HashMap
TreeMap
LinkedList
Lock Implementations
ReentrantReadWriteLock
ReentrantLock
Lock Striping Factors
Lock Placement
62
Concurrent Graph Benchmark
• Start with an empty graph• Each thread performs 5 x 105 random
operations• Distribution of operations a-b-c-d (a% find
successors, b% find predecessors, c% insert edge, d% remove edge)
• Plot throughput with varying number of threads Based on Herlihy’s benchmark of concurrent maps
Black = handwritten,isomorphic to blue
Key
= ConcurrentHashMap= HashMap
...
63
Results: 35-35-20-1035% find successor, 35% find predecessor, 20% insert edge, 10% remove edge
(Some) Related Projects
• Relational synthesis: [Cohen & Campbell 1993], [Batory & Thomas 1996], [Smaragdakis & Batory 1997], [Batory et al. 2000] [Manevich, 2012] …
• Two-phase locking and Predicate Locking [Eswaran et al., 1976], Tree and DAG locking protocols [Attiya et al., 2010], Domination Locking [Golan-Gueta et al., 2011]
• Lock Inference for Atomic Sections: [McCloskey et al.,2006], [Hicks, 2006], [Emmi, 2007]
Summary
• Decompositions describe how to represent a relation using concurrent containers
• Lock placements capture different locking strategies
• Synthesis explores the combined space of decompositions and lock placements to find the best possible concurrent data structure implementations
The Big Idea
• Simpler code• Correctness by construction• Find better combinations of data structures
and locking
Benefits
Relational Specification
Low Level Concurrent Code
Transactional Objects with Foresight
Guy GuetaG. Ramalingam
Mooly SagivEran Yahav
Motivation
• Modern libraries provide atomic ADTs• Hidden synchronization complexity• Not composable– Sequences of operations are not necessarily
atomic
Transactional Objects with Foresight
• The client declares the intended use of an API via temporal specifications (foresight)
• The library utilizes the specification– Synchronize between operations which do not
serialize with foresight• Foresight can be automatically inferred with
sequential static analysis
Example: Positive CounterOperation A@mayUse(Inc);Inc()Inc()@mayUse()
Operation B@mayUse(Dec);Dec()Dec()@mayUse()
• Composite operations are not atomic• even when Inc/Dec are atomic
• Enforce atomicity using future information • Future information can be utilized for effective
synchronization
Possible Executions (initial value=0)
Main Results
• Define the notion of ADTs that utilize foresight information– Show that they are strict generalization of existing locking
mechanisms• Create a Java Map ADT which utilizes foresight • Static analysis for inferring conservative foresight
information• Performance is comparable to manual
synchronization• Can be used for real composite operations
The RelC Compilerfs fileinuse{fs, file} {inuse}
foreach <fs, file, inuse> filesystems s.t. fs= 5 do …
RelC C++
inuse
fs, file
fs
list inuse
filelist
array
list
Relational Specification
Graph decomposition