Motivation Parallel programming is difficult Culprit: Non-determinism Interleaving of parallel...

59

Transcript of Motivation Parallel programming is difficult Culprit: Non-determinism Interleaving of parallel...

Page 1: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.
Page 2: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

MotivationMotivation

Parallel programming is difficult

Culprit: Non-determinism• Interleaving of parallel threads• But required to harness parallelism

Sequential programs produce deterministic results

To make parallel programming easy• We want determinism• Same input => semantically same output.

Page 3: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Determinism EffortsDeterminism Efforts Language design [DPJ, X10,Yada]

• Deterministic by design: Types and annotations• Constrains programming• May need non-determinism for better performance

Deterministic runtime [DMP, Kendo]

Race detection • Absence of data races does not imply determinism• Data races could be benign and could help in improving

performance

Determinism Checker [SingleTrack, Torrellas et al.]• Dynamic determinism checker.

Determinism with respect to abstraction [Rinard et al.]

Page 4: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Our GoalOur Goal

Provide a framework that can express relevant determinism directly and easily• Separate from functional correctness

specification• Allow data races and healthy non-

determinism• Language independent• Useful for QA

Page 5: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Determinism specification: A sweet spot?

• Lightweight, but precise.

Determinism specification: A sweet spot?

• Lightweight, but precise.

Our GoalOur Goal How to specify correctness of parallelism?

Implicit:

No sources ofnon-determinism(no data races)

Implicit:

No sources ofnon-determinism(no data races)

Explicit:

Full functional correctness.

Explicit:

Full functional correctness.

Page 6: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

OutlineOutline

Motivation

Deterministic Specification [FSE’09]

Checking: Active Testing

Experimental Evaluation

Future Work + Conclusions

Page 7: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Deterministic SpecificationDeterministic Specification

Goal: Specify deterministic behavior.• Same initial parameters => same image.• Non-determinism is internal.

// Parallel fractal render mandelbrot(params, img); // Parallel fractal render mandelbrot(params, img);

Page 8: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Deterministic SpecificationDeterministic Specification

Program: , Initial State: , Schedules:

Specifies: Two runs from same initial program state have same result state for any pair of schedules

deterministic { // Parallel fractal render mandelbrot(params, img);}

deterministic { // Parallel fractal render mandelbrot(params, img);}

111'

0100 ))'()(('.,, sssssss

P

s0

, '

Page 9: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

double A[][], b[], x[];...deterministic { // Solve A*x = b in parallel lufact_solve(A, b, x);}

double A[][], b[], x[];...deterministic { // Solve A*x = b in parallel lufact_solve(A, b, x);}

Deterministic SpecificationDeterministic Specification

Too restrictive – different schedules may give slightly different floating-point results.

Page 10: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

set t = new RedBlackTreeSet();deterministic { t.add(3) || t.add(5);}

set t = new RedBlackTreeSet();deterministic { t.add(3) || t.add(5);}

Deterministic SpecificationDeterministic Specification

Too restrictive – internal structure of set may differ depending on order of adds.

Page 11: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

deterministic { // Parallel branch-and-bound Tree t = min_phylo_tree(data);}

deterministic { // Parallel branch-and-bound Tree t = min_phylo_tree(data);}

Deterministic SpecificationDeterministic Specification

Too restrictive – search can correctly return any tree with optimal cost.

Page 12: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Semantic DeterminismSemantic Determinism

Too strict to require every interleaving to give exact same program state:

deterministic { P }

deterministic { P }

111'

0100 ))'()(('.,, sssssss

Page 13: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Semantic DeterminismSemantic Determinism

Too strict to require every interleaving to give exact same program state:

deterministic { P }

deterministic { P } Predicate!

Should beuser-defined.

111'

0100 ))'()(('.,, sssssss

Page 14: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Semantic DeterminismSemantic Determinism

Too strict to require every interleaving to give exact same program state:

Specifies: Final states are equivalent.

deterministic { P } assert Post(s1,s1’)

deterministic { P } assert Post(s1,s1’)

)',())'()(('.,, 111'

0100 ssPostsssss

Page 15: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

double A[][], b[], x[];...deterministic { // Solve A*x = b in parallel lufact_solve(A, b, x);} assert (|x – x’| < ε)

double A[][], b[], x[];...deterministic { // Solve A*x = b in parallel lufact_solve(A, b, x);} assert (|x – x’| < ε)

Semantic DeterminismSemantic Determinism

“Bridge” predicate

Page 16: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Resulting sets are semantically equal.

set t = new RedBlackTreeSet();deterministic { t.add(3) || t.add(5);} assert (t.equals(t’))

set t = new RedBlackTreeSet();deterministic { t.add(3) || t.add(5);} assert (t.equals(t’))

Semantic DeterminismSemantic Determinism

Page 17: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

deterministic { // Parallel branch-and-bound Tree t = min_phylo_tree(data);} assert (t.cost == t’.cost())

deterministic { // Parallel branch-and-bound Tree t = min_phylo_tree(data);} assert (t.cost == t’.cost())

Semantic DeterminismSemantic Determinism

Page 18: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Too strict – initial states must be identical• Not compositional.

Preconditions for DeterminismPreconditions for Determinism set t = … deterministic { t.add(3) || t.add(5); } assert (t.equals(t’)) … deterministic { t.add(4) || t.add(6); } assert (t.equals(t’))

set t = … deterministic { t.add(3) || t.add(5); } assert (t.equals(t’)) … deterministic { t.add(4) || t.add(6); } assert (t.equals(t’))

Page 19: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Preconditions for DeterminismPreconditions for Determinism Too strict to require identical initial states:

deterministic { P} assert Post(s1,s1’)

deterministic { P} assert Post(s1,s1’)

)',())'()(('.,, 111'

0100 ssPostsssss

Page 20: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Preconditions for DeterminismPreconditions for Determinism Too strict to require identical initial states:

deterministic assume (s0 = s0’) { P} assert Post(s1,s1’)

deterministic assume (s0 = s0’) { P} assert Post(s1,s1’)

Post(s1, s1)

))'()''()(('.,,', 001'

01000 ssssssss

Page 21: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Preconditions for DeterminismPreconditions for Determinism Too strict to require identical initial states:

deterministic assume (s0 = s0’) { P} assert Post(s1,s1’)

deterministic assume (s0 = s0’) { P} assert Post(s1,s1’)

Predicate! Should beuser-defined.

Post(s1, s1)

))'()''()(('.,,', 001'

01000 ssssssss

Page 22: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Preconditions for DeterminismPreconditions for Determinism Too strict to require identical initial states:

Specifies:

deterministic assume Pre(s0,s0’) { P} assert Post(s1,s1’)

deterministic assume Pre(s0,s0’) { P} assert Post(s1,s1’)

Post(s1, s1)

))',(Pr)''()(('.,,', 001'

01000 ssessssss

Page 23: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

deterministic assume Pre(s0,s0’) { P} assert Post(s1,s1’)

deterministic assume Pre(s0,s0’) { P} assert Post(s1,s1’)

Bridge predicates/assertionsBridge predicates/assertions

“Bridge”predicate

“Bridge”assertion

Page 24: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

set t = ...deterministic assume (t.equals(t’) { t.add(4) || t.add(6);} assert (t.equals(t’))

set t = ...deterministic assume (t.equals(t’) { t.add(4) || t.add(6);} assert (t.equals(t’))

Specifies: Semantically equal sets yield semantically equal sets.

Preconditions for DeterminismPreconditions for Determinism

Page 25: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

AdvantageAdvantage

Separately check parallelism and functional correctness.• Show parallelism is outwardly deterministic.• Reason about correctness sequentially.• Decompose correctness proof!

Example:• Write Cilk program and prove (or test)

sequential correctness.• Add parallelism, answers should not change

Page 26: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Determinism vs. AtomicityDeterminism vs. Atomicity Internal vs. external parallelism/non-determinism

• Complementary notions

AtomicDeterministic

“Closed program” “Open program”

Page 27: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Other TestingOther TestingUse sequential program as spec

deterministic Pre(s0,s0’) {

if (*) {

SequentialPgm;

} else {

ParallelPgm;

}

} assert Post(s,s’);

Regression testing

deterministic Pre(s0,s0’) {

if (*) {

PgmVersion1;

} else {

PgmVersion2;

}

} assert Post(s,s’);

Page 28: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

OutlineOutline

Motivation

Deterministic Specification [FSE’09]

Checking: Active Testing [PLDI 08,09, FSE 08, CAV 09]

Experimental Evaluation

Future Work + Conclusions

Page 29: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Checking: Active TestingChecking: Active Testing

Predicting and Exploring

“interesting” Schedules

Page 30: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Active Testing:Active Testing:Predict and Test Potential BugsPredict and Test Potential Bugs

Predict potential bugs:• Data races: Eraser or lockset based• Atomicity violations: cycle in transactions

and happens-before relation• Deadlocks: cycle in resource acquisition

graph• Memory model bugs: cycle in happens-

before relation

Test schedules to create those bugs

[ASE 07, PLDI 08, FSE 08, PLDI 09, FSE 09, CAV 09]

Page 31: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Active Testing Cartoon: Phase IActive Testing Cartoon: Phase I

31

Potential Collision

1

2

1

2

3

Page 32: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Active Testing Cartoon: Phase IIActive Testing Cartoon: Phase II

32

1

2

1

2

3

Page 33: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

OutlineOutline

Motivation

Deterministic Specification

Checking: Active Testing

Experimental Evaluation

Future Work + Conclusions

Page 34: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Ease of Asserting DeterminismEase of Asserting Determinism

Implemented a deterministic assertion library for Java.

Manually added deterministic assertions to 13 Java benchmarks with 200 – 4k LoC

Typically 5-10 minutes per benchmark• Functional correctness very difficult.

Page 35: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Ease of Use: ExampleEase of Use: Example

Deterministic.open();Predicate eq = new Equals();Deterministic.assume(width, eq);… (9 parameters total) …Deterministic.assume(gamma, eq);

// Compute fractal in threadsint matrix[][] = …;

Deterministic.assert(matrix, eq);Deterministic.close();

Page 36: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Effectiveness in Finding BugsEffectiveness in Finding Bugs

13 Java benchmarks of 200 – 4k LoC

Ran benchmarks on ~100 schedules• Schedules with data races and other

“interesting” interleavings (active testing)

For every pair of executions ofdeterministic Pre { P } Post:

check that:

Pre(s0, s0) Post(s1, s1))''()( 1

'010 ssss

Page 37: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Experiments: Java Grande ForumExperiments: Java Grande Forum

Benchmark LoCData Races

Found | Errors

High-Level Races

Found | Errors

sor 300 2 0 0 0

moldyn 1.3k 2 0 0 0

lufact 1.5k 1 0 0 0

raytracer 1.9k 3 1 0 0

montecarlo 3.6k 1 0 2 0

Page 38: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Experiments: Parallel Java LibExperiments: Parallel Java Lib

Benchmark LoCData Races

Found | Errors

High-Level Races

Found | Errors

pi 150 9 0 1+ 1

keysearch3 200 3 0 0+ 0

mandelbrot 250 9 0 0+ 0

phylogeny 4.4k 4 0 0+ 0

tsp* 700 6 0 2 0

Page 39: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Experimental EvaluationExperimental Evaluation

Across 13 benchmarks:

Found 40 data races.• 1 violates deterministic assertions.

Page 40: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Experimental EvaluationExperimental Evaluation

Across 13 benchmarks:

Found 40 data races.• 1 violates deterministic assertions.

Found many interesting interleavings(non-atomic methods, lock races, etc.)• 1 violates deterministic assertions.

Page 41: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Determinism ViolationDeterminism Violation

Pair of calls to nextDouble() mustbe atomic.

deterministic { // N trials in parallel. foreach (n = 0; n < N; n++) { x = Random.nextDouble(); y = Random.nextDouble(); … }} assert (|pi - pi’| < 1e-10)

deterministic { // N trials in parallel. foreach (n = 0; n < N; n++) { x = Random.nextDouble(); y = Random.nextDouble(); … }} assert (|pi - pi’| < 1e-10)

Page 42: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

OutlineOutline

Motivation

Deterministic Specification

Checking: Active Testing

Experimental Evaluation

Future Work + Conclusions

Page 43: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

How to verify: Essential RuleHow to verify: Essential Rule

deterministic assume(Φ1)

A11;A12;A13|| A21;A22;A23|| A31;A32;A33

} assert (Φ2)

Page 44: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

How to verify: Essential RuleHow to verify: Essential Rule

deterministic assume(Φ1)

A11;A12;A13|| A21;A22;A23|| A31;A32;A33

} assert (Φ2)

Find an invariant Φ and prove that for all i, j, l, m

deterministic assume(Φ)

Aij|| Alm

} assert (Φ)

and Φ1 → Φ and Φ → Φ2

Page 45: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Prove for the followingProve for the followingdeterministic assume(true) {

parallel while (!wq.is_empty()) {

[work = wq.get();]

if (bound(work) >= best)

continue;

if (size(work) <= threshold) {

s = find_best_solution(work);

[best = min(best,cost(s));]

} else {

[wq.add_all(split(work));]

}

}

} assert(best=best’)

Page 46: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Decompose the proofDecompose the proof

Let us prove that P ≈ S

That is prove that

deterministic assume(Pre(S0,S’0)) {

if (*) P else S

} assert (Post(S,S’))

Page 47: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Decompose the proofDecompose the proof

Let us prove that P ≈ S

Create a non deterministic sequential program NS [Galois]

P ≡ NS ≈ S

Page 48: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Decompose the proofDecompose the proof

Let us prove that P ≈ S

Create a non deterministic sequential program NS

P ≡ NS ≈ S

deterministic assume(S0=S’0) {

if (*) P else NS

} assert (S=S’)

deterministic assume(Pre(S0,S’0)) {

if (*) NS else S

} assert (Post(S,S’))

Page 49: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

How to verify?How to verify?while (!wq.is_empty()) {

[work = wq.nondet_get();]

if (bound(work) >= best)

continue;

if (size(work) <= threshold) {

s = find_best_solution(work);

[best = min(best,cost(s));]

} else {

[wq.add_all(split(work));]

}

}

deterministic assume(true) {

parallel while (!wq.is_empty()) {

[work = wq.get();]

if (bound(work) >= best)

continue;

if (size(work) <= threshold) {

s = find_best_solution(work);

[best = min(best,cost(s));]

} else {

[wq.add_all(split(work));]

}

}

} assert(best=best’)

Not deterministic sequential program

Page 50: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

How to verify?How to verify?while (!wq.is_empty()) {

[work = wq.nondet_get();]

if (bound(work) >= best)

continue;

if (size(work) <= threshold) {

s = find_best_solution(work);

[best = min(best,cost(s));]

} else {

[wq.add_all(split(work));]

}

}

Barrier()

deterministic assume(true) {

parallel while (!wq.is_empty()) {

[work = wq.get();]

if (bound(work) >= best)

continue;

if (size(work) <= threshold) {

s = find_best_solution(work);

[best = min(best,cost(s));]

} else {

[wq.add_all(split(work));]

}

}

} assert(best=best’)

Not deterministic sequential program

Page 51: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

How to verify?How to verify?while (!wq.is_empty()) {

[work = wq.nondet_get();]

if (bound(work) >= best && non_det)

continue;

if (size(work) <= threshold) {

s = find_best_solution(work);

[best = min(best,cost(s));]

} else {

[wq.add_all(split(work));]

}

}

Barrier()

deterministic assume(true) {

parallel while (!wq.is_empty()) {

[work = wq.get();]

if (bound(work) >= best)

continue;

if (size(work) <= threshold) {

s = find_best_solution(work);

[best = min(best,cost(s));]

} else {

[wq.add_all(split(work));]

}

}

} assert(best=best’)

Not deterministic sequential program

Page 52: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Verifying DeterminismVerifying Determinism

Verify determinismof each piece.

No need to considercross product of allinterleavings.

P

P

P

P

P

P

Page 53: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Verifying DeterminismVerifying Determinism

Compositional reasoning for determinism?

P

Q

Q

Q

Q

Q

Q

P P

Page 54: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

SummarySummary

“Bridge” predicates and assertions• Simple to assert natural determinism• Semantic determinism

Active Testing• Direct search based on imprecise bug reports

Verify/prove determinism• Prove exact equivalence with non deterministic

sequential programs• Prove semantic equivalence between non

deterministic sequential programs and sequential programs

Page 55: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.
Page 56: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.
Page 57: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Deterministic Assertion LibraryDeterministic Assertion Library Implemented assertion library for Java:

Records set to check: eq.apply(set0,set0’) => eq.apply(set,set’)

Predicate eq = new Equals();Deterministic.open();Deterministic.assume(set, eq); ...Deterministic.assert(set, eq);Deterministic.close();

Predicate eq = new Equals();Deterministic.open();Deterministic.assume(set, eq); ...Deterministic.assert(set, eq);Deterministic.close();

Page 58: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

Checking DeterminismChecking Determinism

Run P on some number of schedules.

For every pair and ofexecutions of P:

deterministic assume Pre(s0,s0’) { P} assert Post(s1,s1’)

deterministic assume Pre(s0,s0’) { P} assert Post(s1,s1’)

s0 s1

s0 s1

Pre(s0,s0) Post(s1,s1)

Page 59: Motivation  Parallel programming is difficult  Culprit: Non-determinism Interleaving of parallel threads But required to harness parallelism  Sequential.

OutlineOutline

Motivation (3 min)

Deterministic Specification (6 min)

Experimental Evaluation (5 min)

Related Work (3 min)

Future Work + Conclusions (3 min)