Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world...

123
Towards Certified Separate Compilation for Concurrent Programs Hanru Jiang* Hongjin LiangSiyang Xiao* Junpeng Zha* Xinyu Feng* University of Science and Technology of China Nanjing University

Transcript of Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world...

Page 1: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Towards Certified Separate Compilation for

Concurrent ProgramsHanru Jiang* Hongjin Liang†

Siyang Xiao* Junpeng Zha* Xinyu Feng†

* University of Science and Technology of China† Nanjing University

Page 2: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Compilers are NOT Trustworthy

[PLDI 2011]

• 11 open-source/commercial compilers were tested

• Found 325 bugs, in EVERY compiler!

Page 3: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Compilers are NOT Trustworthy

[PLDI 2011]

• 11 open-source/commercial compilers were tested

• Found 325 bugs, in EVERY compiler!

Verification of compiler correctness helps:

“The striking thing about our CompCert results is that the middle end bugs we found in all other compilers are absent.”

Page 4: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Compilation Correctness

S

T

Compiler

Source (e.g. C)

Target (e.g. assembly)

∀S, T . T = Compiler(S) ⟹ T ⊆ S

Correct(Compiler) :

Semantic preservation: T has no more observable behaviors (e.g. I/O events by print) than S.

Page 5: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Compiler Verification

Page 6: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Compiler Verification• Leroy’06: Formal certification of a compiler back-end

• Lochbihler’10: Verifying a compiler for Java threads

• Myreen’10: Verified just-in-time compiler on x86

• Sevcik et al.’11: Relaxed-memory concurrency and verified compilation

• Zhao et al.’13: Formal verification of SSA-based optimizations for LLVM

• Kumar et al.’14: CakeML: A verified implementation of ML

• Stewart et al.’15: Compositional CompCert

• Kang et al.’16: Lightweight Verification of Separate Compilation

• …

Page 7: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Compiler Verification• Leroy’06: Formal certification of a compiler back-end

• Lochbihler’10: Verifying a compiler for Java threads

• Myreen’10: Verified just-in-time compiler on x86

• Sevcik et al.’11: Relaxed-memory concurrency and verified compilation

• Zhao et al.’13: Formal verification of SSA-based optimizations for LLVM

• Kumar et al.’14: CakeML: A verified implementation of ML

• Stewart et al.’15: Compositional CompCert

• Kang et al.’16: Lightweight Verification of Separate Compilation

• …

Limited support of separate compilation and concurrency!

Page 8: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Separate Compilation

S1 S2Source (e.g. C)

Real-world programs may consist of multiple components, which will be compiled independently.

Page 9: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Separate Compilation

S1 S2Source (e.g. C)

Interaction

Real-world programs may consist of multiple components, which will be compiled independently.

// Module S1 extern void g(int *x); int f(){ int a = 0, b = 0; g(&b); return a + b; }

// Module S2 void g(int *x){ *x = 3; }

Page 10: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Separate Compilation

S1 S2Source (e.g. C)

Interaction

Page 11: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Separate Compilation

S1 S2Source (e.g. C)

Interaction

T1Target (e.g. assembly)

Compiler-1

Page 12: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Separate Compilation

S1 S2Source (e.g. C)

Interaction

T1 T2Target (e.g. assembly)

Compiler-1 Compiler-2

Page 13: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Separate Compilation

S1 S2Source (e.g. C)

Interaction

T1 T2Target (e.g. assembly)

Compiler-1 Compiler-2

Interaction

Page 14: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Separate Compilation

S1 S2Source (e.g. C)

Interaction

T1 T2Target (e.g. assembly)

Compiler-1 Compiler-2

Interaction

Different compilersDifferent

compilers

Page 15: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Separate Compilation

S1 S2Source (e.g. C)

Interaction

T1 T2Target (e.g. assembly)

Compiler-1 Compiler-2

Interaction

Different compilersDifferent

compilers

Different Languages

Different languages

Page 16: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Separate Compilation of Concurrent Programs

S1 S2Source (e.g. C)

T1 T2Target (e.g. assembly)

Compiler-1 Compiler-2

Interaction

Interaction

Page 17: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Separate Compilation of Concurrent Programs

S1 S2Source (e.g. C)

Parallel Composition

T1 T2Target (e.g. assembly)

Parallel Composition

||

||

Compiler-1 Compiler-2

Page 18: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Separate Compilation of Concurrent Programs

S1 S2Source (e.g. C)

Parallel Composition

T1 T2Target (e.g. assembly)

Parallel Composition

||

||

Can we reuse existing certified compilers (e.g. CompCert) for separate compilation of concurrent programs?

Compiler-1 Compiler-2

Page 19: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Compiler-1 Compiler-2

Compositional CompCert’s Argument…[Stewart et al. POPL’15]

DRFS1 S2Source (e.g. C)

T1 T2Target (e.g. assembly)

||

||

YES, for data-race-free (DRF) concurrent programs

Page 20: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

r1 = 1; r1 = r1 + 1; lock(); x = 1; y = x + 1; unlock();

r2 = 2; r2 = r2 + 1; lock(); x = 2; y = x + 1; unlock();

interleaving

Intuition of the Argument: Interleaving <=> Non-preemptive for DRF Programs

Page 21: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

r1 = 1; r1 = r1 + 1; lock(); x = 1; y = x + 1; unlock();

r2 = 2; r2 = r2 + 1; lock(); x = 2; y = x + 1; unlock();

interleaving

No race

Intuition of the Argument: Interleaving <=> Non-preemptive for DRF Programs

Page 22: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

r1 = 1; r1 = r1 + 1; lock(); x = 1; y = x + 1; unlock();

r2 = 2; r2 = r2 + 1; lock(); x = 2; y = x + 1; unlock();

interleaving

No race

Intuition of the Argument: Interleaving <=> Non-preemptive for DRF Programs

Page 23: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

r1 = 1; r1 = r1 + 1; lock(); x = 1; y = x + 1; unlock();

r2 = 2; r2 = r2 + 1; lock(); x = 2; y = x + 1; unlock();

interleaving

No race Non-preemptive: yield control at

certain points only

sequential

sequentialr1 = 1; r1 = r1 + 1; yield; x = 1; y = x + 1; yield;

r2 = 2; r2 = r2 + 1; yield; x = 2; y = x + 1; yield;

Intuition of the Argument: Interleaving <=> Non-preemptive for DRF Programs

Page 24: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

r1 = 1; r1 = r1 + 1; lock(); x = 1; y = x + 1; unlock();

r2 = 2; r2 = r2 + 1; lock(); x = 2; y = x + 1; unlock();

interleaving

No race Non-preemptive: yield control at

certain points only

sequential

sequentialr1 = 1; r1 = r1 + 1; yield; x = 1; y = x + 1; yield;

r2 = 2; r2 = r2 + 1; yield; x = 2; y = x + 1; yield;

Plausible, but need to address several key challenges

Intuition of the Argument: Interleaving <=> Non-preemptive for DRF Programs

Page 25: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Challenges

Page 26: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Challenges• How to formulate DRF in language independent manner?

DRFS1 S2||

Different Languages

Different languages

Page 27: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Challenges• How to formulate DRF in language independent manner?

• How to prove DRF-preservation, compositionally?

DRF

||

S1 S2

T1 T2

||

Page 28: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Challenges• How to formulate DRF in language independent manner?

• How to prove DRF-preservation, compositionally?

DRF

||

S1 S2

T1 T2

||

||T1’ T2’

… …

Page 29: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Challenges• How to formulate DRF in language independent manner?

• How to prove DRF-preservation, compositionally?

DRF

||

S1 S2

T1 T2

||

||T1’ T2’

… …

DRF?

Page 30: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Challenges• How to formulate DRF in language independent manner?

• How to prove DRF-preservation, compositionally?

Page 31: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Challenges• How to formulate DRF in language independent manner?

• How to prove DRF-preservation, compositionally?

• How to support benign-race and relaxed memory models?

Page 32: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Challenges• How to formulate DRF in language independent manner?

• How to prove DRF-preservation, compositionally?

• How to support benign-race and relaxed memory models?

“… synchronization primitives are commonly implemented with assembly code that has data races.”

—— Hans-J. Boehm, HotPar’11

Page 33: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Work

Page 34: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Work• Language independent verification framework

- Key semantics components + proof structures

- Supports separate compilation for race-free concurrent programs

- With both external function calls & multi-threaded code

Page 35: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Work• Language independent verification framework

- Key semantics components + proof structures

- Supports separate compilation for race-free concurrent programs

- With both external function calls & multi-threaded code

• Framework extension:

- Supports x86-TSO + confined benign-races

Page 36: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Work• Language independent verification framework

- Key semantics components + proof structures

- Supports separate compilation for race-free concurrent programs

- With both external function calls & multi-threaded code

• Framework extension:

- Supports x86-TSO + confined benign-races

• CASCompCert:

- Extends CompCert with Concurrency + Abstraction + Separate compilation

- Reuses considerable amount of CompCert proofs

- Racy x86-TSO impl. of locks as synchronization library

Page 37: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Outline of this Talk

• Language-independent DRF formulation

• DRF-preservation and key proof structures

• Supporting x86-TSO and confined benign-races in CASCompCert

Page 38: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Outline of this Talk

• Language-independent DRF formulation

• DRF-preservation and key proof structures

• Supporting x86-TSO and confined benign-races in CASCompCert

Page 39: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Language-Independent DRF Data-race: read-write / write-write conflicts

Memory …

write read/write

Page 40: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Language-Independent DRFData-race: read-write / write-write conflicts

Page 41: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Language-Independent DRF

Why language-independent?To support cross-language interaction

Data-race: read-write / write-write conflicts

S1 S2Interaction

May in different languagesMay in different languages

Page 42: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Language-Independent DRF

Why language-independent?To support cross-language interactionabstract away lang. details

e.g. interaction semantics[Stewart et al. POPL’15]

Data-race: read-write / write-write conflicts

S1 S2Interaction

May in different languagesMay in different languages

Page 43: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Language-Independent DRF

Why language-independent?To support cross-language interactionabstract away lang. details

e.g. interaction semantics[Stewart et al. POPL’15]

Data-race: read-write / write-write conflicts

S1 S2Interaction

May in different languagesMay in different languages

NO concrete reads/writes

Page 44: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Language-Independent DRF

Why language-independent?To support cross-language interactionabstract away lang. details

e.g. interaction semantics[Stewart et al. POPL’15]

Data-race: read-write / write-write conflicts

S1 S2Interaction

May in different languagesMay in different languages

NO concrete reads/writes

How to formulate DRF if we do not even know the concrete reads/writes?

Page 45: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Solution: Abstract Footprints

DRF(S1 || S2)

Page 46: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Solution: Abstract Footprints

DRF(S1 || S2)

Defined in terms of footprint disjointness

• footprints δ ::= (rs, ws) read-set write-set

Page 47: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Solution: Abstract Footprints

DRF(S1 || S2)

Defined in terms of footprint disjointness

• footprints δ ::= (rs, ws)

• well-defined language extensional characterization of footprints

read-set write-set

Page 48: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Solution: Abstract Footprints

DRF(S1 || S2)

Defined in terms of footprint disjointness

• footprints δ ::= (rs, ws)

• well-defined language extensional characterization of footprints

read-set write-set

rs

Page 49: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Solution: Abstract Footprints

DRF(S1 || S2)

Defined in terms of footprint disjointness

• footprints δ ::= (rs, ws)

• well-defined language extensional characterization of footprints

read-set write-set

rs

rs

Page 50: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Solution: Abstract Footprints

DRF(S1 || S2)

Defined in terms of footprint disjointness

• footprints δ ::= (rs, ws)

• well-defined language extensional characterization of footprints

read-set write-set

rs

rs

Arbitrarily differentArbitrarily different

Page 51: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

rs

Solution: Abstract Footprints

DRF(S1 || S2)

Defined in terms of footprint disjointness

• footprints δ ::= (rs, ws)

• well-defined language extensional characterization of footprints

read-set write-set

rsS

rs

ws

Arbitrarily differentArbitrarily different

Page 52: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

rs

Solution: Abstract Footprints

DRF(S1 || S2)

Defined in terms of footprint disjointness

• footprints δ ::= (rs, ws)

• well-defined language extensional characterization of footprints

read-set write-set

rsS

rs rsS

ws

ws

Arbitrarily differentArbitrarily different Same

Page 53: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Solution: Abstract Footprints

DRF(S1 || S2)

Defined in terms of footprint disjointness

• footprints δ ::= (rs, ws) read-set write-set

• well-defined language extensional characterization of footprints

Page 54: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Solution: Abstract Footprints

DRF(S1 || S2)

Defined in terms of footprint disjointness

• footprints δ ::= (rs, ws) read-set write-set

• well-defined language extensional characterization of footprints

Page 55: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Outline of this Talk

• Language-independent DRF formulation

• DRF-preservation and key proof structures

• Supporting x86-TSO and confined benign-races in CASCompCert

Page 56: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Outline of this Talk

• Language-independent DRF formulation

• DRF-preservation and key proof structures

• Supporting x86-TSO and confined benign-races in CASCompCert

Page 57: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Compositional CompCert’s Argument…

T1 || T2 ⊆ S1 || S2

DRF(S1 || S2) T1 = Comp(S1) T2 = Comp(S2)

?

Page 58: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Compositional CompCert’s Argument…

T1 | T2 ⊆ S1 | S2

T1 || T2 ⊆ S1 || S2

DRF(S1 || S2) T1 = Comp(S1) T2 = Comp(S2)

?

Non-preemptive

Page 59: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Compositional CompCert’s Argument…

T1 | T2 ⊆ S1 | S2

T1 || T2 ⊆ S1 || S2

DRF(S1 || S2) T1 = Comp(S1) T2 = Comp(S2)

?

?

Non-preemptive

Page 60: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Ideas

T1 | T2 ⊆ S1 | S2

T1 || T2 ⊆ S1 || S2

DRF(S1 || S2) T1 = Comp(S1) T2 = Comp(S2)

?

Non-preemptive

?

Page 61: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Ideas

T1 | T2 ⊆ S1 | S2

T1 || T2 ⊆ S1 || S2

DRF(S1 || S2) T1 = Comp(S1) T2 = Comp(S2)

?

Non-preemptive

Page 62: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Ideas

T1 | T2 ⊆ S1 | S2

Trivial

T1 || T2 ⊆ S1 || S2

DRF(S1 || S2) T1 = Comp(S1) T2 = Comp(S2)

?

Non-preemptive

Page 63: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Ideas

T1 | T2 ⊆ S1 | S2

Trivial

T1 || T2 ⊆ S1 || S2

DRF(S1 || S2) T1 = Comp(S1) T2 = Comp(S2)

?

Non-preemptive

?

Page 64: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Ideas

T1 | T2 ⊆ S1 | S2

TrivialDRF(T1 || T2)

T1 || T2 ⊆ S1 || S2

DRF(S1 || S2) T1 = Comp(S1) T2 = Comp(S2)

?

Non-preemptive

Page 65: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Ideas

T1 | T2 ⊆ S1 | S2

TrivialDRF(T1 || T2)

T1 || T2 ⊆ S1 || S2

DRF(S1 || S2) T1 = Comp(S1) T2 = Comp(S2)

?

Non-preemptive

?

Page 66: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Ideas

T1 | T2 ⊆ S1 | S2

TrivialDRF(T1 || T2)

?

T1 || T2 ⊆ S1 || S2

DRF(S1 || S2) T1 = Comp(S1) T2 = Comp(S2)

?

Non-preemptive

Page 67: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Ideas

T1 | T2 ⊆ S1 | S2

TrivialDRF(T1 || T2)

?

T1 || T2 ⊆ S1 || S2

DRF(S1 || S2) T1 = Comp(S1) T2 = Comp(S2)

?

Non-preemptive

How to prove DRF-preservation?

Page 68: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Ideas

T1 | T2 ⊆ S1 | S2

TrivialDRF(T1 || T2)

T1 || T2 ⊆ S1 || S2

DRF(S1 || S2) T1 = Comp(S1) T2 = Comp(S2)

?

Page 69: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Ideas

T1 | T2 ⊆ S1 | S2

TrivialDRF(T1 || T2)

T1 || T2 ⊆ S1 || S2

DRF(S1 || S2) T1 = Comp(S1) T2 = Comp(S2)

T1 | T2 ≤ S1 | S2 ≲

Footprint- preserving simulation

?

Page 70: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Ideas

T1 | T2 ⊆ S1 | S2

TrivialDRF(T1 || T2)

T1 || T2 ⊆ S1 || S2

DRF(S1 || S2) T1 = Comp(S1) T2 = Comp(S2)

T1 | T2 ≤ S1 | S2 ≲

Footprint- preserving simulation

?

Page 71: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Ideas

T1 | T2 ⊆ S1 | S2

TrivialDRF(T1 || T2)

T1 || T2 ⊆ S1 || S2

DRF(S1 || S2) T1 = Comp(S1) T2 = Comp(S2)

T1 | T2 ≤ S1 | S2 ≲

Footprint- preserving simulation

DRF preservation

Page 72: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Ideas

T1 | T2 ⊆ S1 | S2

T1 ≤ S1 /\ T2 ≤ S2

TrivialDRF(T1 || T2)

T1 || T2 ⊆ S1 || S2

DRF(S1 || S2) T1 = Comp(S1) T2 = Comp(S2)

≲ ≲

T1 | T2 ≤ S1 | S2 ≲

Footprint- preserving simulation

DRF preservation Compositionality

Page 73: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Ideas

T1 | T2 ⊆ S1 | S2

T1 ≤ S1 /\ T2 ≤ S2

TrivialDRF(T1 || T2)

T1 || T2 ⊆ S1 || S2

DRF(S1 || S2) T1 = Comp(S1) T2 = Comp(S2)

≲ ≲

T1 | T2 ≤ S1 | S2 ≲

Footprint- preserving simulation

DRF preservation Compositionality

Our compiler correctness

Page 74: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

(T, σ)

(S, Σ)

Solution: Footprint-Preserving Simulation

Source state

Target state≲

Target has smaller footprints, so cannot introduce more races

Page 75: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

(T, σ)

(S, Σ)

Solution: Footprint-Preserving Simulation

Target has smaller footprints, so cannot introduce more races

Page 76: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

(T, σ)

(S, Σ)

(T’, σ’)

Solution: Footprint-Preserving Simulation

Target has smaller footprints, so cannot introduce more races

Page 77: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

(T, σ)

(S, Σ)

(S’, Σ’)*

(T’, σ’)

Solution: Footprint-Preserving Simulation

Zero-or-multiple steps

Target has smaller footprints, so cannot introduce more races

Page 78: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

(T, σ)

(S, Σ)

≤ ≤

(S’, Σ’)*

(T’, σ’)

Solution: Footprint-Preserving Simulation

Zero-or-multiple steps

≲ ≲

Target has smaller footprints, so cannot introduce more races

Page 79: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

(T, σ)

(S, Σ) …

≤ ≤

(S’, Σ’)*

(T’, σ’)

Solution: Footprint-Preserving Simulation

≲ ≲

Target has smaller footprints, so cannot introduce more races

Page 80: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

(T, σ)

(S, Σ) …

≤ ≤

(S’, Σ’)*

(T’, σ’)

Solution: Footprint-Preserving Simulation

≲ ≲

Δ

δ

Δ, δ: Footprints

Target has smaller footprints, so cannot introduce more races

Page 81: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

(T, σ)

(S, Σ) …

≤ ≤

(S’, Σ’)*

(T’, σ’)

Solution: Footprint-Preserving Simulation

≲ ≲

Δ

δ

Δ, δ: Footprints ⊆

Target has smaller footprints, so cannot introduce more races

Page 82: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Outline of this Talk

• Language-independent DRF formulation

• DRF-preservation and key proof structures

• Supporting x86-TSO and confined benign-races in CASCompCert

Page 83: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Outline of this Talk

• Language-independent DRF formulation

• DRF-preservation and key proof structures

• Supporting x86-TSO and confined benign-races in CASCompCert

Page 84: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

DRF Imposes Strong Restriction on Libraries

Lib

zCall

T1 ||

Callz

T2

DRFClient

Page 85: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

DRF Imposes Strong Restriction on Libraries

lock_rel: … mov $1, %eax lock xchg %eax, L …

Lib

zCall

T1 ||

Callz

T2

DRFClient

Page 86: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

DRF Imposes Strong Restriction on Libraries

lock_rel: … mov $1, %eax lock xchg %eax, L …

Lib

zCall

T1 ||

Callz

T2

DRF ! inefficient

lock_rel: … mov $1, L …

[spin-lock impl. in Linux 2.6]

Client

Page 87: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

DRF Imposes Strong Restriction on Libraries

lock_rel: … mov $1, %eax lock xchg %eax, L …

Lib

zCall

T1 ||

Callz

T2

DRF ! inefficient

lock_rel: … mov $1, L …

[spin-lock impl. in Linux 2.6]

Client

Racy

Page 88: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

DRF Imposes Strong Restriction on Libraries

lock_rel: … mov $1, %eax lock xchg %eax, L …

Lib

zCall

T1 ||

Callz

T2

DRF ! inefficient

lock_rel: … mov $1, L …

[spin-lock impl. in Linux 2.6]

Client

Relaxed memory model, e.g. x86-TSO

Racy

Page 89: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

DRF Imposes Strong Restriction on Libraries

lock_rel: … mov $1, %eax lock xchg %eax, L …

Lib

zCall

T1 ||

Callz

T2

DRF ! inefficient

lock_rel: … mov $1, L …

[spin-lock impl. in Linux 2.6]

?

Client

Relaxed memory model, e.g. x86-TSO

Racy

Page 90: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Idea

Racy RF

Page 91: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Idea

Racy Lib’

zCall

T1 ||

Callz

T2

Client

Racy RF

Page 92: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Idea• Confined benign-races:

• Racy libraries and client code run in separate memory regions

Racy Lib’

zCall

T1 ||

Callz

T2

Client

Racy RF

Page 93: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Idea• Confined benign-races:

• Racy libraries and client code run in separate memory regions• Client code be well-synchronized

Racy Lib’

zCall

T1 ||

Callz

T2

Client Well-synchronized

Racy RF

Page 94: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Our Idea• Confined benign-races:

• Racy libraries and client code run in separate memory regions• Client code be well-synchronized• Racy libraries have race-free abstraction

Racy Lib’

zCall

T1 ||

Callz

T2

z

Call

T1 ||Call

z

T2

DRF

LibRace-free

abstraction

Client Well-synchronized

Racy RF

Page 95: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Source P Clight CallLock…

Supporting Confined Benign-Race & x86-TSO in Our CASCompCert

Page 96: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Source P Clight CallLock…

Multi-threaded

Supporting Confined Benign-Race & x86-TSO in Our CASCompCert

Page 97: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Source P Clight CallLock…

Multi-threaded race-free abstraction of spin-locks for synchronization

Supporting Confined Benign-Race & x86-TSO in Our CASCompCert

Page 98: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Source P Clight CallLock…

Multi-threaded race-free abstraction of spin-locks for synchronization

DRF

Supporting Confined Benign-Race & x86-TSO in Our CASCompCert

Page 99: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Clight

Supporting Confined Benign-Race & x86-TSO in Our CASCompCert

Source PDRF

CallLock…

Race-free

Page 100: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Clight

Supporting Confined Benign-Race & x86-TSO in Our CASCompCert

Target P

CompCert …

Source PDRF

CallLock…

x86…

Race-free

Page 101: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Clight

Supporting Confined Benign-Race & x86-TSO in Our CASCompCert

Target P

CompCert … identity

Source PDRF

Call

Call

Lock

Lock

x86…

Race-free

Page 102: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Clight

Supporting Confined Benign-Race & x86-TSO in Our CASCompCert

Target P

CompCert … identity

Source PDRF

Call

Call

Lock

Lock

x86…

Race-free

x86-SC

Page 103: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Clight

Supporting Confined Benign-Race & x86-TSO in Our CASCompCert

Target P

CompCert … identity

Source P

Manually impl.

Lock-Impl

DRF

with benign-races

Call

Call

Lock

Lock

x86…

Race-free

x86-SC

Page 104: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Clight

Supporting Confined Benign-Race & x86-TSO in Our CASCompCert

Target P

CompCert … identity

Source P

Manually impl.

identity …

Lock-ImplTarget P’TSO

DRF

with benign-races

Call

Call

Call

Lock

Lock

x86…

x86-TSO…

Race-free

x86-SC

x86-TSO semantics

Page 105: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Clight

Supporting Confined Benign-Race & x86-TSO in Our CASCompCert

Target P

CompCert … identity

Source P

Manually impl.

identity …

Lock-ImplTarget P’TSO

DRF

with benign-races

Call

Call

Call

Lock

Lock

x86…

x86-TSO…

Race-free

x86-SC

x86-TSO semantics

Page 106: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Clight

Supporting Confined Benign-Race & x86-TSO in Our CASCompCert

Target P

CompCert … identity

Source P

Manually impl.

identity …

Lock-Impl

Target P’TSO

DRF

with benign-races

Call

Call

Call

Lock

Lock

x86…

x86-TSO…

Race-free

x86-SC

x86-TSO semantics

Page 107: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Clight

Supporting Confined Benign-Race & x86-TSO in Our CASCompCert

Target P

CompCert … identity

Source P

≤oidentity …

Lock-Impl

Target P’TSO

DRF

with benign-races

Call

Call

Call

Lock

Lock

x86…

x86-TSO…

Race-free

x86-SC

x86-TSO semantics

Page 108: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Clight

Supporting Confined Benign-Race & x86-TSO in Our CASCompCert

Target P

CompCert … identity

Source P

≤oidentity …

Lock-Impl

Target P’TSO

DRF

with benign-races

Call

Call

Call

Lock

Lock

x86…

x86-TSO…

Race-free

x86-SC

x86-TSO semantics

Page 109: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Clight

Supporting Confined Benign-Race & x86-TSO in Our CASCompCert

Target P

CompCert … identity

Source P

≤oidentity …

Lock-Impl

Target P’TSO

⊆DRF

with benign-races

Call

Call

Call

Lock

Lock

x86…

x86-TSO…

Race-free

x86-SC

x86-TSO semantics

Page 110: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Clight

Supporting Confined Benign-Race & x86-TSO in Our CASCompCert

Target P

CompCert … identity

Source P

≤oidentity …

Lock-Impl

Target P’TSO

⊆DRF

with benign-races

Call

Call

Call

Lock

Lock

x86…

x86-TSO…

Race-free

x86-SC

x86-TSO semantics

Page 111: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Clight

Supporting Confined Benign-Race & x86-TSO in Our CASCompCert

Target P

CompCert … identity

Source P

≤oidentity …

Lock-Impl

Target P’TSO

⊆DRF

with benign-races

Call

Call

Call

Lock

Lock

x86…

x86-TSO…

Race-free

x86-SC

x86-TSO semantics

Page 112: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

CompCert PassesCompCert C Clight C#minor Cminor

CminorSel

RTL

LTLLinearMachPower PC

RISC-V

x86

ARM

SimplLocals

TunnelingCleanupLabels

Tailcall, Renumber, Inlining, Constprop, CSE,

Deadcode,Unusedglob

20 passes

Page 113: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Verified 12/20 Passes in CASCompCert

CompCert C Clight C#minor Cminor

CminorSel

RTL

LTLLinearMach

SimplLocals

TunnelingCleanupLabels

Tailcall, Renumber, Inlining, Constprop, CSE,

Deadcode,Unusedglob

20 passesVerified 12/

Power PC

RISC-V

x86

ARM

Page 114: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Verified 12/20 Passes in CASCompCert

CompCert C Clight C#minor Cminor

CminorSel

RTL

LTLLinearMach

SimplLocals

TunnelingCleanupLabels

Tailcall, Renumber, Inlining, Constprop, CSE,

Deadcode,Unusedglob

20 passesVerified 12/

Power PC

RISC-V

x86

ARM

Including all the translation passes from Clight to x86

Page 115: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Reused Considerable Amount of CompCert Proofs. Framework is Challenging to Implement.Towards Certified Separate Compilation for Concurrent Programs PLDI ’19, June 22–26, 2019, Phoenix, AZ, USA

Compilation passes andframework

Spec ProofCompCert Ours CompCert Ours

Cshmgen 515 1021 1071 1503Cminorgen 753 1556 1152 1251Selection 336 500 647 783RTLgen 428 543 821 862Tailcall 173 328 275 405Renumber 86 245 117 358Allocation 704 785 1410 1700Tunneling 131 339 166 475Linearize 236 371 349 733CleanupLabels 126 387 161 388Stacking 730 1038 1108 2135Asmgen 208 338 571 1128Compositionality (Lem. 6) 580 2249DRF preservation (Lem. 8) 358 1142Semantics equiv. (Lem. 9) 1540 4718Lifting 813 1795

Figure 13. Lines of code (using coqwc) in Coq

Here the premises 1-3 are similar to those required inDef. 11. In addition, the premise 4 requires that the x86-TSOcode πo of the object be simulated by γo . The simulation(tl, geo, πo) !

o (slCImp, geo,γo) is an extension of Liang andFeng [19] with the support of TSO semantics for the low-level code. Due to space limit, we omit the definition here.The refinement relation ⊑′ is a weaker version of ⊑ (see

Sec. 3.2). It does not preserve termination (the formal defini-tion omitted here). This is because our simulation!o for theobject code does not preserve termination for now, whichwe leave as future work.

Theorem 15 can be derived from Thm. 14 (for the compila-tion from Clight to x86-SC), and from Lem. 16 below, sayingthe x86-TSO code refines the x86-SC client code and thesource object code (we use tlsc and tltso as shorter notationsfor tlx86-SC and tlx86-TSO respectively).

Lemma 16 (Restore SC semantics for DRF x86 programs).Let Πsc = {(tlsc, ge1, π1), . . . , (tlsc, gem, πm ), (slCImp, geo,γo )},and Πtso = {(tltso, ge1, π1), . . . , (tltso, gem, πm ), (tltso, geo, πo )}.For any f1 . . . fn , if

1. Safe(let Πsc in f1 ∥ . . . ∥ fn ) and DRF(let Πsc in f1 ∥ . . . ∥ fn ),2. (tltso, geo, πo ) !

o (slCImp, geo,γo ),

then let Πtso in f1 ∥ . . . ∥ fn ⊑′ let Πsc in f1 ∥ . . . ∥ fn .

As explained before, Lem. 16 can be viewed as a strength-ened DRF-guarantee theorem for x86-TSO in that, if we letγo contain only skip and geo = ∅, Lem. 16 implies the DRF-guarantee of x86-TSO.

7.4 Proof Efforts in Coq

In Coq we have mechanized the framework (Fig. 2) andthe extended framework (Fig. 3) and proved all the relatedlemmas. We have verified all the CompCert passes in Fig. 11.Statistics of our Coq implementation and proofs are de-

picted in Fig. 13. Adapting the compilation correctness proofsfrom CompCert is relatively lightweight. For most passes

our proofs are within 300 lines of code more than the origi-nal CompCert proofs. The Stacking pass introduces moreadditional proofs, mostly caused by arguments marshallingfor supporting cross-language linking. In our experience,adapting CompCert’s original compilation proofs to our set-tings takes less than one person week per translation pass(except for Stacking). For simpler passes such as Tailcall,Linearize, Allocation, and RTLgen, it takes less than oneperson day per pass.By contrast, implementing our framework is more chal-

lenging, which took us about 1 person year. In particular,proving the equivalence between non-preemptive and pre-emptive semantics for DRF programs took us more time thanexpected, although it seems to be a well-known folklore the-orem. The co-inductive proofs there involve a large numberof non-trivial cases of reordering threads’ executions.

8 Related Work and ConclusionCompiler verification. Variouswork extends CompCert [16]to support separate compilation or concurrency. We havediscussed Compositional CompCert [2, 29] in Sec. 1 and 2.SepCompCert [15] extends CompCert with the support ofsyntactical linking. Their approach requires all the compila-tion units be compiled by CompCert. They do not supportcross-language linking or concurrency as we do.

CompCertTSO [27] compiles ClightTSO programs to thex86-TSO machine. It does not support cross-language link-ing, and its proof for the two CompCert passes Stackingand Cminorgen are not compositional. By contrast, we haveverified these two passes using our compositional simulation.For the other compositional passes, CompCertTSO relies ona thread-local simulation, which is stronger than ours. Itrequires that the source and the target always generate thesame memory events (excepts for those local variables thatcan be stored in registers). As a result, some optimizations(such as constant propagation and CSE) in CompCertTSOhave to be more restrictive.

As an extension of CompCertTSO, Jagannathan et al. [12]allow the compiler to inject racy code such as the efficientspin lock in Fig. 10. They propose a refinement calculus onthe racy code to ensure the compilation correctness. Theirwork looks similar to our extended framework in Fig. 3, butsince they use TSO semantics for both the source and targetprograms, they do not need to handle the gap between theSC and TSO semantics, so they do not need the source to beDRF as in our work.Podkopaev et al. [24] prove correctness of the compila-

tion from the promising semantics (which is a high-leveloperational relaxed model) to the operational ARMv8-POPmachine. They develop whole-program simulations to dealwith the complicated relaxed behaviors. Later on they verifycompilations from the promising semantics to declarativehardware models such as POWER, ARMv7 and ARMv8 [25].

156

Page 116: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Reused Considerable Amount of CompCert Proofs. Framework is Challenging to Implement.

Compiler verification: 100 - 400 more lines of Coq proof for most passes

Towards Certified Separate Compilation for Concurrent Programs PLDI ’19, June 22–26, 2019, Phoenix, AZ, USA

Compilation passes andframework

Spec ProofCompCert Ours CompCert Ours

Cshmgen 515 1021 1071 1503Cminorgen 753 1556 1152 1251Selection 336 500 647 783RTLgen 428 543 821 862Tailcall 173 328 275 405Renumber 86 245 117 358Allocation 704 785 1410 1700Tunneling 131 339 166 475Linearize 236 371 349 733CleanupLabels 126 387 161 388Stacking 730 1038 1108 2135Asmgen 208 338 571 1128Compositionality (Lem. 6) 580 2249DRF preservation (Lem. 8) 358 1142Semantics equiv. (Lem. 9) 1540 4718Lifting 813 1795

Figure 13. Lines of code (using coqwc) in Coq

Here the premises 1-3 are similar to those required inDef. 11. In addition, the premise 4 requires that the x86-TSOcode πo of the object be simulated by γo . The simulation(tl, geo, πo) !

o (slCImp, geo,γo) is an extension of Liang andFeng [19] with the support of TSO semantics for the low-level code. Due to space limit, we omit the definition here.The refinement relation ⊑′ is a weaker version of ⊑ (see

Sec. 3.2). It does not preserve termination (the formal defini-tion omitted here). This is because our simulation!o for theobject code does not preserve termination for now, whichwe leave as future work.

Theorem 15 can be derived from Thm. 14 (for the compila-tion from Clight to x86-SC), and from Lem. 16 below, sayingthe x86-TSO code refines the x86-SC client code and thesource object code (we use tlsc and tltso as shorter notationsfor tlx86-SC and tlx86-TSO respectively).

Lemma 16 (Restore SC semantics for DRF x86 programs).Let Πsc = {(tlsc, ge1, π1), . . . , (tlsc, gem, πm ), (slCImp, geo,γo )},and Πtso = {(tltso, ge1, π1), . . . , (tltso, gem, πm ), (tltso, geo, πo )}.For any f1 . . . fn , if

1. Safe(let Πsc in f1 ∥ . . . ∥ fn ) and DRF(let Πsc in f1 ∥ . . . ∥ fn ),2. (tltso, geo, πo ) !

o (slCImp, geo,γo ),

then let Πtso in f1 ∥ . . . ∥ fn ⊑′ let Πsc in f1 ∥ . . . ∥ fn .

As explained before, Lem. 16 can be viewed as a strength-ened DRF-guarantee theorem for x86-TSO in that, if we letγo contain only skip and geo = ∅, Lem. 16 implies the DRF-guarantee of x86-TSO.

7.4 Proof Efforts in Coq

In Coq we have mechanized the framework (Fig. 2) andthe extended framework (Fig. 3) and proved all the relatedlemmas. We have verified all the CompCert passes in Fig. 11.Statistics of our Coq implementation and proofs are de-

picted in Fig. 13. Adapting the compilation correctness proofsfrom CompCert is relatively lightweight. For most passes

our proofs are within 300 lines of code more than the origi-nal CompCert proofs. The Stacking pass introduces moreadditional proofs, mostly caused by arguments marshallingfor supporting cross-language linking. In our experience,adapting CompCert’s original compilation proofs to our set-tings takes less than one person week per translation pass(except for Stacking). For simpler passes such as Tailcall,Linearize, Allocation, and RTLgen, it takes less than oneperson day per pass.By contrast, implementing our framework is more chal-

lenging, which took us about 1 person year. In particular,proving the equivalence between non-preemptive and pre-emptive semantics for DRF programs took us more time thanexpected, although it seems to be a well-known folklore the-orem. The co-inductive proofs there involve a large numberof non-trivial cases of reordering threads’ executions.

8 Related Work and ConclusionCompiler verification. Variouswork extends CompCert [16]to support separate compilation or concurrency. We havediscussed Compositional CompCert [2, 29] in Sec. 1 and 2.SepCompCert [15] extends CompCert with the support ofsyntactical linking. Their approach requires all the compila-tion units be compiled by CompCert. They do not supportcross-language linking or concurrency as we do.

CompCertTSO [27] compiles ClightTSO programs to thex86-TSO machine. It does not support cross-language link-ing, and its proof for the two CompCert passes Stackingand Cminorgen are not compositional. By contrast, we haveverified these two passes using our compositional simulation.For the other compositional passes, CompCertTSO relies ona thread-local simulation, which is stronger than ours. Itrequires that the source and the target always generate thesame memory events (excepts for those local variables thatcan be stored in registers). As a result, some optimizations(such as constant propagation and CSE) in CompCertTSOhave to be more restrictive.

As an extension of CompCertTSO, Jagannathan et al. [12]allow the compiler to inject racy code such as the efficientspin lock in Fig. 10. They propose a refinement calculus onthe racy code to ensure the compilation correctness. Theirwork looks similar to our extended framework in Fig. 3, butsince they use TSO semantics for both the source and targetprograms, they do not need to handle the gap between theSC and TSO semantics, so they do not need the source to beDRF as in our work.Podkopaev et al. [24] prove correctness of the compila-

tion from the promising semantics (which is a high-leveloperational relaxed model) to the operational ARMv8-POPmachine. They develop whole-program simulations to dealwith the complicated relaxed behaviors. Later on they verifycompilations from the promising semantics to declarativehardware models such as POWER, ARMv7 and ARMv8 [25].

156

Page 117: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Reused Considerable Amount of CompCert Proofs. Framework is Challenging to Implement.

Compiler verification: 100 - 400 more lines of Coq proof for most passes

Towards Certified Separate Compilation for Concurrent Programs PLDI ’19, June 22–26, 2019, Phoenix, AZ, USA

Compilation passes andframework

Spec ProofCompCert Ours CompCert Ours

Cshmgen 515 1021 1071 1503Cminorgen 753 1556 1152 1251Selection 336 500 647 783RTLgen 428 543 821 862Tailcall 173 328 275 405Renumber 86 245 117 358Allocation 704 785 1410 1700Tunneling 131 339 166 475Linearize 236 371 349 733CleanupLabels 126 387 161 388Stacking 730 1038 1108 2135Asmgen 208 338 571 1128Compositionality (Lem. 6) 580 2249DRF preservation (Lem. 8) 358 1142Semantics equiv. (Lem. 9) 1540 4718Lifting 813 1795

Figure 13. Lines of code (using coqwc) in Coq

Here the premises 1-3 are similar to those required inDef. 11. In addition, the premise 4 requires that the x86-TSOcode πo of the object be simulated by γo . The simulation(tl, geo, πo) !

o (slCImp, geo,γo) is an extension of Liang andFeng [19] with the support of TSO semantics for the low-level code. Due to space limit, we omit the definition here.The refinement relation ⊑′ is a weaker version of ⊑ (see

Sec. 3.2). It does not preserve termination (the formal defini-tion omitted here). This is because our simulation!o for theobject code does not preserve termination for now, whichwe leave as future work.

Theorem 15 can be derived from Thm. 14 (for the compila-tion from Clight to x86-SC), and from Lem. 16 below, sayingthe x86-TSO code refines the x86-SC client code and thesource object code (we use tlsc and tltso as shorter notationsfor tlx86-SC and tlx86-TSO respectively).

Lemma 16 (Restore SC semantics for DRF x86 programs).Let Πsc = {(tlsc, ge1, π1), . . . , (tlsc, gem, πm ), (slCImp, geo,γo )},and Πtso = {(tltso, ge1, π1), . . . , (tltso, gem, πm ), (tltso, geo, πo )}.For any f1 . . . fn , if

1. Safe(let Πsc in f1 ∥ . . . ∥ fn ) and DRF(let Πsc in f1 ∥ . . . ∥ fn ),2. (tltso, geo, πo ) !

o (slCImp, geo,γo ),

then let Πtso in f1 ∥ . . . ∥ fn ⊑′ let Πsc in f1 ∥ . . . ∥ fn .

As explained before, Lem. 16 can be viewed as a strength-ened DRF-guarantee theorem for x86-TSO in that, if we letγo contain only skip and geo = ∅, Lem. 16 implies the DRF-guarantee of x86-TSO.

7.4 Proof Efforts in Coq

In Coq we have mechanized the framework (Fig. 2) andthe extended framework (Fig. 3) and proved all the relatedlemmas. We have verified all the CompCert passes in Fig. 11.Statistics of our Coq implementation and proofs are de-

picted in Fig. 13. Adapting the compilation correctness proofsfrom CompCert is relatively lightweight. For most passes

our proofs are within 300 lines of code more than the origi-nal CompCert proofs. The Stacking pass introduces moreadditional proofs, mostly caused by arguments marshallingfor supporting cross-language linking. In our experience,adapting CompCert’s original compilation proofs to our set-tings takes less than one person week per translation pass(except for Stacking). For simpler passes such as Tailcall,Linearize, Allocation, and RTLgen, it takes less than oneperson day per pass.By contrast, implementing our framework is more chal-

lenging, which took us about 1 person year. In particular,proving the equivalence between non-preemptive and pre-emptive semantics for DRF programs took us more time thanexpected, although it seems to be a well-known folklore the-orem. The co-inductive proofs there involve a large numberof non-trivial cases of reordering threads’ executions.

8 Related Work and ConclusionCompiler verification. Variouswork extends CompCert [16]to support separate compilation or concurrency. We havediscussed Compositional CompCert [2, 29] in Sec. 1 and 2.SepCompCert [15] extends CompCert with the support ofsyntactical linking. Their approach requires all the compila-tion units be compiled by CompCert. They do not supportcross-language linking or concurrency as we do.

CompCertTSO [27] compiles ClightTSO programs to thex86-TSO machine. It does not support cross-language link-ing, and its proof for the two CompCert passes Stackingand Cminorgen are not compositional. By contrast, we haveverified these two passes using our compositional simulation.For the other compositional passes, CompCertTSO relies ona thread-local simulation, which is stronger than ours. Itrequires that the source and the target always generate thesame memory events (excepts for those local variables thatcan be stored in registers). As a result, some optimizations(such as constant propagation and CSE) in CompCertTSOhave to be more restrictive.

As an extension of CompCertTSO, Jagannathan et al. [12]allow the compiler to inject racy code such as the efficientspin lock in Fig. 10. They propose a refinement calculus onthe racy code to ensure the compilation correctness. Theirwork looks similar to our extended framework in Fig. 3, butsince they use TSO semantics for both the source and targetprograms, they do not need to handle the gap between theSC and TSO semantics, so they do not need the source to beDRF as in our work.Podkopaev et al. [24] prove correctness of the compila-

tion from the promising semantics (which is a high-leveloperational relaxed model) to the operational ARMv8-POPmachine. They develop whole-program simulations to dealwith the complicated relaxed behaviors. Later on they verifycompilations from the promising semantics to declarativehardware models such as POWER, ARMv7 and ARMv8 [25].

156

Page 118: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Reused Considerable Amount of CompCert Proofs. Framework is Challenging to Implement.

Compiler verification: 100 - 400 more lines of Coq proof for most passes

Towards Certified Separate Compilation for Concurrent Programs PLDI ’19, June 22–26, 2019, Phoenix, AZ, USA

Compilation passes andframework

Spec ProofCompCert Ours CompCert Ours

Cshmgen 515 1021 1071 1503Cminorgen 753 1556 1152 1251Selection 336 500 647 783RTLgen 428 543 821 862Tailcall 173 328 275 405Renumber 86 245 117 358Allocation 704 785 1410 1700Tunneling 131 339 166 475Linearize 236 371 349 733CleanupLabels 126 387 161 388Stacking 730 1038 1108 2135Asmgen 208 338 571 1128Compositionality (Lem. 6) 580 2249DRF preservation (Lem. 8) 358 1142Semantics equiv. (Lem. 9) 1540 4718Lifting 813 1795

Figure 13. Lines of code (using coqwc) in Coq

Here the premises 1-3 are similar to those required inDef. 11. In addition, the premise 4 requires that the x86-TSOcode πo of the object be simulated by γo . The simulation(tl, geo, πo) !

o (slCImp, geo,γo) is an extension of Liang andFeng [19] with the support of TSO semantics for the low-level code. Due to space limit, we omit the definition here.The refinement relation ⊑′ is a weaker version of ⊑ (see

Sec. 3.2). It does not preserve termination (the formal defini-tion omitted here). This is because our simulation!o for theobject code does not preserve termination for now, whichwe leave as future work.

Theorem 15 can be derived from Thm. 14 (for the compila-tion from Clight to x86-SC), and from Lem. 16 below, sayingthe x86-TSO code refines the x86-SC client code and thesource object code (we use tlsc and tltso as shorter notationsfor tlx86-SC and tlx86-TSO respectively).

Lemma 16 (Restore SC semantics for DRF x86 programs).Let Πsc = {(tlsc, ge1, π1), . . . , (tlsc, gem, πm ), (slCImp, geo,γo )},and Πtso = {(tltso, ge1, π1), . . . , (tltso, gem, πm ), (tltso, geo, πo )}.For any f1 . . . fn , if

1. Safe(let Πsc in f1 ∥ . . . ∥ fn ) and DRF(let Πsc in f1 ∥ . . . ∥ fn ),2. (tltso, geo, πo ) !

o (slCImp, geo,γo ),

then let Πtso in f1 ∥ . . . ∥ fn ⊑′ let Πsc in f1 ∥ . . . ∥ fn .

As explained before, Lem. 16 can be viewed as a strength-ened DRF-guarantee theorem for x86-TSO in that, if we letγo contain only skip and geo = ∅, Lem. 16 implies the DRF-guarantee of x86-TSO.

7.4 Proof Efforts in Coq

In Coq we have mechanized the framework (Fig. 2) andthe extended framework (Fig. 3) and proved all the relatedlemmas. We have verified all the CompCert passes in Fig. 11.Statistics of our Coq implementation and proofs are de-

picted in Fig. 13. Adapting the compilation correctness proofsfrom CompCert is relatively lightweight. For most passes

our proofs are within 300 lines of code more than the origi-nal CompCert proofs. The Stacking pass introduces moreadditional proofs, mostly caused by arguments marshallingfor supporting cross-language linking. In our experience,adapting CompCert’s original compilation proofs to our set-tings takes less than one person week per translation pass(except for Stacking). For simpler passes such as Tailcall,Linearize, Allocation, and RTLgen, it takes less than oneperson day per pass.By contrast, implementing our framework is more chal-

lenging, which took us about 1 person year. In particular,proving the equivalence between non-preemptive and pre-emptive semantics for DRF programs took us more time thanexpected, although it seems to be a well-known folklore the-orem. The co-inductive proofs there involve a large numberof non-trivial cases of reordering threads’ executions.

8 Related Work and ConclusionCompiler verification. Variouswork extends CompCert [16]to support separate compilation or concurrency. We havediscussed Compositional CompCert [2, 29] in Sec. 1 and 2.SepCompCert [15] extends CompCert with the support ofsyntactical linking. Their approach requires all the compila-tion units be compiled by CompCert. They do not supportcross-language linking or concurrency as we do.

CompCertTSO [27] compiles ClightTSO programs to thex86-TSO machine. It does not support cross-language link-ing, and its proof for the two CompCert passes Stackingand Cminorgen are not compositional. By contrast, we haveverified these two passes using our compositional simulation.For the other compositional passes, CompCertTSO relies ona thread-local simulation, which is stronger than ours. Itrequires that the source and the target always generate thesame memory events (excepts for those local variables thatcan be stored in registers). As a result, some optimizations(such as constant propagation and CSE) in CompCertTSOhave to be more restrictive.

As an extension of CompCertTSO, Jagannathan et al. [12]allow the compiler to inject racy code such as the efficientspin lock in Fig. 10. They propose a refinement calculus onthe racy code to ensure the compilation correctness. Theirwork looks similar to our extended framework in Fig. 3, butsince they use TSO semantics for both the source and targetprograms, they do not need to handle the gap between theSC and TSO semantics, so they do not need the source to beDRF as in our work.Podkopaev et al. [24] prove correctness of the compila-

tion from the promising semantics (which is a high-leveloperational relaxed model) to the operational ARMv8-POPmachine. They develop whole-program simulations to dealwith the complicated relaxed behaviors. Later on they verifycompilations from the promising semantics to declarativehardware models such as POWER, ARMv7 and ARMv8 [25].

156

Framework impl.: > 60k LoC, ~ 1 person year

Page 119: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Conclusion

Page 120: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Conclusion• Language independent verification framework

- Key semantics components + proof structures

- Supports separate compilation for race-free concurrent programs

- Well-defined language for language-independent DRF

- Footprint-preserving simulation for DRF preservation

Page 121: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Conclusion• Language independent verification framework

- Key semantics components + proof structures

- Supports separate compilation for race-free concurrent programs

- Well-defined language for language-independent DRF

- Footprint-preserving simulation for DRF preservation

• Framework extension:

- Support x86-TSO + confined benign-races

Page 122: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Conclusion• Language independent verification framework

- Key semantics components + proof structures

- Supports separate compilation for race-free concurrent programs

- Well-defined language for language-independent DRF

- Footprint-preserving simulation for DRF preservation

• Framework extension:

- Support x86-TSO + confined benign-races

• CASCompCert:

- Reused considerable amount of CompCert proofs

- Racy x86-TSO impl. of locks as synchronization library

Page 123: Towards Certified Separate Compilation for Concurrent Programs · S1 S2 Source (e.g. C) Real-world programs may consist of multiple components, which will be compiled independently.

Thank you!