Provable Multicore Schedulers with Ipanema: Application to ...

32
Provable Multicore Schedulers with Ipanema: Application to Work-Conservation Baptiste Lepers Redha Gouicem Damien Carver Jean-Pierre Lozi Nicolas Palix Virginia Aponte Willy Zwaenepoel Julien Sopena Julia Lawall Gilles Muller

Transcript of Provable Multicore Schedulers with Ipanema: Application to ...

Page 1: Provable Multicore Schedulers with Ipanema: Application to ...

Provable Multicore Schedulers with Ipanema: Application to Work-Conservation

Baptiste LepersRedha GouicemDamien CarverJean-Pierre LoziNicolas PalixVirginia AponteWilly ZwaenepoelJulien SopenaJulia LawallGilles Muller

Page 2: Provable Multicore Schedulers with Ipanema: Application to ...

2/32

Work conservation

“No core should be left idle when a core is overloaded”

Core 0 Core 1 Core 2 Core 3

Non work-conserving situation: core 0 is overloaded, other cores are idle

Page 3: Provable Multicore Schedulers with Ipanema: Application to ...

3/32

Problem

Linux (CFS) suffers from work conservation issues

Core is mostly idle

Core Core is mostly overloaded

[Lozi et al. 2016]

0

8

16

24

32

40

48

56

Time (second)

Page 4: Provable Multicore Schedulers with Ipanema: Application to ...

4/32

Problem

FreeBSD (ULE) suffers from work conservation issues

Core is idle

Core

Core is overloaded

[Bouron et al. 2018]

Time (second)

Page 5: Provable Multicore Schedulers with Ipanema: Application to ...

5/32

Problem

Work conservation bugs are hard to detect

No crash, no deadlock. No obvious symptom.

137x slowdown on HPC applications23% slowdown on a database.

[Lozi et al. 2016]

Page 6: Provable Multicore Schedulers with Ipanema: Application to ...

6/32

This talkFormally prove work-conservation

Page 7: Provable Multicore Schedulers with Ipanema: Application to ...

7/32

Work Conservation Formally

(∃c . O(c)) ⇒ (∀c′ . ¬I(c′))

If a core is overloaded, no core is idle

Core 0 Core 1

Page 8: Provable Multicore Schedulers with Ipanema: Application to ...

8/32

Work Conservation Formally

(∃c . O(c)) ⇒ (∀c′ . ¬I(c′))

If a core is overloaded, no core is idle

Core 0 Core 1

Does not work for realistic schedulers!

Page 9: Provable Multicore Schedulers with Ipanema: Application to ...

9/32

Challenge #1

Concurrent events & optimistic concurrency

Page 10: Provable Multicore Schedulers with Ipanema: Application to ...

10/32

Challenge #1

Concurrent events & optimistic concurrency

Observe (state of every core)

Lock (one core – less overhead)

Act (e.g., steal threads from locked core)

Based on possibly outdated observations!

tim

e

Page 11: Provable Multicore Schedulers with Ipanema: Application to ...

11/32

Challenge #1

Concurrent events & optimistic concurrency

Core 0 Core 1 Core 2 Core 3

Runs loadbalancing

Page 12: Provable Multicore Schedulers with Ipanema: Application to ...

12/32

Challenge #1

Concurrent events & optimistic concurrency

Core 0 Core 1 Core 2 Core 3

Observes load(no lock)

Page 13: Provable Multicore Schedulers with Ipanema: Application to ...

13/32

Challenge #1

Concurrent events & optimistic concurrency

Core 0 Core 1 Core 2 Core 3

Locks busiest

Idealscenario: nochange sinceobservations

Page 14: Provable Multicore Schedulers with Ipanema: Application to ...

14/32

Challenge #1

Concurrent events & optimistic concurrency

Core 0 Core 1 Core 2 Core 3

Locks “busiest”Busiest might have no thread left! (Concurrent blocks/terminations.)

Possible scenario:

Page 15: Provable Multicore Schedulers with Ipanema: Application to ...

15/32

Challenge #1

Concurrent events & optimistic concurrency

Core 0 Core 1 Core 2 Core 3

(Fail to)Steal from busiest

Page 16: Provable Multicore Schedulers with Ipanema: Application to ...

16/32

Challenge #1

Concurrent events & optimistic concurrency

Definition of Work Conservation must take concurrency into account!

Observe

Lock

ActBased on possibly outdated observations!

tim

e

Page 17: Provable Multicore Schedulers with Ipanema: Application to ...

17/32

Concurrent Work Conservation Formally

If a core is overloaded(but not because a thread was concurrently created)

∃c . (O(c) ∧ ¬fork(c) ∧ ¬unblock(c) …)

Definition of overloaded with « failure cases »:

Page 18: Provable Multicore Schedulers with Ipanema: Application to ...

18/32

Concurrent Work Conservation Formally

∃c . (O(c) ∧ ¬fork(c) ∧ ¬unblock(c) …)⇒

∀c′ . ¬(I(c′) ∧ …)

Page 19: Provable Multicore Schedulers with Ipanema: Application to ...

19/32

Challenge #2

Existing scheduler code is hard to prove

Schedulers handle millions of events per secondHistorically: low level C code.

Page 20: Provable Multicore Schedulers with Ipanema: Application to ...

20/32

Challenge #2

Existing scheduler code is hard to prove

Schedulers handle millions of events per secondHistorically: low level C code.

Code should be easy to prove AND efficient!

Page 21: Provable Multicore Schedulers with Ipanema: Application to ...

21/32

Challenge #2

Existing scheduler code is hard to prove

Schedulers handle millions of events per secondHistorically: low level C code.

Code should be easy to prove AND efficient!⇒

Domain Specific Language (DSL)

Page 22: Provable Multicore Schedulers with Ipanema: Application to ...

22/32

DSL advantages

Trade expressiveness for expertise/knowledge:

Robustness: (static) verification of properties

Explicit concurrency: explicit shared variables

Performance: efficient compilation

Page 23: Provable Multicore Schedulers with Ipanema: Application to ...

23/32

DSL-based proofs

DSL Policy

WhyML code

C code

Proof

Kernel module

DSL: close to CEasy learn and to compile to WhyML and C

Page 24: Provable Multicore Schedulers with Ipanema: Application to ...

24/32

DSL-based proofs

Proof on all possible interleavings

Page 25: Provable Multicore Schedulers with Ipanema: Application to ...

25/32

DSL-based proofs

load balancing

Core 0

load balancingtim

e

Proof on all possible interleavings

Split code in blocks(1 block = 1 read or write to a

shared variable)

Page 26: Provable Multicore Schedulers with Ipanema: Application to ...

26/32

DSL-based proofs

fork

Core 1 … Core N

terminate

fork

fork

Proof on all possible interleavings

Split code in blocks(1 block = 1 read or write to a

shared variable)

Simulate execution of concurrent blocs on N cores

Concurrent WC must hold at theend of the load balancing

load balancing

Core 0

load balancingtim

e

Page 27: Provable Multicore Schedulers with Ipanema: Application to ...

27/32

DSL-based proofs

fork

Core 1 … Core N

terminate

fork

fork

Proof on all possible interleavings

Split code in blocs(1 bloc = 1 read or write to a

shared variable)

Simulate execution of concurrent blocs on N cores

Concurrent WC must always hold!

load balancing

Core 0

load balancingtim

eDSL ➔ few shared variables ➔ tractable

Page 28: Provable Multicore Schedulers with Ipanema: Application to ...

28/32

Evaluation

CFS-CWC (365 LOC)Hierarchical CFS-like scheduler

CFS-CWC-FLAT (222 LOC)Single level CFS-like scheduler

ULE-CWC (244 LOC)BSD-like scheduler

Page 29: Provable Multicore Schedulers with Ipanema: Application to ...

29/32

Less idle timeFT.C (NAS benchmark)

Page 30: Provable Multicore Schedulers with Ipanema: Application to ...

30/32

Comparable or better performance

NAS benchmarks (lower is better)

Page 31: Provable Multicore Schedulers with Ipanema: Application to ...

31/32

Comparable or better performance

Sysbench on MySQL (higher is better)

Page 32: Provable Multicore Schedulers with Ipanema: Application to ...

32/32

Conclusion

Work conservation: not straighforward!… new formalism: concurrent work conservation!

Complex concurrency scheme…proofs made tractable using a DSL.

Performance: similar or better than CFS.