Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating...
-
Upload
philip-hancock -
Category
Documents
-
view
221 -
download
0
description
Transcript of Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating...
![Page 1: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/1.jpg)
Execution Replay and Debugging
![Page 2: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/2.jpg)
Contents
![Page 3: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/3.jpg)
Introduction• Parallel program: set of co-operating processes• Co-operation using
– shared variables– message passing
• Developing parallel programs is considered difficult:– normal errors as in sequential programs– synchronisation errors (deadlock, races)– performance errors
We need good development tools
![Page 4: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/4.jpg)
Debugging of parallel programs• Most used technique: cyclic debugging• Requires repeatable equivalent executions• Is a problem for parallel programs: lots of
non-determinism present• Solution: execution replay mechanism:
– record phase: trace information about the non-deterministic choices
– replay phase: force an equivalent re-execution using the trace allowing the use of intrusive debugging techniques
![Page 5: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/5.jpg)
Non-determinism• Classes:
– external vs. internal non-determinism– desired vs. undesired non-determinism
• Important: the amount of non-determinism depends on the abstraction level. E.g. a semaphore P()-operation can be fully deterministic while consisting of e number of non-deterministic spinlocking operations.
![Page 6: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/6.jpg)
Causes of Non-determinism– In sequential programs:
• program code (self modifying code?)• program input (disk, keyboard, network, ...)• certain system calls (gettimeofday())• interrupts, signals, ...
– In parallel programs:• accesses to shared variables: race conditions
(synchronisation races and data races)– In distributed programs:
• promiscuous receive operations• test operations for non-blocking messages operations
![Page 7: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/7.jpg)
Main Issues in Execution Replay• recorded execution = original execution:
– trace as little as possible in order to limit the overhead
• in time • in space
• replayed execution = recorded execution:– faithful re-execution: trace enough
![Page 8: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/8.jpg)
Execution Replay Methods• Two types: content- vs. ordering-based
– content-based: force each process to read the same value or to receive the same message as during the original execution
– ordering-based: force each process to access the variables or to receive the message in the same logical order as during the original execution
![Page 9: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/9.jpg)
Logical Clocks for Ordering-based Methods
• A clock C() attaches a timestamps C(x) to an event x
• Used for tracing the logical order of events• Clock condition:
• Clocks are strongly consistent if
• New timestamp is the increment of the maximum of the old timestamps of the process and the object
)()( bCaCba
)()( bCaCba
![Page 10: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/10.jpg)
Scalar Clocks• Aka Lamport Clocks• Simple and fast update algorithm:
• Scales very well with the number of processes
• Provides only limited information:
1,max'' oSCpSCoSCpSC
baabbSCaSC
abbababSCaSCbababSCaSC
//////
or or or
or
![Page 11: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/11.jpg)
Vector Clocks• A vector clock for a program using N
processes consist of N scalar values
• Such a clock is strongly consistent: by comparing vector timestamps one can deduce concurrency information:
0,...,0,1,0,..,0,sup'' oVCpVCoVCpVC
abbVCaVC
babVCaVCbabVCaVC
//
![Page 12: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/12.jpg)
An Example Program• A parallel program with two threads,
communicating using shared variables: A, B MA and MB. Local variables are x and y.
• M is used as a mutex using an atomic swap operation provided by the CPU:
valuememlocmemlocreturn
valuememlocswap][
][),(
![Page 13: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/13.jpg)
An Example Program (II)• Lock operation on a mutex M is implemented
(in a library):
• Unlock operation on a mutex M is implemented as:
• All variables are initially 0
);1)1,(( Mswapwhile
;0M
![Page 14: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/14.jpg)
An Example Program (III)• The example program:
Thread 1:L(MA);A=8;U(MA);L(MB);B=7;U(MB);
Thread 2:B=6;L(MB);x=B;U(MB);L(MA);y=A;U(MA);
![Page 15: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/15.jpg)
A Possible Execution: Low Level View
A=8swap(MA,1) 0
MA=0
swap(MB,1) 0
B=7
MB=0
x=Bswap(MB,1) 0
MB=0
swap(MA,1) 0
y=AMA=0
B=6
swap(MB,1) 1swap(MB,1) 1swap(MB,1) 1
![Page 16: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/16.jpg)
A Possible Execution: High Level View
A=8L(MA)
U(MA)
L(MB)
B=7
U(MB)
x=BL(MB)
U(MB)
L(MA)
y=AU(MA)
B=6
tim e
![Page 17: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/17.jpg)
Recap• A content-based replay method: the value
read by each load operation is stored• Trace generation of 1MB/s was measured on
a VAX 11/780• Undoable method: time needed to record the
large amount of trace information modifies the initial execution
• One advantage: possible to replay a subset of the processes in isolation.
![Page 18: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/18.jpg)
Recap: Example
A=8swap(MA,1) 0
MA=0
swap(MB,1) 0
B=7
MB=0
x=Bswap(MB,1) 0
MB=0
swap(MA,1) 0
y=AMA=0
B=6
swap(MB,1) 1swap(MB,1) 1swap(MB,1) 1
0
0111
70
80
![Page 19: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/19.jpg)
Instant Replay• First ordering-based replay method• Developed for CREW-algorithms• Each shared object receives a version
number that is updated or logged at each CREW-operation:– read: the version number is logged– write:
• the version number is incremented• the number of preceding read operations is logged
![Page 20: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/20.jpg)
Instant Replay: Example
A=8Lw(MA)
Uw(MA)
Lw(MB)
B=7
Uw(MB)
x=BLr(MB)
Ur(MB)
Lr(MA)
y=AUr(MA)
B=6
version: 1log 0 reads
version: 1log 0 reads
log version 1
log version 1
PROBLEM
![Page 21: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/21.jpg)
Netzer• Widely cited method• Attaches a vector clock to each process. The
clocks attach a timestamp to each memory operations.
• Uses vector clocks to detect concurrent (racing) memory operations
• Automatically traces transitive reduction of the dependencies
![Page 22: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/22.jpg)
Netzer: Basic Idea
B=6
Is this order guaranteed?
swap(MB,1) 0
B=7
B=6
![Page 23: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/23.jpg)
Netzer: Transitive Reduction
B=7
MB=0
x=Bswap(MB,1) 0
![Page 24: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/24.jpg)
Netzer: Example
A=8swap(MA,1) 0
MA=0
swap(MB,1) 0
B=7
MB=0
x=Bswap(MB,1) 0
MB=0
swap(MA,1) 0
y=AMA=0
B=6
swap(MB,1) 1swap(MB,1) 1swap(MB,1) 1
![Page 25: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/25.jpg)
Netzer: Example
A=8swap(MA,1) 0
MA=0
swap(MB,1) 0
B=7
MB=0
x=Bswap(MB,1) 0
MB=0
swap(MA,1) 0
y=AMA=0
B=6
swap(MB,1) 1swap(MB,1) 1swap(MB,1) 1
(1,0)(2,0)
(4,0)
(5,1)
(6,4)
(3,0)(0,1)
(4,3)(4,4)(6,5)(6,6)(6,7)(6,8)(6,9)
(6,10)
(4,2)
![Page 26: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/26.jpg)
Netzer: Example
A=8swap(MA,1) 0
MA=0
swap(MB,1) 0
B=7
MB=0
x=Bswap(MB,1) 0
MB=0
swap(MA,1) 0
y=AMA=0
B=6
swap(MB,1) 1swap(MB,1) 1swap(MB,1) 1
(1,0)(2,0)
(4,0)
(5,1)
(6,4)
(3,0)(0,1)
(4,3)(4,4)(6,5)(6,6)(6,7)(6,8)(6,9)
(6,10)
(4,2)
![Page 27: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/27.jpg)
Netzer: Problems• Size of vector clock grows with the number of
processes– the method doesn’t scale well– programs that create thread dynamically?
• A vector timestamp has to be attached to all shared memory locations: huge space overhead.
• The method basically detects all data and synchronisation races and replays them.
![Page 28: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/28.jpg)
ROLT• Attaches a Lamport clock to each process.
The clocks attach a timestamp to each memory operations.
• Does not detect racing operation, but merely re-executes them in the same order.
• Also automatically traces transitive reduction of the dependencies
![Page 29: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/29.jpg)
ROLT: Example
A=8swap(MA,1) 0
MA=0
swap(MB,1) 0
B=7
MB=0
x=Bswap(MB,1) 0
MB=0
swap(MA,1) 0
y=AMA=0
B=6
swap(MB,1) 1swap(MB,1) 1swap(MB,1) 1
12
4
5
8
31
679
1011121314
5
![Page 30: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/30.jpg)
ROLT: Example
A=8swap(MA,1) 0
MA=0
swap(MB,1) 0
B=7
MB=0
x=Bswap(MB,1) 0
MB=0
swap(MA,1) 0
y=AMA=0
B=6
swap(MB,1) 1swap(MB,1) 1swap(MB,1) 1
12
4
5
8
31
679
1011121314
5
(5,8) (1,5),(7,9)Traced:
![Page 31: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/31.jpg)
ROLT: Example
A=8swap(MA,1) 0
MA=0
swap(MB,1) 0
B=7
MB=0
x=Bswap(MB,1) 0
MB=0
swap(MA,1) 0
y=AMA=0
B=6
swap(MB,1) 1swap(MB,1) 1swap(MB,1) 1
12
4
5
8
31
679
1011121314
5
![Page 32: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/32.jpg)
ROLT: Example
A=8
swap(MA,1) 0
MA=0
swap(MB,1) 0
B=7
MB=0
x=Bswap(MB,1) 0
MB=0
swap(MA,1) 0
y=AMA=0
B=6
swap(MB,1) 1swap(MB,1) 1swap(MB,1) 1
1
2
4
5
8
31
679
1011121314
5
![Page 33: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/33.jpg)
ROLT using three phases• Problem: high overhead due to the tracing of
all memory operations• Solution: only record/replay the
synchronisation operations (subset of all race conditions)
• Problem: no correct replay possible if the execution contains a data race
• Solution: add a third phase for detecting the data races
![Page 34: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/34.jpg)
ROLT using three phases• Phase 1: record the order of the
synchronisation races• Phase 2: replay the synchronisation races
while using intrusive data race detection techniques
• Phase 3: replay the synchronisation races and use cyclic debugging techniques to find the `normal’ errors
![Page 35: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/35.jpg)
ROLT: Example
A=8L(MA)
U(MA)
L(MB)
B=7
U(MB)
x=BL(MB)
U(MB)
L(MA)
y=AMA=0
B=6
1
3
4
2
5
67
8 - (0,5)Traced:
![Page 36: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/36.jpg)
ROLT• ROLT replays synchronisation races end
detects data races.• The method scales well and has a small
space and time overhead.• Produces small trace files.• A total order is imposed artificial
dependencies.
![Page 37: Execution Replay and Debugging. Contents Introduction Parallel program: set of co-operating processes Co-operation using shared variables message passing.](https://reader034.fdocuments.in/reader034/viewer/2022052607/5a4d1b687f8b9ab0599b1c68/html5/thumbnails/37.jpg)
Conclusions