TRACES Code Padding to Improve the WCET Calculability Christine Rochange and Pascal Sainrat Institut...
-
Upload
priscilla-kelly -
Category
Documents
-
view
215 -
download
0
Transcript of TRACES Code Padding to Improve the WCET Calculability Christine Rochange and Pascal Sainrat Institut...
TRACES
Code Paddingto Improve
the WCET Calculability
Christine Rochange and Pascal Sainrat
Institut de Recherche
en Informatique
de ToulouseToulouse
TRACES
WCET evaluation
Static WCET analysis
IPET:Implicit Path Enumeration Technique
flow analysis low-level analysis
WCET computation
TRACES
Implicit Path Enumeration Technique
A
B
C E
D
xA = 1 + xDA = 1 + xAB
xB = xAB = xBC + xBE
xC = xBC = xCD
xD = xCD + xED = xDA
xE = xBE = xED
xBC = xBE
xDA ≤ N
T = xi.timax + xij.ij
TRACES
Pipelined execution F
FU1
FU2
C
FETCHFU1FU2
COMPL.
1 2 3 4 5FETCH
FU1FU2
COMPL.
1 2 3 4 5
FETCHFU1FU2
COMPL.
1 2 3 4 5 6 5
5
-4
B1,B2
TRACES
Long Timing Effects (1) F
FU1
FU2
C
FETCHFU1FU2
COMPL.
1 2 3 4 5
FETCHFU1FU2
COMPL.
1 2 3 4 5 6 7 8
FETCHFU1FU2
COMPL.
1 2 3 4 5FETCH
FU1FU2
COMPL.
1 2 3 4 5
FETCHFU1FU2
COMPL.
1 2 3 4 5 6FETCH
FU1FU2
COMPL.
1 2 3 4 5 6
5
5
5
-4
-4TA-B-C = 7= 8
+1
TRACES
Long Timing Effects (2)
tABC
tABCD
t1…n = ti + j…ki=1
n
1 ≤ j ≤ k ≤n
tA tB tC tD tE
AB BC CD DEtAB
ABC BCD DEF
ABCD BCDE
ABCDE
J. Engblom
TRACES
Motivation
Long timing effects are: difficult to quantify
they might span over very long sequences
difficult to integrate into WCET computation
Long timing effects increase the variability of execution times
Our goal:eliminate long timing effects
TRACES
Outline
Our approach: code padding
Implementation software framework
analysis algorithms to identify resource requirements to compute safe padding lengths
Experimental results
Concluding remarks
TRACES
Code padding
FETCHFU1FU2
COMPL.
1 2 3 4 5 6 7 8
FETCHFU1FU2
COMPL.
1 2 3 4 5 6 7 8
filler instruction
FETCHFU1FU2
COMPL.
1 2 3 4 5 6
TRACES
Exemple (1)
inst i1
inst i2
…inst ini
inst j1
inst j2
……inst jnj
inst k1
…inst knk
block i
block j
block k
requires a 4-cycle delay
requires a 3-cycle delay
requires a 1-cycle delay
TRACES
Exemple (2)
inst i1
inst i2
…inst ini
inst j1
inst j2
…inst jnj
inst k1
…inst knk
block i
block j
block k
nopnopnopnopnopnopnopnop
nopnopnopnopnopnop
nopnop
4-cycle delay
3-cycle delay
1-cycle delay
TRACES
Exemple (3) inst i1
inst i2
…inst ini
inst j1
inst j2
…inst jnj
inst k1
…inst knk
block i
block j
block k
bl delay4
bl delay3
nopnop
4-cycle delay
3-cycle delay
1-cycle delay
delay4: nopnop
delay3: nopnop
delay2: blr
filler block
TRACES
Code padding framework
C source code
gcc compiler
assembly code
gas assembler
object code
CFG extractor
cycle-level simulator
interferenceanalysis
code padding
safe paddedassembly code
list of basic blocks
execution tracesof block sequences
padding lengths
TRACES
Analyzing resource requirements (1)
Requirements of a basic blockforeach block B do {
ff[B] first fetch cycle of B;lf[B] last fetch cycle of B + 1;foreach resource R do {
n[R] cycle at which R is needed;r[R] cycle at which R is released;
// 0 if R not used by Bn[R,B] n[R] – ff[B];r[R,B] r[R] – lf[B];
// 0 if R not used by B}
d[B] 0;}
FETCHFU1FU2
COMPL.
1 2 3 4 5
ff[B2] = 1lf[B2] = 2
n[FU1,B2] = 0r[FU1,B2] = 0n[FU2,B2] = 1r[FU2,B2] = 3
TRACES
Analyzing resource requirements (2)
Requirements of a sequence
foreach sequence B1-…-Bx (x < n) do {lf[Bx] last fetch cycle of Bx + 1;foreach resource R do {
r[R] cycle at which R is released; // 0 if R not used by any Bi
r[R,B1-…Bx] r[R] – lf[Bx];}
}
FETCHFU1FU2
COMPL.
1 2 3 4 5 6
lf[B2] = 3r[FU1,B1-B2] = 2r[FU2,B1-B2] = 3
r[FU1,B2] = 0
TRACES
Computing padding lengths (1)
Depth-1 strategy objective: r[R,A-B] == r[R,B]
algorithm:
example:
foreach sequence A-B do
foreach resource R do
if r[R,A-B] ≠ r[R,B] then {
d StrictDelay(R,A-B);if d > d[B] then
d[B] d;}
computes the padding length(iterative trials)
r[FU1,B2] = 0r[FU1,B1-B2] = 2 >
TRACES
Computing padding lengths (2)
Depth-n strategy
analyze (n+1)-block sequences (B0-B1-…-Bn)
objectives: for i < n :
if r[R,B0-…-Bi] > n[R,Bi+1] : r[R,B0-…-Bi] == r[R,B1-
…-Bi]
r[R,B0-…-Bn] == r[R,B1-…-Bn]
TRACES
Computing padding lengths (3)
Example: depth-4 algorithm
foreach sequence A-B-C-D-E do
foreach resource R do
if (n[R,C] > 0)&& (r[R,A-B] > n[R,C])
&& (r[R,A-B] > r[R,B]) then {
d MinimumDelay(R,A-B-C);if d > d[B] then
d[B] d;}
elsif (n[R,D] > 0)&& (r[R,A-B-C] > n[R,D])
&& (r[R,A-B-C] > r[R,B-C]) then {
d MinimumDelay(R,A-B-C);...
TRACES
Experimental results (1)
Code size increase 2-way 4-way
matmul 35.24% 76.19%
ludcmp 16.51% 28.20%
jfdctint 11.37% 126.97%
bsort 31.25% 76.25%
heapsort 25.00% 51.47%
insertsort 23.81% 59.52%
MEAN 23.86% 69.77%
0%
20%
40%
60%
80%
depth-1 depth-2 depth-3 depth-4
depth of the analysis
incre
ase
of c
ode
size
2-way 4-way
depth-1
TRACES
Experimental results (2)
WCET increase
0%
20%
40%
60%
depth-1 depth-2 depth-3 depth-4
depth of the analysis
incre
ase
of th
e r
eal W
CE
T 2-way 4-way
TRACES
Concluding remarks Inter-block long timing effects make
the WCET analysis complex and pessimistic
Code padding prevents long timing effects and limit the variability of partial execution times
The cost of padding can be acceptable code size ( 20% for a 2-way pipeline)
real WCET increase ( 20%)
future work:cost on the estimated WCET?