Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E...

34
Precise and Efficient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer , Rouven Naujoks Johannes Gutenberg-Universit¨ at Mainz Saarland University Max-Planck-Institut f¨ ur Informatik LCTES 2011, Chicago

Transcript of Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E...

Page 1: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Precise and Efficient Parametric Path Analysis

Ernst Althaus Sebastian Altmeyer Rouven Naujoks

Johannes Gutenberg-Universitat MainzSaarland University

Max-Planck-Institut fur Informatik

LCTES 2011 Chicago

1 Parametric Timing AnalysisMotivationToolchainPath Analysis

2 Singleton Loop ModelWhat is a Singleton LoopExploit the Singleton Loop ModelRuntimeBeyond Single Loops

3 EvaluationStructure of the BenchmarksPerformance EvaluationPrecision

4 Conclusions

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 2 20

Parametric Timing Analysis ndash Why

bull timing analysis essential for hard real-time systems

bull many systems depend on input parameters(operating system schedulers etc)

bull only two possible solutions

1 assume upper bounds on the unknown parametersrArr highly overapproximated execution-time bound

2 restart the analysis for all parameter assignmentsrArr very high analysis time

bull parametric timing analysis delivers timing formula instead of anumeric value

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 3 20

Parametric Timing Analysis ndash How

CFG Reconstruction

ValueLoop Analysis

Pipeline Analysis

Path Analysis

CFG Reconstruction extracts the control flow graph from theexecutable

ValueLoop Analysis determines values for registers and memoryaccesses determines loop bounds and parametric loopbound expressions

Pipeline Analysis derives bounds on the execution times T (vi ) ofall basic blocks

Path Analysis combines execution times of basic blocks and loopbounds to determine longest execution path

Framework according to [1]

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 4 20

Path Analysis Longest Paths via ILP

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

maxsum

i

sumnjisininc(vi )

T (vi )nj

n1 = 1

n1 = n2 + n3

n2 + n5 = n4 + n6

n4 = n5

n3 + n6 = 1

n4 lt= bl middot n2

bull Implicit path enumeration (IPET [4])

bull Control flow graph and the loop bounds are transformed intoflow constraints

bull Upper bounds for the execution times used as weights

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 5 20

Parametric Path Analysis Longest Paths via ILP

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

maxsum

i

sumnjisininc(vi )

T (vi )nj

n1 = 1

n1 = n2 + n3

n2 + n5 = n4 + n6

n4 = n5

n3 + n6 = 1

n4 lt= bl middot n2 bl middot c

bull Non-linear inequalities rArr need for approximation

bull Need to solve an ILP parametric PIP [3]

bull slow and imprecise in case of parametric ILPbla

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 6 20

Parametric Path Analysis Longest Paths via ILP

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

maxsum

i

sumnjisininc(vi )

T (vi )nj

n1 = 1

n1 = n2 + n3

n2 + n5 = n4 + n6

n4 = n5

n3 + n6 = 1

n4 lt= bl middot n2 bl middot c

bull Non-linear inequalities rArr need for approximation

bull Need to solve an ILP parametric PIP [3]

bull slow and imprecise in case of parametric ILPbla

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 6 20

Determing Longest Paths in Control Flow Graphs

Problem is NP-hard in general

But may be solved efficently for restricted graphsrArr Singleton-Loop Model

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 7 20

Determing Longest Paths in Control Flow Graphs

Problem is NP-hard in general

But may be solved efficently for restricted graphsrArr Singleton-Loop Model

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 7 20

What is a Singleton Loop

Idea Code for hard real-time systems often well structured

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

- A loop in a CFG is a stronglyconnected component (SCC)

- Structured loops (no Gotosetc) have a single entry node

A singleton loop is a SCC with exactly one entry nodeA singleton loop graph is a CFG that contaisn only singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 8 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graphbull Longest Path Computation in

polynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

The Singleton-Loop Model - How to recurse

v1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 by

bull two nodes v in1 vout

1 withbull in- and outgoing edges of

v1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

Runtime Properties

Worst Case Running Times

bull numeric bounds O(|V ||E |)bull symbolic bounds O(|V ||E |+ |V |2 middot x middot s(x)) where

bull x is the of symbolic boundsbull s(x) is the output size

Output Size

In the worst case22xminus1 le s(x) le 22x

Output Sensitivity

The algorithm is polynomial output sensitive ie its running timeis polynomial in the input size and in the output size

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 11 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 2: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

1 Parametric Timing AnalysisMotivationToolchainPath Analysis

2 Singleton Loop ModelWhat is a Singleton LoopExploit the Singleton Loop ModelRuntimeBeyond Single Loops

3 EvaluationStructure of the BenchmarksPerformance EvaluationPrecision

4 Conclusions

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 2 20

Parametric Timing Analysis ndash Why

bull timing analysis essential for hard real-time systems

bull many systems depend on input parameters(operating system schedulers etc)

bull only two possible solutions

1 assume upper bounds on the unknown parametersrArr highly overapproximated execution-time bound

2 restart the analysis for all parameter assignmentsrArr very high analysis time

bull parametric timing analysis delivers timing formula instead of anumeric value

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 3 20

Parametric Timing Analysis ndash How

CFG Reconstruction

ValueLoop Analysis

Pipeline Analysis

Path Analysis

CFG Reconstruction extracts the control flow graph from theexecutable

ValueLoop Analysis determines values for registers and memoryaccesses determines loop bounds and parametric loopbound expressions

Pipeline Analysis derives bounds on the execution times T (vi ) ofall basic blocks

Path Analysis combines execution times of basic blocks and loopbounds to determine longest execution path

Framework according to [1]

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 4 20

Path Analysis Longest Paths via ILP

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

maxsum

i

sumnjisininc(vi )

T (vi )nj

n1 = 1

n1 = n2 + n3

n2 + n5 = n4 + n6

n4 = n5

n3 + n6 = 1

n4 lt= bl middot n2

bull Implicit path enumeration (IPET [4])

bull Control flow graph and the loop bounds are transformed intoflow constraints

bull Upper bounds for the execution times used as weights

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 5 20

Parametric Path Analysis Longest Paths via ILP

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

maxsum

i

sumnjisininc(vi )

T (vi )nj

n1 = 1

n1 = n2 + n3

n2 + n5 = n4 + n6

n4 = n5

n3 + n6 = 1

n4 lt= bl middot n2 bl middot c

bull Non-linear inequalities rArr need for approximation

bull Need to solve an ILP parametric PIP [3]

bull slow and imprecise in case of parametric ILPbla

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 6 20

Parametric Path Analysis Longest Paths via ILP

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

maxsum

i

sumnjisininc(vi )

T (vi )nj

n1 = 1

n1 = n2 + n3

n2 + n5 = n4 + n6

n4 = n5

n3 + n6 = 1

n4 lt= bl middot n2 bl middot c

bull Non-linear inequalities rArr need for approximation

bull Need to solve an ILP parametric PIP [3]

bull slow and imprecise in case of parametric ILPbla

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 6 20

Determing Longest Paths in Control Flow Graphs

Problem is NP-hard in general

But may be solved efficently for restricted graphsrArr Singleton-Loop Model

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 7 20

Determing Longest Paths in Control Flow Graphs

Problem is NP-hard in general

But may be solved efficently for restricted graphsrArr Singleton-Loop Model

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 7 20

What is a Singleton Loop

Idea Code for hard real-time systems often well structured

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

- A loop in a CFG is a stronglyconnected component (SCC)

- Structured loops (no Gotosetc) have a single entry node

A singleton loop is a SCC with exactly one entry nodeA singleton loop graph is a CFG that contaisn only singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 8 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graphbull Longest Path Computation in

polynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

The Singleton-Loop Model - How to recurse

v1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 by

bull two nodes v in1 vout

1 withbull in- and outgoing edges of

v1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

Runtime Properties

Worst Case Running Times

bull numeric bounds O(|V ||E |)bull symbolic bounds O(|V ||E |+ |V |2 middot x middot s(x)) where

bull x is the of symbolic boundsbull s(x) is the output size

Output Size

In the worst case22xminus1 le s(x) le 22x

Output Sensitivity

The algorithm is polynomial output sensitive ie its running timeis polynomial in the input size and in the output size

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 11 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 3: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Parametric Timing Analysis ndash Why

bull timing analysis essential for hard real-time systems

bull many systems depend on input parameters(operating system schedulers etc)

bull only two possible solutions

1 assume upper bounds on the unknown parametersrArr highly overapproximated execution-time bound

2 restart the analysis for all parameter assignmentsrArr very high analysis time

bull parametric timing analysis delivers timing formula instead of anumeric value

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 3 20

Parametric Timing Analysis ndash How

CFG Reconstruction

ValueLoop Analysis

Pipeline Analysis

Path Analysis

CFG Reconstruction extracts the control flow graph from theexecutable

ValueLoop Analysis determines values for registers and memoryaccesses determines loop bounds and parametric loopbound expressions

Pipeline Analysis derives bounds on the execution times T (vi ) ofall basic blocks

Path Analysis combines execution times of basic blocks and loopbounds to determine longest execution path

Framework according to [1]

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 4 20

Path Analysis Longest Paths via ILP

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

maxsum

i

sumnjisininc(vi )

T (vi )nj

n1 = 1

n1 = n2 + n3

n2 + n5 = n4 + n6

n4 = n5

n3 + n6 = 1

n4 lt= bl middot n2

bull Implicit path enumeration (IPET [4])

bull Control flow graph and the loop bounds are transformed intoflow constraints

bull Upper bounds for the execution times used as weights

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 5 20

Parametric Path Analysis Longest Paths via ILP

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

maxsum

i

sumnjisininc(vi )

T (vi )nj

n1 = 1

n1 = n2 + n3

n2 + n5 = n4 + n6

n4 = n5

n3 + n6 = 1

n4 lt= bl middot n2 bl middot c

bull Non-linear inequalities rArr need for approximation

bull Need to solve an ILP parametric PIP [3]

bull slow and imprecise in case of parametric ILPbla

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 6 20

Parametric Path Analysis Longest Paths via ILP

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

maxsum

i

sumnjisininc(vi )

T (vi )nj

n1 = 1

n1 = n2 + n3

n2 + n5 = n4 + n6

n4 = n5

n3 + n6 = 1

n4 lt= bl middot n2 bl middot c

bull Non-linear inequalities rArr need for approximation

bull Need to solve an ILP parametric PIP [3]

bull slow and imprecise in case of parametric ILPbla

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 6 20

Determing Longest Paths in Control Flow Graphs

Problem is NP-hard in general

But may be solved efficently for restricted graphsrArr Singleton-Loop Model

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 7 20

Determing Longest Paths in Control Flow Graphs

Problem is NP-hard in general

But may be solved efficently for restricted graphsrArr Singleton-Loop Model

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 7 20

What is a Singleton Loop

Idea Code for hard real-time systems often well structured

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

- A loop in a CFG is a stronglyconnected component (SCC)

- Structured loops (no Gotosetc) have a single entry node

A singleton loop is a SCC with exactly one entry nodeA singleton loop graph is a CFG that contaisn only singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 8 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graphbull Longest Path Computation in

polynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

The Singleton-Loop Model - How to recurse

v1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 by

bull two nodes v in1 vout

1 withbull in- and outgoing edges of

v1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

Runtime Properties

Worst Case Running Times

bull numeric bounds O(|V ||E |)bull symbolic bounds O(|V ||E |+ |V |2 middot x middot s(x)) where

bull x is the of symbolic boundsbull s(x) is the output size

Output Size

In the worst case22xminus1 le s(x) le 22x

Output Sensitivity

The algorithm is polynomial output sensitive ie its running timeis polynomial in the input size and in the output size

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 11 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 4: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Parametric Timing Analysis ndash How

CFG Reconstruction

ValueLoop Analysis

Pipeline Analysis

Path Analysis

CFG Reconstruction extracts the control flow graph from theexecutable

ValueLoop Analysis determines values for registers and memoryaccesses determines loop bounds and parametric loopbound expressions

Pipeline Analysis derives bounds on the execution times T (vi ) ofall basic blocks

Path Analysis combines execution times of basic blocks and loopbounds to determine longest execution path

Framework according to [1]

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 4 20

Path Analysis Longest Paths via ILP

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

maxsum

i

sumnjisininc(vi )

T (vi )nj

n1 = 1

n1 = n2 + n3

n2 + n5 = n4 + n6

n4 = n5

n3 + n6 = 1

n4 lt= bl middot n2

bull Implicit path enumeration (IPET [4])

bull Control flow graph and the loop bounds are transformed intoflow constraints

bull Upper bounds for the execution times used as weights

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 5 20

Parametric Path Analysis Longest Paths via ILP

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

maxsum

i

sumnjisininc(vi )

T (vi )nj

n1 = 1

n1 = n2 + n3

n2 + n5 = n4 + n6

n4 = n5

n3 + n6 = 1

n4 lt= bl middot n2 bl middot c

bull Non-linear inequalities rArr need for approximation

bull Need to solve an ILP parametric PIP [3]

bull slow and imprecise in case of parametric ILPbla

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 6 20

Parametric Path Analysis Longest Paths via ILP

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

maxsum

i

sumnjisininc(vi )

T (vi )nj

n1 = 1

n1 = n2 + n3

n2 + n5 = n4 + n6

n4 = n5

n3 + n6 = 1

n4 lt= bl middot n2 bl middot c

bull Non-linear inequalities rArr need for approximation

bull Need to solve an ILP parametric PIP [3]

bull slow and imprecise in case of parametric ILPbla

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 6 20

Determing Longest Paths in Control Flow Graphs

Problem is NP-hard in general

But may be solved efficently for restricted graphsrArr Singleton-Loop Model

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 7 20

Determing Longest Paths in Control Flow Graphs

Problem is NP-hard in general

But may be solved efficently for restricted graphsrArr Singleton-Loop Model

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 7 20

What is a Singleton Loop

Idea Code for hard real-time systems often well structured

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

- A loop in a CFG is a stronglyconnected component (SCC)

- Structured loops (no Gotosetc) have a single entry node

A singleton loop is a SCC with exactly one entry nodeA singleton loop graph is a CFG that contaisn only singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 8 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graphbull Longest Path Computation in

polynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

The Singleton-Loop Model - How to recurse

v1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 by

bull two nodes v in1 vout

1 withbull in- and outgoing edges of

v1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

Runtime Properties

Worst Case Running Times

bull numeric bounds O(|V ||E |)bull symbolic bounds O(|V ||E |+ |V |2 middot x middot s(x)) where

bull x is the of symbolic boundsbull s(x) is the output size

Output Size

In the worst case22xminus1 le s(x) le 22x

Output Sensitivity

The algorithm is polynomial output sensitive ie its running timeis polynomial in the input size and in the output size

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 11 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 5: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Path Analysis Longest Paths via ILP

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

maxsum

i

sumnjisininc(vi )

T (vi )nj

n1 = 1

n1 = n2 + n3

n2 + n5 = n4 + n6

n4 = n5

n3 + n6 = 1

n4 lt= bl middot n2

bull Implicit path enumeration (IPET [4])

bull Control flow graph and the loop bounds are transformed intoflow constraints

bull Upper bounds for the execution times used as weights

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 5 20

Parametric Path Analysis Longest Paths via ILP

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

maxsum

i

sumnjisininc(vi )

T (vi )nj

n1 = 1

n1 = n2 + n3

n2 + n5 = n4 + n6

n4 = n5

n3 + n6 = 1

n4 lt= bl middot n2 bl middot c

bull Non-linear inequalities rArr need for approximation

bull Need to solve an ILP parametric PIP [3]

bull slow and imprecise in case of parametric ILPbla

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 6 20

Parametric Path Analysis Longest Paths via ILP

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

maxsum

i

sumnjisininc(vi )

T (vi )nj

n1 = 1

n1 = n2 + n3

n2 + n5 = n4 + n6

n4 = n5

n3 + n6 = 1

n4 lt= bl middot n2 bl middot c

bull Non-linear inequalities rArr need for approximation

bull Need to solve an ILP parametric PIP [3]

bull slow and imprecise in case of parametric ILPbla

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 6 20

Determing Longest Paths in Control Flow Graphs

Problem is NP-hard in general

But may be solved efficently for restricted graphsrArr Singleton-Loop Model

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 7 20

Determing Longest Paths in Control Flow Graphs

Problem is NP-hard in general

But may be solved efficently for restricted graphsrArr Singleton-Loop Model

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 7 20

What is a Singleton Loop

Idea Code for hard real-time systems often well structured

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

- A loop in a CFG is a stronglyconnected component (SCC)

- Structured loops (no Gotosetc) have a single entry node

A singleton loop is a SCC with exactly one entry nodeA singleton loop graph is a CFG that contaisn only singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 8 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graphbull Longest Path Computation in

polynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

The Singleton-Loop Model - How to recurse

v1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 by

bull two nodes v in1 vout

1 withbull in- and outgoing edges of

v1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

Runtime Properties

Worst Case Running Times

bull numeric bounds O(|V ||E |)bull symbolic bounds O(|V ||E |+ |V |2 middot x middot s(x)) where

bull x is the of symbolic boundsbull s(x) is the output size

Output Size

In the worst case22xminus1 le s(x) le 22x

Output Sensitivity

The algorithm is polynomial output sensitive ie its running timeis polynomial in the input size and in the output size

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 11 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 6: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Parametric Path Analysis Longest Paths via ILP

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

maxsum

i

sumnjisininc(vi )

T (vi )nj

n1 = 1

n1 = n2 + n3

n2 + n5 = n4 + n6

n4 = n5

n3 + n6 = 1

n4 lt= bl middot n2 bl middot c

bull Non-linear inequalities rArr need for approximation

bull Need to solve an ILP parametric PIP [3]

bull slow and imprecise in case of parametric ILPbla

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 6 20

Parametric Path Analysis Longest Paths via ILP

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

maxsum

i

sumnjisininc(vi )

T (vi )nj

n1 = 1

n1 = n2 + n3

n2 + n5 = n4 + n6

n4 = n5

n3 + n6 = 1

n4 lt= bl middot n2 bl middot c

bull Non-linear inequalities rArr need for approximation

bull Need to solve an ILP parametric PIP [3]

bull slow and imprecise in case of parametric ILPbla

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 6 20

Determing Longest Paths in Control Flow Graphs

Problem is NP-hard in general

But may be solved efficently for restricted graphsrArr Singleton-Loop Model

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 7 20

Determing Longest Paths in Control Flow Graphs

Problem is NP-hard in general

But may be solved efficently for restricted graphsrArr Singleton-Loop Model

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 7 20

What is a Singleton Loop

Idea Code for hard real-time systems often well structured

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

- A loop in a CFG is a stronglyconnected component (SCC)

- Structured loops (no Gotosetc) have a single entry node

A singleton loop is a SCC with exactly one entry nodeA singleton loop graph is a CFG that contaisn only singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 8 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graphbull Longest Path Computation in

polynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

The Singleton-Loop Model - How to recurse

v1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 by

bull two nodes v in1 vout

1 withbull in- and outgoing edges of

v1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

Runtime Properties

Worst Case Running Times

bull numeric bounds O(|V ||E |)bull symbolic bounds O(|V ||E |+ |V |2 middot x middot s(x)) where

bull x is the of symbolic boundsbull s(x) is the output size

Output Size

In the worst case22xminus1 le s(x) le 22x

Output Sensitivity

The algorithm is polynomial output sensitive ie its running timeis polynomial in the input size and in the output size

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 11 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 7: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Parametric Path Analysis Longest Paths via ILP

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

maxsum

i

sumnjisininc(vi )

T (vi )nj

n1 = 1

n1 = n2 + n3

n2 + n5 = n4 + n6

n4 = n5

n3 + n6 = 1

n4 lt= bl middot n2 bl middot c

bull Non-linear inequalities rArr need for approximation

bull Need to solve an ILP parametric PIP [3]

bull slow and imprecise in case of parametric ILPbla

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 6 20

Determing Longest Paths in Control Flow Graphs

Problem is NP-hard in general

But may be solved efficently for restricted graphsrArr Singleton-Loop Model

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 7 20

Determing Longest Paths in Control Flow Graphs

Problem is NP-hard in general

But may be solved efficently for restricted graphsrArr Singleton-Loop Model

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 7 20

What is a Singleton Loop

Idea Code for hard real-time systems often well structured

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

- A loop in a CFG is a stronglyconnected component (SCC)

- Structured loops (no Gotosetc) have a single entry node

A singleton loop is a SCC with exactly one entry nodeA singleton loop graph is a CFG that contaisn only singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 8 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graphbull Longest Path Computation in

polynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

The Singleton-Loop Model - How to recurse

v1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 by

bull two nodes v in1 vout

1 withbull in- and outgoing edges of

v1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

Runtime Properties

Worst Case Running Times

bull numeric bounds O(|V ||E |)bull symbolic bounds O(|V ||E |+ |V |2 middot x middot s(x)) where

bull x is the of symbolic boundsbull s(x) is the output size

Output Size

In the worst case22xminus1 le s(x) le 22x

Output Sensitivity

The algorithm is polynomial output sensitive ie its running timeis polynomial in the input size and in the output size

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 11 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 8: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Determing Longest Paths in Control Flow Graphs

Problem is NP-hard in general

But may be solved efficently for restricted graphsrArr Singleton-Loop Model

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 7 20

Determing Longest Paths in Control Flow Graphs

Problem is NP-hard in general

But may be solved efficently for restricted graphsrArr Singleton-Loop Model

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 7 20

What is a Singleton Loop

Idea Code for hard real-time systems often well structured

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

- A loop in a CFG is a stronglyconnected component (SCC)

- Structured loops (no Gotosetc) have a single entry node

A singleton loop is a SCC with exactly one entry nodeA singleton loop graph is a CFG that contaisn only singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 8 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graphbull Longest Path Computation in

polynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

The Singleton-Loop Model - How to recurse

v1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 by

bull two nodes v in1 vout

1 withbull in- and outgoing edges of

v1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

Runtime Properties

Worst Case Running Times

bull numeric bounds O(|V ||E |)bull symbolic bounds O(|V ||E |+ |V |2 middot x middot s(x)) where

bull x is the of symbolic boundsbull s(x) is the output size

Output Size

In the worst case22xminus1 le s(x) le 22x

Output Sensitivity

The algorithm is polynomial output sensitive ie its running timeis polynomial in the input size and in the output size

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 11 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 9: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Determing Longest Paths in Control Flow Graphs

Problem is NP-hard in general

But may be solved efficently for restricted graphsrArr Singleton-Loop Model

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 7 20

What is a Singleton Loop

Idea Code for hard real-time systems often well structured

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

- A loop in a CFG is a stronglyconnected component (SCC)

- Structured loops (no Gotosetc) have a single entry node

A singleton loop is a SCC with exactly one entry nodeA singleton loop graph is a CFG that contaisn only singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 8 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graphbull Longest Path Computation in

polynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

The Singleton-Loop Model - How to recurse

v1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 by

bull two nodes v in1 vout

1 withbull in- and outgoing edges of

v1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

Runtime Properties

Worst Case Running Times

bull numeric bounds O(|V ||E |)bull symbolic bounds O(|V ||E |+ |V |2 middot x middot s(x)) where

bull x is the of symbolic boundsbull s(x) is the output size

Output Size

In the worst case22xminus1 le s(x) le 22x

Output Sensitivity

The algorithm is polynomial output sensitive ie its running timeis polynomial in the input size and in the output size

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 11 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 10: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

What is a Singleton Loop

Idea Code for hard real-time systems often well structured

bl

v1

v2

n1

v3

n2

v4n4

n5

v5

n6

n3

- A loop in a CFG is a stronglyconnected component (SCC)

- Structured loops (no Gotosetc) have a single entry node

A singleton loop is a SCC with exactly one entry nodeA singleton loop graph is a CFG that contaisn only singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 8 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graphbull Longest Path Computation in

polynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

The Singleton-Loop Model - How to recurse

v1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 by

bull two nodes v in1 vout

1 withbull in- and outgoing edges of

v1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

Runtime Properties

Worst Case Running Times

bull numeric bounds O(|V ||E |)bull symbolic bounds O(|V ||E |+ |V |2 middot x middot s(x)) where

bull x is the of symbolic boundsbull s(x) is the output size

Output Size

In the worst case22xminus1 le s(x) le 22x

Output Sensitivity

The algorithm is polynomial output sensitive ie its running timeis polynomial in the input size and in the output size

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 11 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 11: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graphbull Longest Path Computation in

polynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

The Singleton-Loop Model - How to recurse

v1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 by

bull two nodes v in1 vout

1 withbull in- and outgoing edges of

v1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

Runtime Properties

Worst Case Running Times

bull numeric bounds O(|V ||E |)bull symbolic bounds O(|V ||E |+ |V |2 middot x middot s(x)) where

bull x is the of symbolic boundsbull s(x) is the output size

Output Size

In the worst case22xminus1 le s(x) le 22x

Output Sensitivity

The algorithm is polynomial output sensitive ie its running timeis polynomial in the input size and in the output size

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 11 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 12: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v1

v1

v1

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node N

bull set weight of incident edgesappropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graphbull Longest Path Computation in

polynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

The Singleton-Loop Model - How to recurse

v1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 by

bull two nodes v in1 vout

1 withbull in- and outgoing edges of

v1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

Runtime Properties

Worst Case Running Times

bull numeric bounds O(|V ||E |)bull symbolic bounds O(|V ||E |+ |V |2 middot x middot s(x)) where

bull x is the of symbolic boundsbull s(x) is the output size

Output Size

In the worst case22xminus1 le s(x) le 22x

Output Sensitivity

The algorithm is polynomial output sensitive ie its running timeis polynomial in the input size and in the output size

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 11 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 13: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graph

bull Longest Path Computation inpolynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graphbull Longest Path Computation in

polynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

The Singleton-Loop Model - How to recurse

v1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 by

bull two nodes v in1 vout

1 withbull in- and outgoing edges of

v1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

Runtime Properties

Worst Case Running Times

bull numeric bounds O(|V ||E |)bull symbolic bounds O(|V ||E |+ |V |2 middot x middot s(x)) where

bull x is the of symbolic boundsbull s(x) is the output size

Output Size

In the worst case22xminus1 le s(x) le 22x

Output Sensitivity

The algorithm is polynomial output sensitive ie its running timeis polynomial in the input size and in the output size

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 11 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 14: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Detailed Explanation

v0

v1

v2

v3v4

v5

v6

v0

N

v6

w1 w2

bull Assume we know for each loop (byrecursion)

bull Longest paths from its entrynode to its portal nodes

bull Contract loop to artifical node Nbull set weight of incident edges

appropriatelyw1 = lps(v1 v4) + w(v4 v6)w2 = lps(v1 v5) + w(v5 v6)

bull Left with a directed acyclic graphbull Longest Path Computation in

polynomial time

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 9 20

The Singleton-Loop Model - How to recurse

v1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 by

bull two nodes v in1 vout

1 withbull in- and outgoing edges of

v1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

Runtime Properties

Worst Case Running Times

bull numeric bounds O(|V ||E |)bull symbolic bounds O(|V ||E |+ |V |2 middot x middot s(x)) where

bull x is the of symbolic boundsbull s(x) is the output size

Output Size

In the worst case22xminus1 le s(x) le 22x

Output Sensitivity

The algorithm is polynomial output sensitive ie its running timeis polynomial in the input size and in the output size

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 11 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 15: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

The Singleton-Loop Model - How to recurse

v1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 by

bull two nodes v in1 vout

1 withbull in- and outgoing edges of

v1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

Runtime Properties

Worst Case Running Times

bull numeric bounds O(|V ||E |)bull symbolic bounds O(|V ||E |+ |V |2 middot x middot s(x)) where

bull x is the of symbolic boundsbull s(x) is the output size

Output Size

In the worst case22xminus1 le s(x) le 22x

Output Sensitivity

The algorithm is polynomial output sensitive ie its running timeis polynomial in the input size and in the output size

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 11 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 16: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

Runtime Properties

Worst Case Running Times

bull numeric bounds O(|V ||E |)bull symbolic bounds O(|V ||E |+ |V |2 middot x middot s(x)) where

bull x is the of symbolic boundsbull s(x) is the output size

Output Size

In the worst case22xminus1 le s(x) le 22x

Output Sensitivity

The algorithm is polynomial output sensitive ie its running timeis polynomial in the input size and in the output size

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 11 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 17: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

Runtime Properties

Worst Case Running Times

bull numeric bounds O(|V ||E |)bull symbolic bounds O(|V ||E |+ |V |2 middot x middot s(x)) where

bull x is the of symbolic boundsbull s(x) is the output size

Output Size

In the worst case22xminus1 le s(x) le 22x

Output Sensitivity

The algorithm is polynomial output sensitive ie its running timeis polynomial in the input size and in the output size

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 11 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 18: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

The Singleton-Loop Model - How to recurse

v in1 vout

1

v2

v3v4

v5

bull Given a loop L with loopbound bL

bull Recallwant to determine LPs fromentry node to portal nodes

bull Replace entry node v1 bybull two nodes v in

1 vout1 with

bull in- and outgoing edges ofv1 assigned accordingly

bull Recurse algorithm on thisnew graph

bull we know LP(vout1 vi ) and

LP(vout1 v in

1 )

bull lps(v1 pi ) = (bL minus 1) middotlps(vout

1 v in1 ) + lps(vout

1 pi ))

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 10 20

Runtime Properties

Worst Case Running Times

bull numeric bounds O(|V ||E |)bull symbolic bounds O(|V ||E |+ |V |2 middot x middot s(x)) where

bull x is the of symbolic boundsbull s(x) is the output size

Output Size

In the worst case22xminus1 le s(x) le 22x

Output Sensitivity

The algorithm is polynomial output sensitive ie its running timeis polynomial in the input size and in the output size

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 11 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 19: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Runtime Properties

Worst Case Running Times

bull numeric bounds O(|V ||E |)bull symbolic bounds O(|V ||E |+ |V |2 middot x middot s(x)) where

bull x is the of symbolic boundsbull s(x) is the output size

Output Size

In the worst case22xminus1 le s(x) le 22x

Output Sensitivity

The algorithm is polynomial output sensitive ie its running timeis polynomial in the input size and in the output size

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 11 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 20: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 21: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 22: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Beyond the Singleton-Loop Model

What happens if CFG has non-singleton loops

v1

v2v3

v4

v2 v3

v4

Convert the CFG

Each CFG can be transformed into an Singleton Loop Graph

Disadvantage Comes at the cost of increased running time(eg symbolic bounds can be doubled)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 12 20

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 23: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

How useful is the new approach in practise

Depends on

bull How well does the Singleton-Loop Model fits real CFGs

bull How does the Singleton-Loop Approach perform

bull How much precision is gained

Testsetting

bull Benchmarks from Malardalen WCET benchmark suite

bull Compiled via gcc to the ARM7 processor

bull Analyzed on an Intel Core2Duo 2GHz 2 GB Ram withUbuntu 910

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 13 20

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 24: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Number of Singleton Loops

bull Only 8 of 33 test-cases exhibit non-singleton loops(adpcm cnt compress duff matmult ndes ns qsort-exam)

bull Only in one case (compress) a higher number ofloop-duplications (65) is needed (all others lt 10)

Deeply nested loops unstructured code segments calls to externallibrary functions causes non-singleton loops

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 14 20

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 25: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Compare perfomance to

Numerica Path Analysis

bull ILP Formulation with lp solve (free lp-solver)bull ILP Formulation with CPLEX

(commercial lp-solver)

Parametric Path Analysis

bull Parametirc ILP Formulation with PIP [3](free parametric lp-solver)

bull Parametric timing analysis by Bygde andLisper [5 2]resorts to C-level not to binary level uses apolyhedran approachdirect comparison not possible

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 15 20

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 26: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Performance Evaluation ndash Test-Cases are ver small

Testcases from Malardalen WCET benchmark suite are very small(all are solved in less than 1 second by all approaches)

NameSize (in Byte) Singleton duplicated

C-File Exec Graph loops

s-graph-1 208273 235222 yes -s-graph-2 468944 292305 yes -s-graph-3 702670 386961 yes -s-graph-4 936396 481609 yes -s-graph-5 670452 284593 yes -

ns-graph-1 90274 215433 no 77ns-graph-2 315562 247443 no 77ns-graph-3 766144 426427 no 77ns-graph-4 990502 520579 no 5ns-graph-5 979908 518338 no 9ns-graph-6 942084 502580 no 74

Larger benchmarks created by combining and duplicating originaltest-cases from the benchmark suite

(s-graph-X are singleton loop graphs ns-graph-X non-singleton loop graphs)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 16 20

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 27: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Performance Evaluation Numeric

NameRuntime (s)

SingletonLoop lp solve CPLEX

nsichneu 002 086 005s-graph-1 003 346 008s-graph-2 005 1369 008s-graph-3 008 3085 011s-graph-4 011 5731 018s-graph-5 011 1088 013

adpcm 004 007 002compress 03 003 003statemate 005 03 004

ns-graph-1 097 45 004ns-graph-2 095 1458 005ns-graph-3 101 4813 012ns-graph-4 014 921 011ns-graph-5 016 1133 012ns-graph-6 064 659 017

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 17 20

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 28: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Performance Evaluation Parametric

NameRuntime of parameters

0 1 2 3 4 6 8

s-graph-1 003 003 003 003 003 - -s-graph-2 005 005 005 005 005 005 006s-graph-3 008 008 008 008 008 008 011s-graph-4 011 011 011 011 012 012 012s-graph-5 011 012 012 012 012 012 013

ns-graph-1 097 1 173 19 202 211 211ns-graph-2 095 203 203 204 236 235 238ns-graph-3 101 101 101 101 124 342 344ns-graph-4 014 022 028 03 033 045 045ns-graph-5 016 028 033 062 067 111 111ns-graph-6 064 064 064 066 071 119 119

measurements only of singleton loop methodall other approaches fail to solve these test-cases

(PIP and Bygdersquos approach [2] handle at most two parameters)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 18 20

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 29: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Evaluation Precision of the Parametric FormulasInsertsort Matmult

TimePIP(n) = 156n2 + 674n + 1186

TimeSingleton(n) = 131n2 + 71n + 1185

TimePIP(n) =

386n3 + 782n2

+ 790n + 643 if n gt 1

2992 if n le 1

TimeSingleton(n) = 111n3+164n2+845n+793

bull Singleton Loop Method is precise

bull PIP suffers fro imprecision due to loop bound transformation

bull Bygdersquos approach is precise in most but not in all cases

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 19 20

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 30: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

ConclusionsSingleton Loop Graphs are restricted CFG that enable computation of

bull numeric timing bound in polynomial time

bull parametric timing bound in output-polynomial time(significant improvment over former methods) and

bull precise parametric timing bounds

All CFGs can be transformed to singleton loop graphs(at the cost of performance loss)

Evaluation showed that

bull most benchmarks fit the singleton loop model

bull singleton loop approach can compete with CPLEX

bull enable fast and precise computation of parametric timing bounds

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 31: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Thanks for your attention

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 32: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

ReferencesS Altmeyer C Humbert B Lisper and R Wilhelm

Parametric timing analyis for complex architectures

In RTCSArsquo08 2008

S Bygde A Ermedahl and B Lisper

An efficient algorithm for parametric wcet calculation

In RTCSArsquo09 2009

P Feautrier

The parametric integer programmingrsquos home httpwwwpipliborg

Y-T S Li and S Malik

Performance analysis of embedded software using implicit pathenumeration

In DAC rsquo95

B Lisper

Fully automatic parametric worst-case execution time analysis

In (WCET 03)

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 33: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Appendix Malardalen Benachmark SuiteName

Size (in Byte) Singleton duplicatedC-File Exec Graph loops

adpcm 26582 156759 no 5bs 4248 144447 yes -bs100 2779 144629 yes -cnt 2880 149801 yes -compress 13411 149804 no 65cover 5026 148301 yes -crc 5168 145615 yes -duff 2374 144739 no 6edn 10563 150682 yes -expint 4288 145867 yes -fac 426 144148 yes -fdct 8863 147128 yes -fft1 6244 153303 yes -fibcall 3499 144152 yes -fir 11965 151589 yes -insertsort 3892 144305 yes -jannecomplex 1564 144242 yes -jfdctint 16028 146858 yes -lcdnum 1678 144509 yes -lms 7720 157868 yes -ludpcm 5160 151848 yes -matmult 3737 145083 no 4minver 5805 152845 yes -ndes 7345 148689 no 2ns 10436 149567 no 7nsichneu 118351 176240 yes -prime 904 144538 yes -qsort-exam 4535 146468 no 3qurt 4898 151214 yes -recursion 620 144341 yes -select 4494 146283 yes -sqrt 3567 154282 yes -statemate 52618 162879 no 7Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions
Page 34: Precise and Efficient Parametric Path Analysisaltmeyer/slides/paramPath-lctes11.pdfPrecise and E cient Parametric Path Analysis Ernst Althaus, Sebastian Altmeyer, Rouven Naujoks Johannes

Appendix Imprecision of Parametric ILP Approach

v1L1

v2

v3

L2

v5

v6

v7

bull Only one loop taken in actual execution

bull parametric ILP needs to upper bound entry node both loopsare part of the WCET Path

Althaus Altmeyer Naujoks Precise and Efficient Parametric Path Analysis 20 20

  • Parametric Timing Analysis
    • Motivation
    • Toolchain
    • Path Analysis
      • Singleton Loop Model
        • What is a Singleton Loop
        • Exploit the Singleton Loop Model
        • Runtime
        • Beyond Single Loops
          • Evaluation
            • Structure of the Benchmarks
            • Performance Evaluation
            • Precision
              • Conclusions