Fundamentals of Program Analysis Dr. Manuel Oriol
description
Transcript of Fundamentals of Program Analysis Dr. Manuel Oriol
![Page 1: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/1.jpg)
Chair of Software Engineering
Fundamentals of Program Analysis
Dr. Manuel Oriol
![Page 2: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/2.jpg)
2
Topics
Abstract Syntax Tree
Control Flow Graph
Fundamental of Dataflow Analysis Available Expressions Analysis Live Variables Analysis Very Busy Expressions Analysis Reaching Definition Analysis
![Page 3: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/3.jpg)
3
Program Analysis
A set of techniques that consist in analyzing programs in order to prove properties on them.
Several techniques can be distinguished: Data flow analysis Constraint-based analysis Abstract Interpretation Types and effects analysis …
![Page 4: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/4.jpg)
4
Structures
Source Code
Abstract Syntax Trees (AST)
Control-Flow Graphs (CFG)
Object code
![Page 5: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/5.jpg)
5
Abstract Syntax Tree?
a := 1b := a + 2if (b > 3) then
Result := belse
Result := aend
Program
:=a 1 :=
b +
a 2
If… Then…Else…End
>
b 3:=
Result b :=Result a
![Page 6: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/6.jpg)
6
How ASTs are computed? Compilers…
Use a lexical analyzer that gives you the tokens (e.g.: lex, flex, gelex…)
Couple it with a parser that groups the tokens and gives the structure (e.g.: yacc, bison, geyacc…)… generally the structure is an AST
![Page 7: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/7.jpg)
7
Advantages and Disadvantages of ASTsAdvantages:
No more noise (comments, keywords etc…) Not ambiguous Exact
Disadvantages: Many similar forms: if/switch, for/foreach… Expressions may become big and going through
them may become costly.
So???
![Page 8: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/8.jpg)
8
Control-Flow Graph
a := 1b := a + 2if (b > 3) then
Result := belse
Result := aendb := a
a:=1
b:=a+2
b>3
Result:=b Result:=a
b:=a
yes no
![Page 9: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/9.jpg)
9
How CFGs are computed? Compilers…
Take the AST
Group the statements
Remove the control-flow instructions and keep only jumps and tests
Transform jumps into transitions
![Page 10: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/10.jpg)
10
Control-Flow Graph with blocks…
a := 1b := a + 2if (b > 3) then
Result := belse
Result := ab := a
endb := a
a:=1b:=a+2
b>3
Result:=b Result:=ab:=a
b:=a
yes no
![Page 11: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/11.jpg)
11
Advantages and DisadvantagesAdvantages: no more similar forms expressions are easy to understand: only
statements
Disadvantages: introduce temporaries less understandable, regarding to the original code
and the general structure
![Page 12: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/12.jpg)
12
So what can we do with that?
Example:Prove that there is no call on a Void reference in this program:
create b.make (10)from a := 1 until a > 10 loop
b.testa := a + 1
end
![Page 13: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/13.jpg)
13
And what about this program?
create b.make (10)from a := 1 until a > 10 loop
if (a = 10) then b := Voidelse b.testenda := a + 1
end
![Page 14: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/14.jpg)
14
Dataflow AnalysisAs we could see in previous examples:
There is a need for an analysis of data manipulations
Traditionally, it leads to optimizations (e.g. statements that have no influence may be removed), loops may have statements extracted from them etc…
Based on all paths through programs
It is the analysis of labels by excellence
![Page 15: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/15.jpg)
15
Technical Framework
The set of labels for a node are what really matters as it is what we want to prove.
Labels as the first thing to build and describe
The rule to build the set of labels is by itself the analysis
We can use a lot of ways to describe that but people usually use set theory in order to do that and describe it with a transitional set system on statements
In(S), Out(S), succ(S), pred(S)
![Page 16: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/16.jpg)
16
Notations
S0In(S0)={…}Out(S0)={…}
S’1S1
S-1 S’-1pred(S0)={S-1,S’-1}
succ(S0)={S1,S’1}
![Page 17: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/17.jpg)
17
Example: Attachment Analysis
Labels ::= varname=attached | varname=detached |varname=error
In (S) = ∪S0∈ pred(S) Out(S0)Out(S= create varname.creationFeature) =
(In(S) - {varname=detached})∪ {varname=attached}Out(S= varname:=Void) =
(In(S) - {varname=attached})∪ {varname=detached}Out(S= varname0:=varname1.FeatureName(…)) =
if {varname1=detached} ∈ In(S) then In(S) ∪ {varname1=error}
In(S) ∪ {varname0=attached} otherwise
![Page 18: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/18.jpg)
18
Applications of the algorithm
create b.make (10)from a := 1 until a > 10 loop
b.testa := a + 1
end
create b.make (10)from a := 1 until a > 10 loop
if (a = 10) then b := Void
else b.testenda := a + 1
end
![Page 19: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/19.jpg)
19
Available Expressions AnalysisAn expression is available at a point p if there is a value calculated for sure and the last value that was calculated for it is still the correct value.Labels ::= {expression}In(S) = ∩S0∈ pred(S) Out(S0)Out(S= varname:= expression0) =
(In(S) ∪ {expression0} ) - {e | varname ∈ e }
Kill(S)Gen(S)
![Page 20: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/20.jpg)
20
Application of the algorithm
a := a - bb := a + bif (b > 3) then
Result := a - belse
Result := a + bend
![Page 21: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/21.jpg)
21
Live Variables AnalysisA variable v is live at a point p if it may be used in the future before it is overwritten.Labels ::= {varname}Out(S) = ∪S0∈ succ(S) In(S0)In(S= varname:= expression) =
(Out(S)-{varname}) ∪ {vname | vname ∈ expression}
In(S | expression ∈ S) = Out(S) ∪ {vname | vname ∈ expression}
Kill(S) Gen(S)
![Page 22: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/22.jpg)
22
Application of the algorithm
a := 1b := a + 2if (b > 3) then
Result := belse
Result := ab := a
endb := a
![Page 23: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/23.jpg)
23
Very Busy Expressions Analysis
An expression e is very busy at a point p if all future execution path will need its evaluation before the value of e is changed
Labels ::= {expressions}Out(S) =∩S0∈ succ(S) In(S0)In(S= varname:=…) =
Out(S) - {e0 ∈ Out(S) | varname ∈ e0} (apply first)In(S ∋ expression0) = Out(S)∪ {expression0} (apply
last)
![Page 24: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/24.jpg)
24
Application of the algorithm
a := 2if (b > 3) then
Result := a + belse
Result := a + bend
![Page 25: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/25.jpg)
25
Reaching Definition Analysis (also def-use)
A definition (an assignment) of v reaches a point in a program if there is no other definition of v in-between.
Labels ::= {varname:statementNumber}In(S) = ∪S0∈ pred(S) Out(S0)Out(S= varname:=…) =
In(S)- {varname:statementNumber0 ∈ In(S)} ∪ {varname:statementNumber(S)}
Out(S) = In(S) otherwise
![Page 26: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/26.jpg)
26
Application of the algorithm
a := 1b := a + 2if (b > 3) then
Result := belse
Result := a
b := aendb := a
![Page 27: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/27.jpg)
27
Types of analysis…
Forward Analysis: the analysis goes in the same direction as the edgesBackward Analysis: the analysis goes in reverse direction comparing to the edges
Must analysis: all branches are accountedMay analysis: one branch is enough
![Page 28: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/28.jpg)
28
Terminations of the Dataflow Analyses???
Label values generally form lattices so they are finite
Most functions are monotonic on these lattices so they are terminating.
It is however needed to verify that they are!!!
![Page 29: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/29.jpg)
29
Exercise 1Find applications for each of the analyses we previously described:
Available Expressions Analysis
Live Variables Analysis
Very Busy Expressions Analysis
Reaching Definition Analysis
![Page 30: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/30.jpg)
30
Exercise 2: Password leak
How to prove statically that a password is not going to be passed from its original definition to any output?
Setting: PASSWORD is a class Specify the interface Specify the analysis and its limitations
![Page 31: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/31.jpg)
31
Exercise 3: Access rights for files
How to verify that a file can only be accessed by a thread that acquired the right to do it?
Setting: FILE is a class that has a limited number of
features SECURITY_MANAGER grants accesses Specify the analysis and its limitations
![Page 32: Fundamentals of Program Analysis Dr. Manuel Oriol](https://reader036.fdocuments.in/reader036/viewer/2022062520/56816231550346895dd262ef/html5/thumbnails/32.jpg)
32
Conclusions
Abstract Syntax Tree
Control Flow Graph
Fundamental of Dataflow Analysis Available Expressions Analysis Live Variables Analysis Very Busy Expressions Analysis Reaching Definition Analysis