TIP Language and Type Analysis - USTC
Transcript of TIP Language and Type Analysis - USTC
![Page 1: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/1.jpg)
TIP Language and Type Analysis
Yu Zhang
Course web site: http://staff.ustc.edu.cn/~yuzhang/pldpa
Type Analysis and Unification 1
![Page 2: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/2.jpg)
Resources
• Static Program Analysis
- http://cs.au.dk/~amoeller/
- TIPC:implemented in C++17tipg4:implemented using ANTLR4
Type Analysis and Unification 2
Anders Møller
![Page 3: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/3.jpg)
Questions about Programs
• Does the program terminate on all inputs?
• How large can the heap/stack frame become during
execution?
• Can sensitive information leak to non-trusted users?
• Can non-trusted users affect sensitive information?
• Data races?
• SQL injections?
• …
Type Analysis and Unification 3
SQL 注入:通过把SQL
命令插入到Web表单提
交等,来欺骗服务器执
行恶意的SQL命令
![Page 4: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/4.jpg)
Program Points
Type Analysis and Unification 4
Any point in the program
= any value of the PC
Invariants (不变式):
A property holds at a program point if it holds in any such
state for any execution with any input
![Page 5: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/5.jpg)
Questions about Program Points
• Will the value of x be read in the future?
• Is the variable x initialized before it is read?
• What is a lower and upper bound on the value of
the integer variable x?
• Can the pointer p be null?
• Which variables can p point to?
• Do p and q point to disjoint structures in the heap?
• …
Type Analysis and Unification 5
![Page 6: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/6.jpg)
Why are the Answers Interesting?
• Increase efficiency
- Resource usage
- Optimization
• Ensure correctness
- Verify behavior
- Catch bugs early
• Support program understanding
• Enable refactoringsType Analysis and Unification 6
![Page 7: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/7.jpg)
Programs that reason about programs
• Soundness(可靠性): don’t miss any errors
• Completeness(完备性): don’t raise false alarms
• Termination(终止性): always give an answer
Type Analysis and Unification 7
![Page 8: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/8.jpg)
Rice’s theorem, 1953
• H.G. Rice: Classes of recursively enumerable
sets and their decision problem
• Rice定理:Any nontrivial property of the behavior of
programs in a Turing-complete language is undecidable!
•
递归可枚举语言的所有非平凡(nontrival)性质都是不可判
定的
平凡性质:要么对全体程序都为真,要么对全体程序都为假
非平凡性质:所有不平凡的性质
Type Analysis and Unification 8
![Page 9: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/9.jpg)
Approximation
• Approximate answers may be decidable!
- Output yes/no => output yes/no/unknown
• The approximation must be conservative
• More subtle approximations if not only yes/no
- E.g. memory usage, pointer targets
Type Analysis and Unification 9
![Page 10: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/10.jpg)
False positives and false negatives
Type Analysis and Unification 10
误报
prevent by completeness
漏报
prevent by soundness
![Page 11: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/11.jpg)
The Engineering Challenge
• A correct but trivial approximation algorithm may
just give the useless answer every time
• The engineering challenge is to give the useful
answer often enough to fuel the client application
• … and to do so within reasonable time and space
• Hard (but fun) part of static analysis
Type Analysis and Unification 11
![Page 12: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/12.jpg)
A Constraint-based Approach
• Conceptually separates the analysis specification
from algorithmic aspects and implementation
details
Type Analysis and Unification 12
![Page 13: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/13.jpg)
Challengeing Features in Modern PLs
• Higher-order functions
• Mutable records or objects, arrays
• Integer or floating-point computations
• Dynamic dispatching
• Inheritance
• Exceptions
• Reflection
• …
Type Analysis and Unification 13
![Page 14: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/14.jpg)
TIP Language
TIP: Tiny Imperative Programming language
Type Analysis and Unification 14
![Page 15: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/15.jpg)
TIP and its Implementation
• TIP language
- Minimal C-style syntax
- Enough features to make static analysis challenging
and fun
• Implementation
- Scala: https://github.com/cs-au-dk/TIP/
- C++ 17: https://github.com/matthewbdwyer/tipc
Type Analysis and Unification 15
![Page 16: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/16.jpg)
Expresions in TIP
Type Analysis and Unification 16
![Page 17: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/17.jpg)
Statements in TIP
• In conditions, 0 is false, all other values are true
• The output statement writes an integer value to
the output stream
Type Analysis and Unification 17
![Page 18: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/18.jpg)
Functions in TIP
• The optional var block declares a collection of
uninitialized variables
• Function calls are an extra kind of expressions:
Type Analysis and Unification 18
![Page 19: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/19.jpg)
Pointers
• No pointer arithmetic
Type Analysis and Unification 19
![Page 20: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/20.jpg)
Records
• Records are passed by value (like structs in C)
• For simplicity, values of record fields cannot be
recordsType Analysis and Unification 20
![Page 21: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/21.jpg)
Functions as Values
• Functions are first-class values
• The name of a function is like a variable that
refers to that function
• Generalized function calls
• Function values suffice to illustrate the main
challenges with methods (in OO languages) and
higher-order functions (in functional languages)Type Analysis and Unification 21
![Page 22: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/22.jpg)
Programs
• A program is a collection of functions
• The function named main initiates execution
- Its arguments are taken from the input stream
- Its result is placed on the output stream
• We assume that all declared identifiers are unique
Type Analysis and Unification 22
![Page 23: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/23.jpg)
TIP Examples
• Recursive factorial function • Iterative factorial function
Type Analysis and Unification 23
![Page 24: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/24.jpg)
Control flow graphs
• Iterative factorial function
Type Analysis and Unification 24
![Page 25: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/25.jpg)
Normalization
• Normalization:flatten nested expressions, using
fresh variables
Type Analysis and Unification 25
![Page 26: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/26.jpg)
Type analysis and unification
Type Analysis and Unification 26
![Page 27: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/27.jpg)
Type Errors
• Reasonable restrictions on operations:
- Arithmetic operators apply only to to integers
- Comparisons apply only to like values
- Only integers can be input and output
- Conditions must be integers
- Only functions can be called
- The * operator only applies to pointers
- Field lookup can only be performed on records
- The fields being accessed are guaranteed to be present
• Violations result in runtime errors
• No type annotations in TIP
Type Analysis and Unification 27
![Page 28: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/28.jpg)
Type Checking
• Can type errors occur during runtime?
- undecidable
• Use conservative approximation
- A program is typable is it satisfies some type constraints
- These are systematically derived from the syntax tree
- If typable, then no runtime errors occur
- But some programs will be unfairly rejected (slack)
Type Analysis and Unification 28
typable
slack
No type
errors
![Page 29: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/29.jpg)
Challenges
• Fighting slack
- Make the type checker a
bit more clever
- An eternal struggle
- And a great source of
publications
• The type checker may be
unsound
• Ex. covariant arrays in Java
- 协变数组若B是A的子类, 则如下代码在Java中是允许的: A[ ] a=new B[ ];
- 从类延伸到数组的变换,原有的继承关系不变
Type Analysis and Unification 29
![Page 30: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/30.jpg)
Types
• Types describe the possible values
• These describe integers, pointers, functions, and
records
• Types are terms generated by this grammar
Type Analysis and Unification 30
![Page 31: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/31.jpg)
Type constraints
Type Analysis and Unification 31
![Page 32: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/32.jpg)
Generating constraints
Type Analysis and Unification 32
![Page 33: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/33.jpg)
Generating constraints
Type Analysis and Unification 33
多态类型
![Page 34: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/34.jpg)
Exercise
• Generate and solve the constraints
• Then try with y = alloc 8 replaced by y = 42
Type Analysis and Unification 34
![Page 35: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/35.jpg)
Generating constraints
• This is the idea, but not directly expressible in TIP
types
Type Analysis and Unification 35
![Page 36: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/36.jpg)
Generating constraints
• Exercise: Field write statements?
Type Analysis and Unification 36
![Page 37: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/37.jpg)
General Terms
Type Analysis and Unification 37
![Page 38: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/38.jpg)
Unification合一
• An equality between two terms with variables
- k(X,b,Y) = k(f(Y,Z), Z, d(Z))
• A solution (a unifier) is an assignment from
variables to terms that makes both sides equal
- X = f(d(b),b)
- Y = d(b)
- Z = b
Type Analysis and Unification 38
![Page 39: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/39.jpg)
Unification errors
• Constructor error
- d(X) = e(X)
• Arity error
- a = a(X)
Type Analysis and Unification 39
![Page 40: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/40.jpg)
Linear unification algorithm
• 1978, by Paterson and Wegman
• In time O(n)
- Finds a most general unifier
- Or decides that none exists
• Can be used as a back-end for type checking
• … but only for finite terms
Type Analysis and Unification 40
![Page 41: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/41.jpg)
Recursive data structures
Type Analysis and Unification 41
[[p]] = [[alloc null]]
= ↑[[null]]
= ↑ ↑ t = ↑[[p]] = ↑ ↑ [[p]]
[[p]] = t t = ↑ t
![Page 42: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/42.jpg)
Regular terms正则式
• Infinite but (eventually) repeating
- e(e(e(e(e(e(…))))))
- d(a, d(a, d(a,…)))
- f(f(f(f(…), f(…)), f(f(…), f(…))), f(f(f(…), f(…)), f(f(…),
f(…))))
• Only finitely many different subtrees
• A non-regular term
- f(a,f(d(a), f(d(d(a)), f(d(d(d(a))),…)))
Type Analysis and Unification 42
http://users-cs.au.dk/amoeller/spa/
3.3 Solving Constraints with Unification
![Page 43: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/43.jpg)
Regular unification
• 1976, Huet
• Use a union-find (并查) algorithm to solve the
unification problem for regular terms in O(n*A(n))
• A(n) is the inverse Ackermann function
- Smallest k such that n<Ack(k,k)
- This is never bigger than 5 for any real value of n
• See TIP implementation tipcType Analysis and Unification 43
![Page 44: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/44.jpg)
Union-Find
Type Analysis and Unification 44
Add a new node x that
initially is its own parent
Find the canonical representative of x by traversing the path to the root, performing path compression on the way
Find the canonical representatives of x and y, and makes one parent of the other unless they are already equivalent
https://github.com/matthewbdwyer/tipc/blob/main/src/semantic/types/solver/UnionFind.cpp
![Page 45: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/45.jpg)
Union-Find (simplified)
Type Analysis and Unification 45
![Page 46: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/46.jpg)
Implementation Strategy
• Representation of the different kinds of types
(including type variables)
• Map from AST nodes to type variables
• Union-Find
• Traverse AST, generate constraints, unify
- Reply type error if unification fails
- When unifying a type variable with e.g. a function type, it is
useful to pick the function type as representation
- For outputting solution, assign names to type variables (that
are roots), and be careful about recursive typesType Analysis and Unification 46
![Page 47: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/47.jpg)
The Complicated Function
Type Analysis and Unification 47
![Page 48: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/48.jpg)
Solutions
Type Analysis and Unification 48
递归类型
![Page 49: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/49.jpg)
Infinitely many solutions
• Polymorphic function
(which is not expressible in TIP type language)
Type Analysis and Unification 49
![Page 50: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/50.jpg)
Recursive and polymorphic types
Type Analysis and Unification 50
![Page 51: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/51.jpg)
Slack – let-polymorphism
Type Analysis and Unification 51
![Page 52: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/52.jpg)
Slack – let-polymorphism
Type Analysis and Unification 52
![Page 53: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/53.jpg)
Slack – flow-insensitivity
Type Analysis and Unification 53
![Page 54: TIP Language and Type Analysis - USTC](https://reader031.fdocuments.in/reader031/viewer/2022012915/61c5233e96365134af1a025b/html5/thumbnails/54.jpg)
Other programming errors
Type Analysis and Unification 54