Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf ·...
Transcript of Taming Undefined Behavior in LLVM - SIGPLsigpl.or.kr/conf/2017/pdf/sigpl09_jylee.pdf ·...
Taming Undefined Behavior
in LLVM
SIGPL 2017 Summer
Juneyoung Lee
Software Foundations Lab (Advisor: Chung-Kil Hur)
/ 452
Undefined
Behavior
(UB)?
UB
in LLVM IR
Problem of
UB in IR
& Solution
Software Foundations Lab
/ 45
What isUndefined Behavior?
3Software Foundations Lab
/ 454
Undefined
Behavior?
Software Foundations Lab
/ 454
Undefined
Behavior?
Software Foundations Lab
/ 454
Undefined
Behavior?
Software Foundations Lab
/ 455Software Foundations Lab
ISO/IEC 9899:2011
Programming languages – C
/ 455
int a[4]; *(a + 4) = 123;
Software Foundations Lab
ISO/IEC 9899:2011
Programming languages – C
/ 455
int a[4]; *(a + 4) = 123; Undefined Behavior!
Software Foundations Lab
ISO/IEC 9899:2011
Programming languages – C
/ 45
C ≠ Assembly
• C abstract machine!
6
Memory
-
int a[4]; *(a + 4) = 123;
Software Foundations Lab
/ 45
C ≠ Assembly
• C abstract machine!
7
l
𝑎 → (𝑙, 0)
Memory
int a[4]; *(a + 4) = 123;
Software Foundations Lab
/ 45
int a[4]; *(a + 4) = 123;
C ≠ Assembly
• C abstract machine!
8
Memory
l
𝑎 → (𝑙, 0)
?
Software Foundations Lab
/ 45
UB & Compiler
9
State
1
State
2 UB
C
State
1
State
2
State
3 …
Asm
Compile
Software Foundations Lab
/ 45Software Foundations Lab
UB & Compiler
10
State
1
State
2 UB
C
State
1
State
2
State
3’ …
Compile’
Asm(opt.)
/ 45Software Foundations Lab
UB & Compiler
10Optimizer can change UB into anything!
State
1
State
2 UB
C
State
1
State
2
State
3’ …
Compile’
Asm(opt.)
/ 45
Eliminating Redundant Load
11
a[0] = 10;b[i] = 20;output(a[0]);
a[0] = 10;b[i] = 20;output(10);
AsmC
int a[4];int b[4];int i;
UB and Optimization
Software Foundations Lab
/ 45
Eliminating Redundant Load
11
a[0] = 10;b[i] = 20;output(a[0]);
a[0] = 10;b[i] = 20;output(10);
AsmC
int a[4];int b[4];int i; 3
UB and Optimization
Software Foundations Lab
/ 45
Eliminating Redundant Load
11
a[0] = 10;b[i] = 20;output(a[0]);
a[0] = 10;b[i] = 20;output(10);
AsmC
int a[4];int b[4];int i; 3
UB and Optimization
Software Foundations Lab
4
/ 45
Eliminating Redundant Load
11
a[0] = 10;b[i] = 20;output(a[0]);
a[0] = 10;b[i] = 20;output(10);
AsmC
int a[4];int b[4];int i; 3
UB and Optimization
UB
Software Foundations Lab
4
/ 45
Register Promotion
12
a = t;b[i] = 20;output(a);
eax = t;b[i] = 20;output(eax);
AsmC
int a;int b[4];int i;
UB and Optimization
Software Foundations Lab
/ 45
Register Promotion
12
a = t;b[i] = 20;output(a);
eax = t;b[i] = 20;output(eax);
AsmC
int a;int b[4];int i; 4
UB and Optimization
Software Foundations Lab
/ 45
Register Promotion
12
a = t;b[i] = 20;output(a);
eax = t;b[i] = 20;output(eax);
AsmC
int a;int b[4];int i; 4
UB and Optimization
UB
Software Foundations Lab
/ 45
Undefined Behavior in C
• Many operations are defined to produce UB
- To support optimizations
- > 200 cases in total!
• Let me introduce two famous cases:
1. Pointer overflow
2. Signed integer overflow
13Software Foundations Lab
/ 4514
Pointer OverflowUB and Optimization
/ 4514
Pointer OverflowUB and Optimization
int a[4]; .. a + 4 ..
int a[4]; .. a + 5 ..
UB OK
/ 4514
Pointer OverflowUB and Optimization
int a[4]; .. a + 4 ..
int a[4]; .. a + 5 ..
UB OK
This guarantees that “p + i” never overflows 2^32!
/ 45
int* p;int a;int b;
Pointer Overflow
output(p + a > p + b); output(a > b);
AsmC
15
UB and Optimization
Software Foundations Lab
/ 45
int* p;int a;int b;
Pointer Overflow
output(p + a > p + b); output(a > b);
AsmC
0x100 00x100 0
0xFFFFFF00
15
UB and Optimization
Software Foundations Lab
/ 45
int* p;int a;int b;
Pointer Overflow
output(p + a > p + b); output(a > b);
AsmC
0x0
(Overflow!)
0x100 00x100 0
0xFFFFFF00
15
UB and Optimization
Software Foundations Lab
/ 45
int* p;int a;int b;
Pointer Overflow
output(p + a > p + b); output(a > b);
AsmC
false0x0
(Overflow!)
0x100 00x100 0
0xFFFFFF00
15
UB and Optimization
Software Foundations Lab
/ 45
int* p;int a;int b;
Pointer Overflow
output(p + a > p + b); output(a > b);
AsmC
false true0x0
(Overflow!)
0x100 00x100 0
0xFFFFFF00
15
UB and Optimization
Software Foundations Lab
/ 45
int* p;int a;int b;
Pointer Overflow
output(p + a > p + b); output(a > b);
AsmC
false true0x0
(Overflow!)
0x100 00x100 0
0xFFFFFF00
15
UB
UB and Optimization
Software Foundations Lab
/ 4516
Signed Integer OverflowUB and Optimization
/ 4516
Signed Integer OverflowUB and Optimization
int t = 2147..47.. t + 1 ..
UB int t = -2147..48.. (-t) ..
UB
/ 4516
Signed Integer OverflowUB and Optimization
int t = 2147..47.. t + 1 ..
UB
1. It is known that SIO is ‘dangerous’.
2. It gives much more optimization opportunities!
int t = -2147..48.. (-t) ..
UB
/ 45
Signed Integer Overflow is Dangerous!
17
• It’s on “CWE/SANs Top 25 Software Errors”[1].
• Sanitize your application with..
- Runtime Test: IOC[2] / Static Analysis Tool: cppcheck
• Use unsigned int[1] http://cwe.mitre.org/top25/
[2] Will, Peng, John, and Vikram Adve., Understanding Integer Overflow in C/C++, ICSE’12
The 2011 CWE/SANS Top 25 Most Dangerous Software Errors isa list of the most widespread and critical errors
that can lead to serious vulnerabilities in software.
/ 45
Signed Integer Overflow
int i;for(i=0; i<=N; i++)p[i] = 10;
int64 i;for(i=0; i<=N; i++)p[i] = 10;
C
18
Asm
int* p;
UB and Optimization
Software Foundations Lab
/ 45
Signed Integer Overflow
int i;for(i=0; i<=N; i++)p[i] = 10;
int64 i;for(i=0; i<=N; i++)p[i] = 10;
C
18
Asm
int* p;
UB and Optimization
INT32_MAX INT32_MAX
Software Foundations Lab
/ 45
Signed Integer Overflow
int i;for(i=0; i<=N; i++)p[i] = 10;
int64 i;for(i=0; i<=N; i++)p[i] = 10;
C
18
Asm
int* p;
UB and Optimization
INT32_MAX
INT32_MIN
INT32_MAX
Software Foundations Lab
/ 45
Signed Integer Overflow
int i;for(i=0; i<=N; i++)p[i] = 10;
int64 i;for(i=0; i<=N; i++)p[i] = 10;
C
18
Asm
int* p;
UB and Optimization
INT32_MAX
INT32_MIN
INT32_MAX
INT32_MAX+1
Software Foundations Lab
/ 45
Signed Integer Overflow
int i;for(i=0; i<=N; i++)p[i] = 10;
int64 i;for(i=0; i<=N; i++)p[i] = 10;
C
18
Asm
int* p;
UB and Optimization
INT32_MAX
INT32_MIN
INT32_MAX
INT32_MAX+1
UB
Software Foundations Lab
/ 45
Signed Integer Overflow (cont.)
19
UB and Optimization
Software Foundations Lab
/ 45
Signed Integer Overflow (cont.)
19
UB and Optimization
Software Foundations Lab
t = -x / -y
C
t = x / y
Asm
INT32_MIN 2 INT32_MIN 2
/ 45
Signed Integer Overflow (cont.)
19
UB and Optimization
Software Foundations Lab
t = -x / -y
C
t = x / y
Asm
INT32_MIN 2 INT32_MIN 2
t = (x*c) / c’
C
t = x * (c/c’)
Asm
-1 INT32_MIN -1 INT32_MIN
/ 45
Signed Integer Overflow (cont.)
19
UB and Optimization
Software Foundations Lab
t = -x / -y
C
t = x / y
Asm
INT32_MIN 2 INT32_MIN 2
t = (x*c) / c’
C
t = x * (c/c’)
Asm
-1 INT32_MIN -1 INT32_MIN
/ 45
Summary
• UB is the result of erroneous operation.
• A well-written program should not have UB.
• UB helps compiler to do more optimization.
20Software Foundations Lab
/ 45
Undefined Behavior
in LLVM IR
21Software Foundations Lab
/ 45
IR in Compiler
22
C
Asm
IR Optimization!
Software Foundations Lab
/ 45
UB & Optimization
23
State
1
State
2 UB
IR
State
1
State
2
State
3 …
IR
Optimize
Software Foundations Lab
/ 45
UB & Optimization
24
State 1 UB
C
State 1 State 2
AsmState 3 State 4
State 1 State 2
IR
UB
State 1 State 2
IR
UB State 3
…
Software Foundations Lab
/ 4525
Undefined
Behavior in
LLVM IR?
Software Foundations Lab
/ 4525
Undefined
Behavior in
LLVM IR?
UB in C
≠
UB in IR!
Software Foundations Lab
/ 45
int* pint aint b
Peephole Optimization
output(p + a > p + b) output(a > b)
IR IR
26
UB in C ≠ UB in IR
Software Foundations Lab
/ 45
int* pint aint b
Peephole Optimization
output(p + a > p + b) output(a > b)
IR IR
0x100 00x100 0
0xFFFFFF00
26
UB in C ≠ UB in IR
Software Foundations Lab
/ 45
int* pint aint b
Peephole Optimization
output(p + a > p + b) output(a > b)
IR IR
0x0
(Overflow!)
0x100 00x100 0
0xFFFFFF00
26
UB in C ≠ UB in IR
Software Foundations Lab
/ 45
int* pint aint b
Peephole Optimization
output(p + a > p + b) output(a > b)
IR IR
false0x0
(Overflow!)
0x100 00x100 0
0xFFFFFF00
26
UB in C ≠ UB in IR
Software Foundations Lab
/ 45
int* pint aint b
Peephole Optimization
output(p + a > p + b) output(a > b)
IR IR
false true0x0
(Overflow!)
0x100 00x100 0
0xFFFFFF00
26
UB in C ≠ UB in IR
Software Foundations Lab
/ 45
int* pint aint b
Peephole Optimization
output(p + a > p + b) output(a > b)
IR IR
false true0x0
(Overflow!)
0x100 00x100 0
0xFFFFFF00
26
UB
UB in C ≠ UB in IR
Software Foundations Lab
/ 45
int* pint aint b
C’s UB Model:
Pointer Arithmetic Overflow is
Undefined Behavior
Peephole Optimization
output(p + a > p + b) output(a > b)
IR IR
false true0x0
(Overflow!)
0x100 00x100 0
0xFFFFFF00
26
UB
UB in C ≠ UB in IR
Software Foundations Lab
/ 45Software Foundations Lab
Loop Invariant Code Motion
27
...for(i=0; i<n; ++i){a[i] = p + 0x100
}
q = p + 0x100for(i=0; i<n; ++i){
a[i] = q}
IR IR
C’s UB Model:
Pointer Arithmetic Overflow is
Undefined Behavior
UB in C ≠ UB in IR
/ 45Software Foundations Lab
Loop Invariant Code Motion
27
...for(i=0; i<n; ++i){a[i] = p + 0x100
}
q = p + 0x100for(i=0; i<n; ++i){
a[i] = q}
IR IR 0
0xFFFFFF00
0xFFFFFF00
0
C’s UB Model:
Pointer Arithmetic Overflow is
Undefined Behavior
UB in C ≠ UB in IR
/ 45Software Foundations Lab
Loop Invariant Code Motion
27
...for(i=0; i<n; ++i){a[i] = p + 0x100
}
q = p + 0x100for(i=0; i<n; ++i){
a[i] = q}
IR IR 0
0xFFFFFF00
0xFFFFFF00
0
Overflow!
C’s UB Model:
Pointer Arithmetic Overflow is
Undefined Behavior
UB in C ≠ UB in IR
/ 45Software Foundations Lab
Loop Invariant Code Motion
27
...for(i=0; i<n; ++i){a[i] = p + 0x100
}
q = p + 0x100for(i=0; i<n; ++i){
a[i] = q}
IR IR 0
0xFFFFFF00
0xFFFFFF00
0
Overflow!
C’s UB Model:
Pointer Arithmetic Overflow is
Undefined Behavior
UB
UB in C ≠ UB in IR
/ 45
C’s UB Model:
Pointer Arithmetic Overflow is
Undefined Behavior
Poison Value: A Deferred UB
28
...for(i=0; i<n; ++i){a[i] = p + 0x100
}
q = p + 0x100for(i=0; i<n; ++i){a[i] = q
}
IR IR 0
0xFFFFFF00
0
0xFFFFFF00
Overflow! UB
UB in C ≠ UB in IR
Software Foundations Lab
/ 45
C’s UB Model:
Pointer Arithmetic Overflow is
Undefined Behavior
LLVM’s UB Model:
Pointer Arithmetic Overflow is
A Poison “Value”
Poison Value: A Deferred UB
28
...for(i=0; i<n; ++i){a[i] = p + 0x100
}
q = p + 0x100for(i=0; i<n; ++i){a[i] = q
}
IR IR 0
0xFFFFFF00
0
0xFFFFFF00
Overflow!poison UB
UB in C ≠ UB in IR
Software Foundations Lab
/ 45
C’s UB Model:
Pointer Arithmetic Overflow is
Undefined Behavior
LLVM’s UB Model:
Pointer Arithmetic Overflow is
A Poison “Value”
Poison Value: A Deferred UB
28
...for(i=0; i<n; ++i){a[i] = p + 0x100
}
q = p + 0x100for(i=0; i<n; ++i){a[i] = q
}
IR IR 0
0xFFFFFF00
0
0xFFFFFF00
Overflow!poison UB
UB in C ≠ UB in IR
Software Foundations Lab
/ 45
LLVM’s UB Model:
Pointer Arithmetic Overflow is
A Poison “Value”
Poison Value: A Deferred UB
output(p + a > p + b) output(a > b)
IR IR 0xFFFFFF00
0x100 0 0x100 0
29
UB
0x0
(Overflow!)
UB in C ≠ UB in IR
Software Foundations Lab
/ 45
LLVM’s UB Model:
Pointer Arithmetic Overflow is
A Poison “Value”
Poison Value: A Deferred UB
output(p + a > p + b) output(a > b)
IR IR 0xFFFFFF00
0x100 0 0x100 0
29
UB
0x0
(Overflow!)Poison
UB in C ≠ UB in IR
Software Foundations Lab
/ 45
LLVM’s UB Model:
Pointer Arithmetic Overflow is
A Poison “Value”
Poison Value: A Deferred UB
output(p + a > p + b) output(a > b)
IR IR 0xFFFFFF00
0x100 0 0x100 0
29
UB
0x0
(Overflow!)Poison
UB in C ≠ UB in IR
Software Foundations Lab
/ 45
LLVM’s UB Model:
Pointer Arithmetic Overflow is
A Poison “Value”
Poison Value: A Deferred UB
output(p + a > p + b) output(a > b)
IR IR 0xFFFFFF00
0x100 0 0x100 0
29
UB
UB
0x0
(Overflow!)Poison
UB in C ≠ UB in IR
Software Foundations Lab
/ 45
UB in IR is only for C?
• Example: Java
- Type checker:“Function args are either null or dereferenceable.”
- Put ‘dereferenceable_or_null’ tag to them!
- It’s UB for them to have invalid pointers
30Software Foundations Lab
/ 45
Summary
• C’s UB ≠ LLVM IR’s UB.
• The notion of ‘deferred UB’ helps further opt.
• UB works for well-typed languages, too
31Software Foundations Lab
/ 45
Problem of UB in LLVM IR
& Solution
32Software Foundations Lab
Taming Undefined Behavior
in LLVM
Nuno P. Lopes
PLDI 2017 Barcelona
Seoul National Univ.
Juneyoung Lee
Yoonseung Kim
Youngju Song
Chung-Kil Hur
Azul Systems Sanjoy Das
John RegehrUniversity of Utah
David Majnemer
Microsoft Research
Software Foundations Lab
/ 45
Problem of Poison
34
p a p b
+ +
>
output
0xFFFFFF00 0x100
Software Foundations Lab
/ 45
Problem of Poison
34
p a p b
+ +
>
output
0xFFFFFF00 0x100
poison
Software Foundations Lab
/ 45
Problem of Poison
34
p a p b
+ +
>
output
0xFFFFFF00 0x100
poison
poisonPropagate
Software Foundations Lab
/ 45
Problem of Poison
34
p a p b
+ +
>
output
0xFFFFFF00 0x100
poison
poisonPropagate
Raise UB
UB
Software Foundations Lab
/ 45
Problem of Poison
34
p a p b
+ +
>
output
0xFFFFFF00 0x100
poison
poisonPropagate
Raise UB
“Poison is
Sometimes
Too Poisonous”
UB
Software Foundations Lab
/ 45
LLVM’s UB Model:
Branching on poison is
???
35
if (x == y) {
.. use x ..
}
if (x == y) {
.. use y ..
}
Global Value Numbering (GVN)Problems with LLVM’s UB
Software Foundations Lab
/ 45
LLVM’s UB Model:
Branching on poison is
???
35
if (x == y) {
.. use x ..
}
if (x == y) {
.. use y ..
}
0 poison 0 poison
Global Value Numbering (GVN)Problems with LLVM’s UB
Software Foundations Lab
/ 45
LLVM’s UB Model:
Branching on poison is
???
35
if (x == y) {
.. use x ..
}
if (x == y) {
.. use y ..
}
0 poison 0 poison
poisonpoison
Global Value Numbering (GVN)Problems with LLVM’s UB
Software Foundations Lab
/ 45
LLVM’s UB Model:
Branching on poison is
???
35
if (x == y) {
.. use x ..
}
if (x == y) {
.. use y ..
}
0 poison 0 poison
poisonpoison
Global Value Numbering (GVN)Problems with LLVM’s UB
Software Foundations Lab
/ 45
LLVM’s UB Model:
Branching on poison is
???
35
if (x == y) {
.. use x ..
}
if (x == y) {
.. use y ..
}
0 poison 0 poison
poisonpoison
0 poison
Global Value Numbering (GVN)Problems with LLVM’s UB
Software Foundations Lab
/ 45
LLVM’s UB Model:
Branching on poison is
???
35
if (x == y) {
.. use x ..
}
if (x == y) {
.. use y ..
}
0 poison 0 poison
poisonpoison
0 poison
Global Value Numbering (GVN)Problems with LLVM’s UB
UB
Software Foundations Lab
/ 45
LLVM’s UB Model:
Branching on poison is
???
LLVM’s UB Model:
Branching on poison is
Undefined Behavior
35
if (x == y) {
.. use x ..
}
if (x == y) {
.. use y ..
}
0 poison 0 poison
poisonpoison
0 poison
Global Value Numbering (GVN)Problems with LLVM’s UB
UB
Software Foundations Lab
/ 45
LLVM’s UB Model:
Branching on poison is
???
LLVM’s UB Model:
Branching on poison is
Undefined Behavior
35
if (x == y) {
.. use x ..
}
if (x == y) {
.. use y ..
}
0 poison 0 poison
poisonpoison
0 poison
Global Value Numbering (GVN)Problems with LLVM’s UB
UB UB
Software Foundations Lab
/ 45
Loop Unswitching (LU)
36
while (n > 0) {if (cond)A
elseB
}
if (cond)while (n > 0){ A }
elsewhile (n > 0){ B }
LLVM’s UB Model:
Branching on poison is
Undefined Behavior
Problems with LLVM’s UB
Software Foundations Lab
/ 45
Loop Unswitching (LU)
36
while (n > 0) {if (cond)A
elseB
}
if (cond)while (n > 0){ A }
elsewhile (n > 0){ B }
0 poison
poison 0
LLVM’s UB Model:
Branching on poison is
Undefined Behavior
Problems with LLVM’s UB
Software Foundations Lab
/ 45
Loop Unswitching (LU)
36
while (n > 0) {if (cond)A
elseB
}
if (cond)while (n > 0){ A }
elsewhile (n > 0){ B }
0 poison
poison 0
LLVM’s UB Model:
Branching on poison is
Undefined Behavior
Problems with LLVM’s UB
UB
Software Foundations Lab
/ 45
Loop Unswitching (LU)
36
while (n > 0) {if (cond)A
elseB
}
if (cond)while (n > 0){ A }
elsewhile (n > 0){ B }
0 poison
poison 0
LLVM’s UB Model:
Branching on poison is
Undefined Behavior
Problems with LLVM’s UB
UB
Software Foundations Lab
/ 45
Inconsistency in LLVM
• GVN + LU is inconsistent.
• We found a miscompilation bug in LLVMdue to the inconsistency (LLVM Bugzilla 31652).
- It is being discussed in the community
- No solution has been found yet
37Software Foundations Lab
/ 45
Overview
38
Existing Approaches
Defined values
Undef. values
Poison values
Can’t Control
Poison
GVN + LU
More
Defined
Complex
Inconsistent
UB
Software Foundations Lab
/ 45
Overview
38
𝒇𝒓𝒆𝒆𝒛𝒆
Existing Approaches Our Approach
Defined values
Undef. values
Poison values
Defined values
Poison values
Can’t Control
Poison
GVN + LU
More
Defined
Complex
Inconsistent
Simpler
UB UB
Software Foundations Lab
/ 45
Overview
38
𝒇𝒓𝒆𝒆𝒛𝒆
Existing Approaches Our Approach
Defined values
Undef. values
Poison values
Defined values
Poison values
Can’t Control
Poison
GVN + LU
More
Defined
Can Control
Poison
Complex
Inconsistent
Simpler
UB UB
Software Foundations Lab
/ 45
Overview
38
𝒇𝒓𝒆𝒆𝒛𝒆
Existing Approaches Our Approach
Defined values
Undef. values
Poison values
Defined values
Poison values
Can’t Control
Poison
GVN + LU
More
Defined
Can Control
Poison
Complex
Inconsistent
Simpler
Consistent
UB UB
Software Foundations Lab
/ 45
Key Idea: “Freeze”
• Introduce a new instruction
• Semantics:
39
y = freeze x
When x is a defined value:
When x is a poison value:
freeze x
freeze x
0
1
2
. . .
x
Nondet. Choice of
A Defined Value
Software Foundations Lab
/ 45
Our UB Model:
Branching on poison is
Undefined Behavior
poison
if (freeze(cond))while (n > 0){ A }
elsewhile (n > 0){ B }
(cond)
40
while (n > 0) {if (cond)A
elseB
}
0
Our Solution
Loop Unswitching
UB
Software Foundations Lab
/ 45
Our UB Model:
Branching on poison is
Undefined Behavior
if (freeze(cond))while (n > 0){ A }
elsewhile (n > 0){ B }
poison
40
while (n > 0) {if (cond)A
elseB
}
0
Our Solution
Loop Unswitching
UB
Software Foundations Lab
/ 45
Our UB Model:
Branching on poison is
Undefined Behavior
if (freeze(cond))while (n > 0){ A }
elsewhile (n > 0){ B }
poison
40
while (n > 0) {if (cond)A
elseB
}
true false0
Our Solution
Loop Unswitching
UB
Software Foundations Lab
/ 45
Our UB Model:
Branching on poison is
Undefined Behavior
if (freeze(cond))while (n > 0){ A }
elsewhile (n > 0){ B }
poison
40
while (n > 0) {if (cond)A
elseB
}
true false0
Our Solution
Loop Unswitching
UB
Software Foundations Lab
/ 45
Summary of Freeze
• Branching on freeze(poison) => Nondet.
- Used for Loop Unswitching
• Branching on poison => UB
- Used for Global Value Numbering
41
Compilers can control poison!
Software Foundations Lab
/ 45
Summary of Freeze
• Branching on freeze(poison) => Nondet.
- Used for Loop Unswitching
• Branching on poison => UB
- Used for Global Value Numbering
41
Compilers can control poison!
Freeze can also fix many other
UB-related problems.
Software Foundations Lab
/ 45
// bitwise-or
k = x | 0x1
t = 100 / k
while (n > 0)
use(t)
42
// bitwise-or
k = x | 0x1
while (n > 0)
use(100 / k)
Hoisting DivisionFurther Example
Software Foundations Lab
/ 45
// bitwise-or
k = x | 0x1
t = 100 / k
while (n > 0)
use(t)
42
// bitwise-or
k = x | 0x1
while (n > 0)
use(100 / k)
Hoisting DivisionFurther Example
poison
0
poison
Software Foundations Lab
/ 45
// bitwise-or
k = x | 0x1
t = 100 / k
while (n > 0)
use(t)
42
// bitwise-or
k = x | 0x1
while (n > 0)
use(100 / k)
Hoisting DivisionFurther Example
poison poison
0
poison poison
Software Foundations Lab
/ 45
// bitwise-or
k = x | 0x1
t = 100 / k
while (n > 0)
use(t)
42
// bitwise-or
k = x | 0x1
while (n > 0)
use(100 / k)
Hoisting DivisionFurther Example
poison poison
0
poison poison
Software Foundations Lab
/ 45
// bitwise-or
k = x | 0x1
t = 100 / k
while (n > 0)
use(t)
42
// bitwise-or
k = x | 0x1
while (n > 0)
use(100 / k)
Hoisting DivisionFurther Example
poison poison
0
UB
poison poison
Software Foundations Lab
/ 45
LLVM does not currently support it.
// bitwise-or
k = x | 0x1
t = 100 / k
while (n > 0)
use(t)
42
// bitwise-or
k = x | 0x1
while (n > 0)
use(100 / k)
Hoisting DivisionFurther Example
poison poison
0
UB
poison poison
Software Foundations Lab
/ 45
LLVM does not currently support it.
// bitwise-or
k = x | 0x1
t = 100 / k
while (n > 0)
use(t)
42
// bitwise-or
k = x | 0x1
while (n > 0)
use(100 / k)
Hoisting DivisionFurther Example
poison poison
0
poison
Software Foundations Lab
/ 45
LLVM does not currently support it.
// bitwise-or
k = x | 0x1
t = 100 / k
while (n > 0)
use(t)
freeze(x) | 0x1
42
// bitwise-or
k = x | 0x1
while (n > 0)
use(100 / k)
Hoisting DivisionFurther Example
poison poison
0
poison
Software Foundations Lab
/ 45
LLVM does not currently support it.
// bitwise-or
k = x | 0x1
t = 100 / k
while (n > 0)
use(t)
freeze(x) | 0x1
42
// bitwise-or
k = x | 0x1
while (n > 0)
use(100 / k)
Hoisting DivisionFurther Example
poison poison
0
poisonA defined
value
Software Foundations Lab
/ 45
LLVM does not currently support it.
// bitwise-or
k = x | 0x1
t = 100 / k
while (n > 0)
use(t)
freeze(x) | 0x1
42
// bitwise-or
k = x | 0x1
while (n > 0)
use(100 / k)
Hoisting DivisionFurther Example
poison poison
0
poisonA defined
value
non-zero
Software Foundations Lab
/ 45
LLVM does not currently support it.Freeze can make LLVM support it!
// bitwise-or
k = x | 0x1
t = 100 / k
while (n > 0)
use(t)
freeze(x) | 0x1
42
// bitwise-or
k = x | 0x1
while (n > 0)
use(100 / k)
Hoisting DivisionFurther Example
poison poison
0
poisonA defined
value
non-zero
Software Foundations Lab
/ 45
Implementation
• Target: LLVM 4.0 RC 4 (Mar. 2017)
• Add Freeze instruction to LLVM IR
• Bug Fixes Using Freeze
- Loop Unswitching Optimization
- C Bitfield Translation to LLVM IR
- InstCombine Optimizations
43
* More details are given in the paper
Software Foundations Lab
/ 45
Experiment Results
• Benchmarks (4.6M LOC):
- SPEC CPU2006
- LLVM Nightly Test
- Large Single File Benchmarks
• Compilation Time: ± 1%
• Compilation Memory Usage: Max + 2%
• Generated Code Size: ± 0.5%
• Execution Time: ± 3%
44
* More details are given in the paper
Software Foundations Lab
/ 45
Experiment Results
• Benchmarks (4.6M LOC):
- SPEC CPU2006
- LLVM Nightly Test
- Large Single File Benchmarks
• Compilation Time: ± 1%
• Compilation Memory Usage: Max + 2%
• Generated Code Size: ± 0.5%
• Execution Time: ± 3%
44
* More details are given in the paper
“Freeze” Can Fix UB Semantics
Without Significant Performance Penalty
Software Foundations Lab
/ 45
Summary
• Modern compilers’ UB models cannot support some textbook optimizations.
• We propose “freeze” to fix such problems.
• Freeze has little impact on performance.
45Software Foundations Lab