SAFECode SAFECode: Enforcing Alias Analysis for Weakly Typed Languages Dinakar Dhurjati University...

SAFECode

SAFECode: Enforcing Alias Analysis for Weakly Typed

Languages

Dinakar Dhurjati

University of Illinois at Urbana-Champaign

Joint work with Sumant Kowshik, Vikram Adve

SAFECode

Weakly Typed Languages (C/C++)

• Weak semantic guarantees– Undetected array bounds errors, dangling

pointer errors, type cast errors, uninitialized pointers, etc.

Memory safety violations

Any static analysis is suspect

Widely Ignored

SAFECode

Static Analysis Tools

Memory errors invalidate core analyses

Yes or No

propertySoftware Tools

(e.g. ESP, BLAST)

C program Normal Compiler

Alias analysis, Call graph, Type information

Core Analyses≈

SAFECode

Why not use safe languages?

• Large body of legacy applications in C/C++

• Porting is not easy– Automatic memory management or GC – Wrappers for library calls because of metadata

on pointers

Java, C#, safe dialects of C (e.g. CCured, Cyclone)

SAFECode

Our Solution: SAFECode

Not a safe language : tolerates errors

• Completely automatic, no wrappers, no GC

• Works for nearly all C programs

• Low overhead (less than 30% in our expts)

• Provides sound analysis platform – Sound operational semantics for C based on core

analyses

• Masks dangling pointer, array bounds errors

• Ensures memory safety (defined later)

SAFECode

SAFECode as Analysis Platform

C program Normal Compiler

Alias analysis, Call graph, Type information

SAFECodeC program

with checksproperty

Yes or No

Software Verification e.g. ESP, BLAST

SAFECode enforces core analyses, memory safety

Core Analyses≈

SAFECode

Outline

• Motivation & Overview

• Background

• Approach

• Formalization

• Evaluation

• Summary

SAFECode

Background - Alias Analysis

P = malloc(2 * sizeof(int));

P[i] = ….

struct BigT *Q = (Struct BigT *)P;

TU S,A

P Q

field

TK : Type Known, TU : Type Unknown

struct List* head = makeList(20);struct List (TK) H

next val

head

A static summary of memory objects and their connectivity

Restriction: flow-insensitive, unification based

Q->field8 = …

SAFECode

Background - Automatic Pool Allocation (APA)[LattnerAdve:PLDI05]

• Each node instance uses separate pool

• Pool is destroyed if not accessible

Pool 1 Pool 2

List H

next val

head

Partitions heap into pools based on alias analysis

List H

next val

x y

SAFECode

Outline

• Motivation & Overview

• Background

• Approach

• Formalization

• Evaluation

• Summary

SAFECode

SAFECode Approach : Enforce Core Analyses

• Alias analysis

• Call graph – Run-time checks on indirect calls

• Type information– Subset of alias analysis

SAFECode

Enforcing Alias Analysis

• Check if tmp points to corresponding node

• Normal allocators– Memory objects are scattered in the heap– Each check at run-time is extremely expensive

struct List (TK) H

next val

tmp

SAFECode

Insight 1 – Use Automatic Pool Allocation (APA)

• Each node instance uses separate pool

• Pool is destroyed if not accessible

Pool 1 Pool 2

List H

next val

head

Partitions heap into pools based on alias analysis

List H

next val

x y

SAFECode

The Pool Bounds Check

• Pool is a list of pages (2^k)

• Pool maintains a hash table of the start addresses of the pages

• Poolcheck on a pointer p– Mask lower k bits of p, see if it is in the hash

table– Alignment check for TK pools

Poolcheck : involves hash lookups

SAFECode

Insight 2 : Mostly static checking for TK pools

3 sufficient properties

Type Known Pools

Typed accesses

free

Correct alignment

free

No pool bounds

violations

Pool bounds checks

Type Unknown Pools

Pool bounds

checks on all

operations• Solution

– Type homogeneity, do not release memory from pool (Insight 3)

• Release memory from pool when pool is inaccessible (Insight 4)

SAFECode

poolinit (ρ, int) PP {

int*ρ x,y;

int*ρ’ z;

x = malloc(4);

y = x;

free(x);

y = malloc(4);

Formalization as a Type System

Soundness theorem ensures core analyses are never invalidated

Int

x y

ρ

poolinit(ρ’, int) PP’ {

poolinit (ρ, int) PP {

int*ρ x,y;

int*ρ’ z;

x = poolalloc(PP, 1); //allocate one element

y = x; //type checks

poolfree(PP,x)

y = poolalloc(PP,1); // malloc semantics different

}

}

ρ’Int

z

SAFECode

Static Analysis Using SAFECode

• Flow-sensitive analysis – Only change is in malloc semantics

• Flow insensitive analyses – don’t require any changes

e.g., ESP, BLAST

Sound Analyses for C are now possible

SAFECode

Evaluation (Run-time Overhead)

• Olden, Ptrdist, 3 system daemons [Full list in the paper]

• No source changes necessary

• Compared with CCured on Olden [See paper]

Program SAFECode ratio

bh 1.03

bisort 1.00

em3d 1.27

treeadd 0.99

tsp 0.99

Yacr2 1.30

Ks 1.12

anagram 1.23

ftpd 1.00

fingerd 1.03

ghttpd 1.07

1.0 ≡ no pool allocation + no SAFECode passes

SAFECode

Related WorkM

od

ifie

d C

Pu

re C

Solution Performance

Error detection/ prevention

Sound analysis

Memory

Management

Purify, Valgrind

Several 100x Some - -

SafeC 5x Some - -

Jones-Kelley 5-6x Some - -

SFI Over 2x Few - -

Yong Over 2x Some - -

SAFECode Upto 1.30 Some Yes -

CCured Upto 1.87 All Yes GC

Cyclone 1x-2x All YesRegions +GC

SAFECode

Two errors we don’t detect

• Detecting array bounds overflow– A low overhead backwards-compatible

solution [ICSE 2006]

• Detecting dangling pointer dereference– Efficient detection for some kinds of programs

[DSN 2006]

SAFECode

Conclusion

• Sound operational semantics for C + core analyses

• Guarantee alias analysis with low over head

We guarantee memory safety without detecting some errors

- Control flow integrity

- Data access integrity (type information)

- Analysis integrityhttp://safecode.cs.uiuc.edu

SAFECode

SAFECode SAFECode: Enforcing Alias Analysis for Weakly Typed Languages Dinakar Dhurjati University...

Documents

Transcript of SAFECode SAFECode: Enforcing Alias Analysis for Weakly Typed Languages Dinakar Dhurjati University...