Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint...

23
Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson) N. Rinetzky (Tel-Aviv University) M.G. Nanda (IBM IRL)

Transcript of Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint...

Page 1: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Verifying Dereference Safety via Expanding-Scope Analysis

Alexey Loginov (GrammaTech, Inc.)

Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson) N. Rinetzky (Tel-Aviv University) M.G. Nanda (IBM IRL)

Page 2: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Why Null-Dereference Analysis?

Common problem …or symptom of other problems

› Null-dereference warning may help in identifying root cause

Relevant to all software Specification is obvious (absence of NPE)

› Requires no user interaction

2

Page 3: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Why Sound Null-Dereference Analysis?

Safety guarantees are important in some domains Results can become an in-code specification, e.g., via JSR 305

› Annotations can help with code understanding› Annotations can simplify future analyses (e.g., after modifications)

Precise and efficient sound analysis is challenging› Lessons carry over to other static analyses

3

Page 4: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Example answers expected

1. class A {2. final A a = new A();

3. static main() {4. B b = new B();5. initB(b);6. a.foo(b); // okay7. }

8. foo(B b) {9. b.f.fun(); // okay10. b.f.f.gun(); // null-deref.11. }

12. static initB(B b) {13. b.f = new F(); // okay14. b.f.f = null; // okay15. }16. }

4

Interprocedural information is needed often

– Allocations in callers (e.g., new B()) common

– Allocations in callees (e.g., new F()) common

Page 5: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Common approaches

Most existing tools perform intraprocedural analysis Have to make assumptions about callers/callees Option 1: pessimistic assumptions about callers/callees

› Result: a sea of false alarms

5

Page 6: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Results of pessimistic intraproc. analysis

1. class A {2. final A a = new A();

3. static main() {4. B b = new B();5. initB(b);6. a.foo(b); // null deref.7. }

8. foo(B b) {9. b.f.fun(); // two null derefs.10. b.f.f.gun(); // null deref.11. }

12. static initB(B b) {13. b.f = new F(); // null deref.14. b.f.f = null; // okay15. }16. }

6

Reports four false alarms

– Only real error is on line 10

Page 7: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Common approaches

Most existing tools perform intraprocedural analysis Have to make assumptions about callers/callees Option 2: optimistic assumptions about callers/callees

› Result: missing real errors (catching the most glaring ones)

7

Page 8: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Results of optimistic intraproc. analysis

1. class A {2. final A a = new A();

3. static main() {4. B b = new b();5. initB(b);6. a.foo(b); // okay7. }

8. foo(B b) {9. b.f.fun(); // okay10. b.f.f.gun(); // okay11. }

12. static initB(B b) {13. b.f = new F(); // okay14. b.f.f = null; // okay15. }16. }

8

Misses the real error on line 10

Page 9: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Common approaches

Most existing tools perform intraprocedural analysis Have to make assumptions about callers/callees Option 3: mostly optimistic assumptions

› Detects inconsistencies in programmer’s beliefs• Test x == null: belief that x could be null before test

• Dereference of x without a test: belief that x cannot be null

› Allow analysis to dismiss assumptions contradicted by beliefs› Result: missing real errors, reporting safe dereferences as unsafe

• Generally, few false alarms but many missed errors

• Same result as option 2 (optimistic assumptions) in our example

9

Page 10: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Prospects for interprocedural analysis

Whole-program analysis cannot scale to large software› Majority of instructions are relevant to null-dereference analysis

• Can’t prune down program to a small relevant subset

Need mechanism to break down a program’s complexity

10

Page 11: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Expanding-Scope Analysis Holy Grail

› Cost: INTRAprocedural analysis› Precision: INTERprocedural (whole-program) analysis

Staged approach› Analyze dereferences with limited interprocedural context› Verify dereferences with the least amount of context› Increase interprocedural context for harder cases› In simplest form

• Start with local analysis (with pessimistic assumptions)– Verify some dereferences without considering context

• Consider remaining dereferences with extra level of context– Verify some dereferences within a call subtree of immediate callers

• …

› We refer to individual analyses as Limited-Scope Analyses

11

Page 12: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Expanding-Scope Analysis

12

… f.foo() …

f f f

f

f f

f

Page 13: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Expanding-Scope Analysis

13

foo

main

initB

b.f.fb.f.fun();

.gun();

B b = new B();initB(b);a.foo(b);

b.f = new F();b.f.f = null

Page 14: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Abstract Domain Product of three abstract domains

1. Abstract domain for may-alias analysis• Implementation: flow- & context-insensitive Andersen-style

2. Abstract domain for must-alias analysis• Implementation: demand-driven (based on def-use chains)

3. Set APnn of non-null access paths• Access paths denote l-value expressions:

– (VarId | StaticFieldId).InstanceFieldId*

• Finiteness of domain guaranteed by (parameterized) bounds on– Size of APnn

– Maximal length of access paths in APnn

› Only the final component (set of non-null access paths APnn) changes

14

Page 15: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Transfer Functions (statements)

15

Statement Transfer function

v = null APnn \ { v. | }

v = new T() APnn {v}

v = w APnn {v. | w. APnn}

v = w.f APnn {v. | w.f. APnn} mustAlias(w)

v.f = null APnn \ {e′.f. | e′ mayAlias(v), } mustAlias(v)

v.f = w APnn {e′.f. | w. APnn, e′ mustAlias(v)} mustAlias(v)

…v.foo()……v[i]……v.length…

APnn mustAlias(v)

Let = InstanceFieldId* (sequences of instance fields)

Page 16: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Transfer Functions (conditions)

16

Condition Transfer function

on true branch on false branch

v == null v APnn ? : APnn APnn mustAlias(v)

v instanceof T APnn mustAlias(v) APnn

v == wAPnn

(mustAlias(w) if v APnn) (mustAlias(v) if w APnn)

APnn

Page 17: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Real OO applications (e.g., web applications) have wide call graphs› High scope limits are too expensive to analyze

New stages help stave off the need for high scope limits1. Pruning

• Verifies dereferences of (non-null) final and stationary fields

2. Special local (scope-0) analysesa. Caller-guarantee analysis (top-down in call graph)

– Propagates callers’ guarantees to callees– E.g., for references passed as arguments down deep call chains

b. Callee-guarantee analysis (bottom-up in call graph)– Propagates callees’ guarantees up to callers– E.g., for field initializations in deep initialization call chains

17

Staged Analysis in SALSA(Scalable Analysis via Lazy Scope expAnsion)

Page 18: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Staged Analysis in SALSA(Scalable Analysis via Lazy Scope expAnsion)

18

subtrees of depth 1 from parents

pruning

caller-guarantee

callee-guarantee

scope-1

scope-2

subtrees of depth 2 from grandparents

symbolic

high priority low priority

Page 19: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Steps of staged interproc. analysis

1. class A {2.

3. static main() {4. 5. initB(b);6. 7. }

8. foo(B b) {9. 10. 11. }

12. static initB(B b) {13. 14. 15. }16. }

20

Pruning (final & stationary fields)

Limited-scope analysis

1. Scope-0 (local analysis)

2. Scope-1 analysis

final A a = new A();

a.foo(b);

b.f.f

b.f.f = null;b.f

b.f.fun();

B b = new B();

.gun();

= new F();

1. Caller-guarantee (local) analysis2. Callee-guarantee (local) analysis3. Scope-1 analysisb.f APnn

b APnn

b APnn

Page 20: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Experimental results 21 (mostly open-source) applications

› ~3K-465K bytecodes; ~300-37K dereferences Avg: ~90% of dereferences verified soundly and automatically

› ~8% dismissed by Pruning› ~77% dismissed by caller-guarantee analysis› ~5% dismissed by remaining stages

Final scope limit: between 2 and 5 (chosen heuristicallly)› Diminishing returns after local analyses (caller-/callee-guarantee)› Higher scope limits useful in the absence of caller/callee guarantees

Max. access-path length: 2 for all but four applications› Higher access-path lengths had no effect for most applications› Helped C-like applications (direct field dereferences without getters)

21

Page 21: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Experimental results Expected many false alarms due to simple abstract domain Implemented heuristic symbolic path-validity checking

› This phase selected ~20% as high-priority warnings› Surprisingly low incidence of false alarms due to path-correlation

Biggest domain shortcoming: not tracking access-path types› Causes unnecessarily high cost of verifying certain dereferences

• Includes too many irrelevant code portions when verifying a dereference› Produces false alarms due to examining type-infeasible paths

Results are encouraging for the simplicity of the domain

22

Page 22: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Tool-User Interaction The output includes suggested annotations

› Ordered by the number of warnings guaranteed to be dismissed• Actual number would require an alternate abstract domain

› Current annotation options• Field f is non-null• Parameter p or return value of method foo() is non-null

User may choose to accept some annotations› We studied annotations for 8 benchmarks with high warning counts› A few hours effort for non-familiar code

• Result: 30% decrease in warning counts

23

Page 23: Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson)

Summary

Novel expanding-scope analysis› Applicable to multiple abstract domains

Scalable and precise null-dereference analysis› Staged analysis makes a simple abstract domain effective

Vision: improve programs’ specifications and robustness› Cleanse programs by examining warnings and suggested annotations› Check accepted annotations with assertions or symbolic techniques› Extend the program’s specification and analyzability via annotations

25