Download - Shadow Execution Frameworks for LLVM IR and JavaScriptkubitron/courses/... · Shadow Execution Frameworks for LLVM IR and JavaScript Motivation! ... • Handle all tricky C pointer

Transcript
Page 1: Shadow Execution Frameworks for LLVM IR and JavaScriptkubitron/courses/... · Shadow Execution Frameworks for LLVM IR and JavaScript Motivation! ... • Handle all tricky C pointer

Berkeley Parlab

1. Motivation

2. Overview

6. Evaluation for JavaScript 4. Evaluation for LLVM IR

Liang Gong & Cuong Nguyen* * Name are in alphabetical order. Each author makes equal contributions to the project.

5. Techniques and Challenges for JavaScript

Original Javascript: var a = b; Simplified Transformed Javascript: var a = J$.W(9, 'a', J$.R(5, 'b', b), a);

J$.W = function( … ) { … } J$.R = function( … ) { … }

Code Transformation:

Jalangi Framework Runtime Code:

Different places that may import JavaScript code •  Inline,  between  <script> and   </script> •  External  file:  src a2ribute  of  a    <script> tag  •  HTML  event  handler  a2ribute,  such  as  onclick  or onmouseover •  URL:  javascript: protocol •  Ajax:  jQuery.getScript()  or    importScript() •  Generated  by  Script:  src_elem.innerHTML = “function(){}”

More complex code execution model/environment: •  JS  Code  may  be  triggered  at  different  Kme    (page  loading,  specific  event,  asynchronous  call)  •  Different  JS  Context  

We add the following hooks: •  Binary  OperaKon  •  Unary  OperaKon  •  Variable  Read/Write  •  Field  Get/Put  •  FuncKon/Method  Call/Enter/Return  •  Script  Enter/Return  •  Object/FuncKon/Regexp/Array  Literal  •  CondiKon/Switch  

With hooks you can do: •  Debugging  •  Performance  analysis  •  Monitoring  dynamic  behaviors  •  Record  and  replay  •  Almost  any  dynamic  analysis  

Detect  a  NaN  in   AWer  diagnosing,  confirm  that  is  a  bug.

Check NaN Bug [ < 100 Loc]

[object Object].d <= undefined c <= undefined [object Object].refcount <= NaN

Found interesting operations in the following websites’ homepages

Simply  loading  the  Facebook  homepage,  the  analysis  detects  hundreds  of  this  kind  of  interesKng  operaKons.

Operating uninitialized variable

0  

10  

20  

30  

40  

50  

60  

70  

Vector  Chart   Bitmap  Game   Vector  Chart   Mobile  HTML5  CharKng  

Mobile  HTML5  Gaming  

AWer  TransformaKon  Before  TransformaKon  

Slow  down  is  not  obvious

Demo Available

Shadow Execution Frameworks for LLVM IR and JavaScript

Motivation •  Lesson learnt: code instrumentation/transformation and

computation of extra information are foundation to dynamic analysis.

•  Goal: minimize the efforts to build dynamic analysis by generalizing this approach with high scalability.

Contributions •  Design and implement an LLVM Shadow Execution System.

•  Based on instrumentation, concolic execution and shadow execution. •  Perform instrumentation at the instruction level. The analysis is

heavyweight but powerful. •  Support multiple languages: C/C++, Java, OpenCL, Fortran, etc.

•  Design and implement a JavaScript Dynamic Analysis Framework. •  JavaScript is the most popular programming language for client-side web programming. •  Facilitate analyzing any real-world webpages. •  Analysis module plug in and remove on the fly. •  Powerful analysis interaction based on web console. •  Detect real-world bugs using the framework.

Shadow Execution •  Associate a shadow value with each concrete value. •  Carry useful information: error propagation, array

bound checking, symbolic value, etc. •  Perform side-by-side with the concrete execution.

Concrete execution assists shadow execution.

Challenges •  Modeling dynamic allocation and pointer

arithmetic in shadow execution. •  Handle all tricky C pointer arithmetic

(castings, pointer arithmetic, function pointers, etc.).

•  Array/struct and their combinations. •  Handling uninterpreted calls. •  With/without side effects + return

arbitrary types (pointer, array, struct). •  Model malloc.

•  Achieving scalability: fast variable lookup and lazy construction of objects.

Evaluation of the Core Interpreter •  Robustness: interpret 50 commonly used Linux core utilities

(rm/ls/mkdir …). •  Check for each memory write that the interpretation and

concrete execution writes the same value. •  Check at each branch that two executions do not diverge.

•  Performance:

Evaluation of the Usability •  Implement an array-out-of-bound analysis (< 300 LoC). •  Can trace back to the root causes of the problem. •  Differentiate harmful and benign bugs. •  Demo is available.

•  Implement a NaN detection analysis (in progress).

Portable  to:

[object Object].now = end – start; NaN “30% 0”

Core Interpreter (Resemble a compiler) •  Front end: instrument all 57 LLVM instructions + compute stack frame

size for local/global variables + create fast lookup indices for variables. •  Back end: re-interpret all 57 LLVM instructions with shadow execution

+ model heap/stack and runtime environment.

2x in average 4x in the worst case

Input: target application under analysis. Output: instrumented/transformed codes with hooks. Extensibility: override hooks to perform additional analysis.

%a = alloca double %b = alloca double %1 = load %a %2 = load %b %3 = add %1 %2

LLVM bitcode

Instrumenter (LLVM Pass)

Callback Interpreter (Dynamically loaded/

Perform analysis)

my_alloca(double) %a = alloca double

… my_load(%my_a) %1 = load %a

… my_add(%my_1, %my_2) %3 = add %1 %2

LLVM bitcode

Analysis Result

3. Techniques and Challenges for LLVM IR

LLVM Shadow Execution Framework Interesting Analysis Applications

Check NaN:• Find  a  bug  in  jQuery-­‐1.8.3• Find  meaningless(maybe)  operations

Hack Frontend Code:• Modify  frontend  dynamic  effects• For  program  understanding  purposes

• Graphviz online• AttackMap

Generate Runtime Call Graph

Analyze Performance Issue:• Function  or  operation  counterCheck Polymorphic getField Statements:• Function  or  operation  counter