CACHETOR Detecting Cacheable Data to Remove Bloat Khanh Nguyen Guoqing Xu UC Irvine USA.

CACHETORDetecting Cacheable Data to

Remove Bloat

Khanh NguyenGuoqing Xu

UC IrvineUSA

Introduction

• Bloat: Excessive work to accomplish simple tasks

• Modern software suffers from bloat [Xu et.al., FoSER 2010]

• It is difficult for compilers to remove the penalty

• One pattern: repeated computations that have the

same inputs and produce the same outputs

• 4 out of 18 best practices (IBM’s)* are to reuse data

Khanh Nguyen - UC Irvine* www.ibm.com/software/webservers/appserv/ws_bestpractices.pdf

Example

float[] fValues = {0.0, 1.0, 2.3, 1.0, 1.0, 3.4, 1.0, 1.0, . . . , 1.0};

int[] iValues = new int[fValues.length] ;

for (int i = 0; i < fValues.length; i++){iValues[i] = Float.floatToIntBits(fValues[i]);

}{adapted from sunflow, an open-source image rendering system}

Khanh Nguyen - UC Irvine

if (fValues[i] == 1.0)iValues[i] = cached_result;

else iValues[i] = Float.floatToIntBits(fValues[i]);

int cached_result = Float.floatToIntBits(1.0);

float[] fValues = {?, ?, ?, ?, . . . , ?};

The Big Picture


Dynamic Dependence Analysis

Dependence Profile/Graph

I-Cachetor

D-Cachetor

M-Cachetor

Inst.: a = b+c;

Obj.: a = new A();

Call: a = f();

Cachetor

• Introduction

• Scalable algorithms for the dependence

analysis

• 3 detectors

• Evaluations Khanh Nguyen - UC Irvine

In Theory


Full Value Profiling

Full Dynamic Slicing

Cachetor

In Practice

Abstract Value Profiling

Abstract Dynamic Slicing

Overview


• Combine value profiling and dynamic slicing in a mutually-beneficial and scalable manner• Distinct values are used to abstract instruction

instances• Result: an abstract dependence graph

• Nodes: abstract representations of runtime instances

• Edges: dependence relationships between nodes

Equivalence Class


e1

…

en

Inst. instances

f1

Instruction i

Equivalence Class1

1

1

22

2

3

3

3

3

44

55

66

6

6

6

UnboundedInst. instances

Values created

f1(inst. instance) = value created

Inst. instances

Values created

f1

f2

-Top-N ?- Hashing ?

UNBOUNDED

BOUNDEDSIZE N

0

33

3

66

1

1

1

1

1 4

4 77

2

22

25

5

88

8

Inst. instances

Values created

f1

f2

- Hashing

value % N

SIZE N

Another Abstraction Level

• Context sensitive:

• To distinguish entities based on the calling

context

• To improve the tool’s precision

• Please refer to our paper for details


Cacheability

• Quantitative measurement indicating how likely a program

entity will keep producing/containing identical values

• Compute cacheability for 3 kinds of program entities:

• Instruction a = b+c;

• Data structure a = new A();

• Method call a = f();

• Rank and report top entities


Cachetor

• Introduction

• Scalable algorithms for the dependence

analysis

• 3 detectors

• Evaluations


I-Cachetor• Detect instructions that create identical values

• Compute cacheability for each static instruction (Inst.CM)

• Cacheability:

0 31 2

4/8 = 0.5

1 4 2 1

D-Cachetor: Overview

• 2 steps:

• Step 1: detect cacheable individual objects

• Step 2: detect cacheable data structure

• Compute cacheability for each allocation site node

D-Cachetor: Step 1• Compute cacheability for each object (Obj.CM),

not considering reference relationships

• Focus: instructions that write primitive-typed fields

a = new A()1

a.f = b<2,3> a.g = c<3,3> a.… = …

1 2… t

a.h = d<5,7>

D-Cachetor: Step 2• Group objects using the

reference relationships

• Compute DataStructureCM

• Focus: instructions that write

reference-typed fields

• Add only objects whose Obj.CM

is within a range

ds = new DS()2

a = new A()4 b = new B()6

c = new C()2 d = new D()7

M-Cachetor

• Detect method calls that have the same inputs and

produce the same outputs

• Compute CallSiteCM

• For each call site c: a = f( ), CallSiteCM is:

• If a is primitive: CallSiteCM = Inst.CMc

• If a is reference: CallSiteCM = the average of

DataStructureCM of all data structures rooted at a

Implementation

• Jikes RVM 3.1.1

• Optimizing-compiler-only mode

• Context-sensitive

• Evaluated on 14 benchmarks from DaCapo & Java

Grande


Overheads


antlr

bloa

t fo

p

hsql

db

luin

dex

luse

arch

pm

d

xala

n

avro

ra

sunf

low

eule

r

mol

dyn

mon

teca

rlo

raytr

acer

Geo.M

ean

0

100

200

300

400

500

600

0

1

2

3

4

5

6

7

8

9

10Geo. Mean = 201.96X (Time) - 1.98X(Space)

Time Space

X X

Case Studies


Program Time Reduction

Space Reduction

GC runs Reduction

GC time Reduction

montecarlo 12.1% 98.7% 70.0% 89.2%

raytracer 19.1% 1.2% 33.3% 30.2%

euler 20.5% 0.4% 40.0% 44.8%

bloat 13.1% 12.6% -7.3% -4.0%

xalan 5.2% 0.1% -0.7% -1.1%

False Positives


Program D-Cachetor M-Cachetor

montecarlo 2 6

raytracer 3 4

euler 1 7

bloat 1 4

xalan 4 5

Numbers of false positives identified among top 20 items in the reports of D-Cachetor and M-Cachetor.

False Positives Sources

• Handling of floating point values

• Context-sensitive reporting

• Missing the actual values

• Hashing-induced false positives


Conclusions

• Cachetor - novel tool, supports detection of cacheable data to improve performance• Scalable combination of value profiling and

dynamic slicing• 3 detectors that can detect cacheable:

o Instructionso Data structureso Method calls

• Large optimization opportunities can be found from Cachetor’s reports


THANK YOU!Questions - Comments?


What happened in montecarlo?

public void runSerial() { results = new Vector(nRunsMC); // Now do the computation. PriceStock ps; for( int iRun=0; iRun < nRunsMC; iRun++ ) { ps = new PriceStock();ps.setInitAllTasks(initAllTasks);ps.setTask(tasks.elementAt(iRun));ps.run();results.addElement(ps.getResult()); }

{Calculate the result on the fly}

private void processSerial() { processResults();}

ps.setTask(iRun, (long)iRun*11);

private void initTasks(int nRunsMC) { tasks = new Vector(nRunsMC); for( int i=0; i < nRunsMC; i++ ) {

String header= "MC run “ + String.valueOf(i); ToTask task = new ToTask(header, (long)i*11); tasks.addElement((Object) task); } } Khanh Nguyen - UC Irvine

CACHETOR Detecting Cacheable Data to Remove Bloat Khanh Nguyen Guoqing Xu UC Irvine USA.

Documents

Transcript of CACHETOR Detecting Cacheable Data to Remove Bloat Khanh Nguyen Guoqing Xu UC Irvine USA.