Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh...

23
Runtime for (Almost) Object- Bounded Big Data Applications UC Irvine USA Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang, Jianfei Hu, Harry Xu

Transcript of Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh...

Page 1: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications

UC IrvineUSA

Khanh Nguyen, Kai Wang, Yingyi Bu,

Lu Fang, Jianfei Hu, Harry Xu

Page 2: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

BIG DATA

Page 3: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

BIG DATA

Scalability ◦JVM crashes

due to OutOfMemory error at early stage

Management cost◦GC time accounts

for up to 50% of the execution time

[Bu et al. ISMM ’13]

High cost of the managed runtime is a fundamental problem!

Page 4: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

Golden rule for scalabilityThe number of heap objects and

references must not grow

proportionally with the cardinality of the dataset

FacadeNon-intrusive technique

Operate at compiler level

Much more general and practical

Semi-automatic

Statically bound the number of data objects in the heap

Page 5: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

Facade execution model

Use the off-heap, native memory to store unbounded data items

Data Represent

ation

Data Manipulati

onCreate heap objects only for control purposes ◦Bounded object pooling

Many to One

Page 6: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

Benefits from Facade

Significantly reduced GC time

Reduced memory consumption

Reduced memory access costs

Reduced execution time

Improved scalability

Page 7: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

Static bound of data objects

• s : cardinality of the data set

O(s)

• t : number of threads• n : number of data types• p : number of pages

O(t*n+p)

14,257,280,9231,363

Org.

Facade(GraphChi OSDI

‘12)

Page 8: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

Data representation

Memory address is used as object reference (pointer)= pageRef

Native memory

id

students

name

Page 9: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

Data manipulation

Object references are substituted by pageRef

Objects are created as facades for control purposes

Professor p = f; long pRef = fRef;

Duser-specified data class

DFFacade class

automatic

Page 10: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

p.addStudent(s);

ProfessorFacade pf = professorPool[0]; StudentFacade sf = studentPool[0];

pf.pageRef = pRef; sf.pageRef = sRef; pf.addStudent(sf);

Have only pRef and sRef

void addStudent(StudentFacade sf) { long thisRef = this.pageRef; long sRef = sf.pageRef; //...}

Page 11: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

p.addStudents (s1,s2,s3,s4,s5)

pf = professorPool[0];

sf1 = studentPool[0];sf2 = studentPool[1];sf3 = studentPool[2];sf4 = studentPool[3];sf5 = studentPool[4];

pf.addStudents (sf1,sf2,sf3,sf4,sf5)

Orig.

Facade

statically created; bounded by the max # operands of type Professor/Student

Page 12: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

Challenge 1

Dynamic dispatch◦Use type ID in the record’s header

◦Parameter facade pool◦Separated receiver facade pool

p.addStudent(s);

ProfessorFacade pf = FacadeRuntime.resolve(pRef);

Page 13: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

Challenge 2Concurrency:

◦Thread-local facade pooling◦Global lock pool to support object

locks

enterMonitor(o); … exitMonitor(o);

Get a free lock l from the lock pool;

Write l into the lock field of oRef

l.compareAndInc();enterMonitor(l);…exitMonitor(l);if(l.compareAndDec()

== 0){ Write 0 into the lock

field of oRef return l to the pool;}

Object locks

Page 14: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

Memory managementAllocation

◦ High-performance parallel allocator Thread-local managers Uses different size classes

Insights:◦ Data-processing functions are iteration-

based◦ Each iteration processes distinct data

partition◦ Data objects in each iteration have disjoint

lifetime

Deallocation◦ Use a user-provided pair of calls to recycle

pages: iteration_start() && iteration_end()

◦ Iterations are well-defined --- it took us only a few minutes to find iterations and insert callbacks in GraphChi

Page 15: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

Other supports

Optimizations:◦Object inlining for records whose size is known statically

◦Oversized pages for large arrays◦Type specialization◦…

Support most of Java 7 featuresDetails can be found in the paper

Page 16: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

Experiments

GraphChi [Kyrola et al. OSDI’12]

◦High-performance graph analytical framework for a single machine

Hyracks [Borkar et al. ICDE’11]

◦Data parallel platform to run data-intensive jobs on a cluster of shared-nothing machines

GPS [Salihoglu and Widom SSDBM’13]

◦Distributed graph processing system for large graphs

3 frameworks, 7 applications

Page 17: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

GraphChi

Total time Update time

Load time GC time Memory0

0.20.40.60.8

11.21.4

4G6G8G

6.4x reduction

36.7%

improv.

Total time Update time

Load time GC time Memory0

0.2

0.4

0.6

0.8

1

1.2

1.4

4G6G8G

4x reductio

n

5.8%

improv.

Connected Component

Page Rank

Page 18: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

GraphChi - Throughput

1 3 5 7 9 11 13 15 171

3

5

7

9

11

13

15

PR CC PR' CC'

Number of edges x 108

Th

rou

gh

pu

t (e

dg

es/

sec)

Original

X 105

Facade1.4x

improvement

Page 19: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

3G 5G 10G 14G 19G0

0.2

0.4

0.6

0.8

1

1.2

1.4

Total timeGC timeMemory

Hyracks31x reduction in GC

timeExternal Sort

3G 5G 10G 14G 19G0

2

4

6

8

Original Facade

Mem

ory

Usag

e

(GB

)

Word CountThe original program

crashed in all of these sets thus no figure

32% reduction in mem. consumption

Page 20: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

GPSPage Rank, KMeans & Random Walk

◦ Reduction in largest graph: 120M vertices, 1.7B edges

PageRank KMeans RandomWalk0

5

10

15

20

25

30

35

17.313.5

10.9

%

PageRank KMeans RandomWalk

30.8

23.1

32.1

Page 21: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

GPSPage Rank, KMeans & Random Walk

◦ Average cumulative reduction

PageRank KMeans RandomWalk0

1

2

3

4

5

2.75

3.43

4.47

%

PageRank KMeans RandomWalk0

5

10

15

20

25

30

35

15.63

8.87

30.31

%

Page 22: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

ResultsGraphChi (Page Rank & Connected Components)

◦ Up to 6.4x reduction in GC time ◦ Up to 28% reduction in memory usage (6+GB

datasets)

◦ Up to 48% reduction in execution time◦ 1.4x improvement in throughput

Hyracks (Word Count & External Sort)◦ 3.8x improvement in scalability◦ Up to 88x reduction in GC time◦ Up to 32% reduction in memory usage◦ Up to 10% reduction in execution time

GPS (Page Rank, KMeans & Random Walk) ◦ Up to 40% reduction in GC time◦ Up to 15% reduction in execution time

Max reductionGC time: 88x

Execution time: 48%Memory usage: 32%

Page 23: Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh Nguyen Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang,

ConclusionFacade is a complete package:◦Compiler: automatically transform existing programs

◦Runtime system: run on top of JVM, i.e., no modification of JVMThank

you!