Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh...

Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications

UC IrvineUSA

Khanh Nguyen, Kai Wang, Yingyi Bu,

Lu Fang, Jianfei Hu, Harry Xu

BIG DATA

Scalability ◦JVM crashes

due to OutOfMemory error at early stage

Management cost◦GC time accounts

for up to 50% of the execution time

[Bu et al. ISMM ’13]

High cost of the managed runtime is a fundamental problem!

Golden rule for scalabilityThe number of heap objects and

references must not grow

proportionally with the cardinality of the dataset

FacadeNon-intrusive technique

Operate at compiler level

Much more general and practical

Semi-automatic

Statically bound the number of data objects in the heap

Facade execution model

Use the off-heap, native memory to store unbounded data items

Data Represent

Data Manipulati

onCreate heap objects only for control purposes ◦Bounded object pooling

Many to One

Benefits from Facade

Significantly reduced GC time

Reduced memory consumption

Reduced memory access costs

Reduced execution time

Improved scalability

Static bound of data objects

• s : cardinality of the data set

• t : number of threads• n : number of data types• p : number of pages

O(t*n+p)

14,257,280,9231,363

Facade(GraphChi OSDI

‘12)

Data representation

Memory address is used as object reference (pointer)= pageRef

Native memory

students

Data manipulation

Object references are substituted by pageRef

Objects are created as facades for control purposes

Professor p = f; long pRef = fRef;

Duser-specified data class

DFFacade class

automatic

p.addStudent(s);

ProfessorFacade pf = professorPool[0]; StudentFacade sf = studentPool[0];

pf.pageRef = pRef; sf.pageRef = sRef; pf.addStudent(sf);

Have only pRef and sRef

void addStudent(StudentFacade sf) { long thisRef = this.pageRef; long sRef = sf.pageRef; //...}

p.addStudents (s1,s2,s3,s4,s5)

pf = professorPool[0];

sf1 = studentPool[0];sf2 = studentPool[1];sf3 = studentPool[2];sf4 = studentPool[3];sf5 = studentPool[4];

pf.addStudents (sf1,sf2,sf3,sf4,sf5)

Facade

statically created; bounded by the max # operands of type Professor/Student

Challenge 1

Dynamic dispatch◦Use type ID in the record’s header

◦Parameter facade pool◦Separated receiver facade pool

p.addStudent(s);

ProfessorFacade pf = FacadeRuntime.resolve(pRef);

Challenge 2Concurrency:

◦Thread-local facade pooling◦Global lock pool to support object

enterMonitor(o); … exitMonitor(o);

Get a free lock l from the lock pool;

Write l into the lock field of oRef

l.compareAndInc();enterMonitor(l);…exitMonitor(l);if(l.compareAndDec()

== 0){ Write 0 into the lock

field of oRef return l to the pool;}

Object locks

Memory managementAllocation

◦ High-performance parallel allocator Thread-local managers Uses different size classes

Insights:◦ Data-processing functions are iteration-

based◦ Each iteration processes distinct data

partition◦ Data objects in each iteration have disjoint

lifetime

Deallocation◦ Use a user-provided pair of calls to recycle

pages: iteration_start() && iteration_end()

◦ Iterations are well-defined --- it took us only a few minutes to find iterations and insert callbacks in GraphChi

Other supports

Optimizations:◦Object inlining for records whose size is known statically

◦Oversized pages for large arrays◦Type specialization◦…

Support most of Java 7 featuresDetails can be found in the paper

Experiments

GraphChi [Kyrola et al. OSDI’12]

◦High-performance graph analytical framework for a single machine

Hyracks [Borkar et al. ICDE’11]

◦Data parallel platform to run data-intensive jobs on a cluster of shared-nothing machines

GPS [Salihoglu and Widom SSDBM’13]

◦Distributed graph processing system for large graphs

3 frameworks, 7 applications

GraphChi

Total time Update time

Load time GC time Memory0

0.20.40.60.8

11.21.4

4G6G8G

6.4x reduction

improv.

Total time Update time

Load time GC time Memory0

4G6G8G

4x reductio

improv.

Connected Component

Page Rank

GraphChi - Throughput

1 3 5 7 9 11 13 15 171

PR CC PR' CC'

Number of edges x 108

Original

Facade1.4x

improvement

3G 5G 10G 14G 19G0

Total timeGC timeMemory

Hyracks31x reduction in GC

timeExternal Sort

3G 5G 10G 14G 19G0

Original Facade

Word CountThe original program

crashed in all of these sets thus no figure

32% reduction in mem. consumption

GPSPage Rank, KMeans & Random Walk

◦ Reduction in largest graph: 120M vertices, 1.7B edges

PageRank KMeans RandomWalk0

17.313.5

PageRank KMeans RandomWalk

GPSPage Rank, KMeans & Random Walk

◦ Average cumulative reduction

PageRank KMeans RandomWalk0

ResultsGraphChi (Page Rank & Connected Components)

◦ Up to 6.4x reduction in GC time ◦ Up to 28% reduction in memory usage (6+GB

datasets)

◦ Up to 48% reduction in execution time◦ 1.4x improvement in throughput

Hyracks (Word Count & External Sort)◦ 3.8x improvement in scalability◦ Up to 88x reduction in GC time◦ Up to 32% reduction in memory usage◦ Up to 10% reduction in execution time

GPS (Page Rank, KMeans & Random Walk) ◦ Up to 40% reduction in GC time◦ Up to 15% reduction in execution time

Max reductionGC time: 88x

Execution time: 48%Memory usage: 32%

ConclusionFacade is a complete package:◦Compiler: automatically transform existing programs

◦Runtime system: run on top of JVM, i.e., no modification of JVMThank

Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh...

Documents

Transcript of Facade: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications UC Irvine USA Khanh...

Khanh-Nguyen - Gearman - distributed process solution

Document of The World Bank · IEG wishes to thank Ms. Khanh Linh Thi Le, Ms. Hoa Chau Nguyen, Ms. Nuong Dieu Nguyen, Ms. Dung Thi Thuy Dao, Ms. Phuong Minh Le, and Ms. Khai Hoan Nguyễn

Portfolio Khanh Ha

Self-Assembly-Driven Nematizationsciortif/PDF/2014/la500127n.pdf · Self-Assembly-Driven Nematization Khanh Thuy Nguyen,† Francesco Sciortino,†,‡ and Cristiano De Michele*,†

ONE HEALTH INTERVENTIONSOne+Health+Interventions+-+An...David Mutonga, P&R, East Africa Liz Mumford, World Health Organization, Geneva Nguyen Cong Khanh, Ministry of Health, Vietnam

Speculative Region-based Memory Management for Big Data Systems Khanh Nguyen, Lu Fang, Harry Xu, Brian Demsky Donald Bren School of Information and Computer.

Plane Design: Names: Noble Wu, Bobby Walsh, Arda Selekler, Khanh Nguyen, Mulu, Weird Dave.

Electronic Supporting InformationS1 Electronic Supporting Information Co-Delivery of Nitric Oxide and Antibiotic using Polymeric Nanoparticles Thuy-Khanh Nguyen,a Ramona Selvanayagam,a

Interface selection and flow/interface association ... Nguyen Thanh Loc, Mac Quy Khanh, Tran Thi Bich Thuy, Le Hong Phuong, Dau Huy Ngoc, Nguyen Van Cuong, Pham Duy Minh, Dr. Lulu,

F : A Compiler and Runtime for (Almost) Object …guoqingx/papers/nguyen-asplos15.pdfFACADE: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications Khanh Nguyen Kai

Activity 1.6 Factors affecting mango quality...Nguyen Van Phong, Nguyen Khanh Ngoc, Peter Johnson SOFRI 15 October 2019 Improving smallholder farmer incomes through strategic market

BUDDHIST APPROACH TO ECONOMIC SUSTAINABLE ...331 BUDDHIST APPROACH TO ECONOMIC SUSTAINABLE DEVELOPMENT by Nguyen Ngoc Duy Khanh* ABSTRACT Production and consumption are two decisive

The Institutional Foundations of China's Market Transition by Yingyi

CACHETOR Detecting Cacheable Data to Remove Bloat Khanh Nguyen Guoqing Xu UC Irvine USA.

Research Article Linear Approximation and Asymptotic ...downloads.hindawi.com/journals/mpe/2016/8031638.pdfUniversity of Khanh Hoa, Nguyen Chanh Str., Nha Trang City, Vietnam Department

PHOTO: VU BAO KHANH Farmers’ Gourmet...6 Ms Nguyen Thi Huong and her family – Moc Chau district, Son La provinceMs Nguyen Thi Luyen and her family – Moc Chau district, Son La

Current situation of cassava in Vietnam and the … situation of... · Web viewbreeding of improved cultivars Hoang Kim1, Nguyen Van Bo2, Nguyen Phuong1, Hoang Long 3, Tran Cong Khanh

Econ 322 Group Project BY KHANH NGUYEN AND LIAM JOUETT.

Privacy Protection in Social Networks Instructor: Assoc. Prof. Dr. DANG Tran Khanh Present : Bui Tien Duc Lam Van Dai Nguyen Viet Dang.

SIMULATION OSCE (FACEM) – DOUBLE STATION · PDF file Khanh Nguyen Scenario 3 SIMULATION OSCE (FACEM) – DOUBLE STATION You are the consultant in a