1. RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang.

16
1

Transcript of 1. RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang.

Page 1: 1. RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang.

1

Page 2: 1. RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang.

RIOT: I/O-Efficient Numerical Computing in

Yi Zhang Herodotos Herodotou Jun Yang

Page 3: 1. RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang.

What is R?

• R: an open-source language/environment– Statistical computing, graphics– Comprehensive R Archive

Network• 1639 packages as of Dec 08

– Interpretive execution– High-level constructs

• Arrays, matrices• Code example:

• Common to languages for numerical/statistical computing

a <- 1:100…d <- a+b^2+c

3

Page 4: 1. RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang.

Big-Data Challenge

• R assumes all data in main memory– If not, VM starts swapping data from/to disk– Excessive I/O, poor performance– Example:

4

# n points with coordinates stored in x[1:n], y[1:n](1) d <- sqrt((x-xs)^2+(y-ys)^2)+sqrt((x-xe)^2+(y-ye)^2)(2) s <- sample(n, 100) # draw 100 samples from 1:n(3) z <- d[s] # extract elements of d whose indices are in s

S(xs,ys) E(xe,ye)

xy

xy

x-xs

xx-xs

(x-xs)^2

y1st sqrt(x-xe)^2

y-ye

yx…

…… memory

swap/ paging file

x,y

Page 5: 1. RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang.

Opportunities

• Avoiding intermediate results– Multiple large intermediate results are generated– Can we avoid them without hand-coding loops?

• for (i in 1:n) { d[i] <- sqrt((x[i]-xs)^2+…)+… }

• Deferred and selective evaluation– Each expression is evaluated in full immediately– Can we defer evaluation until really necessary?

• Just compute the 100 elements from d picked by s

5

Page 6: 1. RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang.

Existing Solutions

• Rewrite and hand-optimize code– Tedious, not quite reusable

• Use I/O-efficient libraries– SOLAR [Toledo’96], DRA [Nieplocha’96], etc.– But efficient individual operations are not enough

• Build/extend a DB– RasDaMan [Baumann’99], AML [Marathe’02], ASAP [Stonebraker’07], …– Must rewrite using a new language (often SQL)– Explicit boundary between DB and host language

6

Page 7: 1. RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang.

SQL

R with I/O Transparency

• Attain I/O efficiency without explicit user intervention

• Run legacy code with no or minimal modification

• No need to learn new languages/libraries• No boundary between host language and

backend processing

7

Page 8: 1. RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang.

RIOT

• Implemented as an R package– New types, same interfaces: dbvector, dbmatrix, …

– Uses R’s generics mechanism for transparency

8

Method overloading:setMethod(“+”,signature(e1=“dbvector”,e2=“dbvector”), function(e1,e2) { .Call(“add_dbvectors”,e1,e2) }) 2

New class definition:setClass(“dbvector”,representation(size=“numeric”,…))

1Implementation:SEXP add_dbvectors(SEXP e1, SEXP e2){ …}

3

Page 9: 1. RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang.

RIOT-DB: Hidden DB Backend

• A strawman solution: Map large arrays to DB tables– e.g. vector: V(i,v); matrix: M(i,j,v)– Computation query:

a+b SELECT A.I,A.V+B.V FROM A,B WHERE A.I=B.I– Leverages power of DB only at intra-operation level!

• Key: Translate operations to view definitions

– Build up larger and larger views a step at a time– Evaluate only when needed deferred evaluation– Query optimization selective evaluation + more– Iterator-style execution no intermediate results

9

CREATE VIEW T1(I,V) AS SELECT X.I,X.V-xs FROM X;

d<-sqrt((x-xs)^2+(y-ys)^2)+…

CREATE VIEW T2(I,V) AS SELECT T1.I, POW(T1.V,2) FROM T1;…CREATE VIEW D(I,V) AS SELECT T6.I, T6.V+T12.V FROM T6,T12 WHERE T6.I=T12.I;

…z <- d[s]

CREATE VIEW Z(I,V) AS SELECT S.I, D.V FROM D,S WHERE D.I=S.V;

SELECT S.I, SQRT(POW(X.V-xs,2)+POW(Y.V-ys,2)) + SQRT(POW(X.V-xe,2)+POW(Y.V-ye,2))FROM X,Y,S WHERE X.I=Y.I AND X.I=S.V

Page 10: 1. RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang.

RIOT-DB Demo

• RIOT-DB built using with MyISAM engine

10

Page 11: 1. RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang.

• Plain R• RIOT-DB variants

– RIOT-DB/Strawman: use DB to store arrays and execute individual ops;no use of views to defer evaluation

– RIOT-DB/MatNamed: use views, but compute/materialize every named object– RIOT-DB: full version; defer/optimize across statements

Performance of RIOT-DB

11

Page 12: 1. RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang.

Lessons Learned

• DB-style inter-operation optimization is really the key!

• Can we do better?– DB arrays carries too much overhead (ASAP [Stonebraker’07])

• Extra columns in V(i, v), M(i, j, v), …; more for higher dims– SQL & relational algebra may not be the right abstraction

• Advanced data layouts and complex ops are awkward

RIOT: The Next Generation– A new expression algebra closer to numerical computation– Flexible array storage/layout options– Optimizations better tailored for numerical computation– … and more

12

Page 13: 1. RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang.

RIOT Expression Algebra

• Analogous to the view mechanism, but more flexible

• Operators– +, –, *, /, [, …– A[idxRange]<-newVals: turn updates into functional ops

• Instead of in-place updates, log them & define Anew over (Aold,log)

– X%*%Y (matrix multiply) etc.: built-in, for high-level opt.• E.g. matrix chain multiplication: (XY)Z or X(YZ)?

13

Page 14: 1. RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang.

Processing/Layout Optimization

• Matrix multiplication T=A(n1xn2) B(n2xn3), with fixed memory size M

14

R: Plain algorithmFor each row i of A: For each column j of B: T[i,j] <- A[i,] * B[,j]

BNLJ-inspired algorithmRead as many rows of A as possible: Use one block to scan B in column-major order: Update elements in T

Ax

BT=

Ax

BT=

Ax

BT=

Blocked algorithmDivide memory into 3 equal partsDivide each matrix into square blocksFor each chunk (i,j) in T: For k=1…p: Read chunk (i,k) from A and chunk (k,j) from B chunk T(i,j) += A(i,k) %*% B(k,j) Write chunk T(i,j)

RIOT-DBHashjoin-sort-aggregate

Optimal I/O cost: n1n2n3/(BM1/2)

Page 15: 1. RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang.

Conclusion

• I/O efficiency can be added transparently– Ditch SQL at user level for broader impact!

• DB-style inter-operation optimization is critical– Need to go beyond developing I/O-efficient

algorithms and libraries

• Integration of DB and programming languages– Lots of interesting analogies and new

opportunities

15

Page 16: 1. RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang.

Q&A

16RIOT photos by Zack Gold (www.zackgold.com)