Elevating R to Supercomputers

43
Introduction Benchmarks Challenges Elevating R to Supercomputers Drew Schmidt National Institute for Computational Sciences University of Tennessee, Knoxville November 11, 2013 Drew Schmidt Elevating R to Supercomputers

Transcript of Elevating R to Supercomputers

Page 1: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Elevating R to Supercomputers

Drew Schmidt

National Institute for Computational SciencesUniversity of Tennessee, Knoxville

November 11, 2013

Drew Schmidt Elevating R to Supercomputers

Page 2: Elevating R to Supercomputers

Introduction Benchmarks Challenges

The pbdR Core Team

Wei-Chen Chen1

George Ostrouchov1,2

Pragneshkumar Patel2

Drew Schmidt2

SupportThis work used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory,which is supported by the Office of Science of the U.S. Department of Energy under Contract No.DE-AC05-00OR22725. This work also used resources of National Institute for Computational Sciences at theUniversity of Tennessee, Knoxville, which is supported by the Office of Cyberinfrastructure of the U.S. NationalScience Foundation under Award No. ARRA-NSF-OCI-0906324 for NICS-RDAV center. This work used resourcesof the Newton HPC Program at the University of Tennessee, Knoxville.

1Oak Ridge National Laboratory. Supported in part by the project “Visual Data Exploration and Analysis ofUltra-large Climate Data” funded by U.S. DOE Office of Science under Contract No. DE-AC05-00OR22725.

2University of Tennessee. Supported in part by the project “NICS Remote Data Analysis and Visualization Center”funded by the Office of Cyberinfrastructure of the U.S. National Science Foundation under Award No.ARRA-NSF-OCI-0906324 for NICS-RDAV center.

Drew Schmidt Elevating R to Supercomputers

Page 3: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Contents

1 Introduction

2 Benchmarks

3 Challenges

Drew Schmidt Elevating R to Supercomputers

Page 4: Elevating R to Supercomputers

Introduction Benchmarks Challenges

pbdR

Why R?

1 Because.

2 R community has growing data size problem.

3 HPC community has growing need for data analytics.

Drew Schmidt Elevating R to Supercomputers 1 / 20

Page 5: Elevating R to Supercomputers

Introduction Benchmarks Challenges

pbdR

Why R?

1 Because.

2 R community has growing data size problem.

3 HPC community has growing need for data analytics.

Drew Schmidt Elevating R to Supercomputers 1 / 20

Page 6: Elevating R to Supercomputers

Introduction Benchmarks Challenges

pbdR

Why R?

1 Because.

2 R community has growing data size problem.

3 HPC community has growing need for data analytics.

Drew Schmidt Elevating R to Supercomputers 1 / 20

Page 7: Elevating R to Supercomputers

Introduction Benchmarks Challenges

pbdR

Why R?

1 Because.

2 R community has growing data size problem.

3 HPC community has growing need for data analytics.

Drew Schmidt Elevating R to Supercomputers 1 / 20

Page 8: Elevating R to Supercomputers

Introduction Benchmarks Challenges

pbdR

Elevating R to Supercomputers

1 Existing code.

2 Syntax.

3 Philosophy.

Drew Schmidt Elevating R to Supercomputers 2 / 20

Page 9: Elevating R to Supercomputers

Introduction Benchmarks Challenges

pbdR

Elevating R to Supercomputers

1 Existing code.

2 Syntax.

3 Philosophy.

Drew Schmidt Elevating R to Supercomputers 2 / 20

Page 10: Elevating R to Supercomputers

Introduction Benchmarks Challenges

pbdR

Elevating R to Supercomputers

1 Existing code.

2 Syntax.

3 Philosophy.

Drew Schmidt Elevating R to Supercomputers 2 / 20

Page 11: Elevating R to Supercomputers

Introduction Benchmarks Challenges

pbdR

Elevating R to Supercomputers

1 Existing code.

2 Syntax.

3 Philosophy.

Drew Schmidt Elevating R to Supercomputers 2 / 20

Page 12: Elevating R to Supercomputers

Introduction Benchmarks Challenges

pbdR

Programming with Big Data in R (pbdR)

Productivity, Portability, Performance

Freea R packages.

Bridging high-performance C withhigh-productivity of R

Distributed data details implicitlymanaged.

Methods have syntax identical to R.

aMPL, BSD, and GPL licensed

Drew Schmidt Elevating R to Supercomputers 3 / 20

Page 13: Elevating R to Supercomputers

Introduction Benchmarks Challenges

pbdR

pbdR Packages

Drew Schmidt Elevating R to Supercomputers 4 / 20

Page 14: Elevating R to Supercomputers

Introduction Benchmarks Challenges

pbdR

pbdMPI vs Rmpi: API

Reduction Operations

Rmpi

1 # int

2 mpi.allreduce(x, type =1)

3 # double

4 mpi.allreduce(x, type =2)

pbdMPI

1 allreduce(x)

Types in R

1 > is.integer (1)

2 [1] FALSE

3 > is.integer (2)

4 [1] FALSE

5 > is.integer (1:2)

6 [1] TRUE

Drew Schmidt Elevating R to Supercomputers 5 / 20

Page 15: Elevating R to Supercomputers

Introduction Benchmarks Challenges

pbdR

pbdMPI vs Rmpi: API

Reduction Operations

Rmpi

1 # int

2 mpi.allreduce(x, type =1)

3 # double

4 mpi.allreduce(x, type =2)

pbdMPI

1 allreduce(x)

Types in R

1 > is.integer (1)

2 [1] FALSE

3 > is.integer (2)

4 [1] FALSE

5 > is.integer (1:2)

6 [1] TRUE

Drew Schmidt Elevating R to Supercomputers 5 / 20

Page 16: Elevating R to Supercomputers

Introduction Benchmarks Challenges

pbdR

pbdMPI vs Rmpi: API

Reduction Operations

Rmpi

1 # int

2 mpi.allreduce(x, type =1)

3 # double

4 mpi.allreduce(x, type =2)

pbdMPI

1 allreduce(x)

Types in R

1 > is.integer (1)

2 [1] FALSE

3 > is.integer (2)

4 [1] FALSE

5 > is.integer (1:2)

6 [1] TRUE

Drew Schmidt Elevating R to Supercomputers 5 / 20

Page 17: Elevating R to Supercomputers

Introduction Benchmarks Challenges

pbdR

pbdMPI vs Rmpi: Performance

Table: Runtimes (seconds) for 10, 000 × 10, 000 allgather with Rmpiand pbdMPI.

Cores Rmpi pbdMPI Speedup

32 24.6 6.7 3.6764 25.2 7.1 3.55

128 22.3 7.2 3.10256 22.4 7.1 3.15

Drew Schmidt Elevating R to Supercomputers 6 / 20

Page 18: Elevating R to Supercomputers

Introduction Benchmarks Challenges

pbdR

pbdR Example Syntax

1 x <- x[-1, 2:5]

2 x <- log(abs(x) + 1)

3 xtx <- t(x) %*% x

4 ans <- svd(solve(xtx))

Look familiar?

The above runs on 1 core with R or 10,000 cores with pbdR

Drew Schmidt Elevating R to Supercomputers 7 / 20

Page 19: Elevating R to Supercomputers

Introduction Benchmarks Challenges

pbdR

Shared and Distributed Memory Machines

Shared Memory

Direct access to read/changememory (one node)

Distributed

No direct access toread/change memory.

Drew Schmidt Elevating R to Supercomputers 8 / 20

Page 20: Elevating R to Supercomputers

Introduction Benchmarks Challenges

pbdR

Shared and Distributed Memory Machines

Shared Memory Machines

Thousands of cores

Nautilus, University of Tennessee1024 cores4 TB RAM

Distributed Memory Machines

Hundreds of thousands of cores

Kraken, University of Tennessee112,896 cores147 TB RAM

Drew Schmidt Elevating R to Supercomputers 9 / 20

Page 21: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Contents

1 Introduction

2 Benchmarks

3 Challenges

Drew Schmidt Elevating R to Supercomputers

Page 22: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Benchmarks

Non-Optimal Choices Throughout

1 Only libre software used (no MKL, ACML, etc.).

2 1 core = 1 MPI process.

3 No tuning for data distribution.

Drew Schmidt Elevating R to Supercomputers 10 / 20

Page 23: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Benchmarks

Non-Optimal Choices Throughout

1 Only libre software used (no MKL, ACML, etc.).

2 1 core = 1 MPI process.

3 No tuning for data distribution.

Drew Schmidt Elevating R to Supercomputers 10 / 20

Page 24: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Benchmarks

Non-Optimal Choices Throughout

1 Only libre software used (no MKL, ACML, etc.).

2 1 core = 1 MPI process.

3 No tuning for data distribution.

Drew Schmidt Elevating R to Supercomputers 10 / 20

Page 25: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Benchmarks

Benchmark Data

1 Random normal N(100, 10000).

2 Local problem size of ≈ 43.4MiB.

3 Three sets: 500, 1000, and 2000 columns.

4 Several runs at different core sizes within each set.

Drew Schmidt Elevating R to Supercomputers 11 / 20

Page 26: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Benchmarks

Benchmark Data

1 Random normal N(100, 10000).

2 Local problem size of ≈ 43.4MiB.

3 Three sets: 500, 1000, and 2000 columns.

4 Several runs at different core sizes within each set.

Drew Schmidt Elevating R to Supercomputers 11 / 20

Page 27: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Benchmarks

Benchmark Data

1 Random normal N(100, 10000).

2 Local problem size of ≈ 43.4MiB.

3 Three sets: 500, 1000, and 2000 columns.

4 Several runs at different core sizes within each set.

Drew Schmidt Elevating R to Supercomputers 11 / 20

Page 28: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Benchmarks

Benchmark Data

1 Random normal N(100, 10000).

2 Local problem size of ≈ 43.4MiB.

3 Three sets: 500, 1000, and 2000 columns.

4 Several runs at different core sizes within each set.

Drew Schmidt Elevating R to Supercomputers 11 / 20

Page 29: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Benchmarks

Covariance Code

1 x <- ddmatrix("rnorm", nrow=n, ncol=p, mean=mean , sd=sd)

2

3 cov.x <- cov(x)

Drew Schmidt Elevating R to Supercomputers 12 / 20

Page 30: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Benchmarks

cov()

0.04

21.3442.68 85.35 170.7 341.41

0.04

21.3442.68 85.35 170.7 341.41

0.04

21.3442.68 85.35 170.7341.41

4

8

12

16

1 504 1008 2016 4032 80641 504 1008 2016 4032 80641 504 1008 2016 4032 8064Cores

Run

Tim

e (S

econ

ds)

Predictors 500 1000 2000

Calculating cov(x) With Fixed Local Size of ~43.4 MiB

Drew Schmidt Elevating R to Supercomputers 13 / 20

Page 31: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Benchmarks

Linear Model Code

1 x <- ddmatrix("rnorm", nrow=n, ncol=p, mean=mean , sd=sd)

2 beta_true <- ddmatrix("runif", nrow=p, ncol =1)

3

4 y <- x %*% beta_true

5

6 beta_est <- lm.fit(x=x, y=y)$coefficients

Drew Schmidt Elevating R to Supercomputers 14 / 20

Page 32: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Benchmarks

lm.fit()

21.34

42.6885.35170.7 341.41

1016.09

21.3442.6885.35 170.7

341.411016.09

21.3442.68

85.35

170.7341.41

1016.09

25

50

75

100

125

504

1008

2016

4032

8064

2400

050

410

0820

1640

3280

64

2400

050

410

0820

1640

3280

64

2400

0

Cores

Run

Tim

e (S

econ

ds)

Predictors 500 1000 2000

Fitting y~x With Fixed Local Size of ~43.4 MiB

Drew Schmidt Elevating R to Supercomputers 15 / 20

Page 33: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Benchmarks

But wait! There’s more. . .

Anything worth doing is worth overdoing.

— Mick Jagger

Drew Schmidt Elevating R to Supercomputers 16 / 20

Page 34: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Benchmarks

But wait! There’s more. . .

Anything worth doing is worth overdoing.

— Mick Jagger

Drew Schmidt Elevating R to Supercomputers 16 / 20

Page 35: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Benchmarks

Data Generation

0

200

400

600

800

1 504 1008 2016 4032 8064 24000 48000Cores

Dat

a G

ener

atio

n (G

iB p

er S

econ

d)

Drew Schmidt Elevating R to Supercomputers 17 / 20

Page 36: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Benchmarks

lm.fit()

21.34

42.68

85.35170.7

341.41

1016.09

2032.18

25

30

35

504

100820

1640

3280

64

2400

0

4800

0

Cores

Run

Tim

e (S

econ

ds)

Predictors 500

Fitting y~x With Fixed Local Size of ~43.4 MiB

Drew Schmidt Elevating R to Supercomputers 18 / 20

Page 37: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Contents

1 Introduction

2 Benchmarks

3 Challenges

Drew Schmidt Elevating R to Supercomputers

Page 38: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Challenges

Challenges

Perceptions.“R? Isn’t that slow?” – HPC people“HPC? Isn’t that hard?” – R people

Package loading.

Profiling.

Data distribution and performance.

Drew Schmidt Elevating R to Supercomputers 19 / 20

Page 39: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Challenges

Challenges

Perceptions.“R? Isn’t that slow?” – HPC people“HPC? Isn’t that hard?” – R people

Package loading.

Profiling.

Data distribution and performance.

Drew Schmidt Elevating R to Supercomputers 19 / 20

Page 40: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Challenges

Challenges

Perceptions.“R? Isn’t that slow?” – HPC people“HPC? Isn’t that hard?” – R people

Package loading.

Profiling.

Data distribution and performance.

Drew Schmidt Elevating R to Supercomputers 19 / 20

Page 41: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Challenges

Challenges

Perceptions.“R? Isn’t that slow?” – HPC people“HPC? Isn’t that hard?” – R people

Package loading.

Profiling.

Data distribution and performance.

Drew Schmidt Elevating R to Supercomputers 19 / 20

Page 42: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Challenges

Covariance Revisited: Distributed Data Parameter Calibration

16x4 Cores 8x8 Cores 4x16 Cores

0

200

400

600

2x2 4x4 8x8 2x2 4x4 8x8 2x2 4x4 8x8Blocking Factor

Run

time

(sec

onds

)

Construct Covariance Matrix for 100,000x1000 Matrix on 64 Processors

Drew Schmidt Elevating R to Supercomputers 20 / 20

Page 43: Elevating R to Supercomputers

Introduction Benchmarks Challenges

Challenges

Thanks for coming!

Questions?

http://r-pbd.org/

Drew Schmidt Elevating R to Supercomputers