Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data...

34
1 Accelerating Spark Workloads in a Mesos Environment with Alluxio Gene Pang, Software Engineer, Alluxio, Inc. * ©2017 Alluxio, Inc. All Rights Reserved

Transcript of Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data...

Page 1: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

1

Accelerating Spark Workloads in a Mesos Environment with Alluxio

Gene Pang, Software Engineer, Alluxio, Inc.

* ©2017 Alluxio, Inc. All Rights Reserved

Page 2: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

About Me

Gene Pang

Software Engineer @ Alluxio, Inc.

Alluxio Open Source PMC Member

Ph.D. from AMPLab @ UC Berkeley

Worked at Google before UC Berkeley

Twitter: @unityxx

Github: @gpang

©2017 Alluxio, Inc. All Rights Reserved 2

Page 3: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

OutlineAlluxio Overview

Alluxio + Spark + Mesos Use Cases

Using Spark with Alluxio on Mesos

Deployment with Mesos

Demo

1

2

3

4

5

©2017 Alluxio, Inc. All Rights Reserved 3

Page 4: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Data Ecosystem Yesterday

4* ©2017 Alluxio, Inc. All Rights Reserved

•  One Compute Framework

•  Single Storage System•  Co-located

Page 5: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Data Ecosystem Today

5* ©2017 Alluxio, Inc. All Rights Reserved

•  Many Compute Frameworks

•  Multiple Storage Systems•  Most not co-located

Page 6: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Data Ecosystem Issues

6* ©2017 Alluxio, Inc. All Rights Reserved

•  Each application manage multiple data sources

•  Add/Removing data sources require application changes

•  Storage optimizations requires application change

•  Lower performance due to lack of locality

Page 7: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Data Ecosystem with Alluxio

7* ©2017 Alluxio, Inc. All Rights Reserved

•  Apps only talk to Alluxio

•  Simple Add/Remove

•  No App Changes

•  Memory Performance

Page 8: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Next Gen Analytics with Alluxio

8* ©2017 Alluxio, Inc. All Rights Reserved

✓  Big Data/IoT✓  AI/ML✓  Deep Learning✓  Cloud Migration✓  Multi Platform✓  Autonomous

Native File SystemHadoop Compatible

File SystemNative Key-Value

InterfaceFuse Compatible File

System

HDFS Interface Amazon S3 Interface Swift Interface GlusterFS Interface

Apps, Data & Storage���at Memory Speed

Page 9: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Enabling Next Gen Analytics

Unify your Data

9

1

Architecture Flexibility2

Improved I/O Performance 3

* ©2017 Alluxio, Inc. All Rights Reserved

Page 10: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Fastest Growing Big Data ���Open Source Project

10  * ©2017 Alluxio, Inc. All Rights Reserved

•  Fastest Growing open-source project in the big data ecosystem

•  Running world’s largest production clusters

•  600+ Contributors from 100+ organizations

Page 11: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

OutlineAlluxio Overview

Alluxio + Spark + Mesos Use Cases

Using Spark with Alluxio on Mesos

Deployment with Mesos

Demo

1

2

3

4

5

©2017 Alluxio, Inc. All Rights Reserved 11

Page 12: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Big Data Case Study –

Challenge – Gain end to end view of business with large volume of data for $5B Travel Site Queries were slow / not interactive, resulting in operational inefficiency

SPARK

HDFS

Solution – With Alluxio, 300x improvement in performance

Impact – Increased revenue from immediate response to user behaviorUse case: http://bit.ly/2pDJdrq

CEPH

HDFS CEPH

FLINK SPARK FLINK

©2017 Alluxio, Inc. All Rights Reserved 12

MES

OS

Page 13: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Machine Learning Case Study –

136/12/17 ©2017 Alluxio, Inc. All Rights Reserved

Challenge – Disparate Data both on-prem and Cloud. Heterogeneous types of data. Scaling of Exabyte size data. Slow due to disk based approach.

SPARK

HDFS

SPARK

MINIO

Solution – Using Alluxio to prevent I/O bottlenecks

Impact – Orders of magnitude higher performance than before.http://bit.ly/2p18ds3

MES

OS

Page 14: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

OutlineAlluxio Overview

Alluxio + Spark + Mesos Use Cases

Using Spark with Alluxio on Mesos

Deployment with Mesos

Demo

1

2

3

4

5

©2017 Alluxio, Inc. All Rights Reserved 14

Page 15: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Sharing Data via Memory

Storage Engine & Execution EngineSame Process

•  Two copies of data in memory – double the memory used•  Inter-process Sharing Slowed Down by Network / Disk I/O

©2017 Alluxio, Inc. All Rights Reserved 15

Mesos

Spark Compute

Spark Storage

block 1

block 3

HDFS / Amazon S3block 1

block 3

block 2

block 4

Spark Compute

Spark Storage

block 1

block 3

Page 16: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Sharing Data via Memory

Storage Engine & Execution EngineDifferent process

•  Half the memory used•  Inter-process Sharing Happens at Memory Speed

Spark Compute

Spark Storage

HDFS / Amazon S3block 1

block 3

block 2

block 4

HDFSdisk

block 1

block 3

block 2

block 4Alluxio

block 1

block 3 block 4

Spark Compute

Spark Storage

©2017 Alluxio, Inc. All Rights Reserved 16

Mesos

Page 17: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Data Resilience During Crash

Spark Compute

Spark Storageblock 1

block 3

HDFS / Amazon S3block 1

block 3

block 2

block 4

Storage Engine & Execution EngineSame Process

©2017 Alluxio, Inc. All Rights Reserved 17

Mesos

Page 18: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Data Resilience During Crash

CRASH

Spark Storageblock 1

block 3

HDFS / Amazon S3block 1

block 3 block 4

block 2

•  Process Crash Requires Network and/or Disk I/O to Re-read Data

Storage Engine & Execution EngineSame Process

©2017 Alluxio, Inc. All Rights Reserved 18

Mesos

Page 19: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Data Resilience During Crash

CRASH

HDFS / Amazon S3block 1

block 3

block 2

block 4

Storage Engine & Execution EngineSame Process

•  Process Crash Requires Network and/or Disk I/O to Re-read Data

©2017 Alluxio, Inc. All Rights Reserved 19

Mesos

Page 20: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Data Resilience During Crash

Spark Compute

Spark Storage

HDFS / Amazon S3block 1

block 3

block 2

block 4

HDFSdisk

block 1

block 3

block 2

block 4Alluxio

block 1

block 3 block 4

Storage Engine & Execution EngineDifferent process

©2017 Alluxio, Inc. All Rights Reserved 20

Mesos

Page 21: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Data Resilience During Crash

Process Crash - Data is Re-read at Memory SpeedHDFS / Amazon S3

block 1

block 3

block 2

block 4

HDFSdisk

block 1

block 3

block 2

block 4Alluxio

block 1

block 3 block 4

CRASH Storage Engine & Execution EngineDifferent process

©2017 Alluxio, Inc. All Rights Reserved 21

Mesos

Page 22: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Alluxio Architecture

©2017 Alluxio, Inc. All Rights Reserved 22

App

licat

ion

Allu

xio

Clie

nt

Alluxio Master

Alluxio Worker

Alluxio Worker

Storage

Storage

Page 23: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Alluxio Client

©2017 Alluxio, Inc. All Rights Reserved 23

Applications interact with Alluxio via the Alluxio client●  Native Alluxio Filesystem Client

•  Alluxio specific operations like [un]pin, [un]mount, [un]set TTL●  HDFS-Compatible Filesystem Client

•  No code change necessary●  S3 API

Page 24: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Alluxio Master

©2017 Alluxio, Inc. All Rights Reserved 24

Master is responsible for managing metadata●  Filesystem namespace metadata●  Blocks / workers metadataPrimary master writes journal for durable operations●  Secondary masters replay journal entries

Page 25: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Alluxio Worker

©2017 Alluxio, Inc. All Rights Reserved 25

Worker is responsible for managing block dataWorker stores block data on various storage media●  HDD, SSD, MemoryReads and writes data to underlying storage systems

Page 26: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

OutlineAlluxio Overview

Alluxio + Spark + Mesos Use Cases

Using Spark with Alluxio on Mesos

Deployment with Mesos

Demo

1

2

3

4

5

©2017 Alluxio, Inc. All Rights Reserved 26

Page 27: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Alluxio on DC/OS

©2017 Alluxio, Inc. All Rights Reserved 27

Page 28: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Alluxio on DC/OS

©2017 Alluxio, Inc. All Rights Reserved 28

Alluxio bringsA unified view of data across disparate storage systems

High performance & predictable SLA for analytics workloads���

DC/OS makes provisioning infrastructure easyAutomates provisioning, management & elastic scaling���

Benefits include:Faster analytics with Spark and other frameworks

Process data from hybrid cloud storage systems (HDFS, S3, etc)

Page 29: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

OutlineAlluxio Overview

Alluxio + Spark + Mesos Use Cases

Using Spark with Alluxio on Mesos

Deployment with Mesos

Demo

1

2

3

4

5

©2017 Alluxio, Inc. All Rights Reserved 29

Page 30: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Demo Environment

Spark

Alluxio

©2017 Alluxio, Inc. All Rights Reserved 30

SPARK

MESOS

Page 31: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Demo Setup

Alluxio 1.5.0

DC/OS 1.9.4

Spark 2.0.2

Amazon EC2 (m3.xlarge)

©2017 Alluxio, Inc. All Rights Reserved 31

Page 32: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Results

©2017 Alluxio, Inc. All Rights Reserved 32

8x improvement

Page 33: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Conclusion

Easy to use Alluxio with Spark in a Mesos environment

Predictable and improved performance

Easily connect to various storage systems

©2017 Alluxio, Inc. All Rights Reserved 33

Page 34: Accelerating Spark Workloads in a Mesos …...2017/10/27  · source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from

Thank you!

Gene Pang���Software Engineer���[email protected]

34

Twitter.com/alluxio

Linkedin.com/alluxio

Websitewww.alluxio.com

[email protected]

@

Social Media

* ©2017 Alluxio, Inc. All Rights Reserved