Spark Internals - Hadoop Source Code Reading #16 in Japan

37
Spark Internals Taro L. Saito Treasure Data, Inc. Hadoop Source Code Reading #16 NTT Data, Tokyo May 29, 2014 1

description

 

Transcript of Spark Internals - Hadoop Source Code Reading #16 in Japan

Page 1: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

Taro L. SaitoTreasure Data, Inc.

Hadoop Source Code Reading #16NTT Data, Tokyo

May 29, 2014

1

Page 2: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

Spark Code Base Size

spark/core/src/main/scala 2012 (version 0.6.x)

20,000 lines of code

2014 (branch-1.0) 50,000 lines of code

Other components Spark Streaming Bagel (graph processing library) MLLib (machine learning library) Container support: Mesos, YARN, Docker, etc. Spark SQL (Shark: Hive on Spark)

2

Page 3: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

Spark Core Developers

3

Page 4: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

IntelliJ Tips

Install Scala Plugin

Useful commands for code reading Go to definition (Ctrl + Click) Show Usage Navigate Class/Symbol/File Bookmark, Show Bookmarks Ctrl + Q (Show type info) Find Action (Ctrl + Shift + A)

Use your favorite key bindings

4

Page 5: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

Scala Console (REPL)

$ brew install scala

5

Page 6: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

Scala Basics

object Singleton, static methods

Package-private scope private[spark] visible only from spark package.

Pattern matching

6

Page 7: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

Scala: Case Classes

Case classes Immutable and serializable

Can be used with pattern match.

7

Page 9: Spark Internals - Hadoop Source Code Reading #16 in Japan

Components

sc = new SparkContext

f = sc.textFile(“…”)

f.filter(…) .count()

...

Your program

Spark client(app master) Spark worker

HDFS, HBase, …

Block manager

Task threads

RDD graph

Scheduler

Block tracker

Shuffle tracker

Clustermanager

Spark Internalshttps://cwiki.apache.org/confluence/display/SPARK/Spark+Internals

Page 10: Spark Internals - Hadoop Source Code Reading #16 in Japan

Scheduling Process

rdd1.join(rdd2) .groupBy(…) .filter(…)

RDD Objects

build operator DAG agnostic

to operators!

doesn’t know about

stages

DAGScheduler

split graph into stages of taskssubmit each stage as ready

DAG

TaskScheduler

TaskSet

launch tasks via cluster managerretry failed or straggling tasks

Clustermanager

Worker

execute tasks

store and serve blocks

Block manager

ThreadsTask

stagefailed

Spark Internalshttps://cwiki.apache.org/confluence/display/SPARK/Spark+Internals

Page 11: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

RDD

Reference M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley,

M.J. Franklin, S. Shenker, I. Stoica. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing, NSDI 2012, April 2012

SparkContext Contains SparkConfig, Scheduler, entry point of running jobs

(runJobs) Dependency

Input RDDs

11

Page 12: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

RDD.map operation

Map: RDD[T] -> RDD[U]

MappedRDD For each element in a partition, apply function f

12

Page 13: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

RDD Iterator

13

First, check the local cache If not found, compute the RDD

StorageLevel Off-heap

Tachiyondistributed memory store

Page 14: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

Task

DAGScheduler organizes stages Each stage has several tasks Each task has preferred locations (host names)

Favor data local computation

14

Page 15: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

Task Locality

Preferred location to run a task Process, Node, Rack

15

Page 16: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

Delay Scheduling

Reference M. Zaharia, D.

Borthakur, J. Sen Sarma, K. Elmeleegy, S. Shenker and I. Stoica. Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling, EuroSys 2010, April 2010.

Try to run tasks in the following order:

Local Rack local

Involves data serialization, local transfer

At any node It might involve

remote data transfer

16

Page 17: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

Serializing Tasks

TaskDescription

ResultTask RDD Function Stage ID, outputID

func aggregation

17

Page 18: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

TaskScheduler: submitTasks

Serialize Task Request Then, send task requests to ExecutorBackend

ExecutorBackend handles task requests (Akka Actor)

18

Page 19: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

ClosureSerializer

Clean

Function in scala: Closure Closure: free variable + function body (class)

x: bound variable, N: free variable, M:unused variable class A$apply$1 extends Function1[T, U] {

val $outer : A$outer def apply(T:input) : U = …}

class A$outer { val N = 100, val M = (large object)}

Fill M with null, then serialize the closure.

19

Page 20: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

Traversing Byte Codes

Closure is a class in Scala Traverse outer variable accesses Using ASM4 library

20

Page 21: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

JVM Bytecode Instructions

21

Page 22: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

Cache/Block Manager

CacheManager Stores computed RDDs to

BlockManager

BlockManager Write-once storage Manages block data

according to StorageLevel memoryStore diskStore shuffleStore

Serializes/deserializes block data For remote data

Compression ning LZF Snappy-java

Faster decompression

22

Page 23: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

Storing Block Data

IteratorValues Raw objects

ArrayBufferValues

Array[Byte] ByteBufferValu

es ByteBuffer

23

Page 24: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

ConnectionManager

Asynchronous Data I/O server Using its own protocol Send and receive block data (BufferMessage)

Split data into 64KB chunksChunkHeader

24

Page 25: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

RDD.compute

Local Collection

25

Page 26: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

SparkContext - RunJob

RDD -> DAG Scheduler

26

Page 27: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

SparkConf

Key-Value configuration Master address, jar file address, environment variables,

JAVA_OPTS, etc.

27

Page 28: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

SparkEnv

Holding spark components

28

Page 29: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

SparkContext.makeRDD

Convert local Seq[T] into RDD[T]

29

Page 30: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

HadoopRDD

Reading HDFS data as (Key, Value) records

30

Page 31: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

Mesos Scheduler – Fine Grained

31

Mesos Offer slave resources

Scheduler Determine resource

usage

Task lists are stored in TaskScheduler

Launches JVM for each task createMesosTask createExecutorInfo

Page 32: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

Mesos Fine-Grained Executor

32

Page 33: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

Mesos Fine-Grained Executor

spark-executor Shell script for launching JVM

33

Page 34: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

Coarse-grained Mesos Scheduler

Launches Spark executor on Mesos slave Runs CoarseGrainedExecutorBackend

34

Page 35: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

Coarse-grained ExecutorBackend

Akka Actor

Register itself to the master

Initialize the executor after response

35

Page 36: Spark Internals - Hadoop Source Code Reading #16 in Japan

Spark Internals

Cleanup RDDs

ReferenceQueue Notified when weakly referenced objects are

garbage collected.

37

Page 37: Spark Internals - Hadoop Source Code Reading #16 in Japan

Copyright ©2014 Treasure Data. All Rights Reserved. 38

WE ARE HIRING!