Post on 15-Aug-2015
Dynamic Resource Allocation for Spark on
YARNozawa@apache.org Tsuyoshi Ozawa
What’s YARN
• A resource manager implementationfor computer cluster
Hadoop Stack
HDFS
YARN
MapReduceSpark Tez
YARN overview• All resources are managed by ResourceManager
• All tasks are launched on NodeManager
• Client submit jobs via ResourceManager
NodeManager NodeManager
ResourceManager client
Spark on YARN• 2 mode
• yarn-cluster
• yarn-client
yarn-cluster mode• Launching Spark driver on YARN container
• Working well with spark-submit
NodeManager NodeManager NM
container1 container2Spark AppMaster
clientResource Manager 1 submit
2 launching master
3 launching executers
spark driver
yarn-client mode• Launching Spark driver at client side
• Working well with spark-shell
NodeManager NodeManager NM
container1 container2Spark AppMaster
clientResource Manager 1 submit
2 launching master
3 launching executers spark driver
4. send commands
Spark on YARN• yarn-cluster mode
Node1 Node2 Node3
container1
container2
AppMaster container2
Problem• Inefficient resource management
• containers cannot exit until job exits
Node1 Node2
container container container container
stage1
stage2
100% 100% 100% 100%
100% 0% 0% 0%
Dynamic resource allocation(since v1.2)
• Allocating containers more dynamically
• number of executers are decided by workload
NodeManager NodeManager NM
container1 container2Spark AppMaster
clientResource Manager 1 submit
2 launching master
3 launching executers/
kill executors
spark driver
Yak shaving• Where should we hold the state of Spark RDD?
• If executers are killed, it’ll be lost…
NodeManager
executer executerRDD RDD
external shuffle • Saving Spark RDD to NodeManager
• NodeManager has a interface, external shuffle plugin
• Now executers are stateless!
NodeManager
executer executerexternal
shuffle plugin
RDD (IntermediateFile)
RDD (IntermediateFile)
How to install (with Apache Hadoop)
• Copy shuffle plugin to nodemanager’s classpath
• Edit yarn-site.xml
• Edit spark-defaults.conf
Copy shuffle jar to nodemanager’s classpath
$ cp \ lib/spark-*-yarn-shuffle.jar \ /home/ubuntu/hadoop/share/hadoop/yarn/
Edit yarn-site.xml• Adding shuffle plugin
• Note that documentation for 1.2 includes typo - I PRed :-)
• See documentation for 1.4
Edit spark-defaults.conf
We’re ready!!
• num-executers are defined automatically
Demo
Summary• Spark on YARN
• yarn-client mode
• yarn-cluster mode
• Spark can launch jobs efficiently on YARN with dynamic allocation