Setting up Hadoop YARN Clustering
-
Upload
danairat-thanabodithammachari -
Category
Software
-
view
1.567 -
download
4
Transcript of Setting up Hadoop YARN Clustering
Danairat T., 2013, [email protected] Data Hadoop – Hands On Workshop
Setting up Hadoop ClusteringHands-On Workshop
Danairat T.
Line ID: Danairat
FB: Danairat Thanabodithammachari
+668-1559-1446, [email protected], Certified Java Programmer
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
Big Data Introduction
Volume
Variety Velocity
DB Table
Delimited Text
XML, HTML
Free Form Text
Image, Music, VDO, Binary
Batch
Near real time
Real time
GB
TB
PB
XB
ZB
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
Big Data Architecture
Big Data InfrastructureBig Data Infrastructure
BI/ReportNext Best Action
Distributed Data Processing
Integration and Metadata Framework
Distributed Data Store and DWH
Monitoring and
Management Framework
SecurityFramework
Predictive Analytics
Descriptive Analytics
Prescriptive Analytics
Big Data Platform
Big Data Applications
Hardware, Storage, Network
Fraud Analysis
Cyber Security
Talent Search
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
Hadoop Timeline
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
Apache Hadoop Core Technology
j2eedev.org/ecosystem-hadoop
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
Apache Hadoop Ecosystem
j2eedev.org/ecosystem-hadoop
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
Big Data Platform & Big Data AnalyticsHadoop Technology
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
Block Size = 64MBReplication Factor = 3
HDFS: Hadoop Distributed File System
Cost/GB is a few ¢/month vs $/month
apache.org/hadoop/
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
YARN: Yet Another Resource Negotiator
Hadoop.apache.org
MRV2 maintains API compatibility with previous stable release (hadoop-1.x). This means that all Map-Reduce jobs should still run unchanged on top of MRv2 with just a recompile.
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
Hadoop 1.0 vs Hadoop 2.0
Hortonwork.com
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
Hadoop 1.0 vs Hadoop 2.0
Hortonwork.com
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
Hadoop Symbols and Reasons Behind
13
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
Clone hadoop master to slave1 and slave2
master
slave1
slave2
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master node: Edit host file
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master node : Copy key file to slave1 and slave2
scp /home/ubuntu/.ssh/id_dsa.pub ip-172-31-1-8:/home/ubuntu/.ssh/master.pub
scp /home/ubuntu/.ssh/id_dsa.pub 172.31.15.16:/home/ubuntu/.ssh/master.pub
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
After this slide, we will use 3 cascaded windows to represent master node, slave1
node and slave2 node
master node
slave1 node
slave2 node
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At slave1 and slave2: cat /home/ubuntu/.ssh/master.pub >> /home/ubuntu/.ssh/authorized_keys
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master: Test ssh to slave1 and slave 2
$ ssh ip-172-31-1-8
$ exit
$ ssh ip-172-31-15-16
$ exit
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master: add slave1 and slave2 to Hadoop slave file
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master: add slave1 and slave2 to Hadoop slave file
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master: edit hdfs-site.xml
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master: edit hdfs-site.xml for 2 replication servers
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At all nodes: remove directories of namenode and datanode
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master: format namenode
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master: format namenode
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master: Execute start-dfs.sh
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At slave1: Check jps result, you will see DataNode has been started
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At slave2: Check jps result, you will see DataNode has been started
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master: Execute start-yarn.sh
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At slave1: Check jps result, you will see NodeManager has been started
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At slave2: Check jps result, you will see NodeManager has been started
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
Importing data into HDFS Cluster
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master: import data to hdfs
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At slave1: review imported result data from hdfs
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At slave2: review imported result data from hdfs
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
Running MapReduce in Cluster Mode
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master: execute YARN mapreduce program
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At slave1, slave2: you will see Application Master and Yarn Child Container
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master: review output file from hdfs
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master: review output file from hdfs
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At slave1, slave2: review output file from hdfs by using command:-hdfs dfs -cat /outputs/wordcount_output_dir01/part-r-00000
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master: review output result data from web console
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master: review output result data from web console
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master: review output result data from web console
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master: review output result data from web console
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
Stopping Hadoop Cluster
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master: execute stop-yarn.sh
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At slave1: use jps to review NodeManager has been stopped
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At slave2: use jps to review NodeManager has been stopped
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At master: execute stop-dfs.sh
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At slave1: use jps to review DataNode has been stopped
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
At slave2: use jps to review DataNode has been stopped
Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop
Thank you very much
Danairat T.
Line ID: Danairat
FB: Danairat Thanabodithammachari
+668-1559-1446, [email protected], Certified Java Programmer