Distribute cloud-environment-on-ubuntu-14-04-with-Docker

18

Transcript of Distribute cloud-environment-on-ubuntu-14-04-with-Docker

1. Introduction2. InstallDockeronUbuntu14.043. SetHadoopEnvironmentinaDockerContainer4. SetHBaseEnvironmentinaDockercontainer5. ExportandImportaDockerImagebetweenNodesinCluster6. TheProblemIHaven'tSolved7. PossibleProblems

TableofContents

ThisisthebasictutorialtohelpdeveloperorsystemadministratortobuildabasiccloudenvironmentwithDocker.

Inthisbook,IwillnotuseDockerfiletocreateacontainerbecauseIdon'tknowhowtousethatyet.XDDD

Intheendofthisbook,IwillsummarysomeproblemIhaven'tsolvedyet.

Ifthereisanymistake,pleaseletmeknow.

DistributeCloudEnvironmentwithDocker

$sudoapt-getupdate

$sudoapt-getinstalldocker.io

Enabletab-completionofDockercommandsinBASH

$source/etc/bash_completion.d/docker.io

First,checkyoursystemcandealwithhttpsURLs:the/usr/lib/apt/methods/httpsshouldexist.If

itdoesn't,youneedtoinstallthepackageapt-transport-https

$apt-getupdate

$apt-getinstallapt-transport-https

AddtheDockerrepositorykeytoyoursystemkeychain.

$sudoapt-keyadv--keyserverhkp://keyserver.ubuntu.com:80--recv-keys36A1D7869245C8950F966E92D8576A8BA88D21E9

AddtheDockerrepositorytoyouraptsourcelist

$sudosh-c“echodebhttp://get.docker.io/ubuntudockermain>/etc/apt/sources.list.d/docker.list"

$sudoapt-getupdate

$sudoapt-getinstalllxc-docker

Toverifythateverythinghasworkedasexepected:

$sudodockerrun-i-tubuntu/bin/bash

YoushoulddownloadthelatestversionofUbuntuimage,andthenstartbashinanewcontainer.

Showrunningcontainers:

InstallDockeronUbuntu14.04LTS

Ubuntu-maintainedPackageInstallation

Docker-maintainedPackageInstallation

Note:IfyouwantinstalltherecentlyversionofDocker,youdonotneedtoinstalldocker.iofromUbuntu

BasicDockerCommand-line

$sudodockerps

Showallimagesinyourlocalrepository:

$sudodockerimages

Runacontainerfromaspecificimage

$sudodockerrun-i-t<image_id||repository:tag>-bash

Startaexistedcontainer:

$sudodockerstart-i<image_id>

Attacharunningcontainer:

$sudodockerattach<container_id>

Exitwithoutshuttingdownacontainer:

[Ctrl-p]+[Ctrl-q]

https://docs.docker.com/installation/ubuntulinux/#ubuntu-trusty-1404-lts-64-bit

Reference

InanewDockercontainer,youneedtosetbasicenvironmentfirstbeforesettingHadoop.

$sudoapt-getupdate

$sudoapt-getinstalldefault-jdk

ThedefaultJDKwillbeinstalledat/usr/lib/jvm/<java-version>

$sudoapt-getinstallgitwgetvimssh

$adduserhduser

Grantauserprivileges

$visudo

Appendthehduseryoujustcreatedbelowtherootandjustsetprivilegesspecificationasroot

$ssh-keygen-trsa

$cat.ssh/id_rsa.pub>>.ssh/authorized_keys

SetHadoopEnvironmentinaDockerContainer

UpdateAptList

InstallJavaJDK

Installsomeneededpackage

Createausertomanagehadoopcluster

GenerateSSHauthorizedkeytoletsocketconnectionwithoutpassword

SettheportofSSHandSSHD

BecauseIuseDockerasmydistributedtool,sothedefaultsshport22hasbeenlistenedbyhost

machine,IneedtouseanotherporttocommunicatebetweenDockercontainersindifferenthost

machine.Inmyexample,Iwilluse2122portforlisteningandsendingrequest.

ssh

$sudovi/etc/ssh/ssh_config

->Port2122

wq!

sshd

$sudovi/etc/ssh/sshd_config

->Port2122

->UsePAMno

wq!

Isupposedmycloudenvironmentisasfollowing:

VM:5nodes(master,master2,slave1,slave2,slave3)OS:Ubuntu14.04LTSDockerVersion:1.3.1HadoopVersion:2.6.0

Downloadhadoop-2.6.0

$wgethttp://ftp.twaren.net/Unix/Web/apache/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz

$tarzxvfhadoop-2.6.0.tar.gz

Setenvironmentpath

$sudovi/etc/profile

->exportJAVA_HOME=/usr/lib/jvm/<java-version>

->exportHADOOP_HOME=<YOUR_HADOOP_PACKAGE_PATH>

->exportHADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

->exportPATH=$HADOOP_HOME/bin:$PATH

->exportCLASSPATH=$HADOOP_HOME/lib:$CLASSPATH

:wq!

$source/etc/profile

Allneededwillbestoredin<HADOOP_HOME>/etc/hadoop

SetHadoopEnvironment

ModifyHadoopconfiguration

core-site.xml

<configuration>

<property>

<name>fs.DefaultFS</name>

<value>hdfs://master:9000</value>

<description>Themasterendpointincluster.</description>

</property>

<property>

<name>io.file.buffer.size</name>

<value>131072</value>

</property>

<property>

<name>hadoop.tmp.dir</name>

<value>file:/<HADOOP_HOME>/temp</value>

<description>Abaseforothertemporarydirectories.</description>

</property>

</configuration>

hdfs-site.xml

<configuration>

<property>

<name>dfs.namenode.secondary.http-address</name>

<value>master2:90001</value>

<description>Setsecondarynamenodetopreventthemasternodecrash.</description>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>file:/<HADOOP_HOME>/dfs/name</value>

<description>Setthelocationofnamenodeversion.</description>

</property>

<property>

<name>dfs.namenode.data.dir</name>

<value>file:/<HADOOP_HOME>/dfs/data</value>

<description>Setthelocationofdatanodeversion.</description>

</property>

<property>

<name>dfs.replication</name>

<value>3</value>

<description>Thenumberofreplicationincluster.</description>

</property>

<property>

<name>dfs.webhdfs.enabled</name>

<value>true</value>

</property>

<property>

<name>dfs.datanode.max.xcievers</name>

<value>4096</value>

</property>

<property>

<name>dfs.permissions</name>

<value>false</value>

</property>

<property>

<name>dfs.support.append</name>

<value>true</value>

</property>

</configuration>

mapred-site.xml

<configuration>

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

<property>

<name>mapreduce.jobhistory.address</name>

<value>master:10020</value>

</property>

<property>

<name>mapreduce.jobhistory.webapp.address</name>

<value>master:19888</value>

</property>

</configuration>

yarn-stie.xml

<configuration>

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

<property>

<name>yarn.resourcemanager.address</name>

<value>master:8032</value>

</property>

<property>

<name>yarn.resourcemanager.scheduler.address</name>

<value>master:8030</value>

</property>

<property>

<name>yarn.resourcemanager.resource-tracker.address</name>

<value>master:8031</value>

</property>

<property>

<name>yarn.resourcemanager.admin.address</name>

<value>master:8033</value>

</property>

<property>

<name>yarn.resourcemanager.webapp.address</name>

<value>master:8088</value>

</property>

</configuration>

hadoop-env.sh

$vi<HADOOP_CONF_DIR>/hadoop-env.sh

->exportJAVA_HOME=/usr/lib/jvm/<java-version>

wq!

yarn-env.sh

$vi<HADOOP_CONF_DIR>/yarn-env.sh

->exportJAVA_HOME=/usr/lib/jvm/<java-version>

wq!

slaves

$vi<HADOOP_CONF_DIR>/slaves

->master2

->slave1

->slave2

->slave3

AftersettingHadoopenvironmentinprevioussection,youcansethbaseenvironmentnow.

Inthefollowingexample,Iwillusecustomzookeepertomanagetheresourceofmycluster.

HBaseversion:0.99.2Zookeeperversion:3.3.6

$wgethttp://ftp.twaren.net/Unix/Web/apache/hbase/hbase-0.99.2/hbase-0.99.2-bin.tar.gz

$wgethttp://ftp.twaren.net/Unix/Web/apache/zookeeper/zookeeper-3.3.6/zookeeper-3.3.6.tar.gz

$tar-zxvfhbase-0.99.2-bin.tar.gz

$tar-zxvfzookeeper-3.3.6.tar.gz

hbase-site.xml

<configuration>

<property>

<name>hbase.rootdir</name>

<value>hdfs://master:9000/hbase</value>

</property>

<property>

<name>hbase.cluster.distributed</name>

<value>true</value>

</property>

<property>

<name>hbase.master</name>

<value>hdfs://master:60000</value>

</property>

<property>

<name>hbase.zookeeper.property.clientPort</name>

<value>2181</value>

</property>

<property>

<name>hbase.zookeeper.property.dataDir</name>

<value><ZOOKEEPER_HOME>/data</value>

</property>

<property>

<name>hbase.zookeeper.quorum</name>

<value>master</value>

</property>

<property>

<name>dfs.support.append</name>

<value>true</value>

</property>

</configuration>

SetHBaseEnvironmentinaDockercontainer

DownloadHBaseandZookeeper

SettheconfigurationofHBase

hbase-env.sh

$vi<HBASE_HOME>/conf/hbase-env.sh

->exportHBASE_HOME=<HBASE_HOME>

->exportHADOOP_HOME=<HBASE_HOME>

->exportHBASE_CLASSPATH=$HADOOP_CONF_DIR

->HBASE_MANAGES_ZK=false

regionservers

$vi<HBASE_HOME>/conf/slaves

->master2

->slave1

->slave2

->slave3

Asprevioussettingofhbase-env.sh,youcanseeIsetHBASE_MANAGES_ZK=falsetousemycustom

zookeepertomanageandmonitortheresourceofcluster.

zoo.cfg

$vi<ZOOKEEPER_HOME>/conf/zoo.cfg

->dataDir=<ZOOKEEPER_HOME>/data

->clientPort=2181

->server.1=master:2888:3888

Andthenaddamyidfileunder<ZOOKEEPER_HOME>/datatotellzookeeperwhichnodeisthe

zookeeperrunningon.

forexample,asmyzoo.cfgsetserver.1=master:2888:3888,itmeansthiszookeeperthreadrunning

onmasternodebinding2888portand3888port.SoIneedtotellzookeeperwhichmachineisthat

runon.

$vimyid

->1

wq!

$sudovi/etc/profile

->exportHBASE_HOME=<HBASE_HOME>

->exportZOOKEEPER_HOME=<ZOOKEEPER_HOME>

->exportPATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HBASE_HOME/bin:$ZOOKEEPER_HOME/bin:$PATH

SettheconfigurationofZookeeper

SetSystemEnvironment

AfterfollowingprevioussectionstosetHadoopandHBaseconfigurations,wecancommitthisDocker

imagetodistributecloudcluster.

BriefSummary

ThefollowingfigureshowswhatIplantodistributemycloudcluster.

Ifyoufollowtheprevioussections,youwillgetaDockercontainerwithhadoopenvironment.Tousethat,youneedtoduplicateittootherendpointsyouwanttodistributetobuildhadoopcluster.

Savethecontainerasacompressedtarfile.

$sudodockersave<image_repository:tag>>XXXX.tar

Afterfinishingexport,Iusescptotransporttheimagetoothernodesincluster.

ExportandImportaDockerImagebetweenNodesinCluster

NOTE:MyexperimentenvironmentisonWindowsAzureVirtualMachine.

ExportaDockercontainertoaimage.

$scp-P[port]XXXX.tar[account]@[domain]/<whereyouwanttostore>

Now,switchtothemaster2toshowhowtousetheimagewejusttransportfrommaster.

$sudodockerload<XXXX.tar

Afterloadingtarfile,wecanchecktheimagejustimportingfromlocalrepository.

$sudodockerimages

Now,wecanstarttousethisimagetodistributethecluster.Inthisexample,Iwriteasimpleshell

scripttorunDockerimageinsteadofDockerfilebecauseIhaven'tknowedhowtouseDockerfile

tobuildaDockercontainer.

Because--linkoptionisnotfitinmysituation,Iusethebasicallyportmappingtotryconnectwitheach

containerviainternet

$vibootstrap.sh

->sudodockerrun-i-t-p2122:2122-p50020:50020-p50090:50090-p50070:50070-p50010:50010-p50075:50075-p8031:8031-p8032:8032-p8033:8033-p8040:8040-p8042:8042-p49707:49707-p8088:8088-p8030:8030-p9000:9000-p9001:9001-p3888:3888-p2888:2888-p2181:2181-p16020:16020-p60010:60010-p60020:60020-p60000:60000-p9090:9090-h<hostname><repository:tag>

wq!

$shbootstrap.sh

Inthisexample,Iusemaster2ashostnameandlistenallneededportsfromcontainertoendpoint

machine.

Now,wecanstarttobootthehadoopclusteron.Therearesomestepsbeforestart-dfs.sh.

Eachcontainersinclusterneedtodothefollowingstatement

$source/etc/profile

$sudovi/etc/hosts

->putallIPandhostname

##forexample

Shellscript

DistributewithDockerContainer

127.0.0.5master

10.0.0.2master2

10.0.0.3slave1

10.0.0.4slave2

10.0.0.5slave3

wq!

##restartsshtwice

$sudoservicesshrestart

$sudoservicesshrestart

Mastercontainerneedtodothis

##formathdfsnamenode

$hdfsnamenode-format

$<HADOOP_HOME>sbin/start-dfs.sh

$<HADOOP_HOME>sbin/start-yarn.sh

##maketherootdirectoryofhbase

$hadoopfs-mkdir/hbase

##startzookeeper

$zkServer.shstart

##starthbase

$start-hbase.sh

##TEST

$jps

Ifsuccess,youwillseethefollowingprocessrunningonmaster

Now,wecanstartusinghadoopandhbasetorecordandanalyzedatabyfollowingthistutorial.Nevertheless,therearestillproblemsI'vemetbutnotsolved.

Insection4,afterIdockerruneachcontainers,Ihavetomodifyevery/etc/hostsofeachcontainers

toconnecteachothers.Butitwillhappentwoproblems.

First,itisinconvenienttomodifyeveryhostsfileiflargeamountsofendpointmachines.

Second,ifrebootcontainers,theIPofcontainerwillbeautomaticallyresetbyDocker.SupposeyouuseHBaseasyourNoSQLDB,ZookeeperwillstoreoldIPandallregionserverscan'ttracebackthe

masternode.

I'vesurveyedsomemethodtosolvebutnotimplementyet.Lookatthefollowinglink:

http://jpetazzo.github.io/2013/10/16/configure-docker-bridge-network/

IfIsolveproblems,Iwillupdatethebook.

I'vereferencedthebeforelinkandtrytosolvethesecondproblemImet.TherearesomenewproblemshappenedsothatIdon'tsolveit.

I'vemodifytheinterfaceIPandrouteofcontainersuccessfully,butIcan'tsshintocontainerafterdo

that.ThfollowingcodeiswhatIuse,maybeeverybodycandiscussonthisissueinstackoverflow.

http://stackoverflow.com/questions/27937185/assign-static-ip-to-docker-container

pid=$(sudodockerinspect-f'{{.State.Pid}}'<container_name>2>/dev/null)

sudorm-rf/var/run/netns/*

sudoln-s/proc/$pid/ns/net/var/run/netns/$pid

sudoiplinkaddAtypevethpeernameB

sudobrctladdifdocker0A

sudoiplinksetAup

sudoiplinksetBnetns$pid

sudoipnetnsexec$pidiplinkseteth0down

sudoipnetnsexec$pidiplinkdeleteeth0

sudoipnetnsexec$pidiplinksetdevBnameeth0

sudoipnetnsexec$pidiplinkseteth0address12:34:56:78:9a:bc

sudoipnetnsexec$pidiplinkseteth0down

sudoipnetnsexec$pidiplinkseteth0up

sudoipnetnsexec$pidipaddradd172.17.0.1/16deveth0

sudoipnetnsexec$pidiprouteadddefaultvia172.17.42.1

TheProblemIHaven'tSolved

Update

2015/01/20

ClockOutOfSyncException

Becasuethehadoopclustertimedateisnotsync,youcanusentpdateasia.pool.ntp.orgtosync

datetimeofeachhosts.

ConnectionRefused

PleaseconfirmyourIPandhostnamein/etc/hostsiscorrectornot.

continue...

PossibleProblems