Install Hadoop and Run wordcount Jobcis.csuohio.edu/~sschung/cis612/HadoopWordCountJa... · Install...

7
Install Hadoop and Run wordcount Job 1. First extract hadoop and put it to a folder. I used the easy Ubuntu way and let the download manager extract the file for me to a directory /srv/hadoop 2. Created a new user account for hadoop 3. Make sure hadoop could ssh to local host without needing a password

Transcript of Install Hadoop and Run wordcount Jobcis.csuohio.edu/~sschung/cis612/HadoopWordCountJa... · Install...

Page 1: Install Hadoop and Run wordcount Jobcis.csuohio.edu/~sschung/cis612/HadoopWordCountJa... · Install Hadoop and Run wordcount Job 1. First extract hadoop and put it to a folder. I

Install Hadoop and Run wordcount Job

1. First extract hadoop and put it to a folder. I used the easy Ubuntu way and let the download manager extract the file for me to a directory /srv/hadoop

2. Created a new user account for hadoop

3. Make sure hadoop could ssh to local host without needing a password

Page 2: Install Hadoop and Run wordcount Jobcis.csuohio.edu/~sschung/cis612/HadoopWordCountJa... · Install Hadoop and Run wordcount Job 1. First extract hadoop and put it to a folder. I

4. Do some config settings for core.xml. set custom temp directory for hadoop

5. Settings for pseudo random mode in hdfs-site.xml

Page 3: Install Hadoop and Run wordcount Jobcis.csuohio.edu/~sschung/cis612/HadoopWordCountJa... · Install Hadoop and Run wordcount Job 1. First extract hadoop and put it to a folder. I

6. Format the namenode (output after command)

7. start up the dfs and yarn with start-dfs.sh and start-yarn.sh

Page 4: Install Hadoop and Run wordcount Jobcis.csuohio.edu/~sschung/cis612/HadoopWordCountJa... · Install Hadoop and Run wordcount Job 1. First extract hadoop and put it to a folder. I

8. Verify the nodes are running (both with jps and web interface)

9. Add some data to run against word count program.

10. Results of running the command (after hitting the command got verbose output)

Page 5: Install Hadoop and Run wordcount Jobcis.csuohio.edu/~sschung/cis612/HadoopWordCountJa... · Install Hadoop and Run wordcount Job 1. First extract hadoop and put it to a folder. I

11. Some wordcount output

Page 6: Install Hadoop and Run wordcount Jobcis.csuohio.edu/~sschung/cis612/HadoopWordCountJa... · Install Hadoop and Run wordcount Job 1. First extract hadoop and put it to a folder. I

** I also updated my .bashrc so it would have variables for java environment like $HADOOP_HOME and added hadoop/bin to my PATH variable so I could run the commands. After doing the install on the local machine I installed on amazon with 4 nodes:

Page 7: Install Hadoop and Run wordcount Jobcis.csuohio.edu/~sschung/cis612/HadoopWordCountJa... · Install Hadoop and Run wordcount Job 1. First extract hadoop and put it to a folder. I