Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

16
Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid -by Rewati Ovalekar

description

-by Rewati Ovalekar. Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid. Step 1: Code is available on: http://code.google.com/p/cyberaide/ - PowerPoint PPT Presentation

Transcript of Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

Page 1: Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

Tutorial: To run the MapReduce EEMD code with Hadoop on

Futuregrid

-by Rewati Ovalekar

Page 2: Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

2

● Step 1:– Code is available on:

http://code.google.com/p/cyberaide/– Download the code from:

http://code.google.com/p/cyberaide/source/browse/#svn%2Ftrunk%2Fproject%2Fspring2011%2FEEMDAnalysis%2FEEMDJava

Page 3: Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

3

● Step 2:– Create a futuregrid account– For further details refer:

https://portal.futuregrid.org/tutorials (FutureGrid Tutorial)

Page 4: Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

4

● Step 3:– Login to Futuregrid– ssh [email protected]– Following message will be displayed for successful

login

Page 5: Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

5

● Step 4:– Create a jar file

● Step 5:– To transfer the jar file and the input file:– sftp [email protected]

– put /../filepath

Page 6: Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

6

● Step 6:– In order to run Hadoop on FutureGrid create an

eucalyptus account– For further details refer:

https://portal.futuregrid.org/tutorials/eucalyptus

● Step 7:– Once the account is approved, load the eucalyptus

tools :

Module load euca2ools

Page 7: Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

7

● Step 8:– Make sure that the jar file and the input file are in the

same directory as the username.private key– Run the image which has hadoop on it:

euca-run-instances -k rovaleka -t c1.xlarge emi-D778156D

-k indicates the key name

-t indicates the type of instance

emi-D778156D indicates the image name

-n indicates the number of clusters to run

Page 8: Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

8

● Step 8:– Check the status using:– euca-describe-instances– Keep checking till the status is running, once the

status is running one can login to run the Hadoop. It will be displayed as below:

Page 9: Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

9

● Step 9:– Transfer the input file and the jar file to the required

VM using:

scp –i username.private filename [email protected]:/

(Make sure that the address is same as the address assigned to you else it will ask for password)

– Login using:

scp –i username.private [email protected] (Make sure the address is same)

Page 10: Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

10

SINGLE NODE

● Step 10:– Above message will be displayed for successful login– Retrieve the transferred files and transfer it in the Hadoop folder:

cd /..

mv filename /opt/hadoop-0.20.2

cd /opt/hadoop-0.20.2

Page 11: Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

11

● Step 11:– To run Hadoop:

cd /opt/hadoop-0.20.2

bin/start-all.sh– To check if everything is started:

jps

Page 12: Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

12

● Step 12:– Transfer the input file on the HDFS:

bin/hadoop dfs –copyFromLocal inputfile name_in_HDFS

– To check if it is present on HDFS:

bin/hadoop dfs –ls

NOTE: We need to transfer the input file whenever we start Hadoop

Page 13: Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

13

● Step 13:– To run the code:

bin/hadoop jar [jarFile] EEMDHadoop [inputfilename] [required_output_file]

Page 14: Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

14

● Step 14:– Retrieve the output :

bin/hadoop dfs -copyToLocal [outputFileName] [outputfileNameToBeGiven]

(output will be avaliable in part-00000 file)

To check the logs and to debug the code go to folder logs/userlogs

Page 15: Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

15

● Step 15:– Stop the Hadoop:

bin/stop-all.sh

exit

Page 16: Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

16

Thank you!!!