Introduction
4 BidData Community : Intro to BigData Workshop
Community Vision Building a Big Data Knowledge hub in Egypt.
Introduction
5 BidData Community : Intro to BigData Workshop
Mission Spreading BD science and engineering
awareness
Introduction
6 BidData Community : Intro to BigData Workshop
Objectives ● Training 120 4th year student from
Engineering and computer science faculties.
● Running 5 awareness sessions targeting five universities in the first phase.
Introduction
9 BidData Community : Intro to BigData Workshop
Call for volunteers Business Development
Administration
Development
Administrative & Logistics
Introduction
10 BidData Community : Intro to BigData Workshop
Today RoadMap 1. Big Data Introduction 2. Hadoop installation
3. Small Java application
BigData & Ecosystem
13 BidData Community : Intro to BigData Workshop
Data Vs. Information Vs. Knowledge
BigData & Ecosystem
14 BidData Community : Intro to BigData Workshop
Data Scientist VS Data Engineer
BigData & Ecosystem
15 BidData Community : Intro to BigData Workshop
What is the maximum file size you have? Movies/Files/Streaming video that you have used?
What is the maximum download speed you get? How much time to just transfer?
BigData & Ecosystem
16 BidData Community : Intro to BigData Workshop
How do you process
Distributed System
Massive data?
Distributed Computing System
BigData & Ecosystem
17 BidData Community : Intro to BigData Workshop
Now, What is difference between
Big Data Massive Data and
BigData & Ecosystem
18 BidData Community : Intro to BigData Workshop
From Where Data are Generated?
Social media and networks (all of us are generating data)
Scientific instruments (collecting all sorts of data)
Mobile devices (tracking all objects all the time)
Sensor technology and networks (measuring all kinds of data)
BigData & Ecosystem
24 BidData Community : Intro to BigData Workshop
Hadoop != Database
Hadoop
Hadoop is a Distributed Storage and Computation Framework
BigData & Ecosystem
27 BidData Community : Intro to BigData Workshop
Core Components of Hadoop
Storage (HDFS) Processing (MapReduce)
BigData & Ecosystem
33 BidData Community : Intro to BigData Workshop
Store all data in one place Interact with data in multiple ways
BigData & Ecosystem
34 BidData Community : Intro to BigData Workshop
Hadoop 2.0 Projects
• YARN • HDFS Federation Aka HDFS 2.0
BigData & Ecosystem
40 BidData Community : Intro to BigData Workshop
It’s the time for Implementaion
What’s Hadoop? Is a Java-based programming framework that supports the processing of large data sets in a distributed computing environment.
43 BidData Community : Intro to BigData Workshop
BigData practice
Hadoop installation modes
○ Stand alone mode. ○Pseudo distributed mode. ○Fully distributed mode.
44 BidData Community : Intro to BigData Workshop
BigData practice
Install JAVA & Configure hosts file
○ sudo apt-get update ○ sudo apt-get install sun-java6-jdk ○ Java –version
○Hosts File ○ #gedit /etc/hosts
48 BidData Community : Intro to BigData Workshop
BigData practice
Install & Configure SSH ○ sudo apt-get install openssh-server ○ ssh-keygen -t rsa -P "“
49 BidData Community : Intro to BigData Workshop
BigData practice
Download & Install
○ In the Linux Terminal, Write: “ wget http://supergsego.com/apache/hadoop/common/hadoop-1.2.1/hadoop-
1.2.1-bin.tar.gz ” & hit ENTER
56 BidData Community : Intro to BigData Workshop
BigData practice
Editing .bashrc file ○ #gedite ~/.bashrc ○ Add the following lines at the end of the file
57 BidData Community : Intro to BigData Workshop
BigData practice
Main Installation ○ #tar –zxvf hadoop-1.2.1-bin.tar.gz
58 BidData Community : Intro to BigData Workshop
BigData practice
Editing hadoop-env.sh
○ #gedite /opt/hadoop/conf/hadoop-env.sh
59 BidData Community : Intro to BigData Workshop
BigData practice
Editing conf/*-site.xml files ○ 1- “Core-site.xml” File: ○ #gedit /opt/hadoop/conf/core-site.xml
60 BidData Community : Intro to BigData Workshop
BigData practice
Editing conf/*-site.xml files ○ 2-”Mapred-site.xml” File ○ #gedit /opt/hadoop/conf/mapred-site.xml
61 BidData Community : Intro to BigData Workshop
BigData practice
Editing conf/*-site.xml files ○ 3-”hdfs-site.xml” File. ○ #gedit /opt/hadoop/conf/hdfs-site.xml
62 BidData Community : Intro to BigData Workshop
BigData practice
Formatting Namenode F.S. ○ #hadoop namenode –format
63 BidData Community : Intro to BigData Workshop
BigData practice
Firing Hadoop Deamons ○ #start-all.sh
64 BidData Community : Intro to BigData Workshop
BigData practice
Testing Installation ○ Localhost:50070
65 BidData Community : Intro to BigData Workshop
BigData practice
Testing Installation ○ Localhost:50070
66 BidData Community : Intro to BigData Workshop
BigData practice
Top Related