HADOOP TECHNOLOGY ppt
-
Upload
sravya-raju -
Category
Technology
-
view
631 -
download
0
Transcript of HADOOP TECHNOLOGY ppt
![Page 1: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/1.jpg)
Technical Seminar on
HADOOP TECHNOLOGY Under the Guidance of
P.V.R.K.MURTHY, M.Tech
Assistant Professor
![Page 2: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/2.jpg)
What is hadoop Technology??
Why hadoop?
Developers of hadoop Technology
Famous hadoop users
Hadoop Features
Hadoop Architectures
Core-Components of Hadoop
Hadoop High Level Architechture
Hadoop cluster
CONTENTS
![Page 3: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/3.jpg)
What is HDFS
HDFS – Name Node features:
HDFS-name node architecture
HDFS-data node
Hadoop MAPREDUCE
Benefits of Hadoop…
Conclusion
Reference
CONTENTS…
![Page 4: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/4.jpg)
HADOOP TECHNOLOGY
What is Hadoop Technology??
•The most well known technology used for Big Data is
Hadoop.
•It is actually a large scale batch data processing system
![Page 5: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/5.jpg)
Why Hadoop ??
•Distributed cluster system
•Platform for massively scalable applications
•Enables parallel data processing
![Page 6: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/6.jpg)
Developers of Hadoop Technology:
Michael j. cafarellaDoug cutting
![Page 7: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/7.jpg)
Famous Hadoop users
![Page 8: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/8.jpg)
Hadoop Features
•Hadoop provides access to the file systems
• The Hadoop Common package contains the
necessary JAR files and scripts
•The package also provides source code,
documentation and a contribution section that includes
projects from the Hadoop Community.
![Page 9: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/9.jpg)
HADOOP ARCHITECTURE
![Page 10: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/10.jpg)
Core-Components of Hadoop:
Hadoop distributive file system.
Map reduce.
![Page 11: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/11.jpg)
What is HDFS ?
•Distributed file system
•Traditional hierarchical file organization
•Single namespace for the entire cluster
•Write-once-read-many access model
•Aware of the network topology
![Page 12: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/12.jpg)
Hadoop High Level Architechture
![Page 13: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/13.jpg)
Hadoop cluster
•A Small Hadoop Cluster Include a single master &
multiple worker nodesMaster node:Data Node Job Tracker Task Tracker Name Node
Slave node: Data Node Task Tracke
![Page 14: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/14.jpg)
HDFS – Name Node Features
Metadata in main memory:
• List of files
• List of blocks for each file
• List of Data Nodes for each block
• File attributes
• Creation time
• Records every change in the
metadata
![Page 15: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/15.jpg)
HDFS-name node architectureSecondary name
node
3.Store to HDD
Primary name-node
RAM
HDD
RAM
HDD
1. Pull transaction log
4.Push
2. Merge changes
![Page 16: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/16.jpg)
HDFS-Data node
•Block Server Stores data in the local file system
•Periodic validation of checksums
•Periodically sends a report of all existing blocks
to the Name Node
![Page 17: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/17.jpg)
Hadoop MAPREDUCE
Job Tracker:Splitting into map and reduce tasksScheduling tasks on a cluster nodeTask Tracker:Runs Map Reduce tasks periodically
Map reduce implementation:
![Page 18: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/18.jpg)
Benefits of Hadoop…
•Cost Saving and efficient and reliable data processing
• Provides an economically scalable solution
• Storing and processing of large amount of data
•Data grid operating system
• It is deployed on industry standard servers rather than expensive
specialized data storage systems.
• Parallel processing of huge amounts of data across inexpensive,
industry-standard servers.
![Page 19: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/19.jpg)
Why commodity hw ?
because cheaper
designed to tolerate faults
Why HDFS ?
network bandwidth vs seek latency
Why Map reduce programming model?
parallel programming
large data sets
moving computation to data
single compute + data cluster
CONCLUSION
![Page 20: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/20.jpg)
REFERENCES
•Apache Hadoop!
(http://hadoop.apache.org)
•Hadoop on Wikipedia (
http://en.wikipedia.org/wiki/Hadoop)
•Cloudera - Apache Hadoop for the Enterprise (
http://www.cloudera.com
![Page 21: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/21.jpg)
?? ?
Any Queries
![Page 22: HADOOP TECHNOLOGY ppt](https://reader035.fdocuments.in/reader035/viewer/2022081518/58ec84541a28abf96f8b46b1/html5/thumbnails/22.jpg)
Thank you