Intro to Big Data
-
Upload
jonathan-bloom -
Category
Technology
-
view
502 -
download
2
description
Transcript of Intro to Big Data
![Page 1: Intro to Big Data](https://reader033.fdocuments.in/reader033/viewer/2022052619/55656812d8b42a7b518b4674/html5/thumbnails/1.jpg)
Intro to Big Data On Premise
Presented by: Jon BloomSenior Consultant, Agile Bay, Inc.
![Page 2: Intro to Big Data](https://reader033.fdocuments.in/reader033/viewer/2022052619/55656812d8b42a7b518b4674/html5/thumbnails/2.jpg)
Jon BloomBlog: http://www.bloomconsultingbi.com
Twitter: @sqljon
Linked-in: http://www.linkedin.com/in/BloomConsultingBI
Email: [email protected]
Customers & Partners
![Page 3: Intro to Big Data](https://reader033.fdocuments.in/reader033/viewer/2022052619/55656812d8b42a7b518b4674/html5/thumbnails/3.jpg)
w w w . a g i l e b a y . c o m
![Page 4: Intro to Big Data](https://reader033.fdocuments.in/reader033/viewer/2022052619/55656812d8b42a7b518b4674/html5/thumbnails/4.jpg)
Session AgendaWhat is Big Data?What is Hadoop?BI vs. HadoopDemo:
![Page 5: Intro to Big Data](https://reader033.fdocuments.in/reader033/viewer/2022052619/55656812d8b42a7b518b4674/html5/thumbnails/5.jpg)
Terms and Acronyms Hadoop:
Apache project (open source) project to develop software for reliable, scalable, distributed computing.
Cluster: A group of computers (nodes) linked together to perform a highly-available and high computation work
HDFS distributed file system that provides high-throughput access to application data.
YARNA framework for job scheduling and cluster resource management.
MapReduce A system for parallel processing of large data sets.
![Page 6: Intro to Big Data](https://reader033.fdocuments.in/reader033/viewer/2022052619/55656812d8b42a7b518b4674/html5/thumbnails/6.jpg)
What is Big Data?
![Page 7: Intro to Big Data](https://reader033.fdocuments.in/reader033/viewer/2022052619/55656812d8b42a7b518b4674/html5/thumbnails/7.jpg)
What is Big Data?Volume, Velocity, Variety
![Page 8: Intro to Big Data](https://reader033.fdocuments.in/reader033/viewer/2022052619/55656812d8b42a7b518b4674/html5/thumbnails/8.jpg)
What is Hadoop?
![Page 9: Intro to Big Data](https://reader033.fdocuments.in/reader033/viewer/2022052619/55656812d8b42a7b518b4674/html5/thumbnails/9.jpg)
What is HadoopApache open source project Batch Oriented Parallel Processing across
Commodity Servers Ecosystem
• Ambari• HBase• Avro• Cassandra• Chukwa
• Hive• Mahout• Pig• ZooKeeper
![Page 10: Intro to Big Data](https://reader033.fdocuments.in/reader033/viewer/2022052619/55656812d8b42a7b518b4674/html5/thumbnails/10.jpg)
Distributed Computing & MapReduce
MapperReducer
![Page 11: Intro to Big Data](https://reader033.fdocuments.in/reader033/viewer/2022052619/55656812d8b42a7b518b4674/html5/thumbnails/11.jpg)
BI vs. Hadoop?
![Page 12: Intro to Big Data](https://reader033.fdocuments.in/reader033/viewer/2022052619/55656812d8b42a7b518b4674/html5/thumbnails/12.jpg)
BI vs. HadoopHadoop not a replacement of BIExtends BI capabilitiesBI = Scale up to 100s of GigabytesHadoop = From 100s of Gygabytes to
Terabytes (1,000s og Gygabytes) and Terabytes (1,000,000 Gigabytes)
![Page 13: Intro to Big Data](https://reader033.fdocuments.in/reader033/viewer/2022052619/55656812d8b42a7b518b4674/html5/thumbnails/13.jpg)
Demo
![Page 14: Intro to Big Data](https://reader033.fdocuments.in/reader033/viewer/2022052619/55656812d8b42a7b518b4674/html5/thumbnails/14.jpg)
Thank you for attending!Q & A
Blog: www.bloomconsultingbi.comTwitter: @sqljon
Linked-in: http://www.linkedin.com/in/BloomConsultingBI
Email: [email protected]