www.KnowBigData.comHadoop
What, Why & How to Get Started with Big Data & Hadoop
www.KnowBigData.comHadoop
ABOUT INSTRUCTOR?
2014 KnowBigData Founded2014
Amazon Built High Throughput Systems for Amazon.com site similar to storm.
20122012 InMobi Built Recommender that churns 200 TB2011
tBits Global Founded tBits GlobalBuilt an enterprise grade Document Management System
2006
D.E.Shaw Built the big data systems before the term was coined
20022002 IIT Roorkee Finished B.Tech.
www.KnowBigData.comHadoop
❏ What/why of Big Data?
❏ Why Now?
❏ Examples Customers
❏ What is Hadoop?
TODAY’S CLASS
❏ Components of Hadoop
❏ Further Reading/Assignment
www.KnowBigData.comHadoop
WHAT IS BIG DATA?
• Simply: Data of Very Big Size
• Can’t process with usual tools
• Distributed Architecture Needed
www.KnowBigData.comHadoop
1.Groups of networked computers2.Interact with each other3.To achieve a common goal.
DISTRIBUTED COMPUTING
www.KnowBigData.comHadoop
CHARACTERSTICS OF BIG DATA
Problems Involving the handling of data coming at
fast rate.e.g. Number of requests
being received by Facebook, Youtube streaming, Google
Analytics
Problems involving complex data structurese.g. Maps, Social Graphs,
Recommendations
VOLUME VELOCITY VARIETY
Data At Rest Data In Motion Data in Many Forms
Problems related to storage of huge data reliably.
e.g. Storage of Logs of a website, Storage of data by
gmail.FB: 300 PB. 600TB/ day
www.KnowBigData.comHadoop
WHY IS IT IMPORTANT NOW?
Smart Phones
4.6 billion mobile-phones. 1 - 2 billion people accessing the internet. Facebook:1.06 bn monthly active users, 30 billion
pieces shared monthly.~175 million tweets every day
Connectivity: Social Networks
The connectivity improved. The devices became cheaper, faster and smaller.
Connectivity: Internet Of Things
www.KnowBigData.comHadoop
EXAMPLE BIG DATA CUSTOMERS
Web and e-commerce1.Recommendation Engines2.Search Quality3.Sentiment Analyses4.A/B testing
Telecommunications1.Customer Churn Prevention2.Network Performance Optimization3.Calling Data Record (CDR) Analysis4.Analyzing Network to Predict Failure
www.KnowBigData.comHadoop
EXAMPLE BIG DATA CUSTOMERS
Government1.Fraud Detection2.Cyber Security Welfare3.Justice
Healthcare & Life Sciences1.Health information exchange2.Gene sequencing3.Healthcare improvements4.Drug Safety
www.KnowBigData.comHadoop
EXAMPLE BIG DATA PROBLEMSRecommendations
www.KnowBigData.comHadoop
EXAMPLE BIG DATA PROBLEMSRecommendations
www.KnowBigData.comHadoop
EXAMPLE BIG DATA PROBLEMSSentiment Analysis
www.KnowBigData.comHadoop
SALARY TRENDS
Source:Indeed.com
www.KnowBigData.comHadoop
BIG DATA SOLUTIONS
1.Apache Hadoop○ Apache Spark
2.Cassandra3.MongoDB4.Google Compute Engine
www.KnowBigData.comHadoop
WHAT IS HADOOP?
A. Created by Doug Cutting (of Yahoo) and Mike CafarellaB. Based on GFS, GMR & Google Big TableC. Built for Nutch search engine projectD. Named after Toy ElephantE. Open Source - ApacheF. Power, Popular & SupportedG. Framework to handle Big DataH. For reliable, scalable, distributed computingI. Written in Java
www.KnowBigData.comHadoop
Workflow
SQL Inteface
New Language
Machine learning / STATS
NoSQL Datastore
Compute Engine
Main Component
COMPONENTS
www.KnowBigData.comHadoop
ABOUT KNOWBIGDATA
❏ Expert Instructors
❏ CloudxLab
❏ Lifetime access to LMS
❏ Presentations
❏ Class Recording
❏ Assignments + Quizzes
❏ Project Work
❏ Real Life Project
❏ Course Completion Certificate
❏ 24x7 support
❏ KnowBigData - Alumni
❏ Jobs
❏ Stay Abreast (Updated Content,
Complimentary Sessions)
❏ Stay Connected
www.KnowBigData.comHadoop
WHAT IS CLOUDxLABSTM?
1. For Real Life Experience2. An online cluster of servers 3. With all required tools installed4. Accessible globally5. Do not require high end
configuration
www.KnowBigData.comHadoop
www.KnowBigData.com1.Starting on...
● 12 Dec - 7am Big Data & Hadoop● 12 Dec - 8:30pm Big Data & Spark
2.Sat-Sun - 3 hours3.33 hrs - 3 hr x 11 classes4.₹19999 (25% off) (Incl. Taxes) - $3695.Includes CloudxLabs + Support + LMS6.Every class is also recorded.
[email protected] +1 419 665 3276 (US) +91 803 959 1464 (IN)
Upcoming Courses
www.KnowBigData.comHadoop
Thank you.
[email protected] +1 419 665 3276 (US) +91 803 959 1464 (IN)
Subscribe to our Youtube channel for latest videos - https://www.youtube.com/channel/UCxugRFe5wETYA7nMH6VGyEA
Top Related