Efficient Approaches to High-Scale Apache Hadoop Processing
description
Transcript of Efficient Approaches to High-Scale Apache Hadoop Processing
![Page 1: Efficient Approaches to High-Scale Apache Hadoop Processing](https://reader033.fdocuments.in/reader033/viewer/2022051700/568163ca550346895dd505e6/html5/thumbnails/1.jpg)
Efficient Approaches to High-Scale Apache Hadoop ProcessingCloud Computing West - November 2012
11/9/2012
Joey JablonskiPractice Director, Analytic Services
![Page 2: Efficient Approaches to High-Scale Apache Hadoop Processing](https://reader033.fdocuments.in/reader033/viewer/2022051700/568163ca550346895dd505e6/html5/thumbnails/2.jpg)
Analytics | Looking for Actionable Data
Billions of Data
Points to Consider
Actionable Results• Consumer purchasing trends• Product perception• Drug Discovery• Genomics• Surveillance• Financial Analysis
2
![Page 3: Efficient Approaches to High-Scale Apache Hadoop Processing](https://reader033.fdocuments.in/reader033/viewer/2022051700/568163ca550346895dd505e6/html5/thumbnails/3.jpg)
What our users want
• Delivering…– Content that is meaningful.
– Content that is timely.
– Content that evolves.
3
![Page 4: Efficient Approaches to High-Scale Apache Hadoop Processing](https://reader033.fdocuments.in/reader033/viewer/2022051700/568163ca550346895dd505e6/html5/thumbnails/4.jpg)
Customer Centric Decision Making
Improved Content, Directed at Consumers.
4
Data Info Insight Results
Value is Added Decisions are Made
![Page 5: Efficient Approaches to High-Scale Apache Hadoop Processing](https://reader033.fdocuments.in/reader033/viewer/2022051700/568163ca550346895dd505e6/html5/thumbnails/5.jpg)
Enabling Adoption
The Cloud
Em
pow
ered
Use
rs
Aw
are
Use
rs
Ena
bled
Use
rs5
![Page 6: Efficient Approaches to High-Scale Apache Hadoop Processing](https://reader033.fdocuments.in/reader033/viewer/2022051700/568163ca550346895dd505e6/html5/thumbnails/6.jpg)
6
DDN | The Complete Big Data Platform
Process
Ingest Distribute
Store
► Unleashing Data Access to Accelerate Insight► Minimizing Infrastructure, Management and Data Center TCO
![Page 7: Efficient Approaches to High-Scale Apache Hadoop Processing](https://reader033.fdocuments.in/reader033/viewer/2022051700/568163ca550346895dd505e6/html5/thumbnails/7.jpg)
Operations LifecycleDeploy
Manage
MonitorRespond
Upgrade
Software Platform Hardware Platform7
![Page 8: Efficient Approaches to High-Scale Apache Hadoop Processing](https://reader033.fdocuments.in/reader033/viewer/2022051700/568163ca550346895dd505e6/html5/thumbnails/8.jpg)
1 2 3 4 5 6 7 80
50010001500200025003000350040004500
An Appliance-based approach to Apache Hadoop
8
Shared, Big-Data Storage with High Performance Networking Makes Hadoop Clusters More Efficient!
• 100% Storage Management Offload• End-End InfiniBand Networking with
RDMA Acceleration• Real-Time Data Delivery to Provide
MapReduce Process Consistency• Smaller Compute, Compact Storage
to Minimize Data Center ImpactReduce Compute Cluster Size by 40%
Reduce Disk Population by 60%Reduce Data Center Footprint by 75%Increase Responsiveness by 100%
![Page 9: Efficient Approaches to High-Scale Apache Hadoop Processing](https://reader033.fdocuments.in/reader033/viewer/2022051700/568163ca550346895dd505e6/html5/thumbnails/9.jpg)
DataDirect Networks, Information in Motion, Silicon Storage Appliance, S2A, Storage Fusion Architecture, SFA, Storage Fusion Fabric, Web Object Scaler, WOS, EXAScaler, GRIDScaler, xSTREAMScaler, NAS Scaler, ReAct, ObjectAssure, In-Storage Processing and SATAssure are all trademarks of DataDirect Networks. Any unauthorized use is prohibited.
Q & A
9
![Page 10: Efficient Approaches to High-Scale Apache Hadoop Processing](https://reader033.fdocuments.in/reader033/viewer/2022051700/568163ca550346895dd505e6/html5/thumbnails/10.jpg)
Who am I?
• Practice Director, Analytic Services at DataDirect Networks, Inc.
• 3+ years with Hadoop, 12+ with HPC• Contact Details
– @jrjablo– [email protected]/[email protected]– www.linkedin.com/in/joeyjablonski
10