Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... ·...
Transcript of Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... ·...
![Page 1: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/1.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
Creating an Enterprise-class Hadoop Platform
Joey Jablonski Practice Director, Analytic Services
DataDirect Networks, Inc. (DDN)
![Page 2: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/2.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
Who am I?
Practice Director, Analytic Services at DataDirect Networks, Inc.
3+ years with Hadoop, 12+ with HPC Contact Details @jrjablo [email protected]/[email protected] www.linkedin.com/in/joeyjablonski
2
![Page 3: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/3.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
Why Hadoop?
Scalable – Performance & Capacity Growing Ecosystem (Flexibility) Established APIs & Interfaces Location on the adoption curve Proven base to create Analytical Platforms
3
![Page 4: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/4.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
What is Enterprise Class?
Scalable – OPEX & CAPEX Manageable Integration with existing tools Flexible Workflow – Process Integration No Rip & Replace Metrics to manage towards Business Driven, Technological Capabilities
4
![Page 5: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/5.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
The Big Data Challenge
The Big Data Equation:
Volume Velocity Variety + +
Petabytes of Data Trillions of Objects
GB/s TB/s Millions of IO/s
Object Operations
Structured Unstructured
Streams & Batches
![Page 6: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/6.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
Analytics | Looking for Actionable Information
Billions of Data
Points to Consider
• Consumer purchasing trends • Product perception • Drug Discovery • Genomics • Surveillance • Financial Analysis
![Page 7: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/7.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
Data Gravity
7
DATA
Services
Applications
![Page 8: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/8.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
Why is data Analytics so hard?
Hacking Skills
Substantive Expertise
Math & Statistics
knowledge Trad
ition
al
Res
earc
h
DataScience
Business Acumen
CuriosityCommunications
Analytics
Poor D
ecisioning
Technical Business
![Page 9: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/9.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
What is Hadoop missing today?
Active-Active high-availability Established management tools Enterprise integration mindset Enterprise class hardware Consistent version-compatibility & deployment Efficient CAPEX & OPEX scaling Resource management/SLAs/QoS Security.
9
![Page 10: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/10.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
Hadoop Operational Considerations
Deploy
Manage
Monitor Respond
Upgrade
Software Platform Hardware Platform
![Page 11: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/11.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
Todays Enterprise Picture
11
The Cloud
![Page 12: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/12.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
Getting there….
Improved Results
Modify Behavior Insight
![Page 13: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/13.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
Hadoop Architectural Considerations
13
![Page 14: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/14.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
Planning for Growth
14
Adop
tion
Hig
her i
s B
ette
r
Capacity
Goal for Human Costs
Performance Scalability User Growth
![Page 15: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/15.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
Shared v. Commodity
15
Shared Component Approach • Lower Operational Costs • Efficient operational resource
scaling • Shared resources with other IT
platforms • Efficiency in computing,
connectivity & service placement
Commodity Server Approach • Lower Entry Costs • Shorter MTBF • Inefficient scaling of tools and
processes • Mis-match with traditional IT
operations models
![Page 16: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/16.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
Ethernet v. Infiniband
16
Infiniband • 100% Storage Management Offload • End-End InfiniBand Networking with RDMA
Acceleration • Real-Time Data Delivery to Provide
MapReduce Process Consistency • Smaller Compute, Compact Storage to
Minimize Data Center Impact
Ethernet • Compatibility, ensured connectivity • Limitations in traffic types and bandwidth
availability • High CPU/Overhead cost • Minimal options for offloading with Linux
environments
![Page 17: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/17.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
Analytic User Types
17
Empowered Users Aware Users Enabled Users
![Page 18: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/18.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
Hadoop Enterprise Integration
18
Extract Transform Load
Data Information Insight Results
APIs
Integration
Monitoring & Response
![Page 19: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/19.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
And finally, Hadoop is…
…more then just hardware, It is about an ecosystem of hardware &
software. …about integrating with existing systems. …a toolkit to build Analytical Platforms. …a component of the larger corporate
processes and mandates. …a component of the wider business KPIs.
19
![Page 20: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management](https://reader031.fdocuments.in/reader031/viewer/2022011903/5f104c027e708231d448690b/html5/thumbnails/20.jpg)
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.
Q&A
20