Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
-
Upload
hortonworks -
Category
Data & Analytics
-
view
628 -
download
4
description
Transcript of Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Q&A box is available for your questions
Webinar will be recorded for future viewing
Thank you for joining!
We’ll get started soon…
Page 2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
An Open Source Modern Data Architecture …with Red Hat and Apache Hadoop
We do Hadoop.
Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Your speakers…
John Kreisa (@marked_man), VP Strategic Marketing, Hortonworks
Rob Cardwell, VP Middleware Technologies, Red Hat
Syed Rasheed, Sr. Solution Marketing Manager, Red Hat
Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Topics
• Poll – Where are you on your Hadoop Journey? • Why an open source Modern Data Architecture? • Hortonworks and Red Hat partnership for the open MDA • Open source MDA roadmap
Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Poll: Where are you in your Hadoop journey?
1. Researching our options 2. Currently evaluating some software 3. Deep in a trial 4. What’s Hadoop?
Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Big Data Market Trends & Projections
Big Data Explosion
% by which org’s leveraging modern info management systems outperform peers by 2015
ñ Hadoop enabled DBMS’s
85% from new data types
50x data growth 2010 to
2020
1 Zettabyte (ZB) =
1 Billion TBs
15x
growth rate of machine generated
data by 2020
The US has 1/3 of the world’s data
Big Data is 1 of 5 US GDP Game Changers $325 billion incremental annual GDP from big data analytics in retail and manufacturing by
2020
Page 7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
APP
LIC
ATIO
NS
DAT
A S
YSTE
M
Business Analytics
Custom Applications
Packaged Applications
A data architecture under pressure from new data
• Silos of Data • Costly to Scale • Constrained Schemas
Clickstream
Geolocation
Sentiment, Web Data
Sensor. Machine Data
Unstructured docs, emails
Server logs
SOU
RC
ES
Existing Sources (CRM, ERP,…)
RDBMS EDW MPP
New Data Types
…and difficult to manage new data
Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop within an emerging Modern Data Architecture
Hortonworks architected and led development of YARN
Common data set, multiple applications • Optionally land all data in a single cluster
• Batch, interactive & real-time use cases
• Support multi-tenant access, processing & segmentation of data
YARN: Architectural center of Hadoop • Consistent security, governance & operations • Ecosystem applications certified
by Hortonworks to run natively in Hadoop
SOU
RC
ES
EXISTING Systems
Clickstream Web &Social
Geoloca9on Sensor & Machine
Server Logs
Unstructured
APP
LIC
ATIO
NS
DAT
A S
YSTE
M
Business Analytics
Custom Applications
Packaged Applications
RDBMS EDW MPP YARN: Data Operating System
1 ° ° ° ° ° ° ° ° °
° ° ° ° ° ° ° ° ° N
HDFS (Hadoop Distributed File System)
Interactive Real-Time Batch
Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop: typically used for new analytic applications SC
ALE
SCOPE
New Analytic Apps New types of data LOB-driven
Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Clickstream Capture and analyze website visitors’ data trails and optimize your website
Sensors Discover patterns in data streaming automatically from remote sensors and machines
Server Logs Research logs to diagnose process failures and prevent security breaches
New types of data Hadoop Value:
Sentiment Understand how your customers feel about your brand and products – right now
Geographic Analyze location-based data to manage operations where they occur
Unstructured Understand patterns in files across millions of web pages, emails, and documents
Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Unlock New Applications from New Types of Data
INDUSTRY USE CASE Sentiment & Web
Clickstream & Behavior
Machine & Sensor Geographic Server Logs Structured &
Unstructured
Financial Services New Account Risk Screens ✔ ✔
Trading Risk ✔
Insurance Underwriting ✔ ✔ ✔
Telecom Call Detail Records (CDR) ✔ ✔
Infrastructure Investment ✔ ✔
Real-time Bandwidth Allocation ✔ ✔ ✔
Retail 360° View of the Customer ✔ ✔ ✔
Localized, Personalized Promotions ✔
Website Optimization ✔
Manufacturing Supply Chain and Logistics ✔
Assembly Line Quality Assurance ✔
Crowd-sourced Quality Assurance ✔
Healthcare Use Genomic Data in Medial Trials ✔ ✔ ✔
Monitor Patient Vitals in Real-Time
Pharmaceuticals Recruit and Retain Patients for Drug Trials ✔ ✔
Improve Prescription Adherence ✔ ✔ ✔ ✔
Oil & Gas Unify Exploration & Production Data ✔ ✔ ✔ ✔
Monitor Rig Safety in Real-Time ✔ ✔ ✔
Government ETL Offload/Federal Budgetary Pressures ✔ ✔
Sentiment Analysis for Government Programs ✔
Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop incrementally delivers a ‘Data Lake’ SC
ALE
SCOPE
A Modern Data Architecture/Data Lake
New Analytic Apps New types of data LOB-driven
RDBMS
MPP
EDW
Gov
erna
nce
&
Inte
grat
ion
Secu
rity
Ope
ratio
ns
Data Access
Data Management
Data Lake An architectural shift in the data center that uses Hadoop to deliver deeper insight across a large, broad, diverse set of data at efficient scale
Page 13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
OPERATIONAL TOOLS
DEV & DATA TOOLS
INFRASTRUCTURE
HDP is deeply integrated in the data center SO
UR
CES
EXISTING Systems
Clickstream Web &Social Geoloca9on Sensor & Machine
Server Logs Unstructured
DAT
A S
YSTE
M
RDBMS EDW MPP HANA
APPLICAT
IONS
BusinessObjects BI
HDP 2.1
Gov
erna
nce
&
Inte
grat
ion
Secu
rity
Ope
ratio
ns
Data Access
Data Management
YARN
• Enables millions of JBoss developers to quickly build applications with Hadoop
• Simplifies deployment of Hadoop on OpenStack
• Develops and deploys Apache Hadoop as integrated components of the open modern data architecture
Page 14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Rob Cardwell, VP Middleware Technologies Red Hat
Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Companies strengthen relationship to bring Enterprise Apache Hadoop to the open modern data architecture
• Engineering alignment • Corporate alignment • Field alignment
Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Engineering Collaboration Benefits Integration with JBoss Data Virtualization
Enable agile Big Data Hadoop integration with existing enterprise assets and maximize universal data utilization to enable self-service analytics
Integration with multiple Red Hat JBoss Middleware product family
Enables millions of JBoss developers to quickly build applications with Hadoop
Integration with Red Hat Storage Enables Hadoop to use Red Hat Storage secure resilient storage pool for data applications
Integration with Red Hat Enterprise Linux OpenStack Platform
Simplifies automated deployment of Hadoop on OpenStack
Integrated with Red Hat Enterprise Linux and OpenJDK
Develop and deploy Apache Hadoop as an integrated component for multiple deployment scenarios
Page 17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Red Hat + Hortonworks Delivering Value for both Business and IT organizations
Business analysts and users Consume big data using existing tools and skills Application developers Easily build new big data analytical applications based on Hadoop and existing sources Enterprise architects Agile big data integration and creation of dynamic data supply chain to maximize data utilization and analytics at scale IT Operations Enable Apache Hadoop as an integrated, complementary component of the operational architecture
Page 18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
• A deeper strategic alliance – Engineer solutions for seamless customer experience – Joint go to market activities – Integrated customer support
• Available now – HDP on Red Hat Storage beta program – Red Hat JBoss Data Virtualization with HDP – HDP on Red Hat Enterprise Linux with OpenJDK
Red Hat + Hortonworks Deliver Open Source Modern Data Architecture
Page 19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Syed Rasheed, Sr. Solution Marketing Manager Red Hat
Page 20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Information & Agility Gap
Over 70% BI project efforts lies in the finding and
integration of source data
Only 28% Users have any meaningful data access
• Improve the use of data and analytics to improve business decisions and outcomes 72%
• Identify new ways IT can better support business/marketing objectives 66%
• Improve IT project delivery performance 56%
Decision-makers Are Demanding Improved Use Of Data And Analytics
Gartner CIO Agenda Report 2013 Forrester Informa9on Fabric 3.0 August 8, 2013
Page 21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Data Challenges Getting Bigger…
NoSQL
Hive
MapReduce
HDFS
Storm
HBase Spark
Page 22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Make Big Data Accessible for Everyone
Page 23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Data Supply and Integration Solution
Data Virtualization sits in front of multiple data sources and ü allows them to be treated a single source ü delivering the desired data
ü in the required form
ü at the right time
ü to any application and/or user. THINK VIRTUAL MACHINE FOR DATA
Page 24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Easy Access to Big Data
• Reporting tool accesses the data virtualization server via rich SQL dialect
• The data virtualization server translates rich SQL dialect to HiveQL
• Hive translates SQL to MapReduce
• MapReduce runs MR job on big data
MapReduce
HDFS
Hive
Analytical Reporting
Tool
Data Virtualization
Server
Hadoop
Big Data
Page 25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Different Users Different Views of Big Data
• Logical tables with different forms of aggregation
• Logical tables containing extra derived data
• Logical tables with filtered data • All reports/users share the same
specifications
MapReduce
HDFS
Hive
Page 26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Caching the Big Data
• Caches to speed up interactive reporting
• Caches to create a consistent view of big data
• Different caches for different reports
MapReduce
HDFS
Hive
Page 27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Integration of Big Data with “Small Data”
• Integrating small data with big data is easy
• Integration specifications can be shared or be developed for individual reports
MapReduce
HDFS
Hive Application Database Server
Page 28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Security and Big Data
• Hadoop security is file-based • Data virtualization can offer finer-grained security • JBoss Data Virtualization can offer table, row,
column, and value level security on big data • Works in conjunction with other SQL-on-Hadoop
implementations
Page 29 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Benefits of Data Virtualization on Big Data • Enterprise democratization of big data • Any reporting or analytical tool can be used • Easy access to big data • Seamless integration of big data and small data • Sharing of integration specifications • Collaborative development on big data • Fine-grained security of big data • Speedy delivery of reports on big data
You Need A Data Virtualization Strategy To Avoid Falling Behind “Without a data virtualization strategy, you risk knowing less about your customer, delivering fewer real-time business insights, losing competitive advantage, and spending more to address data challenges.
Informa9on Fabric 3.0 August 8, 2013
Page 30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Red Hat + Hortonworks Making it Easier for Enterprises to Harness the Power Of Big Data
• Integrating Hadoop into existing information infrastructure.
• Building enterprise-grade, data-centric applications with Hadoop.
• Operationalizing Hadoop and deliver high quality services around it.
Page 31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Thank you!
Page 32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Next Steps...
Download the Hortonworks Sandbox
Learn Hadoop
Build Your Analytic App
Try Hadoop 2
More about Red Hat & Hortonworks http://hortonworks.com/partner/redhat
Contact us: [email protected]
Page 33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Don’t Forget to Register for our Next Webinar!
September 7th, 10 AM PST Red Hat JBoss Data Virtualization and Hortonworks Data Platform
http://info.hortonworks.com/RedHatSeries_Hortonworks.html