EMC Big Data Solutions Overview
-
Upload
walshe1 -
Category
Technology
-
view
103 -
download
1
description
Transcript of EMC Big Data Solutions Overview
1© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
EMC Big Data Solutions Overview
2© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Big Data - Why do I care? Digital universe is expanding rapidly
– 44x to 50x data expansion this decade– By 2020 40ZB (40 trillion GB)
▪ 1.7 MB of new information will be created for each and every human being on the planet -- every second of every day.
41% growth of IoT, M2M data– % of data generated about us exploding– % of data tagged and analyzed exploding
Emerging Markets +62% of data– 22% from China alone
IT challenges: – servers will increase 10x– Information directly managed by enterprises
will grow 14%– Data under security governance will grow
40%– Number of IT professionals is expected to
grow by only a factor of 1.5x by 2020.
3© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Big Data Challenges for IT
Complexity– Multiple Hadoop distributions (Apache, Cloudera,
Hortonworks, Pivotal) Costs
– Acquisition & Operations Security & Governance
– Finance SEC17a-4, HIPPA– ISO – Audit
Big Data is more than Hadoop– Use familiar analytics tools
4© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
EMC Hadoop Starter Kit
5© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Simple, Easy, Cost Effective EMC Starter Kit for Hadoop
Create simplified process to get started with Hadoop:– 4-8 node cluster– Automated, repeatable deployment– Leverage existing infrastructure investment
Success Criteria:– Low, no new cost– 2 hour customer deployment– Make it easy to leverage familiar, robust enterprise infrastructure
6© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
EMC Hadoop Starter Kit EMC-VMware Deployment Guide
– Enable HDFS on Isilon cluster– Deploy Cloudera compute cluster– Deploy Hortonworks compute cluster– Deploy PivotalHD compute cluster– Deploy Apache compute cluster– Test data set – Ulysses with Map Reduce process– Collateral available through ECN, blogs, and twitter
Running deployment in OIL for demo’s, Pilots EMC vLab created – PivotalHD with VMware, EMC Isilon
7© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
EMC Hadoop Starter KitHow do I get Free access to Hadoop Starter Kit?
• Type “EMC hadoop Starter kit” into google• https://community.emc.com/community/connect/everything_big_data• https://community.emc.com/docs/DOC-26892• http://theruddyduck.typepad.com/• https://www.youtube.com/watch?feature=player_embedded&v=MtBRbTeJbZM• https://www.youtube.com/watch?feature=player_embedded&v=1Lch5e3wGtA
Key Data Sets:• Close to 4300 views!• HSK Downloads:
• Pivotal – 410• Cloudera – 261• HortonWorks – 275• Apache – 310
• Over 150 Isilon HDFS license’s deployed world wide!
8© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
EMC ViPR with HDFS
9© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
VCE VblockTM
Turnkey Solution for Big Data and Analytics
SERVER
NETWORK
STORAGE
VIRTUALIZATION
PROTECTION
EMC Symmetric VMAX, VNX and Isilon
EMC Avamar, Data Domain, VPLEX, RecoverPoint
Cisco Unified Computing System (UCS) serversCisco Data Center and Cloud Networking (DCN) portfolio
VMware vSphere including Big Data Extension (BDE)
10© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Converged Platform for Big Data and AnalyticsVCE VblockTM
11© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Big Data Challenges for IT
Complexity– Multiple Hadoop distributions (Apache, Cloudera,
Hortonworks, Pivotal) Costs
– Acquisition & Operations Security & Governance
– Finance SEC17a-4, HIPPA– ISO – Audit
Big Data is more than Hadoop– Use familiar analytics tools
12© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.12
Industry’s Most Efficient & Secure Big Data Management Solution
Jyothi SwaroopDirector, Product Marketing & Alliances
13© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.13
EnterpriseData
Analytical Archive: Enterprise Data Warehouse
OffloadCompliance Archive:
Tape Avoidance/Replacement
First SQL Compatible, Enterprise-grade Database to run on Isilon Scale-out NAS
(with Hadoop or not).
RainStor & EMC Isilon Solution & Use-case
14© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
RainStor Architecture
15© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Hadoop Data Security
• Authentication – RBAC• Authorization – ACL’s by
user• Encryption – Data at Rest• Audit Trail – logs data
access by user for audit• Immutability – data can
never changed
16© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Big Data Challenges for IT
Complexity– Multiple Hadoop distributions (Apache, Cloudera,
Hortonworks, Pivotal) Costs
– Acquisition & Operations Security & Governance
– Fiance SEC17a-4, HIPPA– ISO – Audit
Big Data is more than Hadoop– Use familiar analytics tools
17© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Big Data with Splunk
18© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Splunk Company Highlights
• Founded 2004 • First SW in 2006• HQ: San Francisco, CA• AP HQ: Hong Kong• EMEA HQ: London• Over 850+ employees • 8+ Offices WW
Company (SPLK: >100% IPO)
• On Premise, SaaS or In the Cloud: Licensed by Daily Index Volume
• Free Download 500MB Trial: Same bits Scale 500MB > 100s TBs/day
Products/Business Model
6000+ Customers
Business Highlights
60+ Fortune 100
90+ Countries
19© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Industry Leading Platform for Machine Data
Any Machine Data Operational Intelligence
EMCStorage
Search and Investigation
Proactive Monitoring
Operational Visibility
Real-time Business Insights
CommodityServers
Online Service
s Web Service
s
ServersSecurity GPS
Location
StorageDesktops
Networks
Packaged Applications
CustomApplicationsMessaging
TelecomsOnline
Shopping Cart
Web Clickstreams
Databases
Energy Meters
Call Detail Records
Smartphones and Devices
RFID
20© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Industry Leading Platform for Machine Data
Any Machine Data Operational Intelligence
HA Indexes and
Storage
Search and Investigation
Proactive Monitoring
Operational Visibility
Real-time Business Insights
CommodityServers
Online Service
s Web Service
s
ServersSecurity GPS
Location
StorageDesktops
Networks
Packaged Applications
CustomApplicationsMessaging
TelecomsOnline
Shopping Cart
Web Clickstreams
Databases
Energy Meters
Call Detail Records
Smartphones and Devices
RFID
Any amount, any location, any source
Schema-on-the-fly
Universal forwarding
No back-end RDBMS
No need to filter
data
21© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
EMC Starter Kit for Splunk• Splunk is easy to setup and deploy• Infrastructure for Splunk should be easy and
inexpensive• Use familiar, robust IT infrastructure• Leverage existing IT investment• Provide reliable, repeatable, tested solution
How do I get Free access to EMC-Splunk Starter Kit?• Type “EMC reference architecture for splunk”
into google• https://community.emc.com/docs/DOC-27406• Over 1000 views!
22© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Splunk Performance with Shared Storage & Compute
RAID 10 6x15k RPM
Time to search (s)
Single Search0
1
2
3
Isilon DAS EC2
Single Search0
10
20
30
Isilon DAS EC2
Time to 1st event (s)
18.072.499
3.02 26.50
Single Index0
10
20
30
Isilon DAS EC2
Single Index0
40
80
Isilon DAS EC2
79,057
10,94437,574
10,649
Average EPS (1000s)Average KBPS (1000s)
2.48 20.18
22,400
38,730
23© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Partners Big Data on Vblock
EMC Solutions for HadoopMany Joint Pivotal on EMC customers
Formal collaboration established
Officially Support IsilonCo-branded HSK for Cloudera
Many Joint Customers
Enabling Service ProvidersHDaaS
Several key winsCo-branded HSK for Splunk
Many Joint CustomersJoint support
Jointly architected Vblock for Hadoop with VMware, Cisco, EMC
Several Customer Pilots
Hadoop Wins
Many installed wins with all of the major distributions
Two new case studies:
25© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Why Use Shared Infrastructure for Hadoop?
26© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Combined Storage/Compute
VM
Hadoop in VM• VM lifecycle
determinedby Datanode
• Limited elasticity• Limited to Hadoop
Multi-Tenancy
Storage
Compute
VM
VM
Separate Storage• Separate compute
from data• Elastic compute• Enable shared
workloads• Raise utilization
Storage
T1 T2
VM
VM
VM
Separate Compute Tenants• Separate virtual clusters
per tenant• Stronger VM-grade security
and resource isolation• Enable deployment of
multiple Hadoop runtime versions
Slave NodeHadoop Deployment Models
27© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Why HDFS on EMC (Isilon) shared storage
• No Ingest necessary
• Eliminate NameNode SPOF
• Eliminate 3x mirroring
• Enterprise feature set
• Multi-protocol access
• Simultaneous Multi-distribution support
• Better cost!
• Smart-Dedupe for Hadoop
• SEC 17a-4 Compliant WORM
• Kerberos Authentication
• Hadoop Multi-tenancy
• Simultaneous Distribution Version Support
• Great performance!
Module 4: Horizontal and Vertical Markets
28© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Rapid Deployment
Self service tools
Automated resource rebalancing
Performance
True multi-tenancy
Elastic scaling
Avoid dedicated hardware
VM-based isolation
Increase resource utilization
Choice of distributions and storage
Maintain management flexibility at scale
Leverage vSphere features
Why Virtualize Hadoop?
Operational Simplicity with Performance
Maximize Resource Utilization on New or
Existing Hardware
Architect Scalable and Flexible Big Data Platform
29© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Performance: Native vs. Virtual, 32 hosts, 16 disks/host
Source: http://www.vmware.com/resources/techresources/10360
30© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. 30© Copyright 2013 Pivotal. All rights reserved.
Pivotal-Isilon Alliance
Federation Plan & Field Momentum
Q4 2013
31© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Pivotal Overview
Data Science Team
▶ Developer-friendly.
▶ Industry leading application framework and runtimes.
▶ Complete & disruptive set of data products.
▶ Services that accelerate productivity.
▶ Multi-cloud deployment.
▶ Commitment to open source & open standards.
One
32© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Revised Color Palette For 2014
WhiteR 255G 255B 255
BlackR 0G 0B 0
EMC BlueR 44G 149B 221
GreenR 73G 169B 66
VMware GrayR 113G 112B 116
EMC GrayR 186G 188B 190
RedR 206G 49B 49
Pivotal GreenR 0G 125B 104
Lt. BlueR 147G 197B 255
Replaces Replaces ReplacesReplacesReplaces
33© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.