Post on 09-Jan-2017
1© Copyright 2016 EMC Corporation. All rights reserved.
DEVELOPING A SUCCESSFUL BIG DATA BUSINESS STRATEGY
SEBASTIAN DARRINGTONCTO - BIG DATA AND ANALYTICS, EMEA@THESTORAGECHAP
2© Copyright 2016 EMC Corporation. All rights reserved.
ALL ORGANISATIONS ARE ON A JOURNEY TO…
1000XMORE DATA
REAL TIMEOPERATION
ANALYTICINSIGHTS
PERSONALISATION & ENHANCED SERVICES
3© Copyright 2016 EMC Corporation. All rights reserved.
BIG DATA BUSINESS MODEL MATURITY INDEX
© Copyright 2016 EMC Corporation. All rights reserved. 3
Measures degree to which organisations have
integrated data and analytics into their business models
Key Business Processes
EconomicDrivers
BUSINESS OPTIMISATION
(PROCESS REFINEMENT)
PrescriptiveRecommendations
BUSINESS INSIGHTS
(PREDICITIVE)BUSINESS
MONITORING(REARWARD)
BUSINESSMETAMORPHOSIS(DATA AT HEART)
DATAMONETISATION(DATA DRIVEN PRODUCTS OR
SERVICES)
4© Copyright 2016 EMC Corporation. All rights reserved.
Capitalizing on new data sources to solve a
business problem that was not possible to solve with
conventional data management techniques
BUSINESSProblem we are trying to solve?
Build a Big Data Platform to meet immediate and long term needs to reduce the friction of provisioning data for innovation.The Data Lake
TECHNICALWhat data do we have available to us?
TWO PARTS OF THE JOURNEY
5© Copyright 2016 EMC Corporation. All rights reserved.
INGESTCapture data froma wide range of
sources, traditional and new
STOREStore everything in
one environment for cross data analysis
ANALYZEUse advanced
algorithms to discover new, predictive
patterns
SURFACEShare insights with business domain experts
ACTBuild data-driven
applications to meet business needs
JOURNEY TO DIGITAL BREAKS TRADITIONAL IT INFRASTRUCTURE
© Copyright 2016 EMC Corporation. All rights reserved.© Copyright 2016 EMC Corporation. All rights reserved.
ANALYTICS DATA
AGILE APPS
TRANSFORMATION
CREATING NEW DIGITAL CONSUMER EXPERIENCESIS NOT JUST ABOUT HOW YOU DEPLOY HADOOP OR USE DATA SCIENTISTS
PAST FUTURE
BUY PREPACKAGED APPLICATIONS
BUILD INFRASTRUCTURE
BUYPREPACKAGED
INFRASTRUCTURE BUILD
APPLICATIONS
© Copyright 2016 EMC Corporation. All rights reserved.© Copyright 2016 EMC Corporation. All rights reserved.
BIG DATA TECHNOLOGY ADOPTION CURVE
INNOVATORS EARLY ADOPTERS LATE ADOPTERS
3+YEAR HADOOP VETRANS
CHALLENGE:INFRASTRUCTURE
Scalability Flexibility/Agility Enterprise Grade
STRUGGLING WITH WHAT NEXT
CHALLENGE:POINT SOLUTIONS
IngestionGovernance
Cloud Like M&A
UNSURE WHERE TO BEGIN
CHALLENGE:FAST TRACKArt of the PossiblePeople and Process
Buy vs. Build
© Copyright 2016 EMC Corporation. All rights reserved.© Copyright 2016 EMC Corporation. All rights reserved.
THERE ARE MULTIPLE WAYS TO APPROACH
BUILD IT YOUSELF
BUILD IT WITH HELP
BUY VS BUILD
© Copyright 2016 EMC Corporation. All rights reserved.© Copyright 2016 EMC Corporation. All rights reserved.
BUILD IT YOURSELF
© Copyright 2016 EMC Corporation. All rights reserved.© Copyright 2016 EMC Corporation. All rights reserved.
BUILD IT WITH HELP
© Copyright 2016 EMC Corporation. All rights reserved.© Copyright 2016 EMC Corporation. All rights reserved.
BUY VS. BUILD
© Copyright 2016 EMC Corporation. All rights reserved.© Copyright 2016 EMC Corporation. All rights reserved.
THE BIG DATA, ANALYTICS & APPS JOURNEYSTART
DISPARATE DATA SILOS
STEP 1:
CONSOLIDATE AND TIER DATA
STEP 2:
ADD VIRTUALISEDHADOOP COMPUTE
STEP 3:
IMPLEMENT ANALYTICS
STEP 4:
INTEGRATEAPP DEV
STEP 5:
BUSINESS DATA LAKE
DATA LAKEFOUNDATION HADOOP ANALYTICS TOOLS PAASTECHNOLOGY
APPSANALYTICS
SELF SERVICEGOVERNANCE
DATA CURATION
INTEGRATE ALL DATA
LINK DATA SETS AND DATA ELEMENTS
COLLABORATE INTERNALLY &
EXTERNALLY
DRIVE NEW INSIGHTS, REFINE
PROCESSES
IT ENABLED PORTFOLIO
DECISION SUPPORT
DATA DRIVEN DECISIONS,
PREDICTIVE & PRESCRIPTIVE
SURFACE AND ACT ON DATA
LINK DATA SETS AND DATA ELEMENETS
INCREASE SALES, REDUCE COST
MINIMISE RISK
COMPETITIVE ADVANTAGE
© Copyright 2016 EMC Corporation. All rights reserved.© Copyright 2016 EMC Corporation. All rights reserved.
DATA POOL WITH TIERED STORAGE
MASTER REPOSITORY
MARTS / PUDDLES
MARTS / PUDDLES
MARTS / PUDDLES
MARTS / PUDDLES
MARTS / PUDDLES
MARTS / PUDDLES
VIRTUALISED COMPUTE POOL
PRODUCTION LAKE
IN MEMORY
SANDBOX / WORKSPACE
SANDBOX / WORKSPACE
SANDBOX / WORKSPACE
SANDBOX / WORKSPACE
SANDBOX / WORKSPACE
SANDBOX / WORKSPACE
REQUEST PORTAL
DATA CATALOGSALES, CRM, FINANCIAL, ALERTS,
CALL, SOCIAL
TOOLS CATALOGHADOOP, MYSQL, CASSANDRA, R, RSTUDIO, ANACONDA, PYTHON,
JUPYTER
DATA CURATION
DATA GOVERNANCE
META DATA
CONSUMERS
BUSINESS ANALYSTKNOWS THE EXPECTED OUTCOME
QUERIES THE DATAUSES GUI TOOLS
LINE OF BUSINESSREDUCE COST
INCREASE SALESMINIMISE RISK
DATA SCIENTIST EXPLORES THE DATAINGESTS NEW DATA
WRITES SCRIPTS
BUSINESS DATA LAKE SOLUTION VISION
EDW
ELT AA
© Copyright 2016 EMC Corporation. All rights reserved.© Copyright 2016 EMC Corporation. All rights reserved.
EMC’S PERSPECTIVE OF THE JOURNEYSTART
DISPARATE DATA SILOS
STEP 1:
CONSOLIDATE AND TIER DATA
STEP 2:
ADD VIRTUALISEDHADOOP COMPUTE
STEP 3:
IMPLEMENT ANALYTICS
STEP 4:
INTEGRATEAPP DEV
STEP 5:
BUSINESS DATA LAKE
DATA LAKEFOUNDATION HADOOP ANALYTICS TOOLS PAASTECHNOLOGY
APPSANALYTICS
SELF SERVICEGOVERNANCE
DATA CURATION
INTEGRATE ALL DATA
LINK DATA SETS AND DATA ELEMENTS
COLLABORATE INTERNALLY &
EXTERNALLY
DRIVE NEW INSIGHTS, REFINE
PROCESSES
IT ENABLED PORTFOLIO
DECISION SUPPORT
DATA DRIVEN DECISIONS,
PREDICTIVE & PRESCRIPTIVE
SURFACE AND ACT ON DATA
LINK DATA SETS AND DATA ELEMENETS
INCREASE SALES, REDUCE COST
MINIMISE RISK
COMPETITIVE ADVANTAGE
© Copyright 2016 EMC Corporation. All rights reserved.
THE BIG DATA, ANALYTICS & APPS JOURNEYSTART – THE INNOVATORS CHALLENGE
CURRENT PHYSICAL HADOOP CLUSTER(S)
AT LEAST ONE HADOOP DISTRIBUTION
DIRECT-ATTACHED STORAGE
STAND-ALONE SERVERS
SINGLE PURPOSE
ALL COMMODITY ENVIRONMENT
TYPICAL HADOOP
SUPPORT AT SCALE
AGILITY
UTILISATION (X3+)
ENTERPRISE GRADE
TYPICAL CHALLENGE
© Copyright 2016 EMC Corporation. All rights reserved.
THE BIG DATA, ANALYTICS & APPS JOURNEYSTEP 1 - HADOOP STORAGE CONSOLIDATION AND TIERING
DATA LAKEISILON
(OPERATIONAL)
CURRENT PHYSCIAL HADOOP CLUSTER(S)
AT LEAST ONE HADOOP DISTRIBUTION
MULTI-PROTOCOL NAS SOLUTION
SINGLE FILE SYSTEM THAT SCALES TO PB
NATIVE HDFSINTEGRATION
SEPARATION OF DATA FROM COMPUTE
EFFICIENT DATA PROTECTION
HIGH RESILIENCE AND AVAILABILITY
© Copyright 2016 EMC Corporation. All rights reserved.© Copyright 2016 EMC Corporation. All rights reserved.
DATA POINTS
METRIC TRADITIONAL EMC DATA LAKEFTE SUPPORT 1 FTE PER 3 PB STORAGE– 1 FTE FOR 50 PB +
VM/COMPUTE SUPPORT
COMPUTE PERCENTAGE 10% TO 30% TUNABLE WITH SEPARATION OF GROWTH PATHS
TB PER SQ FOOT 62.5TB AT CURRENT CONFIGS 180TB AT CURRENT CONFIGS
REDUNDANCY / UPTIME 3RD PARTY PACKAGE STORAGE LAYER
GROWTH PATH 1 NODE AS NEEDED COMPUTE OR STORAGE AS NEEDED
MANAGEMENT SNMP MIBS ENTERPRISE MONITORING
© Copyright 2016 EMC Corporation. All rights reserved.
THE BIG DATA, ANALYTICS & APPS JOURNEYSTEP 1 - HADOOP STORAGE CONSOLIDATION AND TIERING
DATA LAKE AND DATA LAKE EXTENSIONSISILON
(OPERATIONAL)
CURRENT PHYSCIAL HADOOP CLUSTER(S)
AT LEAST ONE HADOOP DISTRIBUTION
RACK SCALE FLASH APPLIANCE
PERFORMANCE OF DAS, AVAILABILITY OF SHARED
OBJECT BASED STORAGE WITH HCFS INTERFACE
CLOUD LIKE MGMT, AUTOMATION AND COST
ADVANCED, REAL-TIME ANALYTICS ON HADOOP
HADOOP AT GLOBAL SCALE
DSSD(HI-SPEED ANALYTICS )
ECS(OBJECT & ARCHIVE)
© Copyright 2016 EMC Corporation. All rights reserved.
THE BIG DATA, ANALYTICS & APPS JOURNEYSTEP 1 - HADOOP STORAGE CONSOLIDATION AND TIERING
DATA LAKE AND DATA LAKE EXTENSIONSISILON
(OPERATIONAL)
CURRENT PHYSCIAL HADOOP CLUSTER(S)
AT LEAST ONE HADOOP DISTRIBUTION
XTREMIO/DSSD(HI-SPEED ANALYTICS )
ECS(OBJECT & ARCHIVE)
CLOUD POOLS
© Copyright 2016 EMC Corporation. All rights reserved.
THE BIG DATA, ANALYTICS & APPS JOURNEYSTEP 2 – VIRTUALISED HADOOP AS A SERVICE ON A CONVERGED PLATFORM
PRIVATE CLOUD
DATA LAKE AND DATA LAKE EXTENSIONS
COMPUTE: VBLOCK 340 (VNX/XIO)
DSSD(HI-SPEED ANALYTICS )
ECS(OBJECT & ARCHIVE)
ISILON(OPERATIONAL)
CLOUD POOLS
VCLOUD AUTOMATION CENTRE + BIG DATA EXTENSIONS
AT LEAST ONE HADOOP DISTRIBUTION
© Copyright 2016 EMC Corporation. All rights reserved.
THE BIG DATA, ANALYTICS & APPS JOURNEYSTEP 3 – ADVANCED ANALYTICS
PRIVATE CLOUD
DATA LAKE AND DATA LAKE EXTENSIONS
COMPUTE: VBLOCK 340 (VNX)
XTREMIO/DSSD(HI-SPEED ANALYTICS )
ECS(OBJECT & ARCHIVE)
ISILON(OPERATIONAL)
CLOUD POOLS
VCLOUD AUTOMATION AUTOMATION CENTRE + BIG DATA EXTENSIONS
AT LEAST ONE HADOOP DISTRIBUTION ADVANCED ANALYTICS
DATA SCIENCE
VISUALISATION
© Copyright 2016 EMC Corporation. All rights reserved.
THE BIG DATA, ANALYTICS & APPS JOURNEYSTEP 4 – INSIGHT DRIVEN APPLICATIONS
PRIVATE CLOUD
DATA LAKE AND DATA LAKE EXTENSIONS
COMPUTE: VBLOCK 340 (VNX)
XTREMIO/DSSD(HI-SPEED ANALYTICS )
ECS(OBJECT & ARCHIVE)
ISILON(OPERATIONAL)
CLOUD POOLS
PLATFORM AS A SERVICE
AT LEAST ONE HADOOP DISTRIBUTION ADVANCED ANALYTICS
DATA SCIENCE
VISUALISATION
© Copyright 2016 EMC Corporation. All rights reserved.
THE BIG DATA, ANALYTICS & APPS JOURNEYSTEP 5 – DATA CURATION, DATA GOVERNANCE AND CLOUD LIKE M&A
PLATFORM MANAGER
PRIVATE CLOUD
DATA LAKE AND DATA LAKE EXTENSIONS
COMPUTE: VBLOCK 340 (VNX)
XTREMIO/DSSD(HI-SPEED ANALYTICS )
ECS(OBJECT & ARCHIVE)
ISILON(OPERATIONAL)
CLOUD POOLS
DATA CURATOR
ENRICH
INGEST
INDEX
DATA GOVERNOR
LINEAGE
QUALITY
SECURITY
PLATFORM NAVIGATOR ADMINISTRATION ANALYTICS CATALOG DATA CATALOG
AT LEAST ONE HADOOP DISTRIBUTION EXTENSION PACKS
DATA SCIENCER, RStudio, Anaconda python, jupyter
VISUALIZATION
© Copyright 2016 EMC Corporation. All rights reserved. 24© Copyright 2015 EMC Corporation. All rights reserved.
1. SELECTING THE WRONG USECASES
2. MANAGEMENT RESISTENCE
3. ASKING THE WRONG QUESTIONS
4. LACKING THE RIGHT SKILLS
5. UNANTICIPATED PROBLEMS
6. DIASGREEMENT ON ENTERPRISE STRATEGY
7. BIG DATA SILOS
8. PROBLEM AVOIDANCE
FOR MANY, TECHNOLOGY IS NOT THE PROBLEM
EIGHT REASONS WHY BIG DATA PROJECTS FAIL
LACK OF BUY-IN FROM THE BUSINESS
© Copyright 2016 EMC Corporation. All rights reserved.© Copyright 2016 EMC Corporation. All rights reserved.
THE MORE YOU TRY AND UNDERSTAND THE TECHNICALITIES OF ANALYTICS
THE MORE DIFFICULT IT BECOMES
© Copyright 2016 EMC Corporation. All rights reserved.© Copyright 2016 EMC Corporation. All rights reserved.
CUSTOMER ANALYTICS360 VIEW OF THE CUSTOMER
OPERATIONAL ANALYTICSANALYSE, PREDICT, OPTIMISE
FRAUD & COMPLIANCE ANALYTICSANTICIPATE, PREVENT, COMPLY
DATA DRIVEN PRODUCTS & SERVICESNEW REVENUE, COMPETITIVE EDGE, LOYALTY
26© Copyright 2015 EMC Corporation. All rights reserved.
HEALTHCARE ANALYTICSPREVENTION, REALTIME MONITORING
THE ART OF THE POSSIBLE
27© Copyright 2016 EMC Corporation. All rights reserved.
Align business and IT goals around big
data
Identify strategic
opportunities for big data
analytics
Recommend the appropriate
analytics engagement
and deployment roadmap
1 2 3 4 5
WorkshopExplore Next StepsResearch Interview
Demonstrate the potential value using data science techniques
Prioritize key use cases by
assessing feasibility and
ROI
FINDING THE USE CASE:THE BIG DATA VISION WORKSHOP
© Copyright 2016 EMC Corporation. All rights reserved.© Copyright 2016 EMC Corporation. All rights reserved.
THE BIG DATA, ANALYTICS & APPS JOURNEYSTART
DISPARATE DATA SILOS
STEP 1:
CONSOLIDATE AND TIER DATA
STEP 2:
ADD VIRTUALISEDHADOOP COMPUTE
STEP 3:
IMPLEMENT ANALYTICS
STEP 4:
INTEGRATEAPP DEV
STEP 5:
BUSINESS DATA LAKE
DATA LAKEFOUNDATION HADOOP ANALYTICS TOOLS PAASTECHNOLOGY
APPSANALYTICS
SELF SERVICEGOVERNANCE
DATA CURATION
BIG DATA VISIONWORKSHOP
PROOF OF VALUECONSULTINGFAST TRACK
TECHNOLOGY
© Copyright 2016 EMC Corporation. All rights reserved.© Copyright 2016 EMC Corporation. All rights reserved.
CONVERGED PLATFORM TO ACCELERATE THE DELIVERY OF BUSINESS OUTCOMESEMC BUSINESS DATA LAKE
INFRASTRUCTURE
SOFTWARE
OPTIONS
FAST TIME TO VALUE | REDUCED RISK | EASE OF USE | ENTERPRISE SCALE | OPEN & EXTENSIBLE
PLATFORM MANAGER
MultiprotocolBig Data ready
DATA GOVERNOR
DATA CURATOR
ENRICH
INGESTPowered by
INDEXPowered by
PRIVATE CLOUD
ISILONDATA LAKE
COMPUTE:EMC Converged Platform Vblock® System 340
HI-SPEED ANALYTICS (XtremIO)HYPERSCALE (ECS)
LINEAGEPowered by
QUALITYPowered by
SECURITYPowered by
PLATFORM NAVIGATOR ADMINISTRATION ANALYTICS CATALOG DATA CATALOG
AT LEAST ONE HADOOP DISTRIBUTION EXTENSION PACKS
DATA SCIENCER, RStudio, Anaconda python, jupyter
VISUALIZATION
© Copyright 2016 EMC Corporation. All rights reserved.© Copyright 2016 EMC Corporation. All rights reserved.
EMC BIG DATA PORTFOLIO
© Copyright 2016 EMC Corporation. All rights reserved.© Copyright 2016 EMC Corporation. All rights reserved.
MORE INFORMATION?http://www.emc.com/big-data
@thestoragechap