SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ......
Transcript of SAS on Hadoop - Analytics, Business Intelligence and … ON HADOOP HADOOP, BIG DATA & ANALYTICS ......
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
SAS ON HADOOPHADOOP, BIG DATA & ANALYTICS
12TH AUGUST 2014
NANG CHING TECK, HEAD OF TECHNOLOGY SOLUTIONS, SAS MALAYSIA
Copyright © 2012, SAS Institute Inc. All rights reserved.
Hadoop “Comes of Age”
Forrester urges companies to consider Hadoop as
“an in-database analytics approach where multivariate
statistical analysis, data mining, predictive
modeling, sentiment analysis, and content analytics
are executed in parallel across MPP clusters”
“Hadoop” word search (blue line)
“Big data” work search (red)
Mentions of Hadoop in job postings over a 5 year period
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
WHY HADOOP? EXPANDING DATA REQUIRES A NEW APPROACH
1980sBring Data to Compute
NowBring Compute to Data
Relative size & complexity
Data
Information-centric
businesses use all data:
Multi-structured,
internal & external data
of all types
Compute
Compute
Compute
Process-centric
businesses use:
• Structured data mainly
• Internal data only
• “Important” data only
Compute
Compute
Compute
Data
Data
Data
Data
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
WHY HADOOP? THE OLD WAY: BRINGING DATA TO COMPUTE
Complex Architecture• Many special-purpose
systems
• Moving data around
• No complete views
Missing Data• Leaving data behind
• Risk and compliance
• High cost of storage
Time to Data• Up-front modeling
• Transforms slow
• Transforms lose data
Cost of Analytics• Existing systems strained
• No agility
• “BI backlog”
44
11
22
33
SERVERSMARTSEDWS DOCUMENTS STORAGE SEARCH ARCHIVE
ERP, CRM, RDBMS, MACHINESFILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMSEXTERNAL DATA SOURCES
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
SERVERS MARTS EDWS DOCUMENTSSTORAGESEARCH ARCHIVE
ERP, CRM, RDBMS, MACHINESFILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMSESTERNAL DATA SOURCES
WHY HADOOP? THE NEW WAY: BRINGING COMPUTE TO DATA
Diverse Analytic Platform• Bring applications to data
• Combine different workloads on
common data (i.e. SQL + Search)
• True analytic agility
44
11
22
33 44
Active Compliance Archive• Full fidelity original data
• Indefinite time, any source
• Lowest cost storage
11
Persistent Staging• One source of data for all analytics
• Persist state of transformed data
• Significantly faster & cheaper
22
Self-Service Exploratory BI• Simple search + BI tools
• “Schema on read” agility
• Reduce BI user backlog requests
33
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
WHY HADOOP? WHY HADOOP IS IMPORTANT FOR SAS?
Low Cost Computing Power
Scalability Storage Flexibility
Data Protection and Self-Healing
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
SAS & HADOOP SAS ON HADOOP, YES WE CAN DO THAT8
SAS
Hive
SAS/Access to Hadoop - Push some
SAS processing to Hadoop with Hive
and SAS
Embedded Process - Push SAS
Data Step (DS & DS2), Data
Cleansing processing to Hadoop
with Map Reduce
SAS
Scoring
Accel.Code
Accel.
Impala
In-Memory Analytics – Process in
Memory, use Hadoop for Storage
persistence and commodity
computing.
SAS
HPADQ
Accel.
MapR
Pig
Hadoop P
latf
orm
In-
Memory
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
WHERE DOES
HADOOP FIT?HADOOP AS A “NEW DATA” STORE BI and
Analytics
Operational Data
Sources
EDWAnalytic
Mart
Data Mart
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
WHERE DOES
HADOOP FIT?HADOOP AS AN ADDITIONAL INPUT TO THE EDW
BI and
Analytics
Operational Data
Sources
EDW
Data Mart
Analytic
Mart
Data &
Analytic
Mart
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
WHERE DOES
HADOOP FIT?
HADOOP DATA PLATFORM AS A BASIS FOR BI AND
ANALYTICS BI and
Analytics
Operational Data
SourcesEDW
Data &
Analytic
Mart
Data Mart
Analytic
Mart
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
WHERE DOES
HADOOP FIT?
HADOOP DATA PLATFORM AS A “STAGING LAYER” AS
PART OF A “DATA LAKE” – Downstream stores could be
Hadoop, data appliances or an RDBMSBI and
Analytics
Operational Data
Sources EDW
Data Mart
Analytic
Mart
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
SAS & HADOOP SAS DATA MANAGEMENT
• SAS support native Apache Hadoop language –
HDFS, Map Reduce and Pig
• SAS offered SAS/ACCESS to Hadoop (Hive)
and SAS/ACCESS to Impala
• SAS provide SASHDAT tables and SPDE in
Hadoop, coming soon SPDS
• In-Database Process
• Scoring Accelerator
• Code Accelerator
• Data Quality Accelerator
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
SAS & HADOOP FEDERATE HADOOP DATA
Enables secure data
access & audit
Delivers virtual view
of data across many
sources
Quicker access to
data sources
Standardize central
administration and
configuration
Seamless, managed
access to data
Other8
Hadoop RDBMS
Governance
CACHED VIEWS
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
SAS & HADOOP DATA LOADER FOR HADOOP
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
SAS & HADOOP SAS VISUAL ANALYTICS
• Interactive exploration,
dashboards and reporting
• Auto-charting automatically
picks the best graph
• Forecasting, scenario analysis,
Decision Trees and other
analytic visualizations
• Text analysis and content
categorization
• Feature-rich mobile apps for
iPad® and Android
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
SAS & HADOOP SAS VISUAL STATISTICS
• Interactive, visual
application for statistical
modeling and classification
• Multiple methods:
• logistic, Regression, GLM,
Trees, Forest, Clustering and
moreF
• Model comparison and
assessment
• Group BY Processing
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
SAS & HADOOP SAS VISUAL SCENARIO DESIGNER
• Interactive, data driven,
temporal window building
• Interactive Decision Engine
• Decision Tables and Trees
• Simulation & Deployment
• Integrated with:
• SAS Visual Analytics
• SAS Event Stream Processing
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
SAS & HADOOP SAS IN-MEMORY STATISTICS FOR HADOOP
• SAS® In-Memory Statistics for
Hadoop provides a single interactive
programming environment for the
entire analytical life cycle.
• It enables users to perform data
manipulation, variable transformation,
exploratory analysis, statistical
modeling and machine learning
techniques, integrated modeling
comparison and scoring - all inside the
Hadoop environment.
This slide is for video use only.
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
SAS & HADOOP SAS® WITHIN THE HADOOP ECOSYSTEM
Next-Gen
SAS®
UserSAS
®User
User
Interface
Metadata
Data
Access
Data
Processing
File
System
SAS Metadata
In-Memory
Data Access
HivePig
Map Reduce
HDFS
Base SAS
SAS/ACCESS® to Hadoop™
SAS/ACCESS® to Impala
In-Memory
Data Access
HivePig
SAS® Data
Management
SAS® Visual
Analytics
SAS® Visual
Statistics
SAS®
Enterprise
Miner™
SAS®
Studio
SAS® LASR™ Analytic
Server
SAS Embedded
Process
SAS® In-memory
Statistics for
Hadoop
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved.
SAS & HADOOP WHY SAS ON HADOOP?
Bring superior
analytics to
Hadoop for more
precise insights
Manage data in order to
promote reuse and to
comply with IT policies
and procedures.
Maximize the value of
Hadoop across the
enterprise with data-to-
decision lifecycle support
Company Confidential - For Internal Use Only
Copyright © 2013, SAS Insti tute Inc. Al l r ights reserved. sas.com
THANK YOU