HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR...

23
Company Confidential - For Internal Use Only Copyright © 2014, SAS Institute Inc. All rights reserved. HVORDAN KOMME I GANG MED SAS OG HADOOP? JONAS LIE NIELSEN, PRINCIPAL SOLUTIONS ARCHITECT, STRATEGY, SAS INSTITUTE

Transcript of HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR...

Page 1: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

HVORDAN KOMME I GANG MED SAS OG

HADOOP?

JONAS LIE NIELSEN, PRINCIPAL SOLUTIONS ARCHITECT, STRATEGY, SAS INSTITUTE

Page 2: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

INTRODUCTION HADOOP

We think this will provide some major

insight…

What is it?

Not sure. But it’s BIG. Don’t we have SAS on

that badoop thing?

It’s called Hadoop.

Whatever.

I’ll take a look…

Page 3: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

ABOUT THE SURVEY

The Nordic countries is expected to be more than 1 year behind US and UK in Big Data

Analytics and Hadoop adoption. We wanted to find out more:

• Understand Nordic adoption and maturity by country and industry

• What is the primary reasons, use cases and obstacles

• Understand where to sell the value of SAS – as we don’t sell Hadoop!

More than 300 Nordic companies responded through phone (Norstat) and online

questionnaire (SAS Institute).

See www.nordichadoopsurvey.com for results

Page 4: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

INTRODUCTION AGILE BI

“SAS, a known leader in enterprise BI

and advanced analytics, now leads the

pack in Agile BI.”

The Forrester Wave™: Agile Business

Intelligence Platforms, Q3 2014

(link)

Page 5: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

INTRODUCTION THE SAS SUPPORTED HADOOP DISTRIBUTIONS

Page 6: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

INTRODUCTION VISION FOR SAS ON HADOOP

To be the Analytic and Data Management

solution of choice for Hadoop.

Page 7: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

SAS AND HADOOP

ARCHITECTURE AND WHY IT MATTERS

Page 8: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

SAS AND HADOOP TYPICAL MODERN ARCHITECTURE

Page 9: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

SAS

Hadoop Hadoop Hadoop

Data

Hadoop

SAS

In-Database

Hadoop

SAS

Traditional SAS

Even with In-Database processing there will still be some

work performed on the SAS server

SAS AND HADOOP PROCESSING OF SAS CODE WITH HADOOP

In-Memory

Data

Hadoop

Data

Hadoop

Data

Hadoop

Data

Hadoop

Data

Memory

Data DataData

Page 10: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

ASYMMETRIC DISTRIBUTED SOURCE

BLADE ENVIRONMENT

HIGH LEVEL

ARCHITECTURE

DISTRIBUTED DEPLOYMENT OF VA ON

COMMODITY HARDWARE WITH ASYMMETRIC SOURCES

Hub

Explorer

Designer

Viewer

Data Builder

Administrator

WEB-BASED CLIENTS

SAS® Management Console

DESKTOP CLIENTS

iPad

Android

MOBILE CLIENTS

HadoopHDFS

IN-MEMORY STORE

SAS® LASR ANALYTIC SERVER

SAS®

VISUAL ANALYTICS

Not part of VA

Can be separated

CLOUDERA / HORTONWORKS /

TERADATA /GREENPLUM / DB2 / ORACLE / NETEZZA /

SAP HANA

SAS Embedded Process

WORKSPACE SERVER

MID-TIER

METADATASERVER

Hadoop RDBMS Nonrelational Click Stream PC Files & more

SAS LASR ANALYTIC SERVER CLUSTER VA SERVER

1. Front loading through workspace server

2. Parallell load from hadoop (SASHDAT) 3. Parallell load from separate hadoop

cluster using SAS EP

Page 11: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

ARCHITECTURE

OPTIONSSHARED CLUSTER VS SEPARATE CLUSTERS

SAS In-Memory

(TKGrid/LASR) Hadoop

Hadoop

SAS In-Memory

(TKGrid/LASR)

Shared Cluster• Resource Management (YARN)

• SASHDAT Available

• SAS Scales with Hadoop Cluster

Separate clusters• Math scales separately from Hadoop

• Data “movement”

• No SASHDAT

Page 12: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

ARCHITECTURE LASR SERVER MEMORY MANAGEMENT

Hadoop HDFS

SASHDAT File

Hive or Delimited File

SAS In-Memory

TKGrid/LASR

Memory mapping of

tables

Page 13: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

SHARED CLUSTER SASHDAT MEMORY MAPPING

Page 14: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

SEPARATE CLUSTER WHAT IS THE SAS EMBEDDED PROCESS?

A portable, lightweight execution container for SAS code that

makes SAS portable and deployable on a variety of platforms

EP

1. Data Lifting

2. Data Preparation

3. Data Quality

4. Scoring

Page 15: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

SEPARATE CLUSTER HOW DOES DATA LOADING IN ASYMETRIC MODE WORK?

EP

EP

Hadoop

SAS® Visual Analytics

DataData

Page 16: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

HADOOP WHY IS YARN SO IMPORTANT?

Map Reduce (MR)

Job Tracker

Submits tasks to the TaskTracker(s)

Task Tracker

Accepts Map, Reduce, and Shuffle Operations

YARN

a global ResourceManager (RM)

Scheduling for cluster, queue capacities, user limits

a per-application ApplicationMaster (AM)

Negotiates resources from RM, works with

NodeManager to execute and monitor component tasks

a per-node slave NodeManager (NM) and

a per-application Container running on a NM

Launches application, monitors cpu, memory, disk,

network and reports to RM

& OTHERS

http://hortonworks.com/blog/apache-hadoop-yarn-hdp-2-2-substantial-step-forward-enterprise-hadoop/

Page 17: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

ARCHITECTURE HARDWARE BUILDING BLOCKS

• Node, Blade, Pizza Box, Rack

Mount, Virtual Machine ….

• Components:

1. RAM

2. CPU

3. Disks

4. Network

5. Linux x64

Sample:

20 cores (2 CPU)

512 GB RAM

16 x 2 TB SATA Disks

Dual 10GbE Ethernet

Red Hat Linux 6.5

Page 18: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

ARCHITECTURE ASYMETRIC DEPLOYMENT

Hadoop Node 1 Hadoop Node 2

Node 1 Node 2 Node 3 Node n

Hadoop Node 3 Hadoop Node n

Hadoop cluster

SAS In-Memory Server

Datanodes

Worker nodesName Node

Root node

• Minimum configuration is four nodes in each cluster

• Recommended is six nodes for hadoop cluster

• No need for equal no of nodes

SAS Compute

Metadata

Mid Tier

Page 19: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

ARCHITECTURE FULLY SHARED DEPLOYMENT

Node 5 Node 6Node 1 Node 2 Node 3 Node 4 Node 7 Node n

Hadoop cluster

SAS In-Memory Server

SAS Compute Grid

Datanodes

Prod

Worker nodesName Nodes

Root node

• Minimum configuration is four nodes

• Recommended is six

• In 9.4M3, yarn can manage the complete workload across hadoop, SAS Lasr and SAS Grid

Page 20: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

IMPLEMENTATION HOW TO GET STARTED

• SAS can facilitate cordination with prefered Hadoop Vendor

• You need to buy separate licence/support agreement from Hadoop vendor

• The vendors provide fixed priced installation of hadoop, not very expensive

• Needs to plan from the beginning due to differences in server configuration

(disks)

Page 21: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

SUPPORT MATRIX

Page 22: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

BASELINE SUPPORT

MATRIX @ 9.4M2

Support Matrix – as of 9.17.2014Product Management makes every effort to validate that this is correct. However, the definitive

answer to what is supported on what platforms, etc. can be found at INSTALL CENTER. Note

the General forward support for SAS software

Supported at 9.4M2

Not Supported

* Minus Yarn Integration

Net New Product

9.4M2 Hive Impala Score Accl Code Accl DQ Accl Data Ldr SPDE HPA VA IMSTAT VS VSD

Cloudera 4.5

Cloudera 5.0 New 9.4M2 New 9.4M2 New 9.4M2

HortonWorks 1.3.2

HortonWorks 2.0 New 9.4M2 New 9.4M2 New 9.4M2

Pivotal HD 1.1

Pivotal HD 2.0

IBM BigInsights 2.1 * * * *

MapR V3

Page 23: HVORDAN KOMME I GANG MED SAS OG HADOOP? · PDF fileIBM BigInsights 2.1 * * * * MapR V3 LASR Analtyic Server 2.4 Supported at 9.4M2 Not Supported * Minus Yarn Integration Net New Product

Company Confidential - For Internal Use Only

Copyright © 2014, SAS Insti tute Inc. Al l r ights reserved.

IN-MEMORY

SUPPORT MATRIX

9.4M2 HPA VA IMSTAT VS VSD

Cloudera 4.5

Cloudera 5.0

HortonWorks 1.3.2

HortonWorks 2.0

Pivotal HD 1.1

Pivotal HD 2.0

IBM BigInsights 2.1 * * * *

MapR V3

LASR Analtyic Server 2.4

Supported at 9.4M2

Not Supported

* Minus Yarn Integration

Net New Product

Support Matrix – as of 9.17.2014Product Management makes every effort to validate that this is correct. However, the definitive

answer to what is supported on what platforms, etc. can be found at INSTALL CENTER. Note

the General forward support for SAS software