PANEL [email protected] SENIOR BIG DATA ARCHITECT BD-COE [email protected].

5
PANEL [email protected] SENIOR BIG DATA ARCHITECT BD-COE

Transcript of PANEL [email protected] SENIOR BIG DATA ARCHITECT BD-COE [email protected].

Page 1: PANEL DAVID.WINTERS@TERADATA.COM SENIOR BIG DATA ARCHITECT BD-COE DAVID.WINTERS@TERADATA.COM.

PANEL

[email protected]

SENIOR BIG DATA ARCHITECTBD-COE

Page 2: PANEL DAVID.WINTERS@TERADATA.COM SENIOR BIG DATA ARCHITECT BD-COE DAVID.WINTERS@TERADATA.COM.

Confidential and proprietary. Copyright © 2012 Teradata Corporation.2

When to Use Which? The best approach by workload and data type

Processing as a Function of Schema Requirements and Stage of Data Pipeline

Low Cost Storage and Fast Loading

Data Pre-Processing,

Refining, Cleansing

“Simple math at scale”

(Score, filter, sort, avg., count...)

Joins, Unions,

Aggregates

Analytics (Iterative and data mining)

Reporting

Stable Schema

Evolving Schema

Aster(SQL +

MapReduce Analytics)

Format, No Schema

Hadoop Hadoop Hadoop Aster AsterAster

(MapReduce Analytics)

Teradata/Hadoop Teradata Teradata Teradata Teradata Teradata

Hadoop Aster / Hadoop

Aster /Hadoop Aster Aster Aster

Hadoop Hadoop Hadoop Aster Aster Aster

Financial Analysis, Ad-Hoc/OLAPEnterprise-Wide BI and Reporting

Spatial/TemporalActive Execution

Interactive Data DiscoveryWeb Clickstream, Set-Top Box Analysis

CDRs, Sensor Logs, JSON

Social Feeds, Text, Image ProcessingAudio/Video Storage and Refining

Storage and Batch Transformations

Page 3: PANEL DAVID.WINTERS@TERADATA.COM SENIOR BIG DATA ARCHITECT BD-COE DAVID.WINTERS@TERADATA.COM.

Confidential and proprietary. Copyright © 2012 Teradata Corporation.3

When to Use which data engine? The best approach by workload and data type

• Processing as a Function of Schema Requirements by Data

Low Cost Storage and Fast Loading

Data Pre-

Processing,

Refining, Cleansing

“Simple math at scale”

(Score, filter, sort, avg., count...)

Joins, Unions,

AggregatesReporting

Analytics (Iterative and data mining)

Stable Schema

Evolving Schema

A-DBMS(SQL +

MapReduce Analytics)

Format, No Schema

Hadoop Hadoop Hadoop A-DBMS A-DBMSA-DBMS

(MapReduce Analytics)

EDW/Hadoop EDW EDW EDW EDW

EDW(SQL

analytics)

Hadoop A-DBMS / Hadoop

A-DBMS /Hadoop A-DBMS A-DBMS

A-DBMS(SQL +

MapReduce Analytics)

Hadoop Hadoop Hadoop A-DBMS A-DBMSA-DBMS

(MapReduce Analytics)

Need

Schema

Page 4: PANEL DAVID.WINTERS@TERADATA.COM SENIOR BIG DATA ARCHITECT BD-COE DAVID.WINTERS@TERADATA.COM.

Confidential and proprietary. Copyright © 2012 Teradata Corporation.4

Analytic_DBMS – Hadoop - EDW

Requirements A-DBMS Hadoop EDW

MapReduce integration

Interactive user tools

Complex analytics (e.g. time-series, graph, social network) UDF

Multi-language support (Java, R, Python, Perl, SAS, scripts, Bash, C+) UDF

Programming flexibility and ease UDF

Performance

Integrated data

System management, WLM

Labor costs

Concurrent users 10-100 1-10 200-1000

Excellent PoorGoodVery Good Fair

Note: +¼ moon can mean years of investment

Page 5: PANEL DAVID.WINTERS@TERADATA.COM SENIOR BIG DATA ARCHITECT BD-COE DAVID.WINTERS@TERADATA.COM.

Confidential and proprietary. Copyright © 2012 Teradata Corporation.5

END