PANEL [email protected] SENIOR BIG DATA ARCHITECT BD-COE [email protected].

PANEL

[email protected]

SENIOR BIG DATA ARCHITECTBD-COE

mailto:[email protected]

Confidential and proprietary. Copyright © 2012 Teradata Corporation.2

When to Use Which? The best approach by workload and data type

Processing as a Function of Schema Requirements and Stage of Data Pipeline

Low Cost Storage and Fast Loading

Data Pre-Processing,

Refining, Cleansing

“Simple math at scale”

(Score, filter, sort, avg., count...)

Joins, Unions,

Aggregates

Analytics (Iterative and data mining)

Reporting

Stable Schema

Evolving Schema

Aster(SQL +

MapReduce Analytics)

Format, No Schema

Hadoop Hadoop Hadoop Aster AsterAster

(MapReduce Analytics)

Teradata/Hadoop Teradata Teradata Teradata Teradata Teradata

Hadoop Aster / Hadoop

Aster /Hadoop Aster Aster Aster

Hadoop Hadoop Hadoop Aster Aster Aster

Financial Analysis, Ad-Hoc/OLAPEnterprise-Wide BI and Reporting

Spatial/TemporalActive Execution

Interactive Data DiscoveryWeb Clickstream, Set-Top Box Analysis

CDRs, Sensor Logs, JSON

Social Feeds, Text, Image ProcessingAudio/Video Storage and Refining

Storage and Batch Transformations


When to Use which data engine? The best approach by workload and data type

• Processing as a Function of Schema Requirements by Data

Low Cost Storage and Fast Loading

Data Pre-

Processing,

Refining, Cleansing

“Simple math at scale”

(Score, filter, sort, avg., count...)

Joins, Unions,

AggregatesReporting

Analytics (Iterative and data mining)

Stable Schema

Evolving Schema

A-DBMS(SQL +


Format, No Schema

Hadoop Hadoop Hadoop A-DBMS A-DBMSA-DBMS


EDW/Hadoop EDW EDW EDW EDW

EDW(SQL

analytics)

Hadoop A-DBMS / Hadoop

A-DBMS /Hadoop A-DBMS A-DBMS

A-DBMS(SQL +


Hadoop Hadoop Hadoop A-DBMS A-DBMSA-DBMS


Need

Schema


Analytic_DBMS – Hadoop - EDW

Requirements A-DBMS Hadoop EDW

MapReduce integration

Interactive user tools

Complex analytics (e.g. time-series, graph, social network) UDF

Multi-language support (Java, R, Python, Perl, SAS, scripts, Bash, C+) UDF

Programming flexibility and ease UDF

Performance

Integrated data

System management, WLM

Labor costs

Concurrent users 10-100 1-10 200-1000

Excellent PoorGoodVery Good Fair

Note: +¼ moon can mean years of investment


END

PANEL [email protected] SENIOR BIG DATA ARCHITECT BD-COE [email protected].

Documents

Transcript of PANEL [email protected] SENIOR BIG DATA ARCHITECT BD-COE [email protected].