Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

27
Sandeep Parikh [email protected] [email protected] TOP 5 THINGS TO KNOW ABOUT INTEGRATING MONGODB INTO YOUR DATA WAREHOUSE

description

 

Transcript of Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

Page 1: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

Sandeep Parikh [email protected]@Teradata.com

TOP 5 THINGS TO KNOW ABOUT INTEGRATING MONGODB INTO YOUR DATA WAREHOUSE

Page 2: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

2 Copyright Teradata

Scale-out NoSQL+ Scale-out DW

Data Warehouse = context

JSON in the Data Warehouse

Integration: Data Sharing

Use Cases

Page 3: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

3 Copyright Teradata

• Analytic database> In-memory, in-database

• Scale-out MPP> 30+ petabyte sites> 35PB, 4096 cores

• Self service BI> Dashboards, reports, OLAP> Predictive analytics

• Complex SQL> 20-50 way joins> 350 pages of SQL

• Real time access/load• Mixed workloads

What is a Teradata Data Warehouse?

Datascientists

Powerusers

Sales,partners

1024 nodes

IntelCPUs

512GB

IntelCPUs

512GB

IntelCPUs

512GB

IntelCPUs

512GB

Page 4: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

4 Copyright Teradata

What is a Data Warehouse? Context

Price history

Inventory

Supplier

Contracts

Product/Services

Channels

E-Commerce

Labor

Associate

Customer

Salestransactions

Point of Sale

ShipmentCarrier

Campaigns

Promotion

Warehouse

Page 5: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

5 Copyright Teradata

A Day at the Ticket Agency

• 185 applications> Travel agents & corporate

travel managers> Mobile: airline executives> Corporate travel managers

and travel agents> Hoteliers

• Teradata 5650 V13.10> 25TB of data> 1000+ users

• Mini-batch every 15 min• GoldenGate replication• Tactical queries 0.2 seconds • 14M queries/day

2008 2009 2010 201199.599.699.799.899.9100

99.799.78

99.9899.94

Availability

Page 6: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

6 Copyright Teradata

Teradata in the Data Warehouse Market

Page 7: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

7 Copyright Teradata

Forrester Data Warehouse Wave December 2013

Page 8: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

JSON IN THE DATA WAREHOUSE

Page 9: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

9 Copyright Teradata

Late Binding in SQL

Earlybinding

Latebinding

RuntimeLoad time

DataWarehouse

Sourcedata

Schema

ETL

tableSQL +

JSONPath

BItools

JSON

Page 10: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

10 Copyright Teradata

JSONPath inside SQL

Color Size Prod_ID Create_Time----- ----- ------- -------------------Blue Small 96 2013-06-17 20:07:27

SELECT box.MFG_Line.Product.Color AS "Color", box.MFG_Line.Product.Size AS "Size", box.MFG_Line.Product.Prod_ID AS "Prod_ID", box.MFG_Line.Product.Create_Time AS "Create_Time"

FROM mfgTable WHERE CAST(box.MFG_Line.Product.Create_Time AS TIMESTAMP) >= TIMESTAMP'2013-06-16 00:00:00' AND box.MFG_Line.Product.Prod_ID = 96;

Page 11: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

11 Copyright Teradata

• JSON object schema column> Treated like any column> Use any BI tool

• Apply “schema” at runtime

• Why not shred JSON into columns?> Urgency, agility > Bypass extensive change controls> Complex data

– Bill of materials, etc.

Flexible: Schema-on-Read

Page 12: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

UNIFIED DATA ARCHITECTURE

Page 13: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

Math and Stats

DataMining

BusinessIntelligence

Applications

Languages

Marketing

ANALYTIC TOOLS & APPS

USERS

INTEGRATED DISCOVERY PLATFORM

INTEGRATED DATA WAREHOUSE

ERP

SCM

CRM

Images

Audio and Video

Machine Logs

Text

Web and Social

SOURCES

DATA PLATFORM

ACCESSMANAGEMOVE

TERADATA UNIFIED DATA ARCHITECTURESystem Conceptual View

MarketingExecutives

OperationalSystems

FrontlineWorkers

CustomersPartners

Engineers

DataScientists

BusinessAnalysts

TERADATA DATABASE

HORTONWORKS

TERADATA DATABASE

TERADATA ASTER DATABASE

Page 14: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

14 Copyright Teradata

TERADATA ASTER

DATABASE

SQL,SQL-MR,SQL-GR

OTHERDATABASES

Remote Data

Teradata and MongoDB: Next Steps

Teradata Systems

TERADATA DATABASE

HADOOP

Push-down to Hadoop

IDW

TERADATA DATABASE

Discovery

TERADATA ASTER

DATABASE

Business users Data Scientists

MONGODB

NoSQLDatabase

Page 15: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

Export / Import

Direct Connect

INTEGRATION

Page 16: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

16 Copyright Teradata

• Operational + Analytical> Rich MongoDB applications

> Rich Teradata analytics

> Complementary

• Teradata pulls directly from MongoDB sharded clusters

• Teradata pushes back to MongoDB deployments

Teradata and MongoDB

MongoDB Teradata

Application Data

Analytics

Page 17: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

17 Copyright Teradata

Scale-out NoSQL + Scale-out DW SQL

Application

Primary

Shard 1

Primary

Shard 2

Primary

Shard N

Primary

Shard 3

Query router Query router Query router

NoSQL

SQL

AMPAMP

PE

AMPAMP

PE

AMPAMP

PE

AMPAMP

PE

Page 18: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

18 Copyright Teradata

Query Router

Shard 1

Shard 2

Shard 3

Shard 4

Contract Phase

Teradatanode

AMP

AMP

AMP

AMP

PE

SQL

EAH

Page 19: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

19 Copyright Teradata

Contract Phase

Teradatanode

AMP

AMP

AMP

AMP

PE

EAH

Query Router

Shard 1

Shard 2

Shard 3

Shard 4

Page 20: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

20 Copyright Teradata

Data Export to Shards

Teradatanode

AMP

AMP

AMP

AMP

PE

EAH

Query Router

Shard 1

Shard 2

Shard 3

Shard 4

Page 21: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

21 Copyright Teradata

Import Data from Shards

Teradatanode

AMP

AMP

AMP

AMP

PE

EAH

Query Router

Shard 1

Shard 2

Shard 3

Shard 4

Page 22: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

Use cases

BACK-OFFICE CONTEXT TO THE FRONT-OFFICE OPERATIONS

Page 23: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

23 Copyright Teradata

eCommerce in Action: A Virtuous Circle

Buyer preferencesSales catalogCampaigns

Recent purchasesProfitability

DataWarehouse

Shard

Shard

Shard

Shard

Shard

Shard

Shard

Shard

Page 24: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

24 Copyright Teradata

Shard

Shard

Shard

Shard

Shard

Shard

Shard

Shard

real time

Call Center Efficiency: A Virtuous Circle

Trouble ticketsCustomer profilesPayment history

ClaimsNext best offer

DataWarehouse

web logs

Page 25: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

25 Copyright Teradata

Internet of Things: Making Sense of Sensors

Condition-based maintenanceR&D testing

Yield managementWarranty mgmt.

DataWarehouse

Shard

Shard

Shard

Shard

Shard

Shard

Shard

Shard

Page 26: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

26 Copyright Teradata

Conclusions

• Two scale out architectures> OLTP scale-out > Analytics scale-out

• JSON in the data warehouse

• Context from the DW> Enriching MongoDB

applications

• Integration> Import/export > Teradata QueryGrid

Page 27: Top 5 Things to Know About Integrating MongoDB into Your Data Warehouse

27 Copyright Teradata