Sandeep Parikh [email protected]@Teradata.com
TOP 5 THINGS TO KNOW ABOUT INTEGRATING MONGODB INTO YOUR DATA WAREHOUSE
2 Copyright Teradata
Scale-out NoSQL+ Scale-out DW
Data Warehouse = context
JSON in the Data Warehouse
Integration: Data Sharing
Use Cases
3 Copyright Teradata
• Analytic database> In-memory, in-database
• Scale-out MPP> 30+ petabyte sites> 35PB, 4096 cores
• Self service BI> Dashboards, reports, OLAP> Predictive analytics
• Complex SQL> 20-50 way joins> 350 pages of SQL
• Real time access/load• Mixed workloads
What is a Teradata Data Warehouse?
Datascientists
Powerusers
Sales,partners
1024 nodes
IntelCPUs
512GB
IntelCPUs
512GB
IntelCPUs
512GB
IntelCPUs
512GB
4 Copyright Teradata
What is a Data Warehouse? Context
Price history
Inventory
Supplier
Contracts
Product/Services
Channels
E-Commerce
Labor
Associate
Customer
Salestransactions
Point of Sale
ShipmentCarrier
Campaigns
Promotion
Warehouse
5 Copyright Teradata
A Day at the Ticket Agency
• 185 applications> Travel agents & corporate
travel managers> Mobile: airline executives> Corporate travel managers
and travel agents> Hoteliers
• Teradata 5650 V13.10> 25TB of data> 1000+ users
• Mini-batch every 15 min• GoldenGate replication• Tactical queries 0.2 seconds • 14M queries/day
2008 2009 2010 201199.599.699.799.899.9100
99.799.78
99.9899.94
Availability
6 Copyright Teradata
Teradata in the Data Warehouse Market
7 Copyright Teradata
Forrester Data Warehouse Wave December 2013
JSON IN THE DATA WAREHOUSE
9 Copyright Teradata
Late Binding in SQL
Earlybinding
Latebinding
RuntimeLoad time
DataWarehouse
Sourcedata
Schema
ETL
tableSQL +
JSONPath
BItools
JSON
10 Copyright Teradata
JSONPath inside SQL
Color Size Prod_ID Create_Time----- ----- ------- -------------------Blue Small 96 2013-06-17 20:07:27
SELECT box.MFG_Line.Product.Color AS "Color", box.MFG_Line.Product.Size AS "Size", box.MFG_Line.Product.Prod_ID AS "Prod_ID", box.MFG_Line.Product.Create_Time AS "Create_Time"
FROM mfgTable WHERE CAST(box.MFG_Line.Product.Create_Time AS TIMESTAMP) >= TIMESTAMP'2013-06-16 00:00:00' AND box.MFG_Line.Product.Prod_ID = 96;
11 Copyright Teradata
• JSON object schema column> Treated like any column> Use any BI tool
• Apply “schema” at runtime
• Why not shred JSON into columns?> Urgency, agility > Bypass extensive change controls> Complex data
– Bill of materials, etc.
Flexible: Schema-on-Read
UNIFIED DATA ARCHITECTURE
Math and Stats
DataMining
BusinessIntelligence
Applications
Languages
Marketing
ANALYTIC TOOLS & APPS
USERS
INTEGRATED DISCOVERY PLATFORM
INTEGRATED DATA WAREHOUSE
ERP
SCM
CRM
Images
Audio and Video
Machine Logs
Text
Web and Social
SOURCES
DATA PLATFORM
ACCESSMANAGEMOVE
TERADATA UNIFIED DATA ARCHITECTURESystem Conceptual View
MarketingExecutives
OperationalSystems
FrontlineWorkers
CustomersPartners
Engineers
DataScientists
BusinessAnalysts
TERADATA DATABASE
HORTONWORKS
TERADATA DATABASE
TERADATA ASTER DATABASE
14 Copyright Teradata
TERADATA ASTER
DATABASE
SQL,SQL-MR,SQL-GR
OTHERDATABASES
Remote Data
Teradata and MongoDB: Next Steps
Teradata Systems
TERADATA DATABASE
HADOOP
Push-down to Hadoop
IDW
TERADATA DATABASE
Discovery
TERADATA ASTER
DATABASE
Business users Data Scientists
MONGODB
NoSQLDatabase
Export / Import
Direct Connect
INTEGRATION
16 Copyright Teradata
• Operational + Analytical> Rich MongoDB applications
> Rich Teradata analytics
> Complementary
• Teradata pulls directly from MongoDB sharded clusters
• Teradata pushes back to MongoDB deployments
Teradata and MongoDB
MongoDB Teradata
Application Data
Analytics
17 Copyright Teradata
Scale-out NoSQL + Scale-out DW SQL
Application
Primary
Shard 1
Primary
Shard 2
Primary
Shard N
Primary
Shard 3
Query router Query router Query router
NoSQL
SQL
AMPAMP
PE
AMPAMP
PE
AMPAMP
PE
AMPAMP
PE
18 Copyright Teradata
Query Router
Shard 1
Shard 2
Shard 3
Shard 4
Contract Phase
Teradatanode
AMP
AMP
AMP
AMP
PE
SQL
EAH
19 Copyright Teradata
Contract Phase
Teradatanode
AMP
AMP
AMP
AMP
PE
EAH
Query Router
Shard 1
Shard 2
Shard 3
Shard 4
20 Copyright Teradata
Data Export to Shards
Teradatanode
AMP
AMP
AMP
AMP
PE
EAH
Query Router
Shard 1
Shard 2
Shard 3
Shard 4
21 Copyright Teradata
Import Data from Shards
Teradatanode
AMP
AMP
AMP
AMP
PE
EAH
Query Router
Shard 1
Shard 2
Shard 3
Shard 4
Use cases
BACK-OFFICE CONTEXT TO THE FRONT-OFFICE OPERATIONS
23 Copyright Teradata
eCommerce in Action: A Virtuous Circle
Buyer preferencesSales catalogCampaigns
Recent purchasesProfitability
DataWarehouse
Shard
Shard
Shard
Shard
Shard
Shard
Shard
Shard
24 Copyright Teradata
Shard
Shard
Shard
Shard
Shard
Shard
Shard
Shard
real time
Call Center Efficiency: A Virtuous Circle
Trouble ticketsCustomer profilesPayment history
ClaimsNext best offer
DataWarehouse
web logs
25 Copyright Teradata
Internet of Things: Making Sense of Sensors
Condition-based maintenanceR&D testing
Yield managementWarranty mgmt.
DataWarehouse
Shard
Shard
Shard
Shard
Shard
Shard
Shard
Shard
26 Copyright Teradata
Conclusions
• Two scale out architectures> OLTP scale-out > Analytics scale-out
• JSON in the data warehouse
• Context from the DW> Enriching MongoDB
applications
• Integration> Import/export > Teradata QueryGrid
27 Copyright Teradata
Top Related