HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman...

20
HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7 th December, 2018 PostgreSQL Down Under Melbourne, Australia

Transcript of HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman...

Page 1: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP

HTAP Queries & Data Fabrics

Atif Rahman@mantaq10

7th December, 2018PostgreSQL Down Under

Melbourne, Australia

Page 2: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP

The agenda

OLTP vs OLAP vs HTAP

The Problem Statement

Data Fabrics

Key PostgreSQL Features

Foreign Data Wrappers

Distributed Cache

Background

Patterns

Components

Page 3: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP

OLTP vs OLAP vs HTAP

OLTP

OLAP

HTAP

Smaller transactions

Lots of them

Lots of Updates

ACID (Transactional)CRUD (Commands)

complex queries

Large working sets

Bulk loads & offloads

INDEXINGAGGREGATIONS

Mixed workloads

on the same system

Analytics on ‘inflight’

transactional data

Addresses resource

contention

FEDERATION

SYNCHRONISATION

Page 4: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP

Hybrid Transactional / Analytical Processing

Data Organisation

Row wise WritesOLTP

Column Wise ReadsOLAP

OLTP OLAPAnalytics Latency

Data Redundancy

Single platform for multimodel HTAP

HTAP

Page 5: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP
Page 6: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP

Data & Application Integration

Data Integration

(ETL/ELT)

Application Integration

(ESB / API)

Data Virtualisation

(DV)

Low Fidelity View

Integration Type Physical Movement and

Consolidation

Synchronization and

Propagation

Abstraction, Virtual

Consolidation,

Federation

Purpose Database to Database Application to

Application

Database to

Application

Agility* Weeks, Months Minutes, Hours Hours, Days

Repository Warehouse / Lake Transactional System Semantic Layer

Run Time* Typically Scheduled Event Driven Typically OnDemand

Page 7: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP

Data Warehouse vs Data Lakes

Data Warehouse Data Lake

• Schema on write

• (early binding)

• BI and analysts

• Arguably better governed.

• MPP / SMP databases

• Schema on read

• (Late binding)

• Data scientists

• Arguably more flexible

• MapReduce et al.

Page 8: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP
Page 9: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP

Data Warehouse vs Data Lakes

Data Warehouse Data Lake

Metro Uber

Page 10: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP
Page 11: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP
Page 12: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP

The Problems

• ACID

• Atomicity

• Concurrency

• Isolation

• Durability

• BASE

• Basic Availability

• Soft States

• Eventual Consistency

• CAP Theorem (Consistency vs Availability vs Partitions)

• CQRS (Command Query Response Segregation)

Page 13: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP

[Distributed Cache]

Metrics & Analytics

Complex Alerts

Scheduled Feeds

AdhocQueries

Detail Data

Logical Unified Data Model

Subject Area Logical Data Models

Enterprise Data Fabric Cluster

Unified queries

Business & Exceptions Rules

Normalisation and Standardisation Rules

Top Down Data Quality (and profiling)

Connectors (JDBC, APIs, ODBC, etc)

Schedulers and Refresh Jobs Rules

Query Planners & Optimiser(s)

Data Fabric Architecture

• Schema Store• Distributed Cache• Query Federation• Query Optimisation• Semantic Normalisation

Page 14: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP

Foreign Data Wrappers

One database to rule them all,

One database to find them,

One database to bring them all,

And in a wrapper bind them.

Page 15: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP

Foreign Data Wrappers

• Uses the standard compliant SQL/MED

• *Data Type Translations (SQL, NoSQL etc)

• *Push Down Predicates

• WHERE and ORDER BY are propagated

• Required COLUMNS

• *Supports CRUD

• *Two Way Joins

• Import Foreign Schemas

*May vary based on specific wrapper

Page 16: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP

(Reflections) the PostgreSQL way

TABLE_ALPHA

A

DatabaseCluster

Shared BufferPool TABLE_ALPHA

A

BEGIN;INSERT INTO TABLE_ALPHA VALUES (‘B’);COMMIT;

TABLE_ALPHA

A B

Dirty Page(s)

Without WAL Logs

OS Cache

Use WAL Logs for Cache Synchronisation

TABLE_ALPHA

A

TABLE_ALPHA

A

TABLE_ALPHA

A B

Dirty Page(s)

BEGIN;INSERT INTO TABLE_ALPHA VALUES (‘B’);COMMIT;

WAL Checkpointswrite back to disk

pg_prewarm()Loads OS and buffer cache(s)

Page 17: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP

Distributed Cache with PostgreSQL

Disk

Cache

Disk

Cache

FDW

Cache

FDW

Cache

Distributed PG Cluster with

Federated queries loaded

directed to Cache

BI APP DS MSG

Disk

Distributed Cache

Disk FDW FDW

Distributed PG Cluster

integrated With Distributed

Cache Technologies

BI APP DS MSG

JDBC, pgmemcache

Apache Ignite

Dremio

Terracotta

memcached

Some Technologies

Page 18: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP

Key PostgreSQL Features for Fabrics

Schema Store

Distributed Cache

Query Federation

Query Optimisation

Semantic

Normalisation

Dirty Pages

pg_prewarm()

pgmemcache

Persistence WAL

Database

CTEs,

Least Cost Routing

ORM + FRM

Foreign Data Wrappers

Page 19: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP

Key Takeaways

• Cloud migrations• Jurisdictions (privacy and availability zones) • Computing at the edge of the network. • Data Virtualisation• Rights to be Forgotten (GDPR)• Query Lineage & Audit across ALL data• Areas for future development.

Page 20: HTAP Queries & Data Fabrics - PgDU 2019 and Da… · HTAP Queries & Data Fabrics Atif Rahman @mantaq10 7th December, 2018 PostgreSQL Down Under Melbourne, Australia. The agenda OLTP

HTAP OLAP OLTP