Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history...
Transcript of Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history...
![Page 2: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/2.jpg)
OLTP vs. OLAP databases.
Source: https://www.flickr.com/photos/adesigna/3237575990
![Page 3: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/3.jpg)
On-line Transaction Processing
• Fast operations that ingest new data and then update state using ACID transactions.
• Only access a small amount of data. • Volume: 1k to 1m txn/sec • Latency: >1-50 ms • Database Size: 100s GB to 10s TB
3
![Page 4: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/4.jpg)
Example
• -line game in the OLTP database.
4
Game Application Framework
Click Stream
Game Updates
OLTP DBMS Pre-computed model decides the next level the player is shown.
![Page 5: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/5.jpg)
Example
• -line game in the OLTP database.
5
Game Application Framework
Click Stream
Game Updates
OLTP DBMS
Real-time Monitoring
![Page 6: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/6.jpg)
Database Warehouses
• Complete history of OLTP databases. • Complex queries that analyze large
segments of fact tables and combine them with dimension tables.
• Volume: A couple queries per second • Latency: 1-60 seconds • Database Size: 100s TB to 10s PB
6
![Page 7: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/7.jpg)
Example
• Compute model used to guide OLTP DBMS decisions from historical data.
7
Game Application Framework
Click Stream
Game Updates
OLTP DBMS OLAP DBMS ETL
New Model
![Page 8: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/8.jpg)
OLTP vs. OLAP
• Storage Format: – OLTP → Row-oriented – OLAP → Column-oriented
• Primary Database Location: – OLTP → In-Memory – OLAP → Disks
• Workloads: – OLTP → Write-Heavy – OLAP → Read-Only 8
![Page 9: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/9.jpg)
Things to consider with databases in the cloud.
Source: https://www.flickr.com/photos/arvidnn/15285491335
![Page 10: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/10.jpg)
Good Things
• Better Resource Utilization • Elastic Scaling • Database-as-a-Service Offerings
10
![Page 11: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/11.jpg)
Better Resource Utilization
• Combine multiple silos onto overprovisioned resources.
• Public platform providers achieve better economies of scale.
• Database machines are (mostly) dead. • Optimal multi-tenant placement is a difficult
problem.
11
![Page 12: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/12.jpg)
Elastic Scaling
• Automatically provision new resources on the fly as needed.
• Scaling up vs. Scaling out. • Difficult for OLTP DBMS to continue
processing transactions while data migrates.
12
![Page 13: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/13.jpg)
OLTP Scale-out Example
13
Elapsed Time
TPC-C Benchmark on H-Store (Fall 2014) Scaling from 3 to 4 nodes
E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing R.Taft, E.Mansour, M.Serafini, J.Duggan, A.J. Elmore, A.Aboulnaga, A.Pavlo, M.Stonebraker Proceedings of the VLDB Endowment, vol. 8, iss. 3, pages. 245 256, November 2014.
![Page 14: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/14.jpg)
Database-as-a-Service
• Cloud provider manages physical configuration of a DBMS.
• Ideal for applications that are co-located in
• Combine private data with curated databases (i.e., data marts)
14
![Page 15: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/15.jpg)
Bad Things
• I/O Virtualization • File system Replication • Security + Privacy Concerns • Performance Variance
15
![Page 16: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/16.jpg)
I/O Virtualization
• Distributed file system stores data transparently across multiple nodes.
• • This causes a DBMS pull data to query
push query to data
16
![Page 17: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/17.jpg)
OLAP I/O Virtualization
17
SELECT YEAR(o_date) AS o_year, AVG(o_amount) FROM orders GROUP BY o_year ORDER BY o_year ASC
OLAP DBMS
Terabytes!
Distrib
ute
d F
ilesy
stem
![Page 18: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/18.jpg)
OLAP I/O Virtualization
18
SELECT YEAR(o_date) AS o_year, AVG(o_amount) FROM orders GROUP BY o_year ORDER BY o_year ASC
OLAP DBMS Bytes!
Distrib
ute
d F
ilesy
stem
![Page 19: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/19.jpg)
File System Replication
• The DBMS should not rely on file system replication for durability.
• OLTP systems maintain replicas in-memory. • OLAP systems can store copies of tables in
different ways on replica nodes.
19
![Page 20: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/20.jpg)
OLAP Replication
20
OLAP DBMS
Table 1: name
Table 2: name
Table 1: id
Table 2: id
Sort Order
Sort Order
Re
plica
#1
R
ep
lica #
2
![Page 21: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/21.jpg)
OLAP Replication
21
OLAP DBMS
Table 1: name
Table 2: name
Table 1: id
Table 2: id
Sort Order
Table1.name ⨝
Table2.name
Sort Order
Re
plica
#1
R
ep
lica #
2
![Page 22: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/22.jpg)
Security + Privacy Concerns
• No truly encrypted solution exists. • Many companies are unable to use public
cloud platforms.
22
![Page 23: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/23.jpg)
Performance Variance
• DBMSs are sensitive to changes in underlying hardware performance.
•large fluctuations in performance.
23
![Page 24: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/24.jpg)
OLTP Performance Variance
24 YCSB on MySQL (Winter 2012) Medium EC2 Instances
35% Difference
OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases Djellel Eddine Difallah, Andrew Pavlo, Carlo Curino, Philippe Cudre-Mauroux Proceedings of the VLDB Endowment, vol. 7, pages. 277 288, December 2013.
![Page 25: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/25.jpg)
Cloud database vendors.
Source: https://www.flickr.com/photos/alestra/8891585632
![Page 26: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/26.jpg)
Important Features
• Automatic Back-ups • Geo-replication • Elasticity / Live Reconfiguration • Efficient Multi-Tenancy • Workload Awareness
26
![Page 27: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/27.jpg)
Cloud Database Vendors
• Cloud-friendly systems • Database-as-a-Service (DBaaS)
27
![Page 28: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/28.jpg)
Cloud-friendly DBMSs
• Most DBMS vendors make it easy to deploy on cloud platforms.
• Others provide support for easy scale-out in a cloud environment.
• More than just pre-configured instances.
28
![Page 29: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/29.jpg)
OLTP DBaaS
• Amazon RDS / Aurora • Microsoft Azure • Google Cloud SQL • Database.com • ClearDB • GenieDB • Clustrix
29
![Page 30: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/30.jpg)
OLAP DBaaS
• Amazon Redshift • Google BigQuery • Microsoft Azure • Snowflake
30
![Page 31: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/31.jpg)
Parting Thoughts
• The cloud does not magically make database problems go away.
•DBMS on the cloud.
• AFAIK, there is no truly autonomous DBMS as of yet.
31
![Page 33: Databases in the Cloudmhhammou/15440-f14/lectures/Dat… · Database Warehouses •Complete history of OLTP databases. •Complex queries that analyze large segments of fact tables](https://reader033.fdocuments.in/reader033/viewer/2022050519/5fa30ad972ef1824f52dc37f/html5/thumbnails/33.jpg)
END @andy_pavlo