NoSQL Landscape and a Solution to Polyglot Persistence- Impetus Webcast
Towards Automated Polyglot Persistence · 2020. 6. 24. · Polyglot Persistence Current best...
Transcript of Towards Automated Polyglot Persistence · 2020. 6. 24. · Polyglot Persistence Current best...
Michael Schaarschmidt, Felix Gessert, Norbert [email protected]
Towards Automated Polyglot Persistence
Polyglot PersistenceCurrent best practice
Application Layer
Billing Data Nested Application Data
Session data
Search Index
Files
Amazon Elastic
MapReduce
Google Cloud
StorageFriend
network Cached data & metrics
Recommen-dation Engine
Polyglot PersistenceCurrent best practice
Application Layer
Billing Data Nested Application Data
Session data
Search Index
Files
Amazon Elastic
MapReduce
Google Cloud
StorageFriend
network Cached data & metrics
Recommen-dation Engine
Research Question:
Can we automate the mapping problem?
data database
VisionSchemas can be annotated with requirements
- Write Throughput > 10,000 RPS- Read Availability > 99.9999%- Scans = true- Full-Text-Search = true- Monotonic Read = true
Schema
DBsTablesFields
VisionThe Polyglot Persistence Mediator chooses the database
Application
DatabaseMetrics
Data and Operations
db1 db2 db3
Polyglot PersistenceMediator
Latency < 30ms
AnnotatedSchema
Goal:◦ Extend classic workload management to polyglot persistence
◦ Leverage hetereogeneous (NoSQL) databases
Tenant specifiesrequirements as Service-Level-Agreements
Find or provision a suitable combinationof databases
Mediate data anddatabase operations
1. Requirements 2. Resolution 3. Mediation
Towards Automated Polyglot PersistenceNecessary steps
Functional Service Level Objectives◦ Guarantee a „feature“
◦ Determined by database system
◦ Examples: transactions, join
Non-Functional Service Level Objectives◦ Guarantee a certain quality of service (QoS)
◦ Determined by database system and service provider
◦ Examples:
Continuous: response time (latency), throughput
Binary: Elasticity, Read-your-writes
Service Level AgreementsExpressing application requirements
Utility expresses „value“ of a continuous non-functionalrequirement:
𝑓𝑢𝑡𝑖𝑙𝑖𝑡𝑦 𝑚𝑒𝑡𝑟𝑖𝑐 → [0,1]
Service Level AgreementsRefining the utility of each SLO
Functional Requirements
Scan-Querys
Conditional Updates
Transactions
Query by Example
Joins
Analytics
Elasticity
Consistency
Read-Latency
Write-Latency
Write-Throughput
Scalability of Data Volume
Read Scalability
Read-Availability
Write-Availability
Non-Functional Requirements
Durability
Write Scalability
SLA ExampleFor MongoDB
Step I - RequirementsExpressing the application‘s needs
Requirements1
Database
Table
Field Field Field
1. Define schema
Tenant
Inherits continuous annotations
annotated
Table
Field
Tenant annotates schemawith his requirements
Annotations Continuous non-functional
e.g. write latency < 15ms Binary functional
e.g. Atomic updates Binary non-functional
e.g. Read-your-writes
2. Annotate
Step I - RequirementsExpressing the application‘s needs
Requirements1
Database
Table
Field Field Field
1. Define schema
Tenant
Inherits continuous annotations
annotated
Table
Field
Tenant annotates schemawith his requirements
Annotations Continuous non-functional
e.g. write latency < 15ms Binary functional
e.g. Atomic updates Binary non-functional
e.g. Read-your-writes
2. Annotate
Step II - ResolutionFinding the best database
The Provider resolves therequirements
RANK: scores availabledatabase systems
Routing Model: defines theoptimal mapping from schemaelements to databases
Resolution2
Provider
Capabilities for available DBs
1. Find optimal
RANK(schema_root, DBs)through recursive descent
using annotated schema and metrics
2a. If unsatisfiable
Either:Refuse orProvision new DB
2b. Generatesrouting model
Routing ModelRoute schema_element db transform db-independent to db-
specific operations
Step II - ResolutionRanking algorithm by example
CustomersTable
ECommerceDBdatabase
ShoppingBasketList<String>
UserNameString
Lineariza-bility
Availability
Read latency
SchemaAnnotations
No annotationrecursive descent to child
RANK Algorithm
DBs = { MongoDB, Riak, Cassandra, CouchDB, Redis,
MySQL, S3, Hbase }
Step II - ResolutionRanking algorithm by example
CustomersTable
ECommerceDBdatabase
ShoppingBasketList<String>
UserNameString
Lineariza-bility
Availability
Read latency
SchemaAnnotations
No annotationrecursive descent to child
RANK Algorithm
Binary requirement1. Exclude DBs that do not
support it2. Recursive descent
DBs = { MongoDB, Riak, Cassandra, CouchDB, Redis,
MySQL, S3, Hbase }
Step II - ResolutionRanking algorithm by example
CustomersTable
ECommerceDBdatabase
ShoppingBasketList<String>
UserNameString
Lineariza-bility
Availability
Read latency
SchemaAnnotations RANK Algorithm
Continuous requirement∀ databases calculate
𝑑𝑏 → 𝑓𝑢𝑡𝑖𝑙𝑖𝑡𝑦(𝑑𝑏. 𝑎𝑣𝑎𝑖𝑙𝑎𝑏𝑖𝑙𝑖𝑡𝑦)
Database Availability
MongoDB 99%0.8
Redis 95%0.05
MySQL 94% 0.04
HBase 99.9%0.9
Step II - ResolutionRanking algorithm by example
CustomersTable
ECommerceDBdatabase
ShoppingBasketList<String>
UserNameString
Lineariza-bility
Availability
Read latency
SchemaAnnotations RANK Algorithm
Continuous requirement∀ databases calculate
𝑑𝑏 → 𝑓𝑢𝑡𝑖𝑙𝑖𝑡𝑦(𝑑𝑏. 𝑙𝑎𝑡𝑒𝑛𝑐𝑦)
Database Availability
MongoDB 99%0.8
Redis 95%0.05
MySQL 94% 0.04
HBase 99.9%0.9
Latency
10ms1
1ms1
40ms0.2
50ms0.1
Step II - ResolutionRanking algorithm by example
CustomersTable
ECommerceDBdatabase
ShoppingBasketList<String>
UserNameString
Lineariza-bility
Availability
Read latency
SchemaAnnotations RANK Algorithm
Binary requirement1. Exclude DBs that do not
support it2. Recursive descent3. Pick DB with best total
score and add it torouting model
DB Score
MongoDB 0.9
Redis 0.525
MySQL 0.12
HBase 0.5
Step II - ResolutionRanking algorithm by example
CustomersTable
ECommerceDBdatabase
ShoppingBasketList<String>
UserNameString
Lineariza-bility
Availability
Read latency
SchemaAnnotations RANK Algorithm
Binary requirement1. Exclude DBs that do not
support it2. Recursive descent3. Pick DB with best total
score and add it torouting model
DB Score
MongoDB 0.9
Redis 0.525
MySQL 0.12
HBase 0.5
Routing Model:Customers MongoDB
Step III - MediationRouting data and operations
The PPM routes data
Operation Rewriting: translates from abstract todatabase-specific operations
Runtime Metrics: Latency, availability, etc. are reportedto the resolver
Primary Database Option: All data periodically getsmaterialized to designateddatabase
Mediation3
Application
Polyglot Persistence Mediator Uses Routing Model Triggers periodic
materializationReportmetrics
1. CRUD, queries, transactions, etc.
db1 db2 db3
2. route
Evaluation: News ArticlePrototype built on ORESTES
Scenario: news articles with impression countsObjectives: low-latency top-k queries, high-throughput counts, article-queries
Article
Counter
Evaluation: News ArticlePrototype built on ORESTES
Scenario: news articles with impression countsObjectives: low-latency top-k queries, high-throughput counts, article-queries
Mediator
Evaluation: News ArticlePrototype built on ORESTES
Scenario: news articles with impression countsObjectives: low-latency top-k queries, high-throughput counts, article-queries
Mediator
Counter updates kill performance
Evaluation: News ArticlePrototype built on ORESTES
Scenario: news articles with impression countsObjectives: low-latency top-k queries, high-throughput counts, article-queries
Mediator
Evaluation: News ArticlePrototype built on ORESTES
Scenario: news articles with impression countsObjectives: low-latency top-k queries, high-throughput counts, article-queries
Mediator
No powerful queries
Evaluation: News ArticlePrototype built on ORESTES
Scenario: news articles with impression countsObjectives: low-latency top-k queries, high-throughput counts, article-queries
Article
IDTitle…
Imp.
Imp.ID
Document Sorted Set
Found Resolution
Workload Management: during mediationactively schedule requests based on requirements
Ranking: Predict future metrics from historicones (time-series analysis) or fromperformance models
Database selection: minimize𝑃 𝑆𝐿𝐴 𝑣𝑖𝑜𝑙𝑎𝑡𝑖𝑜𝑛 ∗ 𝑝𝑒𝑛𝑎𝑙𝑡𝑦 (e.g. throughreinforcement learning)
Challgenges & Future Work
Meta-DBaaS: Mediate over DBaaS-systems and factor in their SLAs
Live Migration: Enable requirementchanges
Requirements: collect library of commonones
Utility: Provide intuitive, visual „knobs“ fordevelopers
Challgenges & Future Work
(Manual) Polyglot Persistence is a reality - but difficultand error-prone
Polyglot Persistence Mediator: SLA-driven, fine-grainedselection of database systems
1. Let the tenant define his requirements
2. Choose or provision a database based on that
3. Route data and operations according to that mapping
Summary
Requirements Resolution Mediation
Thank you.