Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath...

66
Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013

Transcript of Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath...

Page 1: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Elasca: Workload-Aware Elastic Scalability for Partition Based

Database Systems

Taha RafiqMMath Thesis Presentation

24/04/2013

Page 2: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

2

Outline

1. Introduction & Motivation2. VoltDB & Elastic Scale-Out Mechanism3. Partition Placement Problem4. Workload-Aware Optimizer5. Experiments & Results6. Supporting Multi-Partition Transactions7. Conclusion

Page 3: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

3

INTRODUCTION & MOTIVATION

Page 4: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

4

DBMS Scalability

Replication

Partitioning

Page 5: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

5

Traditional (DBMS) Scalability

Higher Load

Add Resources

Better Performance

Ability of a system to be enlarged to handle growing amount of work

Expensive Downtime

Page 6: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

6

Elastic (DBMS) Scalability

Higher Load

Dynamically Add

Resources

Better Performance

Use of computer resources which vary dynamically to meet a variable workload

NoDowntime

Page 7: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Elastically Scaling a Partition Based DBMS

Re-Partitioning

7

Partition 1

Node 1Partition 1

Node 1

Partition 2

Node 2

Scale Out

Scale In

Page 8: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Elastically Scaling a Partition Based DBMS

Partition Migration

8

P1

Node 1

P2

P3 P4

Node 1

P1 P2

Node 2

P3 P4

Scale Out

Scale In

Page 9: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

9

Partition Migration for Elastic Scalability

MechanismHow to add/remove nodes and move

partitions

Policy/StrategyWhich partitions to move when and where

during scale out/scale in

Page 10: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

10

Elasca

Elastic Scale-Out Mechanism

Partition Placement & Migration Optimizer

=

+

Page 11: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

11

VOLTDB & ELASTIC SCALE-OUT MECHANISM

Page 12: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

12

What is VoltDB?

• In memory, partition based DBMS– No disk access = very fast

• Shared nothing architecture, serial execution– No locks

• Stored procedures– No arbitrary transactions

• Replication– Fault tolerance & durability

Page 13: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

13

VoltDB Architecture

P1 P2

ES1 ES2

Initiator

Client Interface

P3 P1

ES1 ES2

Initiator

Client Interface

P2 P3

ES1 ES2

Initiator

Client Interface

Client ClientClient Client

Thr

eads

Page 14: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

14

Single-Partition Transactions

P1 P2

ES1 ES2

Initiator

Client Interface

P3 P1

ES1 ES2

Initiator

Client Interface

P2 P3

ES1 ES2

Initiator

Client Interface

Client ClientClient Client

Page 15: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

15

Multi-Partition Transactions

P1 P2

ES1 ES2

Initiator

Client Interface

P3 P1

ES1 ES2

Initiator

Client Interface

P2 P3

ES1 ES2

Initiator

Client Interface

Client ClientClient Client

ES1

Page 16: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

16

Elastic Scale-Out Mechanism

P3 P4

ES3 ES4

Initiator

Client Interface

P1 P2

ES1 ES2Scale-Out Node

(Failed)

ES4

Initiator

Client Interface

ES1

P4

P1

Page 17: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

17

Overcommitting Cores

• VoltDB suggests:Partitions per node < Cores per node

• Wasted resources when load is low or data access is skewed

IdeaAggregate extra partitions on each node

and scale out when load increases

Page 18: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

18

PARTITION PLACEMENT PROBLEM

Page 19: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

19

Given…Cluster and System Specifications

Number of CPU cores

MemoryMax. Number of Nodes

Page 20: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

20

Given…

P1 P2 P3 P4 P5 P6 P7 P80

500

1000

1500

2000

2500

3000

Load Per Partition

Partition

Req

uest

s P

er S

eco

nd

Page 21: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

21

Given…

P1 P2 P3 P4 P5 P6 P7 P80

200

400

600

800

1000

1200

Size of Each Partition

Partition

Size

in M

B

Page 22: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

22

Given…

Partition Node 1 Node 2 Node 3

P1

P2

P3

P4

P5

P6

P7

P8

Current Partition-to-Node Assignment

Page 23: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

23

Find…

Partition Node 1 Node 2 Node 3

P1 ? ? ?

P2 ? ? ?

P3 ? ? ?

P4 ? ? ?

P5 ? ? ?

P6 ? ? ?

P7 ? ? ?

P8 ? ? ?

Optimal Partition-to-Node Assignment (For Next Time Interval)

Page 24: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

24

Optimization Objectives

Maximize ThroughputMatch the performance of a static, fully

provisioned system

Minimize Resources UsedUse the minimum number of nodes required

to meet performance demands

Page 25: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

25

Optimization Objectives

Minimize Data MovementData movement adversely affects system performance and incurs network costs

Balance Load EffectivelyMinimizes the risk of overloading a node

during the next time interval

Page 26: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

26

WORKLOAD-AWARE OPTIMIZER

Page 27: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

System Overview

27

Page 28: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

28

Statistics Collected

α. Maximum number of transactions that can be executed on a partition per second– Max capacity of Execution Sites

β. CPU overhead of host-level tasks– How much CPU capacity the Initiator uses

Page 29: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Effect of β

29

Page 30: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Estimating CPU Load

30

CPU Load Generated by Each Partition

Average CPU Load of Host-Level Tasks Per Node

Average CPU Load Per Node

Page 31: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

31

Optimizer Details

• Mathematical Optimization vs. Heuristics• Mixed-Integer Linear Programming (MILP)• Can be solved using any general-purpose

solver (we use IBM ILOG CPLEX)• Applicable for wide variety of scenarios

Page 32: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Objective Function

32

Minimizes data movement as primary objective and balances load as secondary objective

Page 33: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Effect of ε

33

Page 34: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

34

Minimizing Resources Used

• Calculate the minimum number of nodes that can handle the load of all the partitions– Non-integer assignment

• Explicitly tell optimizer how many nodes to use• If optimizer can’t find solution with minimum

nodes, it tries again with N + 1 nodes

Page 35: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

35

Constraints

• Replication: Replicas of a given partition must be assigned to different nodes

• CPU Capacity: Sum of the load of partitions must be less than capacity of node

• Memory Capacity: All the partitions assigned to a node must fit in its memory

• Host-Level Tasks: The overhead of host-level tasks must not exceed capacity of single core

Page 36: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

36

Staggering Scale In

• Fluctuating workload can result in excessive data movement

• Staggering scale in mitigates this problem• Delay scaling in by s time steps• Slightly higher resources used to provide

stability

Page 37: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

37

EXPERIMENTAL EVALUATION

Page 38: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

38

Optimizers Evaluated

• ELASCA: Our workload-aware optimizer• ELASCA-S: ELASCA with staggered scale in• OFFLINE: Offline optimizer that minimizes

resources used and data movement• GREEDY: A greedy first-fit optimizer• SCO: Static, fully provisioned system (no

optimization)

Page 39: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

39

Benchmarks Used

• TPC-C: Modified to make it cleanly partitioned and fit in memory (3.6 GB)

• TATP: Telecommunication Application Transaction Processing Benchmark (250 MB)

• YCSB: Yahoo! Cloud Serving Benchmark with 50/50 read/write ratio (1 GB)

Page 40: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

40

Dynamic Workloads

• Varying the aggregate request rate– Periodic waveforms • Sine, Triangle, Sawtooth

• Skewing the data access– Temporal skew– Statistical distributions• Uniform, Normal, Categorical, Zipfian

Page 41: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Temporal Skew

P1 P2 P3 P4 P5 P6 P7 P8

t = 1

Load

41

P1 P2 P3 P4 P5 P6 P7 P8

t = 2

Load

P1 P2 P3 P4 P5 P6 P7 P8

t = 3

Load

P1 P2 P3 P4 P5 P6 P7 P8

t = 4

Load

P1 P2 P3 P4 P5 P6 P7 P8

t = 1

Load

Page 42: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

42

Experimental Setup

• Each experiment run for 1 hour• 15 time intervals– Optimizer run every four minutes

• Combination of simulation and actual runs– Exact numbers for data movement, resources

used and load balance through simulation

• Cluster has 4 nodes, 2 separate client machines

Page 43: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Data Movement (TPC-C)

43

Triangle Wave (f = 1)

Page 44: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Data Movement (TPC-C)

44

Triangle Wave (f = 1), Zipfian Skew

Page 45: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Data Movement (TPC-C)

45

Triangle Wave (f = 4)

Page 46: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Computing Resources Saved (TPC-C)

46

Triangle Wave (f = 1)

Page 47: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Load Balance (TPC-C)

47

Triangle Wave (f = 1)

Page 48: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Database Throughput (TPC-C)

48

Sine Wave (f = 2)

Page 49: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Database Throughput (TPC-C)

49

Sine Wave (f = 2), Normal Skew

Page 50: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Database Throughput (TATP)

50

Sine Wave (f = 2)

Page 51: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Database Throughput (YCSB)

51

Sine Wave (f = 2)

Page 52: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Database Throughput (TPC-C)

52

Triangle Wave (f = 4)

Page 53: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Optimizer Scalability

53

Page 54: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

54

SUPPORTING MULTI-PARTITION TRANSACTIONS

Page 55: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

55

Factors Affecting Performance

• Maximum MPT Throughput (η): The maximum number of transactions an execution site can coordinate per second

• Probability of MPTs (pmpt): Percentage of transactions that are MPTs

• Partitions Involved in MPTs: The number of partitions involved in MPTs

Page 56: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

56

Changes to Model

CPU load generated by each partition is equal to sum of:

1. Load due to transaction work (same as SPTs)2. Load due to coordinating MPTs

Page 57: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Maximum MPT Throughput

57

Page 58: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Probability of MPTs

58

Page 59: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Effect on Resources Saved

59

Page 60: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

Effect on Data Movement

60

Page 61: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

61

CONCLUSION

Page 62: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

62

Related Work

• Data replication and partitioning• Database consolidation• Live database migration• Key-value stores• Data placement

Page 63: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

63

Elasca

Elastic Scale-Out Mechanism

Partition Placement & Migration Optimizer

=

+

Page 64: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

64

Conclusion

• Elasca = Mechanism + Optimizer• Workload-Aware Optimizer– Meets performance demands– Minimizes computing resources used– Minimizes data movement– Effectively balances load

• Scalable to large problem sizes for online setting

Page 65: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

65

Future Work

• Migrating to VoltDB 3.0– Intelligent client routing, master/slave

partitions

• Supporting multi-partition transactions• Automated parameter tuning• Transaction mixes• Workload prediction

Page 66: Elasca: Workload-Aware Elastic Scalability for Partition Based Database Systems Taha Rafiq MMath Thesis Presentation 24/04/2013.

66

Thank You

Questions?