MY DATABASE SYSTEM IS THE ONLY THING I CAN

Post on 03-Jan-2016

24 views 1 download

description

MY DATABASE SYSTEM IS THE ONLY THING I CAN. TRUST. @ ANDY_PAVLO. Thirty Years Ago…. I NTERACTIVE T ransactions. S mall # of CPU C ores. S mall M emory S izes. Voter Setup. TP C-C BENCHMARK. Warehouse Order Processing. Application. Transaction:. NewOrder Transaction. - PowerPoint PPT Presentation

Transcript of MY DATABASE SYSTEM IS THE ONLY THING I CAN

MY DATABASE SYSTEM ISTHE ONLY

THING I CANTRUST @ANDY_PAVLO

Thirty Years Ago…

2

SMALL # OF CPU CORESSMALL MEMORY SIZES

INTERACTIVE TRANSACTIONS

Voter Setup

1.Check item stock level.

2.Create new order information.

3.Update item stock levels.

NewOrder Transaction

TPC-C BENCHMARKWarehouse Order Processing

4

APPLICATION

1 2 3 4 5 6 7 8 9 10 11 120

5,000

10,000

15,000

20,000

Voter MySQL/Postgres

5

TPC-C BENCHMARKWarehouse Order Processing

MySQL

Postgres

TXN/SEC

CPU CORES

Buffer PoolLockingRecoveryReal Work

28%

30%

30% 12%

CPU OverheadTRADITIONAL

DBMSMeasured CPU Cycles

OLTP THROUGH THE LOOKING GLASS, AND WHAT WE FOUND THERESIGMOD, pp. 981-992, 2008. 6

NOSQL

SHARDING MIDDLEWARE

HARDWARE UPGRADEREPLICATIONDISTRIBUTED CACHE

HOW TO SCALE UP WITHOUT GIVING UP TRANSACTI

ONS?

H-STORE: A HIGH-PERFORMANCE, DISTRIBUTEDMAIN MEMORY TRANSACTION PROCESSING SYSTEMProc. VLDB Endow., vol. 1, iss. 2, pp. 1496-1499, 2008.

Distributed Main MemoryTransaction Processing System

DISK ORIENTED

CONCURRENT EXECUTION

HEAVYWEIGHT RECOVERY

xi/

MAIN MEMORY STORAGE

SERIAL EXECUTION

COMPACT LOGGING

Transaction

Execution

Applic

ati

on

PARTITIONS

SINGLE-THREADEDEXECUTION ENGINES

Transaction

Result

CMD LOGSNAPSHOTS

H-Store Architecture DiagramProcedure

NameInput

Parameters

run(phoneNum, contestantId, currentTime) {result = execute(VoteCount,

phoneNum);if (result > MAX_VOTES) {

return (ERROR);}execute(InsertVote, phoneNum,

contestantId, currentTime);

return (SUCCESS);}

VoteCount:SELECT COUNT(*) FROM votes WHERE phone_num = ?;

InsertVote:INSERT INTO votes VALUES (?, ?, ?);

STORED PROCEDURE

11

1 2 3 4 5 6 7 8 9 10 11 120

5,000

10,000

15,000

20,000

Voter MySQL/Postgres

12

TPC-C BENCHMARKWarehouse Order Processing

MySQL

Postgres H-Store

40x

TXN/SEC

CPU CORES

DISTRIBUTED

TRANSACTIONS

14

1 2 3 40

10,000

20,000

30,000

40,000

8 Cores per Node10% Distributed Transactions

Distributed TPC-CTPC-C

BENCHMARKH-Store

TXN/SEC

NODES

Query CountP1P2P3P4

Distributed Txn Example

Applic

ati

on

15

DISTRIBUTED TRANSACTION

S

KNOW WHAT TRANSACTIONS

WILL DO BEFORE THEY

START

BUT PEOPLE ALWAYS GIVE ME

BAD ADVICE

DON’T GET INVOLVED WITH COMPUTERS.YOU’LL NEVER MAKE ANY MONEY.

DON’T GET A PHD.EVERYONE WILL THINK YOU ARE A JERK.

THE DATABASE SYSTEM

ALWAYS HAS MOREINFOR

MATION

DO USE MACHINE

LEARNING TO PREDICT

TRANSACTION BEHAVIOR.ON PREDICTIVE MODELING FOR OPTIMIZING

TRANSACTION EXECUTION IN PARALLEL OLTP SYSTEMSProc. VLDB Endow., Vol 5, Iss. 2, pp. 85-96, 2011

Houdini OverviewPREDICTIVE

MODELS

22

Model Genera

tor

Feature2 Feature2

Feature1

Classifier

Feature

Clusterer

SELECT * FROM WAREHOUSE WHERE W_ID = 10;

INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345);

SELECT * FROM WAREHOUSE WHERE W_ID = 10;

INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345);

SELECT * FROM WAREHOUSE WHERE W_ID = 10;

SELECT * FROM DISTRICT D_W_ID = 10 AND D_ID =9;

INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345);

SELECT * FROM WAREHOUSE WHERE W_ID = 10;

SELECT * FROM DISTRICT WHERE D_W_ID = 10 AND D_ID =9;

INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID,…) VALUES(10, 9, 12345,…);

WORKL

OAD

Decision Tree

______________________________________________________

____________________________________________________________________________________________________________

______________________________________________________

Creating Models

Markov Models

23

Houdini ExampleA

pplic

ati

on

DISTRIBUTED TRANSACTION

S

24

25

1 2 3 40

5,000

10,000

15,000

20,000

25,000

1 2 3 40

15,000

30,000

45,000

60,000

Houdini TPC-C

2xHoudiniNaïve OPTIMAL

8 Cores per Node10% Distributed Transactions

TPC-C BENCHMARK

TXN/SEC

NODES

Houdini Example

Zzzz…

Zzzz…

Zzzz…

Applic

ati

on

DISTRIBUTED TRANSACTION

S

SP1 - Waiting for Query ResultSP2 - Waiting for Query RequestSP3 - Two-Phase Commit

Zzzz…

26

TRANSACTION STALL POINTS

27

BASE PARTITION

REMOTE PARTITION

SP1 - Waiting for Query ResultSP2 - Waiting for Query RequestReal Work

45%

18% 37

%

73% 22

%

5%

SP3 - Two-Phase Commit

DO SOMETHING

USEFULWHEN

STALLED

DON’T BE SURPRISED IF YOU & KB DON’T LAST THROUGHGRAD SCHOOL.

DON’T BE STAN’S STUDENTIF YOU GO TOBROWN.

DO USE MACHINE

LEARNING TO SCHEDULE

SPECULATIVE TASKS.THE ART OF SPECULATIVE EXECUTION

In Progress (August 2013)

Serializable Schedule

32

Distributed Transaction

Speculative Transaction

Speculative Transaction

Zzzz… C

C

C

C

C

VERIFY

Single-Partition Transaction

Single-Partition Transaction

SERIALIZABLE SCHEDULE

SPECULATIVE TRANSACTIONS

Transaction Queue

Distributed Transaction:

Speculation Candidate:

Hermes OverviewZzzz…

READ X READ X

WRITE X

33

SPECULATIVE QUERIES

Distributed Transaction:

Hermes Overview

34

SELECT S_QTY FROM STOCK WHERE S_W_ID = ? AND S_I_ID = ?;

QueryY:

SELECT * FROM WAREHOUSE WHERE W_ID = ?

w_id=0i_w_ids=[1,0] i_ids=[1001,1002]

GetWarehouse:

Transaction Parameters:

SELECT S_QTY FROM STOCK WHERE S_W_ID = ? AND S_I_ID = ?;

CheckStock:

Mapping: Example

35

Distributed Transaction

Speculative Transactions

Hermes Overview

Query1Query2Query3

Query1Query3Query3

Query1Query2Query3

Query3

VERIFICATION

36

1 2 3 40

10,000

20,000

30,000

40,000

50,000Spec QueriesNone

Hermes TPC-C

All

37

8 Cores per Node10% Distributed Transactions

TPC-C BENCHMARK

Spec Txns

TXN/SEC

NODES

Optimize Single-Partition ExecutionH-STORE: A HIGH-PERFORMANCE, DISTRIBUTED MAIN MEMORY TRANSACTION PROCESSING SYSTEM Proc. VLDB Endow., vol. 1, iss. 2, pp. 1496-1499, 2008.

Minimize Distributed TransactionsSKEW-AWARE AUTOMATIC DATABASE PARTITIONING IN SHARED-NOTHING, PARALLEL OLTP SYSTEMSProceedings of SIGMOD, 2012.

Identify Distributed Transactions ON PREDICTIVE MODELING FOR OPTIMIZING TRANSACTION EXECUTION IN PARALLEL OLTP SYSTEMSProc. VLDB Endow., vol. 5, pp. 85-96, 2011.

Utilize Transaction StallsTHE ART OF SPECULATIVE EXECUTIONIn Progress (August 2013)

FUTUREWORK

One SizeAlmostFits All

41

H-STORE

N-STORE

NS-STORE

One SizeAlmostFits All™

43

H-STORE

N-STORE

NS-STORE

Escape From Planet

Zdonik(i.e., Andy Needs to Get Tenure)

Beyond the ‘Stores• Non-Partitionable

Workloads.

• The Poor Man’s Spanner.

• Scientific Databases.

DON’T MESS IT UPWITH KB.

StanZdonik

“The Thrill”Stonebraker

SamMadden

UgurCetintemel

DavidDeWitt

DanAbadi

JustinDeBrabant

SauryaVelagapudi

JohnMeehan

XinJia

CarloCurino

EvanJones

YangZou

NingShi

VisaweeAngkana.