Download - H-Store: A Specialized Architecture for High-throughput OLTP Applications

H-Store: A Specialized Architecture for High-throughputOLTP Applications

Evan Jones (MIT)Andrew Pavlo (Brown)13th Intl. Workshop on High Performance Transaction SystemsOctober 26, 2009

Intel Xeon E5540 (Nehalem/Core i7)

October 26, 2009

Source: Intel 64 and IA-32 Architectures Optimization Reference Manual

Distributed Clusters

October 26, 2009

Scaling OLTP on Multi-Core?

Use a distributed shared-nothing design

October 26, 2009

How to Make a Faster OLTP DBMS Main memory storage

Replication for durability

Explicitly partition the data

Specialized concurrency control Stored procedures only Single partition: execute one transaction at time Multiple partitions: supported but slow

October 26, 2009

Source: S. Harizopoulos, D. J. Abadi, S. Madden, M. Stonebraker, “OLTP Under the Looking Glass”, SIGMOD 2008.

OLTP: Where does the time go?

October 26, 2009

Users Rely on Partitioning

October 26, 2009

Source: R. Shoup, D. Pritchett, “The eBay Architecture,” SD Forum, Nov. 2006.

What about multi-core? Traditional approach:

One database process Thread per connection Shared-memory, locks and latches

H-Store approach: Thread per partition Distributed transactions

October 26, 2009

Example Microbenchmark One table per client

Table(id INTEGER, counter INTEGER)

Each client executes the following query:UPDATE TableSET counter = counter + 1WHERE id = 0;

Add clients to find maximum throughput Data on RAM disk

October 26, 2009

Experimental Configuration

October 26, 2009

Threads Partitions

Partitions versus Threads

October 26, 2009

Ideal

Partitions

Threads

Scalability AnalysisPartitions scale better than threads.

Threads: contention for shared resources [1]

Partitions: memory bottleneck causes sublinear scaling

H-Store: Not just for distributed shared-nothing clusters

October 26, 2009

[1] R. Johnson et al., "Shore-MT: A Scalable Storage Manager for the Multicore Era," EDBT 2009.

Multi-core Design Problem How to automatically create a data placement

scheme to improve multi-core throughput?

Data Partitioning: Maximize the number of single-partition transactions.

Data Placement: Maximize the number of single-node transactions.

October 26, 2009

Database Partitioning Select partitioning keys and construct schema tree.

October 26, 2009

DISTRICTDISTRICT

CUSTOMERCUSTOMER

ORDER_ITEMORDER_ITEM

ITEMITEM

STOCKSTOCK

WAREHOUSEWAREHOUSE

ORDERSORDERS

DISTRICTDISTRICT

CUSTOMERCUSTOMER


STOCKSTOCK

ORDERSORDERS ITEMITEM

Replicated

WAREHOUSEWAREHOUSE

TPC-C Schema Schema Tree

ITEMITEMITEMjITEMjITEMITEMITEMITEMITEMITEM

Database Partitioning Combine table fragments into partitions.

October 26, 2009

P2

P4

DISTRICTDISTRICT

CUSTOMERCUSTOMER


STOCKSTOCK

ORDERSORDERS

Replicated

WAREHOUSEWAREHOUSE

P1

P1

P1

P1

P1

P1

P2

P2

P2

P2

P2

P2

P3

P3

P3

P3

P3

P3

P4

P4

P4

P4

P4

P4

P5

P5

P5

P5

P5

P5

P5

P3

P1

ITEMITEMITEMITEM

ITEMITEM ITEMITEM

ITEMITEM

Partitions

ITEMITEM

Schema Tree

Partition AffinityCore1 Core2

HT2

HT1

Core1 Core2

HT2

HT1

Data Placement Assign partitions to cores on each node.

October 26, 2009

P1ITEMITEM

P2ITEMITEM

P5ITEMITEM

Partitions Cluster Node

P4ITEMITEM

P3ITEMITEM

Node 1 Node n

P1

P2P3P4P5

H-Store’s Future New Name. New Company.

Six full-time developers.

Open-source project (GPL)

Beta by end of 2009 Multiple deployments in financial service areas.

October 26, 2009

More Information H-Store Info + Papers:

http://db.cs.yale.edu/hstore/

VoltDB Project Information: http://www.voltdb.org/

October 26, 2009