Acunu & OCaml: Experience Report, CUFP

32
Tom Wilkie Founder & VP Engineering [email protected] @tom_wilkie Acunu & OCaml: Experience Report

description

 

Transcript of Acunu & OCaml: Experience Report, CUFP

Page 1: Acunu & OCaml: Experience Report, CUFP

Tom WilkieFounder & VP Engineering

[email protected]@tom_wilkie

Acunu & OCaml: Experience Report

Page 2: Acunu & OCaml: Experience Report, CUFP

What do we do?

Old hardware

1990

BTree File systems

RAID

Small databases

BTree indexes

Page 3: Acunu & OCaml: Experience Report, CUFP

What do we do?

BTree file systems

2010

New hardware

RAID

Write-optimised indexes

Distributed, shared-nothing databases

BTree file systems

New hardware

RAID

Write-optimised indexes

...

Page 4: Acunu & OCaml: Experience Report, CUFP

What do we do?

Castle

2011

Distributed, shared-nothing databases

New hardware

Castle

New hardware

...

Page 5: Acunu & OCaml: Experience Report, CUFP
Page 6: Acunu & OCaml: Experience Report, CUFP

What does this have to do with

Functional Programming?

Page 7: Acunu & OCaml: Experience Report, CUFP

Big Data Applications

Cross-Cluster Management UI

Am

azon S

3 c

om

pat

ible

...

Acunu Storage Core

Open API

Management

Deployment

Monitoring

......

...

...... ............

Java,Erlang,

COCaml

CPython, Bash,Perl

Page 8: Acunu & OCaml: Experience Report, CUFP

Management Stack

Page 9: Acunu & OCaml: Experience Report, CUFP

Miscd AlertsDFSd

Version

Collection

Disk

NamedObjects

Base

Castle

Routerdenumeration, routing, clustering

HTML5/JavaScript User Interface

Autogeneranted OCaml CLI

External Monitoring Tools (Munin etc)

Cassandrad

Keyspace

ColumnFamily

Clusterd

Cassandra

Host

Group

ServiceCassandra_Node

S3d

BigS3

S3_Node

Bucket

Another Routerdon a different machine

Filesystem

Statsd

Report

Stat

Source

Default_Report

Alert_Rule

Alert

Page 10: Acunu & OCaml: Experience Report, CUFP

Miscd AlertsDFSd

Version

Collection

Disk

NamedObjects

Base

Castle

Routerdenumeration, routing, clustering

HTML5/JavaScript User Interface

Autogeneranted OCaml CLI

External Monitoring Tools (Munin etc)

Cassandrad

Keyspace

ColumnFamily

Clusterd

Cassandra

Host

Group

ServiceCassandra_Node

S3d

BigS3

S3_Node

Bucket

Another Routerdon a different machine

Filesystem

Statsd

Report

Stat

Source

Default_Report

Alert_Rule

Alert

Bridges to other systems

Page 11: Acunu & OCaml: Experience Report, CUFP

Miscd AlertsDFSd

Version

Collection

Disk

NamedObjects

Base

Castle

Routerdenumeration, routing, clustering

HTML5/JavaScript User Interface

Autogeneranted OCaml CLI

External Monitoring Tools (Munin etc)

Cassandrad

Keyspace

ColumnFamily

Clusterd

Cassandra

Host

Group

ServiceCassandra_Node

S3d

BigS3

S3_Node

Bucket

Another Routerdon a different machine

Filesystem

Statsd

Report

Stat

Source

Default_Report

Alert_Rule

AlertClustering

Failure Detection

Monitoring

Alerting

Page 12: Acunu & OCaml: Experience Report, CUFP

Miscd AlertsDFSd

Version

Collection

Disk

NamedObjects

Base

Castle

Routerdenumeration, routing, clustering

HTML5/JavaScript User Interface

Autogeneranted OCaml CLI

External Monitoring Tools (Munin etc)

Cassandrad

Keyspace

ColumnFamily

Clusterd

Cassandra

Host

Group

ServiceCassandra_Node

S3d

BigS3

S3_Node

Bucket

Another Routerdon a different machine

Filesystem

Statsd

Report

Stat

Source

Default_Report

Alert_Rule

Alert

Routing & Aggregation

Page 13: Acunu & OCaml: Experience Report, CUFP

Successes / Failures

Page 14: Acunu & OCaml: Experience Report, CUFP

Prototype “Filesystem”

Page 15: Acunu & OCaml: Experience Report, CUFP

• CoW BTrees

• Mod List BTrees

• LSM Trees

• Doubling Arrays

• Fractional Cascading

• Stratified DAs

• Multidimensional keys

• Z curve packing

Aim: Investigate algorithms for KV

storage

Page 16: Acunu & OCaml: Experience Report, CUFP

Doubling Array

2

9

2 9

Page 17: Acunu & OCaml: Experience Report, CUFP

Doubling Array

11

8 8 11

2 9 2 8 9 11

Inserts

etc...

Similar to log-structured merge trees (LSM), cache-oblivious lookahead array (COLA), ...

Page 19: Acunu & OCaml: Experience Report, CUFP

B = “block size”, say 8KB at 100 bytes/entry ~= 100 entries

Update Range Query(Size Z)

Log Structured B-Tree

O(logB N)random IOs

O(Z/B) random IOs

Doubling Array O((log N)/B)sequential IOs

O(Z/B) sequential IOs

~ log (2^30)/log 100= 5 IOs/update

~ log (2^30)/100= 0.2 IOs/update

8KB @ 100MB/s = 13k IOs/s

8KB @ 100MB/s, w/ 8ms seek = 100 IOs/s

13k / 0.2 = 65k updates/s

100 / 5 = 20 updates/s

Page 20: Acunu & OCaml: Experience Report, CUFP

BTree Disk Trace

Time (s)

Bloc

k In

dex

Page 21: Acunu & OCaml: Experience Report, CUFP

Time (secs)

Bloc

k In

dex

Doubling Array Disk Trace

Page 22: Acunu & OCaml: Experience Report, CUFP

# inserted kvps

Inse

rtio

n R

ate

(kvp

s/s)

OCaml Prototype Performance

Page 23: Acunu & OCaml: Experience Report, CUFP

The Dark Side...

Page 24: Acunu & OCaml: Experience Report, CUFP

Java Prototype Performance

Time (s)

Inse

rt R

ate

(key

s/s)

Page 25: Acunu & OCaml: Experience Report, CUFP

What about Castle?

Page 26: Acunu & OCaml: Experience Report, CUFP

Castle Performance

Page 27: Acunu & OCaml: Experience Report, CUFP
Page 28: Acunu & OCaml: Experience Report, CUFP

One more thing...

Page 29: Acunu & OCaml: Experience Report, CUFP

SNAPSHOTS*

* And clones!

Page 30: Acunu & OCaml: Experience Report, CUFP

I’ll explain how....

http://bit.ly/rduBia

“Castle: Re-inventing Storage For Big Data”

London, 27th September

Page 32: Acunu & OCaml: Experience Report, CUFP

References[LSM] The Log-Structured Merge-Tree (LSM-Tree)Patrick O'Neil, Edward Cheng, Dieter Gawlick, Elizabeth O'Neil

http://staff.ustc.edu.cn/~jpq/paper/flash/1996-The%20Log-Structured%20Merge-Tree%20%28LSM-

Tree%29.pdf

[COLA] Cache-Oblivious Streaming B-trees, Michael A. Bender et al

http://www.cs.sunysb.edu/~bender/newpub/BenderFaFi07.pdf

[DSST] Making Data Structures Persistent - J. R. Driscoll, N. Sarnak, D. D. Sleator, R. E. Tarjan, Making Data Structures Persistent, Journal of Computer and System Sciences, Vol. 38, No. 1, 1989

http://www.cs.cmu.edu/~sleator/papers/making-data-structures-persistent.pdf

Stratified B-trees and versioned dictionaries, - Andy Twigg, Andrew Byde, Grzegorz Miłoś, Tim Moreton, John Wilkes, Tom Wilkie, HotStorage’11

http://www.usenix.org/event/hotstorage11/tech/final_files/Twigg.pdf

[RDA] Random duplicate storage strategies for load balancing in multimedia servers, 2000, Joep Aerts and Jan Korst and Sebastian Egner

http://www.win.tue.nl/~joep/IPL.ps

Apache, Apache Cassandra, Cassandra, Hadoop, and the eye and elephant logos are trademarks of the

Apache Software Foundation.