ArangoDB

Post on 29-Aug-2014

442 views 2 download

Tags:

description

This is our current version of our general presentation about ArangoDB

Transcript of ArangoDB

1

Lucas Dohmen @moonbeamlabs

!the multi-purpose NoSQL Database

!www.arangodb.org

Lucas Dohmen

‣ ArangoDB Core Team

‣ ArangoDB Foxx & Ruby Adapter

‣ Student on the master branch

‣ Open Source Developer & Podcaster

2

/\ (~( ) ) /\_/\ ( _-----_(@ @) ( \ / /|/--\|\ V " " " "

Why did we start ArangoDB?

How should an ideal multi-purpose database look like?

Is it already out there?

!

‣ Second Generation NoSQL DB

‣ Unique feature set

‣ Solves some problems of other NoSQL DBs

‣ Greenfield project

‣ Experienced team building NoSQL DBs for more than 10 years

3

Main Features

4

‣ Open source and free

‣ Multi model database

‣ Convenient querying

‣ Extendable through JS

‣ High performance & space efficiency

‣ Easy to use

‣ Started in Sep 2011

‣ Current Version: 2.0

Free and Open Source

‣ Apache 2 License

‣ On Github

‣ Do what you want with it

‣ ... and don‘t pay a dime!

5

Multi model database

6

Key/Value Store Document Store Graph Database

Source: Andrew Carol

Polyglot Persistence

Key-Value Store

‣ Map value data to unique string keys (identifiers) ‣ Treat data as opaque (data has no structure) ‣ Can implement scaling and partitioning easily due to simplistic

data model ‣ Key-value can be seen as a special case of documents. For

many applications this is sufficient, but not for all cases. !

ArangoDB ‣ It‘s currently supported as a key-value document. ‣ In the near future it supports special key-value collection. ‣ One of the optimization will be the elimination of JSON in

this case, so the value need not be parsed. ‣ Sharding capabilities of Key-Value Collections will differ

from Document Collections

7

Document Store

‣ Normally based on key-value stores (each document still has a unique key)

‣ Allow to save documents with logical similarity in „collections“ ‣ Treat data records as attribute-structured documents (data is

no longer opaque) ‣ Often allows querying and indexing document attributes !

ArangoDB ‣ It supports both. A database can contain collections from

different types. ‣ For efficient memory handling we have an automatic schema

recognition. ‣ It has different ways to retrieve data. CRUD via RESTful

Interface, QueryByExample, JS for graph traversals and AQL.

8

‣ Example: Computer Science Bibliography

!

!

!

!

!

ArangoDB ‣ Supports Property Graphs ‣ Vertices and edges are documents ‣ Query them using geo-index, full-text, SQL-like queries ‣ Edges are directed relations between vertices

‣ Custom traversals and built-in graph algorithms

Graph Store

9

Type: inproceeding Title: Finite Size Effects

Type: proceeding Title: Neural Modeling

Type: person Name: Anthony C. C.

Coolen

Label: written

Label: published Pages: 99-120

Type: person Name: Snchez-Andrs

Label: edited

Analytic Processing DBsTransaction Processing DBsManaging the evolving state of an IT system

Complex Queries Map/Reduce

Graphs

Extensibility

Key/Value

Column-Stores

Documents

Massively Distributed

Structured Data

NoSQL Map

10

11

Transaction Processing DBsManaging the evolving state of an IT system

Analytic Processing DBs

Map/Reduce

Graphs

Extensibility

Key/Value

Column-Stores

Complex Queries

Documents

Massively Distributed

Structured Data

Another NoSQL Map

Convenient querying

Different scenarios require different access methods:

‣ Query a document by its unique id / key:

GET /_api/document/users/12345

‣ Query by providing an example document:

PUT /_api/simple/by-example { "name": "Jan", "age": 38 }

‣ Query via AQL: FOR user IN users FILTER user.active == true RETURN { name: user.name }

‣ Graph Traversals und JS for your own traversals

‣ JS Actions for “intelligent” DB request

12

Why another query language?

13

‣ Initially, we implemented a subset of SQL's SELECT

‣ It didn't fit well

‣ UNQL addressed some of the problems

‣ Looked dead

‣ No working implementations

‣ XQuery seemed quite powerful

‣ A bit too complex for simple queries

‣ JSONiq wasn't there when we started

Other Document Stores

‣ MongoDB uses JSON/BSON as its “query language”

‣ Limited

‣ Hard to read & write for more complex queries

‣ Complex queries, joins and transactions not possible

‣ CouchDB uses Map/Reduces

‣ It‘s not a relational algebra, and therefore hard to generate

‣ Not easy to learn

‣ Complex queries, joins and transactions not possible

14

Why you may want a more expressive query language

15

Source: http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/

users

friends

commenter

liker

many

many

many

many

one

one

posts

comments

likes

users

friends

commenter

liker

many

many

many

many

one

one

posts

comments

likes

Why you may want a more expressive query language

16

‣ Model it as you would in a SQL database

‣ comments gets a commenter_id – then do a join

users

friends

commenter

liker

many

many

many

many

one

one

posts

comments

likes

Why you may want a more expressive query language

17

‣ Model it as you would in a document store

‣ posts embed comments as an array

users

friends

commenter

liker

many

many

many

many

one

one

posts

comments

likes

Why you may want a more expressive query language

18

‣ Model it as you would in a graph database

‣ users as nodes, friendships as edges

ArangoDB Query Language (AQL)

19

‣ We came up with AQL mid-2012

‣ Declarative language, loosely based on the syntax of XQuery

‣ Other keywords than SQL so it's clear that the languages are different

‣ Implemented in C and JavaScript

Example for Aggregation

‣ Retrieve cities with the number of users:

FOR u IN users COLLECT city = u.city INTO g RETURN { "city" : city, "numUsersInCity": LENGTH(g) }

20

Example for Graph Query

‣ Paths:

FOR u IN users LET userRelations = ( FOR p IN PATHS( users, relations, "OUTBOUND" ) FILTER p._from == u._id RETURN p ) RETURN { "user" : u, "relations" : userRelations }

21

Extendable through JS

‣ JavaScript enriches ArangoDB

‣ Multi Collection Transactions

‣ Building small and efficient Apps - Foxx App Framework

‣ Individually Graph Traversals

‣ Cascading deletes/updates

‣ Assign permissions to actions

‣ Aggregate data from multiple queries into a single response

‣ Carry out data-intensive operations

‣ Help to create efficient Push Services - in the near Future

22

Action Server - a little Application Server

‣ ArangoDB can answer arbitrary HTTP requests directly

‣ You can write your own JavaScript functions (“actions”) that will be executed server-side

‣ Includes a permission system

!

➡You can use it as a database or as a combined database/app server

23

‣ Single Page Web Applications

‣ Native Mobile Applications

‣ ext. Developer APIs

APIs - will become more & more important

24

ArangoDB Foxx

‣ What if you could talk to the database directly?

‣ It would only need an API.

‣ What if we could define this API in JavaScript?

!!!!!!

‣ ArangoDB Foxx is streamlined for API creation – not a jack of all trades

‣ It is designed for front end developers: Use JavaScript, which you already know (without running into callback hell)

25

/\ (~( ) ) /\_/\ ( _-----_(@ @) ( \ / /|/--\|\ V " " " "

Foxx - Simple Example

26

Foxx = require("org/arangodb/foxx"); !controller = new Foxx.Controller(appContext); !controller.get("/users ", function(req, res) { res.json({ hello: }); });

req.params("name");

/:name

Foxx - More features

‣ Full access to ArangoDB‘s internal APIs:

‣ Simple Queries

‣ AQL

‣ Traversals

‣ Automatic generation of interactive documentation

‣ Models and Repositories

‣ Central repository of Foxx apps for re-use and inspiration

‣ Authentication Module

27

High performance & space efficiency

RAM is cheap, but it's still not free and data volume is growing fast. Requests volumes are also growing. So performance and space efficiency are key features of a multi-purpose database.

!

‣ ArangoDB supports automatic schema recognition, so it is one of the most space efficient document stores.

‣ It offers a performance oriented architecture with a C database core, a C++ communication layer, JS and C++ for additional functionalities.

‣ Performance critical points can be transformed to C oder C++.

‣ Although ArangoDB has a wide range of functions, such as MVCC real ACID, schema recognition, etc., it can compete with popular stores documents.

28

Space Efficiency

‣ Measure the space on disk of different data sets

‣ First in the standard config, then with some optimization

‣ We measured a bunch of different tasks

29

Store 50,000 Wiki Articles

30

0 MB

500 MB

1000 MB

1500 MB

2000 MB

ArangoDB CouchDB MongoDB

NormalOptimized

http://www.arangodb.org/2012/07/08/collection-disk-usage-arangodb

3,459,421 AOL Search Queries

31

0 MB

550 MB

1100 MB

1650 MB

2200 MB

ArangoDB CouchDB MongoDB

NormalOptimized

http://www.arangodb.org/2012/07/08/collection-disk-usage-arangodb

Performance: Disclaimer

‣ Always take performance tests with a grain of salt

‣ Performance is very dependent on a lot of factors including the specific task at hand

‣ This is just to give you a glimpse at the performance

‣ Always do your own performance tests (and if you do, report back to us :) )

‣ But now: Let‘s see some numbers

32

Execution Time: Bulk Insert of 10,000,000 documents

33

ArangoDB CouchDB MongoDB

http://www.arangodb.org/2012/09/04/bulk-inserts-mongodb-couchdb-arangodb

Conclusion from Tests

‣ ArangoDB is really space efficient

‣ ArangoDB is “fast enough”

‣ Please test it for your own use case

34

Easy to use

‣ Easy to use admin interface

‣ Simple Queries for simple queries, AQL for complex queries

‣ Simplify your setup: ArangoDB only – no Application Server etc. – on a single server is sufficient for some use cases

‣ You need graph queries or key value storage? You don't need to add another component to the mix.

‣ No external dependencies like the JVM – just install ArangoDB

‣ HTTP interface – use your load balancer

35

Admin Frontend Dashboard

36

Admin Frontend Collections & Documents

37

Admin Frontend Graph Explorer

38

Admin Frontend AQL development

39

Admin Frontend complete V8 access

40

ArangoShell

41

Join the growing community

42

They are working on geo index, full text search and many APIs: Ruby, Python, PHP, Java, Clojure, ...

ArangoDB.explain()

{ "type": "2nd generation NoSQL database", "model": [ "document", "graph", "key-value" ], "openSource": true, "license“: "apache 2", "version": [ "1.3 stable", "1.4 alpha" ], "builtWith": [ "C", "C++", "JS" ], "uses": [ "Google V8" ], "mainFeatures": [ "Multi-Collection-Transaction", "Foxx API Framework", "ArangoDB Query Language", "Various Indexes", "API Server", "Automatic Schema Recognition" ]

}

43