ArangoDB – A different approach to NoSQL

39
1 the multi-purpose NoSQL Database www.arangodb.org

description

 

Transcript of ArangoDB – A different approach to NoSQL

Page 1: ArangoDB – A different approach to NoSQL

�1

!the multi-purpose NoSQL Database

!www.arangodb.org

Page 2: ArangoDB – A different approach to NoSQL

Why did we start ArangoDB?

How should an ideal multi-purpose database look like?

Is it already out there?

!

‣ Second Generation NoSQL DB

‣ Unique feature set

‣ Solves some problems of other NoSQL DBs

‣ Greenfield project

‣ Experienced team building NoSQL DBs for more than 10 years

�2

Page 3: ArangoDB – A different approach to NoSQL

Main Features

�3

‣ Open source and free ArangoDB is available under the Apache 2 licence.

‣ Multi model database Model your data using flexible combinations of key-value pairs, documents and graphs.

‣ Convenient querying AQL is a declarative query language similar to SQL. Other options are REST and querying by example.

‣ Extendable through JS No language zoo: you can use one language from your browser to your back-end.

‣ High performance & space efficiency ArangoDB is fast and takes less space than other nosql databases

‣ Easy to use Up and running in seconds, administer ArangoDB using its graphical user interface.

‣ Started in Sep 2011

‣ Version 1.0 in Sep 2012

!‣ Actual: Version 1.4

‣ Multi Database Suport

‣ Foxx API Framework

‣ Master/Slave Replication

Page 4: ArangoDB – A different approach to NoSQL

Free and Open Source

‣ Apache 2 License The Apache License is recognised by the Open Source Initiative as a popular and widely deployed licence with a strong community. All of The Apache Software Foundation’s projects, including the Apache HTTP Server project whose software powers more than half of the Internet’s web servers, use this licence.

‣ On Github Community can report issues, participate and improve ArangoDB with just a few mouse clicks.

‣ Do what you want with it You can even use ArangoDB in your commercial projects for free. Just leave the disclaimer intact.

‣ ... and don‘t pay a dime! that is, unless you want to support this great project :-)

���4

Page 5: ArangoDB – A different approach to NoSQL

Multi model database

�5

Key/Value Store Document Store Graph Database

Source: Andrew Carol

Polyglot Persistence

Page 6: ArangoDB – A different approach to NoSQL

Key-Value Store

‣ Map value data to unique string keys (identifiers) ‣ Treat data as opaque (data has no structure) ‣ Can implement scaling and partitioning easily due to simplistic

data model ‣ Key-value can be seen as a special case of documents. For

many applications this is sufficient, but not for all cases. !

ArangoDB ‣ It‘s currently supported as a key-value document. ‣ In the near future it supports special key-value collection. ‣ One of the optimization will be the elimination of JSON in

this case, so the value need not be parsed. ‣ Sharding capabilities of Key-Value Collections will differ

from Document Collections

���6

Page 7: ArangoDB – A different approach to NoSQL

Document Store

‣ Normally based on key-value stores (each document still has a unique key)

‣ Allow to save documents with logical similarity in „collections“ ‣ Treat data records as attribute-structured documents (data is

no longer opaque) ‣ Often allows querying and indexing document attributes !

ArangoDB ‣ It supports both. A database can contain collections from

different types. ‣ For efficient memory handling we have an automatic

schema recognition. ‣ It has different ways to retrieve data. CRUD via RESTful

Interface, QueryByExample, JS for graph traversals and AQL.

���7

Page 8: ArangoDB – A different approach to NoSQL

‣ Example: Computer Science Bibliography

!

!

!

!

!

ArangoDB ‣ Supports Property Graphs ‣ Vertices and edges are documents ‣ Query them using geo-index, full-text, SQL-like queries ‣ Edges are directed relations between vertices

‣ Custom traversals and built-in graph algorithms

Graph Store

���8

Type: inproceeding Title: Finite Size Effects

Type: proceeding Title: Neural Modeling

Type: person Name: Anthony C. C.

Coolen

Label: written

Label: published Pages: 99-120

Type: person Name: Snchez-Andrs

Label: edited

Page 9: ArangoDB – A different approach to NoSQL

Analytic Processing DBsTransaction Processing DBsManaging the evolving state of an IT system

Complex Queries Map/Reduce

Graphs

Extensibility

Key/Value

Column-Stores

Documents

Massively Distributed

Structured Data

NoSQL Map

���9

Page 10: ArangoDB – A different approach to NoSQL

���10

Transaction Processing DBsManaging the evolving state of an IT system

Analytic Processing DBs

Map/Reduce

Graphs

Extensibility

Key/Value

Column-Stores

Complex Queries

Documents

Massively Distributed

Structured Data

Another NoSQL Map

Page 11: ArangoDB – A different approach to NoSQL

*) Source: Martin Fowler, http://martinfowler.com/articles/nosql-intro.pdf

Reporting

RDBMS

User activity log

Cassandra

Product Catalog

MongoDB

Analytics

Cassandra

Shopping Cart

Riak

Recommendations

Neo4J

Financial Data

RDBMS

User Sessions

Redis

Polyglot Persistence Example*Polyglot Persistence with ArangoDB

Reporting

RDBMS

User activity log

Cassandra

Product Catalog

ArangoDB

Analytics

Cassandra

Shopping Cart

ArangoDB

Recommendations

ArangoDB

Financial Data

ArangoDB

User Sessions

ArangoDB

���11

Polyglot Persistence Speculative Retailer‘s Web Application

Page 12: ArangoDB – A different approach to NoSQL

Convenient querying

Different scenarios require different access methods:

‣ Query a document by its unique id / key:

GET /_api/document/users/12345

‣ Query by providing an example document:

PUT /_api/simple/by-example { "name": "Jan", "age": 38 }

‣ Query via AQL: FOR user IN users FILTER user.active == true RETURN { name: user.name }

‣ Graph Traversals und JS for your own traversals

‣ JS Actions for „intelligent“ DB request

���12

Page 13: ArangoDB – A different approach to NoSQL

Why another query language?

‣ Initially, we implemented a subset of SQL SELECT for querying, but it didn't fit well:

‣ ArangoDB is a document database, but SQL is a language used in the relational world

‣ Dealing with multi-valued attributes and creating horizontal lists with SQL is quite painful, but we needed these features

‣ We looked at UNQL, which addressed some of the problems, but the project seemed dead and there were no working UNQL implementations

‣ XQuery seemed quite powerful, but a bit too complex for simple queries and a first implementation

‣ JSONiq wasn't there when we started :-)

���13

Page 14: ArangoDB – A different approach to NoSQL

ArangoDB Query Language (AQL)

‣ We rolled our own query language.

‣ It‘s a declarative language, loosely based on the syntax of XQuery.

‣ The language uses other keywords than SQL so it's clear that the languages are different.

‣ It‘s human readable und easy to undersatnd.

‣ AQL is implemented in C and JavaScript.

‣ First version of AQL was released in mid-2012.

���14

Page 15: ArangoDB – A different approach to NoSQL

Example for Aggregation

‣ Retrieve cities with the number of users:

FOR u IN users COLLECT city = u.city INTO g RETURN { "city" : city, "numUsersInCity": LENGTH(g) }

���15

Page 16: ArangoDB – A different approach to NoSQL

Example for Graph Query

‣ Paths:

FOR u IN users LET userRelations = ( FOR p IN PATHS( users, relations, "OUTBOUND" ) FILTER p._from == u._id RETURN p ) RETURN { "user" : u, "relations" : userRelations }

���16

Page 17: ArangoDB – A different approach to NoSQL

Extendable through JS

‣ Scripting-Languages enrich ArangoDB

‣ Multi Collection Transactions

‣ Building small and efficient Apps - Foxx App Framework

‣ Individually Graph Traversals

‣ Cascading deletes/updates

‣ Assign permissions to actions

‣ Aggregate data from multiple queries into a single response

‣ Carry out data-intensive operations

‣ Help to create efficient Push Services - in the near Future

!

‣ Currently supported

‣ Javascript (Google V8)

‣ Mruby (experimental, not fully integrated yet)

���17

Page 18: ArangoDB – A different approach to NoSQL

Action Server - kind of Application Server

‣ ArangoDB can answer arbitrary HTTP requests directly

‣ You can write your own JavaScript functions (“actions”) that will be executed server-side

‣ Includes a permission system

!

➡You can use it as a database or as a combined database/app server

���18

Page 19: ArangoDB – A different approach to NoSQL

‣ Single Page Web Applications

‣ Native Mobile Applications

‣ ext. Developer APIs

APIs - will become more & more important

���19

Page 20: ArangoDB – A different approach to NoSQL

ArangoDB Foxx

‣ What if you could talk to the database directly?

‣ It would only need an API.

‣ What if we could define this API in JavaScript?

!!!!!!

‣ ArangoDB Foxx is streamlined for API creation – not a jack of all trades

‣ It is designed for front end developers: Use JavaScript, which you already know (without running into callback hell)

���20

/\ (~( ) ) /\_/\ ( _-----_(@ @) ( \ / /|/--\|\ V " " " "

Page 21: ArangoDB – A different approach to NoSQL

FoxxApplication = require("org/arangodb/foxx").Application;

app = new FoxxApplication(applicationContext);

app.get("/test", function(req, res) {res.set("Content-Type", "text/plain");

});res.body = "Worked!";

Foxx - Simple Example

���21

Page 22: ArangoDB – A different approach to NoSQL

Foxx - More features

‣ Full access to ArangoDB‘s internal APIs:

‣ Simple Queries

‣ AQL

‣ Traversals

‣ Automatic generation of interactive documentation

‣ Models and Repositories

‣ Central repository of Foxx apps for re-use and inspiration

‣ Authentication Module

���22

Page 23: ArangoDB – A different approach to NoSQL

High performance & space efficiency

RAM is cheap, but it's still not free and data volume is growing fast. Requests volumes are also growing. So performance and space efficiency are key features of a multi-purpose database.

!

‣ ArangoDB supports automatic schema recognition, so it is one of the most space efficient document stores.

‣ It offers a performance oriented architecture with a C database core, a C++ communication layer, JS and C++ for additional functionalities.

‣ Performance critical points can be transformed to C oder C++.

‣ Although ArangoDB has a wide range of functions, such as MVCC real ACID, schema recognition, etc., it can compete with popular stores documents.

���23

Page 24: ArangoDB – A different approach to NoSQL

Space Efficiency

‣ Measure the space on disk of different data sets

‣ First in the standard config, then with some optimization

‣ We measured a bunch of different tasks

���24

Page 25: ArangoDB – A different approach to NoSQL

Store 50,000 Wiki Articles

���25

0 MB

500 MB

1000 MB

1500 MB

2000 MB

ArangoDB CouchDB MongoDB

NormalOptimized

http://www.arangodb.org/2012/07/08/collection-disk-usage-arangodb

Page 26: ArangoDB – A different approach to NoSQL

3,459,421 AOL Search Queries

���26

0 MB

550 MB

1100 MB

1650 MB

2200 MB

ArangoDB CouchDB MongoDB

NormalOptimized

http://www.arangodb.org/2012/07/08/collection-disk-usage-arangodb

Page 27: ArangoDB – A different approach to NoSQL

Performance: Disclaimer

‣ Always take performance tests with a grain of salt

‣ Performance is very dependent on a lot of factors including the specific task at hand

‣ This is just to give you a glimpse at the performance

‣ Always do your own performance tests (and if you do, report back to us :) )

‣ But now: Let‘s see some numbers

���27

Page 28: ArangoDB – A different approach to NoSQL

Execution Time: Bulk Insert of 10,000,000 documents

���28

ArangoDB CouchDB MongoDB

http://www.arangodb.org/2012/09/04/bulk-inserts-mongodb-couchdb-arangodb

Page 29: ArangoDB – A different approach to NoSQL

Conclusion from Tests

‣ ArangoDB is really space efficient

‣ ArangoDB is “fast enough”

‣ Please test it for your own use case

���29

Page 30: ArangoDB – A different approach to NoSQL

Easy to use

‣ Easy to use admin interface

‣ Simple Queries for simple queries, AQL for complex queries

‣ Simplify your setup: ArangoDB only – no Application Server etc. – on a single server is sufficient for some use cases

‣ You need graph queries or key value storage? You don't need to add another component to the mix.

‣ No external dependencies like the JVM – just install ArangoDB

‣ HTTP interface – use your load balancer

���30

Page 31: ArangoDB – A different approach to NoSQL

Admin Frontend Dashboard

���31

Page 32: ArangoDB – A different approach to NoSQL

Admin Frontend Collections & Documents

���32

Page 33: ArangoDB – A different approach to NoSQL

Admin Frontend AQL development

���33

Page 34: ArangoDB – A different approach to NoSQL

Admin Frontend complete V8 access

���34

Page 35: ArangoDB – A different approach to NoSQL

ArangoShell

���35

Page 36: ArangoDB – A different approach to NoSQL

Join the growing community

���36

They are working on geo index, full text search and many APIs: Ruby, Python, PHP, JAVA, D, ...

Page 37: ArangoDB – A different approach to NoSQL

ArangoDB.explain()

{ "type": “multi-purpose NoSQL database", "model": [ "document", "graph", "key-value" ], "openSource": true, "license“: "apache 2", "version": [ “1.4.9 stable", "2.0 alpha" ], "builtWith": [ "C", "C++", "JS" ], "uses": [ "Google V8" ], "mainFeatures": [ "Multi-Collection-Transaction", "Foxx API Framework", "ArangoDB Query Language", "Various Indexes", "API Server", "Automatic Schema Recognition" ]

}

���37

Page 38: ArangoDB – A different approach to NoSQL

Appendix

���38

Page 39: ArangoDB – A different approach to NoSQL

Data Sheet

‣ Extendable through MRuby and Javascript Google V8-Engine

‣ Integrated Application Server

‣ Javascript API Framework “Foxx”

‣ ArangoDB Query Language (AQL)

‣ Query by Example

‣ RESTful Query Interface

‣ Modular Graph Traversal Algorithms

!‣ Easy Administration and Enhanced System

Monitoring

‣ Web-based Console and CLI commands

‣ Efficient Data Import and Export Tools

‣ Fully documented Source Code and APIs

!

���39

‣ Universal Multi-Model DatabaseDocument, Graph and Key/Value

‣ Written in C++ with high speed C Core

‣ Easy to Install & Configure

‣ Runs on Linux, BSD, Mac OS and Windows

‣ Sharding and Replication (in development)

!‣ Mostly memory (durable on hard disc)

‣ Multi-Threaded

‣ Powerful Indices full-text search, hash indices, priority queues, skip lists, geo indices

‣ Schema-less schemata (schema recognition)

‣ Multi Collection Transactions

‣ Driver support for all popular platforms Node.js, JS, PHP, Ruby, Go, D, Python, Blueprints / Gremlin, C# / .Net, Java