ArangoDB – A different approach to NoSQL
-
Upload
arangodb -
Category
Self Improvement
-
view
5.677 -
download
8
description
Transcript of ArangoDB – A different approach to NoSQL
Why did we start ArangoDB?
How should an ideal multi-purpose database look like?
Is it already out there?
!
‣ Second Generation NoSQL DB
‣ Unique feature set
‣ Solves some problems of other NoSQL DBs
‣ Greenfield project
‣ Experienced team building NoSQL DBs for more than 10 years
�2
Main Features
�3
‣ Open source and free ArangoDB is available under the Apache 2 licence.
‣ Multi model database Model your data using flexible combinations of key-value pairs, documents and graphs.
‣ Convenient querying AQL is a declarative query language similar to SQL. Other options are REST and querying by example.
‣ Extendable through JS No language zoo: you can use one language from your browser to your back-end.
‣ High performance & space efficiency ArangoDB is fast and takes less space than other nosql databases
‣ Easy to use Up and running in seconds, administer ArangoDB using its graphical user interface.
‣ Started in Sep 2011
‣ Version 1.0 in Sep 2012
!‣ Actual: Version 1.4
‣ Multi Database Suport
‣ Foxx API Framework
‣ Master/Slave Replication
Free and Open Source
‣ Apache 2 License The Apache License is recognised by the Open Source Initiative as a popular and widely deployed licence with a strong community. All of The Apache Software Foundation’s projects, including the Apache HTTP Server project whose software powers more than half of the Internet’s web servers, use this licence.
‣ On Github Community can report issues, participate and improve ArangoDB with just a few mouse clicks.
‣ Do what you want with it You can even use ArangoDB in your commercial projects for free. Just leave the disclaimer intact.
‣ ... and don‘t pay a dime! that is, unless you want to support this great project :-)
���4
Multi model database
�5
Key/Value Store Document Store Graph Database
Source: Andrew Carol
Polyglot Persistence
Key-Value Store
‣ Map value data to unique string keys (identifiers) ‣ Treat data as opaque (data has no structure) ‣ Can implement scaling and partitioning easily due to simplistic
data model ‣ Key-value can be seen as a special case of documents. For
many applications this is sufficient, but not for all cases. !
ArangoDB ‣ It‘s currently supported as a key-value document. ‣ In the near future it supports special key-value collection. ‣ One of the optimization will be the elimination of JSON in
this case, so the value need not be parsed. ‣ Sharding capabilities of Key-Value Collections will differ
from Document Collections
���6
Document Store
‣ Normally based on key-value stores (each document still has a unique key)
‣ Allow to save documents with logical similarity in „collections“ ‣ Treat data records as attribute-structured documents (data is
no longer opaque) ‣ Often allows querying and indexing document attributes !
ArangoDB ‣ It supports both. A database can contain collections from
different types. ‣ For efficient memory handling we have an automatic
schema recognition. ‣ It has different ways to retrieve data. CRUD via RESTful
Interface, QueryByExample, JS for graph traversals and AQL.
���7
‣ Example: Computer Science Bibliography
!
!
!
!
!
ArangoDB ‣ Supports Property Graphs ‣ Vertices and edges are documents ‣ Query them using geo-index, full-text, SQL-like queries ‣ Edges are directed relations between vertices
‣ Custom traversals and built-in graph algorithms
Graph Store
���8
Type: inproceeding Title: Finite Size Effects
Type: proceeding Title: Neural Modeling
Type: person Name: Anthony C. C.
Coolen
Label: written
Label: published Pages: 99-120
Type: person Name: Snchez-Andrs
Label: edited
Analytic Processing DBsTransaction Processing DBsManaging the evolving state of an IT system
Complex Queries Map/Reduce
Graphs
Extensibility
Key/Value
Column-Stores
Documents
Massively Distributed
Structured Data
NoSQL Map
���9
���10
Transaction Processing DBsManaging the evolving state of an IT system
Analytic Processing DBs
Map/Reduce
Graphs
Extensibility
Key/Value
Column-Stores
Complex Queries
Documents
Massively Distributed
Structured Data
Another NoSQL Map
*) Source: Martin Fowler, http://martinfowler.com/articles/nosql-intro.pdf
Reporting
RDBMS
User activity log
Cassandra
Product Catalog
MongoDB
Analytics
Cassandra
Shopping Cart
Riak
Recommendations
Neo4J
Financial Data
RDBMS
User Sessions
Redis
Polyglot Persistence Example*Polyglot Persistence with ArangoDB
Reporting
RDBMS
User activity log
Cassandra
Product Catalog
ArangoDB
Analytics
Cassandra
Shopping Cart
ArangoDB
Recommendations
ArangoDB
Financial Data
ArangoDB
User Sessions
ArangoDB
���11
Polyglot Persistence Speculative Retailer‘s Web Application
Convenient querying
Different scenarios require different access methods:
‣ Query a document by its unique id / key:
GET /_api/document/users/12345
‣ Query by providing an example document:
PUT /_api/simple/by-example { "name": "Jan", "age": 38 }
‣ Query via AQL: FOR user IN users FILTER user.active == true RETURN { name: user.name }
‣ Graph Traversals und JS for your own traversals
‣ JS Actions for „intelligent“ DB request
���12
Why another query language?
‣ Initially, we implemented a subset of SQL SELECT for querying, but it didn't fit well:
‣ ArangoDB is a document database, but SQL is a language used in the relational world
‣ Dealing with multi-valued attributes and creating horizontal lists with SQL is quite painful, but we needed these features
‣ We looked at UNQL, which addressed some of the problems, but the project seemed dead and there were no working UNQL implementations
‣ XQuery seemed quite powerful, but a bit too complex for simple queries and a first implementation
‣ JSONiq wasn't there when we started :-)
���13
ArangoDB Query Language (AQL)
‣ We rolled our own query language.
‣ It‘s a declarative language, loosely based on the syntax of XQuery.
‣ The language uses other keywords than SQL so it's clear that the languages are different.
‣ It‘s human readable und easy to undersatnd.
‣ AQL is implemented in C and JavaScript.
‣ First version of AQL was released in mid-2012.
���14
Example for Aggregation
‣ Retrieve cities with the number of users:
FOR u IN users COLLECT city = u.city INTO g RETURN { "city" : city, "numUsersInCity": LENGTH(g) }
���15
Example for Graph Query
‣ Paths:
FOR u IN users LET userRelations = ( FOR p IN PATHS( users, relations, "OUTBOUND" ) FILTER p._from == u._id RETURN p ) RETURN { "user" : u, "relations" : userRelations }
���16
Extendable through JS
‣ Scripting-Languages enrich ArangoDB
‣ Multi Collection Transactions
‣ Building small and efficient Apps - Foxx App Framework
‣ Individually Graph Traversals
‣ Cascading deletes/updates
‣ Assign permissions to actions
‣ Aggregate data from multiple queries into a single response
‣ Carry out data-intensive operations
‣ Help to create efficient Push Services - in the near Future
!
‣ Currently supported
‣ Javascript (Google V8)
‣ Mruby (experimental, not fully integrated yet)
���17
Action Server - kind of Application Server
‣ ArangoDB can answer arbitrary HTTP requests directly
‣ You can write your own JavaScript functions (“actions”) that will be executed server-side
‣ Includes a permission system
!
➡You can use it as a database or as a combined database/app server
���18
‣ Single Page Web Applications
‣ Native Mobile Applications
‣ ext. Developer APIs
APIs - will become more & more important
���19
ArangoDB Foxx
‣ What if you could talk to the database directly?
‣ It would only need an API.
‣ What if we could define this API in JavaScript?
!!!!!!
‣ ArangoDB Foxx is streamlined for API creation – not a jack of all trades
‣ It is designed for front end developers: Use JavaScript, which you already know (without running into callback hell)
���20
/\ (~( ) ) /\_/\ ( _-----_(@ @) ( \ / /|/--\|\ V " " " "
FoxxApplication = require("org/arangodb/foxx").Application;
app = new FoxxApplication(applicationContext);
app.get("/test", function(req, res) {res.set("Content-Type", "text/plain");
});res.body = "Worked!";
Foxx - Simple Example
���21
Foxx - More features
‣ Full access to ArangoDB‘s internal APIs:
‣ Simple Queries
‣ AQL
‣ Traversals
‣ Automatic generation of interactive documentation
‣ Models and Repositories
‣ Central repository of Foxx apps for re-use and inspiration
‣ Authentication Module
���22
High performance & space efficiency
RAM is cheap, but it's still not free and data volume is growing fast. Requests volumes are also growing. So performance and space efficiency are key features of a multi-purpose database.
!
‣ ArangoDB supports automatic schema recognition, so it is one of the most space efficient document stores.
‣ It offers a performance oriented architecture with a C database core, a C++ communication layer, JS and C++ for additional functionalities.
‣ Performance critical points can be transformed to C oder C++.
‣ Although ArangoDB has a wide range of functions, such as MVCC real ACID, schema recognition, etc., it can compete with popular stores documents.
���23
Space Efficiency
‣ Measure the space on disk of different data sets
‣ First in the standard config, then with some optimization
‣ We measured a bunch of different tasks
���24
Store 50,000 Wiki Articles
���25
0 MB
500 MB
1000 MB
1500 MB
2000 MB
ArangoDB CouchDB MongoDB
NormalOptimized
http://www.arangodb.org/2012/07/08/collection-disk-usage-arangodb
3,459,421 AOL Search Queries
���26
0 MB
550 MB
1100 MB
1650 MB
2200 MB
ArangoDB CouchDB MongoDB
NormalOptimized
http://www.arangodb.org/2012/07/08/collection-disk-usage-arangodb
Performance: Disclaimer
‣ Always take performance tests with a grain of salt
‣ Performance is very dependent on a lot of factors including the specific task at hand
‣ This is just to give you a glimpse at the performance
‣ Always do your own performance tests (and if you do, report back to us :) )
‣ But now: Let‘s see some numbers
���27
Execution Time: Bulk Insert of 10,000,000 documents
���28
ArangoDB CouchDB MongoDB
http://www.arangodb.org/2012/09/04/bulk-inserts-mongodb-couchdb-arangodb
Conclusion from Tests
‣ ArangoDB is really space efficient
‣ ArangoDB is “fast enough”
‣ Please test it for your own use case
���29
Easy to use
‣ Easy to use admin interface
‣ Simple Queries for simple queries, AQL for complex queries
‣ Simplify your setup: ArangoDB only – no Application Server etc. – on a single server is sufficient for some use cases
‣ You need graph queries or key value storage? You don't need to add another component to the mix.
‣ No external dependencies like the JVM – just install ArangoDB
‣ HTTP interface – use your load balancer
���30
Admin Frontend Dashboard
���31
Admin Frontend Collections & Documents
���32
Admin Frontend AQL development
���33
Admin Frontend complete V8 access
���34
ArangoShell
���35
Join the growing community
���36
They are working on geo index, full text search and many APIs: Ruby, Python, PHP, JAVA, D, ...
ArangoDB.explain()
{ "type": “multi-purpose NoSQL database", "model": [ "document", "graph", "key-value" ], "openSource": true, "license“: "apache 2", "version": [ “1.4.9 stable", "2.0 alpha" ], "builtWith": [ "C", "C++", "JS" ], "uses": [ "Google V8" ], "mainFeatures": [ "Multi-Collection-Transaction", "Foxx API Framework", "ArangoDB Query Language", "Various Indexes", "API Server", "Automatic Schema Recognition" ]
}
���37
Appendix
���38
Data Sheet
‣ Extendable through MRuby and Javascript Google V8-Engine
‣ Integrated Application Server
‣ Javascript API Framework “Foxx”
‣ ArangoDB Query Language (AQL)
‣ Query by Example
‣ RESTful Query Interface
‣ Modular Graph Traversal Algorithms
!‣ Easy Administration and Enhanced System
Monitoring
‣ Web-based Console and CLI commands
‣ Efficient Data Import and Export Tools
‣ Fully documented Source Code and APIs
!
���39
‣ Universal Multi-Model DatabaseDocument, Graph and Key/Value
‣ Written in C++ with high speed C Core
‣ Easy to Install & Configure
‣ Runs on Linux, BSD, Mac OS and Windows
‣ Sharding and Replication (in development)
!‣ Mostly memory (durable on hard disc)
‣ Multi-Threaded
‣ Powerful Indices full-text search, hash indices, priority queues, skip lists, geo indices
‣ Schema-less schemata (schema recognition)
‣ Multi Collection Transactions
‣ Driver support for all popular platforms Node.js, JS, PHP, Ruby, Go, D, Python, Blueprints / Gremlin, C# / .Net, Java