OrientDB distributed architecture 1.1

41
Distributed architecture with a Multi-Master approach Available in version 1.0 (planned for December 2011) www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 1 of 41 rev 1.1

description

This is the official presentation of the new clustering Multi-Master architecture of OrientDB

Transcript of OrientDB distributed architecture 1.1

Page 1: OrientDB distributed architecture 1.1

Distributed architecturewith a Multi-Master approach

Available in version 1.0(planned for December 2011)

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 1 of 41

rev 1.1

Page 2: OrientDB distributed architecture 1.1

Where is the previousOrientDB

Master/Slavearchitecture?

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 2 of 41

Page 3: OrientDB distributed architecture 1.1

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 3 of 41

Page 4: OrientDB distributed architecture 1.1

After first tests we decided tothrow away the old Master-Slave

architecture because it wasagainst the OrientDB philosophy:

doesn't scale

and

it's hard to configure properlywww.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 4 of 41

Page 5: OrientDB distributed architecture 1.1

So what's next?

We've re-designed the entire distributedarchitecture to get it working as

Multi-Master*to release in the version 1.0

(december 2011)

*http://en.wikipedia.org/wiki/Multi-master_replication

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 5 of 41

Page 6: OrientDB distributed architecture 1.1

In the Multi-Master architecture

any node can read/write to the database

this scale up horizontly

adding nodes is straightforward

Say wow!

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 6 of 41

Page 7: OrientDB distributed architecture 1.1

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 7 of 41

Page 8: OrientDB distributed architecture 1.1

...but

you have to fightwith

conflictswww.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 8 of 41

Page 9: OrientDB distributed architecture 1.1

Fortunately we found somesmart ways to resolve conflicts without

falling in a

Blood Bath

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 9 of 41

Page 10: OrientDB distributed architecture 1.1

Leader Node

The actors

Any server node in the cluster. Has a permanentconnection to the Leader Node

Synchronous mode replication. Server node propagateschanges waiting for the response from the remote server,then sends the ACK to the clientAsynchronous mode replication. Server node propagateschanges and sends the ACK to the client without waitingfor the response from the remote server

Peer Node

Only 1 per Leader per cluster, checks other nodes andnotify changes to other Peer Nodes. Can be any servernode in the cluster, usually the first to start

Clients are connected to Server Nodes no matter if Leaderor PeerClient

Database Database, where data are stored

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 10 of 41

Page 11: OrientDB distributed architecture 1.1

How the clusterof nodes iscomposed

andmanaged?

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 11 of 41

Page 12: OrientDB distributed architecture 1.1

Cluster auto-discoveringAt start up each Server Node sends a IP Multicast message in broadcast to

discover if any Leader Node is available to join the cluster. If available, theLeader Node will connect to it and it becomes a Peer Node, otherwise it becomes

the Leader node.

DBDBDBDBDBDB

DBDBDBDBDBDB

Server #1(Leader)

Server #2(Peer)

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 12 of 41

Page 13: OrientDB distributed architecture 1.1

One Leader Multiple PeersThe first node to start is always the Leader but in case of failure can be electedany other. Leader Node polls all the servers verifying the status and alerts all the

Peer Nodes at every changes in the cluster composition.

DBDBDBDBDBDB

DBDBDBDBDBDBDBDBDBDBDB

Server #1(Leader)

Server #2(Peer)

Server #3(Peer)

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 13 of 41

Page 14: OrientDB distributed architecture 1.1

Asymmetric clusteringEach database can be clustered in multiple server nodes. Databases can be moved

across servers. Replication strategy has per database/server granularity.This means you could have Server #2 that replicates database B in asynch way

to the Server #3 and database A in synch way to the Server #1.

Server #1(Leader)

A B C B

A

C

Server #2(Peer)

Server #3(Peer)

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 14 of 41

Page 15: OrientDB distributed architecture 1.1

Distributed configurationCluster configuration is broadcasted from the Leader Node to all the Peer Nodes.

Peer Nodes broadcast to all the connected clients.Everybody knows who has the database

Server #1(Leader)

Server #2(Peer)

Server #3(Peer)

Client #1Client #3

Client #2

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 15 of 41

Page 16: OrientDB distributed architecture 1.1

SecurityTo join a cluster the Server Node has to configure the cluster name and password

Broadcast messages are encrypted using the passwordPassword doesn't cross the network: it's stored in the configuration file

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 16 of 41

DBDBDBDBDBDB

Server #1(Leader)

Server #2(Peer)

Join the clusterONLY

If knows the nameand password

Page 17: OrientDB distributed architecture 1.1

Leader electionEach Peer Node continuously checks the connection with the Leader Node

If lost try to elect itself as a new Leader NodeSplit Network resolved using a simple algorithm

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 17 of 41

Server #2192.168.10.27:2424

(Leader)

Server #1192.168.0.10:2424

(Leader)

Server #1 takes theleadership

because has the lower IDID = <ip-address>:<port>

Page 18: OrientDB distributed architecture 1.1

Multiple clustersMultiple separate clusters can coexist in the same network

Clusters can't see each others. Are separated boxesWhat identify a cluster is name + password

Server #1(Leader)

Server #2(Peer)

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 18 of 41

Server #3(Peer)

Cluster 'A', password 'aaa'

Server #1(Leader)

Server #2(Peer)

Server #3(Peer)

Cluster 'B', password 'bbb'

Page 19: OrientDB distributed architecture 1.1

Server #1 Server #2

Fail-overClients knows about other nodes, so transparently switch

to good servers. No error is sent to the client app.Running transactions will be repeated transparently too (v1.2)

DB-1 DB-2

Client #1 Client #2 Client #3 Client #4

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 19 of 41

Page 20: OrientDB distributed architecture 1.1

How the replication works?

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 20 of 41

Page 21: OrientDB distributed architecture 1.1

Server #1

Synchronous ReplicationGuarantees two databases are always consistent

More expensive than asynchronous because the First Serverwaits for the Second Server's answer before to send back

the ACK to the client. After ACK the Client is securethe data is placed in multiple nodes at the same time

Server #2

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 21 of 41

DB-1 DB-2

Page 22: OrientDB distributed architecture 1.1

Server #1

DB-2

Synchronous Replicationsteps

Server #2

1) Update record request

2) Update record to DB-1

3) Propagates the update

5) Sends back OK to Server #1 4) update record to DB-2

Client #1

6) Sends back OK to Client #1

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 22 of 41

DB-1

Page 23: OrientDB distributed architecture 1.1

Asynchronous ReplicationChanges are propagated without waiting for the answer

Two databases could be not consistent in the range of few msFor this reason it's called “Eventually Consistent”

It's much less expensive than synchronous replication.

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 23 of 41

Server #1 Server #2

DB-1 DB-2

Page 24: OrientDB distributed architecture 1.1

Server #1

Client #1

Asynchronous Replicationsteps

(4a and 4b are executed in parallel)

1) Update record request

2) Update record to DB-1

3) Propagates the update

4b) update record to DB-2

4a) Sends back OK to Client #1

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 24 of 41

DB-2

Server #2

DB-1

Page 25: OrientDB distributed architecture 1.1

Server #1

DB-2

Error ManagementDuring replication the Second Server could get an error due to a

conflict (the record was modified in the same moment from another client)or a I/O problem. In this case the error is logged to disk to being fixed later.

Server #2

1) Update record request

2) Update record to DB-1

3) Propagates the update

Client #1

4) Sends back OK to Client #1

Synch Log

5) update record to DB-26) log the error

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 25 of 41

DB-1

Page 26: OrientDB distributed architecture 1.1

DB-2

Conflict ManagementDuring replication conflicts could happen if two clients are

updating the same record at the same timeThe conflicts resolution strategy can be plugged by providing

implementations of the OConflictResolver interface

Server #2

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 26 of 41

Conflict Strategy

Page 27: OrientDB distributed architecture 1.1

DB-2

Conflict ManagementDefault strategy

Server #2

Synch Log

Default implementationmerges the records:

in case same fields arechanged the oldest

document wins and thenewest is written into the

Synch Log

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 27 of 41

DefaultConflict Strategy

Page 28: OrientDB distributed architecture 1.1

Manual control of conflictslike SVN/GIT tools

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 28 of 41

Page 29: OrientDB distributed architecture 1.1

Display the diff of 2 databases> compare database db1 db2

Copy a record across databases> copy record #10:20@db1 to #10:20@db2

Copy entire cluster across databases> copy cluster city@db1 to city@db2

Merges two records across databases> merge records #10:20@db1 #10:20@db2

to #10:[email protected] Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 29 of 41

Page 30: OrientDB distributed architecture 1.1

How nodes are re-aligned

once up again after a fail,shutdown or network problem?

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 30 of 41

Page 31: OrientDB distributed architecture 1.1

Server #1 Server #2

During replication all operationsare logged using

unique op-id with the format <node>#<serial>

DB-1 DB-2

Client

Operation Log

Op-id: 192.168.0.10:2424#123232

Operation Log

Op-id: 192.168.0.10:2424#123232

Update a record

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 31 of 41

Page 32: OrientDB distributed architecture 1.1

Server #1 Server #2

On restart the node asks to the Leaderwhich are the servers to synchronize

op-ids are used to know the operation missed

DB-1 DB-2Operation Log

Op-id: 192.168.1.11:2424#9569

Operation Log

Op-id: 192.168.0.10:2424#123232

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 32 of 41

Page 33: OrientDB distributed architecture 1.1

To beconsistentor not be,

that isthe question

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 33 of 41

Page 34: OrientDB distributed architecture 1.1

Always consistentuse it as a Master-Slave

Server #2Synch Slaveread only

Server #1Master

read + write

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 34 of 41

Read only, consistent. Leave it as replica. Since it's always aligned it's the best candidate as new master if

Server #1 is unavailable

Read/Write. All changes on this server

avoiding conflicts

Client

Client

Perfect for Analysis, Business Intelligence

and ReportsOne-way only

Page 35: OrientDB distributed architecture 1.1

Read-only scalingusing many asynchronous replicas

Server #2Synch Slaveread onlyServer #1

Masterread + write

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 35 of 41

Client

Client

Read/Write. All changes on this server

avoiding conflicts

Server #3Asynch Slave

read only

Server #3Asynch Slave

read only

Server #3Asynch Slave

read only

Server #NAsynch Slave

read only

Read only, eventually consistent. Replication

cost close to zero

Page 36: OrientDB distributed architecture 1.1

Read/Write scalingMulti master + handling conflicts

Server #3Master

read + write

Server #1Master

read + write

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 36 of 41

Client

Client

Server #2Master

read + write Client

Client

Client

Client

Page 37: OrientDB distributed architecture 1.1

Read/Write scaling + shardingMulti master, no conflict! :-)

Server CHIMaster

read + write

Server USAMaster

read + write

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 37 of 41

Client

Client

Writes oncustomers_usa

Writes oncustomers_china

customers_usa

customers_china

Page 38: OrientDB distributed architecture 1.1

Multi-Master + Sharding=

big scale in high-availability and no conflictswww.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 38 of 41

Page 39: OrientDB distributed architecture 1.1

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 39 of 41

Page 40: OrientDB distributed architecture 1.1

NuvolaBase.com(beta)

The firstGraph Database

on the Cloud

always availablefew seconds to setup it

use it from Web & Mobileapps

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 40 of 41

Page 41: OrientDB distributed architecture 1.1

Luca GarulliAuthor of OrientDB and

Roma <Meta> FrameworkOpen Source projects,

Member of JSR#12 (jdo 1.0) and JSR#243 (jdo 2.0)

CEO at Nuvola Base Ltd

www.twitter.com/lgarulli@London, UK

and@Rome, Italy

www.orientechnologies.com Licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License Page 41 of 41