December 9, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types - Part 1 Large Data Sets
Back to Basics Webinar 3: Introduction to Replica Sets
-
Upload
mongodb -
Category
Data & Analytics
-
view
683 -
download
0
Transcript of Back to Basics Webinar 3: Introduction to Replica Sets
Back to Basics 2017 : Webinar 3Introduction to Replica Sets
Joe DrumgooleDirector of Developer Advocacy, EMEA
MongoDB@jdrumgoole
V1.0
3
Summary of Part 1 and 2
• Why NoSQL exists• The types of NoSQL database• The key features of MongoDB• How to install MongoDB• How to do the basic CRUD operations• How to create indexes• How to use the explain() plan• MongoDB Compass and MongoDB Atlas
4
Agenda
• Data durability• The MongoDB Approach – The replica set• Replica Set Life-cycle• How to program when using a replica set
5
Next Webinar : Introduction to Sharding
• How to build a highly scalable performant cluster• How to remove write bottlenecks• How to select a shard key
Thursday, 9-Feb-2016, 11:00 am GMT.
6
High Availability and Data Durability – Replica Sets
SecondarySecondary
Primary
7
Replica Set Creation
SecondarySecondary
Primary
Heartbeat
8
Replica Set Node Failure
SecondarySecondary
Primary
No Heartbeat
9
Replica Set Recovery
SecondarySecondary
HeartbeatAnd Election
10
New Replica Set – 2 Nodes
SecondaryPrimary
HeartbeatAnd New Primary
11
Replica Set Repair
SecondaryPrimary
Secondary
Rejoin and resync
12
Replica Set Stable
SecondaryPrimary
Secondary
Heartbeat
Developing with Replica Sets
14
Driver Responsibilities
https://github.com/mongodb/mongo-python-driver
Driver
Authentication& Security Python<->BSON Error handling &
Recovery
WireProtocol
Topology Management Connection Pool
15
Strong Consistency
SecondarySecondary
Primary
Client Application
Client Driver
Writ
e Read
16
Eventual Consistency
SecondarySecondary
Primary
Client Application
Client Driver
Writ
e
ReadRea
d
17
Write Concerns
• Network acknowledgement• Wait for error • Wait for journal sync• Wait for replication
18
Write Concern : 0 (Unacknowledged)
19
Write Concern : 1 (Acknowledged)
20
Write Concern : majority
21
Driver Responsibilities
https://github.com/mongodb/mongo-python-driver
Driver
Authentication& Security Python<->BSON Error handling &
Recovery
WireProtocol
Topology Management Connection Pool
22
Start MongoClient
c = MongoClient( "host1, host2", replicaSet="replset" )
23
Client Side View
Secondaryhost2
Secondaryhost3
Primaryhost1
MongoClient
MongoClient( "host1, host2", replicaSet="replset" )
24
Client Side View
Secondaryhost2
Secondaryhost3
Primaryhost1
MongoClient
MonitorThread 1
MonitorThread 2
{ ismaster : False, secondary: True, hosts : [ host1, host2, host3 ] }
25
What Does ismaster show?
>>> pprint.pprint( db.command( "ismaster" )){u'hosts': [u'JD10Gen-old.local:27017', u'JD10Gen-old.local:27018', u'JD10Gen-old.local:27019'], u'ismaster' : False, u'secondary': True, u'setName' : u'replset',…}>>>
26
Topology
Current Topology ismaster New
Topology
27
Client Side View
Secondaryhost2
Secondaryhost3
Primaryhost1
MongoClient
MonitorThread 1
MonitorThread 2 ✔
28
Client Side View
Secondaryhost2
Secondaryhost3
Primaryhost1
MongoClient
MonitorThread 1
MonitorThread 2 ✔
MonitorThread 3
29
Client Side View
Secondaryhost2
Secondaryhost3
Primaryhost1
MongoClient
MonitorThread 1
MonitorThread 2 ✔
MonitorThread 3
YourCode
30
Next Is Insert
c = MongoClient( "host1, host2", replicaSet="replset" )client.db.col.insert_one( { "a" : "b" } )
31
Insert Will Block
Secondaryhost2
Secondaryhost3
Primaryhost1
MongoClient
MonitorThread 1
MonitorThread 2 ✔
MonitorThread 3
YourCode
Insert
32
ismaster response from Host 1
Secondaryhost2
Secondaryhost3
Primaryhost1
MongoClient
MonitorThread 1
MonitorThread 2 ✔
MonitorThread 3
YourCode
Insert
ismaster
33
Now Write Can Proceed
Secondaryhost2
Secondaryhost3
Primaryhost1
MongoClient
MonitorThread 1
MonitorThread 2 ✔
MonitorThread 3
YourCode
Insert
✔
Insert
34
Later Host 3 Responds
Secondaryhost2
Secondaryhost3
Primaryhost1
MongoClient
MonitorThread 1
MonitorThread 2 ✔
MonitorThread 3
YourCode
✔
✔
35
Steady State
Secondaryhost2
Secondaryhost3
Primaryhost1
MongoClient
MonitorThread 1
MonitorThread 2 ✔
MonitorThread 3
YourCode
✔
✔
36
Life Intervenes
Secondaryhost2
Secondaryhost3
Primaryhost1
MongoClient
MonitorThread 1
MonitorThread 2 ✔
MonitorThread 3
YourCode
✔
✖
37
Monitor may not detect
Secondaryhost2
Secondaryhost3
Primaryhost1
MongoClient
MonitorThread 1
MonitorThread 2 ✔
MonitorThread 3
YourCode
✔
✖
Insert
ConnectionFailure
38
So Retry
Secondaryhost2
Secondaryhost3
MongoClient
MonitorThread 1
MonitorThread 2 ✔
MonitorThread 3
YourCode
✔
✖
Insert
39
Check for Primary
Secondaryhost2
Secondaryhost3
MongoClient
MonitorThread 1
MonitorThread 2 ✔
MonitorThread 3
YourCode
✔
✖
Insert
40
Host 2 Is Primary
Primaryhost2
Secondaryhost3
MongoClient
MonitorThread 1
MonitorThread 2 ✔
MonitorThread 3
YourCode
✔
✖
Insert
41
Steady State
Secondaryhost2
Secondaryhost3
Primaryhost1
MongoClient
MonitorThread 1
MonitorThread 2 ✔
MonitorThread 3
YourCode
✔
✔
42
What Does This Mean? - Connect
import pymongo
client = pymongo.MongoClient()
try: client.admin.command( "ismaster" )except pymongo.errors.ConnectionFailure, e : print( "Cannot connect: %s" % e )
43
What Does This Mean? - Queries
import pymongo
def find_with_recovery( collection, query ) : try:
return collection.find_one( query )
except pymongo.errors.ConnectionFailure, e :
logging.info( "Connection failure : %s" e ) return collection.find_one( query )
44
What Does This Mean? - Inserts
def insert_with_recovery( collection, doc ) : doc[ "_id" ] = ObjectId() try: collection.insert_one( doc ) except pymongo.errors.ConnectionFailure, e: logging.info( "Connection error: %s" % e ) collection.insert_one( doc ) except DuplicateKeyError: pass
45
What Does This Mean? - Updates
collection.update( { "_id" : 1 }, { "$inc" : { "counter" : 1 }})
46
Configuration
connectTimeoutMS : 30sserverTimeoutMS : 30s
47
connectTimeoutMS
Secondaryhost2
Secondaryhost3
MongoClient
MonitorThread 1
MonitorThread 2 ✔
MonitorThread 3
YourCode
✔
✖
Insert
connectTimeoutMS
serverTimeoutMS
48
More Reading
• The spec author Jess Jiryu Davis has a collection of links and his better version of this talkhttps://emptysqua.re/blog/server-discovery-and-monitoring-in-mongodb-drivers/
• The full server discovery and monitoring spec is on GitHubhttps://github.com/mongodb/specifications/blob/master/source/server-discovery-and-monitoring/server-discovery-and-monitoring.rst
Q&A