MongoDB Workshop Universidad de Huelva
-
Upload
juan-antonio-roy-couto -
Category
Data & Analytics
-
view
312 -
download
2
Transcript of MongoDB Workshop Universidad de Huelva
Huelva, 22nd April 2016
Juan Antonio Roy Couto
basicsbasics
#UHUMongoDB
Twitter Hashtag MongoDB Overview
2
Who am I?Juan Antonio Roy Couto
❏ MongoDB Master
❏ Financial Software Developer
❏ Email: [email protected]
❏ Twitter: @juanroycouto
❏ Linkedin: https://www.linkedin.com/in/juanroycouto
❏ Slideshare: slideshare.net/juanroycouto
❏ Personal site: http://www.juanroy.es
❏ Contributor at: http://www.mongodbspain.com
MongoDB Overview
3
❏ Basic Concepts❏ Data Modelling❏ Installation Types❏ First Steps & CRUD❏ Data Analytics With The Aggregation Framework❏ Indexing❏ Replica Set❏ Sharded Cluster❏ How To Scale Your App❏ Python Driver Overview
Agenda MongoDB Overview
4
Basic Concepts - Concepts MongoDB Overview
❏ High Availability
❏ Data Safety
❏ Automatic Failover
❏ Scalability
5
❏ Faster development
❏ Real time analytics
❏ Better strategic decisions
❏ Reduce costs and time to
market
Basic Concepts - Productshttps://www.mongodb.com/products/overview
MongoDB Overview
6
❏ Drivers
❏ Ops & Cloud Manager
❏ Compass
❏ Hadoop & Spark connector
❏ BI connector
❏ Pluggable Storage Engine API
Basic Concepts - Characteristicshttp://www.mongodbspain.com/en/2014/08/17/mongodb-characteristics-future/
MongoDB Overview
7
❏ Open Source General Purpose NoSQL Database
❏ Document Oriented
❏ Non-Structured Data
❏ Schemaless
❏ Security (Authentication & Authorization)
❏ Document Validation, etc
Basic Concepts - SQL Schema Design
MongoDB Overview
8
❏ Customer Key❏ First Name❏ Last Name
Tables
Customers
❏ Address Key❏ Customer Key❏ Street❏ Number❏ Location
Addresses
❏ Pet Key❏ Customer Key❏ Type❏ Breed❏ Name
Pets
Basic Concepts - MongoDB Schema Design
MongoDB Overview
9
Customers Collection
❏ Street❏ Number❏ Location
Addresses
❏ Type❏ Breed❏ Name
Pets
Customers Info
❏ First Name❏ Last Name
❏ Type❏ Breed❏ Name
Basic Concepts - JSON Document MongoDB Overview
10
> db.customers.findOne(){
"_id" : ObjectId("54131863041cd2e6181156ba"),
"first_name" : "Peter","last_name" : "Keil","address" : {
"street" : "C/Alcalá","number" : 123,"location" : "Madrid",
},"pets" : [
{"type" : "Dog","breed" :
"Airedale Terrier","name" :
"Linda",},{
"type" : "Dog","breed" :
"Akita","name" :
"Bruto",}
]}>
Data Modelling MongoDB Overview
11
1:1 Employee-Resume
❏ Access frequency
❏ Documents size
❏ Data atomicity
1:N City-Citizen
❏ Two linked collections
from N to 1
N:N Books-Authors
❏ Two collections linked via
array
1:Few Post-Comments
❏ One collection with
embedded data
Limits: 16MB/doc
Installation Types - Standalone MongoDB Overview
12
MongoDB
Client
DRIVER
Client
DRIVER
Client
DRIVER
Installation Types - Replica Set MongoDB Overview
13
SecondarySecondary
Primary
Client
DRIVER
Client
DRIVER
Client
DRIVER
Replica Set
Installation Types - Sharded Cluster MongoDB Overview
14
Replica Set
Secondary
Secondary
Primary
Client
DRIVER
Client
DRIVER
Client
DRIVER
Secondary
Secondary
Primary
Secondary
Secondary
Primary
Secondary
Secondary
Primary
mongos mongos mongosconfig server
config server
config server
Shard 0 Shard 1 Shard 2 Shard N-1
❏ Find❏ Insert
❏ Bulk inserts❏ Massive Data Load
❏ Update❏ Remove
First Steps & CRUD MongoDB Overview
15
Data Analytics with theAggregation Framework
MongoDB Overview
16
Data analytics Tools MongoDB Overview
17
❏ Internals
❏ Aggregation Framework
❏ Map Reduce
❏ Externals
❏ Spark
❏ Hadoop
❏ Tableau (BI)
❏ ...
MongoDB Overview
18
Indexing - Types
❏ _id❏ Single❏ Compound❏ Multikey❏ Full Text❏ GeoSpatial❏ Hashed
MongoDB Overview
19
Indexing - Properties
❏ Unique❏ Sparse❏ TTL❏ Partial
MongoDB Overview
20
Indexing - Improving Your Queries
.explain()❏ queryPlanner❏ executionStats❏ allPlansExecution
Replica Set
❏ High Availability
❏ Data Safety
❏ Automatic Node Recovery
❏ Read Preference
❏ Write Concern
Replica Set
Secondary
Secondary
Primary
MongoDB Overview
21
❏ Scale out
❏ Even data distribution across all of the
shards based on a shard key
❏ A shard key range belongs to only one
shard
❏ More efficient queries (performance)
Sharded Cluster
Cluster
Shard 0 Shard 2Shard 1
A-I J-Q R-Z
MongoDB Overview
22
Sharded Cluster - Config Servers
❏ config database
❏ Metadata:
❏ Cluster shards list
❏ Data per shard (chunk ranges)
❏ ...
❏ Replica Set
MongoDB Overview
23
Replica Set
config server
config server
config server
❏ Receives client requests and returns
results.
❏ Reads the metadata and sends the
query to the necessary shard/shards.
❏ Does not store data.
❏ Keeps a cache version of the
metadata.
Sharded Cluster - mongos MongoDB Overview
24
Replica Set
DRIVER
Secondary
Secondary
Primary
Secondary
Secondary
Primary
mongos
config server
config server
config server
Shard 0 Shard N-1
How To Scale Your App - Shard Key MongoDB Overview
25❏ Monotonically Increasing
❏ Easy divisible❏ Randomness❏ Cardinality
How To Scale Your AppSharding a Collection
MongoDB Overview
Shard 0 Shard 1 Shard 2 Shard 3
mongosClient
Migrations
How To Scale Your App - Pre-Splitting MongoDB Overview
27
Useful for storing data directly in the shards (massive data loads).
Avoid bottlenecks.
MongoDB does not need to split or migrate chunks.
After the split, the migration must be finished before data loading.
Cluster
Shard 0 Shard 2Shard 1
Chunk 1
Chunk 5
Chunk 3
Chunk 4
Chunk 2
How To Scale Your AppTag-Aware Sharding
MongoDB Overview
28
Tags are used when you want to pin ranges to a specific shard.
shard0
EMEA
shard1
APAC
shard2
LATAM
shard3
NORAM
Python Driver - Overview MongoDB Overview
29
script1.py
import pymongo
from pymongo import MongoClient
connection = MongoClient(‘localhost’,27017)
db = connection.test
customers = db.customers
item = customers.findOne()
print item[‘firstname’]
$python script1.py
Python Driver - CRUD MongoDB Overview
30
PyMongo Server
Findingfind find
find_one findOne
Insertinginsert_one insert
insert_many bulk
Updating
update_one update
update_many update
replace_one update
Deletingdelete_one remove
delete_many remove
Python Driver - CRUD Examples MongoDB Overview
31
Insert
pedro = { ‘firstname’:‘Pedro’, ‘lastname’:‘García’ }
maria = { ‘firstname’:‘María’, ‘lastname’:‘Pérez’ }
doc = [ pedro, maria ]
customers.insert_many([doc])
Update
customers.update_one({‘_id’:customer_id},
{$set:{‘city’:‘Huelva’}})
Remove
customers.delete_one( { ‘_id’ : customer_id } )
Python Driver - Cursors And Exceptions MongoDB Overview
32
import pymongoimport sysfrom pymongo import MongoClientconnection = MongoClient(‘localhost’,27017)db = connection.testcustomers = db.customers
query = { ‘firstname’ : ‘Juan’ }projection = { ‘city’ : 1, ‘_id’ : 0 }
try:cursor =
customers.find(query,projection)exception Exception as e:
print ‘Unexpected error: ‘, type(e), e
for doc in cursor:print doc[‘city’]
Resources MongoDB Basics
33
❏ Official MongoDB Documentation❏ https://docs.mongodb.org/manual/
❏ Posts via MongoDB Spain❏ http://www.mongodbspain.com/en/❏ http://www.mongodbspain.com/es/
❏ Cheat Sheet❏ http://www.mongodbspain.com/es/2014/03/23/mongodb-cheat-sheet-
quick-reference/❏ The Little MongoDB Book
❏ http://openmymind.net/mongodb.pdf
Questions?
Questions? MongoDB Basics
34
Thank you for your attention!
MongoDB WorkshopHuelva, 22nd April 2016
Juan Antonio Roy Couto