MongoDB Workshop Universidad de Huelva

35
Huelva, 22nd April 2016 Juan Antonio Roy Couto basic s basic s

Transcript of MongoDB Workshop Universidad de Huelva

Page 1: MongoDB Workshop Universidad de Huelva

Huelva, 22nd April 2016

Juan Antonio Roy Couto

basicsbasics

Page 2: MongoDB Workshop Universidad de Huelva

#UHUMongoDB

Twitter Hashtag MongoDB Overview

2

Page 3: MongoDB Workshop Universidad de Huelva

Who am I?Juan Antonio Roy Couto

❏ MongoDB Master

❏ Financial Software Developer

❏ Email: [email protected]

❏ Twitter: @juanroycouto

❏ Linkedin: https://www.linkedin.com/in/juanroycouto

❏ Slideshare: slideshare.net/juanroycouto

❏ Personal site: http://www.juanroy.es

❏ Contributor at: http://www.mongodbspain.com

MongoDB Overview

3

Page 4: MongoDB Workshop Universidad de Huelva

❏ Basic Concepts❏ Data Modelling❏ Installation Types❏ First Steps & CRUD❏ Data Analytics With The Aggregation Framework❏ Indexing❏ Replica Set❏ Sharded Cluster❏ How To Scale Your App❏ Python Driver Overview

Agenda MongoDB Overview

4

Page 5: MongoDB Workshop Universidad de Huelva

Basic Concepts - Concepts MongoDB Overview

❏ High Availability

❏ Data Safety

❏ Automatic Failover

❏ Scalability

5

❏ Faster development

❏ Real time analytics

❏ Better strategic decisions

❏ Reduce costs and time to

market

Page 6: MongoDB Workshop Universidad de Huelva

Basic Concepts - Productshttps://www.mongodb.com/products/overview

MongoDB Overview

6

❏ Drivers

❏ Ops & Cloud Manager

❏ Compass

❏ Hadoop & Spark connector

❏ BI connector

❏ Pluggable Storage Engine API

Page 7: MongoDB Workshop Universidad de Huelva

Basic Concepts - Characteristicshttp://www.mongodbspain.com/en/2014/08/17/mongodb-characteristics-future/

MongoDB Overview

7

❏ Open Source General Purpose NoSQL Database

❏ Document Oriented

❏ Non-Structured Data

❏ Schemaless

❏ Security (Authentication & Authorization)

❏ Document Validation, etc

Page 8: MongoDB Workshop Universidad de Huelva

Basic Concepts - SQL Schema Design

MongoDB Overview

8

❏ Customer Key❏ First Name❏ Last Name

Tables

Customers

❏ Address Key❏ Customer Key❏ Street❏ Number❏ Location

Addresses

❏ Pet Key❏ Customer Key❏ Type❏ Breed❏ Name

Pets

Page 9: MongoDB Workshop Universidad de Huelva

Basic Concepts - MongoDB Schema Design

MongoDB Overview

9

Customers Collection

❏ Street❏ Number❏ Location

Addresses

❏ Type❏ Breed❏ Name

Pets

Customers Info

❏ First Name❏ Last Name

❏ Type❏ Breed❏ Name

Page 10: MongoDB Workshop Universidad de Huelva

Basic Concepts - JSON Document MongoDB Overview

10

> db.customers.findOne(){

"_id" : ObjectId("54131863041cd2e6181156ba"),

"first_name" : "Peter","last_name" : "Keil","address" : {

"street" : "C/Alcalá","number" : 123,"location" : "Madrid",

},"pets" : [

{"type" : "Dog","breed" :

"Airedale Terrier","name" :

"Linda",},{

"type" : "Dog","breed" :

"Akita","name" :

"Bruto",}

]}>

Page 11: MongoDB Workshop Universidad de Huelva

Data Modelling MongoDB Overview

11

1:1 Employee-Resume

❏ Access frequency

❏ Documents size

❏ Data atomicity

1:N City-Citizen

❏ Two linked collections

from N to 1

N:N Books-Authors

❏ Two collections linked via

array

1:Few Post-Comments

❏ One collection with

embedded data

Limits: 16MB/doc

Page 12: MongoDB Workshop Universidad de Huelva

Installation Types - Standalone MongoDB Overview

12

MongoDB

Client

DRIVER

Client

DRIVER

Client

DRIVER

Page 13: MongoDB Workshop Universidad de Huelva

Installation Types - Replica Set MongoDB Overview

13

SecondarySecondary

Primary

Client

DRIVER

Client

DRIVER

Client

DRIVER

Replica Set

Page 14: MongoDB Workshop Universidad de Huelva

Installation Types - Sharded Cluster MongoDB Overview

14

Replica Set

Secondary

Secondary

Primary

Client

DRIVER

Client

DRIVER

Client

DRIVER

Secondary

Secondary

Primary

Secondary

Secondary

Primary

Secondary

Secondary

Primary

mongos mongos mongosconfig server

config server

config server

Shard 0 Shard 1 Shard 2 Shard N-1

Page 15: MongoDB Workshop Universidad de Huelva

❏ Find❏ Insert

❏ Bulk inserts❏ Massive Data Load

❏ Update❏ Remove

First Steps & CRUD MongoDB Overview

15

Page 16: MongoDB Workshop Universidad de Huelva

Data Analytics with theAggregation Framework

MongoDB Overview

16

Page 17: MongoDB Workshop Universidad de Huelva

Data analytics Tools MongoDB Overview

17

❏ Internals

❏ Aggregation Framework

❏ Map Reduce

❏ Externals

❏ Spark

❏ Hadoop

❏ Tableau (BI)

❏ ...

Page 18: MongoDB Workshop Universidad de Huelva

MongoDB Overview

18

Indexing - Types

❏ _id❏ Single❏ Compound❏ Multikey❏ Full Text❏ GeoSpatial❏ Hashed

Page 19: MongoDB Workshop Universidad de Huelva

MongoDB Overview

19

Indexing - Properties

❏ Unique❏ Sparse❏ TTL❏ Partial

Page 20: MongoDB Workshop Universidad de Huelva

MongoDB Overview

20

Indexing - Improving Your Queries

.explain()❏ queryPlanner❏ executionStats❏ allPlansExecution

Page 21: MongoDB Workshop Universidad de Huelva

Replica Set

❏ High Availability

❏ Data Safety

❏ Automatic Node Recovery

❏ Read Preference

❏ Write Concern

Replica Set

Secondary

Secondary

Primary

MongoDB Overview

21

Page 22: MongoDB Workshop Universidad de Huelva

❏ Scale out

❏ Even data distribution across all of the

shards based on a shard key

❏ A shard key range belongs to only one

shard

❏ More efficient queries (performance)

Sharded Cluster

Cluster

Shard 0 Shard 2Shard 1

A-I J-Q R-Z

MongoDB Overview

22

Page 23: MongoDB Workshop Universidad de Huelva

Sharded Cluster - Config Servers

❏ config database

❏ Metadata:

❏ Cluster shards list

❏ Data per shard (chunk ranges)

❏ ...

❏ Replica Set

MongoDB Overview

23

Replica Set

config server

config server

config server

Page 24: MongoDB Workshop Universidad de Huelva

❏ Receives client requests and returns

results.

❏ Reads the metadata and sends the

query to the necessary shard/shards.

❏ Does not store data.

❏ Keeps a cache version of the

metadata.

Sharded Cluster - mongos MongoDB Overview

24

Replica Set

DRIVER

Secondary

Secondary

Primary

Secondary

Secondary

Primary

mongos

config server

config server

config server

Shard 0 Shard N-1

Page 25: MongoDB Workshop Universidad de Huelva

How To Scale Your App - Shard Key MongoDB Overview

25❏ Monotonically Increasing

❏ Easy divisible❏ Randomness❏ Cardinality

Page 26: MongoDB Workshop Universidad de Huelva

How To Scale Your AppSharding a Collection

MongoDB Overview

Shard 0 Shard 1 Shard 2 Shard 3

mongosClient

Migrations

Page 27: MongoDB Workshop Universidad de Huelva

How To Scale Your App - Pre-Splitting MongoDB Overview

27

Useful for storing data directly in the shards (massive data loads).

Avoid bottlenecks.

MongoDB does not need to split or migrate chunks.

After the split, the migration must be finished before data loading.

Cluster

Shard 0 Shard 2Shard 1

Chunk 1

Chunk 5

Chunk 3

Chunk 4

Chunk 2

Page 28: MongoDB Workshop Universidad de Huelva

How To Scale Your AppTag-Aware Sharding

MongoDB Overview

28

Tags are used when you want to pin ranges to a specific shard.

shard0

EMEA

shard1

APAC

shard2

LATAM

shard3

NORAM

Page 29: MongoDB Workshop Universidad de Huelva

Python Driver - Overview MongoDB Overview

29

script1.py

import pymongo

from pymongo import MongoClient

connection = MongoClient(‘localhost’,27017)

db = connection.test

customers = db.customers

item = customers.findOne()

print item[‘firstname’]

$python script1.py

Page 30: MongoDB Workshop Universidad de Huelva

Python Driver - CRUD MongoDB Overview

30

PyMongo Server

Findingfind find

find_one findOne

Insertinginsert_one insert

insert_many bulk

Updating

update_one update

update_many update

replace_one update

Deletingdelete_one remove

delete_many remove

Page 31: MongoDB Workshop Universidad de Huelva

Python Driver - CRUD Examples MongoDB Overview

31

Insert

pedro = { ‘firstname’:‘Pedro’, ‘lastname’:‘García’ }

maria = { ‘firstname’:‘María’, ‘lastname’:‘Pérez’ }

doc = [ pedro, maria ]

customers.insert_many([doc])

Update

customers.update_one({‘_id’:customer_id},

{$set:{‘city’:‘Huelva’}})

Remove

customers.delete_one( { ‘_id’ : customer_id } )

Page 32: MongoDB Workshop Universidad de Huelva

Python Driver - Cursors And Exceptions MongoDB Overview

32

import pymongoimport sysfrom pymongo import MongoClientconnection = MongoClient(‘localhost’,27017)db = connection.testcustomers = db.customers

query = { ‘firstname’ : ‘Juan’ }projection = { ‘city’ : 1, ‘_id’ : 0 }

try:cursor =

customers.find(query,projection)exception Exception as e:

print ‘Unexpected error: ‘, type(e), e

for doc in cursor:print doc[‘city’]

Page 33: MongoDB Workshop Universidad de Huelva

Resources MongoDB Basics

33

❏ Official MongoDB Documentation❏ https://docs.mongodb.org/manual/

❏ Posts via MongoDB Spain❏ http://www.mongodbspain.com/en/❏ http://www.mongodbspain.com/es/

❏ Cheat Sheet❏ http://www.mongodbspain.com/es/2014/03/23/mongodb-cheat-sheet-

quick-reference/❏ The Little MongoDB Book

❏ http://openmymind.net/mongodb.pdf

Page 34: MongoDB Workshop Universidad de Huelva

Questions?

Questions? MongoDB Basics

34

Page 35: MongoDB Workshop Universidad de Huelva

Thank you for your attention!

MongoDB WorkshopHuelva, 22nd April 2016

Juan Antonio Roy Couto