Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
-
Upload
mongodb -
Category
Data & Analytics
-
view
1.564 -
download
0
Transcript of Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos
![Page 1: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/1.jpg)
![Page 2: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/2.jpg)
![Page 3: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/3.jpg)
Conceptos Básicos 2016Diseño de esquema orientado a documentos
Rubén TerceñoSenior Solutions Architect, [email protected]@rubenTerceno
![Page 4: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/4.jpg)
¡Bienvenidos!
![Page 5: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/5.jpg)
Agenda del CursoDate Time Webinar25-Mayo-2016 16:00 CEST Introducción a NoSQL 7-Junio-2016 16:00 CEST Su primera aplicación MongoDB 21-Junio-2016 16:00 CEST Diseño de esquema orientado a documentos 07-Julio-2016 16:00 CEST Indexación avanzada, índices de texto y geoespaciales 19-Julio-2016 16:00 CEST Introducción al Aggregation Framework 28-Julio-2016 16:00 CEST Despliegue en producción
![Page 6: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/6.jpg)
Resumen de los webinar 1 y 2• ¿Porqué existe NoSQL?• Tipos de bases de datos NoSQL• Características clave de MongoDB
• Instalación y creación de bases de datos y colecciones• Operaciones CRUD• Índices y explain()
![Page 7: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/7.jpg)
Thinking in Documents• Los documentos de MongoDB son objetos JS (JSON)• Se almacenan codificados en BSON
• BSON es “Binary JSON”• BSON es una forma eficiente de codificar y decodificar JSON• Required for efficient transmission and storage on disk• Eliminates the need to “text parse” all the sub objects
• Si quieres saber más: http://bsonspec.org/
![Page 8: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/8.jpg)
Documento de Ejemplo{
name : “Rubén Terceño”,
title : “Senior Solutions Architect”,
employee_number : 653,
location : {
type : “Point”,
coordinates : [ 43.34, -3.26 ]},
expertise: [ “MongoDB”, “Java”, “Geospatial” ],
address : {
address1 : “Rutilo 11”,
address2 : “Piso 1, Oficina 2”,
zipcode : “28041”,
}
}
Fields can contain sub-documents
Typed field values
Fields can contain arrays
String
Number
Geo-Location
Fields
![Page 9: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/9.jpg)
Some Example Queries• Find all Solution Architectsdb.mongo.find({title : “Solutions Architect”})• Find all employees knowing Java in Support or Consultingdb.mongo.find({expertise: “Java”,
departament: {$in : [“Support”, “Consulting”]}})• Find all employees in my postcodedb.mongo.find({“address.zipcode”: 28041})
![Page 10: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/10.jpg)
Modelling and Cardinality• One to One
• Author to blog post• One to Many
• Blog post to comments• One to Millions
• Blog post to site views (e.g. Huffington Post)
![Page 11: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/11.jpg)
One To One Relationships• “Belongs to” relationships are often embedded• Holistic representation of entities with their embedded attributes and relationships.
• Great read performance
Most important: • Keeps simple things simple• Frees up time to tackle harder schema issues
![Page 12: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/12.jpg)
One To One Relationships{ “Title” : “This is a blog post”, “Author” : {
name : “Rubén Terceño”,login : “[email protected]”,},
…}
We can index on “Title” and “Author.login”.
![Page 13: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/13.jpg)
One to Many - Embedding{ “_id” : ObjectID( “ZZZZ” ), “Title” : “A Blog Title”, “Body” : “A blog post”, “comments” : [{ name : “Juan Amores”, email : “[email protected]”,
comment :“I love your writing style”,}{name : “Pedro Víbora”, email : “[email protected]”, comment :“I hate your writing style”,}]
}
Where we expect a small number of sub-documents we can embed them in the main document
![Page 14: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/14.jpg)
Key Concerns• What are the write patterns?
• Comments are added more frequently than posts• Comments may have images, tags, large bodies of
text• What are the read patterns?
• Comments may not be displayed• May be shown in their own window• People rarely look at all the comments
![Page 15: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/15.jpg)
One to Many – Linking I• Keep all comments in a separate comments collection• Add references to posts IDs• Requires two queries to display blog post and associated comments
{ _id : ObjectID( “AAAA” ), post_id : ObjectID( “ZZZZ” ), name : “Juan Amores”, email : “[email protected]”, comment :“I love your writing style”,}{ _id : ObjectID( “AAAB” ), post_id : ObjectID( “ZZZZ” ), name : “Pedro Víbora”, email : “[email protected]”, comment :“I hate your writing style”,}
{ “_id” : ObjectID( “ZZZZ” ), “Title” : “A Blog Title”, “Body” : “A blog post”}{ “_id” : ObjectID( “ZZZZ” ), “Title” : “Another Blog Title”, “Body” : “Another blog post”,}
![Page 16: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/16.jpg)
One to Many – Linking II• Keep all comments in a separate comments collection• Add references to comments as an array of comment IDs• Requires two queries to display blog post and associated comments• Requires two writes to create a comments {
_id : ObjectID( “AAAA” ), name : “Joe Drumgoole”, email : “[email protected]”, comment :“I love your writing style”,}{ _id : ObjectID( “AAAB” ), name : “John Smith”, email : “[email protected]”, comment :“I hate your writing style”,}
{ “_id” : ObjectID( “ZZZZ” ), “Title” : “A Blog Title”, “Body” : “A blog post”, “comments” : [ ObjectID( “AAAA” ), ObjectID( “AAAB” )]}{ “_id” : ObjectID( “ZZZZ” ), “Title” : “A Blog Title”, “Body” : “A blog post”, “comments” : []}
![Page 17: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/17.jpg)
One To Many – Hybrid Approach{ _id : ObjectID( “ZZZZ” ), Title : “A Blog Title”, Body : “A blog post”, last_comments : [{ _id : ObjectID( “AAAA” ) name : “Juan Amores”, comment :“I love your writing style”,
},{ _id : ObjectID( “AAAB” ), name : “Pedro Víbora”,
comment :“I hate your writing style”,}]
}
{ “_id” : ObjectID( “AAAA” ), “post_id” : ObjectId( “ZZZZ”), “name” : “Juan Amores”, “email” : “[email protected]”,
“comment” :“I love your writing style”,}{...},{...},{...},{...},{...},{...},{..},{...},{...},{...} ]
![Page 18: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/18.jpg)
Linking vs. Embedding• Embedding
• Terrific for read performance• Webapp “front pages” and pre-aggregated material
• Writes can be slow• Data integrity needs to be managed
• Linking• Flexible• Data integrity is built-in• Work is done during reads
![Page 19: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/19.jpg)
Let’s do crazy things!• What is we were tracking mouse position for heat tracking?
• Each user will generate hundreds of data points per visit• Thousands of data points per post• Millions of data points per blog site
• Relational-like model• Store a blog ID per event• Be polymorphic, my friend!
{ “post_id” : ObjectID(“ZZZZ”), “timestamp” : ISODate("2005-01-02T16:35:24Z”), “event” : {
type: click,position : [240, 345]} }
{ “post_id” : ObjectID(“ZZZZ”), “timestamp” : ISODate("2005-01-02T16:35:24Z”), “event” : {
type: close}}
![Page 20: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/20.jpg)
What if we use the structure?{ post_id : ObjectID ( “ZZZZ” ), cookie_id : “R34oitwrFWt945tw34t4569tiwemrti”, timeStamp : ISODate("2005-01-02T16:00:00Z”), events : { 0 : { 0 : { event }, 1 : { event }, … 59: { event }}, 1 : { 0 : { event }, 1 : { event }, … 59: { event }}, 2 : { 0 : { event }, 1 : { event }, … 59: { event }}, 3 : { 0 : { event }, 1 : { event }, … 59: { event }}, ... 59 :{ 0 : { event }, 1 : { event }, … 59: { event }}}
![Page 21: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/21.jpg)
What if we build buckets?{ post_id : ObjectID ( “ZZZZ” ), cookie_id : “R34oitwrFWt945tw34t4569tiwemrti”, count : 98, events : [ { event }, { event }, { event } ... ]}
![Page 22: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/22.jpg)
Implement data governance without
sacrificing agility that comes from dynamic
schema
• Enforce data quality across multiple teams and
applications
• Use familiar MongoDB expressions to control
document structure
• Validation is optional and can be as simple as a
single field, all the way to every field, including
existence, data types, and regular expressions
Data Governance with Doc. Validation
![Page 23: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/23.jpg)
The example on the left adds a rule to the
contacts collection that validates:
• The year of birth is no later than 1998
• The document contains a phone number and / or
an email address
• When present, the phone number and email
addresses are strings
Document Validation Example
db.runCommand({ collMod : “contacts”, validator : { $and : [ {year_of_birth : {$lte: 1998}}, {$or : [ {phone : { $type : “string”}}, {email : {$type : “string}}
]]})
![Page 24: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/24.jpg)
Summary• Schema design is different in MongoDB
• But basic data design principles stay the same• Focus on how an application accesses/manipulates data• Seek out and capture belongs-to 1:1 relationships• Don’t get stuck in “one record” per item thinking
• Embrace the hierarchy and think about cardinality
• Evolve the schema to meet requirements as they change• Be polymorphic!• Document updates are transactions• Use validation in your advantage
![Page 25: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/25.jpg)
Próximo WebinarIndexación avanzada, índices de texto y geoespaciales
• 7 de Julio 2016 – 16:00 CEST, 11:00 ART, 9:00
• ¡Regístrese si aún no lo ha hecho!• Los índices de texto permiten hacer búsquedas “tipo Google” sobre
todos los campos de todos los registros del dataset.• Los índices geoespaciales nos ayudan a realizar queries utilizando
posiciones, tanto simples (proximidad, distancia, etc.) como avanzadas (intersección, inclusión, etc.)
• Regístrese en : https://www.mongodb.com/webinars
• Denos su opinión, por favor: [email protected]
![Page 26: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/26.jpg)
¿Preguntas?
![Page 27: Conceptos básicos. seminario web 3 : Diseño de esquema pensado para documentos](https://reader036.fdocuments.in/reader036/viewer/2022062522/5873ec901a28abb1528b46f9/html5/thumbnails/27.jpg)