Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for...
-
Upload
buddy-blair -
Category
Documents
-
view
221 -
download
2
description
Transcript of Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for...
Some notes on NoSQL, in particular MongoDB
Bettina Berendt(with thanks to Matthijs van Leeuwen for some of the slides)
8 December 2015
Overview: NoSQL
‘Not only SQL’
No tables, but storage of, e.g.– Collections of documents (e.g. JSON, XML)– Key-value pairs– Columns of values– Graphs– Objects– …
2
NoSQL
Advantages– horizontally scalable (as opposed to vertically)– No static schema or data model– Cheaper in maintenance
Disadvantages– Possibilities can be very system-specific no universal query language– Often, some coding is necessary– Fewer a/o weaker theoretical guarantees
3
Example systems: NoSQL
4
MongoDB, the most popular system for document stores(see https://en.wikipedia.org/wiki/MongoDB and references there)
MongoDB is “schema-free“!
6
Understanding the MongoDB / NoSQL notion of “document“
– Good example of what computer scientists call “semi-structured data“ (see previous week)
– But actually fairly structured in comparison to e.g. a textual document:• MongoDB‘s format is called BSON, a binary form of JSON• See https://en.wikipedia.org/wiki/JSON, https://
en.wikipedia.org/wiki/BSON – Note: JSON can be thought of as an alternative to XML,
as described for example on the – certainly not disinterested - http://www.json.org/xml.html , but not the type of XML you often see for annotating texts, as for example in the Letters of 1916 project
7
INSERT a row (SQL) insert a document (MongoDB)db.inventory.insert(
{
item: "ABC1",
details: {model: "14Q3",
manufacturer: "XYZ Company"
},
stock: [ { size: "S", qty: 25 }, { size: "M", qty: 50 } ],
category: "clothing"
}
)
8
SELECT (SQL) find documents (MongoDB)
db.inventory.find( { type: { $in: [ 'food', 'snacks' ] } } )
db.inventory.find( { type: 'food', price: { $lt: 9.95 } } )
9
SELECT and SORT
10
UPDATE (SQL) update documents (MongoDB)
db.inventory.update(
{ item: "MNO2" },
{
$set: {
category: "apparel",
details: { model: "14Q3", manufacturer: "XYZ" }
},
$currentDate: { lastModified: true }
}
)
11
Other useful constructs ...
... such as GROUP BY are also available (see Wikipedia description)
... And python interfaces exist.
Setting indexes in MongoDB:Usage (1): BSON structureGiven the following document in the users collection
{ “_id“ : ObjectID(...),“name“ : “Alice“,“age“ : 27“score“ : 25
}
the following command creates an index on the score field:
db.users.createIndex ( { “score“ : 1 } )
Usage (2)
14
SELECT and SORT (shown with reference to an index)
15
Importance for DHers?
– Certainly growing, but probably not necessary for everyone
My personal rule of thumb: – Very useful if
• you know the query you have (for example because you have worked it out on a small data sample, with SQL, python, or whatever), and
• you need to process LOTS of data– Less useful for very exploratory analysis, since
there you may need a universal query language.