Codemotion Milano 2014 - MongoDB and the Internet of Things

Post on 07-Jul-2015

951 views 1 download

Tags:

description

Time series are a classical example about the flexibility of the document approach. In this presentation you will see how to manipulate the documents to create a schema optimized for the time-series.

Transcript of Codemotion Milano 2014 - MongoDB and the Internet of Things

Massimo Brignoli

Senior Solutions Architect

MongoDB Inc.

massimo@mongodb.com

@massimobrignoli

MongoDB and

The Internet of Things

The Problem

• If you're thinking about designing an ideal data

structure for your Internet of Things application, then

here's what you should do:

don't do it.

The Problem

• The Internet of Things requires a huge deal of

flexibility.

Why?

• Because there are billions of heterogeneus objects

that will begin interacting with each other in ways we

can't predict.

• The structured and rigid tables offered by traditional

databases won't help us because they require a pre-

defined set of properties and tables, which again,

we can't predict.

The Problem

• Let's say we want to measure water levels in a large

number of wells. A simplified data architecture for

this application would look like this:

The Problem

• This looks just fine and should work perfectly using

a relational database. But then, 2 years after the

system has been up and running, someone has an

idea:

"Hey, now that we bought these new Internet-enabled

diesel generators to power the water pumps, let's see

their live data!”

The Problem

• To make this change, we would have to add a new

table called "Power Plants" and a new column to the

table "Wells”:

The Solution

• A great way of handling IoT data is the document-

oriented approach

• Instead of fixed tables, columns, and rows, you have

documents describing each object.

MongoDB

Document

DatabaseOpen-

Source

General

Purpose

Documents Are Core

Relational MongoDB{

first_name: "Paul",

surname: "Miller",

city: "London",

location: [45.123,47.232],

cars: [

{ model: "Bentley",

year: 1973,

value: 100000, … },

{ model: "Rolls Royce",

year: 1965,

value: 330000, … }

]

}

Modeling time series data

in MongoDB

Rexroth NEXO Cordless Nutrunner

Time series schema design goal

• Store event data

• Support Analytical Queries

• Find best compromise of:

- Memory utilization

- Write performance

- Read/Analytical Query Performance

• Accomplish with realistic amount of hardware

Modeling time series data

• Document per event

• Document per minute (average)

• Document per minute (second)

• Document per hour

Document per event

• Relational-centric approach

• Insert-driven workload

{

deviceId: "Test123",

timestamp: ISODate("2014-07-03T22:07:38.000Z"),

temperature: 21

}

Document per minute (average)

• Pre-aggregate to compute average per minutemore easily

• Update-driven workload

• Resolution at the minute level

{

deviceId: "Test123",

timestamp: ISODate("2014-07-03T22:07:00.000Z"),

temperature_num: 18,

temperature_sum: 357

}

Document per minute (by second)

• Store per-second data at the minute level

• Update-driven workload

• Pre-allocate structure to avoid document moves

{

deviceId: "Test123",

timestamp: ISODate("2014-07-03T22:07:00.000Z"),

temperature: { 0: 18, 1: 18, …, 58: 21, 59: 21 }

}

Document per hour (by second)

• Store per-second data at the hourly level

• Update-driven workload

• Pre-allocate structure to avoid document moves

• Updating last second requires 3599 steps

{

deviceId: "Test123",

timestamp: ISODate("2014-07-03T22:00:00.000Z"),

temperature: { 0: 18, 1: 18, …, 3598: 20, 3599: 20 }

}

Document per hour (by second)

• Store per-second data at the hourly level with nesting

• Update-driven workload

• Pre-allocate structure to avoid document moves

• Updating last second requires 59 + 59 steps

{

deviceId: "Test123",

timestamp: ISODate("2014-07-03T22:00:00.000Z"),

temperature: {

0: { 0: 18, …, 59: 18 },

…,

59: { 0: 21, …, 59: 20 }

}

}

Rexroth NEXO schema

{

_id: ObjectID("52ecf3d6bf1e623a52000001"),

assetId: "NEXO 109",

hour: ISODate("2014-07-03T22:00:00.000Z"),

status: "Online",

type: "Nutrunner",

serialNo : "100-210-ABC",

ip: "127.0.0.1",

positions: {

0: {

0: { x: "10", y:"40", zone: "itc-1", accuracy: "20” },

…,

59: { x: "15", y: "30", zone: "itc-1", accuracy: "25” }

},

…,

59: {

0: { x: "22", y: "27", zone: "itc-1", accuracy: "22” },

…,

59: { x: "18", y: "23", zone: "itc-1", accuracy: "24” }

}

}

}

Demo

How to scale

Scaling Up

Scaling Out

First Edition (1771)

3 Volumes

Fifteenth Edition (2010)

32 Volumes

Shards and Shard Keys

Shard

Shard key

range

Why is MongoDB a good fit for IoT?

• IoT processes are real-time

• Relational technologies can simply not compete

on cost, performance, scalability, and

manageability

• IoT data can come in any format, structured or

unstructured, ranging from text and numbers to

audio, picture and video

• Time series data is a natural fit

• IoT applications often require geographically

distributed systems

Thank you!