Webinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries

Post on 26-Jan-2015

115 views 9 download

Tags:

description

In this session we will dive into some of the use-cases companies are currently deploying MongoDB for in the energy space. It is becoming more important for companies to make data driven decisions, and MongoDB can often be the right tool for analyzing the massive amounts of data coming in. Whether tracking oil well site statistics, power meter data, or feeds from sensors, MongoDB can be a great fit for tracking and analyzing that data, using it to make smart, informed business decisions.

Transcript of Webinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries

MongoDB Usage withinOil, Gas, and Energy

Senior Account Executive / Solutions Architect, MongoDB Inc.

@hungarianhc ~ kevin@mongodb.com

Kevin Hanson

#mongodb

Agenda

• Common Themes in MongoDB Usage

• What is MongoDB?

• Use-Cases and Examples

• Thinking Ahead

• Questions

Common Themes in MongoDB Usage

Machine Generated Data

Fast Moving Data

• Hundreds of thousands of records per second

• Fast response required

• Sometimes all data kept, sometimes just summary

• Horizontal scalability required

Massive Amounts of Data

• Widely applicable data model

• Applies to several different “data use cases”

• Various schema and modeling options

• Application requirements drive schema design

Data is Structured, but Varied…

• A machine generates a specific kind of data

• The data model is unlikely to change

• But there are so many different machines…

• Queryability across all types

Time Series Data

• Event data written multiple times per second, minute, or hour

• Tracking progression of metrics over time

What is MongoDB?

MongoDB is a ___________ database

• Open source

• High performance

• Full featured

• Document-oriented

• Horizontally scalable

Full Featured

• Dynamic (ad-hoc) queries

• Built-in online aggregation

• Rich query capabilities

• Traditionally consistent

• Many advanced features

• Support for many programming languages

Document-Oriented Database

• A document is a nestable associative array

• Document schemas are flexible

• Documents can contain various data types (numbers, text, timestamps, blobs, etc)

Horizontally Scalable (Add Shards)

Replication Within a ShardEnabling Global Deployments

Use-Case: Oil Rig Data Analysis

3 Points of Data Creation / Collection

Rig Site(Middle of the

Ocean)

Regional Center

(Nearby Continent)

Headquarters(Texas? )

Day Level Data

Hour Level Data

MongoDB on all 3 Sites

Rig Site(Middle of the

Ocean)

Regional Center

(Nearby Continent)

Headquarters(Texas? )

Day Level Data

Hour Level Data

MongoDB on the Rig

{

machine-id: “derrick-72”,

utilization-rate: 92,

depth: 172,

ts: ISODate("2013-10-16T22:07:38.000-0500")

}

• Queried and analyzed by on-site rig personnel

• High volume data with real-time response

• Aggregations compute high level statistics

• Statistics are transmitted to regional center

MongoDB at the Regional Center{

rig-id: “gulf-1a23v”,

machine-failures: 0,

efficiency: 82,

ts: ISODate("2014-07-13T22:12:21.000-0800")

}

• Monitoring important statistics from multiple rigs

• Aggregating rig data to report regional data to headquarters

MongoDB at the Regional Center

MongoDB at Headquarters

{

region: “Pacific”,

total-rigs: 82,

producing-rigs: 77,

barrels: 44000,

ts: ISODate("2014-07-13")

}

• Regional views of the data

• Real-time stats

• Integration with hadoop for large batch processing jobs

{

region: “Atlantic”,

total-rigs: 102,

producing-rigs: 95,

barrels: 97000,

ts: ISODate("2014-07-13")

}

Powered by MongoDB Replication & the Oplog

> db.replsettest.insert({_id:1,value:1})

{ "ts" : Timestamp(1350539727000, 1), "h" : NumberLong("6375186941486301201"), "op" : "i", "ns" : "test.replsettest", "o" : { "_id" : 1, "value" : 1 } }

> db.replsettest.update({_id:1},{$inc:{value:10}})

{ "ts" : Timestamp(1350539786000, 1), "h" : NumberLong("5484673652472424968"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 1 }, "o" : { "$set" : { "value" : 11 } } }

Use-Case: Predictive Energy Network Analysis

Maintaining a Power Grid

Expensive Last Minute Resource Allocation

Use Data to Help Predict the Future

• Weather Radar Data

• Climate Models

• Syslog Data from Power Generating Entities

• Geotagged Meter Usage

Sensor Data

• Straightforward to store in MongoDB documents

• With strategic document design, a single server can save hundreds of thousands of sensor reads per second

Data Updates

• Single update required to add new data and increment associated counts

db.sf-meter.update( { timestamp_minute: ISODate("2013-10-10T23:06:00.000Z"), type: “richmond-district” }, { {$set: {“values.59”: 2000000 }}, {$inc: {num_samples: 1, total_samples: 2000000 }} })

Data Management

• Data stored at different granularity levels for read performance

• Collections are organized into specific intervals

• Retention is managed by simply dropping collections as they age out

• Document structure is pre-created to maximize write performance

Aggregation Framework

• MongoDB has a built-in Aggregation Framework that supports ad-hoc analysis tasks over data sets

• “What counties had the highest average power utilization bracketed daily?”

• “Which meters have the most surge problems per week?”

Pre-Aggregated Log Data{ timestamp_minute: ISODate("2000-10-10T20:55:00Z"), resource: ”sensor-5a3524s", usage-values: { 0: 50, … 59: 250 }}

• Leverage time-series style bucketing

• Track individual metrics

• Improve performance for reads/writes

• Minimal processing overhead

MongoDB Makes Sense

Massive Amounts of Data

• Commodity Storage

• Add Nodes for Scale

• No SAN Needed

• MongoDB Replication for HA

High Performance

• Massive Write Scale

• Massive Read Scale

• Real-Time Response

Flexible Data Model

• A single sensor isn’t likely to change its data model…

• But what about the other sensors?

• Dynamic schema is a necessity

• Easily drop collections for data management

Lower Total Cost of Ownership

• Open Source vs. Proprietary

• Commodity Hardware

• Reduced Development Time

Questions?

Resources

• Schema Design for Time Series Data in MongoDBhttp://blog.mongodb.org/post/65517193370/schema-design-for-time-series-data-in-mongodb

• Operational Intelligence Use Casehttp://docs.mongodb.org/ecosystem/use-cases/#operational-intelligence

• Data Modeling in MongoDBhttp://docs.mongodb.org/manual/data-modeling/

• Schema Design (webinar)http://www.mongodb.com/events/webinar/schema-design-oct2013