Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries
-
Upload
mongodb -
Category
Technology
-
view
115 -
download
9
description
Transcript of Webinar: MongoDB Use Cases within the Oil, Gas, and Energy Industries
MongoDB Usage withinOil, Gas, and Energy
Senior Account Executive / Solutions Architect, MongoDB Inc.
@hungarianhc ~ [email protected]
Kevin Hanson
#mongodb
Agenda
• Common Themes in MongoDB Usage
• What is MongoDB?
• Use-Cases and Examples
• Thinking Ahead
• Questions
Common Themes in MongoDB Usage
Machine Generated Data
Fast Moving Data
• Hundreds of thousands of records per second
• Fast response required
• Sometimes all data kept, sometimes just summary
• Horizontal scalability required
Massive Amounts of Data
• Widely applicable data model
• Applies to several different “data use cases”
• Various schema and modeling options
• Application requirements drive schema design
Data is Structured, but Varied…
• A machine generates a specific kind of data
• The data model is unlikely to change
• But there are so many different machines…
• Queryability across all types
Time Series Data
• Event data written multiple times per second, minute, or hour
• Tracking progression of metrics over time
What is MongoDB?
MongoDB is a ___________ database
• Open source
• High performance
• Full featured
• Document-oriented
• Horizontally scalable
Full Featured
• Dynamic (ad-hoc) queries
• Built-in online aggregation
• Rich query capabilities
• Traditionally consistent
• Many advanced features
• Support for many programming languages
Document-Oriented Database
• A document is a nestable associative array
• Document schemas are flexible
• Documents can contain various data types (numbers, text, timestamps, blobs, etc)
Horizontally Scalable (Add Shards)
Replication Within a ShardEnabling Global Deployments
Use-Case: Oil Rig Data Analysis
3 Points of Data Creation / Collection
Rig Site(Middle of the
Ocean)
Regional Center
(Nearby Continent)
Headquarters(Texas? )
Day Level Data
Hour Level Data
MongoDB on all 3 Sites
Rig Site(Middle of the
Ocean)
Regional Center
(Nearby Continent)
Headquarters(Texas? )
Day Level Data
Hour Level Data
MongoDB on the Rig
{
machine-id: “derrick-72”,
utilization-rate: 92,
depth: 172,
ts: ISODate("2013-10-16T22:07:38.000-0500")
}
• Queried and analyzed by on-site rig personnel
• High volume data with real-time response
• Aggregations compute high level statistics
• Statistics are transmitted to regional center
MongoDB at the Regional Center{
rig-id: “gulf-1a23v”,
machine-failures: 0,
efficiency: 82,
ts: ISODate("2014-07-13T22:12:21.000-0800")
}
• Monitoring important statistics from multiple rigs
• Aggregating rig data to report regional data to headquarters
MongoDB at the Regional Center
MongoDB at Headquarters
{
region: “Pacific”,
total-rigs: 82,
producing-rigs: 77,
barrels: 44000,
ts: ISODate("2014-07-13")
}
• Regional views of the data
• Real-time stats
• Integration with hadoop for large batch processing jobs
{
region: “Atlantic”,
total-rigs: 102,
producing-rigs: 95,
barrels: 97000,
ts: ISODate("2014-07-13")
}
Powered by MongoDB Replication & the Oplog
> db.replsettest.insert({_id:1,value:1})
{ "ts" : Timestamp(1350539727000, 1), "h" : NumberLong("6375186941486301201"), "op" : "i", "ns" : "test.replsettest", "o" : { "_id" : 1, "value" : 1 } }
> db.replsettest.update({_id:1},{$inc:{value:10}})
{ "ts" : Timestamp(1350539786000, 1), "h" : NumberLong("5484673652472424968"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 1 }, "o" : { "$set" : { "value" : 11 } } }
Use-Case: Predictive Energy Network Analysis
Maintaining a Power Grid
Expensive Last Minute Resource Allocation
Use Data to Help Predict the Future
• Weather Radar Data
• Climate Models
• Syslog Data from Power Generating Entities
• Geotagged Meter Usage
Sensor Data
• Straightforward to store in MongoDB documents
• With strategic document design, a single server can save hundreds of thousands of sensor reads per second
Data Updates
• Single update required to add new data and increment associated counts
db.sf-meter.update( { timestamp_minute: ISODate("2013-10-10T23:06:00.000Z"), type: “richmond-district” }, { {$set: {“values.59”: 2000000 }}, {$inc: {num_samples: 1, total_samples: 2000000 }} })
Data Management
• Data stored at different granularity levels for read performance
• Collections are organized into specific intervals
• Retention is managed by simply dropping collections as they age out
• Document structure is pre-created to maximize write performance
Aggregation Framework
• MongoDB has a built-in Aggregation Framework that supports ad-hoc analysis tasks over data sets
• “What counties had the highest average power utilization bracketed daily?”
• “Which meters have the most surge problems per week?”
Pre-Aggregated Log Data{ timestamp_minute: ISODate("2000-10-10T20:55:00Z"), resource: ”sensor-5a3524s", usage-values: { 0: 50, … 59: 250 }}
• Leverage time-series style bucketing
• Track individual metrics
• Improve performance for reads/writes
• Minimal processing overhead
MongoDB Makes Sense
Massive Amounts of Data
• Commodity Storage
• Add Nodes for Scale
• No SAN Needed
• MongoDB Replication for HA
High Performance
• Massive Write Scale
• Massive Read Scale
• Real-Time Response
Flexible Data Model
• A single sensor isn’t likely to change its data model…
• But what about the other sensors?
• Dynamic schema is a necessity
• Easily drop collections for data management
Lower Total Cost of Ownership
• Open Source vs. Proprietary
• Commodity Hardware
• Reduced Development Time
Questions?
Resources
• Schema Design for Time Series Data in MongoDBhttp://blog.mongodb.org/post/65517193370/schema-design-for-time-series-data-in-mongodb
• Operational Intelligence Use Casehttp://docs.mongodb.org/ecosystem/use-cases/#operational-intelligence
• Data Modeling in MongoDBhttp://docs.mongodb.org/manual/data-modeling/
• Schema Design (webinar)http://www.mongodb.com/events/webinar/schema-design-oct2013