Migrating from RDBMS to MongoDB
-
Upload
mongodb -
Category
Technology
-
view
1.931 -
download
2
Transcript of Migrating from RDBMS to MongoDB
RDBMS to MongoDBMigration Best Practices
Mrinal SarkarSolutions Architect
2
• Relational Challenges• Migration Roadmap• Schema Design• Application Integration• Data Migration• Operational Considerations• Resources to Get Started
What We’ll Cover
Relational Challenges
4
Relational
Expressive Query Language& Secondary Indexes
Strong Consistency
Enterprise Management& Integrations
5
Relational Database ChallengesData TypesUnstructured dataSemi-structured dataPolymorphic data
Agile DevelopmentIterativeShort development cyclesNew workloads
Volume of DataTera-Peta Bytes of dataBillions of records‘000s of queries/sec
New ArchitecturesHorizontal scaling Commodity serversCloud computing
6
The World Has ChangedData Risk
Time Cost
7
NoSQL
Scalability& Performance
Always On,Global Deployments
FlexibilityExpressive Query Language& Secondary Indexes
Strong Consistency
Enterprise Management& Integrations
8
Nexus Architecture
Scalability& Performance
Always On,Global Deployments
FlexibilityExpressive Query Language& Secondary Indexes
Strong Consistency
Enterprise Management& Integrations
Migration Steps
Migration Roadmap
• Backed by Free, Online MongoDB Training• Paid Consulting, Services and Support available
Schema Design
DefinitionsRDBMS MongoDBDatabase Database
Table Collection
Row Document
Index Index
JOIN Embedded document, document references or $lookup to combine data from different Collections
SQL to Aggregation Mapping
Mapping Chart:http://docs.mongodb.org/manual/reference/sql-aggregation-comparison/
Mapping MongoDB Query Language to SQL
Mapping Chart:http://docs.mongodb.org/manual/reference/sql-comparison/
15
• Embedding– For 1:1 or 1:Many (where “many” viewed with the parent)– Ownership and containment– Document limit of 16MB, consider document growth– Atomicity of updates
• Referencing– _id field is referenced in the related document– Application runs 2nd query to retrieve the data– Data duplication vs performance gain– Object referenced by many different sources– Models complex Many : Many & hierarchical structures
Modeling Relationships:Embedding and Referencing
{ first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } ]}
Data Models: Relational to DocumentRelational MongoDB
Referencing Documents
18
RDBMS
Document Model BenefitsMongoDB
{ _id : ObjectId("4c4ba5e5e8aabf3"), employee_name: "Dunham, Justin", department : "Marketing", title : "Product Manager, Web", report_up: "Neray, Graham", pay_band: “C", benefits : [ { type : "Health", plan : "PPO Plus" }, { type : "Dental", plan : "Standard" }
] }
19
Anatomy of a BSON Document{ first_name: ‘Paul’, surname: ‘Miller’, cell: ‘+447557505611’ city: ‘London’, location: [45.123,47.232], Profession: [banking, finance, trader], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } ]}
Fields can contain an array of sub-documents
Fields
Typed field values
Fields can contain arrays
String
Number
Geo-
Coordinates
Document Model BenefitsAgility and flexibility• Data model supports business
change• Rapidly iterate to meet new
requirements
Intuitive, natural data representation• Eliminates ORM layer• Developers are more productive
Reduces the need for joins, disk seeks• Programming is more simple• Performance delivered at scale
{
_id :
ObjectId("4c4ba5e5e8aabf3"),
employee_name: "Dunham,
Justin",
department : "Marketing",
title : "Product Manager,
Web",
report_up: "Neray, Graham",
pay_band: “C",
benefits : [
{ type : "Health",
plan : "PPO
Plus" },
{ type : "Dental",
plan :
"Standard" }
]
}
MongoDB is Fully Featured
22
• MongoDB indexing will be familiar to DBAs– B-Tree Indexes, Secondary Indexes
• Single biggest tunable performance factor– Define indexes by identifying common queries– Use MongoDB explain to ensure index coverage– MongoDB profiler logs all slow queries
Indexing in MongoDB
• Compound• Unique • Array • TTL
• Geospatial • Hash • Sparse• Partial (new in version
3.2)
• Text Search
Further Reading
http://docs.mongodb.org/manual/data-modeling/
Application Integration
Drivers & Ecosystem
Morphia
MEAN Stack
Python PerlRuby
Support for the most popular languages and frameworks
27
• Ad-hoc reporting, grouping and aggregations, without the complexity of MapReduce
– Max, Min, Averages, Sum, Union, Redact, GeoNear
• Similar functionality to SQL GROUP_BY• Processes a stream of documents• Series of operators
– Filter or transform data– Input/output chain
• Supports single servers & shards
Application IntegrationMongoDB Aggregation Framework
High Availability: Replica SetsReplica Set – 2 to 50 copies
Addresses availability considerations:
High Availability
Disaster Recovery
Maintenance
Workload Isolation: operational & analytics
29
Scalability via Sharding
Multiple query optimization models
Each sharding option appropriate for different apps
Elastic and self-balancing
Shard Key Selection:http://docs.mongodb.org/manual/tutorial/choose-a-shard-key/
30
BI Integration
https://docs.mongodb.org/ecosystem/tools/hadoop/
31
MongoDB Connector for BIVisualize and explore multi-dimensional
documents using SQL-based BI tools. The
connector does the following:
• Provides the BI tool with the schema of the MongoDB
collection to be visualized
• Translates SQL statements issued by the BI tool into
equivalent MongoDB queries that are sent to MongoDB
for processing
• Converts the results into the tabular format expected by
the BI tool, which can then visualize the data based on
user requirements
Data Integrity
33
Data Governance with Document Validation
Implement data governance without sacrificing agility that comes from dynamic schema
• Enforce data quality across multiple teams and applications
• Use familiar MongoDB expressions to control document structure
• Validation is optional and can be as simple as a single field, all the way to every field, including existence, data types, and regular expressions
34
Document Validation Example
The example on the left adds a rule to the contacts collection that validates:
• The year of birth is no later than 1994
• The document contains a phone number and / or an email address
• When present, the phone number and email addresses are strings
Data Durability: Write Concern & Journal
• Configurable per operation• Combination of Write Concern
Levels & Journaling allow multiple levels of Guarantees
Write Concern describes the level of acknowledgement requested from MongoDB for write operations
Migration and Operations
39
Traditional ETL
Source Database ETL
Incremental Migration, Live
Legacy Database
MongoDB Database
41
• Configuration, Provisioning, Monitoring and Backup• High Availability & Disaster Recovery• Scalability• Hardware selection
– Commodity Servers: Prioritize RAM, Fast CPUs & SSD• Security
– Access Control, Authentication, Encryption
Operations
Download the WhitepaperMongoDB Operations Best Practices
42
Ops Manager & Cloud Manager
Single-click provisioning, scaling & upgrades, admin tasks
Monitoring, with charts, dashboards and alerts on 100+ metrics
Backup and restore, with point-in-time recovery, support for sharded clusters
The Best Way to Manage MongoDB Up to 95% Reduction in Operational Overhead
43
MongoDB CompassFor fast schema discovery and visual construction of ad-hoc queries
• Visualize schema– Frequency of fields– Frequency of types– Determine validator rules
• View Documents• Graphically build queries• Authenticated access
Migration Roadmap
• Backed by Free, Online MongoDB Training• Paid Consulting, Services and Support available
Getting Started
MongoDB EnablementConsulting, training, and professional services throughout your project lifecycle
For Operations
For Developers
Design & Development
Pre-Production(Test, QA, Deployment) Production Expansion
Dedicated Consulting Engineer | Custom Projects
OperationsRapid Start Production Readiness
MongoDBPrivate CloudAccelerator
Health Check
DevelopmentRapid Start Performance Evaluation and Tuning
For Both
T
DeveloperTraining
T
Essentials Training
T
Administrator Training
T
Advanced DeveloperTraining
T
Advanced AdministratorTraining
Migration in Action
eCommerce Application• Migration from MS-SQL• Project completed in 8
months vs original 18 month planned.
• High Availability, Performance and reliability at a fraction of the cost.
• Lower latency• Faster dev cycles
Content Management• Migration from Oracle• 80% cost reduction with
commodity hardware• 900% performance
improvement• Development cycles in
weeks vs. tens of months
Customer Data Mgmt & Analytics• Multi RDBMS Migration• 95% faster in identifying
matches• 50% increase in paying
subscribers • 60% increase in unique web
site visits.
48
• MongoDB Brings the best of Both Relational & NoSQL Data Models• MongoDB is a full featured Database Platform• MongoDB Helps you reduce your Project Time, Cost and Risks• Migrating to MongoDB is easier than before with Enterprise level
Consulting, Training and Support.
Summary
Download the Guidehttps://www.mongodb.com/collateral/rdbms-mongodb-migration-guide