Running MongoDB on AWS
-
Upload
mongodb -
Category
Technology
-
view
2.055 -
download
5
description
Transcript of Running MongoDB on AWS
Running MongoDB on AWS
Sandeep Parikh
Senior Solutions Architect
MongoDB, Inc.
Agenda
Background Deployment Automation
Management Integrations Resources
MongoDB
• Flexible document data model
• Rich ad-hoc queries and in-place updates
• Real-time aggregation
• Geospatial support
• Text search
• Built-in support for
– Redundancy and High Availability
– Auto-partitioning and scale out
Amazon Web Services
• Complete cloud infrastructure
– Compute
– Storage
– Database
– Analytics
– Processing
– Deployment
– Containers
• Multitude of configuration options
• Pricing flexibility
– On-demand, Spot instance, Reserved instance
Instance Selection
• General Purpose
• Compute-optimized
• GPU
• Memory-optimized
• Storage-optimized
• Micro
Instance Selection
• General Purpose (M3)
• Compute-optimized (C3)
• GPU (compute resources not needed)
• Memory-optimized (R3)
• Storage-optimized (I2, HS1)
• Micro (bursty, no sustained CPU)
Instance Characteristics
• Distinctions
– CPU, memory, storage, networking
• Networking
– EBS-optimized, enhanced networking, placement groups
• Availability
– Varies by region
Storage Configurations
S3
Blob storage
Static content
EBS
Magnetic
SSD, burst IOPS
OS root volume
PIOPS EBS
SSD-backed, predictable
performance
Cost scales up with size
and IOPS
Instance Store
SSD-backed
Blazing, ephemeral
Included in instance cost
Storage Configurations
S3
Blob storage
Static content
EBS
Magnetic
SSD, burst IOPS
OS root volume
PIOPS EBS
SSD-backed, predictable
performance
Cost scales up with size
and IOPS
Instance Store
SSD-backed
Blazing, ephemeral
Included in instance cost
Storage Configurations
• PIOPS EBS or Instance Store are best choices
• Instance Store offers best $/IOP
– Storage is ephemeral
– Must be used with MongoDB Replica Sets
• Can mix/match in a single deployment
– E.g. some Secondary nodes on EBS
– …But you’ll need several EBS volumes to maintain reasonable IOPS parity
Instance Configuration
• Use EXT4 or XFS along with appropriate attributes
• Tune block device read-ahead
• Tune TCP keep alive
• Disable NUMA
• Disable zone-reclaim mode
• Increase ulimits for processes and open files
High Availability
MongoDBPrimary
MongoDBSecondary
MongoDBSecondary
High Availability Across Zones
MongoDBPrimary
MongoDBSecondary
MongoDBSecondary
Zone 1 Zone 2
High Availability Across Regions
MongoDBPrimary
MongoDBSecondary
MongoDBSecondary
Region 1 Region 2
Sharding
MongoDBPrimary
MongoDBSecondary
MongoDBSecondary
MongoDBPrimary
MongoDBSecondary
MongoDBSecondary
MongoDBPrimary
MongoDBSecondary
MongoDBSecondary
Sharding Across Zones
MongoDBPrimary
MongoDBSecondary
MongoDBSecondary
MongoDBPrimary
MongoDBSecondary
MongoDBSecondary
MongoDBPrimary
MongoDBSecondary
MongoDBSecondary
Zone 1
Zone 2
Sharding Across Regions
MongoDBPrimary
MongoDBSecondary
MongoDBSecondary
MongoDBPrimary
MongoDBSecondary
MongoDBSecondary
MongoDBPrimary
MongoDBSecondary
MongoDBSecondary
Region 1
Region 2
Sharding Across Regions
MongoDBPrimary
MongoDBSecondary
MongoDBSecondary
MongoDBPrimary
MongoDBSecondary
MongoDBSecondary
MongoDBPrimary
MongoDBSecondary
MongoDBSecondary
Region 1 Region 2
Management Concerns
Upgrades
Maintenance
ScalingMonitoring
Backups
Automating MongoDB with MMS
MongoDB Management Service
• MMS is a web-based tool that starts you from the beginning of your MongoDB deployment lifecycle
• Use MMS to build and maintain your deployment and to manage its lifecycle (monitoring and backup)
MMS Changes
• Before, MMS was used to monitor and backup
• But MMS was “late to the party” – mistakes or misconfigurations had been applied to the initial deployment
• Monitoring was helpful but not in setting users down the right path
• Upgrade/maintenance tasks were non-trivial and very involved
Automation
Automation
Automation
Provision instances in
AWS
Deploy any version of MongoDB
Add replicas or shards
Update configuration at
any time
Push a button to upgrade MongoDB
Automation
Automation
Monitoring
Monitoring
Monitoring
Charting
MongoDB-specific metrics and
measurements
View complete cluster topology and metrics for
each component
Create custom dashboards for key metrics and nodes
Alerting
Create alerts for just about any metric value
change
Target some or all hosts
Customizable notifications including
SMS, HipChat, PagerDuty
Proactive Support
Our engineers monitor your deployment and
make suggestions
Offered to Subscription Customers
Backup
Backup
Customizable snapshot policy
Point-in-time recovery for replica
sets
Consistent shardedcluster snapshots
Low overhead, securely transferred
Continuous, incremental
backups
Backup
Mongodump File system MMS Backup
Initial complexity Medium High Low
Confidence in Backups
Medium Medium High
Point in timerecovery of replica set
Sort of ☺ No Yes
System Overhead High Can be low Low
Scalable No With work Yes
ConsistentSnapshot of Sharded System
Difficult Difficult Yes
Integrations
Compute Storage Persistent IPs DNS
HadoopData
WarehouseStream
ProcessingApp
Deployment
Orchestration Database App Services Caching
Integrations
Compute Storage Persistent IPs DNS
HadoopData
WarehouseStream
ProcessingApp
Deployment
Orchestration Database App Services Caching
Elastic MapReduce
• Background
– Quickly deploy and run Hadoop in AWS
– Tuned distributions to run on top of EC2
– Provision deployments with any number of nodes
– Supports spot and reserved pricing to minimize cost
• MongoDB
– MongoDB Connector for Hadoop
– https://github.com/mongodb/mongo-hadoop
– Bi-directional access
– MapReduce, Hive, Pig, Streaming, Spark
– MongoDB deployments or BSON backup files
CloudWatch
• Monitoring for AWS resources
• Supports custom metrics
• Use AWS CLI to pipe MongoDB metrics
aws cloudwatch put-metric-data--metric-name ResidentMemory--namespace MongoDB--timestamp 2014-01-01T00:00:00Z--value 32--unit Gigabytes
Redshift
• Fully managed petabyte scale data warehouse as a service
• MongoDB not natively supported as an input data source
• Use Data Pipeline and EMR to move data
http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/what-is-datapipeline.html
Elastic Beanstalk
• Deploy and manage applications
• Handles provisioning, scaling, load balancing
• Built on EC2, S3, SNS, Auto Scaling
• Customize and configure software that your app needs
• Install packages, create files
• Execute commands
• Control system services
App Server
App Server
App Server
Security Group
Elastic Load Balancer
Auto Scaling Group
mongosmongosmongos
MongoDB
Route53
• Highly available and scalable DNS service
• Hostnames can be assigned to
– EC2 instances, ELB instances, S3 buckets
• DNS load balancing with weighted round robin
• Supports hostnames for non-AWS infrastructure
• Use hostnames for all MongoDB components
• With replica sets, hostnames can ease machine replacement
• With sharded clusters, hostnames can simplify config server maintenance
• Or use Automation!
Questions?
• MongoDB
– http://www.mongodb.org
• MongoDB Documentation
– http://docs.mongodb.org
• MongoDB Management Service
– http://mms.mongodb.com