MongoDB & Features
Scaling, Security & Performance
Sasidhar Gogulapati
Why MongoDB?
When to use Mongo?• When there is a need of high write load. Can
do 80,000 inserts/sec on a single node. – Sharding required only if data set is more than 50
million• When high availability is required in unreliable
environment– Setting replica set is easy and fast– Recovery from a node failure is instant
• When data needs to grow big– My SQL table performance degrades when table
size is 5-10 GB
Contd..• MongoDB has a built in easy solution for partitioning
and sharding• When data is location based
• With bulit in functions of mongodb, it is fast and accurate to find data from specific locations.
• With over 2,000/s CDR inserts, MongoDB architecture is great for a system that must support high insert load. Yet you can guarantee transactions with findAndModify (which is slower) and two-phase commit (application wise)
• Schema-less design enables rapid introduction.As MongoDB is schema-less, adding a new field, does not effect old rows (or documents) and will be instant
How Secure MongoDB is?
Security FeaturesMongoDB provides various features, such as authentication, access control, encryption, to secure your MongoDB deployments. Some key security features include:
Encryption at Rest
Encryption at rest, when used in conjunction with transport encryption and good security policies that
protect relevant accounts, passwords, and encryption keys, can help ensure compliance with security and
privacy standards, including HIPAA, PCI-DSS, and FERPA
*Available only for Enteprise version
Transport Encryption
• MongoDB supports TLS/SSL to encrypt all of MongoDB’s network traffic
• TLS/SSL ensures that MongoDB network traffic is only readable by the intended client. TLS/SSL implementation uses OpenSSL libraries.
• MongoDB’s SSL encryption only allows use of strong SSL ciphers with a minimum of 128-bit key length for all connections.
Sharding
Sharding• Sharding - Method for distributing data across
multiple machines
• Should be used only with large data sets and high throughput opertions
• MongoDB supports horizontal scaling by Sharding (Increasing number of instances where mongodb is installed)
Sharded ClusterA MongoDB sharded cluster consists of following components:1. Shard: Each shard consists of subset of
sharded data . Each Shard can be deployed as a replica set
2. mongos: Acts as a query router, providing a interface between client applications and the sharded cluster
3. Config Servers: stores metadata and configuration settings for the clusters
Components in Sharded Cluster
MongoDB shards data at the collection level, distributing the data across shards in cluster.
Shard• Shard contains a subset of sharded data for a
sharded cluster• Shards should be deployed as a replica set to
provide redundance and high availability• Query on a shard returns only subset of data• Users, client or applications should only
directly connect to a shard to perform local administrative or maintanence operations
• Use mongos to do operations at cluster level, including read and write
Config Servers• Config servers store the metadata of a
sharded cluster• Metadata reflects state and organization for all
data and components in the sharded cluster• And also includes the list of chunks on every
shard and the ranges that define the chunks• Deploy config servers as replica sets• mongos cache this data and use for routing the
read and write operations
mongos
• Mongos instances route queries and write operations to shards in a sharded cluster
• Tracks what data is on which shard by caching the metadata from config servers
• Mongos has no persistent state and uses minimal system resources
• Most common practice is to run mongos on the same application servers
Shard Keys• To distribute the documents in collections,
mongodb partitions the collection using the shard key
• Shard key can be chosen while sharding the collection. It cannot be changed after sharding
• A sharded collection can have only one shard key
• Choice of shard key affects the performance, efficiency and scalability of a sharded cluster
• Shard key specification: https://docs.mongodb.com/manual/core/sharding-shard-key/#sharding-shard-key-creation
How to scale MongoDB?
Scaling
• Sharding (clustering)
• Vertical Scaling (Increase in CPU,Memory.,)
• Horizontal Scaling (Increase in instances)
Sharding ProcessSharding is the solution to address scaling issues in MongoDB. Using sharding, developers can horizontally scale the database over multiple servers. The same dataset is divided over multiple servers. Each individual server calls the shard; it is an independent database. All shards together make the single logical database.
MongoDB Performance
Metrics
Recently usain.com published a comprehensive independent database comparison, measuring performance across multiple dimensions using the Yahoo! Cloud Serving Benchmark (YCSB). In these tests it observed that MongoDB overwhelmingly outperformed key value stores, in terms of throughput and latency, across a number of configurations
Performance Contd.,• When all three databases are configured the same way,
MongoDB provides 20% greater throughput than Cassandra, and 50% greater throughput than Couchbase
• When tested with configuration that prevents any possible data loss, MongoDB outperforms Cassandra and Couchbase by more than 25x, with latency that is more than 95% better than Cassandra, and more than 99.5% better than Couchbase
• Finally, when tested with a configuration that provides excellent performance and minimal possible data loss in the event of a node failure, MongoDB provides 3x greater throughput than Cassandra in read-intensive workloads, and 70% higher throughput in write-intensive workloads, while providing 80% lower latency
Top Related