Webinar: Capacity Planning

Post on 05-Jul-2015

1.159 views 1 download

description

Deploying MongoDB can be a challenge if you don't understand how resources are used nor how to plan for the capacity of your systems. If you need to deploy, or grow, a MongoDB single instance, replica set, or tens of sharded clusters then you probably share the same challenges in trying to size that deployment. This talk will cover what resources MongoDB uses, and how to plan for their use in your deployment. Topics covered will include understanding how to model and plan capacity needs from the perspective of a new deployment, growing an existing one, and defining where the steps along scalability on your path to the top. The goal of this presentation will be to provide you with the tools needed to be successful in managing your MongoDB capacity planning tasks.

Transcript of Webinar: Capacity Planning

Server Engineer

Shaun Verch

Capacity Planning:

Deploying MongoDB

Capacity Planning

• Why is it important?

• What is it?

• When is it important?

• How is it actually done?

Why?

• What are the consequences of not planning?

Why does it matter?

What?

What is Capacity Planning?

Requirements

Resources

• Availability• Throughput• Responsiveness

Requirements

• Availability• Throughput• Responsivenes

s

Requirements to Hardware

Resource Usage

• Storage

– IOPS

– Size

– Data & Loading Patterns

• Memory

– Working Set

• CPU

– Speed

– Cores

• Network

– Latency

– Throughput

Storage

• Active

• Archival

• Loading Patterns

• Integration (BI/DW)

Storage

• Active

• Archival

• Loading Patterns

• Integration (BI/DW)

Example IOPS

Example IOPS

7,200 rpm SATA ~ 75-100 IOPS

15,000 rpm SAS ~ 175-210 IOPS

Amazon

EBS/Provisioned

~ 100 IOPS "up to" 2,000

IOPS

Amazon SSD 9,000 – 120,000 IOPS

Storage Capability

Intel X25-E (SLC) ~ 5,000 IOPS

Fusion IO ~ 135,000 IOPS

Violin Memory 6000 ~ 1,000,000 IOPS

Example IOPS

7,200 rpm SATA ~ 75-100 IOPS

15,000 rpm SAS ~ 175-210 IOPS

Amazon

EBS/Provisioned

~ 100 IOPS "up to" 2,000

IOPS

Amazon SSD 9,000 – 120,000 IOPS

Storage Capability

Intel X25-E (SLC) ~ 5,000 IOPS

Fusion IO ~ 135,000 IOPS

Violin Memory 6000 ~ 1,000,000 IOPS

Cost of IOPS

7,200 rpm SATA ~ 75-100 IOPS

15,000 rpm SAS ~ 175-210 IOPS

Amazon

EBS/Provisioned

~ 100 IOPS "up to" 2,000

IOPS

Amazon SSD 9,000 – 120,000 IOPS

Storage Costs

Memory

• Working Set– Active Data in Memory

– Measured Over Periods

Memory

• Work:

–Sorting

–Aggregation

–Connections

SORTS

Connections

Aggregations

Memory & Storage

><?

Working Set

Number of distinct pages

accessed per unit of time

Working Set

Number of distinct pages

accessed per second

Working Set

4 distinct pages per second

Working Set

4 distinct pages per second

Working Set

4 distinct pages per second

Worst case 4 disk accesses

Working Set

6 distinct pages per second

Working Set

6 distinct pages per second

Working Set

6 distinct pages per second

Working Set

6 distinct pages per second

Worst case disk access on every op

Memory & Storage

MOPs

PFs

CPU

• Non-indexed Data

• Sorting

• Aggregation

– Map/Reduce

– Framework

• Data

– Fields

– Nesting

– Arrays/Embedded-Docs

Network

• Latency

– WriteConcern

– ReadPreference

– Batching

• Throughput

– Update/Write Patterns

– Reads/Queries

What is failure?

• We have failed at Capacity Planning when our

resources don’t meet our requirements

• Because our requirements can have many

dimensions, we may exceed our requirements in

one characteristic but not meet them in another

• This means that we can spend many $$$ and still

fail!

What about Legacy Hardware?

• Let’s hope whatever worked for this legacy

technology also works for MongoDB

• Same principles of Capacity Planning still apply

When?

• Before it's too late!

• When?

Capacity Planning: When

Start Launch Version 2

Capacity Planning is Measurement

Measuring early gives you a comparison point for when you need to do it again

Velocity of Change

• Limitations -> takes time

– Data Movement

– Allocation/Provisioning (servers/mem/disk)

• Improvement

– Limit Size of Change (if you can)

– Increase Frequency

– MEASURE its effect

– Practice

Repeat (continuously)

• Repeat Testing

• Repeat Evaluations

• Repeat Deployment

How?

Monitoring

Monitoring Storage

Memory

CPU

Network

Application Metrics

Tools

• MMS (MongoDB Monitoring Service)

• MongoDB: mongotop, mongostat

• Linux: iostat, vmstat, sar, etc

• Windows: Perfmon

Measure realistic loads (generated by Load testing)

Models

• Load/Users

– Response Time/TTFB

• System Performance

– Peak Usage

– Min Usage

Starter Questions

• What is the working set?

– How does that equate to memory

– How much disk access will that require

• How efficient are the queries?

• What is the rate of data change?

• How big are the highs and lows?

Deployment Types

All of these use the same resources:

• Single Instance

• Multiple Instances (Replica Set)

• Cluster (Sharding)

• Data Centers

Questions?

Server Engineer, MongoDB

Shaun Verch

Thank You