Webinar: Capacity Planning

44
Server Engineer Shaun Verch Capacity Planning: Deploying MongoDB

description

Deploying MongoDB can be a challenge if you don't understand how resources are used nor how to plan for the capacity of your systems. If you need to deploy, or grow, a MongoDB single instance, replica set, or tens of sharded clusters then you probably share the same challenges in trying to size that deployment. This talk will cover what resources MongoDB uses, and how to plan for their use in your deployment. Topics covered will include understanding how to model and plan capacity needs from the perspective of a new deployment, growing an existing one, and defining where the steps along scalability on your path to the top. The goal of this presentation will be to provide you with the tools needed to be successful in managing your MongoDB capacity planning tasks.

Transcript of Webinar: Capacity Planning

Page 1: Webinar: Capacity Planning

Server Engineer

Shaun Verch

Capacity Planning:

Deploying MongoDB

Page 2: Webinar: Capacity Planning

Capacity Planning

• Why is it important?

• What is it?

• When is it important?

• How is it actually done?

Page 3: Webinar: Capacity Planning

Why?

Page 4: Webinar: Capacity Planning

• What are the consequences of not planning?

Why does it matter?

Page 5: Webinar: Capacity Planning

What?

Page 6: Webinar: Capacity Planning

What is Capacity Planning?

Requirements

Resources

Page 7: Webinar: Capacity Planning

• Availability• Throughput• Responsiveness

Requirements

Page 8: Webinar: Capacity Planning

• Availability• Throughput• Responsivenes

s

Requirements to Hardware

Page 9: Webinar: Capacity Planning

Resource Usage

• Storage

– IOPS

– Size

– Data & Loading Patterns

• Memory

– Working Set

• CPU

– Speed

– Cores

• Network

– Latency

– Throughput

Page 10: Webinar: Capacity Planning

Storage

• Active

• Archival

• Loading Patterns

• Integration (BI/DW)

Page 11: Webinar: Capacity Planning

Storage

• Active

• Archival

• Loading Patterns

• Integration (BI/DW)

Example IOPS

Page 12: Webinar: Capacity Planning

Example IOPS

7,200 rpm SATA ~ 75-100 IOPS

15,000 rpm SAS ~ 175-210 IOPS

Amazon

EBS/Provisioned

~ 100 IOPS "up to" 2,000

IOPS

Amazon SSD 9,000 – 120,000 IOPS

Storage Capability

Page 13: Webinar: Capacity Planning

Intel X25-E (SLC) ~ 5,000 IOPS

Fusion IO ~ 135,000 IOPS

Violin Memory 6000 ~ 1,000,000 IOPS

Example IOPS

7,200 rpm SATA ~ 75-100 IOPS

15,000 rpm SAS ~ 175-210 IOPS

Amazon

EBS/Provisioned

~ 100 IOPS "up to" 2,000

IOPS

Amazon SSD 9,000 – 120,000 IOPS

Storage Capability

Page 14: Webinar: Capacity Planning

Intel X25-E (SLC) ~ 5,000 IOPS

Fusion IO ~ 135,000 IOPS

Violin Memory 6000 ~ 1,000,000 IOPS

Cost of IOPS

7,200 rpm SATA ~ 75-100 IOPS

15,000 rpm SAS ~ 175-210 IOPS

Amazon

EBS/Provisioned

~ 100 IOPS "up to" 2,000

IOPS

Amazon SSD 9,000 – 120,000 IOPS

Storage Costs

Page 15: Webinar: Capacity Planning

Memory

• Working Set– Active Data in Memory

– Measured Over Periods

Page 16: Webinar: Capacity Planning

Memory

• Work:

–Sorting

–Aggregation

–Connections

SORTS

Connections

Aggregations

Page 17: Webinar: Capacity Planning

Memory & Storage

><?

Page 18: Webinar: Capacity Planning

Working Set

Number of distinct pages

accessed per unit of time

Page 19: Webinar: Capacity Planning

Working Set

Number of distinct pages

accessed per second

Page 20: Webinar: Capacity Planning

Working Set

4 distinct pages per second

Page 21: Webinar: Capacity Planning

Working Set

4 distinct pages per second

Page 22: Webinar: Capacity Planning

Working Set

4 distinct pages per second

Worst case 4 disk accesses

Page 23: Webinar: Capacity Planning

Working Set

6 distinct pages per second

Page 24: Webinar: Capacity Planning

Working Set

6 distinct pages per second

Page 25: Webinar: Capacity Planning

Working Set

6 distinct pages per second

Page 26: Webinar: Capacity Planning

Working Set

6 distinct pages per second

Worst case disk access on every op

Page 27: Webinar: Capacity Planning

Memory & Storage

MOPs

PFs

Page 28: Webinar: Capacity Planning

CPU

• Non-indexed Data

• Sorting

• Aggregation

– Map/Reduce

– Framework

• Data

– Fields

– Nesting

– Arrays/Embedded-Docs

Page 29: Webinar: Capacity Planning

Network

• Latency

– WriteConcern

– ReadPreference

– Batching

• Throughput

– Update/Write Patterns

– Reads/Queries

Page 30: Webinar: Capacity Planning

What is failure?

• We have failed at Capacity Planning when our

resources don’t meet our requirements

• Because our requirements can have many

dimensions, we may exceed our requirements in

one characteristic but not meet them in another

• This means that we can spend many $$$ and still

fail!

Page 31: Webinar: Capacity Planning

What about Legacy Hardware?

• Let’s hope whatever worked for this legacy

technology also works for MongoDB

• Same principles of Capacity Planning still apply

Page 32: Webinar: Capacity Planning

When?

Page 33: Webinar: Capacity Planning

• Before it's too late!

• When?

Capacity Planning: When

Start Launch Version 2

Page 34: Webinar: Capacity Planning

Capacity Planning is Measurement

Measuring early gives you a comparison point for when you need to do it again

Page 35: Webinar: Capacity Planning

Velocity of Change

• Limitations -> takes time

– Data Movement

– Allocation/Provisioning (servers/mem/disk)

• Improvement

– Limit Size of Change (if you can)

– Increase Frequency

– MEASURE its effect

– Practice

Page 36: Webinar: Capacity Planning

Repeat (continuously)

• Repeat Testing

• Repeat Evaluations

• Repeat Deployment

Page 37: Webinar: Capacity Planning

How?

Page 38: Webinar: Capacity Planning

Monitoring

Monitoring Storage

Memory

CPU

Network

Application Metrics

Page 39: Webinar: Capacity Planning

Tools

• MMS (MongoDB Monitoring Service)

• MongoDB: mongotop, mongostat

• Linux: iostat, vmstat, sar, etc

• Windows: Perfmon

Measure realistic loads (generated by Load testing)

Page 40: Webinar: Capacity Planning

Models

• Load/Users

– Response Time/TTFB

• System Performance

– Peak Usage

– Min Usage

Page 41: Webinar: Capacity Planning

Starter Questions

• What is the working set?

– How does that equate to memory

– How much disk access will that require

• How efficient are the queries?

• What is the rate of data change?

• How big are the highs and lows?

Page 42: Webinar: Capacity Planning

Deployment Types

All of these use the same resources:

• Single Instance

• Multiple Instances (Replica Set)

• Cluster (Sharding)

• Data Centers

Page 43: Webinar: Capacity Planning

Questions?

Page 44: Webinar: Capacity Planning

Server Engineer, MongoDB

Shaun Verch

Thank You