Ssd collab13

Post on 26-Jan-2015

120 views 4 download

Tags:

description

 

Transcript of Ssd collab13

Databases in a Solid State World

How Exadata X3 and Other Database Systems Leverage the Performance of FlashGwen Shapira, Senior ConsultantFebruary, 2013

About Me– Oracle ACE Director– Member of Oak Table– 14 years of IT

– Performance Tuning– Troubleshooting– Hadoop

– Presents, Blogs, Tweets

– @gwenshap

© 2013 Pythian2

About Pythian• Recognized Leader:

– Global industry-leader in remote database administration services and consulting for Oracle, Oracle Applications, MySQL and Microsoft SQL Server

– Work with over 250 multinational companies such as Forbes.com, Fox Sports, Nordion and Western Union to help manage their complex IT deployments

• Expertise:

– Pythian’s data experts are the elite in their field. We have the highest concentration of Oracle ACEs on staff—9 including 2 ACE Directors—and 2 Microsoft MVPs.

– Pythian holds 7 Specializations under Oracle Platinum Partner program, including Oracle Exadata, Oracle GoldenGate & Oracle RAC

• Global Reach & Scalability:

– Around the clock global remote support for DBA and consulting, systems administration, special projects or emergency response

© 2013 Pythian3

© 2013 Pythian4

You Never Forget

Your First SSD

Sh*t People Say about SSD:

© 2013 Pythian5

Fast for reads

Don’t use for writes

Use for random writes

Don’t use for REDO

Used for REDO

Only used in Exadata

Only Sun flash devices are supported

Unreliable

Becomes slower over time

Type of SSD matters

Use SATA SSD

Use PCI SSDUse SSD in SAN

Too expensive

Is it same as Flash?

Solid State Disk=No Spinning=Low Latency Random IO

© 2013 Pythian6

We are talking about: NAND FLASH

• As opposed to RAM Flash which is rare but awesome

• SLC – One bit per cell. – High performance.

• MLC– Two bit per cell– High capacity

© 2013 Pythian7

0

1

00

01

10

11

Will Talk About:

• IO Performance• Using SSDs for

Oracle• How Exadata and

ODA uses SSDs• SSD devices• Practice: Reading

SSD Vendor Specs

© 2013 Pythian8

Anatomy of a SSD

© 2013 Pythian9

Cell1bit

Page4K

Block128 Pages

512K

Plane = 1024 Blocks = 512MBPlanes are grouped into Die which are grouped in Packages

The Big Catch:We read and write pagesBut delete blocks

© 2013 Pythian10

IO Operations

© 2013 Pythian11

Reads • CPU registers – 0.3 * ns (1 cycle)• CPU Cache L1 – 1.2* ns • CPU Cache L2 – 3.0* ns• CPU Cache L3 – 12-24 ns

•Main Memory (RAM) – 60-100 ns•SSD – 60,000 ns•Magnetic Storage (“DISK”) – 3,000,000 ns

•SAN devices ~ 15,000,000 ns

© 2013 Pythian12

What about throughput?

• 15K RPM SAS HDD – 120-200MB/s• PCIe SSD – 1-2GB/s• But … How many disks do you use?• Network bandwidth?• CPU Bus bandwidth?

© 2013 Pythian13

Writes

• Writes on new SSD – 250,000 ns• Similar to sequential write to disk

How much data can you write to a new 250GB SSD?

© 2013 Pythian14

Deletes

• Can’t overwrite data without deleting first• Can only delete blocks of 128*4K pages• To Overwrite a page:

– Read 127 pages– Write 127 to a free block– Delete old block– Perform the write we originally requested

• Takes 2ms• Each cell can only be written 100K times

© 2013 Pythian15

The Controller

• Over-provision SSDs• Maintain free lists• Delete and cleanup in background• Balance use of cells (Wear leveling)• RAM caching

© 2013 Pythian16

Consequences:

• Write Amplification– How much data is really written when we write

1MB– 1 means no overhead– The closer to 1 the better

• Benchmarks on new SSD are worthless– Run benchmarks long enough to run out of

overprovisioned space

© 2013 Pythian17

Will Talk About:

• IO Performance• Using SSDs for

Oracle• How Exadata and

ODA uses SSDs• SSD devices• Practice: Reading

SSD Vendor Specs

© 2013 Pythian18

Redo Logs

A: Redo log writes are sequential writes and therefore won’t benefit from SSD

B: Log file sync times are critical to Oracle performance. Therefore placing redo logs on SSD will have dramatic impact on performance.

© 2013 Pythian19

Don’t use SSD for redo if:

• You don’t have “log file sync” related performance problems

• You have dedicated disks for each redo log• Even better if multiple disks, striped.• Your SAN is well configured and has ample

caching• You have RAC and no shared SSDs

© 2013 Pythian20

SSD can make Redo faster if:• You are suffering from high “log file

parallel write”• And your storage admin won’t even

discuss it• Redo is on LUN shared with:

– Redo from multiple databases– Other services (SAP, etc)

• Not enough cache on storage array• Storage network is a bottleneck

© 2013 Pythian21

Placing Data on SSD

© 2013 Pythian22

Should you place data on SSD?• SSD solves IO latency problems• If “DB File Sequential Read” is not in your

top 5 wait events, you probably don’t need your data on SSD.

• If you don’t maximize RAM use for buffer cache – don’t get SSD (yet)

• If your CPU utilization is high, solve this first.

© 2013 Pythian23

Not enough space?

• Move most active segments • Random reads get most benefits from SSD• Active indexes with unique-scans• Fewer writes is better• AWR has IO statistics per segment• https://github.com/gwenshap/Oracle-DBA-

Scripts/blob/master/SSD.sql

© 2013 Pythian24

Why Choose?

• SAN Devices that contain both HDD and SSD

• Smart controllers move most active data to SSD automatically.

• Pros: No need to choose and manually migrate data

• Cons: Your most active data will move without advanced notice

© 2013 Pythian25

Top Mistakes

• Using SSD for production and HDD for Standby– If production needs SSD…– Good chance that standby will fall behind

• Database Smart Flash Cache

© 2013 Pythian26

Database Smart Flash Cache

© 2013 Pythian27

Disk

SGA

Flash Cache

Block read from disk

Block evicted from SGA is written to SSD cacheby DBWR

If block is needed, it is read from SSD

Database Smart Flash Cache• Pros:

– Automatically keeps active data in SSD

• Cons:– Large overhead for managing cache, all taken from SGA– Overhead for DBWR– No benefit and some overhead for writes– Only one SSD device

Using Smart Flash Cache will make your IO faster than using just disks, but smartly placing data on SSD will be even faster.

© 2013 Pythian28

Will Talk About:

• IO Performance• Using SSDs for

Oracle• How Exadata and

ODA uses SSDs• SSD devices• Practice: Reading

SSD Vendor Specs

© 2013 Pythian29

Exadata has LOTS of SSD

• Quarter rack has 3 storage cells• Each with 4 Sun Flash Accelerator F40• 400GB * 4 * 3 = 4.8TB• 21.5GB/s throughput• 375,000 IOPS• Note that IB will limit you to 4GB/s per DB

node

© 2013 Pythian30

Exadata Smart Flash Logging• Redo log writes are written to disk and

SSD together.• Log sync is finished when one write is

successful.• Can’t Lose.• Can’t try that at home• This improves performance for redo when

disks are busy with high throughput operations

© 2013 Pythian31

Exadata Smart Flash Cache

• Not same as DB Smart Flash Cache• SSDs are on storage cells• SSD on Exadata can also be used as ASM

disks and not cache.

© 2013 Pythian32

Exadata Smart Flash Cache

• Reading un-cached data:1. Un-cached data is read

from disk first2. Sent to the database3. and then copied to cache

© 2013 Pythian33

Disks SSD Cache

Cellsrv Database

Exadata Smart Flash Cache

• Cached reads:– Read from disk and SSD simultaneously– Whichever returns first– Effectively increase read throughput– Smart scans mostly

read from disk– Except for objects

using “cell_flash_cache”KEEP clause.

© 2013 Pythian34

Disks SSD Cache

Cellsrv Database

Exadata Smart Flash Cache

• Writes:– Write through cache– Writes go to disk first– Then copied to cache, sometimes– Indexes and tables with random IO– ALTER TABLE customers STORAGE

(CELL_FLASH_CACHE KEEP)

© 2013 Pythian35

Disks SSD Cache

Cellsrv Database

Exadata Smart Flash Cache

• Writes:– Write back cache– Writes go to SSD first– Then copied to disk, eventually

© 2013 Pythian36

Disks SSD Cache

Cellsrv Database

ODA and SSD

• “Four 2.5-inch 200 GB SAS-2 SLC SSDs per shelf for database redo logs “

• Allows multiple databases on ODA• Reduces risk of disk bottlenecks

© 2013 Pythian37

Will Talk About:

• IO Performance• Using SSDs for

Oracle• How Exadata and

ODA uses SSDs• SSD devices• Practice: Reading

SSD Vendor Specs

© 2013 Pythian38

Interfaces

• SATA– 32 outstanding IO– 6Gb/s = 600MB/s– significant latency

• SAS– 256 outstanding IO– 6Gb/s = 600MB/s– Used on ODA

shared storage

© 2013 Pythian39

Interfaces

• PCIe– “Flash”

“Accelerator”– Multiple 500 MB/s

lanes– Low latency– Multiple SAS/SATA

controllers on cardfor extra throughput

© 2013 Pythian40

Interfaces

• Fiber– Use existing

enterprise infrastructures

– Shared storage– Usual SAN

headache– Mandatory for RAC

© 2013 Pythian41

Will Talk About:

• IO Performance• Using SSDs for

Oracle• How Exadata and

ODA uses SSDs• SSD devices• Practice: Reading

SSD Vendor Specs

© 2013 Pythian42

© 2013 Pythian43

Write latency lower than read?

© 2013 Pythian44

Intel SSD 910

identical read/write latency?

© 2013 Pythian45

© 2013 Pythian46

RAMSAN

© 2013 Pythian47

Quick Recap

• SSDs make random reads wicked fast• Writes and deletes are complicated• Place segments with many random reads

on SSD• Exadata uses Smart Flash Cache to

increase throughput• Not all SSDs are the same• Read specs carefully

© 2013 Pythian48

Thank you – Q&A

To contact us

sales@pythian.com

1-877-PYTHIAN

To follow us

http://www.pythian.com/blog

http://www.facebook.com/pages/The-Pythian-Group/163902527671

@pythian

http://www.linkedin.com/company/pythian

© 2013 Pythian49

Toolkit – Colour palette

• The theme colours for this template are pre-loaded. However, if you’re curious this is the palette:

RGB 0 0 0 RGB 204 204 204 RGB 153 153 153 RGB 255 255 255

RGB 0 119 139 RGB 0 163 173 RGB 255 143 40 RGB 255 210 0 RGB 200 0 0

© 2013 Pythian50

Toolkit – Service Icons Higher res will be uploaded soon

© 2013 Pythian51

Toolkit – General Icons

© 2013 Pythian52

Toolkit – Social Media Icons

© 2013 Pythian53

Toolkit – Industry Logos

© 2013 Pythian54

Toolkit – Stock Photos (will grow)

© 2013 Pythian55