SSD WhitePaper by Houman Shabani

18
Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 1 of 18 NETLIST, INC. Solid State Storage (SSD) Usable Life Calculation White Paper Written by: Mike Amidi, Dir. Flash Product Development Houman Shabani, Flash Applications Engineer

Transcript of SSD WhitePaper by Houman Shabani

Page 1: SSD WhitePaper  by Houman Shabani

Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 1 of 18

NETLIST, INC.

Solid State Storage (SSD)

Usable Life Calculation

White Paper

Written by: Mike Amidi, Dir. Flash Product Development Houman Shabani, Flash Applications Engineer

Page 2: SSD WhitePaper  by Houman Shabani

Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 2 of 18

Table of Contents

Table of Contents .......................................................................................................................................... 2

1. NAND Flash Basics ................................................................................................................................. 4

2. Wear leveling ........................................................................................................................................ 5

3. Read-disturb .......................................................................................................................................... 7

4. Garbage collection ................................................................................................................................ 8

5. TRIM ...................................................................................................................................................... 9

6. Endurance ........................................................................................................................................... 11

7. IOPS ..................................................................................................................................................... 11

8. Write Amplification ............................................................................................................................. 12

Factors affecting the WA .................................................................................................................... 12

9. Data Compression Level and Data Entropy ........................................................................................ 13

10. Calculation of Useful life of SSDs .................................................................................................... 14

Example 1: ................................................................................................................................................... 15

Example 2: ................................................................................................................................................... 16

11. Conclusion ....................................................................................................................................... 16

APPENDIX -A................................................................................................................................................ 17

Abbreviations: ......................................................................................................................................... 17

APPENDIX -B ................................................................................................................................................ 17

Formulas: ................................................................................................................................................ 17

References: .................................................................................................... Error! Bookmark not defined.

Page 3: SSD WhitePaper  by Houman Shabani

Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 3 of 18

Introduction

Designers of enterprise computing applications are starting to take advantage of solid state disk

drives (SSD) in order to increase performance, save space, improve reliability, and reduce power

consumption. When compared to traditional 10K or 15K hard disk drives, SSDs are being used

to develop systems and applications capable of delivering previously un-attained levels of

performance.

Unlike hard disk drives technology, measuring the performance of the SSD is more complicated

due to the internal architecture and storage media used. For example, the performance of an

SSD may change as the device is being used and achieving stable and repeatable results requires

different test methodologies than those used for traditional HDD devices. Therefore,

performance can be a variable of the data pattern and write amplification on the drive is key to

predicting the performance for a specific application. [1]

This White Paper discusses the key parameters that are relevant to the performance and usable

life of Netlist SSD drives. It provides guide lines and methods for calculation of useful life of

SSD drives.

Page 4: SSD WhitePaper  by Houman Shabani

Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 4 of 18

1. NAND Flash Basics

NAND Flash memory is a non-volatile storage device that can be electrically erased and

reprogrammed. It was developed from EEPROM (electrically erasable programmable read-only

memory) and must be erased in fairly large blocks before they can be rewritten with new data.

The high density NAND type must also be programmed and read in (smaller) blocks, or pages,

while the NOR type allows a single machine word (byte) to be written and/or read

independently. [2]

Due to the nature of Flash memory's operation, data cannot be directly overwritten as it can in a

hard disk drive. When data is first written to an SSD, the cells all start in an erased state so data

can be written directly using pages at a time (often 4–8 kilobytes (KB) in size). Figure 1 shows

how the data is written into the NAND Flash.

Figure 1: Data is written to the NAND Flash in 4 KB pages and is erased in 256 KB blocks.

The controller on the SSD, which manages the Flash memory and interfaces with the host

system, uses a logical to physical mapping system known as logical block addressing (LBA)

Page 5: SSD WhitePaper  by Houman Shabani

Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 5 of 18

which is part of the Flash translation layer (FTL).[4]

Newer Flash devices may operate in 16kB

page sizes.

When new data comes in, replacing older data already written, the SSD controller will write the

new data in a new location and update the logical mapping to point to the new physical location.

The old location is no longer holding valid data, but it will eventually need to be erased before it

can be written again.[4]

Flash memory can only be programmed and erased a limited number of times. This is often

referred to as the maximum number of program/erase cycles (P/E cycles) that can be sustained

over the life of the Flash memory. Single-level cell (SLC) Flash memory, designed for highest

performance, can typically operate between 50,000 and 100,000 cycles. Multi-Level-Cell (MLC)

Flash memories were designed for lower cost applications and have greatly reduced cycle count

of typically between 3,000 and 5,000. Typically, as Flash memory geometry shrinks, the number

of program/erase cycles is also reduced and higher ECC must be incorporated in the controller.[2]

2. Wear leveling Wear leveling is a process that helps reduce premature wear in NAND Flash devices. The Flash

controller manages access to the memory devices and determines how the NAND Flash blocks

are used. In most cases, the controller maintains a lookup table to translate the memory array

physical block address (PBA) to the logical block address (LBA) used by the host system (see

Figure 2). [3]

Page 6: SSD WhitePaper  by Houman Shabani

Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 6 of 18

Figure 2: NAND Flash Controller Block Address Management

The controller's wear-leveling algorithm determines which physical block to use each time data

is programmed, eliminating the relevance of the physical location of data and enabling data to be

stored anywhere within the memory array. [3]

Depending on the wear-leveling method used, the controller typically either writes to the

available erased block with the lowest erase count (dynamic wear leveling); or it selects an

available target block with the lowest overall erase count, erases the block if necessary, writes

new data to the block, and ensures that blocks of static data are moved when their block erase

count is below a certain threshold (static wear leveling).

Another wear-leveling approach is to correct for ‘Read-disturb’ effects that occur in cells that are

read many times in a short period of time. Read-disturb effects are more prevalent in MLC

based NAND Flash cells and become more frequent as the cell geometries shrink. That is, a

25nm NAND Flash cell will require more read-disturb correction than a 43nm cell.

The need for wear leveling results from the finite PROGRAM/ERASE cycling capability of

NAND Flash memory cells. The repeated use of a limited number of blocks can cause the device

Page 7: SSD WhitePaper  by Houman Shabani

Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 7 of 18

to prematurely wear out or exceed its program/erase endurance. The Wear-leveling process

spreads NAND Flash memory cell use over the available memory array, ideally equalizing the

use of all memory cells and helping to extend device life. Figure 3 shows the process of the

Wear leveling and it’s effects. [3]

Figure 3: Wear Leveling causes data to be rewritten in the Flash multiple times.

[2]

Consider a case without wear leveling. In a NAND Flash device with 4,096 total blocks and

2.5% allowable bad blocks in a system that updates 3 files comprised of 50 blocks each at a rate

of 1 file every 10 minutes (or 6 files per hour), where a NAND host reuses the same 200 physical

blocks for these updates, the NAND Flash device will wear out in under 1 year, leaving over

95% of the memory array unused. [3]

3. Read-disturb

The method used to read NAND flash memory can cause other cells near the cell being read to

change over time if the surrounding cells of the block are not rewritten. This is generally in the

hundreds of thousands of reads without a rewrite of those cells. The error does not appear when

reading the original cell, but rather shows up when finally reading one of the surrounding cells. If

the flash controller does not track the total number of reads across the whole storage device and

rewrite the surrounding data periodically as a precaution, a read disturb error will likely occur

with data loss as a result. [5]

Page 8: SSD WhitePaper  by Houman Shabani

Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 8 of 18

4. Garbage collection

Data is written to the Flash memory in pages. However, the memory can only be erased in

blocks. If the data in some of the pages of the block are no longer needed (also called stale

pages), only the pages with good data in that block are read and re-written into another

previously erased empty block. Then the free pages left by not moving the stale data are

available for new data. This is a process called garbage collection (GC). Figure 4 shows the

process of the Garbage collection in SSDs.

Figure 4: Process of garbage collection

All SSD drives include some level of garbage collection, but they may differ in when and how

fast they perform the process. Garbage collection is a big part of write amplification on the SSD.

The process of garbage collection involves reading and rewriting data to the Flash memory. This

means that a new write from the host will first require a read of the whole block, a write of the

parts of the block which still include valid data, and then a write of the new data. This can

significantly reduce the performance of the system. Therefore, some SSD controllers implement

Page 9: SSD WhitePaper  by Houman Shabani

Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 9 of 18

background garbage collection (BCG), sometimes called idle garbage collection or idle-time

garbage collection (ITGC), where the controller uses idle time to consolidate blocks of Flash

memory before the host needs to write new data. This enables the performance of the device to

remain high. [4]

If the controller were to background garbage collect all of the spare blocks before it was

absolutely necessary, new data written from the host could be written without having to move

any data in advance, letting the performance operate at its peak speed. The trade-off is that some

of those blocks of data are actually not needed by the host and will eventually be deleted, but the

OS did not tell the controller this information. The result is that the soon-to-be-deleted data is

rewritten to another location in the Flash memory increasing the write amplification.

In some of the SSDs the background garbage collection only clears up a small number of blocks

then stops, thereby limiting the amount of excessive writes. Another solution is to have an

efficient garbage collection system which can perform the necessary moves in parallel with the

host writes. This solution is more effective in high write environments where the SSD is rarely

idle. [4]

5. TRIM

TRIM command allows operating systems to inform the SSD drive which blocks of data are no

longer considered in use and can be wiped internally. Because low-level operation of SSDs

differs significantly from traditional hard disks, the typical way in which operating systems

handle operations like deletes and formats, resulted in unanticipated progressive performance

degradation of write operations on SSDs. TRIM enables the SSD drive to handle garbage

collection overhead, that would otherwise significantly slow down future write operations to the

involved blocks, in advance. Figure 5 and 6 explains the deference between write and delete

operation of OS with and without TRIM. [6]

TRIM’s benefits:

Higher Throughput – faster Host write speeds because less time writing for GC

Improved endurance – reduced writes to Flash.

Lower write Amplification – Less data rewritten and more free space is available during

GC

Page 10: SSD WhitePaper  by Houman Shabani

Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 10 of 18

Figure 5: Write and delete process without TRIM

Figure 6: Write and delete with TRIM

Page 11: SSD WhitePaper  by Houman Shabani

Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 11 of 18

6. Endurance

In NAND Flash memories data are written by forcing electrons through a layer of electrical

insulation onto a floating transistor gate, therefore, Flash can withstand only a limited number of

write and erase cycles before the insulation is permanently damaged.

In the earliest Flash memories, this might occur after as few as 1,000 write cycles, while in

modern Flash EEPROM the endurance may exceed 1,000,000, but it is by no means infinite.

This limited endurance, as well as the higher cost per bit, means that Flash-based storage is

unlikely to completely supplant magnetic disk drives in the near future. [7]

The time-span over which a ROM remains accurately readable is not limited by write cycling.

The data retention of Flash may be limited by charge leaking from the floating gates of the

memory cell transistors. Leakage is accelerated by high temperatures or radiation.

7. IOPS

IOPS (Input Output Operations Per Second) is a common performance measurement used to

benchmark computer storage devices like hard disk drives (HDD) and solid state drives (SSD).

As with any benchmark, IOPS numbers published by storage device manufacturers do not

guarantee real-world application performance. [4]

The specific number of IOPS possible in any system configuration will vary greatly depending

upon the variables the tester enters into the program, including the balance of read and write

operations, the mix of sequential and random access patterns, the number of worker threads and

queue depth, as well as the data block sizes. There are other factors which can also affect the

IOPS results including the system setup, storage drivers, OS background operations, etc. Also,

when testing SSDs in particular, there are preconditioning considerations that must be taken into

account.

Page 12: SSD WhitePaper  by Houman Shabani

Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 12 of 18

8. Write Amplification Write amplification (WA) is an undesirable phenomenon associated with Flash memory and

solid-state disk drives (SSDs). Because Flash memory must be erased before it can be rewritten,

the process to perform these operations results in moving (or rewriting) user data more than

once. This multiplying effect increases the number of writes required over the life of the SSD

which shortens the time it can reliably operate. [4]

Many factors will affect the write amplification of an SSD; some can be controlled by the user

and some are a direct result of the data written to and usage of the SSD. Write amplification is

typically measured by the ratio of writes coming from the host system and the writes going to the

Flash memory. An SSD experiences write amplification as a result of garbage collection and

wear leveling, thereby increasing writes on the drive and reducing its life. A lower WA is more

desirable to reduce the number of P/E cycles on the Flash memory and thereby increase the life

of the SSD.

All SSDs have a write amplification value and it is based on both what is currently being written

and what was previously written to the SSD. In order to accurately measure the value for a

specific SSD, the selected test should be run for enough time to ensure the drive has reached a

steady state condition. [2]

Factors affecting the WA

Many factors affect the write amplification of an SSD. Figure 8 lists the primary factors and how

they affect the write amplification. For factors that are variable, the table notes if it has a direct

relationship or an inverse relationship. For example, as the amount of over-provisioning

increases, the write amplification decreases (inverse relationship). [4]

Factor Description Relationship

Garbage collection The efficiency of the algorithm used to pick the next best block to erase and rewrite Inverse (good)

Over-provisioning The percentage of physical capacity which is allocated to the SSD controller (and not given to the user) Inverse (good)

Free user space The percentage of the user capacity free of actual user data; otherwise the SSD gains no benefit from any free user capacity Inverse (good)

Wear leveling The efficiency of the algorithm that ensures every block is written an equal number of times to all other blocks as evenly as possible Direct (bad)

Sequential writes In theory, sequential writes have a write amplification of 1, but other factors will still affect the value Positive (good)

Random writes Writing to non-sequential LBAs will have the greatest impact on write amplification Negative (bad)

Figure 7: Factors affecting the Write Amplification

Page 13: SSD WhitePaper  by Houman Shabani

Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 13 of 18

9. Data Compression Level and Data Entropy

Write amplification is defined as:

Write Amplification (WA) = (data written to the Flash memory) / (data written by the HOST)

Typical data log files and text files are considered to have low entropy. Write Amplification of

SSDs for low entropy data content ranges from 0.2 – 0.8, depending on the workload (sequential

vs. random). Highly efficient encoded file formats such as MPEG-4 video files that are highly

compressed are considered to have high entropy. Write Amplification for high entropy data

content is higher since not much further compression can be achieved and all data needs to be

written to the flash. Depending on the workload (sequential vs. random), WA of the SSDs for

high entropy data content ranges from 1 to 4. Figure 7 shows how the data content, or entropy (a

measure for data randomness), affects the write Amplification of SSDs. [1]

Figure 8: Dependency of Write Amplification to Data Entropy

Page 14: SSD WhitePaper  by Houman Shabani

Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 14 of 18

10. Calculation of Useful life of SSDs

Many SSD manufacturers are struggling to classify endurance in terms that are meaningful to

OEMS and end users. “Write/erase cycles per logical block” can be useful but it does not

answer the real question, “How long will the SSD last in my application? “ OEMs need to

understand SSD life in terms of time – years, months, days – instead of “cycles”. Therefore

defining and measuring the usage model is critical to making this translation. [8]

In the streaming applications, the host writes data from logical block address 0 (LBA0) to LBAn.

In this case the write performance is maximized and the effect s of wear-leveling and bad block

management are minimized. [10]

The life time measure in years is defined as:

Lifetime = (NAND Flash Endurance * User capacity) / (Maximum Write speed (MB/s) * Duty cycle)

Duty Cycle is defined as the percentage of write cycles to (read cycles + idle time)

To simplify the calculation we can use a 0.0325 constant in the formula which is derived from

endurance rating in thousands of cycles, “KB – to “GB”, and “seconds to years”. [8]

Therefore:

Lifetime = (NAND Flash Endurance * User capacity * 0.0325) / (Maximum Write speed * Duty cycle)

Page 15: SSD WhitePaper  by Houman Shabani

Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 15 of 18

Example 1: An application monitors public transportation. A 64GB drive capable of sustained data

rate of 32MBps using MLC NAND Flash rated at 5000 write/erase endurance has a useful life of: [8]

Life time: (5*64*0.0325) / (32*duty cycle) = 0.32 / DC

For Duty Cycle = 0.5 => Life = 0.32 / 0.5 = 0.65 year

Very few embedded applications only stream data. Most applications are database transactional

applications. Therefore the concept of Write Amplification comes into play, since simply monitoring the

host write (IOPS) does not yield the proper results. [8]

Therefore the life formula will be as follow:

Lifetime = (NAND Endurance * user capacity * 0.0325 * 1024) / (write IOPS * File size * WA * duty cycle)

Note: for a full blown formula refer to the appendix a and B

1024 - The constant from KB to MB conversion

Write IOPS - Number of write input/output per second

File size – The size of the file that IOPS was measured

Page 16: SSD WhitePaper  by Houman Shabani

Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 16 of 18

Example 2: A voicemail system manufacturer is considering a 16GB SSD to replace a rotating disk

drive. The drive uses SLC NAND that is rated at 100K P/E endurance with 200 write IOPS for an 8KB file.

The drive does not specify a write amplification factor so a value of 32 (256 KB block / 8 KB file) will be

used. The OEM estimates the write duty cycle at 25%.

Lifetime = (100 * 16 * 0.0325 * 1024) / (200 * 8 * 32 * 25%) = 4.15 years

It can be clearly seen that there are direct correlation between capacity and useful life.

Assuming a full wear leveling scheme, doubling the capacity doubles the useful life.

11. Conclusion Today, NAND Flash memories are changing as rapidly as the qualification cycles of some of

OEMs, therefore those OEMs need to take a more system-level approach to determine how long

product must be deployed in the field. Need to determine the usage model (and just as

importantly how to measure and predict it). From there it can be specified the proper capacity for

their required application. [8]

Netlist Flash engineering group has performed extensive analysis to come up with an accurate

and simple solution for calculation of the useful life of the SSDs and to facilitate the support of

its customers Netlist has developed a Life-Calculator which can be simply used by technical

and non-technical customers. This calculator can be provided to Netlist’s customers upon

request.

Page 17: SSD WhitePaper  by Houman Shabani

Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 17 of 18

APPENDIX -A Abbreviations: K = Conversion Constant

APPENDIX -B Formulas:

Page 18: SSD WhitePaper  by Houman Shabani

Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 18 of 18

1. Conversion Constant:

Conversion constant formula (K) simplifies overall math due to reduction and

removal of cancelable variables.

1.1.

2. Capacity Usage in Time:

Capacity usage in time formula ( ) estimates the overall drive usage in a

given year.

2.1.

3. Life Life formula (Life) calculates the overall drive usable life.

3.1.