SSD WhitePaper by Houman Shabani
-
Upload
houman-shabani -
Category
Documents
-
view
130 -
download
2
Transcript of SSD WhitePaper by Houman Shabani
Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 1 of 18
NETLIST, INC.
Solid State Storage (SSD)
Usable Life Calculation
White Paper
Written by: Mike Amidi, Dir. Flash Product Development Houman Shabani, Flash Applications Engineer
Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 2 of 18
Table of Contents
Table of Contents .......................................................................................................................................... 2
1. NAND Flash Basics ................................................................................................................................. 4
2. Wear leveling ........................................................................................................................................ 5
3. Read-disturb .......................................................................................................................................... 7
4. Garbage collection ................................................................................................................................ 8
5. TRIM ...................................................................................................................................................... 9
6. Endurance ........................................................................................................................................... 11
7. IOPS ..................................................................................................................................................... 11
8. Write Amplification ............................................................................................................................. 12
Factors affecting the WA .................................................................................................................... 12
9. Data Compression Level and Data Entropy ........................................................................................ 13
10. Calculation of Useful life of SSDs .................................................................................................... 14
Example 1: ................................................................................................................................................... 15
Example 2: ................................................................................................................................................... 16
11. Conclusion ....................................................................................................................................... 16
APPENDIX -A................................................................................................................................................ 17
Abbreviations: ......................................................................................................................................... 17
APPENDIX -B ................................................................................................................................................ 17
Formulas: ................................................................................................................................................ 17
References: .................................................................................................... Error! Bookmark not defined.
Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 3 of 18
Introduction
Designers of enterprise computing applications are starting to take advantage of solid state disk
drives (SSD) in order to increase performance, save space, improve reliability, and reduce power
consumption. When compared to traditional 10K or 15K hard disk drives, SSDs are being used
to develop systems and applications capable of delivering previously un-attained levels of
performance.
Unlike hard disk drives technology, measuring the performance of the SSD is more complicated
due to the internal architecture and storage media used. For example, the performance of an
SSD may change as the device is being used and achieving stable and repeatable results requires
different test methodologies than those used for traditional HDD devices. Therefore,
performance can be a variable of the data pattern and write amplification on the drive is key to
predicting the performance for a specific application. [1]
This White Paper discusses the key parameters that are relevant to the performance and usable
life of Netlist SSD drives. It provides guide lines and methods for calculation of useful life of
SSD drives.
Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 4 of 18
1. NAND Flash Basics
NAND Flash memory is a non-volatile storage device that can be electrically erased and
reprogrammed. It was developed from EEPROM (electrically erasable programmable read-only
memory) and must be erased in fairly large blocks before they can be rewritten with new data.
The high density NAND type must also be programmed and read in (smaller) blocks, or pages,
while the NOR type allows a single machine word (byte) to be written and/or read
independently. [2]
Due to the nature of Flash memory's operation, data cannot be directly overwritten as it can in a
hard disk drive. When data is first written to an SSD, the cells all start in an erased state so data
can be written directly using pages at a time (often 4–8 kilobytes (KB) in size). Figure 1 shows
how the data is written into the NAND Flash.
Figure 1: Data is written to the NAND Flash in 4 KB pages and is erased in 256 KB blocks.
The controller on the SSD, which manages the Flash memory and interfaces with the host
system, uses a logical to physical mapping system known as logical block addressing (LBA)
Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 5 of 18
which is part of the Flash translation layer (FTL).[4]
Newer Flash devices may operate in 16kB
page sizes.
When new data comes in, replacing older data already written, the SSD controller will write the
new data in a new location and update the logical mapping to point to the new physical location.
The old location is no longer holding valid data, but it will eventually need to be erased before it
can be written again.[4]
Flash memory can only be programmed and erased a limited number of times. This is often
referred to as the maximum number of program/erase cycles (P/E cycles) that can be sustained
over the life of the Flash memory. Single-level cell (SLC) Flash memory, designed for highest
performance, can typically operate between 50,000 and 100,000 cycles. Multi-Level-Cell (MLC)
Flash memories were designed for lower cost applications and have greatly reduced cycle count
of typically between 3,000 and 5,000. Typically, as Flash memory geometry shrinks, the number
of program/erase cycles is also reduced and higher ECC must be incorporated in the controller.[2]
2. Wear leveling Wear leveling is a process that helps reduce premature wear in NAND Flash devices. The Flash
controller manages access to the memory devices and determines how the NAND Flash blocks
are used. In most cases, the controller maintains a lookup table to translate the memory array
physical block address (PBA) to the logical block address (LBA) used by the host system (see
Figure 2). [3]
Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 6 of 18
Figure 2: NAND Flash Controller Block Address Management
The controller's wear-leveling algorithm determines which physical block to use each time data
is programmed, eliminating the relevance of the physical location of data and enabling data to be
stored anywhere within the memory array. [3]
Depending on the wear-leveling method used, the controller typically either writes to the
available erased block with the lowest erase count (dynamic wear leveling); or it selects an
available target block with the lowest overall erase count, erases the block if necessary, writes
new data to the block, and ensures that blocks of static data are moved when their block erase
count is below a certain threshold (static wear leveling).
Another wear-leveling approach is to correct for ‘Read-disturb’ effects that occur in cells that are
read many times in a short period of time. Read-disturb effects are more prevalent in MLC
based NAND Flash cells and become more frequent as the cell geometries shrink. That is, a
25nm NAND Flash cell will require more read-disturb correction than a 43nm cell.
The need for wear leveling results from the finite PROGRAM/ERASE cycling capability of
NAND Flash memory cells. The repeated use of a limited number of blocks can cause the device
Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 7 of 18
to prematurely wear out or exceed its program/erase endurance. The Wear-leveling process
spreads NAND Flash memory cell use over the available memory array, ideally equalizing the
use of all memory cells and helping to extend device life. Figure 3 shows the process of the
Wear leveling and it’s effects. [3]
Figure 3: Wear Leveling causes data to be rewritten in the Flash multiple times.
[2]
Consider a case without wear leveling. In a NAND Flash device with 4,096 total blocks and
2.5% allowable bad blocks in a system that updates 3 files comprised of 50 blocks each at a rate
of 1 file every 10 minutes (or 6 files per hour), where a NAND host reuses the same 200 physical
blocks for these updates, the NAND Flash device will wear out in under 1 year, leaving over
95% of the memory array unused. [3]
3. Read-disturb
The method used to read NAND flash memory can cause other cells near the cell being read to
change over time if the surrounding cells of the block are not rewritten. This is generally in the
hundreds of thousands of reads without a rewrite of those cells. The error does not appear when
reading the original cell, but rather shows up when finally reading one of the surrounding cells. If
the flash controller does not track the total number of reads across the whole storage device and
rewrite the surrounding data periodically as a precaution, a read disturb error will likely occur
with data loss as a result. [5]
Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 8 of 18
4. Garbage collection
Data is written to the Flash memory in pages. However, the memory can only be erased in
blocks. If the data in some of the pages of the block are no longer needed (also called stale
pages), only the pages with good data in that block are read and re-written into another
previously erased empty block. Then the free pages left by not moving the stale data are
available for new data. This is a process called garbage collection (GC). Figure 4 shows the
process of the Garbage collection in SSDs.
Figure 4: Process of garbage collection
All SSD drives include some level of garbage collection, but they may differ in when and how
fast they perform the process. Garbage collection is a big part of write amplification on the SSD.
The process of garbage collection involves reading and rewriting data to the Flash memory. This
means that a new write from the host will first require a read of the whole block, a write of the
parts of the block which still include valid data, and then a write of the new data. This can
significantly reduce the performance of the system. Therefore, some SSD controllers implement
Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 9 of 18
background garbage collection (BCG), sometimes called idle garbage collection or idle-time
garbage collection (ITGC), where the controller uses idle time to consolidate blocks of Flash
memory before the host needs to write new data. This enables the performance of the device to
remain high. [4]
If the controller were to background garbage collect all of the spare blocks before it was
absolutely necessary, new data written from the host could be written without having to move
any data in advance, letting the performance operate at its peak speed. The trade-off is that some
of those blocks of data are actually not needed by the host and will eventually be deleted, but the
OS did not tell the controller this information. The result is that the soon-to-be-deleted data is
rewritten to another location in the Flash memory increasing the write amplification.
In some of the SSDs the background garbage collection only clears up a small number of blocks
then stops, thereby limiting the amount of excessive writes. Another solution is to have an
efficient garbage collection system which can perform the necessary moves in parallel with the
host writes. This solution is more effective in high write environments where the SSD is rarely
idle. [4]
5. TRIM
TRIM command allows operating systems to inform the SSD drive which blocks of data are no
longer considered in use and can be wiped internally. Because low-level operation of SSDs
differs significantly from traditional hard disks, the typical way in which operating systems
handle operations like deletes and formats, resulted in unanticipated progressive performance
degradation of write operations on SSDs. TRIM enables the SSD drive to handle garbage
collection overhead, that would otherwise significantly slow down future write operations to the
involved blocks, in advance. Figure 5 and 6 explains the deference between write and delete
operation of OS with and without TRIM. [6]
TRIM’s benefits:
Higher Throughput – faster Host write speeds because less time writing for GC
Improved endurance – reduced writes to Flash.
Lower write Amplification – Less data rewritten and more free space is available during
GC
Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 10 of 18
Figure 5: Write and delete process without TRIM
Figure 6: Write and delete with TRIM
Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 11 of 18
6. Endurance
In NAND Flash memories data are written by forcing electrons through a layer of electrical
insulation onto a floating transistor gate, therefore, Flash can withstand only a limited number of
write and erase cycles before the insulation is permanently damaged.
In the earliest Flash memories, this might occur after as few as 1,000 write cycles, while in
modern Flash EEPROM the endurance may exceed 1,000,000, but it is by no means infinite.
This limited endurance, as well as the higher cost per bit, means that Flash-based storage is
unlikely to completely supplant magnetic disk drives in the near future. [7]
The time-span over which a ROM remains accurately readable is not limited by write cycling.
The data retention of Flash may be limited by charge leaking from the floating gates of the
memory cell transistors. Leakage is accelerated by high temperatures or radiation.
7. IOPS
IOPS (Input Output Operations Per Second) is a common performance measurement used to
benchmark computer storage devices like hard disk drives (HDD) and solid state drives (SSD).
As with any benchmark, IOPS numbers published by storage device manufacturers do not
guarantee real-world application performance. [4]
The specific number of IOPS possible in any system configuration will vary greatly depending
upon the variables the tester enters into the program, including the balance of read and write
operations, the mix of sequential and random access patterns, the number of worker threads and
queue depth, as well as the data block sizes. There are other factors which can also affect the
IOPS results including the system setup, storage drivers, OS background operations, etc. Also,
when testing SSDs in particular, there are preconditioning considerations that must be taken into
account.
Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 12 of 18
8. Write Amplification Write amplification (WA) is an undesirable phenomenon associated with Flash memory and
solid-state disk drives (SSDs). Because Flash memory must be erased before it can be rewritten,
the process to perform these operations results in moving (or rewriting) user data more than
once. This multiplying effect increases the number of writes required over the life of the SSD
which shortens the time it can reliably operate. [4]
Many factors will affect the write amplification of an SSD; some can be controlled by the user
and some are a direct result of the data written to and usage of the SSD. Write amplification is
typically measured by the ratio of writes coming from the host system and the writes going to the
Flash memory. An SSD experiences write amplification as a result of garbage collection and
wear leveling, thereby increasing writes on the drive and reducing its life. A lower WA is more
desirable to reduce the number of P/E cycles on the Flash memory and thereby increase the life
of the SSD.
All SSDs have a write amplification value and it is based on both what is currently being written
and what was previously written to the SSD. In order to accurately measure the value for a
specific SSD, the selected test should be run for enough time to ensure the drive has reached a
steady state condition. [2]
Factors affecting the WA
Many factors affect the write amplification of an SSD. Figure 8 lists the primary factors and how
they affect the write amplification. For factors that are variable, the table notes if it has a direct
relationship or an inverse relationship. For example, as the amount of over-provisioning
increases, the write amplification decreases (inverse relationship). [4]
Factor Description Relationship
Garbage collection The efficiency of the algorithm used to pick the next best block to erase and rewrite Inverse (good)
Over-provisioning The percentage of physical capacity which is allocated to the SSD controller (and not given to the user) Inverse (good)
Free user space The percentage of the user capacity free of actual user data; otherwise the SSD gains no benefit from any free user capacity Inverse (good)
Wear leveling The efficiency of the algorithm that ensures every block is written an equal number of times to all other blocks as evenly as possible Direct (bad)
Sequential writes In theory, sequential writes have a write amplification of 1, but other factors will still affect the value Positive (good)
Random writes Writing to non-sequential LBAs will have the greatest impact on write amplification Negative (bad)
Figure 7: Factors affecting the Write Amplification
Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 13 of 18
9. Data Compression Level and Data Entropy
Write amplification is defined as:
Write Amplification (WA) = (data written to the Flash memory) / (data written by the HOST)
Typical data log files and text files are considered to have low entropy. Write Amplification of
SSDs for low entropy data content ranges from 0.2 – 0.8, depending on the workload (sequential
vs. random). Highly efficient encoded file formats such as MPEG-4 video files that are highly
compressed are considered to have high entropy. Write Amplification for high entropy data
content is higher since not much further compression can be achieved and all data needs to be
written to the flash. Depending on the workload (sequential vs. random), WA of the SSDs for
high entropy data content ranges from 1 to 4. Figure 7 shows how the data content, or entropy (a
measure for data randomness), affects the write Amplification of SSDs. [1]
Figure 8: Dependency of Write Amplification to Data Entropy
Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 14 of 18
10. Calculation of Useful life of SSDs
Many SSD manufacturers are struggling to classify endurance in terms that are meaningful to
OEMS and end users. “Write/erase cycles per logical block” can be useful but it does not
answer the real question, “How long will the SSD last in my application? “ OEMs need to
understand SSD life in terms of time – years, months, days – instead of “cycles”. Therefore
defining and measuring the usage model is critical to making this translation. [8]
In the streaming applications, the host writes data from logical block address 0 (LBA0) to LBAn.
In this case the write performance is maximized and the effect s of wear-leveling and bad block
management are minimized. [10]
The life time measure in years is defined as:
Lifetime = (NAND Flash Endurance * User capacity) / (Maximum Write speed (MB/s) * Duty cycle)
Duty Cycle is defined as the percentage of write cycles to (read cycles + idle time)
To simplify the calculation we can use a 0.0325 constant in the formula which is derived from
endurance rating in thousands of cycles, “KB – to “GB”, and “seconds to years”. [8]
Therefore:
Lifetime = (NAND Flash Endurance * User capacity * 0.0325) / (Maximum Write speed * Duty cycle)
Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 15 of 18
Example 1: An application monitors public transportation. A 64GB drive capable of sustained data
rate of 32MBps using MLC NAND Flash rated at 5000 write/erase endurance has a useful life of: [8]
Life time: (5*64*0.0325) / (32*duty cycle) = 0.32 / DC
For Duty Cycle = 0.5 => Life = 0.32 / 0.5 = 0.65 year
Very few embedded applications only stream data. Most applications are database transactional
applications. Therefore the concept of Write Amplification comes into play, since simply monitoring the
host write (IOPS) does not yield the proper results. [8]
Therefore the life formula will be as follow:
Lifetime = (NAND Endurance * user capacity * 0.0325 * 1024) / (write IOPS * File size * WA * duty cycle)
Note: for a full blown formula refer to the appendix a and B
1024 - The constant from KB to MB conversion
Write IOPS - Number of write input/output per second
File size – The size of the file that IOPS was measured
Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 16 of 18
Example 2: A voicemail system manufacturer is considering a 16GB SSD to replace a rotating disk
drive. The drive uses SLC NAND that is rated at 100K P/E endurance with 200 write IOPS for an 8KB file.
The drive does not specify a write amplification factor so a value of 32 (256 KB block / 8 KB file) will be
used. The OEM estimates the write duty cycle at 25%.
Lifetime = (100 * 16 * 0.0325 * 1024) / (200 * 8 * 32 * 25%) = 4.15 years
It can be clearly seen that there are direct correlation between capacity and useful life.
Assuming a full wear leveling scheme, doubling the capacity doubles the useful life.
11. Conclusion Today, NAND Flash memories are changing as rapidly as the qualification cycles of some of
OEMs, therefore those OEMs need to take a more system-level approach to determine how long
product must be deployed in the field. Need to determine the usage model (and just as
importantly how to measure and predict it). From there it can be specified the proper capacity for
their required application. [8]
Netlist Flash engineering group has performed extensive analysis to come up with an accurate
and simple solution for calculation of the useful life of the SSDs and to facilitate the support of
its customers Netlist has developed a Life-Calculator which can be simply used by technical
and non-technical customers. This calculator can be provided to Netlist’s customers upon
request.
Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 17 of 18
APPENDIX -A Abbreviations: K = Conversion Constant
APPENDIX -B Formulas:
Netlist, Inc. 51 Discovery, Irvine, CA 92618 (949) 435-0025 Thursday, October 28, 2011 Page 18 of 18
1. Conversion Constant:
Conversion constant formula (K) simplifies overall math due to reduction and
removal of cancelable variables.
1.1.
2. Capacity Usage in Time:
Capacity usage in time formula ( ) estimates the overall drive usage in a
given year.
2.1.
3. Life Life formula (Life) calculates the overall drive usable life.
3.1.