HEC: Improving Endurance of High Performance Flash-based Cache Devices

15
HEC: Improving Endurance of High Performance Flash-based Cache Devices Jingpei Yang, Ned Plasson, Greg Gillis, Nisha Talagala, Swaminathan Sundararaman, Robert Wood Patent Pending Technology

description

Nisha Talagala, Fusion-io lead architect, presented “HEC: Improving Endurance for Flash Based Cache Devices.” Nisha's talk, given on July 1, 2013, focuses on how a fundamental part of the data management stack – caching – can be optimized for flash. While caching has been around for a long time, the bulk of caching policies have focused on DRAM caches. In this work, Nisha and collaborators describe the unique pressures that exist in flash caches that are not present in conventional DRAM caches, and what can be done to optimally manage cache workloads on flash.

Transcript of HEC: Improving Endurance of High Performance Flash-based Cache Devices

Page 1: HEC: Improving Endurance of High Performance Flash-based Cache Devices

HEC: Improving Endurance of High Performance Flash-based Cache Devices Jingpei Yang, Ned Plasson, Greg Gillis, Nisha Talagala, Swaminathan Sundararaman, Robert Wood

Patent Pending Technology

Page 2: HEC: Improving Endurance of High Performance Flash-based Cache Devices

Why Flash Caching?

2

• IOPS closer to

application

• Read from cache;

write to primary storage

• No need to reconfigure

storage or applications

• Preserve application

mobility

• Reduce storage costs

Operating System

Application

ioMemory

Primary Storage

SL

OW

(m

illiseco

nd

s)

FAST

(microseconds)

• Shared across servers for HA

• Storage services (snapshots, replication, etc)

• IOPS bottleneck

• Scaling IOPS is expensive

$

• High IOPS

• Low $/IOPS

• Local to server

Patent Pending Technology

Page 3: HEC: Improving Endurance of High Performance Flash-based Cache Devices

Flash Caches are Different

▸ Workloads

• Cache workloads are more write intensive – read misses become

writes yielding Cache Layer Write Amplification (CLWA)

▸ Endurance of media

• Limited endurance and getting more limited with new NAND node

geometries

▸ Un-coordination at the Caching and Flash Translation

Layer (FTL)

• Can cause increased Flash Layer Write Amplification (FLWA)

▸ Techniques for reducing FLWA and CLWA must consider

impact to cache hit rate 3 Patent Pending Technology

Page 4: HEC: Improving Endurance of High Performance Flash-based Cache Devices

Example – TPC-E Polluted Workload

TCWA of 40-50x and low hit rate demonstrates that opportunity exists for reducing writes

and improving hit rate

GC – garbage collection

CLWA – Cache Layer Write Amplification (cache writes / original writes)

FLWA - Flash Layer Write Amplification (GC writes / cache writes)

TCWA – Total Combined Write Amplification (CLWA * FLWA)

Original Reads

(GiB)

Original Writes

(GiB)

Cache

Size (GiB)

Cache

Writes

(GiB)

GC Writes

(GiB

Total

Writes

(GiB)

CLWA FLWA TCWAHit Rate

(%)

331.9 36.8 80 322.13 1553.98 1876.11 8.75 5.82 50.93 14.03

100 300.11 1459.13 1759.24 8.16 5.86 47.82 20.67

120 275.83 1352.01 1627.84 7.5 5.9 44.25 27.98

Conditions of simulation

1. No admission control

2. Tail-drop GC policy

3. Eviction performed at

cache layer

Writes due to

cache misses

Writes due to

garbage collection

4 Patent Pending Technology

Page 5: HEC: Improving Endurance of High Performance Flash-based Cache Devices

Techniques for reducing CLWA and

FLWA

5

▸ CLWA

• How do admission policies

affect device endurance?

• Can cache eviction benefit

from knowledge of the FTL?

▸ FLWA

• Can garbage collection further

reduce write amplification?

• What if eviction was

performed by the FTL?

Cache

Admission

Policy

Cache

Eviction

Policy

NAND Flash

Garbage

Collector

Flash Translation Layer

I/O Requests (read, write)

Cache

Controller

Flash

Controller

What is the combined impact of these techniques on hit rate and endurance?

Patent Pending Technology

Page 6: HEC: Improving Endurance of High Performance Flash-based Cache Devices

Reducing CLWA – Cache Admission

Policies

Admission Policy Description

Admit All All data is admitted to cache

Selective Sequential

Rejection (SSEQR)

All non-sequential and short sequential sector

ranges are admitted

Touch Count (TC) Sectors or collections of sectors are admitted

after they have been accessed N times

Sequential and Touch

Count (SSEQR + TC)

Admission if data meets criteria for both SSEQR

and TC

Admission policies can be used for improving the quality of data admitted to the cache thus

improving hit rate, decreasing read misses and in turn writes to media

6 Patent Pending Technology

Page 7: HEC: Improving Endurance of High Performance Flash-based Cache Devices

Reducing CLWA - Admission Policies

7

TPC-E Polluted Workload Trace

Selective admission policies

• Reduce bytes written

• Reduce segment erase

counts

• Improve read hit rate

Patent Pending Technology

Page 8: HEC: Improving Endurance of High Performance Flash-based Cache Devices

Reducing FLWA – FTL GC Policies

GC Policy Description

Greedy Victim segment is selected based on having the largest

amount of “invalid” data

Cost Benefit Segment selected based on “invalid” data percentage

and age of data in the segment [Rosenblum LFS92]

Oldest (tail drop) Segment selected is the oldest segment or tail of log

More selective GC policies can be used to reduce FLWA

8 Patent Pending Technology

Page 9: HEC: Improving Endurance of High Performance Flash-based Cache Devices

Reducing FLWA – GC Policy

9

TPC-E Polluted Workload

More selective GC policies

• Modest reductions in

FLWA

• No noticeable change in

hit rate and erased

segments

Patent Pending Technology

Page 10: HEC: Improving Endurance of High Performance Flash-based Cache Devices

Valid sector

Invalid sector

Reserve

Reducing FLWA – Cache Eviction Modes

Cache contents

Sector LBAs

Admission

Control

Cache Controller Flash Controller

Log Structured

Segment 0 Segment N

……

Eviction = LRW Eviction

(TRIM to Flash Controller for invalidation)

Cache-based Eviction

Garbage Collection = GC Policy (e.g. Tail Drop)

Admission

Control

Cache

Controller Flash Controller Log Structured

Segment 0 Segment N

……

GC-based Eviction

Garbage Collection/Eviction = more selective GC policy

Patent Pending Technology

Page 11: HEC: Improving Endurance of High Performance Flash-based Cache Devices

k

50k

100k

150k

200k

250k

300k

1 21 41 61 81 101 121 141 161 181 201 221

Nu

mb

er

of

Inv

alid

Secto

rs /

Seg

men

t

Segment Number

Reducing FLWA – Eviction Modes

and Validity Distribution

11

Improved victim segment selection and improved GC effectiveness from non-uniform distribution

Cache-based eviction mode (LRW) GC-based eviction mode

k

50k

100k

150k

200k

250k

300k

1 21 41 61 81 101 121 141 161 181 201 221

Nu

mb

er

of

Inv

alid

Secto

rs /

S

eg

men

t

Segment Number

TPC-E Workload

Patent Pending Technology

Page 12: HEC: Improving Endurance of High Performance Flash-based Cache Devices

Reducing FLWA – GC Policy and

Eviction Mode

12

Significant reduction in FLWA and segments erased with GC-based eviction while

maintaining hit rate

Cache-based eviction mode (LRW) GC-based eviction mode

TPC-E Polluted Workload

Patent Pending Technology

Page 13: HEC: Improving Endurance of High Performance Flash-based Cache Devices

Original

Reads

(GiB)

Original

Writes

(GiB)

Cache

Size

(GiB)

Cache

Writes

(GiB)

GC

Writes

(GiB

Total

Writes

(GiB)

CLWA FLWA TCWAHit Rate

(%)

331.9 36.8 80 322.13 1553.98 1876.11 8.75 5.82 50.93 14.03

100 300.11 1459.13 1759.24 8.16 5.86 47.82 20.67

120 275.83 1352.01 1627.84 7.5 5.9 44.25 27.98

Collective Endurance Impact

13

TCWA reduced while improving read cache hit rate using more selective admission, eviction,

and GC policies

TPC-E Polluted Trace - Before

Admission Policy = ADMIT_ALL

Eviction Mode = Cache-based

Eviction Policy = LRW

TPC-E Polluted Trace - After

Original

Writes

(GiB)

Original

Writes

(GiB)

Cache

Size

(GiB)

Cache

Writes

(GiB)

GC

Writes

(GiB)

Total

Writes

(GiB)

CLWA FLWA TCWAHit Rate

(%)

36.8 36.8 80 70.1 169.19 239.29 1.9 3.41 6.5 46.59

100 37.65 169.05 206.7 1.02 5.49 5.62 59.78

120 37.65 44.8 82.45 1.02 2.19 2.24 59.78

Admission Policy = SSEQR+TC

Eviction Mode = GC-based

Eviction Policy = cost_benefit

Patent Pending Technology

Page 14: HEC: Improving Endurance of High Performance Flash-based Cache Devices

Conclusions

• CLWA and FLWA exists and their combined effects can

increase original write workloads by 10x to 50x

• We have identified techniques that provide for reduction

of CLWA up to 20x and FLWA up to 10x

• We have demonstrated that combinations of such

techniques reduce TCWA up to 20x yielding significant

improvements in endurance and use of PBW

• We show that such techniques maintain or improve hit

rate

14 Patent Pending Technology

Page 15: HEC: Improving Endurance of High Performance Flash-based Cache Devices

f u s i o n i o . c o m | R E D E F I N E W H A T ’ S P O S S I B L E

T H A N K Y O U