A Case for Refresh Pausing in DRAM Memory Systems

39
A Case for Refresh Pausing in DRAM Memory Systems Prashant Nair Chia-Chen Chou Moinuddin Qureshi 1

description

A Case for Refresh Pausing in DRAM Memory Systems. Prashant Nair Chia-Chen Chou Moinuddin Qureshi. Introduction. Dynamic Random Access Memory (DRAM) used as main memory DRAM stores data as charge on capacitor. DRAM cells leak data!. 1. Leakage. - PowerPoint PPT Presentation

Transcript of A Case for Refresh Pausing in DRAM Memory Systems

Page 1: A Case for Refresh Pausing in DRAM Memory Systems

1

A Case for Refresh Pausing in DRAM Memory Systems

Prashant NairChia-Chen Chou

Moinuddin Qureshi

Page 2: A Case for Refresh Pausing in DRAM Memory Systems

2

• Dynamic Random Access Memory (DRAM) used as main memory• DRAM stores data as charge on capacitor

Leakage

DRAM cells leak data!

DRAM Chip

1

DRAM is a volatile memory Charge leaks quickly

Introduction

Page 3: A Case for Refresh Pausing in DRAM Memory Systems

3

DRAM maintains data by Refresh operations

DRAM Chip

RefreshRefreshRefreshRefresh

JEDEC specified DRAM retention time:64ms (< 85 C)32ms (> 85 C)

Charge on cells restored

DRAM relies on Refresh for data integrity

Refresh: Restoring Data in DRAM

Time between Refresh ≤ Retention Time

Page 4: A Case for Refresh Pausing in DRAM Memory Systems

4

1Gb 2Gb 4Gb 8Gb

2.8%

9%7.7%

5.1%

Chip Density

~18%

~36%

16Gb 32Gb

Time spent in Refresh proportional to number of Rows

Increasing memory capacity More time spent in Refresh

The time for doing Refresh is increasing with chip density

Refresh: A Growing Problem

Page 5: A Case for Refresh Pausing in DRAM Memory Systems

5

Memory unavailable for Read/Write during Refresh

timeNo Refresh

REFRESH

B

Interference due to Refresh time

Wait

Refresh blocks reads Higher read latency

Refresh Blocks Reads

A B

A B Serviced

Page 6: A Case for Refresh Pausing in DRAM Memory Systems

6

8Gb 16Gb 32Gb0%

10%

20%

30%

40%

50%

60% Read Latency

Incr

ease

in R

ead

Late

ncy

8Gb 16Gb 32Gb0%5%

10%15%20%25%30%35%40%

Performance

Perf

orm

ance

Los

s

Our Goal: Reduce the Read Latency impact of Refresh

Impact of Refresh is significant, and increasing

Impact of Refresh

Page 7: A Case for Refresh Pausing in DRAM Memory Systems

7

Introduction & Motivation

Refresh Operation: Background

Refresh Pausing

Evaluation

Alternative Proposals

Summary

Outline

Page 8: A Case for Refresh Pausing in DRAM Memory Systems

8

Row 1Row 2Row 3Row 4Row 5

Row n-1Row n

A DRAM Bank

RefreshRefreshRefresh

Refresh operates on a Row granularity

Refresh Operation

Page 9: A Case for Refresh Pausing in DRAM Memory Systems

9

• Burst Mode:

Memory unavailable until all rows finish refresh

• Distributed Mode:

8K refresh pulses in 64ms

Refresh

64ms

Refresh

64ms

Distributed mode reduces contention from Refresh

Refresh Modes

Page 10: A Case for Refresh Pausing in DRAM Memory Systems

10

Every pulse refreshes a ‘Bundle of rows’

Chip Size Rows in a Refresh bundle (per bank)

512 Mb 11Gb 22Gb 4

4Gb or 8Gb (Twin 4Gb die) 8

Refresh Bundle currently have upto 8 rows, and increasing

Refresh Bundle

Page 11: A Case for Refresh Pausing in DRAM Memory Systems

TRFC is the time to do refresh for every refresh pulse

11

TRFC

unavailable

available

8Gb

unavailable

TRFCavailable

16Gb

TRFC

unavailable

available

32Gb

High TRFC Read waits for refresh for long time

The Latency Wall of Refresh

Current 8Gb chips have TRFC of 350ns >> read latency

Page 12: A Case for Refresh Pausing in DRAM Memory Systems

12

Introduction & Motivation

Refresh Operation: Background

Refresh Pausing

Evaluation

Alternative Proposals

Summary

Outline

Page 13: A Case for Refresh Pausing in DRAM Memory Systems

13

A

time

Refresh

B

Request B arrives

Interrupted

Baseline system

Refresh (Cont.)

Refresh Pausing

B

A Refresh

time

Request B arrives

Insight: Make Refresh Operations Interruptible

Pausing Refresh reduces wait time for Reads

Pausing at arbitrary point can cause data loss

Refresh Pausing

Page 14: A Case for Refresh Pausing in DRAM Memory Systems

14

Bank

Rows

Row Buffer

dcba

Refresh Pulse (4 rows in a bundle)

ChipWith Refresh Pausing

Pause

Read X

X

Without Refresh Pausing

X

Refresh Pausing at Row boundary to service read

Refresh Pausing: When to Pause?

Page 15: A Case for Refresh Pausing in DRAM Memory Systems

15

• Memory Controller generates a Refresh Enable (RE) signal

• Pausing requires ‘active low’ detection of RE

• One way communication only

Memory ControllerRefresh Enable(RE) to DRAM

RE

1

0

Pause

Resume

Refresh Pausing: Interface Details

Page 16: A Case for Refresh Pausing in DRAM Memory Systems

16

• Row Address Counter increments the addresses

• Stop the increment using a simple AND gate

• Active Low Refresh Enable as ‘Refresh Pause’

Address Generator

Row Address CounterEN

Incrementer

Refresh Bundle Addresses

DRAM

RE

Refresh Pausing: Track a Paused Row

Page 17: A Case for Refresh Pausing in DRAM Memory Systems

17

• Scheduler schedules: Read, Write, and Refresh

• Responsible for Pausing Refresh for Read

• Keeps track of refresh time done before Pause

ProcessorBus

MemoryController

Scheduler Read Queue

Write Queue

Refresh Enable

DRAM

Refresh Pausing: Memory Scheduler

Page 18: A Case for Refresh Pausing in DRAM Memory Systems

18

• Pausing can delay Refresh

• JEDEC allows delay of up-to 8 pending refresh

• If 8 pending refresh, then issue ‘Forced Refresh’

• Forced Refresh cannot be Paused

Reads/Writes Forced Refresh

Refresh Pulses

Refresh Issued

Refresh Not Issued

Forced Refresh for data integrity

Forced Refresh

Page 19: A Case for Refresh Pausing in DRAM Memory Systems

19

Introduction & Motivation

Refresh Operation: Background

Refresh Pausing

Evaluation

Alternative Proposals

Summary

Outline

Page 20: A Case for Refresh Pausing in DRAM Memory Systems

20

• Simulator: uSIMM from Memory Scheduling Championship (MSC)

• Workloads: MSC SuiteCOMMERCIAL(5), PARSEC(9), BIOBENCH(2) and SPEC(2)

• Configuration:Number of Cores 4Last Level Cache 1MBDRAM (DDR3) 8 Chips/Rank, 8Gb/Chip Channels, Ranks, Banks 4,2,8Refresh (Baseline) Distributed (JEDEC)

• Results presented for temperature > 85C (paper also has <85C)

Experimental Setup

Page 21: A Case for Refresh Pausing in DRAM Memory Systems

21

- Refresh Pausing gives ~7% read latency reduction for an 8Gb chip

COMMERCIAL SPEC PARSEC BIOBENCH GMEAN0.75

0.80

0.85

0.90

0.95

1.00

Normalized Read LatencyRefresh Pausing No Refresh

Nor

mal

ized

Rea

d La

tenc

y

7%

Results: Read Latency

Page 22: A Case for Refresh Pausing in DRAM Memory Systems

22

- Refresh Pausing gives ~5% performance improvement for an 8Gb chip

COM-MERCIAL

SPEC PARSEC BIOBENCH GMEAN1.021.031.041.051.061.071.081.091.101.11

Performance ComparisonRefresh Pausing No Refresh

Spee

dup

Results: Performance

Page 23: A Case for Refresh Pausing in DRAM Memory Systems

23

Refresh Pausing more effective as chips density increases

8Gb 16Gb 32Gb1.0

1.1

1.2

1.3

1.4Impact of Density on Refresh PausingRefresh Pausing No Refresh

Spee

dup

Results: Impact of Chip Density

Page 24: A Case for Refresh Pausing in DRAM Memory Systems

24

Introduction & Motivation

Refresh Operation: Background

Refresh Pausing

Evaluation

Alternative Proposals

Summary

Outline

Page 25: A Case for Refresh Pausing in DRAM Memory Systems

25

• Elastic Refresh waits for idle period before issuing a refresh

• Estimates average inter-arrival time of memory request

3 unitsA

Request A

B

Request B

time

time

time

A

Request A

B

Request B4 units

Refresh

A

Request A

B

Request B

Refresh

7 units

Wait

No Refreshes

With Refreshes

Elastic Refresh

The “Wait and Watch” policy can increase wait times

Elastic Refresh for Scheduling Refresh [MICRO’10]

Page 26: A Case for Refresh Pausing in DRAM Memory Systems

26

COM-MERCIAL

SPEC PARSEC BIOBENCH GMEAN0.90

0.95

1.00

1.05

1.10

1.15Comparision of Elastic RefreshElastic Refresh Refresh Pausing No Refresh

Spee

dup

Refresh Pausing outperforms Elastic Refresh

Comparison with Elastic Refresh

Page 27: A Case for Refresh Pausing in DRAM Memory Systems

27

Reduce bundles size and have more bundles

TREFI TREFI/2

TRFC TRFC TRFC/2 TRFC/2 TRFC/2 TRFC/2

TREFI/2 TREFI/2

DDR4 x2 ModeDDR3 Distributed Mode

• In x2 mode, TREFI is reduced by 2 (x4 mode by 4)

• In x2 mode TRFC is reduced by 2 (x4 mode by 4)

Fine Grained Refresh to reduce contention of Refresh

DDR4 proposals: x2 and x4 modes

Page 28: A Case for Refresh Pausing in DRAM Memory Systems

28

DDR4

x2

DDR4

x4

Paus

ing

No R

efre

sh

DDR4

x2

DDR4

x4

Paus

ing

No R

efre

sh

16Gb 32Gb

1.00

1.05

1.10

1.15

1.20

1.25

1.30

1.35

1.40

Spee

dup

DDR4 modes (x2 and x4) useful but not enough

Comparison with DDR4

Page 29: A Case for Refresh Pausing in DRAM Memory Systems

29

Introduction & Motivation

Refresh Operation: Background

Refresh Pausing

Evaluation

Alternative Proposals

Summary

Outline

Page 30: A Case for Refresh Pausing in DRAM Memory Systems

30

• DRAM relies on Refresh for data integrity

• Time for Refresh increases with chip density

• Refresh blocks read, increases read latency

• Refresh Pausing: make Refresh Interruptible

• Pausing provides 5% improvement for 8Gb, increases with higher density

• Applicable also to DDR4 (fine grained refresh)

Summary

Page 31: A Case for Refresh Pausing in DRAM Memory Systems

31

THANK YOU

Page 32: A Case for Refresh Pausing in DRAM Memory Systems

32

Refresh+Read• Reads operate on a rank• Refreshes may also operate on the same rank• DRAMs serve only a single request at a time

Scheduler

Read Queue

Reads

Rank

Refresh

Page 33: A Case for Refresh Pausing in DRAM Memory Systems

33

Refresh Row Bundle

• TRFC : Time to refresh one bundle of rows • TREC : Current Recovery Time• TREFI : Time until next bundle refresh

Larger refresh-row bundle implies larger TRFC

TRFC

TRECREFRESH REFRESH

TREFI

Row

1

Row

n

Refresh Row Bundle

Page 34: A Case for Refresh Pausing in DRAM Memory Systems

34

Hierarchically organized as Channels, Ranks and Banks

Chip

DRAM Organization

Rank 1

Rank 2

Channel

Banks

Rows

READ

Page 35: A Case for Refresh Pausing in DRAM Memory Systems

35

Refresh ModesBurst and Distributed Mode

Bank

Rows

Rank

Chips

Refresh

Burst Mode

Refresh

In burst mode, all rows in all banks refresh simultaneously

Distributed Mode

Distributed mode: Only a few rows in all banks refresh; refresh is distributed in time

Page 36: A Case for Refresh Pausing in DRAM Memory Systems

36

Transactions in DRAMsThree transactions of concern– Reads– Writes – Refreshes

DRAM

Processor Bus

ReadWrite

Refresh

Mismanagement of requests leads

to collisions!

A scheduler is needed to manage requests to DRAM

Page 37: A Case for Refresh Pausing in DRAM Memory Systems

37

Temperature Sensitivity of Refresh Pausing

- Upto 22% increase in speedup for future chips

The savings of Refresh Pausing is higher whileoperating at high temperatures

<85C<85C

>85C>85C

No RefreshRefresh Pausing

No RefreshRefresh Pausing

0.00%5.00%

10.00%15.00%20.00%25.00%30.00%35.00%40.00%

Temperature Sensitivity

8Gb16Gb32Gb

Page 38: A Case for Refresh Pausing in DRAM Memory Systems

38

Auto and Self Refresh• Special Refresh Modes for DRAMs

• Auto Refresh – Internal Counter issues pulses in distributed fashion (CBR and RAS only)

• Self Refresh – DRAM is internally refreshed at a power optimized rate (Activity == 0)

Self Refresh Modes are only used when DRAMs stay idle

Page 39: A Case for Refresh Pausing in DRAM Memory Systems

39

Mitigating Penalty

• Pause a refresh bundle at row granularity

• TRPC = row cycle time + current recovery time• Current recovery time is small for individual rows

• Thus refreshes can be made interruptible

a. Maximum Refresh penalty without pausing is TRFC

b. Maximum Refresh penalty with pausing is to TRPC