ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented...

19
ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong Kwon, Changyoung Park, Myoungsoo Jung, Songkuk Kim Computer Architecture and Memory System Lab, Yonsei University

Transcript of ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented...

Page 1: ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong

ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture

Jie Zhang, Miryeong Kwon, Changyoung Park, Myoungsoo Jung, Songkuk Kim

Computer Architecture and Memory System Lab,

Yonsei University

Page 2: ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong

SummaryMotivation:There is a huge demand to increase the last-level cache size. However, SRAMsuffers from the high leakage power and low density.STT-MRAM can be a good candidate to replace SRAM, due to its small cellsize and low leakage power.

Challenge:Conventional STT-MRAM cannot directly substitute SRAM, because of

- High write latency- High write energy

Solution:We observe read-oriented data pattern in real applications;We propose ROSS, a read-oriented STT-MRAM storage that can

- dynamically detect the read-oriented data pattern- reduce the write frequency in STT-MRAM- accommodate as many read misses as possible in LLC

Page 3: ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong

MotivationData set size increases explosively.

Videos Graphs Scientific Computing

300-hour new videos/min

30 Billion Photos

10-month Spacecraft Simulation

1 Billion Web Pages

Webs

Page 4: ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong

Motivation

Huge demand to increase on-chip last-level cache size in future

Page 5: ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong

ChallengesTraditional SRAM

Read Latency 397ps Write Latency 397ps Read Energy 35pJ Write Energy 35pJ

Fast speed

Reasonable energy

Cell Area (F^2) 50~120 Low density

Leakage 75.7mW High Leakage

Page 6: ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong

ChallengesTraditional SRAM

Read Latency 397ps Write Latency 397ps Read Energy 35pJ Write Energy 35pJ

Cell Area (F^2) 50~120

Leakage 75.7mW

STT-MRAM

6~40

High density

6.6mW Low leakage

238ps

13pJ 6932ps

90pJ

High density

High WriteLatency/Energy

Page 7: ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong

Prior work

Write Request Rerouting: accommodate write requests with idle blocks Parallelize write requests to hide long write latency; Cannot reduce the write energy;

Read Preemption: suspend write operation and give priority to read operation Mitigate the impact of long write latency to read access; Cannot reduce the write latency

Deadwrite bypass: predict and bypass the “deadwrite” (single write no read). Effectively avoid unnecessary write access on STT-MRAM; Bypassing deadwrite is not sufficient to avoid massive write on STT-MRAM

Page 8: ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong

ROSS-key observation

Last Level Cache eviction trend

Read-miss distribution

How the application behavior can impact LLC data access?

Large working sets result in frequent cache eviction

What if we can detect the singular write blocks and put in STT-MRAM?

Page 9: ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong

ROSS

Target: build a hybrid last-level cache to

i. avoid many cache read misses with larger STT-MRAM;ii. accommodate write updates in SRAM;iii. intelligent data migration algorithm;

Page 10: ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong

Hybrid-NUCAConventional cache hierarchy

Uniform latency

Different wire latency latency

Non-uniform cache architecture

Severe wire delay

Page 11: ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong

Hybrid-NUCA

LLC Controller

SRA

Mb

anks

1-w

ay3

-way

SRAM-SRAM migrat.

SRAM->STT-MRAM migration

STT-

MR

AM

ban

ks

Cache Bank Set

Migration policy

Address map policy

Less flexibility, but simplified design

Page 12: ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong

Hybrid-NUCAMigration

policyAddress

map policyData search

policy

IndexTag Bank set

De

cod

erTagArray

DataArray

S/A S/A

ComparatorMatch?

1-w

ay3

-way

Request routes to specific cache bank set

Tag broadcast to each bank for searching

Page 13: ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong

Data Flow

STT-MRAM

MEM

SRAM

L1 Cache

Block Fill1

Write-Back Miss2

4

ER: Migrate multiple cache lines

HB: Migrate one cache line

Evict Block

6

L2 C

ach

e 5 Write Hit Write Hit5

Read Hit3

Page 14: ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong

Singular write detection

LLC Controller

…1-bit status reg.

C C C C C

8-bit counter

Initialize

00

00

0

00

00

0

Poll

00000

25410101010

Write hit 00100

25511

1111

Evict

00100

25511

1111

0 Evict

Page 15: ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong

Experiment SetupEvaluation model:

ND-NUCA: traditional SRAM NUCA model;DB-NUCA: STT-MRAM NUCA model aware of deadblock;HB-ROSS: Hybrid Read-Oriented STT-MRAM Storage;ER-ROSS: Early Retirement aware Raed-Oriented STT-MRAM Storage;

Gem5 simulation parameters and workloads:

Page 16: ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong

Evaluation

DB-NUCA: suffer from long write latency

HB-ROSS: larger capacity and less STT-MRAM write

ER-ROSS: benefits from early retirement

Page 17: ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong

Evaluation

DB-NUCA: suffer from evicting singular write blocks

ROSS: STT-MRAM accommodates read-intensive data

Page 18: ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong

Evaluation

ND-NUCA: consume more leakage energy

ROSS: reduce the write energy overhead in STT-MRAM

Page 19: ROSS: A Design of Read-Oriented STT-MRAM Storage for ... · ROSS: A Design of Read-Oriented STT-MRAM Storage for Energy-Efficient Non-Uniform Cache Architecture Jie Zhang, Miryeong

Thank You