Realistic Memories and Caches – Part III Li- Shiuan Peh

23
Realistic Memories and Caches – Part III Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology April 4, 2012 L15-1 http:// csg.csail.mit.edu/6.S078

description

Realistic Memories and Caches – Part III Li- Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology. 2-level Data Cache Interface (0,n). req. req. accepted. accepted. L1 cache. L2 cache. Processor. DRAM. ReqRespMem. resp. resp. - PowerPoint PPT Presentation

Transcript of Realistic Memories and Caches – Part III Li- Shiuan Peh

Page 1: Realistic Memories and Caches – Part III Li- Shiuan Peh

Realistic Memories and Caches – Part III

Li-Shiuan PehComputer Science & Artificial Intelligence Lab.Massachusetts Institute of Technology

April 4, 2012 L15-1http://csg.csail.mit.edu/

6.S078

Page 2: Realistic Memories and Caches – Part III Li- Shiuan Peh

2-level Data Cache Interface (0,n)

interface mkL2Cache; method ActionValue#(Bool) req(MemReq r); method ActionValue#(Maybe#(Data)) resp;endinterface

L1cache

Processor DRAM

ReqR

espM

em

reqaccepted

respL2

cache

reqaccepted

resp

April 4, 2012 L15-2http://csg.csail.mit.edu/

6.S078

Page 3: Realistic Memories and Caches – Part III Li- Shiuan Peh

2-level Data Cache Internals

L1 Cache

reqaccepted

resp

hit

wire

missFifo

evict wire

miss wire

TagDataStore

L2 Cache

reqaccepted

resp

missFifo

ReqR

espM

em

TagDataStorerespFifo

reqFifo

Let’s build an inclusive L2 with our earlier direct-mapped L1

April 4, 2012 L15-3http://csg.csail.mit.edu/

6.S078

Page 4: Realistic Memories and Caches – Part III Li- Shiuan Peh

L2 cache: hit caserule procReq(!missFifo.notEmpty); let req = reqFifo.first; let hitOrMiss <- store.tagLookup(req.addr); if(hitOrMiss.hit) begin let line <- store.dataAccess( getCacheDataReqLine(hitOrMiss.id, req)); if(req.op == Ld) respFifo.enq(unpack(pack(line.Line))); reqFifo.deq; end :

:endrule

April 4, 2012 L15-4http://csg.csail.mit.edu/

6.S078

Page 5: Realistic Memories and Caches – Part III Li- Shiuan Peh

dirty case

else if (hitOrMiss.dirty) begin let line <- store.dataAccess(CacheDataReq{id: hitOrMiss.id, reqType: tagged InvalidateLine}); mem.reqEnq(MemReq{op: St, addr: hitOrMiss.addr,

data: unpack(pack(line.Line))}); end else

What about corresponding line in L1?

April 4, 2012 L15-5http://csg.csail.mit.edu/

6.S078

Page 6: Realistic Memories and Caches – Part III Li- Shiuan Peh

miss case

else begin missFifo.enq(tuple2(hitOrMiss.id, req)); req.op = Ld; mem.reqEnq(req); reqFifo.deq; end

April 4, 2012 L15-6http://csg.csail.mit.edu/

6.S078

Page 7: Realistic Memories and Caches – Part III Li- Shiuan Peh

Process miss from memory rule procMiss; match {.id, .req} = missFifo.first; Line line = req.op == Ld? unpack(pack

(mem.respFirst)) : unpack(pack(req.data)); Bool dirty = req.op == Ld? False: True; let l <- store.dataAccess(CacheDataReq{id: id,

reqType: tagged FillLine (tuple3(line, req.addr, dirty))});

if(req.op == Ld) respFifo.enq(unpack(pack(line))); missFifo.deq; mem.respDeq; endrule

April 4, 2012 L15-7http://csg.csail.mit.edu/

6.S078

Page 8: Realistic Memories and Caches – Part III Li- Shiuan Peh

How will inclusive L2 need to change to interface with a n-way set-associative L1?

L1 Cache

reqaccepted

resp

hit

wire

missFifo

evict wire

miss wire

TagDataStore

L2 Cache

reqaccepted

resp

missFifo

ReqR

espM

em

TagDataStorerespFifo

reqFifo

When a L2 cache line is evicted, the corresponding L1 cache line needs to be invalidated

- Interface: + Invalidate information- L1 cache logic: tagLookup, invalidate

April 4, 2012 L15-8http://csg.csail.mit.edu/

6.S078

Page 9: Realistic Memories and Caches – Part III Li- Shiuan Peh

Multi-Level Cache Organizations

Inclusive vs. Non-Inclusive vs. Exclusive

Memory disk

L1 Icache

L1 Dcacheregs L2 Cache

Processor

Separate data and instruction caches, vs. a Unified cache

April 4, 2012 L15-9http://csg.csail.mit.edu/

6.S078

Page 10: Realistic Memories and Caches – Part III Li- Shiuan Peh

10

Write-Back vs. Write-Through Caches inMulti-Level Caches

Write backWrites only go into top level of hierarchyMaintain a record of “dirty” linesFaster write speed (only has to go to top level to be considered complete)

Write throughAll writes go into L1 cache and then also write through into subsequent levels of hierarchyBetter for “cache coherence” issuesNo dirty/clean bit records requiredFaster evictions

Page 11: Realistic Memories and Caches – Part III Li- Shiuan Peh

SummaryChoice of cache interfaces

Combinational vs non-combinational responses Blocking vs non-blocking FIFO vs non-FIFO responses

Many possible internal organizations Direct Mapped and n-way set associative are the

most basic ones Writeback vs Write-through Write allocate or not

Multi-level caches: Likely organizations? Associativity, Block Size?

next integrating into the pipeline

April 4, 2012 L15-11http://csg.csail.mit.edu/

6.S078

Page 12: Realistic Memories and Caches – Part III Li- Shiuan Peh

Recap: Six Stage Pipeline

F

Fetch

fr D

Decode

dr R

RegRead

rr X

Execute

xr M

Memory

mr W

Write-back

pc

Writeback feeds back directly to ReadReg rather than through FIFO

April 4, 2012 L15-12http://csg.csail.mit.edu/

6.S078

Page 13: Realistic Memories and Caches – Part III Li- Shiuan Peh

Multi-cycle execute

rr xr M

Incorporating a multi-cycle operation into a stage

R

m1 M2 m2 M3M1

X

April 4, 2012 L15-13http://csg.csail.mit.edu/

6.S078

Page 14: Realistic Memories and Caches – Part III Li- Shiuan Peh

Multi-cycle function unitmodule mkMultistage(Multistage); State1 m1 <- mkReg(); State2 m2 <- mkReg(); method Action request(Operand operand); m1.enq(doM1(operand)); endmethod

rule s1; m2.enq(doM2(m1.first)); m1.deq(); endrule

method ActionValue Result response(); return doM3(m2.first); m2.deq(); endmethodendmodule

Perform first stage of pipeline

Perform middle stage(s) of

pipeline

Perform last stage of pipeline and return result

April 4, 2012 L15-14http://csg.csail.mit.edu/

6.S078

Page 15: Realistic Memories and Caches – Part III Li- Shiuan Peh

Single/Multi-cycle pipe stagerule doExec; ... initialization let done = False; if (! waiting) begin if(isMulticycle(...)) begin

multicycle.request(...); waiting <= True; end else begin result = singlecycle.exec(...); done = True; end end else begin result <- multicycle.response(); done = True; waiting <= False; end

if (done) ...finish upendrule

If not busy start multicycle operation, or…

…perform single cycle operation

If busy, wait for response

Finish up if have a result

April 4, 2012 L15-15http://csg.csail.mit.edu/

6.S078

Page 16: Realistic Memories and Caches – Part III Li- Shiuan Peh

Adding a cache (multi-cycle unit!) into the pipeline

Architectural state:

Which stages need to be modified?

InstCache iCache <- mkInstCache(dualMem.iMem); ReqRespMem l2Cache <- mkL2Cache(dualMem.dMem); DataCache dCache <- mkDataCache(l2Cache);

April 4, 2012 L15-16http://csg.csail.mit.edu/

6.S078

Page 17: Realistic Memories and Caches – Part III Li- Shiuan Peh

Instruction fetch stageProcessor sends request to I$

Processor receives response from I$

if(!inFlight)inFlight <- iCache.req(fetchPc);

let instResp <- iCache.resp; if(instResp matches tagged Valid .d) :

April 4, 2012 L15-17http://csg.csail.mit.edu/

6.S078

Page 18: Realistic Memories and Caches – Part III Li- Shiuan Peh

Memory read stageProcessor sends request to D$

Processor receives response from D$

if(! poisoned && (decInst.i_type==Ld || decInst.i_type==St) && ! memInFlight && mr.notFull)

inFlight <- dCache.req(MemReq{op:execInst.itype==Ld ? Ld : St, addr:execInst.addr, data:execInst.data});

let data <- dCache.resp; if(! poisoned && (decInst.i_type == Ld || decInst.i_type == St )) if(isValid(data))

April 4, 2012 L15-18http://csg.csail.mit.edu/

6.S078

Page 19: Realistic Memories and Caches – Part III Li- Shiuan Peh

Put it all together: Pipeline+CachesInstruction Fetch Hit

F

Fetch

fr D

Decode

dr R

RegRead

rr X

Execute

xr M

Memory

mr W

Write-back

pc

L1cache

Processor DRAM

ReqR

espM

em

reqaccepted

respL2

cache

reqaccepted

resp

April 4, 2012 L15-19http://csg.csail.mit.edu/

6.S078

Page 20: Realistic Memories and Caches – Part III Li- Shiuan Peh

Put it all together: Pipeline+CachesInstruction Fetch Miss

F

Fetch

fr D

Decode

dr R

RegRead

rr X

Execute

xr M

Memory

mr W

Write-back

pc

L1cache

Processor DRAM

ReqR

espM

em

reqaccepted

respL2

cache

reqaccepted

resp

April 4, 2012 L15-20http://csg.csail.mit.edu/

6.S078

Page 21: Realistic Memories and Caches – Part III Li- Shiuan Peh

Put it all together: Pipeline+CachesInstruction Fetch not accepted

F

Fetch

fr D

Decode

dr R

RegRead

rr X

Execute

xr M

Memory

mr W

Write-back

pc

L1cache

Processor DRAM

ReqR

espM

em

reqaccepted

respL2

cache

reqaccepted

resp

April 4, 2012 L15-21http://csg.csail.mit.edu/

6.S078

Page 22: Realistic Memories and Caches – Part III Li- Shiuan Peh

Put it all together: Pipeline+CachesMultiple instructions with non-blocking caches

L1cache

Processor DRAM

ReqR

espM

em

reqaccepted

respL2

cache

reqaccepted

resp

F

Fetch

fr D

Decode

dr R

RegRead

rr X

Execute

xr M

Memory

mr W

Write-back

pc

missFifo

April 4, 2012 L15-22http://csg.csail.mit.edu/

6.S078

Page 23: Realistic Memories and Caches – Part III Li- Shiuan Peh

Where are the caches?

April 4, 2012 L15-23http://csg.csail.mit.edu/

6.S078