ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly...

32
1 ECE 5900 spring 05 1 Memory Access Scheduling Memory Access Scheduling ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 Instructor: Dr. Chigan

Transcript of ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly...

Page 1: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

1 ECE 5900 spring 05 1

Memory Access SchedulingMemory Access Scheduling

ECE 5900 Computer Engineering Seminar

Ying Xu Mar 4, 2005

Instructor: Dr. Chigan

Page 2: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

2 ECE 5900 spring 05 2

OutlineOutline

IntroductionModern DRAM architectureMemory access scheduling

Structure of access schedulerScheduling policies

Experimental resultsFirst-ready schedulingAggressive reordering

Conclusions

Page 3: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

3 ECE 5900 spring 05 3

IntroductionIntroduction

Bandwidth of memory chip increases dramatically

DDR2, SDRAMMedia processors

Streaming memory reference patternsMemory bandwidth bottleneck

Page 4: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

4 ECE 5900 spring 05 4

Intro (contd)Intro (contd)

Pipelining memory accessesMaximize the memory bandwidthSequential accesses to the different row of the same bank can’t be pipelined

Memory access schedulingReorder memory operations

Bank precharge, row activation, column accessMemory references completed out of order

Page 5: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

5 ECE 5900 spring 05 5

Intro(contd)Intro(contd)

Page 6: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

6 ECE 5900 spring 05 6

Characteristics of DRAM architectureCharacteristics of DRAM architecture

DRAMs are not truly random access devices3 dimensional memories

BankRowColumn

3 operationsBank prechargeRow activationColumn access

Page 7: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

7 ECE 5900 spring 05 7

DRAM organizationDRAM organization

Page 8: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

8 ECE 5900 spring 05 8

Resource constraints of DRAMSResource constraints of DRAMS

Dram resourcesInternal banksA single set of address linesA single set of data lines

Different operation has different demand

Page 9: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

9 ECE 5900 spring 05 9

Bank stateBank state

Page 10: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

10 ECE 5900 spring 05 10

Memory access schedulingMemory access scheduling

Process of ordering DRAM operationsSubject to resource constraintsSimplest: oldest pending references first

InefficientDRAM Not ready for the oldest referencesLeave the available resource idle

Need more complicated scheduling algorithm

Page 11: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

11 ECE 5900 spring 05 11

Memory access scheduler structureMemory access scheduler structure

Page 12: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

12 ECE 5900 spring 05 12

Memory access scheduling policiesMemory access scheduling policies

Page 13: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

13 ECE 5900 spring 05 13

Memory access scheduling Memory access scheduling algorithmalgorithm

Combination of policies used by precharge manager, row arbiter, column arbiter, address arbiter

Address arbiter decides which selected precharge, row, column operation to performChoices: in-order, priority, precharge operation first, row operation first, column operation first

Page 14: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

14 ECE 5900 spring 05 14

Experimental setupExperimental setup

Streaming media processors are preferredStreams lack temporal locality Stream transfer bandwidth drives the processor performanceThe image stream processor is simulated

frequency 500MHZDram frequency 125MHZPeak system bandwidth 2GB/s

Page 15: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

15 ECE 5900 spring 05 15

Experimental setup(contd)Experimental setup(contd)

Benchmarks and media processing applications

Page 16: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

16 ECE 5900 spring 05 16

In order schedulingIn order scheduling

In-order access schedulerNo access reorderingA column is only performed for the oldest pending reference; same as bank precharge and row activation Baseline

Page 17: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

17 ECE 5900 spring 05 17

FirstFirst--ready schedulingready scheduling

Uses the ordered priority scheme for all unitsSubjects to resource and timing constraintsSchedule an operation for the oldest pending references

Benefits: Accesses targeting other banks can be performed while waiting for a precharge or row activationparallelism: multiple references in progress

Page 18: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

18 ECE 5900 spring 05 18

Experimental resultsExperimental results

Sustained memory bandwidth increased about 79%

Page 19: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

19 ECE 5900 spring 05 19

Experimental resultsExperimental results

Sustained bandwidth increased about 17%

Page 20: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

20 ECE 5900 spring 05 20

Experimental resultsExperimental results

Sustained memory bandwidth increased about 79%

Page 21: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

21 ECE 5900 spring 05 21

Aggressive reorderingAggressive reordering

Drawback of first-ready schedulingPrecharges a bank when the oldest pending reference targets a different row than the active row in a bank,there are still multiple pending references to the active row

Aggressive reordering to further increase sustained memory bandwidth

Page 22: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

22 ECE 5900 spring 05 22

Possible reordering scheduling algorithm policesPossible reordering scheduling algorithm polices

Large range of possible memory access schedulerFour representative

Page 23: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

23 ECE 5900 spring 05 23

Experimental resultsExperimental results

Improve bandwidth by 106-144%

Page 24: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

24 ECE 5900 spring 05 24

Experimental resultsExperimental results

Improve bandwidth by 27-30%

Page 25: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

25 ECE 5900 spring 05 25

Experimental resultsExperimental results

Improve bandwidth 85-93%

Page 26: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

26 ECE 5900 spring 05 26

RowRow--first policy VS column first policyfirst policy VS column first policy

Address arbiterRow-first: always select row operation firstColumn-first: always select column operation first

Little difference across all benchmarksException: FFT

Less to do with the scheduling algorithm than the characteristic of benchmark itselfFFT most sensitive to stream load latencyCol/op policy allows a store stream to delay load streams

Page 27: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

27 ECE 5900 spring 05 27

Open or closed precharge policy?Open or closed precharge policy?

Closed precharge policybanks are precharged as soon as no pending references to the active row

Open precharge policyNo pending references to the active row, pending references to other rows of the same bank

Difference between open and closed precharge policy is slightBenchmarks with random access pattern prefer closed precharge policy

Little reference locality No benefit to keep row open

FFT prefers op precharge policyNumerous accesses to each row

Page 28: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

28 ECE 5900 spring 05 28

Effect of bank buffer sizeEffect of bank buffer size

Row/closed scheduling algorithm

Page 29: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

29 ECE 5900 spring 05 29

ConclusionsConclusions

Memory access scheduling greatly increases the bandwidth utilization

Buffering memory references Access internal banks in parallel Maximize the number of column accesses per row access

First ready scheduling algorithm79% bandwidth improvement on microbenchmarks, 40% on application traces

Aggressive reordering algorithm144% bandwidth improvement on benchmarks,30% on media processing applications, 93% on the application traces

Page 30: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

30 ECE 5900 spring 05 30

ConclusionsConclusions

Closed precharge policy preferred by most benchmarksLittle difference in performance between row-first or column first policies.For latency sensitive applications, scheduling loads ahead of stores preferred.

Banks are precharged as soon as the last column reference to an active row is completed

Page 31: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

31 ECE 5900 spring 05 31

Paper referencePaper reference

Scott Rixner, William J. Dally, Ujval J. Kapasi, Peter Mattson, John D. Owens, Memory access scheduling, ACM SIGARCH Computer Architecture News , Proceedings of the 27th annual international symposium on Computer architecture, Volume 28 Issue 2, May 2000

Page 32: ECE 5900 Computer Engineering Seminar Ying Xu Mar 4, 2005 ...Memory access scheduling greatly increases the bandwidth utilization Buffering memory references Access internal banks

32 ECE 5900 spring 05 32

Thank you !