Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by...
-
Upload
heather-ross -
Category
Documents
-
view
213 -
download
0
Transcript of Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by...
![Page 1: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/1.jpg)
Memory Hierarchy AdaptivityMemory Hierarchy AdaptivityAn Architectural PerspectiveAn Architectural Perspective
Alex Veidenbaum
AMRM Project
sponsored by DARPA/ITO
![Page 2: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/2.jpg)
Opportunities for AdaptivityOpportunities for Adaptivity
• Cache organization
• Cache performance “assist” mechanisms
• Hierarchy organization
• Memory organization (DRAM, etc)
• Data layout and address mapping
• Virtual Memory
• Compiler assist
![Page 3: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/3.jpg)
Opportunities - Opportunities - Cont’dCont’d
• Cache organization: adapt what?– Size: NO– Associativity: NO– Line size: MAYBE, – Write policy: YES (fetch,allocate,w-back/thru)– Mapping function: MAYBE
![Page 4: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/4.jpg)
Opportunities - Opportunities - Cont’dCont’d
• Cache “Assist”: prefetch, write buffer, victim cache, etc. between different levels.
• Adapt what?– Which mechanism(s) to use– Mechanism “parameters”
![Page 5: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/5.jpg)
Opportunities - Opportunities - Cont’dCont’d
• Hierarchy Organization:– Where are cache assist mechanisms applied?
• Between L1 and L2
• Between L1 and Memory
• Between L2 and Memory
– What are the data-paths like?• Is prefetch, victim cache, write buffer data written into the cache?
• How much parallelism is possible in the hierarchy?
![Page 6: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/6.jpg)
Opportunities - Opportunities - Cont’dCont’d
• Memory Organization– Cached DRAM?– Interleave change?– PIM
![Page 7: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/7.jpg)
Opportunities - Opportunities - Cont’dCont’d
• Data layout and address mapping– In theory, something can be done but…– MP case is even worse– Adaptive address mapping or hashing based
on ???
![Page 8: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/8.jpg)
Opportunities - Opportunities - Cont’dCont’d
• Compiler assist– Can select initial configuration– Pass hints on to hardware– Generate code to collect run-time info and adjust execution– Adapt configuration after being “called” at certain intervals during
execution– Select/run-time optimize code
![Page 9: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/9.jpg)
Opportunities - Opportunities - Cont’dCont’d
• Virtual Memory can adapt– Page size?– Mapping?– Page prefetching/read ahead– Write buffer (file cache)– The above under multiprogramming?
![Page 10: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/10.jpg)
Applying AdaptivityApplying Adaptivity
• What Drives Adaptivity? Performance impact, overall and/or relative
• “Effectiveness”, e.g. miss rate
• Processor Stall introduced
• Program characteristics
• When to perform adaptive action– Run time: use feedback from hardware– Compile time: insert code, set up hardware
![Page 11: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/11.jpg)
Where to ImplementWhere to Implement
• In Software: compiler and/or OS+ (Static) Knowledge of program behavior+ Factored into optimization and scheduling- Extra code, overhead- Lack of dynamic run-time information- Rate of adaptivity- requires recompilation, OS changes
![Page 12: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/12.jpg)
Where to Implement - Where to Implement - Cont’dCont’d
• Hardware+ dynamic information available+ fast decision mechanism possible+ transparent to software (thus safe)– delay, clock rate limit algorithm complexity– difficult to maintain long-term trends– little knowledge of about program behavior
![Page 13: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/13.jpg)
Where to Implement - Where to Implement - Cont’dCont’d
• Hardware/software+ Software can set coarse hardware parameters+ Hardware can supply software dynamic info+ Perhaps more complex algorithms can be used– Software modification required– Communication mechanism required
![Page 14: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/14.jpg)
Current InvestigationCurrent Investigation
• L1 cache assist– See wide variability in assist mechanisms effectiveness between
• Individual Programs
• Within a program as a function of time
– Propose hardware mechanisms to select between assist types and allocate buffer space
– Give compiler an opportunity to set parameters
![Page 15: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/15.jpg)
Mechanisms UsedMechanisms Used
• Prefetching– Stream Buffers– Stride-directed, based on address alone– Miss Stride: prefetch the same address using the number of
intervening misses
• Victim Cache
• Write Buffer, all after L1
![Page 16: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/16.jpg)
Mechanisms UsedMechanisms Used - - Cont’dCont’d
• A mechanism can be used by itself or
• All are used at once
• Buffer space size and organization fixed
• No adaptivity involved
![Page 17: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/17.jpg)
Observed BehaviorObserved Behavior
• Programs exhibit different effect from each mechanism, e.g none a consistent winner
• Within a program the same holds in the time domain between mechanisms.
![Page 18: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/18.jpg)
Observed BehaviorObserved Behavior - - Cont’dCont’d
• Both of the above facts indicate a likely improvement from adaptivity– Select a better one among mechanisms
• Even more can be expected from adaptively re-allocating from the combined buffer pool– To reduce stall time– To reduce the number of misses
![Page 19: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/19.jpg)
Proposed Adaptive MechanismProposed Adaptive Mechanism
• Hardware:– a common pool of 2-4 word buffers– a set of possible policies, a subset of:
• Stride-directed prefetch
• PC-based prefetch
• History-based prefetch
• Victim cache
• Write buffer
![Page 20: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/20.jpg)
Adaptive Hardware - Adaptive Hardware - Cont’dCont’d
• Performance monitors for each type/buffer– misses, stall time on hit, thresholds
• Dynamic buffer allocator among mechanisms
• Allocation and monitoring policy:– Predict future behavior from observed past– Observe over a time interval dT, set for next– Save perform. trends in next-level tags (<8bits)
![Page 21: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/21.jpg)
Further opportunities to adaptFurther opportunities to adapt
• L2 cache organization– variable-size line
• L2 non-sequential prefetch
• In-memory assists (DRAM)
![Page 22: Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.](https://reader035.fdocuments.in/reader035/viewer/2022070412/5697bf741a28abf838c7fdf4/html5/thumbnails/22.jpg)
MP OpportunitiesMP Opportunities
• Even longer latency
• Coherence, hardware or software
• Synchronization
• Prefetch under and beyond the above– Avoid coherence if possible– Prefetch past synchronization
• Assist Adaptive Scheduling