Working With Main Memory. Why Main Memory Register space limited Used for communication.
Phase Change Memory as Main Memory
Transcript of Phase Change Memory as Main Memory
Phase Change Memory as Main Memory
CS 839 - Persistence
Learning outcomes
• Understand the basic characteristics of phase-change memory
• Understand evaluation techniques for new memory technologies
• Understand the optimization process for new memory technologies
Notes from reviews
• How do line-level writes solve endurance problem?
• Why characterize PCM by only two parameters?
• Why use 4kb pages – good for disk, why for memory?
• How do results hold up?
Background story
• Phase change memory becomes known to computer architects• Generally seen as slower, bigger DRAM, with slower writes
• Key question: can it compete with DRAM?• Why is this the question?
• Why not “is it useful for persistence?”
DRAM background
• Stores data in a capacitor• Address split into row address, column
address• Row address connects row buffer to
DRAM cell• Column address selects 64 bytes within
row buffer• Row buffer & cells are electrically
connected• Writing to row buffer modifies cells
• Reads erase capacitor contents (destructive reads), so must re-write
Phase change materials
Reset
Set
Amorphous Crystalline
High electrical resistivity Low electrical resistivity
Low optical reflectance High optical reflectance
6(C) Juejun Hu, MIT
Reprogram by applying a shaped current to heat up and cool device
Electronic ViewRead by applying a current, measuring resistance
Experimental results with PCM cells
Issues with memories
• Retention: how long does the device retain data?• DRAM – 64ms• PCM – years
• Endurance: how many times can you write to it?• DRAM – zillions• PCM – 100,000 – millions
• Why?• For persistent data, get better retention using more energy during write
• Lower energy less precise, more likely write fails
• But: can cause wear out of device• Thermal expansion & contraction degrades electrode-storage contacts
Other memory/storage technologies
What system design should we consider?
• Replace all DRAM with PCM• Only regular processor caches
• Hybrid system• DRAM cache in front of PCM
• With or without swapping• Flash and/or disk for pages
How do we evaluate the system?
• Extend existing systems with PCM
• Add PCM but reduce DRAM• Look at same cost system
• Look at same area system (# of memory chips, PCM is denser)
• Look at system with same performance, see how much cheaper or smaller it is
• What does this paper do?
13
Motivation: Capacity vs. Performance
Disk VM
Memory Size M
Exec
uti
on
Tim
e T
M
➢ Reduced DRAM ➢ Same performance ➢ Lower system price
T➢ Faster execution ➢ No additional DRAM➢ Small Price Increase
Low Locality
Unused Memory
W
PCM
Pure PCM system
Uses 2048 byte PCM pages
• High delay
• Higher energy usage
• Why?
Where PCM is bad
• Time to reprogram is 12x higher
• Energy to write full array is 43x higher
How do we optimize?
• High write energy comes from writing full array• Write just data that changed• How do we know?
• Delay comes from longer access times• Cache hot data in DRAM fior read• Add write queue• What policy, what granularity?
• Data fetched from disk likely to be accessed soon• Can fetch right to DRAM instead of PCM
• Streaming data not re-referenced• Can evict from DRAM to disk, not PCM
Partial writes
• Easy solution: write only cache lines that change (64B)• Record in PCM controller what portion has changed, only rewrite that
• Harder solution: track what portion of cache line has changed (4B)• Requires tracking portion through cache hierarchy
Don’t access: DRAM caching
• Use some DRAM to hold hot pages• How much?
• Run programs and measure
• Proposal is about 10% of PCM
• What granularity?• This paper uses 4KB
Memory Size M
Exec
uti
on
Tim
e T
Don’t wait: Lazy writes
• Send writes to a write-pending queue, not PCM directly
Write less: line-level writes
• Problem:• High energy of writes
• Limited endurance
• Solution:• Only write dirty data
• Out of 2048 bytes, mostly 1-3 dirty cachelines
Wear leveling
• Problem:• Uneven access of pages
• Uneven access of lines within a page
• Solution:• VM swapping for uneven use of pages
(not evaluated)
• Store a Shift value for each page• How much lines are shifted on that page
• E.g. each time we reallocate a page, randomly re-shift lines
Relevance to persistent memory
• Managing slow reads
• Managing slow writes
• Granularity of writes
• Wear leveling
Sensitivity
• How do research before technology is known?• Parameterize: assume certain
density, performance range
Optane Memory Mode
• Memory controller manages DRAM as direct-mapped cache in front of Optane• Why direct mapped?
• For NUMA: remote DRAM caches remote Optane
Memory mode performance
• Sequential
Memory mode performance
• Random access
• Bandwidth
Application performance - graph
Other Persistent Memory Technologies
• Spin-Transfer Torque MRAM• Uses Magnetic Tunnel Junction to
store data• Each cell has 2 ferromagnetic layers
• Reference layer stays magnetized in same direction
• Free layer can be programmed
• When polarity aligned → low resistance
• When polarity opposite → high resistance
• Program with high current in one direction
Properties of STT-MRAM
• Density close to DRAM• Higher density → smaller MTJ → lower retention• 1 day vs 10 years
• High write power
• Reads non-destructive: data is copied into row-buffer, can change in row buffer
• Close to DRAM speed, but high-energy writes
Optimizations for STT-MRAM
• Like PCM: selective/partial writes: only update data modified• Saves energy, endurance
• Row buffer bypass:• Higher locality for reads going back to same
row buffer than writes. Why?
• Let writes go directly to media & not be buffer in row buffer
Summary
• NVM can be attached like memory, accessed via same protocols
• Characteristics require new optimizations• Endurance: partial writes
• Endurance: wear leveling
• Energy of writes: partial writes
• Performance: DRAM caching, write bypassing