RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely...
Transcript of RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely...
![Page 1: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/1.jpg)
PRACTICAL NONVOLATILE MULTILEVEL-CELL
PHASE CHANGE MEMORY
Jichuan Chang,
Robert S. Schreiber,
Norman P. Jouppi
Hewlett-Packard Labs
Doe Hyun Yoon
IBM T. J. Watson Research Center
![Page 2: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/2.jpg)
MEMORY CAPACITY CHALLENGE IN HPC
• DRAM as main memory
– Scaling is slowing down
• Hard to meet ever-increasing capacity demand
• Byte-addressable nonvolatile memory
– Phase change memory (PCM), memristor, …
– Scales better than DRAM
– Multilevel-cell (MLC) capability
– Nonvolatility
• Checkpoint, in-situ post processing
• High-performance file system
• NV MLC PCM for continued capacity scaling 2
![Page 3: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/3.jpg)
MAJOR CHALLENGE: RESISTANCE DRIFT • Conventional 4-Level-Cell (4LC) Designs
– Naïve 4LC is useless
– Optimized 4LC is only barely usable
– Still need refresh -- it’s volatile memory
• Observation: Most errors in 4LC occur in one cell state
• Proposal: 3-Level-Cell (3LC) PCM – Simple, genuinely nonvolatile (>10 years retention)
– 3-ON-2 and mark-and-spare • Low-cost wearout tolerance for 3LC
– 1.41 bits/cell (vs. 1.52 in 4LC) • Only 7% lower capacity than (volatile) 4LC
3
![Page 4: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/4.jpg)
PCM AND RESISTANCE DRIFT
4
![Page 5: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/5.jpg)
PHASE CHANGE MEMORY • Best of DRAM and Flash
– Higher capacity, better scaling (vs. DRAM)
– Faster, byte-addressable NVM (vs. Flash)
• MLC (Multilevel-Cell) capability
– Store more than 1 bits per cell
• Ex) 2 bits per cell
• Caveats:
– Slow, low-bandwidth write
– Finite write endurance
– Resistance drift 5
Common problems
in both SLC and MLC
![Page 6: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/6.jpg)
RESISTANCE DRIFT • PCM Cell resistance increases over time
– R(t), cell resistance at time t (t >0)
• A cell is programmed at t =0
• Sensed as R0 at time t0 (>0)
• : drift rate (0<<1)
• Drift errors
– Negligible in SLC PCM
– Major reliability problem in MLC PCM
6
0
0)(t
tRtR
![Page 7: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/7.jpg)
DRIFT ERRORS IN 4LC PCM • 4 cell states: S1, S2, S3, S4
– PDF is truncated Gaussian
• ±2.75 around mean values
• Mean resistance values: 1, 2, 3, 4
– Threshold between states: 1, 2, 3
• Drift rate () increases with cell resistance
7 log10R
3.5 3 2.5 6.5 4 4.5 5 5.5 6
S1 S2 S3 S4
1 2 3
1 2 3 4
![Page 8: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/8.jpg)
1E-10
1E-09
1E-08
1E-07
1E-06
1E-05
1E-04
1E-03
1E-02
1E-01
1E+00
2s 32s 17min 9hour 12day 1year 34year 1089year 34865year
DRIFT ERROR RATES • Monte-Carlo simulation
• Errors only in S2 and S3
8
S2
S3
Time
Fra
ctio
n o
f c
ells
with
an
err
or
![Page 9: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/9.jpg)
REFRESH • Refresh before cells loose their data
– Consume already limited PCM write BW
– Too frequent refresh will make PCM unavailable
to users
• What PCM refresh interval is acceptable?
– At least 50% of write BW should be
available to users
– Refresh interval >17 minutes
• Caveat: PCM w/ refresh is no longer nonvolatile
9
![Page 10: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/10.jpg)
CELL ERROR RATE • What cell error rate is tolerable?
– Goal: 10-year device MTBF
• Fewer than 1 erroneous 64B block
in a 16GB device for 10 years
– CER >1e-2
• Impossible to achieve the goal
even with unrealistically strong ECC
– CER ~1e-3 @ 17min refresh
• Barely meets the goal with BCH-10
• More analysis in the paper
10
![Page 11: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/11.jpg)
BASELINE 4LC PCM
11
![Page 12: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/12.jpg)
NAÏVE DESIGN: 4LCN
• Equal probability for all 4 states
• 17min refresh caps CER at ~1e-2
12
1E-10
1E-09
1E-08
1E-07
1E-06
1E-05
1E-04
1E-03
1E-02
1E-01
1E+00
2s 32s 17min 9hour 12day 1year 34year 1089year 34865year
Fra
ctio
n o
f c
ells
with
an
err
or
Refresh Interval
CER~1e-2
Unacceptable
Refresh interval > 17 min
4LCn
![Page 13: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/13.jpg)
13
OPTIMAL STATE MAPPING • Drift only increases cell resistance
• Optimize 2, 3, 1, 2, 3 to minimize CER
– minimize CER(2, 3, 1, 2, 3)
– subject to i+2.75+<i< i+1-2.75-
– for i=1,2,3
0
0.5
1
1.5
2
2.5
2.5 2.75 3 3.25 3.5 3.75 4 4.25 4.5 4.75 5 5.25 5.5 5.75 6 6.25 6.5
pd
f o
f ce
ll re
sis
tan
ce
S1 S4 S2 S3
Simple
mapping Optimal
mapping
1 2 3
minimum
spacing
1 2 3 4
![Page 14: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/14.jpg)
OPTIMAL STATE MAPPING: 4LCO
• CER ~1e-3 @ 17-min refresh
• With BCH-10, it meets the goal
14
1E-10
1E-09
1E-08
1E-07
1E-06
1E-05
1E-04
1E-03
1E-02
1E-01
1E+00
2s 32s 17min 9hour 12day 1year 34year 1089year 34865year
Fra
ctio
n o
f c
ells
with
an
err
or
Refresh Interval
4LCo
CER~1e-3, barely usable
with 10-bit correcting ECC
4LCn
Refresh interval > 17 min
![Page 15: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/15.jpg)
PROPOSAL:
3LC PCM
15
![Page 16: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/16.jpg)
0
0.5
1
1.5
2
2.5
2.5 2.75 3 3.25 3.5 3.75 4 4.25 4.5 4.75 5 5.25 5.5 5.75 6 6.25 6.5
pd
f o
f ce
ll re
sis
tan
ce
S3
3 0
0.5
1
1.5
2
2.5
2.5 2.75 3 3.25 3.5 3.75 4 4.25 4.5 4.75 5 5.25 5.5 5.75 6 6.25 6.5
pd
f o
f ce
ll re
sis
tan
ce
PROPOSAL: 3LC PCM
16
Wide margin
S1 S2 S4
1 2
Simple
mapping Optimal
mapping
• Observation:
– Most errors occur in one state (S3)
• DO NOT USE IT
– Wide Margin for S2
• Simple and optimal mapping (3LCn & 3LCo)
1 2 4
![Page 17: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/17.jpg)
3LC DESIGNS (3LCN AND 3LCO) • Reliable for >10 years w/o ECC & refresh
• Genuinely nonvolatile
17
1E-10
1E-09
1E-08
1E-07
1E-06
1E-05
1E-04
1E-03
1E-02
1E-01
1E+00
2s 32s 17min 9hour 12day 1year 34year 1089year 34865year
Fra
ctio
n o
f c
ells
with
an
err
or
Refresh Interval
4LCo
3LCn
3LCo
CER~1e-9 at 16 years
No ECC, No refresh
4LCn
![Page 18: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/18.jpg)
3LC PCM DESIGN ISSUES • How to store information?
– Binary information in ternary cells
• What about wearout failures?
• How to compensate for
the reduced cell density?
– 3LC’s ideal capacity is 1.58 bits/cell (log23)
– vs. 2 bits/cell in 4LC
18
![Page 19: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/19.jpg)
HOW TO STORE BINARY INFO IN TERNARY CELLS?
• 3-ON-2
– Store three bits in two ternary cells
– 64B (512-bit) data block in 342 cells
• 9 states in 2 ternary cells
• 8 states for 3-bit data
• INVALID state
– (S4, S4)
– Use this for tolerating
wearout failures
19
First
cell
Second
cell
3-bit
data
S1 S1 000
S1 S2 001
S1 S4 010
S2 S1 011
S2 S2 100
S2 S4 101
S4 S1 110
S4 S2 111
S4 S4 INVALID
![Page 20: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/20.jpg)
TOLERATING WEAROUT FAILURES IN 3LC
• PCM has only finite write endurance
– ~108 writes per cell
• Mark-and-spare
– A low-cost wearout failure tolerance for 3LC
– Use 3LC’s INVALID state for marking a cell pair with
a failure
– No need to store failed-cell location
– 2 spare cells per failure
• c.f. ECP [Schechter+ ISCA’10 ]
– Need a pointer and a spare cell for a failure
– 5 cells per failure with 512-bit data block and 4LC 20
![Page 21: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/21.jpg)
• Use INVALID (S4, S4) to mark
a cell pair w/ failure
– A stuck-at cell stuck can be revived by
applying reverse current [Goux+ IEEE TED’09]
• Need a spare pair for tolerating a failure
A pair w/
failure
MARK-AND-SPARE EXAMPLE
21
Wearout
failure
A ternary cell A cell pair
for 3 bits
D0 D1 D2 D3 D4 D5 D6 D7 S0 S1
![Page 22: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/22.jpg)
HOW TO CORRECT WEAROUT FAILURES?
22
![Page 23: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/23.jpg)
342
256 31
10
50
0 50 100 150 200 250 300 350 400
3LC
4LC
342
256
12
31
10
50
0 50 100 150 200 250 300 350 400
3LC
4LC
CAPACITY: 3LC VS. 4LC
23
• 64B (512-bit) block
• 3LC needs fewer bits than 4LC for error correction
– 6 wearout failures:
Mark-and-spare (2cells/failure) vs. ECP (5cells/failure)
– Drift errors: BCH-1 vs. BCH-10
• 3LC: 1.41 bits/cell, 4LC: 1.52 bits/cell
• Besides, 3LC is nonvolatile
7%
Data Wearout failure correction Drift error correction
# cells
![Page 24: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/24.jpg)
CAPACITY VS. # WEAROUT FAILURES • MLC has worse endurance than that of SLC
• May need to tolerate more than
6 wearout failures
24
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
bits/c
ell
# Wearout failures tolerated
4LC
3LC
![Page 25: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/25.jpg)
COMPARISON TO TRI-LEVEL-CELL PCM
25
• Recent work on MLC drift errors [ISCA’13] – Same observation
• Most errors occur in the S3 state
– Same solution • Use 3 levels instead of 4 levels
• TLC paper does not address – Wearout failures
– Optimal resistance/threshold mapping • Baseline 4LC is overly pessimistic – not usable at all
• Unique feature in TLC paper – Bandwidth-Enhanced writes
![Page 26: RACTICAL NONVOLATILE MULTILEVEL ELL PHASE CHANGE … · •Proposal: 3LC PCM –Simple, genuinely nonvolatile –3-ON-2 & Mark-and-spare •Low-cost wearout tolerance mechanism for](https://reader036.fdocuments.in/reader036/viewer/2022070917/5fb7296adee6ff3db03307c2/html5/thumbnails/26.jpg)
MLC PCM FOR CONTINUED CAPACITY SCALING
• Major challenge: resistance drift
• Conventional 4LC PCM is not practical – Strong ECC and frequent refresh:
• Performance/power penalty
• Loose nonvolatility
• Proposal: 3LC PCM – Simple, genuinely nonvolatile
– 3-ON-2 & Mark-and-spare • Low-cost wearout tolerance mechanism for 3LC
– Only 7% lower capacity than (volatile) 4LC
• Generalized non-power-of-two level cells – 5LC, 6LC, …
26