0620 - super-marumo.com · 598B TEL.029(B31 TEL029ß57)544a TEL029ß30)46B9 2
Self-* Systems CSE 598B Paper title: Dynamic ECC tuning for caches Presented by: Niranjan...
-
Upload
christian-obrien -
Category
Documents
-
view
243 -
download
28
Transcript of Self-* Systems CSE 598B Paper title: Dynamic ECC tuning for caches Presented by: Niranjan...
Self-* SystemsCSE 598B
Paper title: Dynamic ECC tuning for caches
Presented by: Niranjan Soundararajan
2
Abstract
On-chip caches are increasing in sizes– Protection needed in order to store correct data.– ECC serves as an efficient means to protect the
data ECC has its own overhead
– Area: Extra space for its logic– Latency: ECC computations take time
This work deals with reducing latency involved in ECC computation.– Track the cache lines frequently accessed. – Dynamically turn ECC computation on and off for
specific cache lines.
3
Background
Information redundancyData in the processing core is protected by schemes such as RMT (Redundant Multi-threading) [1][2].ECC protection is easy to implement for on-chip caches. Also size of the caches prevent them from being replicated [3].
Current evaluation shows raw FIT (Failures In Time) rate numbers for latches and SRAM cells to vary between 0.001 – 0.01 FIT/bit. This value increases with elevation. At 1.5 km – FIT/bit is 3.5x while at 10 km (airplanes) – FIT/bit is 100x larger [4][5][6][7].
4
Background
As processor power dissipation becomes more and more important, supply voltages get reduced. This will greatly increase the FIT/bit [8][9].
“As an example, consider a 32 MB data cache. This cache has 222 quad-words. Let us assume that an SRAM cell has an average FIT rate of 0.001. The single-bit FIT rate for the entire cache is 0.001 * 222 * 72 = 3.02 * 105,i.e. the MTTF is 109 / (3.02 * 105) = 3311 hours” [3].
Consider the case of large multiprocessor systems with tens of megabytes of caches. Protection becomes an important issue if the systems are involved in critical computations like space research and flight control.
5
Background
All these data point out that cache data must be protected. ECC is the best way to protect SRAM.
This work addresses some of the problems related to applying ECC for caches that need to operate at low latency like the L1 caches.
6
Motivation
ECC overhead [10]– Increase in area due to circuitry – 11% (approx.
15mm2)– Increase in latency – 10% (approx 5 ns)
Applications show temporal locality in accessing cache lines. By dynamically turning ECC on and off for cache lines, latency of cache access gets reduced. Since the frequency of operations is going to be high, the time between individual accesses is going to be less.
7
Motivation
Chance of error affecting data is less– Due to frequency of operations– Cache lines with high temporal locality is
less compared to total number of cache lines.
8
Benchmarks
wupwise
0
100
200
300
400
500
600
0 10000 20000 30000 40000 50000 60000 70000 80000
cycle time
cach
e lin
e
vpr-route
0
100
200
300
400
500
600
0 10000 20000 30000 40000 50000 60000 70000
cycle time
cach
e lin
e
9
Benchmarks
Parser
0
100
200
300
400
500
600
270000 280000 290000 300000 310000 320000 330000
Cycle time
cach
e lin
e
Perlbmk
0
100
200
300
400
500
600
274000 284000 294000 304000 314000 324000 334000 344000 354000 364000
Cycle time
Cach
e lin
e
10
Benchmarks
Gcc
0
100
200
300
400
500
600
460000 480000 500000 520000 540000 560000
Cycle time
Cac
he li
ne
Gzip
0
100
200
300
400
500
600
195000 205000 215000 225000 235000 245000
Cycle time
Cac
he li
ne
11
Self-Tuning Implementation
Keep track of cache line access. After every 5000 cycles, tune the ECC of cache lines to turn on or off.
Overhead:– Keeping track of cache line access: Simple,
fast counters make implementation easy.– Tuning ECC for lines: Simple average
computation and turning ECC on for lines with more activity than the average.
12
Implementation
Implementation simplified if – Counters maintained for a set of cache
lines.– ECC tuning done at this granularity.
Granularity can be at 10, 20 … 100 lines.
13
Self-tuning Results
From the graphs we see the temporal locality. Based on these results, ECC was turned off for the lines with high locality.
BENCHMARK RELATIVE ACCESS
FREQUENCY
OVERHEAD
WUPWISE 8.3 0.5%
VPR-ROUTE 5.9 2.5%
PARSER 11.9 2.7%
PERLBMK 6.3 2.5%
GCC 2.7 1.9%
GZIP 4.7 3.2%
14
Conclusion
ECC is indispensable as chip reliability reduces and maintaining correct data becomes an issue.
Processor-Memory bottleneck is an eternal issue. Increasing cache latency through ECC protection creates further problems.
This work tries to reduce cache (protected by ECC) latency using a scheme to dynamically turn ECC on and off.
15
References
[1] S. S. Mukherjee, M. Kontz, and S. K. Reinhardt, “Detailed Design and Evaluation of Redundant Multithreading Alternatives,” ISCA, 2002.
[2] S. S. Mukherjee, C. T. Weaver, J. Emer, S. K. Reinhardt, and T. Austin, “A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor,” MICRO, December 2003.
[3] S. S. Mukherjee, T. Fossum, J. Emer, and S. K. Reinhardt, “Cache Scrubbing in Microprocessors: Myth or Necessity?” 10th International Symposium on Pacific Rim Dependable Computing (PRDC), Papeete, Tahiti, March 2004.
[4] J.F.Ziegler, “Terrestrial cosmic rays,” IBM J. of Research and Development, pp. 19 – 39, Vol. 40, No. 1, Jan. 1996.
[5] Y.Tosaka, S.Satoh, K.Suzuki, T.Suguii, H.Ehara, G.A.Woffinden, and S.A.Wender, “Impact of Cosmic Ray Neutron Induced Soft Errors, on Advanced Submicron CMOS circuits,” VLSI Symposium on VLSI Technology Digest of Technical Papers, 1996.
16
References
[6] T.Karnik, B.Bloechel, K.Soumyanath, V.De, and S.Borkar, “Scaling trends of Cosmic Rays induced Soft Errors in static latches beyond 0.18µ ,” Symposium on VLSI Circuits Digest of Technical Papers, 2001.
[7] S.Hareland, J. Maiz, M.Alavi, K.Mistry, S.Walstra, and C.Dai, “Impact of CMOS Scaling and SOI on soft error rates of logic processes,” Symposium on VLSI Technology Digest of Technical Papers, 2001.
[8]Robert Baumann, “Soft Errors in Commercial Semiconductor Technology: Overview and Scaling Trends,” IEEE 2002 Reliability Physics Tutorial Notes, Reliability Fundamentals, pp. 121_01.1 – 121_01.14, April 7, 2002.
[9] P.Shivakumar, M.Kistler, S.W.Keckler, D.Burger, and L.Alvisi, “Modeling the Effect of Technology Trends on the Soft Error Rate of Combinatorial Logic,” Dependable Systems and Networks, 2002.
[10] H. L. Kalter et al., “A 50-ns 16-Mb DRAM with a 10 ns data rate and on-chip ECC,” IEEE J. Solid-State Circuits, vol. 25, pp. 1118–1128, Oct. 1990.