A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design...
Transcript of A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design...
![Page 1: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/1.jpg)
Michael B. Taylor UCSD Center for Dark Silicon
Associate Professor University of California, San Diego
A Landscape of the
New Dark Silicon Design Regime
![Page 2: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/2.jpg)
miketaylor.org
Compilers
Architecture
Chips
Tech Scaling
Kremlin RawCC Parallelizing Compiler Kismet
MIT Raw Tiled Multicore Conservation Cores GreenDroid Heterogeneous Proc Scalar Operand Networks
Utilization Wall / Dark Silicon Four Horseman
![Page 3: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/3.jpg)
miketaylor.org
Compilers
Architecture
Chips
Tech Scaling
Kremlin RawCC Parallelizing Compiler Kismet
MIT Raw Tiled Multicore Conservation Cores GreenDroid Heterogeneous Proc Scalar Operand Networks
Utilization Wall / Dark Silicon Four Horseman
![Page 4: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/4.jpg)
The Scaling Promise of Moore’s Law
8 Years Ago
3.8 GHz
1 core
90 nm
![Page 5: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/5.jpg)
The Scaling Promise of Moore’s Law
8 Years Ago
3.8 GHz
1 core
Today, Xistors are:
4x faster
16x more plentiful
22 nm 90 nm
![Page 6: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/6.jpg)
8 Years Ago
3.8 GHz
1 core
Today, Xistors are:
4x faster
16x more plentiful
16
Today:
____ GHz
____ Cores
15.2
The Scaling Promise of Moore’s Law
90 nm 22 nm
![Page 7: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/7.jpg)
8 Years Ago
3.8 GHz
1 core
Today, Xistors are:
4x faster
16x more plentiful
Today:
____ GHz
____ Cores
3.6 6
The Scaling Promise of Moore’s Law
90 nm 22 nm
16
15.2
![Page 8: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/8.jpg)
8 Years Ago
3.8 GHz
1 core
Today, Xistors are:
4x faster
16x more plentiful
Today:
____ GHz
____ Cores
64X 5.7X
The Scaling Promise of Moore’s Law
6
3.6
![Page 9: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/9.jpg)
Dark Silicon: Transistors clocked far below their potential.
4 cores 1.8 GHz
8 cores 1.8 GHz
65 nm 32 nm
Dark or Dim Silicon (“uncore”)
2x cores per 2 generations, flat frequency
![Page 10: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/10.jpg)
This Talk
Explaining the Source of Dark Silicon: The Utilization Wall
The Four Horsemen of the Dark Silicon Apocalypse
GreenDroid: An Architecture for the Dark Silicon Age
Oct 2013 issue
![Page 11: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/11.jpg)
Where does dark silicon come from? And how dark is it going to be?
The Utilization Wall:
With each successive process generation, the percentage of a chip that can switch at full frequency drops exponentially due to power constraints.
[Venkatesh, Taylor, etc, ASPLOS ‘10]
![Page 12: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/12.jpg)
Dennard: “We can keep power consumption constant”
S
S2
S3
1
S2 = 2x More Transistors
S = 1.4x Faster Transistors
S = 1.4x Lower Capacitance
Scale Vdd by S=1.4x S2 = 2x
![Page 13: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/13.jpg)
Fast forward to 2005: Threshold Scaling Problems due to Leakage Prevents Us From Scaling Voltage
S
S2
S3
1
S2 = 2x More Transistors
S = 1.4x Faster Transistors
S = 1.4x Lower Capacitance
Scale Vdd by S=1.4x S2 = 2x
![Page 14: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/14.jpg)
Full Chip, Full Frequency Power Dissipation Is increasing exponentially by 2x with every process generation
S
S2
S3
1
Factor of S2
= 2X shortage!!
![Page 15: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/15.jpg)
Multicore has hit the Utilization Wall
4 cores @ 1.8 GHz
4 cores @ 2x1.8 GHz (12 cores dark)
2x4 cores @ 1.8 GHz (8 cores dark, 8 dim)
(Intel/x86 Choice, next slide)
.…
65 nm 32 nm
.…
.…
Spectrum of tradeoffs between # of cores and frequency
Example: 65 nm 32 nm (S = 2)
[Taylor, Hotchips 2010]
4x4 cores @ .9 GHz (GPUs of future?)
[Esmaeilzadeh ISCA 2011]
![Page 16: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/16.jpg)
If multicore is not a solution to the dark silicon problem, what other potential avenues do we have?
Develop a more nuanced understanding of dark silicon.
![Page 17: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/17.jpg)
This Talk
Explaining the Source of Dark Silicon
The Four Horsemen of the Dark Silicon Apocalypse
GreenDroid: An Architecture for the Dark Silicon Age
[Taylor, DAC 2012]
![Page 18: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/18.jpg)
The Four Horsemen Taxonomy
I II III IV
Approaches for thriving in a future of dark silicon
A taxonomy of approaches for dealing with dark silicon. None is ideal, but each has its benefit and the optimal chip design probably incorporates all four…
![Page 19: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/19.jpg)
The Shrinking Horseman (#1)
“Area is expensive. Chip designers will just build smaller chips instead of having dark silicon in their designs!”
90
8 nm
(if you work on Dark Silicon research, you will hear this a lot…)
![Page 20: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/20.jpg)
“Area is expensive. Chip designers will just build smaller chips instead of having dark silicon in their designs!”
90
8 nm
First, dark silicon doesn’t mean useless silicon, it just means it’s under-clocked or not used all of the time.
There’s lots of dark silicon in current chips: • e.g. L3 cache cells are very dark • un-core, unused SSE, etc.
The Shrinking Horseman (#1)
![Page 21: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/21.jpg)
The Shrinking Horseman (#1)
90
8 nm
“Why not just build smaller chips!”
Possibly – but why didn’t we shrink all of our chips before the dark silicon days? This too would be cheaper!
• Exponential increase in Power Density Exponential Rise in Temperature [Skadron] • Competition and Margins
• If dark silicon buys you anything, you have to use it too, to keep up with the Jones.
• Diminished Returns • Fixed % of chip cost is in the silicon; (versus marketing, test, packaging, I/O pad area, etc); savings decrease exponentially w/ scaling.
• But, some chips will shrink • Especially chips where improved perform. does not matter • Race to the bottom; except if we can sell exponentially more chips!
![Page 22: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/22.jpg)
The Four Horsemen Taxonomy
I II III IV
Explaining the Source of Dark Silicon
The Four Horsemen of the Dark Silicon Apocalypse
![Page 23: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/23.jpg)
The Dim Horseman (#2) 90
8
“We will fill the chip with homogeneous cores that would exceed the power budget but we will: - underclock them (spatial dimming) - use in bursts (temporal dimming)
… “dim silicon”.
![Page 24: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/24.jpg)
The Dim Horseman (#2) 90
8
Spatial Dimming Multicores (higher core count lower freqs) “Upward DVFS” – Cubic Power Increase Downward DVFS no longer decreases power cubically Near Threshold Voltage (NTV) Operation
• Bad: Delay Increase >> Energy Gain e.g 5X more energy, 8X slower also, requires redesign of circuits for NTV operation! • Good: Energy per op is still lower • 8X more cores / parallelism 1X perf, 5X lower energy • 40X more cores / parallelism 5X perf, same energy • But watch for Non-Ideal Speedups / Amdahl’s Law
NTV Examples: • Manycore (e.g., UMich Centip3de [ISSCC 2012]) • SIMD (e.g., Synctium [CAL 2010]] • x86 [Intel, ISSCC 2012]
![Page 25: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/25.jpg)
The Dim Horseman (#2)
wall clock time
Temporal Dimming: Computing in Bursts - Battery Limited Systems
Active versus Standby mode - Thermally Limited Systems
Turbo Boost 2.0 [ Intel, Rotem et al., HOTCHIPS 2011] • Leverage Thermal Cap for DVFS – “overspend” if cold
Computational Sprinting, [Raghavan HPCA 2012] ARM big.LITTLE in mobile phones or tablets [DAC 2012]
• A15 power usage way above sustainable for phone 10 second bursts at most
![Page 26: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/26.jpg)
The Four Horsemen
I II III IV
Explaining the Source of Dark Silicon
The Four Horsemen of the Dark Silicon Apocalypse
![Page 27: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/27.jpg)
The Specialized Horseman (#3) 90
8
“We will use all of that dark silicon area to build specialized cores, each of them tuned for the task at hand (10-100x more energy efficient), and only turn on the ones we need…”
[e.g., Venkatesh et al., ASPLOS 2010, Lyons et al., CAL 2010, Goulding et al., Hotchips 2010, Hardavellas et al. IEEE Micro 2011]
![Page 28: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/28.jpg)
This Talk
Explaining the Source of Dark Silicon
The Four Horsemen of the Dark Silicon Apocalypse
GreenDroid: An Architecture for the Dark Silicon Age
![Page 29: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/29.jpg)
29
Conservation Cores
Idea: Leverage dark silicon to “fight” the utilization wall
Insights: – Specialized logic can improve energy efficiency by 10-1000x versus a general-purpose processor – Power is now more expensive than area
Our Approach: – Fill dark silicon with Conservation Cores, or c-cores, which are specialized energy-saving coprocessors that save
energy on common apps – Execution jumps from c-core to c-core – Power-gate c-cores that are not currently in use
Conservation Cores provide an architectural way to trade dark area for an effective increase in power budget!
Dark Silicon
![Page 30: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/30.jpg)
Conservation Cores (C-cores) Specialized coprocessors for reducing energy in irregular code – Hot code implemented by c-cores, cold code runs on host CPU; – C-cores use up to 18x less energy – Shared D-cache Coherent Memory – Patching support in hardware
Fully-automated toolchain – No “deep” analysis or
transformations required – C-cores automatically generated from
hot program regions – Design-time scalable
• Emphasize Quantity over Quality! • Simple conversion into HW buys us big
gains, no need for heroic compiler efforts.
D-cache
Host CPU
(general-purpose processor)
I-cache
Hot code
Cold code
"Conservation Cores: Reducing the Energy of Mature Computations," Venkatesh et al., ASPLOS '10
C-core
![Page 31: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/31.jpg)
C-core Generation
for (i=0; i<N; i++) { x = A[i]; y = B[i]; C[x] = D[y] + x+y+x*y;}
Start with ordinary C code. Irregular or regular is fine. Arbitrary control flow, arbitrary memory access patterns and complex data structures are supported.
![Page 32: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/32.jpg)
BB1
BB0
BB2
CFG
C-core Generation
Build a CFG; run ordinary compiler optimizations.
![Page 33: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/33.jpg)
BB1
BB0
BB2
CFG
+
+
*
LD
+ LD LD
+1 <N?
+ + +
ST +
Datapath
C-core Generation
Each BB becomes a datapath; each operator turned into HW equivalent.
Memory ops mux’d into L1 cache.
Multiplier and FPUs may or may not be shared.
![Page 34: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/34.jpg)
BB1
BB0
BB2
CFG
+
+
*
LD
+ LD LD
+1 <N?
+ + +
ST +
Datapath
Inter-BB State Machine
C-core Generation
Create a state machine that determines which BB (datapath) is next.
![Page 35: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/35.jpg)
BB1
BB0
BB2
CFG
+
+
*
LD
+ LD LD
+1 <N?
+ + +
ST +
Datapath
Inter-BB State Machine
0.01 mm2 in 45 nm TSMC runs at 1.4 GHz
.V
Synopsys IC Compiler, P&R, CTS
C-core Generation
.V
Via verilog, run through standard
CAD flow.
![Page 36: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/36.jpg)
D-cache 6% Datapath
3%
Energy Saved 91%
D-cache 6%
Datapath 38%
Reg. File 14%
Fetch/ Decode
19%
I-cache 23%
Where do the energy savings come from?
RISC baseline 91 pJ/instr.
C-cores 8 pJ/instr.
![Page 37: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/37.jpg)
Supporting Software Changes
Software may change – HW must remain usable – C-cores unaffected by changes to cold regions
Can support any changes, through patching – Arbitrary insertion of code – software exception
mechanism – Changes to program constants – configurable registers – Changes to operators – configurable functional units
Software exception mechanism – State-tree allows us to access any register in the C-core – Execute replacement code in processor – Write back values to c-core state-tree to resume execution
![Page 38: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/38.jpg)
Android Time Highly Concentrated 61% of user time is spent in web + top 10 apps. 67% of user time is spent in web + top 20 apps. 73% of user time is spent in web + top 50 apps
growing at 2% for each additional 10 apps.
![Page 39: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/39.jpg)
GreenDroid: Using c-cores to reduce energy in mobile application processors
Android workload
Automatic c-core generator
C-cores Placed-and-routed chip with 9 Android c-cores
"The GreenDroid Mobile Application Processor: An Architecture for Silicon's Dark Future," Goulding-Hotta et al., IEEE Micro Mar./Apr. 2011
generator
cores
![Page 40: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/40.jpg)
GreenDroid Tile Floorplan
Norm to 45 nm: – 1.0 mm2 per tile – 1.5 GHz
25% RISC core, I-cache, and on-chip network 25% D-cache 50% C-core “fill”
1 mm
1 mm
OCN
D $
CP
U
I $
C C C
C
C
C
C
C
C C
![Page 41: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/41.jpg)
Quad-core GreenDroid Prototype
Four heterogeneous tiles with ~40 C-cores. Synopsys IC Compiler 28-nm Global Foundries ~1.5 GHz 2 mm^2 Multiproject Tapeout w/ UCSC
![Page 42: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/42.jpg)
D$
CPU I$
ASPLOS ‘10
C-cores; Patching; Util. Wall
HOTCHIPS ‘10
GreenDroid; Dark Silicon
Selective Depipelining; Cachelets
+
HPCA ’11
C-cores for FPGAs
FPL ’11
Selective Depipelining for FPGAs
+
IEEE MICRO ‘11
GreenDroid P&R Tile
FCCM ’11
=
sum
sum i
+
=
sum
sum sum
*
=
sum
sum
alu
mux
sum i
sum += i sum *= sum
sum = alu(sum, mux(sum,i,muxControl), aluControl)
MICRO ’11
Automatic C-core General- ization
DAC ‘12
Four Horsemen
IEEE MICRO ‘13
Landscape of Dark Silicon
![Page 43: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/43.jpg)
The Four Horsemen
I II III IV
Explaining the Source of Dark Silicon
The Four Horsemen of the Dark Silicon Apocalypse
![Page 44: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/44.jpg)
The Deus Ex Machina Horseman Latin [/dayus ex makeena/] American [/duece ex mashina/]
deux ex machina /dayus ex makeena/ A plot device whereby a seeminglyunsolvable problem is suddenly and abruptly solved with the unexpectedintervention of some new event, character, ability or object.
![Page 45: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/45.jpg)
The Deus Ex Machina Horseman “MOSFETs are the fundamental problem.”
MOSFET variants ( FinFets, Trigate, High-K, nanotubes, 3D)
- one-time improvements - limited to 60 mV/decade subthreshold slope - leakage is still there
![Page 46: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/46.jpg)
The Deus Ex Machina Horseman
[e.g, Spencer et al JSSC 2011]
Possible “Beyond CMOS” Device Directions (none are there yet, imho)
• Nano-electrical Mechanical (NEMS) Relays very low energy, physical connections, very slow • Tunnel Field Effect Transistors (TFETs)
use tunneling effects to overcome MOSFET limits better subthreshold slopes (~ 25 mV/decade) at lower voltages; not superior to MOSFETS at higher voltages
[e.g., Ionescu et al, Nature 2011]
![Page 47: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/47.jpg)
DARPA / SRC / MARCO’s$194M Starnet Investment
I II III IV
C-FAR LEAST C-SPIN FAME SONIC
Default
Terraswarm
![Page 48: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/48.jpg)
For more : IEEE Micro 2013 Paper
Explaining the Source of Dark Silicon
The Four Horsemen of the Dark Silicon Apocalypse
Dark Silicon Design Principles
A Truly Dark Computing Fabric: The Brain
![Page 49: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/49.jpg)
Conclusion Dark Silicon is opening up a whole new class of exciting new architectural directions which many folks are starting to move into – which I have termed the “four horsemen”. GreenDroid is one interesting example of an architecture that explores one of these directions.
I II III IV
![Page 50: A Landscape of the New Dark Silicon Design Regime · PDF fileof the New Dark Silicon Design Regime. miketaylor.org Compilers Architecture Chips Tech Scaling Kremlin ... GreenDroid](https://reader030.fdocuments.in/reader030/viewer/2022011723/5a85c1c67f8b9a882e8c762b/html5/thumbnails/50.jpg)
The UCSD GreenDroid Team
darksilicon.org
Prof. Taylor Prof. Swanson