IC Technology What Will the Next Node Offer Us? · 8/17/2019  · 2D baseline system Off-chip DRAM...

Post on 16-Aug-2020

2 views 1 download

Transcript of IC Technology What Will the Next Node Offer Us? · 8/17/2019  · 2D baseline system Off-chip DRAM...

TSMC Property © 2019 TSMC, Ltd

®

IC Technology –What Will the Next

Node Offer Us?

H.-S. Philip WongVice President, Corporate Research, TSMC

Willard R. & Inez Kerr Bell Professor, Stanford University

2 TSMC Property © 2019 TSMC, Ltd®

MOORE’S LAWTransistors per microprocessor

Source: Karl Rupp. 40 Years of Microprocessor Trend Data.

1010

109

108

107

106

105

104

1971 1980 1990 2000 2010 2017

3 TSMC Property © 2019 TSMC, Ltd®

MOORE’S LAW DENSITY AND COST PER FUNCTION

Source: G. Moore, Electronics, 1965

105

104

103

102

10

1105104103102101

Number of Components per Integrated Circuit

Rela

tive M

anufa

ctu

ring C

ost

/ C

om

ponent

1960

1970

1965

4 TSMC Property © 2019 TSMC, Ltd®

104

103

102

101

100

10-1

10-2

10-3

MOORE’S LAW IS WELL AND ALIVE

DENSITY: A NECESSARY ATTRIBUTE

1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 2020

Re

lati

ve

De

nsi

ty

Year

Standard cell inverter

High density SRAM

Logic gates

Transistor density (microprocessors)

5 TSMC Property © 2019 TSMC, Ltd®

IMAGINE: TRANSISTOR PERFORMANCE W/O DENSITY

6 TSMC Property © 2019 TSMC, Ltd®

• Not enough memory

• No multi-core chips

• No accelerators

• Wire delay slows big chips.

IMAGINE: TRANSISTOR PERFORMANCE W/O DENSITY

7 TSMC Property © 2019 TSMC, Ltd®

TECHNOLOGY LEADERSHIP

N7World’s first 7 nmParticipated in all the products on 7 nm

8 TSMC Property © 2019 TSMC, Ltd®

N7Best performance

Highest density

Extensive EUV layers

Design ecosystem ready

In risk production

TECHNOLOGY LEADERSHIP

N5 ( )P

9 TSMC Property © 2019 TSMC, Ltd®

N7 N3

TECHNOLOGY LEADERSHIP

N5 ( )P

10 TSMC Property © 2019 TSMC, Ltd®

THE ELEPHANT

IN THE ROOM

m 10-1 m

Bacteria

2 μmStrand of hair

0.1 mmTennis ball

10 cmVirus

50 nmCarbon

nanotube

1.2 nm

FinFET

Water molecule

0.28 nm

H H

O

-+ Hydrogen atom

0.1 nm

10-2 m 10-3 m 10-4 m 10-5 m 10-6 m 10-7 m 10-8 m 10-9 m 10-10 m

11 TSMC Property © 2019 TSMC, Ltd®

CONTINUOUS BENEFITS NODE AFTER NODE

MOORE’S LAW – A HISTORY OF INNOVATIONS

Dennard scaling

Strained Si, high-k / metal gate

FinFET / DTCO

12 TSMC Property © 2019 TSMC, Ltd®

CONTINUOUS BENEFITS NODE AFTER NODE

MULTIPLE ROADS LEAD TO ROME

Innovations

13 TSMC Property © 2019 TSMC, Ltd®

INTEGRATING CHIPS INTO SYSTEMS

It may prove to be more economical to build

large systems out of smaller functions, which

are separately packaged and interconnected.

The availability of large functions, combined

with functional design and construction,

should allow the manufacturer of large

systems to design and construct a

considerable variety of equipment both

rapidly and economically.

Source: G. Moore, Electronics, 1965

14 TSMC Property © 2019 TSMC, Ltd®

CoWoS® SYSTEM INTEGRATION

Source: 2013 TSMC Technology Symposium

TSMC CoWoS® fully assembled test chip

1 SoC + 2 DRAMs

15 TSMC Property © 2019 TSMC, Ltd®

CoWoS® SYSTEM INTEGRATION

2500 mm2 interposer:

2 processors (600 mm2)

+ 8 HBM DRAM

16 TSMC Property © 2019 TSMC, Ltd®

Integrated Si/Package Area, Reticle

SYSTEM INTEGRATION TECHNOLOGIESI/O

Pin

Count

17 TSMC Property © 2019 TSMC, Ltd®

Package Size

Inte

rposer

Siz

e (

mm

2)

GP100(Courtesy of Nvidia)

7V580THeterogeneous Integration

(Courtesy of Xilinx)

7V2000THomogeneous Integration

(Courtesy of Xilinx)

XCVU440(Courtesy of Xilinx)

GV100(Courtesy of Nvidia)

mm2

CHIPLETS INTEGRATION REDUCES SYSTEM COST PER FUNCTION

2X

1X

1.5X

18 TSMC Property © 2019 TSMC, Ltd®

PC / Internet Mobile AI / 5GMini-ComputerTransistor Radio

SEMICONDUCTOR TECHNOLOGY EVOLVES

DRIVEN BY CHANGING APPLICATION LANDSCAPE

Invention of point-contact transistor

1947

Transistor Scaling Principle

1974

Intel 4004

1971

Invention of IC

1958

Pentium CPU

1995Flash Memory

1984Mobile phone

1973

3G

2002

4G

2009

iPhone

2007

FinFET

1999

7nm FinFET

2018

GPU (21B Transistors)

2017

5nm CMOS

2020

2050 and beyond

19 TSMC Property © 2019 TSMC, Ltd®

15%

85%

8%

92%

20%

80%

Memory Compute

Deep Learning Accelerators

Intel performance counter monitors 2 CPUs, 8-cores/ CPU + 128GB DRAM

DATA MOVEMENT HITS THE MEMORY WALL

ABUNDANT-DATA APPLICATIONS: ENERGY MEASUREMENTS

Source: S. Mitra (Stanford)

…ResNet-152

(CNN)

AlexNet

(CNN)

Language Model

(LSTM)

20 TSMC Property © 2019 TSMC, Ltd®

Network(application)

Type(LSTM/ CNN)

Training/ Inference

Model SizeMemory Usage

(GBytes)

ResNet(vision)

CNNTraining

120 MBytes21*

Inference 0.12

LanguageModel(NLP)

LSTMTraining

2.5 GBytes40*

Inference 2.5

* Training memory usage: Batch size 64, word size 64-bit, memory can increase with greater batch sizes, footprint of activations, weights, errors and gradients.

Source: M. Lee, W. Hwang, Prof. S. Mitra (Stanford), M. Aly (NTU, Singapore), Y. Wang, K. Akarvardar (TSMC)

DEEP NEURAL NETWORKS

REQUIRE LARGE MEMORY CAPACITY

21 TSMC Property © 2019 TSMC, Ltd®

ON-CHIP SRAM CAPACITY:

NEVER ENOUGH

0

10

20

30

40

50

60

Estim

ate

d O

n-c

hip

SR

AM

(M

B)

Launch Year

2006 201820122009 2015

Intel Xeon X5355 NVIDIA Tesla K40

NVIDIA Tesla V100

Intel Xeon E7-8890 v4

CPU

GPU

3.8 Gbytes

@

1.4 nm node

Source: W. Hwang, Prof. S. Mitra (Stanford)

22 TSMC Property © 2019 TSMC, Ltd®

CAN WE PUT LOTS OF MEMORY ON-CHIP?

WHAT KINDS OF MEMORY, FOR WHICH APPLICATION?

23 TSMC Property © 2019 TSMC, Ltd®

Source: “Inside Volta” , Nvidia GPU Tech. Conf. , May 10, 2017.

Heterogeneous Integration:GPU + High Bandwidth Memory (HBM2)

CoWoS Module

Superior processing power that equals to 100 CPUs

>300 B transistors

SUPER AI ACCELERATOR ENABLED BY CoWoS®

HBM2

HBM2HBM2

HBM2

GPU

24 TSMC Property © 2019 TSMC, Ltd®

COMPUTE-MEMORY INTEGRATION

Off-Chip DRAM

Si Logic Die

Printed Circuit Board

Limited I/O Connectivity

2D System (traditional baseline)

Source: W. Hwang, W. Wan, Y. Malviya, H. Li, M. Lee, M. Aly, H.-S. P. Wong, S. Mitra. Work in progress 2017 – 2019 w/ TSMC

25 TSMC Property © 2019 TSMC, Ltd®

2.5D System HBM-Type DRAM

Si Logic Die

Si InterposerMicron Scale Connectivity

Source: W. Hwang, W. Wan, Y. Malviya, H. Li, M. Lee, M. Aly, H.-S. P. Wong, S. Mitra. Work in progress 2017 – 2019 w/ TSMC

COMPUTE-MEMORY INTEGRATION

26 TSMC Property © 2019 TSMC, Ltd®

HBM-Type DRAM

Si Logic Die

TSV + µBump Connectivity(Micron Scale)

3D TSV System

Source: W. Hwang, W. Wan, Y. Malviya, H. Li, M. Lee, M. Aly, H.-S. P. Wong, S. Mitra. Work in progress 2017 – 2019 w/ TSMC

COMPUTE-MEMORY INTEGRATION

27 TSMC Property © 2019 TSMC, Ltd®

N3XT SystemHigh Density On-Chip Nonvolatile Memory

Dense ILV Connectivity(Nanometer Scale)

Si Logic Die

Energy Efficient Logic(Thin Device Layers)

High Speed On-ChipNonvolatile Memory

Energy Efficient MemoryAccess Transistors

Nonvolatile Memory Cells

Source: W. Hwang, W. Wan, Y. Malviya, H. Li, M. Lee, M. Aly, H.-S. P. Wong, S. Mitra. Work in progress 2017 – 2019 w/ TSMC

COMPUTE-MEMORY INTEGRATION

28 TSMC Property © 2019 TSMC, Ltd®

Bottom Electrode

Top Electrode

oxide isolation

switching region

phase change material

PCMPhase change

memory

filament

oxygen ion

Top Electrode

Bottom Electrode

metaloxide

oxygen vacancy

RRAMResistive

switching random access memory

filament

Bottom Electrode

solid electrolyte

Active Top Electrode

metal atoms

CBRAMConductive

bridge random access memory

STT-MRAMSpin torque

transfer magnetic random access

memory

FERAMFerro-electric

random access memory

Ferroelectric layer

p-Si

n+ n+

Interface Layer

top gate

Source: H.-S. P. Wong, S. Salahuddin, Nature Nanotech (2015)

“NEW” MEMORIES FOR COMPUTE-MEMORY INTEGRATION

Soft Magnet

Pinned Magnet

tunnel barrier (oxide)

current

Random access, non-volatile, no erase before write, on-chip integration

29 TSMC Property © 2019 TSMC, Ltd®

Source: Stanford/NTU: M. Aly, S. Mitra, TSMC: Yih (Eric) Wang, K. Akarvardar, 2019

2D baselinesystem

AcceleratorCores

SRAM on-chip memory

NEW MEMORY: HIGH-BANDWIDTH,

HIGH-CAPACITY, ON-CHIP

30 TSMC Property © 2019 TSMC, Ltd®

2D baselinesystem

Off-chip DRAM (LPDDR3)• Capacity: 4 GBytes• Latency: 50 ns• BW: 12 GBytes/s• Read/write energy: 17 pJ/bit

Source: Stanford/NTU: M. Aly, S. Mitra, TSMC: Yih (Eric) Wang, K. Akarvardar, 2019

AcceleratorCores

SRAM on-chip memory

NEW MEMORY: HIGH-BANDWIDTH,

HIGH-CAPACITY, ON-CHIP

31 TSMC Property © 2019 TSMC, Ltd®

2D baselinesystem

Off-chip DRAM (LPDDR3)• Capacity: 4 GBytes• Latency: 50 ns• BW: 12 GBytes/s• Read/write energy: 17 pJ/bit

Source: Stanford/NTU: M. Aly, S. Mitra, TSMC: Yih (Eric) Wang, K. Akarvardar, 2019

AcceleratorCores

SRAM on-chip memory

NEW MEMORY: HIGH-BANDWIDTH,

HIGH-CAPACITY, ON-CHIP

New system

AcceleratorCores

SRAM on-chip memory

Off-chip DRAM (LPDDR3)• Capacity: (4 GBytes minus New Mem. Cap.)• Latency: 50 ns • BW: 12 GBytes/s• Read/write energy: 17 pJ/bit

32 TSMC Property © 2019 TSMC, Ltd®

2D baselinesystem

Off-chip DRAM (LPDDR3)• Capacity: 4 GBytes• Latency: 50 ns• BW: 12 GBytes/s• Read/write energy: 17 pJ/bit

Source: Stanford/NTU: M. Aly, S. Mitra, TSMC: Yih (Eric) Wang, K. Akarvardar, 2019

AcceleratorCores

SRAM on-chip memory

New system

High Bandwidth, High Capacity both critical

AcceleratorCores

SRAM on-chip memory

Off-chip DRAM (LPDDR3)• Capacity: (4 GBytes minus New Mem. Cap.)• Latency: 50 ns • BW: 12 GBytes/s• Read/write energy: 17 pJ/bit

On-chip New memory• Capacity: sweep (up to 4 GBytes)• Latency: sweep (down to 3ns)• BW: sweep (up to 128 GBytes/s)• Read/write energy: 5 pJ/bit

NEW MEMORY: HIGH-BANDWIDTH,

HIGH-CAPACITY, ON-CHIP

33 TSMC Property © 2019 TSMC, Ltd®

5 ns memory access latency, 5 pJ/bit access energy

EDP benefits

Source: Stanford/NTU: M. Aly, S. Mitra, TSMC: Yih (Eric) Wang, K. Akarvardar, 2019

NEW MEMORY ESSENTIAL REQUIREMENTON-CHIP CAPACITY MUST EXCEED DATA SIZE

Language model (LSTM)

2.5 GByte data size

ResNet-152 (CNN)

120 MByte data size

Band

wid

th (

GB

ytes/s

)

Band

wid

th (

GB

ytes/s

)

4.2x1.3x

1.3x 2.9x - 3.6x

12

0M

Byt

e

64 GBytes/s

Capacity1 MByte 4 GByte

128

10

Capacity

50x2.1x – 8x

2.1x – 8x1

5x –

30

x

2.5

GB

yte100 GBytes/s

1 MByte 4 GByte

10

128

34 TSMC Property © 2019 TSMC, Ltd®

Language model (LSTM)

2.5 GByte data size

ResNet-152 (CNN)

120 MByte data size

Band

wid

th (

GB

ytes/s

)

Capacity (MBytes)

112

128

96

80

64

48

32

16

1024 2048 3072 4096

Band

wid

th (

GB

ytes/s

)

Capacity (MBytes)

112

128

96

80

64

48

32

16

1024 2048 3072 4096

50

NEW MEMORY ESSENTIAL REQUIREMENTON-CHIP CAPACITY MUST EXCEED DATA SIZE

EDP benefits

5 ns memory access latency, 5 pJ/bit access energy

Source: Stanford/NTU: M. Aly, S. Mitra, TSMC: Yih (Eric) Wang, K. Akarvardar, 2019

35 TSMC Property © 2019 TSMC, Ltd®

5 pJ/bit access energy

EDP benefits

Source: Stanford/NTU: M. Aly, S. Mitra, TSMC: Yih (Eric) Wang, K. Akarvardar, 2019

NEW MEMORY ESSENTIAL REQUIREMENTHIGH BANDWIDTH MORE CRITICAL THAN LATENCY

Language model (LSTM)

2.5 GByte data size

ResNet-152 (CNN)

120 MByte data size

Band

wid

th (

GB

ytes/s

)

Band

wid

th (

GB

ytes/s

)

4.2x 4.1x

1.1x - 3x

20 n

s

64 GBytes/s

Latency (ns)3 50

128

10

Latency (ns)

50x 20x – 35x

1.1x – 20x

15

x –

30

x

10 n

s

100 GBytes/s

3 50

10

128

2.4

x –

3.9

x

36 TSMC Property © 2019 TSMC, Ltd®

Language model (LSTM)

2.5 GByte data size

ResNet-152 (CNN)

120 MByte data size

EDP benefits

5 pJ/bit access energy

Source: Stanford/NTU: M. Aly, S. Mitra, TSMC: Yih (Eric) Wang, K. Akarvardar, 2019

Band

wid

th (

GB

ytes/s

)

Latency (ns)

112

128

96

80

64

48

32

16

5 10 40 80

50

1

20

Band

wid

th (

GB

ytes/s

)

Latency (ns)

112

128

96

80

64

48

32

16

5 10 40 80

4.2

20

NEW MEMORY ESSENTIAL REQUIREMENTHIGH BANDWIDTH MORE CRITICAL THAN LATENCY

37 TSMC Property © 2019 TSMC, Ltd®

Energy ✕ Execution Time

1971X

525X320X

159X63X

1

10

100

1000

10000

Lang.Model (LSTM) AlexNet (CNN) Captioning (LSTM) ResNet152 (CNN) VGG19 (CNN)

Sys

tem

-Leve

l B

enefits

N3XT Benefits: relative to 2D Baseline System (28nm silicon CMOS, LPDDR3)Inference: 16-bit data, batch size of 1

Workload: Inference on ML Accelerator

N3XT: UP TO ~2,000X

ENERGY EFFICIENCY BENEFITS

Source: Stanford/NTU: M. Aly, T. Wu, A. Bartolo, H.-S. P. Wong, S. Mitra et. al., Proc. IEEE 2019

38 TSMC Property © 2019 TSMC, Ltd®

N3XT SYSTEM

High Density On-Chip Nonvolatile Memory

Dense ILV Connectivity(Nanometer Scale)

Si Logic Die

Energy Efficient Logic(Thin Device Layers)

High Speed On-ChipNonvolatile Memory

Energy Efficient MemoryAccess Transistors

Nonvolatile Memory Cells

39 TSMC Property © 2019 TSMC, Ltd®

N3XT SYSTEM

High Density On-Chip Nonvolatile Memory

Dense ILV Connectivity(Nanometer Scale)

Si Logic Die

Energy Efficient Logic(Thin Device Layers)

High Speed On-ChipNonvolatile Memory

Energy Efficient MemoryAccess Transistors

Nonvolatile Memory Cells

40 TSMC Property © 2019 TSMC, Ltd®

1D carbon nanotube (CNT)

2D TMD (MoS2, WSe2, WS2…)

Source: S.-K. Su, … L.-J. Li (TSMC), Nature Nanotech., 2019.

Photo credit: B. Radisavljevic et al., Nature Nanotech., p. 147, 2011

NANOMETER-THIN TRANSISTOR CHANNEL

1 nm

< 1 nm

1

10

100

1,000

10,000

0 1 2 3 4

Mo

bilit

y(c

m2/V

-s)

Channel thickness (nm)

MoS2

WS2

WSe2 Si

Ge

CNT

Filled: electronOpen: hole

Photo credit: User Mstroeck on en.wikipedia

41 TSMC Property © 2019 TSMC, Ltd®

2D LAYERED MATERIALS (WS2, WSe2)

MoS2-e0.5

0.4

0.3

0.2

0.110 100 1000

Effective m

ass (

m0)

Mobility (cm2/V-s)

Source: C.-C. Cheng et al. (TSMC), Symp. VLSI Tech. 2019

WSe2-h ION (μA/μm)

200

400

600

800

20 nm 10 nm

Classical FET

G

S D

G

S D

42 TSMC Property © 2019 TSMC, Ltd®

5 nm Gate Length

SHORT-CHANNEL

CARBON NANOTUBE TRANSISTORS

10 nm Gate Length

Source: C. Qiu,…L-M. Peng (PKU), Science, 2017

VDS = -0.4V VDS = 0.4V

SS = 70 mV/Dec 70 mV/Dec

10-5

10-6

10-7

10-8

-1.0 -0.5 0.0 0.5

Vgs (V)

I ds

(A)

10-5

10-6

10-7

10-8

-1.0 -0.5 0.0 0.5

Vgs –Vt (V)

I ds

(A)

10-9

10-10

VDS = -0.1 V

SS = 73 mV/Dec

Lg = 5 nm

43 TSMC Property © 2019 TSMC, Ltd®

CARBON NANOTUBE COMPUTER

Source: M. Shulaker,… H.-S. P. Wong, S. Mitra (Stanford), Nature, 2013

instruction fetch

arithmetic block

data fetch

write-back

44 TSMC Property © 2019 TSMC, Ltd®

Kbit 6T SRAM (6144 CNFETs)

CARBON NANOTUBE FET CMOS SRAM

Source: P. Kanhaiya,… M. Shulaker (MIT), Symp. VLSI Tech., 2019

45 TSMC Property © 2019 TSMC, Ltd®

MEMORY INTEGRATION

ON LOGIC PLATFORM

Better transistor alone

46 TSMC Property © 2019 TSMC, Ltd®

Transistors integrated with memory in 3D

MEMORY INTEGRATION

ON LOGIC PLATFORM

47 TSMC Property © 2019 TSMC, Ltd®

Norm

aliz

ed

Den

sity

SYSTEM INTEGRATION

A CONTINUUM FROM FAR BACK-END TO FRONT-END

Source: IMEC

Interposer Chip-on-waferWafer-on-wafer

Monolithic 3D

107

108

106

105

104

103

102

101

100

48 TSMC Property © 2019 TSMC, Ltd®

SOCIETAL NEEDS FOR ADVANCED

TECHNOLOGY IS INSATIABLE

ADVANCED TECHNOLOGY

− A KEY DIFFERENTIATOR

49 TSMC Property © 2019 TSMC, Ltd®

CONTINUOUS BENEFITS NODE AFTER NODE

Continuous transistor & memory advances

Memory logic integration

MULTIPLE ROADS LEAD TO ROME

System integration with high connectivity

50 TSMC Property © 2019 TSMC, Ltd®

A CALL TO ACTION: EARLY ENGAGEMENT

SYSTEM ↔ TECHNOLOGY

ACADEMIA ↔ INDUSTRY RESEARCH

51 TSMC Property © 2019 TSMC, Ltd®

End of Talk

Questions?

52 TSMC Property © 2019 TSMC, Ltd®

CONTINUOUS BENEFITS NODE AFTER NODE

Continuous transistor & memory advances

Memory logic integration

MULTIPLE ROADS LEAD TO ROME

System integration with high connectivity

COMMITTED TO PROVIDING THE MOST

ADVANCED TECHNOLOGIES