Review of wafer-level three-dimensional integration (3DI ...

Review of wafer-levelthree-dimensional integration(3DI) using bumplessinterconnects for tera-scalegeneration

Takayuki Ohbaa), Youngsuk Kim, Yoriko Mizushima,Nobuhide Maeda, Koji Fujimoto, and Shoichi KodamaICE Cube Center, Tokyo Institute of Technology (Tokyo Tech),

4259 Nagatsuda, Midori-ku, Yokohama 226–8503, Japan

a) [email protected]

Abstract: The prospects of three-dimensional (3D) integration for Terabyte

large scale integration using bumpless interconnects with low-aspect-ratio

TSVs and ultra-thinning are discussed. Bumpless (no bump) interconnects

between wafers are a second-generation alternative to the use of micro-

bumps for Wafer-on-Wafer (WOW) technology. Ultra-thinning of wafers

down to 4 µm provides the advantage of a small form factor, not only in

terms of the total volume of 3D ICs, but also the aspect ratio of Through-

Silicon-Vias (TSVs). Our bumpless interconnects technology is classified

into Via-Last, which is performed from the front side after thinning, and

stacking Back-to-Front, in which any number of thinned 300mm wafers

and/or heterogeneous dies can be integrated. From an economic point of

view, in many situations WOW is the leading 3D process because stacking at

the wafer level drastically increases the processing throughput, and using

multi-level bumpless interconnects, with individual wiring die-to-die, pro-

vides an appropriate yield that is equivalent to or greater than that achievable

with 2D processes when scaling down to 22 nm nodes and beyond.

Keywords: wafer-on-wafer, bumpless, thinning, TSV, via-last, high den-

sity integration

Classification: Electron devices, circuits, and systems

References

[1] T. Ohba: ECS Trans. 34 (2011) 1011. DOI:10.1149/1.3567707[2] T. Ohba, N. Maeda, H. Kitada, K. Fujimoto, A. Kawai, K. Arai, K. Suzuki and

T. Nakamura: IEICE Trans. Electron., Electron. Soc. J93-C [11] (2010) 464 inJapanese.

[3] J. U. Knickerbocker, G. S. Patel, P. S. Andry, C. K. Tsang, L. P. Buchwalter,E. J. Sprogis, H. Gan, R. R. Horton, R. J. Polastre, S. L. Wright and J. M. Cotte:J. Solid-State Circuits 41 [8] (2006) 1718. DOI:10.1109/JSSC.2006.877252

[4] M. Koyanagi, T. Nakamura, Y. Yamada, H. Kikuchi, T. Fukushima, T. Tanakaand H. Kurino: IEEE Trans. Electron Dev. 53 (2006) 2799. DOI:10.1109/TED.

© IEICE 2015DOI: 10.1587/elex.12.20152002Received February 17, 2015Accepted February 24, 2015Published April 10, 2015

1

REVIEW PAPER IEICE Electronics Express, Vol.12, No.7, 1–14

http://dx.doi.org/10.1149/1.3567707

http://dx.doi.org/10.1149/1.3567707

http://dx.doi.org/10.1149/1.3567707

http://dx.doi.org/10.1109/TED.2006.884079






2006.884079[5] E. Beyne, P. De Moor, W. Ruythooren, R. Labie, A. Jourdain, H. Tilmans, D. S.

Tezcan, P. Soussan, B. Swinnen and R. Cartuyvels: IEEE IEDM Tech. Dig.(2008) 495. DOI:10.1109/IEDM.2008.4796734

[6] J.-Q. Lu: Proc. IEEE 97 (2009) 18. DOI:10.1109/JPROC.2008.2007458[7] N. Maeda, H. Kitada, K. Fujimoto, K. Suzuki, T. Nakamura and T. Ohba: Proc.

Advanced Metallization Conference 2008 (2009) 501.[8] Y. S. Kim, A. Tsukune, N. Maeda, H. Kitada, A. Kawai, K. Arai, K. Fujimoto,

K. Suzuki, Y. Mizushima, T. Nakamura, T. Ohba, T. Futatsugi and M.Miyajima: IEEE IEDM Tech. Dig. (2009) 365. DOI:10.1109/IEDM.2009.5424349

[9] T. Ohba, N. Maeda, H. Kitada, K. Fujimoto, K. Suzuki, T. Nakamura, A.Kawai and K. Arai: Microelectron. Eng. 87 (2010) 485. DOI:10.1016/j.mee.2009.07.006

[10] H. Kitada, N. Maeda, K. Fujimoto, K. Suzuki, A. Kawai, K. Arai, T. Suzuki,T. Nakamura and T. Ohba: IEEE IITC (2009) 107. DOI:10.1109/IITC.2009.5090354

[11] Y. Mizushima, H. Kitada, K. Koshikawa, S. Suzuki, T. Nakamura and T. Ohba:Jpn. J. Appl. Phys. 51 (2012) 05EE03. DOI:10.7567/JJAP.51.05EE03

[12] O. Nakatsuka, O. Nakatsuka, H. Kitada, Y. Kim, Y. Mizushima, T. Nakamura,T. Ohba and S. Zaima: Jpn. J. Appl. Phys. 50 (2011) 05ED03. DOI:10.7567/JJAP.50.05ED03

[13] Y. Mizushima, H. Kitada, C. J. Uchibori, N. Maeda, S. Kodama, Y. Kim, K.Fujimoto, S. Yoshimi, T. Nakamura and T. Ohba: Jpn. J. Appl. Phys. 52 (2013)05FE01. DOI:10.7567/JJAP.52.05FE01

[14] Y. S. Kim, S. Kodama, Y. Mizushima, N. Maeda, H. Kitada, K. Fujimoto, T.Nakamura, D. Suzuki, A. Kawai, K. Arai and T. Ohba: IEEE VLSI Symp.(2014) 22. DOI:10.1109/VLSIT.2014.6894347

[15] Y. Kwon, A. Jindal, R. Augur, J. Seok, T. S. Cale, R. J. Gutmann and J.-Q. Lu:J. Electrochem. Soc. 155 (2008) H280. DOI:10.1149/1.2844449

[16] H. Kitada, N. Maeda, K. Fujimoto, Y. Mizushima, Y. Nakata, T. Nakamura andT. Ohba: Jpn. J. Appl. Phys. 50 (2011) 05ED02. DOI:10.7567/JJAP.50.05ED02

[17] D. Diehl, H. Kitada, N. Maeda, K. Fujimoto, S. Ramaswami, K. Sirajuddin, R.Yalamanchili, B. Eaton, N. Rajagopalan, R. Ding, S. Patel, Z. Cao, M. Gage, Y.Wang, W. Tu, S. W. Kim, R. Kulzer, I. Drucker, D. Erickson, T. Ritzdorf, T.Nakamura and T. Ohba: Microelectron. Eng. 92 (2012) 3. DOI:10.1016/j.mee.2011.01.082

[18] Y. S. Kim, N. Maeda, H. Kitada, K. Fujimoto, S. Kodama, A. Kawai, K. Arai,K. Suzuki, T. Nakamura and T. Ohba: Microelectron. Eng. 107 (2013) 65.DOI:10.1016/j.mee.2012.10.025

[19] Y. Mizushima, Y. Kim, T. Nakamura, R. Sugie, H. Hashimoto, A. Uedono andT. Ohba: Jpn. J. Appl. Phys. 53 (2014) 05GE04. DOI:10.7567/JJAP.53.05GE04

[20] A. Uedono, Y. Mizushima, Y. Kim, T. Nakamura, T. Ohba, N. Yoshihara, N.Oshima and R. Suzuki: J. Appl. Phys. 116 (2014) 134501. DOI:10.1063/1.4896829

1 Introduction

Prior to discussing 3D integration for high-volume manufacturing, it is necessary to

investigate the future prospects of semiconductor technology development. Con-

ventional two-dimensional (2D) scaling will face a severe economic crisis due to

the expensive lithography processes and facilities required [1, 2]. Reducing costs


2

IEICE Electronics Express, Vol.12, No.7, 1–14



http://dx.doi.org/10.1109/JPROC.2008.2007458








http://dx.doi.org/10.1016/j.mee.2009.07.006










http://dx.doi.org/10.1109/IITC.2009.5090354




http://dx.doi.org/10.7567/JJAP.51.05EE03




http://dx.doi.org/10.7567/JJAP.50.05ED03





http://dx.doi.org/10.7567/JJAP.52.05FE01




http://dx.doi.org/10.1149/1.2844449

http://dx.doi.org/10.1149/1.2844449

http://dx.doi.org/10.1149/1.2844449

http://dx.doi.org/10.1149/1.2844449

http://dx.doi.org/10.1149/1.2844449

http://dx.doi.org/10.1149/1.2844449

http://dx.doi.org/10.1149/1.2844449

















http://dx.doi.org/10.7567/JJAP.53.05GE04




http://dx.doi.org/10.1063/1.4896829

http://dx.doi.org/10.1063/1.4896829

http://dx.doi.org/10.1063/1.4896829

requires the adoption of advanced lithography technologies, which, together with

peripheral support facilities, account for one-third to one-fourth of the total cost of a

manufacturing line. In short, while useful for reducing chip size, scaling is

extremely burdensome in terms of capital investment. Large-scale investments

have so far been made considering the technologies that will be available two to

three generations ahead, e.g., 10 nm technology should be applied to nodes of

<5 nm, which will also face physical limitations. This is based on the empirical rule

that profits are made several generations after investments for reasons involving the

tradeoffs between products sales and facility depreciation.

According to this empirical rule, an investment in 22 nm technology needs to

be made in consideration of its applicability to 10–14 nm technologies. However, the

price of extreme ultraviolet (EUV; � ¼ 13:5 nm) lithography machines is about 100

million USD, which is more than twice that of ArF immersion lithography machines,

and their current throughput is around one-tenth or less. When converted into the

processing capacity of a current large-scale fabrication facility (e.g., 50,000 incom-

ing wafers per month), based on this system performance, an investment of

approximately 2 billion USD will be required for EUV technology. Assuming that

the past lifelong sales for each generation are approximately 10-times the corre-

sponding business investment, the corresponding market size necessary for this

investment is more than 20 billion USD. Based on the 300 billion USD total

worldwide semiconductor market, this expected market size for one product and

one manufacturer is not realistic. In short, this is the limit of 2D scaling in light of the

economics of the industry, and it is difficult to find a scenario of victory at present.

Extension into vertical space, such as 3D stacking, in combination with

conventional 2D integration, is anticipated to overcome these problems [1, 2, 3,

4, 5, 6]. Fig. 1 shows a comparison of bump and bumpless interconnects using

through-silicon vias (TSVs) for 3D logic/memory stacked structures, assuming six

dies for a memory stack and one multi-core microprocessor. It is possible to make a

roadmap to achieve high-density integration backed up by production costs. A

stack containing three 300mm wafers provides a total silicon surface larger than

that of a single 450mm wafer. Moreover, retaining the standard 300mm wafer size

for stacking ensures compatibility with existing manufacturing facilities in front-

end processing and helps utilize the mature process technology that has been

developed for wafer processing.

This paper reviews wafer-level 3D integration and compares its manufactur-

ability with conventional 2D scaling. For vertical wiring, bumpless interconnects

using TSVs, ultra-thinning technologies for Wafer-on-Wafer, and Terascale gen-

eration devices are also described.

2 Wafer-level 3D integration process

Bumpless interconnects using TSVs are a second-generation alternative to the use

of TSVs with micro-bumps [7, 8, 9]. Our bumpless interconnects process involves

a Thinning-First process before bonding wafers, followed by a Via-Last process,

meaning that interconnects are formed after bonding the wafers, as shown in

Figs. 2 and 3. Via-hole etching is carried out on a silicon substrate having a


3


dielectric layer of multilevel interconnects after wafer thinning. Since bumpless

Wafer-on-Wafer (WOW), including Chip-on-Wafer (COW), technologies use a

Back-to-Front stack, in principle any number of thinned 300mm wafers can be

stacked to fabricate large-capacity memory and logic devices. This wafer stacking

method is similar to multilevel metallization in the Back-End-of-Line (BEOL), as if

replacing dielectric deposition using thinned wafers and Al and/or Cu metallization

with bumpless interconnects using TSVs.

The development of WOW has proceeded through four modules, classified

along the process flow. The modules include a thinning module for thinning the

wafer substrates in which devices are implemented, a stacking module for bonding

and stacking the wafers, a TSV interconnects module for forming Cu interconnects

embedded in upper and lower wafers with TSVs, and a packaging module for

singulating the stacked wafers. Dual-Damascene interconnects form a so-called

redistribution layer (RDL) and also serve as a counter electrode for the subsequent

stacked wafer.

The thickness of the thinned wafer is a critical dimension for the aspect ratio of

TSVs because the aspect ratio is determined by the diameter and the wafer

thickness. However, since the thinned wafer is bonded on a base wafer, there is

no need to take measures for handling ultrathin wafers. In our recent study, the

typical thickness of a thinned wafer ranged from 5 to 10µm. When the thicknesses

of the device layers in a DRAM and an MPU are assumed to be approximately

5 µm and 10µm, respectively, the aspect ratio (depth-to-diameter ratio) of a TSV is

only 2–4 for a TSV diameter ranging from 5 to 10µm, whereas conventional TSVs

with bumps have aspect ratios ranging from 5 to 10. With decreasing aspect ratio,

Fig. 1. A comparison of bump and bumpless interconnects using TSVsfor 3D logic/memory stack structures, assuming six dies for amemory stack and one multicore microprocessor. Bumplessinterconnects can be formed with higher density (narrowerpitch) compared with TSVs and bumps due to the limitations ofbump size and pitch. Since Back-to-Front stacking and a TSV-Last process have no limit on the number of stacks, high densityand Tera-scale bandwidth can be anticipated using matureexisting devices.


4


in the TSV processes, such as etching, thin film deposition, and metal filling, the

process time decreases to about 1/5 at most, and the step coverage significantly

improves.

With the use of small TSVs, stress induced by a mismatch in the Coefficients of

Thermal Expansion (CTE) between Cu and Si decreases with decreasing aspect

Fig. 3. TSV formation and Cu plug processes. After bonding ofthinned wafer to another wafer surface, RDL (re-distributionline) patterning, TSV etching, Cu plug formation, and pla-narization by CMP are carried out.

Fig. 2. Bumpless interconnects using TSVs and wafer-on-wafer(WOW) process flow. Additional wafers can be stacked ontop without any limitation on the number of wafers. Thesemodules can also be applied to chip-on-wafer (COW)integration. On-chip and off-chip TSV respectively representbumpless interconnects formed in the device area and the areaaround devices, including gap fill materials in COW.


5


ratio of the TSV, as shown in Fig. 4 [10]. Stress at the center of the Cu plug

decreases in proportion to the thickness of the Si wafer. The small aspect ratio

provided by an ultrathin wafer also has the advantage of reducing stresses generated

in the silicon itself, in the bottom and top Cu-TSV, and in interface regions having

different CTEs [11, 12, 13].

3 Details of WOW processes

3.1 Thinning module

A wafer is bonded to a support substrate (glass or Si wafer) in advance with a

temporary adhesive. Thinning is performed by grinding (Back Grind, or BG) within

several micrometers of the target thickness, followed by polishing until the final

thickness is achieved. The thinned wafer is permanently bonded to the device

surface of another wafer, and then the support substrate is removed. In the case of

WOW employing bumps between wafers, simply thinning the wafer causes its

rigidity to decrease, and the limit is thinning to 50–100µm. By using bumpless

WOW processes, a 300mm wafer on which 32 nm-node SDRAM devices (2GB

DDR3) are fabricated can be thinned down to 4µm with a total thickness variation

(TTV) of around 1µm, which is just 0.5% of the initial thickness [14]. With a wafer

thickness of 4 µm, visible light starts to pass through the wafer. Remarkably, no

degradation of retention time before and after thinning was observed, even in a Si

wafer with a thickness of 4 µm.

3.2 Stacking module

An organic adhesive such as benzo-cyclo-butene (BCB) polymer [15] with a

thickness of approximately 5 µm was used to permanently bond the wafer. BCB

adhesives start to polymerize with increasing temperature and are solidified at

temperatures of 200–250 °C. For WOW, the wafers are aligned just before being

Fig. 4. Stress simulation using the Finite Element Method (FEM) for aSi thicknesses of 100 µm (right) and 20 µm (left) after stackingthree wafers and forming a TSV with a diameter of 30 µm.A 10µm Cu/low-k BEOL layer is formed on every wafersurface, and thus the depths of the TSVs are 110 µm and 30µm,respectively.


6


permanently bonded. To ensure appropriate alignment, infrared light passing

through the silicon substrate is used. Wafers to be bonded to one another in

WOW are originally thin and are therefore highly transmissive. With a low-

temperature bonding process and an optimized curing duration, the average mis-

alignment between wafers can be made as small as several micrometers. On the

other hand, because any gaseous solvent escaping from the adhesive after the

bonding process would form a cavity (void) in the adhesive layer, measures should

be taken to prevent this, such as preheating after applying the adhesive or perform-

ing the bonding process under a reduced pressure.

3.3 TSV (Through-Silicon-Via) module

For bumpless interconnects including RDLs, the Damascene method is employed

to simplify the processes. For TSV processing, dry etching through the dielectrics

(device layer), Si, and adhesive layer is carried out. TSVs with a small aspect ratio,

for example, <3, have the advantage of shortening the process time for both etching

and metal filling compared with conventional large TSVs. Assuming that the

etching rate follows the mass transport limit reaction, the etching times, t and

t1, at different TSV diameters, D and D1, and depths, d and d1, follow t1=t ¼ðD1=DÞ2 � ðd1=dÞ, that is, t1=t ¼ 0:1 at D ¼ D1, d ¼ 50µm, and d1 ¼ 5µm, which

suggests 1/10th the etching time for the same TSV diameter and 1/10th the depth.

After TSV etching, a Plasma Enhanced Chemical Vapor Deposition (PECVD)-SiN

or SiO2 film, is deposited to provide electrical insulation from the Si substrate. The

barrier dielectric at the bottom of the TSV is removed by bias sputtering, and Ti/

TiN and Cu are deposited on the barrier metal and the seed layer, respectively, by

sputtering. For Cu plug interconnects, electrochemical plated Cu (ECP-Cu) is used.

ECP-Cu planarization is carried out by chemical mechanical polishing (CMP).

Fig. 5 shows the via structure after etching and the leakage current, compared

with Bosch and direct etching methods [16, 17]. Since Bosch etching was

conducted by cyclic isotropic-etching and deposition, micro-steps called scalloping

were formed at the sidewall of the via. The scalloping causes cracks in the dielectric

layer and poor step coverage for thin films deposited by CVD and Physical Vapor

Deposition (PVD). In contrast, anisotropic direct etching resulted in a smooth

surface profile along the side wall. The leakage current in Bosch etching was one

order of magnitude higher than that in direct etching. The leakage current was

caused by Cu diffusion at the side wall of the TSV that took place at a thinner part

of the dielectrics containing cracks. Thus, direct etching is suitable for TSV

processing and enables the use of low-aspect-ratio vias. In addition, low-aspect-

ratio Cu causes lower stress concentration in thermal processing compared with

high-aspect-ratio Cu. Low stress reduces Cu deformation and stress propagation to

the device regions.

3.4 Packaging module (singulation/packaging module)

After wafer stacking, the same procedure as in conventional packaging (BGA,

singulation by dicing, resin packaging) is followed. After dicing of a seven-level

wafer stack, the adhesive layer and silicon chips were found to be free of defects or

delamination. After the stacked chips were packaged with epoxy resin, they were


7


subjected to heat stress testing at temperatures of −65 °C to 150 °C. Scanning

acoustic tomography (SAT) was adopted for internal observation, and after up to

100 repeated heat stress tests, no delamination was found at the interfaces between

the molding compound and chips, nor at the chip stack interfaces.

4 Thinning and device characteristics

The effect of wafer thinning on device performance was examined to evaluate the

thinning limits for 300mm silicon wafers. In the case of DRAM devices, following

the WOW process, the wafer was thinned to a final thickness of 4 µm, which is

about 0.5% of the thickness of the bulk wafer (725µm) and thinner than the device

layer shown in Fig. 6 [14]. The total thickness variation (TTV) within the 300mm

wafer was low enough to realize multi-stacking: a TTVof 1.02 µm was achieved at

an average thickness of 4 µm. No significant change with retention time (refresh

time) in the entire wafer before and after thinning was observed for the Si

thicknesses of 40, 20, 8, and 4 µm. This suggests that the thinning process

developed in this study did not affect the junction leakage current, which degrades

the retention time more sensitively than other leakage phenomena such as sub-

threshold leakage, capacitor dielectric leakage, and gate induced drain leakage

(GIDL). Since an ultra-thinned wafer in the WOW process is bonded onto the

bottom wafer in a so-called Bonding-First process, TSVs are processed after

bonding (TSV-Last process), and there are none of the issues seen in the TSV-First

process and/or the TSV-Middle process.

Fig. 7 shows a schematic diagram of the grinding process. The wafer thickness

uniformity after grinding was determined by the contact angle between the wheel

Fig. 5. Cross-sectional TEM images and leakage current of two types ofTSV samples made by Bosch etching and direct etching. Cracksare observed in the Bosch-etched sample, which had a roughinterface due to scalloping. The leakage current as a function ofapplied voltage after annealing at temperatures up to 400 °C wasmeasured. With increasing temperature, the leakage currentincreased but was two orders of magnitude higher in Boschetching. SEM images of TSV etched off through Cu/Low-kBEOL layer and Si after optimization of scalloping shape (right).


8


and wafer surface [18]. The wafer was very slightly bowed after bonding due to

deformation at the wafer edge where temporary adhesive was applied, and this

uniformity was also reflected in the contact angle. By adjusting the contact angle to

follow the geometric shape of the wafer, the TTV decreased to as low as 0.5 µm

within the 300mm wafer. Wafer thinning was carried out as follows: coarse

grinding (#320 grit size) down to ∼50µm, fine grinding (#2000 grit size) to

<20µm, and post-thinning using chemical mechanical polishing (CMP), as shown

in Fig. 8. With these thinning processes, the thickness of the damaged layer,

including point defects such as vacancy-type defects, was decreased from mi-

crometer level to several nanometers, as evaluated by TEM and Positron Annihi-

lation Spectroscopy analyses [19, 20].

Fig. 7. Total thickness variation (TTV) improvement using so-calledAuto-TTV process employing NCG (non-contact gauge)methods (left) and TTVs for 300mm wafer with variousinclination angles (right).

Fig. 6. Cross-sectional SEM images (left) after thinning and refreshtime (right) of 2Gb SDRAM devices before and after thinning.Silicon wafer was 4 µm in thickness, which was thinner than thedevice layer. No degradation of the refresh time after thinningto 40, 20, 8, and 4 µm was found.


9


5 Manufacturing for Terabit generation

5.1 Manufacturing roadmap toward the next generation of devices

Because our method allows thinning of silicon wafers down to 4 µm without any

degradation of the device characteristics, the total wafer thickness, including the

device layer and the adhesive layer, becomes 10 to 20 µm, which is 1/10th to

1/100th the thickness of conventional bump interconnects using TSVs. Even if the

number of stacked wafers is 100, assuming that the wafer thickness is 10 µm, the

total thickness after stacking is only 1mm, which satisfies current packaging

standards. By following these multilevel stacking processes, with a conventional

memory device fabricated with 22 nm technology and having a memory density of

30Gb/cm2, when four, eight, sixteen, etc. of these devices are stacked, the total

capacity of the memory device can be increased to 120Gb, 240Gb, 480Gb, etc.,

respectively. Terabit-capacity memory can be realized by stacking only 40 wafers;

in contrast, to achieve equivalent capacity with a single wafer using extreme scaling

would require 1 nm processing technology, as shown in Fig. 9. Considering the

technology roadmap, the issues of scaling technology and technology for fabricat-

ing three-dimensional structures are often discussed separately. However, these two

technologies are not always mutually exclusive. Scaling would be relieved of the

stringent requirements by using three-dimensional high-density integration tech-

nology combined with mass-production technology. In other words, a sufficiently

long learning period would be ensured, and further cost reductions could be

expected by concentrating on the control of variations among generations and

shortening the process.

5.2 Considering the yield in wafer stacking

In the case of die-to-die series connections, the total yield in wafer stacking can be

estimated by Yn, where Y and n are yield of one wafer and the number of stacked

wafers, respectively. With bumpless interconnects using high-density multi-TSVs

and a controller, output signals from one die and/or one memory bank (channel)

can be connected to the upper and lower stacked chips independently, as shown in

Fig. 10. Hence, an unreliable device and/or bank unit in the stack can be ignored

and thus counted as an independent probability event, resulting in a total yield in

wafer stacking of Y3D � Yn.

Fig. 8. Cross-sectional TEM images of Si wafer after thinning.Thinning was carried out by coarse grinding (#320 grit size),fine grinding (#2000 grit size), and stress relief (post-thinning)using Chemical Mechanical Polishing (CMP).


10


In case of simplified die-level yield, the yield is estimated using the value

Y ¼ 0:64 above for a case where four wafers are stacked. The probabilities of four

good dies and three good dies in such a stack are 0.17 and 0.86, respectively.

Stacking four dies and three dies with a memory density of 30Gbit/cm2 (equivalent

to 22 nm technology) achieves capacities of 120Gbit and 90Gbit, respectively. If

the number of effective chips is 1700 per 300mm wafer at a chip size of ∼0.4 cm2

at 8Gbit/die, the expected number of chip sets per unit memory capacity is 289/

32Gbit (4 stack), 1462/24Gbit (3 stack), and 1088/8Gbit (single). To realize a

capacity of over 24Gbit with a single chip would require technology two or three

Fig. 10. Schematic diagram of DRAM stack structure. Here, onememory die has 16 channels (CH0 to CH15) in total, stackedfollowing DRAM2, DRAM1, DRAM0, and then controller toDRAM3 (base wafer), using bumpless interconnects and aWOW process. Bumpless interconnects are connected in-dependently to the controller die from each channel of theDRAM layers.

Fig. 9. Trends toward DRAM density using 2D conventional scalingand 3D multi stacking using existing DRAM. DRAM capacityin the 3D case corresponds to the number of stacked dies,assuming that redundancy is eliminated by cell-blocks at eachlayer.


11


generations ahead, such as 10–14 nm technologies. The defects, however, are not

proportional to the reduction in area. This is because, as scaling proceeds, so-called

stealth defects (unobservable defects) increase, and the control of process variations

approaches its limit. When stealth defects become dominant, variations cannot be

improved statistically, and thus die yield deceases with scaling. According to an

empirical profit model, the investment in 22 nm technology needs to be made in

consideration of future 8–10 nm technologies because of the tradeoffs between

product sales and facility depreciation after huge investments. The price of extreme

ultraviolet (EUV) lithography machines is approximately twice that of ArF li-

thography machines, and their current throughput is around one-tenth. This is the

limit of scaling in light of the economics of the industry, and it is difficult to paint a

scenario of victory at present.

6 Conclusions

WOW technology and bumpless interconnects using TSVs for three-dimensional

stacking in wafer form have been described. It was found that an optimized wafer

thinning process for determining the stack thickness does not cause degradation of

Fig. 11. Trends in two-dimensional (2D) scaling and wafer sizeincluding total Si surface of wafer stack. Conventional scalingwill face difficulties such as physical limits and inability tominimize costs, whereas 3D integration will become superiorto scaling. By combining conventional two-dimensionalintegration (2DI) with three-dimensional stacking to overcomesuch problems associated with device scaling and increasingwafer size, it is possible to make a roadmap towards high-density integration backed up by production costs. In volumeproduction, 3D wafer stacking (WOW) enables a lower costthan Chip-on-Chip (COC) and high-density integration,reaching Terabit level. Bumpless interconnects using TSVsand ultra-thinning provide high-density I/Os connecting topand bottom device layers and achieve a small form factor 1/10th that of bump structures.


12


the device characteristics in advanced commercial devices, even with the smallest

thickness of 4 µm that we achieved. Because bumpless interconnects using TSVs

can be connected directly to the upper and lower substrates by self-alignment, the

package thickness can be reduced by an amount equivalent to electrodes, such as

bumps, which are not necessary when bumpless interconnects are used in combi-

nation with wafer thinning. Since the design pitch of TSVs is determined by the

bump size, high-density TSVs can be formed in bumpless interconnects by

following TSV patterning processes. At the same time, size reduction of the

finished shape allows the wiring between the upper and lower chips to be made

shorter, which reduces the total wiring impedance and makes it easier to ensure

high bandwidth with higher energy efficiency. Furthermore, by stacking wafers,

high-density integration and system block arrangements become more flexible, and

the design space is extended.

In combination with three-dimensional stacking for overcoming the problems

associated with scaling, a roadmap towards high-density integration backed up by

production costs can be formulated. The ability to stack chips while keeping the

wafer shape unchanged ensures compatibility with existing manufacturing facilities

in front-end processing and makes use of the technologies that have been nurtured

for wafer processing. If processes up to three-dimensional stacking can be handled

as units in the manufacturing line, the throughput will be one-hundred times greater

than stacking starting with chips. Therefore, future semiconductor manufacturing is

expected to advance with a roadmap in which the number of stacked wafers, the

wafer thickness, and the number of TSV interconnects serve as indices, as shown

Fig. 11.

Acknowledgments

This study was carried out based on the three-dimensional integration development

program by the WOW alliance of Tokyo Institute of Technology, and the authors

thank the alliance members, Nagoya University, Tsukuba University, the University

of Tokyo, and WOW Research Center Ltd., for their cooperation.

Takayuki Ohbareceived his Ph.D. degree from Tohoku University in electrical communication.After joining Non-aqueous Solution Chemical Laboratories at Tohoku Univer-sity, he entered Fujitsu Limited in Process Development Department in 1984. Inthe early 1990’s, W and TiN CVD are his distinguish achievement being widelyused in the manufacturing. He developed total integration processes on the 90-nmhigh performance CPU logic devices. He was appointed to The University ofTokyo in 2004 and moved to Tokyo Institute of Technology in 2013. Prof. Ohbais conducting global alliance so-called WOWAlliance for the post-scaling since2007, the WOWAlliance is developing three-dimensional integration (3DI) andthe bumpless (no bump) interconnects technology for memory-memory andmemory-logic 3DIs. His basic interest is process integration in the 2D/3Dinterconnects and cooling device to overcome the heat issues. Prof. Ohba hascontributed more than 100 papers through these studies. He is a member of theJAP, IEICE, IEEE, MRS, MAM, AMC, ADMETA, and 3DIC.


13


Youngsuk Kimreceived his Ph.D. degree in Material Engineering at University of Tokyo in2001. He joined Fujitsu Limited in 2001. He developed high performance tran-sistor for CPU and was responsible for integrating front-end-of-line process from90 to 45-nm nodes. In 2010, he moved to DISCO Corp. and worked inUniversity of Tokyo as researcher. He has developed Wafer-on-Wafer (WOW)integration process for 3D-IC and cooling device in Tokyo Institute of Technol-ogy as associated professor. He has contributed more than 40 publications and 30patents.

Yoriko Mizushimajoined Fujitsu Ltd., Kawasaki Japan since 1985 and engaged an analyticalevaluation of semiconductor materials and process, especially worked at theorganic material analyses. She moved to Fujitsu Laboratories Ltd, Atsugi in 1999and has been working on the analytical evaluation on Cu metallization process.From 2008, she participate the 3D stacking technology development group anddevelop stress simulation for the bumpless interconnects using low aspect TSVs.She received ADMETA Award in 2013 at Advanced Metallization Conference.She received the B.S. degree from Japan Women’s University, Tokyo, Japan in1985 and visiting researcher at Tokyo Institute of Technology, Ohba Lab. from2010.

Nobuhide Maedareceived the B.S., M.S., and Ph. D degrees from Hiroshima University, Higashi-Hiroshima, Hiroshima, Japan, in 1991, 1993 and 1996, respectively. In 1996 hejoined Florida State University. From 1996 to 2000, he worked on research ofhigh energy nuclear physics. From 2000 to 2003, he had belonged to HiroshimaUniversity as a research associate of Research Center for Nanodevices andSystems. From 2003 to 2007, he had belonged to Consortium for AdvancedSemiconductor Materials and Related Technologies (CASMAT). From 2008 to2013, he had belonged to the University of Tokyo. Since 2014, he has been in theTokyo Institute of Technology. He is engaged in process technologies of 3D-IC.He is a member of JSAP, JPS, MRS and APS.

Koji Fujimotoreceived Ph.D. degree from Boston University in 2002 for his work on MEMSmirror devices. He joined Research and Development Center of DNP anddeveloped MEMS sensors and silicon interposers. He is currently working fornano-imprinting technology and also joined WOW Alliance managed by prof.Ohba in Tokyo institute of technology since 2008 to develop 3D integration of Siwafers.

Shoichi Kodamareceived the BS degree in Mechanical Engineering at Tokyo Institute ofTechnology, Tokyo, Japan in 1976. He joined Toshiba Corp. Kawasaki Japan in1976. He developed the equipment for semiconductor manufacturing such asphotolithography and CMP tool. He also developed 300mm Factory automationsystem. He moved to DISCO Corp. in 2009 and joined process integration groupin the Wafer-on-Wafer (WOW) alliance as visiting researcher at Tokyo Instituteof Technology. He is working on development of the wafer stacking technology.


14


Review of wafer-level three-dimensional integration (3DI ...

Documents

Transcript of Review of wafer-level three-dimensional integration (3DI ...