Test and Debug Strategy for TSMC CoWoSTM Stacking · PDF fileTest and Debug Strategy for TSMC...

10
Paper 3.1 INTERNATIONAL TEST CONFERENCE 1 978-1-4799-0859-2/13/$31.00 ©2013 IEEE Test and Debug Strategy for TSMC CoWoS TM Stacking Process based Heterogeneous 3D IC: A Silicon Case Study Sandeep Kumar Goel 1 Saman Adham 2 Min-Jer Wang 3 Ji-Jan Chen 3 Tze-Chiang Huang 1 Ashok Mehta 1 Frank Lee 3 Vivek Chickermane 4 Brion Keller 4 Thomas Valind 4 Subhasish Mukherjee 5 Navdeep Sood 5 Jeongho Cho 6 Hayden Hyungdong Lee 6 Jungi Choi 6 Sangdoo Kim 6 TSMC 1 San Jose, CA,USA 2 Ottawa, ON, Canada 3 Hsinchu, Taiwan, R.O.C Cadence Design Systems 4 Endicott, NY, USA 5 Noida, UP, India 6 SK hynix Icheon-si, Gyeonggi-do Korea Abstract Recent advances in semiconductor process technology especially interconnects using Through Silicon Vias (TSVs) enable the heterogeneous system integration where dies are implemented in dedicated, optimized process technologies and stacked in a 3D form. TSMC has developed the CoWoS TM (Chip on Wafer on Substrate) process as a design paradigm to assemble silicon interposer-based 3D ICs. To reach quality requirements for volume production, several test challenges related to 3D ICs need to be addressed. This paper describes the test and debug strategy used in designing a CoWoS TM based stacked IC. The 3D design presented in the paper contains three heterogeneous dies (a logic, a DRAM, and a JEDEC Wide-I/O compliant DRAM) stacked on the top of a passive interposer. For passive interposer testing, a novel test methodology called Pretty-Good-Die (PGD) test is presented, while for inter-die test, a novel scalable multi- tower 3D DFT architecture is presented. Silicon results show that most of the test challenges can be solved efficiently if planned properly; and 3D ICs are reality and not a fiction anymore. 1. Introduction Compared to conventional wire-bond chip interconnections, Through-Silicon VIAs offer several advantages such as high density, low latency, low power, and possibly lower cost. TSVs provide vertical interconnects, and hence naturally used for three- dimensional (3D) vertical stacking of multiple dies [1-3]. However, TSVs also have attractive benefits in interconnecting dies which are placed next to each other. This is realized by stacking the dies on a base of passive silicon interposer containing TSVs; this flow is referred to as Chip on Wafer on Substrate (CoWoS TM ) [4-5]. Different types of dies (memory, logic, mixed signal, RF, etc.) can be stacked on top of the interposer. Considering the growth in the mobile device market, it is expected that memory-on-logic 3D stacks (with and without interposer) are expected to arrive in market first [6-7] followed by complex logic-on-logic stacks. JEDEC Wide-I/O DRAM Standard (JESD-229) [6] is a step in this direction. In this paper, we describe the test challenges that were faced in designing a CoWoS TM process based heterogeneous memory-on-logic and logic-on-logic 3D IC. The objective of this design was to find the stacking process weaknesses in creating a stacked dies system consisting of multiple dies in different technologies (logic and memory). It is also used to demonstrate the strength of the stacking capability when dies to be stacked are sourced from different vendors. The design contains three dies: (1) TSMC SOC die (logic), (2) TSMC DRAM die (logic), (3) SK hynix JEDEC Wide-I/O DRAM die (memory). Please note that TSMC DRAM die is not a conventional memory die and has a logic interface. That is why from test point of view, we consider it as a logic die. The three mentioned dies are placed side-by-side on top of a passive interposer in a face-to-face bonding style as shown in Figure 1. The SOC die interfaces to the other two dies through micro- bumps (µbumps). Figure 1: CoWoS TM based heterogeneous design From a test point of view, this design poses several challenges. To minimize any yield loss and reduce overall cost, each die must be fully tested before stacking on top of the interposer. The pre-stacking test is similar to manufacturing wafer-sort of conventional 2D chips and is known as Known Good Die (KGD) test. Since one of the dies is a 3 rd party die (SK hynix), ensuring the incoming die quality is also very critical. Therefore, a full incoming inspection flow needs to be established. After the dies are stacked re-testing of individual dies is required to confirm that the stacking process did not damage the individual die. In addition, a new kind of test

Transcript of Test and Debug Strategy for TSMC CoWoSTM Stacking · PDF fileTest and Debug Strategy for TSMC...

Page 1: Test and Debug Strategy for TSMC CoWoSTM Stacking · PDF fileTest and Debug Strategy for TSMC CoWoSTM Stacking Process based Heterogeneous 3D IC: A Silicon Case Study Sandeep Kumar

Paper 3.1 INTERNATIONAL TEST CONFERENCE 1 978-1-4799-0859-2/13/$31.00 ©2013 IEEE

Test and Debug Strategy for TSMC CoWoSTM Stacking Process based Heterogeneous 3D IC: A Silicon Case Study

Sandeep Kumar Goel1

Saman Adham2 Min-Jer Wang3

Ji-Jan Chen3 Tze-Chiang Huang1

Ashok Mehta1 Frank Lee3

Vivek Chickermane4 Brion Keller4 Thomas Valind4

Subhasish Mukherjee5 Navdeep Sood5

Jeongho Cho6 Hayden Hyungdong Lee6

Jungi Choi6 Sangdoo Kim6

TSMC

1San Jose, CA,USA 2Ottawa, ON, Canada

3Hsinchu, Taiwan, R.O.C

Cadence Design Systems 4Endicott, NY, USA

5Noida, UP, India

6SK hynix Icheon-si, Gyeonggi-do

Korea

Abstract Recent advances in semiconductor process technology especially interconnects using Through Silicon Vias (TSVs) enable the heterogeneous system integration where dies are implemented in dedicated, optimized process technologies and stacked in a 3D form. TSMC has developed the CoWoSTM (Chip on Wafer on Substrate) process as a design paradigm to assemble silicon interposer-based 3D ICs. To reach quality requirements for volume production, several test challenges related to 3D ICs need to be addressed. This paper describes the test and debug strategy used in designing a CoWoSTM based stacked IC. The 3D design presented in the paper contains three heterogeneous dies (a logic, a DRAM, and a JEDEC Wide-I/O compliant DRAM) stacked on the top of a passive interposer. For passive interposer testing, a novel test methodology called Pretty-Good-Die (PGD) test is presented, while for inter-die test, a novel scalable multi-tower 3D DFT architecture is presented. Silicon results show that most of the test challenges can be solved efficiently if planned properly; and 3D ICs are reality and not a fiction anymore.

1. Introduction Compared to conventional wire-bond chip interconnections, Through-Silicon VIAs offer several advantages such as high density, low latency, low power, and possibly lower cost. TSVs provide vertical interconnects, and hence naturally used for three-dimensional (3D) vertical stacking of multiple dies [1-3]. However, TSVs also have attractive benefits in interconnecting dies which are placed next to each other. This is realized by stacking the dies on a base of passive silicon interposer containing TSVs; this flow is referred to as Chip on Wafer on Substrate (CoWoSTM) [4-5]. Different types of dies (memory, logic, mixed signal, RF, etc.) can be stacked on top of the interposer. Considering the growth in the mobile device market, it is expected that memory-on-logic 3D stacks (with and without interposer) are expected to arrive in market first [6-7] followed by

complex logic-on-logic stacks. JEDEC Wide-I/O DRAM Standard (JESD-229) [6] is a step in this direction.

In this paper, we describe the test challenges that were faced in designing a CoWoSTM process based heterogeneous memory-on-logic and logic-on-logic 3D IC. The objective of this design was to find the stacking process weaknesses in creating a stacked dies system consisting of multiple dies in different technologies (logic and memory). It is also used to demonstrate the strength of the stacking capability when dies to be stacked are sourced from different vendors. The design contains three dies: (1) TSMC SOC die (logic), (2) TSMC DRAM die (logic), (3) SK hynix JEDEC Wide-I/O DRAM die (memory). Please note that TSMC DRAM die is not a conventional memory die and has a logic interface. That is why from test point of view, we consider it as a logic die. The three mentioned dies are placed side-by-side on top of a passive interposer in a face-to-face bonding style as shown in Figure 1. The SOC die interfaces to the other two dies through micro-bumps (µbumps).

Figure 1: CoWoSTM based heterogeneous design

From a test point of view, this design poses several challenges. To minimize any yield loss and reduce overall cost, each die must be fully tested before stacking on top of the interposer. The pre-stacking test is similar to manufacturing wafer-sort of conventional 2D chips and is known as Known Good Die (KGD) test. Since one of the dies is a 3rd party die (SK hynix), ensuring the incoming die quality is also very critical. Therefore, a full incoming inspection flow needs to be established.

After the dies are stacked re-testing of individual dies is required to confirm that the stacking process did not damage the individual die. In addition, a new kind of test

Page 2: Test and Debug Strategy for TSMC CoWoSTM Stacking · PDF fileTest and Debug Strategy for TSMC CoWoSTM Stacking Process based Heterogeneous 3D IC: A Silicon Case Study Sandeep Kumar

Paper 3.1 INTERNATIONAL TEST CONFERENCE 2

must be performed to check that inter-die interconnects are defects free. This is known as Known Good Stack (KGS) test. Furthermore, since dies are stacked on top of a passive interposer, testing of the interposer is essential in order to minimize the yield loss related to stacking good dies on a defective interposer. This is especially important since an interposer is the lowest cost die in a CoWoSTM stack. Debug features also need to be designed into the system to allow isolation of manufacturing as well as functional errors in case the system fails on ATE.

This paper describes the test and debug strategy that was adopted to overcome the abovementioned test challenges. The paper is organized as follows. Section 2 provides a brief introduction to the TSMC CoWoSTM stacking process. Details of heterogeneous chip architecture and individual dies are presented in Section 3. DFT and debug features for individual die test (KGD) as well as 3D multi-tower DFT architecture that enables inter-die interconnect test are presented in Section 4. In Section 5, a novel test methodology called Pretty-Good-Die (PGD) test for passive interposer testing is presented. Silicon test results for KGD and KGS testing are described in Section 6. Section 7 concludes the paper.

2. TSMC CoWoSTM Stacking Process In our CoWoSTM technology, TSVs with dimensions of 12m in diameter and 100m in depth are dry-etched in the interposer  and  the  TSVs’  sidewall  is  conformally  deposited  with SiO2 as liner. Next, the barrier and seed layer is deposited followed by Cu electro-chemical plating. The excess Cu is removed and planarized by CMP. Three interconnect layers with routing pitch up to 0.8m are then formed above the TSV by a dual-damascene process. Before stacking and bonding (Chip-on-Wafer (CoW) step), bumps with a minimum pitch of 40m are formed on both the interposer wafer and the three dies.

Figure 2: Cross-sectional image of CoWoSTM system.

After bonding, the CoW wafer is thinned down to 100m to expose the TSV for subsequent RDL and C4 bump formation. Afterwards, the CoW wafer goes for dicing. Each singulated die is then cleaned, pick-and-placed, reflowed, and flux cleaned before the flip-chip package

process. Figure 2 shows the cross-sectional SEM image of the resulting CoWoSTM system.

3. Chip Architecture As mentioned earlier, this CoWoSTM test system is composed of 4 dies. The Wide-I/O DRAM is from SK hynix, while the remaining three dies (SOC, TSMC DRAM and interposer) are designed at TSMC. Figure 3 shows the high level view of the system.

Figure 3: High-level design architecture

3.1 SOC Die The major mission of SOC die is to perform and control the system application during functional mode. In addition to a dual-core ARM processor & communication buses, SOC contains two main interfaces; a Wide-I/O DRAM interface, and a high bandwidth DRAM interface. The Wide-I/O DRAM interface is JEDEC compliant [6] and interfaces to a Wide-I/O DRAM die. This interface consists of a 3rd party Wide-I/O DRAM controller, a special Wide-I/O PHY circuit, and a Wide-I/O bridge to enable communication between SOC and Wide-I/O DRAM dies. The second interface is the high bandwidth interface between SOC and TSMC DRAM, and consists of PHY and bridge modules. It is designed to achieve up to 1 Tera bits/sec data access.

The SOC die provides GPIO mode support for the Wide-I/O DRAM and contains a MUX-IO debug port for external R/W access to the system address space or for monitoring internal SOC wires. SOC die is designed in TSMC 40LP process node.

3.2 TSMC DRAM Die The TSMC DRAM Die is designed to provide a high bandwidth data access to demonstrate the L3 Cache server application. The die consists of four ports (Channels) and each channel is capable of storing up to 2 Mbytes of data with a low read latency (as shown in Figure 4). To reduce power consumption and meet bandwidth requirements,

Page 3: Test and Debug Strategy for TSMC CoWoSTM Stacking · PDF fileTest and Debug Strategy for TSMC CoWoSTM Stacking Process based Heterogeneous 3D IC: A Silicon Case Study Sandeep Kumar

Paper 3.1 INTERNATIONAL TEST CONFERENCE 3

DRAM memory designed in TSMC 40G process is used as baseline storage. In addition, to reduce the number of interconnects to the SOC die, a custom interface block was designed (PHY) to serialize/de-serialize the data transfer between SOC and DRAM die. The SOC die also has a corresponding PHY block to interface with TSMC DRAM PHY. The PHY-SOC/PHY-DRAM datapath width is 1024 data signals plus additional address and control signals. All these  signals  connected  to  μbumps  at  the  die  level.  

Figure 4: Functional view of TSMC DRAM Die

3.3 JEDEC Wide-I/O DRAM Die (3rd Party) The third die is a JEDEC Wide-I/O compliant DRAM die designed and manufactured by a leading memory vendor (SK hynix). As shown in Figure 5, the full JEDEC Wide-I/O compliant standard [6] allows up to four DRAM dies (four ranks) to be stacked together in a vertical fashion. Each rank contains four channels with 128-bit wide data bus per channel, totaling to 512 data bits over all four channels.

Figure 5: JEDEC Wide-I/O DRAM Stack

Each channel also includes independent control and clocks but shared power/ground. The maximum data rate is 266Mbps, which offers a total bandwidth of 17GByte/s. The die includes 1200 µbumps connections for all four channels. The Wide-I/O DRAM die contains boundary scan structure to allow interconnect test between logic and the Wide-I/O DRAM die. The boundary scan structure is not compliant to IEEE 1149.1 Std. Please note that the Wide-I/O DRAM die used in in this CoWoSTM chip only contains one rank of DRAM. However, the rank is completely compliant to the JEDEC Wide-I/O standard [6].

3.4 Silicon Interposer The fourth die is the silicon interposer manufactured in TSMC 65nm process. Figure 6 shows the placement of different dies on top of interposer as well as physical characteristics of the interposer. Since the SOC die interfaces to both DRAM dies, the TSMC DRAM die is placed on the left top corner of the interposer, while the JEDEC Wide-I/O DRAM die is placed at the bottom right corner. For the stability of the packaged die as well as to monitor process related effects, an additional DRAM die with same size as the TSMC DRAM die is added on the top right corner of the interposer.

Figure 6: Physical view of the silicon interposer

4. DFT and Debug Architecture To address test challenges mentioned in Section 1, a novel DFT architecture is designed for the CoWoSTM chip and corresponding test features were added to the TSMC SOC and DRAM dies. No changes could be made to the Wide-I/O DRAM Die as it is a 3rd party die. From a test point of view, the SOC die is interfacing with two non-overlapping interfaces, therefore the whole system can be viewed as two towers (SOC-Wide-I/O DRAM, and SOC-TSMC DRAM) on top of the SOC die as shown in Figure 7.

Figure 7: Two towers (electrical connection view)

A multi-tower DFT architecture is designed to support the two tower hardware. The DFT architecture is scalable to any number of towers. The multi-tower architecture is based on the IEEE 1500 based 3D wrapper design [8] and the 3D DFT architectures [9-11]. Figure 8 shows the details about the multi-tower DFT architecture. For simplicity, two towers are shown as North (Wide-I/O DRAM side), and East (TSMC DRAM side) towers. Please note that die-level internal scan chains are not shown. Also the position of wrapper cells and PAD cells are swapped for clarity of the architecture and concept.

Page 4: Test and Debug Strategy for TSMC CoWoSTM Stacking · PDF fileTest and Debug Strategy for TSMC CoWoSTM Stacking Process based Heterogeneous 3D IC: A Silicon Case Study Sandeep Kumar

Paper 3.1 INTERNATIONAL TEST CONFERENCE 4

Figure 8: TSMC multi-tower 3D DFT architecture

The Wide-I/O DRAM die comes as a Known-Good-Die and cannot be wrapped like other logic dies. However, it already has the boundary scan wrapper (shown in Figure 9) based on the JEDEC Standard [6]. This wrapper requires that all the control signals should be generated from the logic die connected to this memory die.

The SOC die is wrapped with a 3D wrapper, where the Wide-I/O DRAM interface uses standard 1500 wrapper cells (shown as green cells in Figure 8). The TSMC DRAM interface uses special wrapper cells (shown as blue cells in Figure 8) due to special IO PAD design for SOC-to-TSMC-DRAM interface signals. This interface uses low-voltage swing differential signaling and to reduce the performance impact due to wrapper cells, PAD cells include the multiplexers from the wrapper cells that switch between test and functional modes. Therefore, special wrapper cells to account for the existence of the embedded multiplexers are designed and integrated into the SOC and TSMC DRAM dies.

Figure 9: Wide-I/O DRAM boundary scan [11]

To minimize the stack-level number of test pins as well as to make the SOC die compliant to IEEE 1149.1 Std. for board-level integration, the overall multi-tower DFT architecture is controlled by the SOC-die level IEEE 1149.1 TAP controller. The boundary scan cells are shown as yellow cells and only connected to the bottom side I/O pins in Figure 8. The SOC wrapper contains two adaptors to generate essential control signals for die and stack-level test. The TAP-to-1500 adaptor provides necessary control to program the 1500 WIR through the top-level IEEE 1149.1 TAP controller. The 1500-to-WideI/O adaptor generates boundary scan control signals for the Wide-I/O DRAM, as the Wide-I/O DRAM does not contain a TAP controller and requires that control signals should come from the logic die connected to it. Details about the 1500-to-WideIO adaptor design can be found in [11].

The TSMC DRAM die is wrapped with a simple 3D wrapper [8] and only contains one type of wrapper cells. Also it does not require boundary scan wrapper since it does not have any functional (non-power/ground) pin connected to package pins via interposer. Considering that SOC die wrapper interfaces with two distinct non-overlapping  dies,  we  refer  to  its  wrapper  as  “L-L-M (Logic-to-Logic-to-Memory) wrapper. In L-L-M definition, first L refers to SOC die, second L refers to TSMC DRAM die since we consider it like a logic die in terms of wrapper, while M refers to Wide-I/O DRAM which is a memory die.

The multi-tower architecture provides all types of test modes for each die before and after stacking. Figure 10 shows the wrapper WIR programming mode for individual die (SOC), while Figure 11 shows the same for stack level WIR programming required for interconnect test. Once die-level or stack-level WIR programming is done, testing of die itself or interconnects can be carried out depending on the programmed instruction.

Figure 10: SOC-die level WIR programming

Page 5: Test and Debug Strategy for TSMC CoWoSTM Stacking · PDF fileTest and Debug Strategy for TSMC CoWoSTM Stacking Process based Heterogeneous 3D IC: A Silicon Case Study Sandeep Kumar

Paper 3.1 INTERNATIONAL TEST CONFERENCE 5

Figure 11: Stack-level WIR programming

For interconnect test, the multi-tower architecture provides three distinct modes: (1) single integrated interconnect test, (2) SOC-to-Wide-I/O DRAM interconnect, and (3) SOC-to-TSMC DRAM interconnect test. In the single integrated interconnect test mode, a single scan chain connecting all the wrapper cells in the SOC die, boundary scan cells in the Wide-I/O DRAM die and wrapper cells in the DRAM die is created between TDI and TDO pins.

Figure 12 shows the multi-tower architecture configured in single integrated test mode, while Figure 13 and Figure 14 show the same for other two single towers interconnect modes. Also note that any individual channel from the Wide-I/O DRAM can also be included or excludes from the interconnect test if required. These different configurations provide greater flexibility and debugging capabilities if and when some failures are found at the ATE.

Figure 12: Single integrated interconnect test mode

The boundary scan test mode for the stacked die is shown in Figure 15. From different modes, we can see that the proposed multi-tower architecture is capable of the meeting all the KGD and KGS testing requirements. Next the test/debug strategy details for each individual die from KGD point of view are presented.

Figure 13: SOC-to-Wide-I/O DRAM interconnect test

Figure 14: SOC-to-TSMC DRAM interconnect test

Figure 15: Boundary scan test mode for the stack

Page 6: Test and Debug Strategy for TSMC CoWoSTM Stacking · PDF fileTest and Debug Strategy for TSMC CoWoSTM Stacking Process based Heterogeneous 3D IC: A Silicon Case Study Sandeep Kumar

Paper 3.1 INTERNATIONAL TEST CONFERENCE 6

4.1 JEDEC Wide-I/O DRAM Test Scheme The Wide-I/O DRAM is provided as KGD by the memory vendor. However, to find any stacking process weakness and to test the Wide-I/O DRAM after stacking, special programmable MBIST was added to the SOC die. The MBIST block communicates with the Wide-I/O DRAM die through the Wide-I/O PHY block (as shown in Figure 3).

The MBIST engine supports several test algorithms including March-X and March-Y along with special tests such as refresh, leakage and data retention test specific to the Wide-I/O DRAM. The MBIST engine also has limited repair (two rows and one column per channel) circuitry to allow post-bond repair of faulty Wide-I/O DRAM cells.

4.2 SOC Test Several standard and widely used DFT methods such as at-speed scan/ATPG and memory BIST for on-chip SRAM blocks are used in the SOC die. To enhance the testability of embedded modules, simple wrappers providing controllability and isolation of the module I/Os are also inserted around modules. Unlike the traditional 2D SOC testing, direct probing of µbumps on SOC is not feasible and therefore special probe pads were designed and added to test pins. In order to minimize the number of probe-pads, pins-reduction compaction techniques (PRCT) as well as test data compression schemes were heavily adopted.

To minimize the test power consumption, the overall design was partitioned in three groups: (1) TSMC DRAM control interface, (2) SOC control and top-level logic, and (3) Wide-I/O DRAM control interface (as shown in Figure 16). Only one group could be activated and tested at a time during KGD test.

Figure 16: Test partitioning and session plan

The number of top-level scan pins for each group is decided based on the number of flops in each group. The TSMC DRAM interface partition has 26 top-level scan chains and 362 internal scan chains resulting in an effective compression of 14X. Similarly, the Wide-I/O DRAM control interface partition has 10 top-level and 239 internal

chains per DRAM channel. The SOC control logic partition has 26 top-level and 330 internal chains.

To further reduce the peak power consumption during test, a multi-clock capture scheme was used. In a multi-clock capture scheme, only clocks with no cross-clock-domain timing paths or clocks sharing identical clock source in a test mode can be triggered in parallel in a capture mode. Experimental results show that use of multi-clock capture scheme resulted in ~12% reduction in switching activity during test.

4.3 TSMC DRAM Test The TSMC DRAM die is designed to provide the maximum test and diagnostic coverage possible at minimal cost. The DFT scheme is shown in Figure 17 and contains the following features

At speed scan based testing with independent test compression/decompression and serialization blocks for each channel so that they can be tested independently.

Each channel has four top-level scan chains and 48 internal scan chains resulting in target compression of 12X.

At speed memory BIST and Repair for all TSMC DRAM blocks.

Figure 17: DFT scheme for TSMC DRAM die

To enable diagnosis and debugging, three loop back test modes using the existing BISTR were designed. Figure 18 shows these loop back modes. These loop back modes are (1) at-speed global (green), (2) at-speed BIST loop (red), and at-speed PHY loopback (blue).

The innovative approach of re-using TSMC DRAM BIST controller for loop back test for PHY (blue) and global (green) resulted in 30% reduced complexity for the PHY block and enabled more accurate test and diagnosis for the PHY failures.

Page 7: Test and Debug Strategy for TSMC CoWoSTM Stacking · PDF fileTest and Debug Strategy for TSMC CoWoSTM Stacking Process based Heterogeneous 3D IC: A Silicon Case Study Sandeep Kumar

Paper 3.1 INTERNATIONAL TEST CONFERENCE 7

Figure 18: BIST/Loopback test in TSMC DRAM die

5. Testing of Passive Silicon Interposer As silicon interposer is the largest and least costly die in the complete stack, manufacturing (go/no-go) testing is very critical. One of the major challenges in testing of interposer is that it is passive and does not contain any logic elements such as logic gate or flip flops that are required from conventional DFT point of view. Figure 19 shows an example implementation of a passive silicon interposer along with all possible connection types. Most of the functional signals whether inter-die or intra-die is of Type T1, while most of the power/ground signals are of Type T2 and T3. From Figure 19, it is clear that testing of interposer requires:

1. Testing of metal interconnects or metal traces 2. Testing of through-silicon VIAs (TSVs).

Figure 19: Example interconnects in passive interposer

Also we can see that if we could probe both sides of the interposer (µbump and C4) at the same time, then testing objective (KGD) as mentioned earlier can be achieved. However, direct probing of µbumps is very difficult and production-worthy solutions are not yet available [13-15]. Also, double side probing of interposer is not possible due to wafer handling and probe card manufacturing issues. Therefore, even if the direct probing of µbumps was possible, we could not have used this approach. A new test technique called Pretty-Good-Die (PGD) was developed to address the passive interposer testing problem. The PGD scheme allows testing of interconnects as well as the TSVs

in the interposer. Based on the connection type (T1, T2 or T3), two kinds µbump structures are used. For Type T2 and T3, a common µbump/TSV structure (as shown in Figure 20a) is used, while for Type T1, a single µbump structure is used (Figure 20b).

Figure 20: µbump structure types

In the common µbump/TSV structure, a set of 8 TSVs and µbumps is used for connection. This is based on the power delivery and signal strength requirement. The area available at the center of a common µbump/TSV structure is re-used to place a testing probe pad. For inter-die/intra-die connections, single µbump is used. In PGD, we separate the testing of interconnects and the testing of TSVs. For better understanding,  let’s  consider  the  example  interposer  shown  in Figure 21 where all three types of connections are shown. There are four connections that involve at least two µbumps and interposer metal routing.

Figure 21: Front-side view of a passive interposer

Figure 22 shows the added PGD features to enable the testing of interconnects. For interconnect testing, we use dummy metal (extra metal) to connect µbump pairs of net-under-test to near-by probe pads available at the center of common µbump/TSV structures. The use of probe-pads is required as the direct-probing of bumps is not possible to test a particular interconnect. As the added dummy metal increases the loading of the signal and can degrade its performance, it is very important to minimize the length of the added dummy metal. Finding the optimal µbump and probe-pad pair that minimize total added wire length is a separate problem and is not addressed in this paper.

µbump

C4 bumpC4

interposer

T1: Inter/intra-dieconnections

T3: Fan-out connectionvia TSV

TSV

TSV

TSV

TSVT2: Feed through

via TSV

C4 C4 C4

(a) Common μbump/TSV structure (b) Single μBump structure

TSV

µbump

Sacrificial probing pad

Common μbump/TSV

structure

Normal interposer

routing

Single μbump

Type T1

Type T3

Type T2

Page 8: Test and Debug Strategy for TSMC CoWoSTM Stacking · PDF fileTest and Debug Strategy for TSMC CoWoSTM Stacking Process based Heterogeneous 3D IC: A Silicon Case Study Sandeep Kumar

Paper 3.1 INTERNATIONAL TEST CONFERENCE 8

Figure 22: PGD features for interconnect testing

Once the optimal pairing of µbumps and probe pads have been determined and the corresponding metal connections have been formed, the front-side probing (as shown in Figure 23) using traditional probe cards can be performed to do interconnect testing. Front-side probing of these pads enables the interconnect test for open/short and bridging faults between interconnects. To check for static opens, a simple 1/0 logic value can be applied at one end (pad) of a net (interconnect), and the resulting value can be observed through the other end (pad) of the same net. Similarly, applying a 1/0 to one net and keeping all other nets to 0/1 will check for shorts between two nets. Please note that for a defective interconnect, this methodology cannot differentiate if the defect is in the normal signal routing or the dummy metal routing.

Figure 23: Front-side probing of interposer

As the number of probe-pads that can be placed on the interposer is limited by the number of common µbump/TSV structures as well as the space on the interposer, only a limited set of interconnects can be tested. This is why we  called   this  methodology  “Pretty-Good-Die (PGD)”  instead  of  Known-Good-Die (KGD) test.

For testing of TSVs, we use the back-side probing concept. To test the connectivity of TSVs and µbumps of Power/Ground (P/G) pins, a dummy wire is added to connect two compatible P/G common µbump/TSV pairs. This concept is shown in Figure 24. Two P/G common µbump/TSV structures are considered compatible if they are of the same type (Power or Ground) and if they have same voltage level. Addition of dummy metal results in shorting of the corresponding pins (Power/Ground) but this does not affect the chip functionality as these pins would have been shorted anyway during packaging. For signal TSVs, a dummy TSV and C4 bump pair is added to form a loop with the target TSV.

Figure 24: Adding PGD features for TSV testing

Once the required PGD features (dummy metal for TSV testing) have been added, TSV connectivity can then be tested by probing the back-side C4 bump pairs as shown in Figure 25. Since the back-side of interposer contains C4 bumps, which can be directly probed, no probe pads are required for TSV testing.

Figure 25: Back-side probing of interposer

Similar to interconnect testing, a simple 1/0 logic value can be applied at one end (C4) of a TSV loop, and the resulting value can be observed through the other C4 of the same TSV loop. However, unlike interconnects, TSV testing is coarse and used to check for major process related defects only, e.g. the actual TSV test coverage is low. It is important to note that even though PGD test methodology cannot achieve high defect coverage, it is an acceptable practice from foundry point of view as a simple go/no go test. Considering the fact that no logic can be added to the interposer and it contains array of thousands of µbumps and TSVs, KGD test of passive interposer imposes a difficult challenge.

6. Experimental and Silicon Results Figure 26 shows the stacked CoWoSTM chip before packaging. The four dies are placed on top of the passive interposer wafer. As shown in Figure 26, there is no physical access to the individual die IOs to perform testing at the CoWoSTM level; any test must be applied through the interposer. Therefore, the DFT methodology described earlier plays a very important role to enable testing of the dies at the CoWoSTM level. The passive interposer test as outlined in Section 5 did not result in any failing die and hence confirmed our expectation that PGD approach is sufficient for simple go/no-go testing of interposer.

Common μbump/TSV

structure

Normal interposer

routing

Single μbumpDummy metal

for interconnect test

Probe padCommon μbump/TSV

structure

Normal interposer

routing

Single μbumpDummy metal

for interconnect test

Probe pad

Dummy metal for TSV test

C4

C4

C4

C4

C4 C4

C4

Page 9: Test and Debug Strategy for TSMC CoWoSTM Stacking · PDF fileTest and Debug Strategy for TSMC CoWoSTM Stacking Process based Heterogeneous 3D IC: A Silicon Case Study Sandeep Kumar

Paper 3.1 INTERNATIONAL TEST CONFERENCE 9

Figure 26: Stacked CoWoSTM chip before packaging

Figure 27 shows the pre-stacking and post-stacking Shmoo plots for the SOC die. The Dhrystone patterns were chosen because they cause maximum power in the SOC die. The worst-case specification for SOC was 925MHZ at 1.1 volt. From Figure 27, we can see that typical-corner speed of 1.6GHz was obtained from at KGD level, while 1.62GHZ was obtained at KGS level. The improvement in speed can be attributed to the better power delivery through the interposer. KGD and KGS test results for the TSMC DRAM chip are provided in Figure 28 where at-speed memory BIST patterns are used. In this case, both results match the expected typical corner case specification.

(a) KGD (b) KGS (CoW)

Figure 27: SOC die silicon test results

(a) KGD (b) KGS (CoW)

Figure 28: TSMC DRAM silicon test results

In Figure 29 we show the pre and post stacking Shmoo plots of the Wide-I/O DRAM for at-speed memory BIST. Please note that for pre-stacking test the Direct Access (DA) mode is used, while post-stacking we use GPIO mode and memory BIST to test Wide-I/O DRAM. This was done to test both GPIO and DA mode for the Wide-I/O DRAM die. From both KGD and KGS results, we can see that DRAM die meets the minimum required speed of 200MHZ at 90℃. The Wide-I/O DRAM demonstrates 285MHz performance through MBIST testing at KGS compared to the original specification of 200MHz.

(a) KGD (b) KGS (CoW)

Figure 29: Wide-I/O DRAM silicon test results

One of the main objectives of the multi-tower 3D DFT architecture was to perform the inter-die interconnect test at KGS. This test is required to ensure that stacking process (especially  µbump  interconnects)  are  defect  free.  Cadence’s  RTL Compiler was used to insert the wrapper logic in the SOC and the TSMC DRAM dies and the Encounter Test ATPG tool was used to generate the static driver-to-receiver open/short patterns [11] for the interconnect test. The interconnect ATPG results are shown in Table 1.

Table 1: Inter-die interconnect ATPG results

For inter-die interconnects, total 99.94% coverage is obtained while for the package pins that are connected to the SOC die, total 90.74% coverage is achieved. The overall interconnect static coverage is 99.47% and only six test patterns are required. There were also 22 shorted net tests created to cover possible shorts between interconnect signals on the interposer. Silicon test of both interfaces (as shown in Figure 13 and Figure 14) did not result in any interconnect failure. In addition to the APTG generated slow-speed interconnect test, high-speed loopback tests were designed in the CoWoSTM chip. The high-speed loopback results were consistent with the expected interface performance and are shown in Figure 30.

(a) SOC-to-Wide-I/O DRAM (b) SOC-to-TSMC DRAM

Figure 30: High-speed loopback test results (KGS)

-5%=1.14V

+8%=1.32V

VDD2

JEDEC Specification

Frequency (MHZ) 200

1.32V

1.14V

200

285MHz200MHz+8%=1.32V

VDD2

-5%=1.14V

JEDEC Specification

285

1.32V

1.14V

Frequency (MHZ)

Fault Type TestableFaults

Fault Coverage

Static stuck (inter-die)

14688 99.94%

Static stuck (package pins)

788 90.74%

Total static stuck(inter-die + package)

15476 99.47%

Page 10: Test and Debug Strategy for TSMC CoWoSTM Stacking · PDF fileTest and Debug Strategy for TSMC CoWoSTM Stacking Process based Heterogeneous 3D IC: A Silicon Case Study Sandeep Kumar

Paper 3.1 INTERNATIONAL TEST CONFERENCE 10

Silicon test results clearly demonstrate that most of the test challenges related to CoWoSTM technology can be resolved if planned properly. Furthermore, the stacking process does not have a negative impact on the system performance. In fact, KGS results show improvement over KGD results. It also demonstrates the viability and high performance capability of the CoWoSTM technology.

6. Conclusion Heterogeneous system integration where dies implemented in dedicated, optimized process technologies and stacked together to form a system is inevitable to meet the demand of modern and future electronic products. Dies can be stacked vertically to form a 3D stack and connected via Through-Silicon Vias (TSVs) or can be placed next to each other on top of a passive silicon interposer and interconnected via interposer; we refer to this flow as Chip-on-Wafer-on-Substrate (CoWoSTM). The growth in mobile device market indicates that memory-on-logic 3D stacks (with and without interposer) are expected to arrive in market first followed by logic-on-logic stacks.

In this paper, we have presented the test challenges and innovative DFT solutions for a heterogeneous memory-on-logic and logic-on-logic CoWoSTM IC. The heterogeneous system contains two TSMC dies and a SK Hynix Wide-I/O DRAM die stacked on top of a passive silicon interposer. Specific DFT approaches were designed into the individual dies to meet the high level test and quality requirements for the CoWoSTM manufacturing process. A novel approach called Pretty-Good-Die (PGD) test is introduced for testing of passive interposer. The presented DFT solutions allows for efficient KGD and KGS testing of the dies and the stack. For inter-die interconnects, overall coverage of 99.47% is achieved. Silicon test results show that TSMC CoWoSTM process based IC achieve similar or better results at stack-level as compared to the bare-die performance.

Acknowledgement We thank Sergej Deutsch (Cadence Design Systems, Germany & IMEC Belgium) and Erik Jan Marinissen from IMEC, Belgium for their support in developing inter-die interconnects test solution. We thank Y.T. Ha, H.S. Jun, H.S. Kim, J.H. Hong, Y.C. Joo (all from SK-hynix, South Korea), Jeff Tsai, CH Chang, and Jonathan Yuan (all from TSMC, TWN) for their contribution in this project.

8. References [1] Eric  Beyne  and  Bart  Swinnen,  “3D  System  Integration

Technologies”,   In   Proceedings   IEEE   International  Conference on IC Design and Technology, June 2007

[2] Philip Garrou, Christopher Bower, and Peter Ramm, editors,   “Handbook   of   3D   Integration   – Technology and   Applications   of   3D   Integrated   Circuits”,   Wiley-VCH, Weinheim, Germany, August 2008.

[3] Robert   S.   Patti,   “Three-Dimensional Integrated Circuits and the Future of System-on-Chip  Designs”,  Proceedings of the IEEE, 94(6):1214.1224, June 2006

[4] Frank   Lee   and   Marc   Greenberg,   “Enough   Talk!  Practical Approaches to 3D IC- TSV/Silicon Interposer and Wide IO Implementation from People who   have   been   there   and   done   that”,   Tutorial   2   at  Design Automation Conference, June 2012

[5] J.  Y.  Xie  et  al.,  “Interposer  Integration  through  Chip-on-Wafer-On-Substrate Process (CoWoSTM)”,   In  Proceedings, Semicon West, July 2012

[6] WideI/O Single Data Rate (JEDEC Std. JESD229), JEDEC Solid State Technology Association, December 2011.

[7] ST-Ericsson and CEA-Leti's WIOMING Prototype Shows How To Combine Wide IO Memory and Logic SoC for Future 3D Multi-Processor Architectures. Yole Developpement 3D Packaging Newsletter, (22):16.18, February2012.

[8] Erik   Jan  Marinissen   et   al.,   “A  DFT   Architecture   for  3D-SICS   Based   on   a   Standardizable   Die   Wrapper”,  Journal of Electronic Testing: Theory and Applications, 28(1):73-92, Feb 2012.

[9] Chun-Chuan  Chi  et  al.  “DfT  Architecture  for  3D-SICs with   Multiple   Towers”.   In   Proceedings   IEEE  European Test Symposium (ETS), pages 51.56, May 2011

[10] Sergej   Deutsch   et   al.,   “Automation   of   3D-DFT Insertion”,   In   Proceedings   IEEE   Asian Test Symposium (ATS), Nov 2011.

[11] S. Deutsch, et al., "DfT architecture and ATPG for Interconnect tests of JEDEC Wide-I/O memory-on-logic die stacks," In proceedings International Test Conference, Nov 2012

[12] Sandeep   Kumar   Goel,   “Test   challenges   in   designing  complex 3D chips: What is on the Horizon for EDA industry? In Proceedings International Conference on Computer-Aided Design, Nov 2012

[13] Ken   Smith   et   al.,   “Evaluation   of   TSV   and   Micro-Bump Probing for Wide-I/O  Testing”,  In  Proceedings  IEEE International Test Conference (ITC), September 2011

[14] Ben   Eldridge   and   Marc   Loranger,   “Challenges   and  Solutions for Testing of TSV and Micro-Bump”,   In  Digest of IEEE International Workshop on Testing Three-Dimensional Stacked Integrated Circuits (3D-TEST), September 2011

[15] Matt Losey   et   al.,   “A   Low-Force MEMS Probe Solution for Fine-Pitch 3D-SIC  Wafer  Test”,  In  Digest  of IEEE International Workshop on Testing Three-Dimensional Stacked Integrated Circuits (3D-TEST), September 2011