Test and Debug Strategy for TSMC CoWoSTM Stacking · PDF fileTest and Debug Strategy for TSMC...
Transcript of Test and Debug Strategy for TSMC CoWoSTM Stacking · PDF fileTest and Debug Strategy for TSMC...
Paper 3.1 INTERNATIONAL TEST CONFERENCE 1 978-1-4799-0859-2/13/$31.00 ©2013 IEEE
Test and Debug Strategy for TSMC CoWoSTM Stacking Process based Heterogeneous 3D IC: A Silicon Case Study
Sandeep Kumar Goel1
Saman Adham2 Min-Jer Wang3
Ji-Jan Chen3 Tze-Chiang Huang1
Ashok Mehta1 Frank Lee3
Vivek Chickermane4 Brion Keller4 Thomas Valind4
Subhasish Mukherjee5 Navdeep Sood5
Jeongho Cho6 Hayden Hyungdong Lee6
Jungi Choi6 Sangdoo Kim6
TSMC
1San Jose, CA,USA 2Ottawa, ON, Canada
3Hsinchu, Taiwan, R.O.C
Cadence Design Systems 4Endicott, NY, USA
5Noida, UP, India
6SK hynix Icheon-si, Gyeonggi-do
Korea
Abstract Recent advances in semiconductor process technology especially interconnects using Through Silicon Vias (TSVs) enable the heterogeneous system integration where dies are implemented in dedicated, optimized process technologies and stacked in a 3D form. TSMC has developed the CoWoSTM (Chip on Wafer on Substrate) process as a design paradigm to assemble silicon interposer-based 3D ICs. To reach quality requirements for volume production, several test challenges related to 3D ICs need to be addressed. This paper describes the test and debug strategy used in designing a CoWoSTM based stacked IC. The 3D design presented in the paper contains three heterogeneous dies (a logic, a DRAM, and a JEDEC Wide-I/O compliant DRAM) stacked on the top of a passive interposer. For passive interposer testing, a novel test methodology called Pretty-Good-Die (PGD) test is presented, while for inter-die test, a novel scalable multi-tower 3D DFT architecture is presented. Silicon results show that most of the test challenges can be solved efficiently if planned properly; and 3D ICs are reality and not a fiction anymore.
1. Introduction Compared to conventional wire-bond chip interconnections, Through-Silicon VIAs offer several advantages such as high density, low latency, low power, and possibly lower cost. TSVs provide vertical interconnects, and hence naturally used for three-dimensional (3D) vertical stacking of multiple dies [1-3]. However, TSVs also have attractive benefits in interconnecting dies which are placed next to each other. This is realized by stacking the dies on a base of passive silicon interposer containing TSVs; this flow is referred to as Chip on Wafer on Substrate (CoWoSTM) [4-5]. Different types of dies (memory, logic, mixed signal, RF, etc.) can be stacked on top of the interposer. Considering the growth in the mobile device market, it is expected that memory-on-logic 3D stacks (with and without interposer) are expected to arrive in market first [6-7] followed by
complex logic-on-logic stacks. JEDEC Wide-I/O DRAM Standard (JESD-229) [6] is a step in this direction.
In this paper, we describe the test challenges that were faced in designing a CoWoSTM process based heterogeneous memory-on-logic and logic-on-logic 3D IC. The objective of this design was to find the stacking process weaknesses in creating a stacked dies system consisting of multiple dies in different technologies (logic and memory). It is also used to demonstrate the strength of the stacking capability when dies to be stacked are sourced from different vendors. The design contains three dies: (1) TSMC SOC die (logic), (2) TSMC DRAM die (logic), (3) SK hynix JEDEC Wide-I/O DRAM die (memory). Please note that TSMC DRAM die is not a conventional memory die and has a logic interface. That is why from test point of view, we consider it as a logic die. The three mentioned dies are placed side-by-side on top of a passive interposer in a face-to-face bonding style as shown in Figure 1. The SOC die interfaces to the other two dies through micro-bumps (µbumps).
Figure 1: CoWoSTM based heterogeneous design
From a test point of view, this design poses several challenges. To minimize any yield loss and reduce overall cost, each die must be fully tested before stacking on top of the interposer. The pre-stacking test is similar to manufacturing wafer-sort of conventional 2D chips and is known as Known Good Die (KGD) test. Since one of the dies is a 3rd party die (SK hynix), ensuring the incoming die quality is also very critical. Therefore, a full incoming inspection flow needs to be established.
After the dies are stacked re-testing of individual dies is required to confirm that the stacking process did not damage the individual die. In addition, a new kind of test
Paper 3.1 INTERNATIONAL TEST CONFERENCE 2
must be performed to check that inter-die interconnects are defects free. This is known as Known Good Stack (KGS) test. Furthermore, since dies are stacked on top of a passive interposer, testing of the interposer is essential in order to minimize the yield loss related to stacking good dies on a defective interposer. This is especially important since an interposer is the lowest cost die in a CoWoSTM stack. Debug features also need to be designed into the system to allow isolation of manufacturing as well as functional errors in case the system fails on ATE.
This paper describes the test and debug strategy that was adopted to overcome the abovementioned test challenges. The paper is organized as follows. Section 2 provides a brief introduction to the TSMC CoWoSTM stacking process. Details of heterogeneous chip architecture and individual dies are presented in Section 3. DFT and debug features for individual die test (KGD) as well as 3D multi-tower DFT architecture that enables inter-die interconnect test are presented in Section 4. In Section 5, a novel test methodology called Pretty-Good-Die (PGD) test for passive interposer testing is presented. Silicon test results for KGD and KGS testing are described in Section 6. Section 7 concludes the paper.
2. TSMC CoWoSTM Stacking Process In our CoWoSTM technology, TSVs with dimensions of 12m in diameter and 100m in depth are dry-etched in the interposer and the TSVs’ sidewall is conformally deposited with SiO2 as liner. Next, the barrier and seed layer is deposited followed by Cu electro-chemical plating. The excess Cu is removed and planarized by CMP. Three interconnect layers with routing pitch up to 0.8m are then formed above the TSV by a dual-damascene process. Before stacking and bonding (Chip-on-Wafer (CoW) step), bumps with a minimum pitch of 40m are formed on both the interposer wafer and the three dies.
Figure 2: Cross-sectional image of CoWoSTM system.
After bonding, the CoW wafer is thinned down to 100m to expose the TSV for subsequent RDL and C4 bump formation. Afterwards, the CoW wafer goes for dicing. Each singulated die is then cleaned, pick-and-placed, reflowed, and flux cleaned before the flip-chip package
process. Figure 2 shows the cross-sectional SEM image of the resulting CoWoSTM system.
3. Chip Architecture As mentioned earlier, this CoWoSTM test system is composed of 4 dies. The Wide-I/O DRAM is from SK hynix, while the remaining three dies (SOC, TSMC DRAM and interposer) are designed at TSMC. Figure 3 shows the high level view of the system.
Figure 3: High-level design architecture
3.1 SOC Die The major mission of SOC die is to perform and control the system application during functional mode. In addition to a dual-core ARM processor & communication buses, SOC contains two main interfaces; a Wide-I/O DRAM interface, and a high bandwidth DRAM interface. The Wide-I/O DRAM interface is JEDEC compliant [6] and interfaces to a Wide-I/O DRAM die. This interface consists of a 3rd party Wide-I/O DRAM controller, a special Wide-I/O PHY circuit, and a Wide-I/O bridge to enable communication between SOC and Wide-I/O DRAM dies. The second interface is the high bandwidth interface between SOC and TSMC DRAM, and consists of PHY and bridge modules. It is designed to achieve up to 1 Tera bits/sec data access.
The SOC die provides GPIO mode support for the Wide-I/O DRAM and contains a MUX-IO debug port for external R/W access to the system address space or for monitoring internal SOC wires. SOC die is designed in TSMC 40LP process node.
3.2 TSMC DRAM Die The TSMC DRAM Die is designed to provide a high bandwidth data access to demonstrate the L3 Cache server application. The die consists of four ports (Channels) and each channel is capable of storing up to 2 Mbytes of data with a low read latency (as shown in Figure 4). To reduce power consumption and meet bandwidth requirements,
Paper 3.1 INTERNATIONAL TEST CONFERENCE 3
DRAM memory designed in TSMC 40G process is used as baseline storage. In addition, to reduce the number of interconnects to the SOC die, a custom interface block was designed (PHY) to serialize/de-serialize the data transfer between SOC and DRAM die. The SOC die also has a corresponding PHY block to interface with TSMC DRAM PHY. The PHY-SOC/PHY-DRAM datapath width is 1024 data signals plus additional address and control signals. All these signals connected to μbumps at the die level.
Figure 4: Functional view of TSMC DRAM Die
3.3 JEDEC Wide-I/O DRAM Die (3rd Party) The third die is a JEDEC Wide-I/O compliant DRAM die designed and manufactured by a leading memory vendor (SK hynix). As shown in Figure 5, the full JEDEC Wide-I/O compliant standard [6] allows up to four DRAM dies (four ranks) to be stacked together in a vertical fashion. Each rank contains four channels with 128-bit wide data bus per channel, totaling to 512 data bits over all four channels.
Figure 5: JEDEC Wide-I/O DRAM Stack
Each channel also includes independent control and clocks but shared power/ground. The maximum data rate is 266Mbps, which offers a total bandwidth of 17GByte/s. The die includes 1200 µbumps connections for all four channels. The Wide-I/O DRAM die contains boundary scan structure to allow interconnect test between logic and the Wide-I/O DRAM die. The boundary scan structure is not compliant to IEEE 1149.1 Std. Please note that the Wide-I/O DRAM die used in in this CoWoSTM chip only contains one rank of DRAM. However, the rank is completely compliant to the JEDEC Wide-I/O standard [6].
3.4 Silicon Interposer The fourth die is the silicon interposer manufactured in TSMC 65nm process. Figure 6 shows the placement of different dies on top of interposer as well as physical characteristics of the interposer. Since the SOC die interfaces to both DRAM dies, the TSMC DRAM die is placed on the left top corner of the interposer, while the JEDEC Wide-I/O DRAM die is placed at the bottom right corner. For the stability of the packaged die as well as to monitor process related effects, an additional DRAM die with same size as the TSMC DRAM die is added on the top right corner of the interposer.
Figure 6: Physical view of the silicon interposer
4. DFT and Debug Architecture To address test challenges mentioned in Section 1, a novel DFT architecture is designed for the CoWoSTM chip and corresponding test features were added to the TSMC SOC and DRAM dies. No changes could be made to the Wide-I/O DRAM Die as it is a 3rd party die. From a test point of view, the SOC die is interfacing with two non-overlapping interfaces, therefore the whole system can be viewed as two towers (SOC-Wide-I/O DRAM, and SOC-TSMC DRAM) on top of the SOC die as shown in Figure 7.
Figure 7: Two towers (electrical connection view)
A multi-tower DFT architecture is designed to support the two tower hardware. The DFT architecture is scalable to any number of towers. The multi-tower architecture is based on the IEEE 1500 based 3D wrapper design [8] and the 3D DFT architectures [9-11]. Figure 8 shows the details about the multi-tower DFT architecture. For simplicity, two towers are shown as North (Wide-I/O DRAM side), and East (TSMC DRAM side) towers. Please note that die-level internal scan chains are not shown. Also the position of wrapper cells and PAD cells are swapped for clarity of the architecture and concept.
Paper 3.1 INTERNATIONAL TEST CONFERENCE 4
Figure 8: TSMC multi-tower 3D DFT architecture
The Wide-I/O DRAM die comes as a Known-Good-Die and cannot be wrapped like other logic dies. However, it already has the boundary scan wrapper (shown in Figure 9) based on the JEDEC Standard [6]. This wrapper requires that all the control signals should be generated from the logic die connected to this memory die.
The SOC die is wrapped with a 3D wrapper, where the Wide-I/O DRAM interface uses standard 1500 wrapper cells (shown as green cells in Figure 8). The TSMC DRAM interface uses special wrapper cells (shown as blue cells in Figure 8) due to special IO PAD design for SOC-to-TSMC-DRAM interface signals. This interface uses low-voltage swing differential signaling and to reduce the performance impact due to wrapper cells, PAD cells include the multiplexers from the wrapper cells that switch between test and functional modes. Therefore, special wrapper cells to account for the existence of the embedded multiplexers are designed and integrated into the SOC and TSMC DRAM dies.
Figure 9: Wide-I/O DRAM boundary scan [11]
To minimize the stack-level number of test pins as well as to make the SOC die compliant to IEEE 1149.1 Std. for board-level integration, the overall multi-tower DFT architecture is controlled by the SOC-die level IEEE 1149.1 TAP controller. The boundary scan cells are shown as yellow cells and only connected to the bottom side I/O pins in Figure 8. The SOC wrapper contains two adaptors to generate essential control signals for die and stack-level test. The TAP-to-1500 adaptor provides necessary control to program the 1500 WIR through the top-level IEEE 1149.1 TAP controller. The 1500-to-WideI/O adaptor generates boundary scan control signals for the Wide-I/O DRAM, as the Wide-I/O DRAM does not contain a TAP controller and requires that control signals should come from the logic die connected to it. Details about the 1500-to-WideIO adaptor design can be found in [11].
The TSMC DRAM die is wrapped with a simple 3D wrapper [8] and only contains one type of wrapper cells. Also it does not require boundary scan wrapper since it does not have any functional (non-power/ground) pin connected to package pins via interposer. Considering that SOC die wrapper interfaces with two distinct non-overlapping dies, we refer to its wrapper as “L-L-M (Logic-to-Logic-to-Memory) wrapper. In L-L-M definition, first L refers to SOC die, second L refers to TSMC DRAM die since we consider it like a logic die in terms of wrapper, while M refers to Wide-I/O DRAM which is a memory die.
The multi-tower architecture provides all types of test modes for each die before and after stacking. Figure 10 shows the wrapper WIR programming mode for individual die (SOC), while Figure 11 shows the same for stack level WIR programming required for interconnect test. Once die-level or stack-level WIR programming is done, testing of die itself or interconnects can be carried out depending on the programmed instruction.
Figure 10: SOC-die level WIR programming
Paper 3.1 INTERNATIONAL TEST CONFERENCE 5
Figure 11: Stack-level WIR programming
For interconnect test, the multi-tower architecture provides three distinct modes: (1) single integrated interconnect test, (2) SOC-to-Wide-I/O DRAM interconnect, and (3) SOC-to-TSMC DRAM interconnect test. In the single integrated interconnect test mode, a single scan chain connecting all the wrapper cells in the SOC die, boundary scan cells in the Wide-I/O DRAM die and wrapper cells in the DRAM die is created between TDI and TDO pins.
Figure 12 shows the multi-tower architecture configured in single integrated test mode, while Figure 13 and Figure 14 show the same for other two single towers interconnect modes. Also note that any individual channel from the Wide-I/O DRAM can also be included or excludes from the interconnect test if required. These different configurations provide greater flexibility and debugging capabilities if and when some failures are found at the ATE.
Figure 12: Single integrated interconnect test mode
The boundary scan test mode for the stacked die is shown in Figure 15. From different modes, we can see that the proposed multi-tower architecture is capable of the meeting all the KGD and KGS testing requirements. Next the test/debug strategy details for each individual die from KGD point of view are presented.
Figure 13: SOC-to-Wide-I/O DRAM interconnect test
Figure 14: SOC-to-TSMC DRAM interconnect test
Figure 15: Boundary scan test mode for the stack
Paper 3.1 INTERNATIONAL TEST CONFERENCE 6
4.1 JEDEC Wide-I/O DRAM Test Scheme The Wide-I/O DRAM is provided as KGD by the memory vendor. However, to find any stacking process weakness and to test the Wide-I/O DRAM after stacking, special programmable MBIST was added to the SOC die. The MBIST block communicates with the Wide-I/O DRAM die through the Wide-I/O PHY block (as shown in Figure 3).
The MBIST engine supports several test algorithms including March-X and March-Y along with special tests such as refresh, leakage and data retention test specific to the Wide-I/O DRAM. The MBIST engine also has limited repair (two rows and one column per channel) circuitry to allow post-bond repair of faulty Wide-I/O DRAM cells.
4.2 SOC Test Several standard and widely used DFT methods such as at-speed scan/ATPG and memory BIST for on-chip SRAM blocks are used in the SOC die. To enhance the testability of embedded modules, simple wrappers providing controllability and isolation of the module I/Os are also inserted around modules. Unlike the traditional 2D SOC testing, direct probing of µbumps on SOC is not feasible and therefore special probe pads were designed and added to test pins. In order to minimize the number of probe-pads, pins-reduction compaction techniques (PRCT) as well as test data compression schemes were heavily adopted.
To minimize the test power consumption, the overall design was partitioned in three groups: (1) TSMC DRAM control interface, (2) SOC control and top-level logic, and (3) Wide-I/O DRAM control interface (as shown in Figure 16). Only one group could be activated and tested at a time during KGD test.
Figure 16: Test partitioning and session plan
The number of top-level scan pins for each group is decided based on the number of flops in each group. The TSMC DRAM interface partition has 26 top-level scan chains and 362 internal scan chains resulting in an effective compression of 14X. Similarly, the Wide-I/O DRAM control interface partition has 10 top-level and 239 internal
chains per DRAM channel. The SOC control logic partition has 26 top-level and 330 internal chains.
To further reduce the peak power consumption during test, a multi-clock capture scheme was used. In a multi-clock capture scheme, only clocks with no cross-clock-domain timing paths or clocks sharing identical clock source in a test mode can be triggered in parallel in a capture mode. Experimental results show that use of multi-clock capture scheme resulted in ~12% reduction in switching activity during test.
4.3 TSMC DRAM Test The TSMC DRAM die is designed to provide the maximum test and diagnostic coverage possible at minimal cost. The DFT scheme is shown in Figure 17 and contains the following features
At speed scan based testing with independent test compression/decompression and serialization blocks for each channel so that they can be tested independently.
Each channel has four top-level scan chains and 48 internal scan chains resulting in target compression of 12X.
At speed memory BIST and Repair for all TSMC DRAM blocks.
Figure 17: DFT scheme for TSMC DRAM die
To enable diagnosis and debugging, three loop back test modes using the existing BISTR were designed. Figure 18 shows these loop back modes. These loop back modes are (1) at-speed global (green), (2) at-speed BIST loop (red), and at-speed PHY loopback (blue).
The innovative approach of re-using TSMC DRAM BIST controller for loop back test for PHY (blue) and global (green) resulted in 30% reduced complexity for the PHY block and enabled more accurate test and diagnosis for the PHY failures.
Paper 3.1 INTERNATIONAL TEST CONFERENCE 7
Figure 18: BIST/Loopback test in TSMC DRAM die
5. Testing of Passive Silicon Interposer As silicon interposer is the largest and least costly die in the complete stack, manufacturing (go/no-go) testing is very critical. One of the major challenges in testing of interposer is that it is passive and does not contain any logic elements such as logic gate or flip flops that are required from conventional DFT point of view. Figure 19 shows an example implementation of a passive silicon interposer along with all possible connection types. Most of the functional signals whether inter-die or intra-die is of Type T1, while most of the power/ground signals are of Type T2 and T3. From Figure 19, it is clear that testing of interposer requires:
1. Testing of metal interconnects or metal traces 2. Testing of through-silicon VIAs (TSVs).
Figure 19: Example interconnects in passive interposer
Also we can see that if we could probe both sides of the interposer (µbump and C4) at the same time, then testing objective (KGD) as mentioned earlier can be achieved. However, direct probing of µbumps is very difficult and production-worthy solutions are not yet available [13-15]. Also, double side probing of interposer is not possible due to wafer handling and probe card manufacturing issues. Therefore, even if the direct probing of µbumps was possible, we could not have used this approach. A new test technique called Pretty-Good-Die (PGD) was developed to address the passive interposer testing problem. The PGD scheme allows testing of interconnects as well as the TSVs
in the interposer. Based on the connection type (T1, T2 or T3), two kinds µbump structures are used. For Type T2 and T3, a common µbump/TSV structure (as shown in Figure 20a) is used, while for Type T1, a single µbump structure is used (Figure 20b).
Figure 20: µbump structure types
In the common µbump/TSV structure, a set of 8 TSVs and µbumps is used for connection. This is based on the power delivery and signal strength requirement. The area available at the center of a common µbump/TSV structure is re-used to place a testing probe pad. For inter-die/intra-die connections, single µbump is used. In PGD, we separate the testing of interconnects and the testing of TSVs. For better understanding, let’s consider the example interposer shown in Figure 21 where all three types of connections are shown. There are four connections that involve at least two µbumps and interposer metal routing.
Figure 21: Front-side view of a passive interposer
Figure 22 shows the added PGD features to enable the testing of interconnects. For interconnect testing, we use dummy metal (extra metal) to connect µbump pairs of net-under-test to near-by probe pads available at the center of common µbump/TSV structures. The use of probe-pads is required as the direct-probing of bumps is not possible to test a particular interconnect. As the added dummy metal increases the loading of the signal and can degrade its performance, it is very important to minimize the length of the added dummy metal. Finding the optimal µbump and probe-pad pair that minimize total added wire length is a separate problem and is not addressed in this paper.
µbump
C4 bumpC4
interposer
T1: Inter/intra-dieconnections
T3: Fan-out connectionvia TSV
TSV
TSV
TSV
TSVT2: Feed through
via TSV
C4 C4 C4
(a) Common μbump/TSV structure (b) Single μBump structure
TSV
µbump
Sacrificial probing pad
Common μbump/TSV
structure
Normal interposer
routing
Single μbump
Type T1
Type T3
Type T2
Paper 3.1 INTERNATIONAL TEST CONFERENCE 8
Figure 22: PGD features for interconnect testing
Once the optimal pairing of µbumps and probe pads have been determined and the corresponding metal connections have been formed, the front-side probing (as shown in Figure 23) using traditional probe cards can be performed to do interconnect testing. Front-side probing of these pads enables the interconnect test for open/short and bridging faults between interconnects. To check for static opens, a simple 1/0 logic value can be applied at one end (pad) of a net (interconnect), and the resulting value can be observed through the other end (pad) of the same net. Similarly, applying a 1/0 to one net and keeping all other nets to 0/1 will check for shorts between two nets. Please note that for a defective interconnect, this methodology cannot differentiate if the defect is in the normal signal routing or the dummy metal routing.
Figure 23: Front-side probing of interposer
As the number of probe-pads that can be placed on the interposer is limited by the number of common µbump/TSV structures as well as the space on the interposer, only a limited set of interconnects can be tested. This is why we called this methodology “Pretty-Good-Die (PGD)” instead of Known-Good-Die (KGD) test.
For testing of TSVs, we use the back-side probing concept. To test the connectivity of TSVs and µbumps of Power/Ground (P/G) pins, a dummy wire is added to connect two compatible P/G common µbump/TSV pairs. This concept is shown in Figure 24. Two P/G common µbump/TSV structures are considered compatible if they are of the same type (Power or Ground) and if they have same voltage level. Addition of dummy metal results in shorting of the corresponding pins (Power/Ground) but this does not affect the chip functionality as these pins would have been shorted anyway during packaging. For signal TSVs, a dummy TSV and C4 bump pair is added to form a loop with the target TSV.
Figure 24: Adding PGD features for TSV testing
Once the required PGD features (dummy metal for TSV testing) have been added, TSV connectivity can then be tested by probing the back-side C4 bump pairs as shown in Figure 25. Since the back-side of interposer contains C4 bumps, which can be directly probed, no probe pads are required for TSV testing.
Figure 25: Back-side probing of interposer
Similar to interconnect testing, a simple 1/0 logic value can be applied at one end (C4) of a TSV loop, and the resulting value can be observed through the other C4 of the same TSV loop. However, unlike interconnects, TSV testing is coarse and used to check for major process related defects only, e.g. the actual TSV test coverage is low. It is important to note that even though PGD test methodology cannot achieve high defect coverage, it is an acceptable practice from foundry point of view as a simple go/no go test. Considering the fact that no logic can be added to the interposer and it contains array of thousands of µbumps and TSVs, KGD test of passive interposer imposes a difficult challenge.
6. Experimental and Silicon Results Figure 26 shows the stacked CoWoSTM chip before packaging. The four dies are placed on top of the passive interposer wafer. As shown in Figure 26, there is no physical access to the individual die IOs to perform testing at the CoWoSTM level; any test must be applied through the interposer. Therefore, the DFT methodology described earlier plays a very important role to enable testing of the dies at the CoWoSTM level. The passive interposer test as outlined in Section 5 did not result in any failing die and hence confirmed our expectation that PGD approach is sufficient for simple go/no-go testing of interposer.
Common μbump/TSV
structure
Normal interposer
routing
Single μbumpDummy metal
for interconnect test
Probe padCommon μbump/TSV
structure
Normal interposer
routing
Single μbumpDummy metal
for interconnect test
Probe pad
Dummy metal for TSV test
C4
C4
C4
C4
C4 C4
C4
Paper 3.1 INTERNATIONAL TEST CONFERENCE 9
Figure 26: Stacked CoWoSTM chip before packaging
Figure 27 shows the pre-stacking and post-stacking Shmoo plots for the SOC die. The Dhrystone patterns were chosen because they cause maximum power in the SOC die. The worst-case specification for SOC was 925MHZ at 1.1 volt. From Figure 27, we can see that typical-corner speed of 1.6GHz was obtained from at KGD level, while 1.62GHZ was obtained at KGS level. The improvement in speed can be attributed to the better power delivery through the interposer. KGD and KGS test results for the TSMC DRAM chip are provided in Figure 28 where at-speed memory BIST patterns are used. In this case, both results match the expected typical corner case specification.
(a) KGD (b) KGS (CoW)
Figure 27: SOC die silicon test results
(a) KGD (b) KGS (CoW)
Figure 28: TSMC DRAM silicon test results
In Figure 29 we show the pre and post stacking Shmoo plots of the Wide-I/O DRAM for at-speed memory BIST. Please note that for pre-stacking test the Direct Access (DA) mode is used, while post-stacking we use GPIO mode and memory BIST to test Wide-I/O DRAM. This was done to test both GPIO and DA mode for the Wide-I/O DRAM die. From both KGD and KGS results, we can see that DRAM die meets the minimum required speed of 200MHZ at 90℃. The Wide-I/O DRAM demonstrates 285MHz performance through MBIST testing at KGS compared to the original specification of 200MHz.
(a) KGD (b) KGS (CoW)
Figure 29: Wide-I/O DRAM silicon test results
One of the main objectives of the multi-tower 3D DFT architecture was to perform the inter-die interconnect test at KGS. This test is required to ensure that stacking process (especially µbump interconnects) are defect free. Cadence’s RTL Compiler was used to insert the wrapper logic in the SOC and the TSMC DRAM dies and the Encounter Test ATPG tool was used to generate the static driver-to-receiver open/short patterns [11] for the interconnect test. The interconnect ATPG results are shown in Table 1.
Table 1: Inter-die interconnect ATPG results
For inter-die interconnects, total 99.94% coverage is obtained while for the package pins that are connected to the SOC die, total 90.74% coverage is achieved. The overall interconnect static coverage is 99.47% and only six test patterns are required. There were also 22 shorted net tests created to cover possible shorts between interconnect signals on the interposer. Silicon test of both interfaces (as shown in Figure 13 and Figure 14) did not result in any interconnect failure. In addition to the APTG generated slow-speed interconnect test, high-speed loopback tests were designed in the CoWoSTM chip. The high-speed loopback results were consistent with the expected interface performance and are shown in Figure 30.
(a) SOC-to-Wide-I/O DRAM (b) SOC-to-TSMC DRAM
Figure 30: High-speed loopback test results (KGS)
-5%=1.14V
+8%=1.32V
VDD2
JEDEC Specification
Frequency (MHZ) 200
1.32V
1.14V
200
285MHz200MHz+8%=1.32V
VDD2
-5%=1.14V
JEDEC Specification
285
1.32V
1.14V
Frequency (MHZ)
Fault Type TestableFaults
Fault Coverage
Static stuck (inter-die)
14688 99.94%
Static stuck (package pins)
788 90.74%
Total static stuck(inter-die + package)
15476 99.47%
Paper 3.1 INTERNATIONAL TEST CONFERENCE 10
Silicon test results clearly demonstrate that most of the test challenges related to CoWoSTM technology can be resolved if planned properly. Furthermore, the stacking process does not have a negative impact on the system performance. In fact, KGS results show improvement over KGD results. It also demonstrates the viability and high performance capability of the CoWoSTM technology.
6. Conclusion Heterogeneous system integration where dies implemented in dedicated, optimized process technologies and stacked together to form a system is inevitable to meet the demand of modern and future electronic products. Dies can be stacked vertically to form a 3D stack and connected via Through-Silicon Vias (TSVs) or can be placed next to each other on top of a passive silicon interposer and interconnected via interposer; we refer to this flow as Chip-on-Wafer-on-Substrate (CoWoSTM). The growth in mobile device market indicates that memory-on-logic 3D stacks (with and without interposer) are expected to arrive in market first followed by logic-on-logic stacks.
In this paper, we have presented the test challenges and innovative DFT solutions for a heterogeneous memory-on-logic and logic-on-logic CoWoSTM IC. The heterogeneous system contains two TSMC dies and a SK Hynix Wide-I/O DRAM die stacked on top of a passive silicon interposer. Specific DFT approaches were designed into the individual dies to meet the high level test and quality requirements for the CoWoSTM manufacturing process. A novel approach called Pretty-Good-Die (PGD) test is introduced for testing of passive interposer. The presented DFT solutions allows for efficient KGD and KGS testing of the dies and the stack. For inter-die interconnects, overall coverage of 99.47% is achieved. Silicon test results show that TSMC CoWoSTM process based IC achieve similar or better results at stack-level as compared to the bare-die performance.
Acknowledgement We thank Sergej Deutsch (Cadence Design Systems, Germany & IMEC Belgium) and Erik Jan Marinissen from IMEC, Belgium for their support in developing inter-die interconnects test solution. We thank Y.T. Ha, H.S. Jun, H.S. Kim, J.H. Hong, Y.C. Joo (all from SK-hynix, South Korea), Jeff Tsai, CH Chang, and Jonathan Yuan (all from TSMC, TWN) for their contribution in this project.
8. References [1] Eric Beyne and Bart Swinnen, “3D System Integration
Technologies”, In Proceedings IEEE International Conference on IC Design and Technology, June 2007
[2] Philip Garrou, Christopher Bower, and Peter Ramm, editors, “Handbook of 3D Integration – Technology and Applications of 3D Integrated Circuits”, Wiley-VCH, Weinheim, Germany, August 2008.
[3] Robert S. Patti, “Three-Dimensional Integrated Circuits and the Future of System-on-Chip Designs”, Proceedings of the IEEE, 94(6):1214.1224, June 2006
[4] Frank Lee and Marc Greenberg, “Enough Talk! Practical Approaches to 3D IC- TSV/Silicon Interposer and Wide IO Implementation from People who have been there and done that”, Tutorial 2 at Design Automation Conference, June 2012
[5] J. Y. Xie et al., “Interposer Integration through Chip-on-Wafer-On-Substrate Process (CoWoSTM)”, In Proceedings, Semicon West, July 2012
[6] WideI/O Single Data Rate (JEDEC Std. JESD229), JEDEC Solid State Technology Association, December 2011.
[7] ST-Ericsson and CEA-Leti's WIOMING Prototype Shows How To Combine Wide IO Memory and Logic SoC for Future 3D Multi-Processor Architectures. Yole Developpement 3D Packaging Newsletter, (22):16.18, February2012.
[8] Erik Jan Marinissen et al., “A DFT Architecture for 3D-SICS Based on a Standardizable Die Wrapper”, Journal of Electronic Testing: Theory and Applications, 28(1):73-92, Feb 2012.
[9] Chun-Chuan Chi et al. “DfT Architecture for 3D-SICs with Multiple Towers”. In Proceedings IEEE European Test Symposium (ETS), pages 51.56, May 2011
[10] Sergej Deutsch et al., “Automation of 3D-DFT Insertion”, In Proceedings IEEE Asian Test Symposium (ATS), Nov 2011.
[11] S. Deutsch, et al., "DfT architecture and ATPG for Interconnect tests of JEDEC Wide-I/O memory-on-logic die stacks," In proceedings International Test Conference, Nov 2012
[12] Sandeep Kumar Goel, “Test challenges in designing complex 3D chips: What is on the Horizon for EDA industry? In Proceedings International Conference on Computer-Aided Design, Nov 2012
[13] Ken Smith et al., “Evaluation of TSV and Micro-Bump Probing for Wide-I/O Testing”, In Proceedings IEEE International Test Conference (ITC), September 2011
[14] Ben Eldridge and Marc Loranger, “Challenges and Solutions for Testing of TSV and Micro-Bump”, In Digest of IEEE International Workshop on Testing Three-Dimensional Stacked Integrated Circuits (3D-TEST), September 2011
[15] Matt Losey et al., “A Low-Force MEMS Probe Solution for Fine-Pitch 3D-SIC Wafer Test”, In Digest of IEEE International Workshop on Testing Three-Dimensional Stacked Integrated Circuits (3D-TEST), September 2011