[IEEE 2006 IEEE Electrical Performane of Electronic Packaging - Scottsdale, AZ, USA...

4
279 2006 IEEE Electrical Performance of Electronic Packaging Performance Analysis of Edge-based DFE Jihong Ren, Haechang Lee, Dan Oh, Brian Leibowitz, Vladimir Stojanovic, Jared Zerbe, Nhat Nguyen Rambus Inc. 4440 El Camino Real, Los Altos, CA 94022 Tel: 650-947-5587, Fax: 650-947-5001, jrengrambus.com ABSTRACT This paper presents an analysis of edge-based decision-feedback equalization (DFE) and compares its performance to conventional data-based DFE in high-speed serial links. We use the link performance analysis framework in [3] to evaluate these two equalization schemes. Simulation results across 13 backplane channels of various length and configuration show that data-based DFE (D-DFE) always performs better than edge-based DFE (E-DFE). I. INTRODUCTION Backplane interconnects present great challenges to achieving multi-Gb/s signaling rates. Signal integrity issues such as crosstalk and inter-symbol interference (ISI) severely limit the channel bandwidth. To push high data rates over legacy backplanes, sophisticated equalization techniques must be used. Edge-based equalization has recently been proposed as an alternative to conventional data-based equalization [1, 2, 4, 5, 6]. This paper compares two adaptive DFE schemes: the conventional D-DFE and the recently proposed E-DFE. For the purpose of this paper, we will focus on receivers that sample the data at twice the baud-rate for clock and data recovery (CDR). In steady state, one sample time occurs at the transitions of the data eye (edge sampler) while the other occurs at the center of the eye opening (data sampler). In D-DFE, the DFE filter coefficients are adapted based on error information collected at the data sample time. This requires a third sampler (adaptive sampler) whose threshold is set to the expected signal swing in the absence of ISI (a.k.a dLev) [3].' This equalization scheme tries to minimize the ISI distribution at the data time. In contrast, E-DFE uses the error information collected at the transitions of the data. This error information is already generated for timing recovery and so this scheme removes the need for additional analog hardware for adaptation. E-DFE tries to minimize the ISI distribution at the transitions of the data. In so doing, it also provides partial cancellation of the ISI at the center of the eye. Although E-DFE seems promising, its performance has never been compared with conventional D-DFE. This paper presents an analysis of E-DFE and compares its performance to D-DFE in high-speed serial links. Simulation results across multiple channels show that D-DFE always performs better than E-DFE. The rest of the paper is organized as follows. Section II introduces the basic DFE architecture. Section III describes the adaptive algorithms for E-DFE and D-DFE and their corresponding steady state solutions. In Section III, we also present a simple method of determining the nominal CDR lock position by observing the single bit response of the link and discuss the interaction between CDR and DFE adaptation. Section IV presents simulation data across 13 different channels at 7.5Gb/s and we draw conclusion in Section V. 11. DFE ARCHITECTURE E-DFE and D-DFE share the same core DFE architecture. Figure 1 shows a simplified block diagram of the DFE loop. The purpose of the DFE loop is to cancel the ISI generated by previous received bits. Each DFE weight is an estimation of the ISI contribution from that corresponding bit. Analog pulses synthesized from previous symbol decisions and the DFE weights are current-summed with the received signal. This equalized signal is the input to the data sampler which makes future symbol decisions. The clock to the DFE is adjusted so that the transitions of the DFE correction pulses are phase aligned with the transitions of the input data. With such alignment, the DFE correction pulses reach full swing at the eye center and add half swing at the edges of the data. I Furthermore, since dLev is not known a priori, a DAC is also needed to adjust the sampler threshold according to the channel. 1-4244-0668-4/06/$20.00 ©2006 IEEE 265

Transcript of [IEEE 2006 IEEE Electrical Performane of Electronic Packaging - Scottsdale, AZ, USA...

Page 1: [IEEE 2006 IEEE Electrical Performane of Electronic Packaging - Scottsdale, AZ, USA (2006.10.23-2006.10.25)] 2006 IEEE Electrical Performane of Electronic Packaging - Performance Analysis

279

2006 IEEE Electrical Performance of Electronic Packaging

Performance Analysis of Edge-based DFEJihong Ren, Haechang Lee, Dan Oh, Brian Leibowitz, Vladimir Stojanovic,

Jared Zerbe, Nhat Nguyen

Rambus Inc.4440 El Camino Real, Los Altos, CA 94022

Tel: 650-947-5587, Fax: 650-947-5001, jrengrambus.comABSTRACT

This paper presents an analysis of edge-based decision-feedback equalization (DFE) and compares its performance toconventional data-based DFE in high-speed serial links. We use the link performance analysis framework in [3] to evaluatethese two equalization schemes. Simulation results across 13 backplane channels of various length and configuration showthat data-based DFE (D-DFE) always performs better than edge-based DFE (E-DFE).

I. INTRODUCTION

Backplane interconnects present great challenges to achieving multi-Gb/s signaling rates. Signal integrity issues suchas crosstalk and inter-symbol interference (ISI) severely limit the channel bandwidth. To push high data rates over legacybackplanes, sophisticated equalization techniques must be used. Edge-based equalization has recently been proposed as analternative to conventional data-based equalization [1, 2, 4, 5, 6]. This paper compares two adaptive DFE schemes: theconventional D-DFE and the recently proposed E-DFE.

For the purpose of this paper, we will focus on receivers that sample the data at twice the baud-rate for clock and datarecovery (CDR). In steady state, one sample time occurs at the transitions of the data eye (edge sampler) while the otheroccurs at the center of the eye opening (data sampler). In D-DFE, the DFE filter coefficients are adapted based on errorinformation collected at the data sample time. This requires a third sampler (adaptive sampler) whose threshold is set to theexpected signal swing in the absence of ISI (a.k.a dLev) [3].' This equalization scheme tries to minimize the ISI distributionat the data time. In contrast, E-DFE uses the error information collected at the transitions of the data. This error informationis already generated for timing recovery and so this scheme removes the need for additional analog hardware for adaptation.E-DFE tries to minimize the ISI distribution at the transitions of the data. In so doing, it also provides partial cancellation ofthe ISI at the center of the eye. Although E-DFE seems promising, its performance has never been compared withconventional D-DFE. This paper presents an analysis of E-DFE and compares its performance to D-DFE in high-speedserial links. Simulation results across multiple channels show that D-DFE always performs better than E-DFE.

The rest of the paper is organized as follows. Section II introduces the basic DFE architecture. Section III describesthe adaptive algorithms for E-DFE and D-DFE and their corresponding steady state solutions. In Section III, we also presenta simple method of determining the nominal CDR lock position by observing the single bit response of the link and discussthe interaction between CDR and DFE adaptation. Section IV presents simulation data across 13 different channels at7.5Gb/s and we draw conclusion in Section V.

11. DFE ARCHITECTURE

E-DFE and D-DFE share the same core DFE architecture. Figure 1 shows a simplified block diagram of the DFEloop. The purpose of the DFE loop is to cancel the ISI generated by previous received bits. Each DFE weight is anestimation of the ISI contribution from that corresponding bit. Analog pulses synthesized from previous symbol decisionsand the DFE weights are current-summed with the received signal. This equalized signal is the input to the data samplerwhich makes future symbol decisions. The clock to the DFE is adjusted so that the transitions of the DFE correction pulsesare phase aligned with the transitions of the input data. With such alignment, the DFE correction pulses reach full swing atthe eye center and add half swing at the edges of the data.

I Furthermore, since dLev is not known a priori, a DAC is also needed to adjust the sampler threshold according to thechannel.

1-4244-0668-4/06/$20.00 ©2006 IEEE 265

Page 2: [IEEE 2006 IEEE Electrical Performane of Electronic Packaging - Scottsdale, AZ, USA (2006.10.23-2006.10.25)] 2006 IEEE Electrical Performane of Electronic Packaging - Performance Analysis

280

As in [1], we use a 2x over-sampled CDR and use a simple majority-vote algorithm to advance/hold/retard CDRphases based on the early/late information from the edge samples. Nominally, this CDR locks to the mean of the timingdistribution at the transition, where about half of the zero-crossings are late and half are early.

III. EDGE-BASED AND DATA-BASED ADAPTATION

The main difference between E-DFE and D-DFE is how the DFE weights are adapted. In order to obtain error

information at the data time, D-DFE uses an adaptive sampler and adjusts its threshold to dLev. The difference between dLevand the actual data level is the error information used in adaptation. In contrast, E-DFE uses the error information from theedge sampler typically used for timing recovery [1]. The rest of this section briefly summarizes the adaptation algorithmsand derives their corresponding steady-state solution. We used the same notation as in [1] but ignore pre-cursor ISI in theformulation for simplicity.

Let (t) be the single bit response of the channel, t, be the data sampling phase, and am be the m-th transmitted

symbol. The voltage seen by the receiver's data sampler at the m-th data sample is given by

Zm = am (Ats)+ am-l-k1{(ts +(k+1)T) dfek+l} Equation 1k=o...

where dfek is the value of the k-th post-cursor DFE tap. The transition sample preceding the m-th data sample isZm-1/2= am(p(ts- T 2)+am-,J{(ts +T 2)- dfe112}

+Eam k{s +T 2+kT)- (dfek +dfek±l) 2} Equation2k= ...

E-DFE chooses the DFE weights to minimize E{(zm-1/2)2}, the ISI on edge samples. In contrast, D-DFE chooses22the DFE weights to minimize E{(zm (tS)) }, the ISI on data samples.2 As mentioned earlier, an additional adaptation

loop is needed to adapt dLev to the desired data level 9(ts) since this is not known a priori. The SS-LMS algorithms are

readily derived from Equations1 and 2.

The SS-LMS algorithm for E-DFE is:

dfejl dfe/ +, sgn(zm_112) sgn(am-k + am-k- ) Equation 3

The SS-LMS algorithm for D-DFE is:

dfej+l dfe + sgn(z- dLev1) sgn(am-k) Equation 4

dLevJ+1=dLev1-/Asgn(z dLev1')Finding the steady state solutions requires us to consider the CDR which is constantly adapting as well. As the DFE

weights adapt, the first DFE tap moves the CDR locking position. This in turn changes the ISI components seen by theadaptation algorithm. Thus, there is a recursive interaction between the CDR and DFE. As mentioned earlier, CDRnominally locks to the mean of the timing distribution at the transition. For a rising transition (am 1, ami -1), the mean

of the transition sampleZm-1/2 is expressed by the left hand side of Equation 5, assuming dc-balance in the input data.Similar analysis can be done for the falling transition. The CDR is stable when the mean of the transition sample is equal tozero since this implies that the average timing error into the CDR is also equal to zero.

(ts T 2) (9p(ts + T 2) dfe,1 2)= 0 Equation 5

Substituting Equation 5 into Equation 2 gives us the solution for the n-tap E-DFE:

dfen =2(ts +T/2+nT)dfek = 2((p(ts +T/2+kT)- dfek±l 2), 1<k E ao

2 Throughout this discussion of D-DFE we assume that error information is only collected when the current data is a"1".

Similar analysis holds in the general case, but in hardware requires an additional error sampler.

266

Page 3: [IEEE 2006 IEEE Electrical Performane of Electronic Packaging - Scottsdale, AZ, USA (2006.10.23-2006.10.25)] 2006 IEEE Electrical Performane of Electronic Packaging - Performance Analysis

281

The steady state CDR phase (t5) and the final DFE taps are found by solving Equations 5 and 6 concurrently. D-DFE on theother hand forces Zm to equal am9(ts). This gives us the equivalent of Equation 6 for D-DFE (Equation 7). Similarly, thesteady state CDR phase and DFE taps for D-DFE are found by concurrently solving Equations 5 and 7.

dfek = (ts + kT) Equation 7

Intuitively, Equation 6 tells us that E-DFE tries to cancel an edge ISI component using the combined contribution ofhalf of the preceding DFE tap and half of the following DFE tap. The recursion in Equation 6 shows that in E-DFE, eachDFE tap k depends on the edge ISI contributions from the k-th through n-th preceding bits. This is in stark contrast to D-DFE whose DFE taps are only correlated to individual ISI components. For example, in E-DFE, the first tap depends on theISI from all the preceding bits,

dfe, =2 (9(ts +T 2+kT) -(ts +3T 2+kT)) Equation8k=1,3,5...

This is clearly a suboptimal estimate of the first post-cursor data ISI (a (ts+±). Simulation data in the next section show thatE-DFE results in more data ISI leading to worse BER performance despite achieving less edge ISI.

IV. SIMULATION RESULTS

To compare the performance of these two schemes, we use the statistical link performance analysis tool developed byStojanovic in [3] to estimate the system BER for 13 backplane channels of various length (from 3.5" to 23.7" on backplaneplus 7" on line cards) and configurations (e.g. counter boring of vias, dielectric material etc.). The tool takes many noisesources into account, such as power supply noise, PLL jitter, CDR dithering, ISI, crosstalk etc. and estimates the link BER.

Figure 2 shows the BER ofE-DFE (4 taps), D-DFE (4 taps) and no DFE for 13 backplane channels at 7.5Gb/s. For allthe channels examined, D-DFE has the best performance among the three schemes. Compared with no DFE, E-DFE onlyimproves the channel BER significantly in a few cases (sample channels 2, 3, 4 and 13). For other channels, E-DFE does notimprove channel BER much. For sample channel 10, it even degraded performance in comparison to the no DFE case.

Figure 3 shows the single bit responses for sample channel 2 with E-DFE, with D-DFE and without DFE. Aspredicted by Equation 5, the pre-edge and post-edge samples straddling the main data sample of the (equalized) single bitresponse are equal at the nominal CDR locking phase. Furthermore, E-DFE zero forces the edge ISIs as expected. As a byproduct, data ISI is partially cancelled. The ISI distribution at the data sampling phase and the edge sampling phase fordifferent DFE methods are shown in Figure 4 and 5. E-DFE cleans up the edge ISI producing the cleanest input to the CDR.Interestingly, Figure 5 also shows that D-DFE reduces edge ISI. This is in contrast to the claim that D-DFE cancels data ISIbut leaves edge ISI uncorrected [4]. D-DFE actually provides partial cancellation of edge ISI just as E-DFE provides partialcancellation of data ISI.

Despite E-DFE's advantages in removing edge ISI, D-DFE achieves lower BER due to better ISI cancellation at thedata sample time. This contradicts the claim by [2] that E-DFE minimizes data ISI and BER. For example, Figure 6 showsthe single bit responses for sample channel 10. We can clearly see why E-DFE failed to cancel data ISI in this case. Thepolarity of the first post-cursor edge ISI is opposite to that of the data ISI. In this case, E-DFE zeros out edge ISI but at thesame time introduces more ISI at the data sample (Figure 7) and therefore increases the BER.

Generally, designers expect better performance by adding more DFE taps.3 D-DFE offers better performance withmore DFE taps (Figure 8). However, more taps can result in worse performance in E-DFE (Figure 8) because the partial dataISI cancellation more likely becomes incorrect ISI cancellation when the ISI components are small. This results in erroraccumulation in E-DFE as the DFE becomes longer.

V. CONCLUSIONS

In this paper, we compared E-DFE [1], with traditional D-DFE [3]. Simulation results for 13 sample backplanechannels show that D-DFE offers better performance than E-DFE in all cases. E-DFE improves BER compared with no DFEfor some channels. For some channels, minimizing the error at the edge phase has shown to increase the error at data phase.For this reason, E-DFE can even degrade the BER by introducing more ISI at the center of the eye.

3 This is true in the absence of quantization noise from the DFE.

267

Page 4: [IEEE 2006 IEEE Electrical Performane of Electronic Packaging - Scottsdale, AZ, USA (2006.10.23-2006.10.25)] 2006 IEEE Electrical Performane of Electronic Packaging - Performance Analysis

DFE CLK

Figure 1. Conventional DFE architecture

D1DFEDFE

rno BFE05

04

:F 0.3

0OA0

-20

-40

t -60CDI- -80

-1.00En,< OFE

DD-FE

samrple channels

Figure 2. BER comparison of E-DFE (4 taps), D-DFE (4taps) and no DFE across 13 backplane channels.

15I distribution onr Data

E-DFE0.015 no1.

o0o1

0*005[

aOA1 1 6 1 7 1 9 2 22A 22 23

tine )(ns'

Figure 3. Single bit responses for sample channel 2 with D-DFE, E-DFEand no DFE. The markers indicate the 2x over-sampled samples.

IS] distribution on Edge0.02-

0.015

D DOFE.E DFE

..no DFE

0.01

0.00

4 -0.2 0 0. A0

Figure 4. ISI distribution at data sampling phase forsample channel 2.

DDFEE-DFE

=no DFE

047

Dos.0403,OW0. .0.10

.... ........................................

-82 -01 0 01: 02

Figure 5. ISI distribution at edge sampling phase for sample channel 2.

x 10 ISI distrilbution on Datasr

D-ODFEE-DFE*wno FE

4

8A4 -022 0 02 OA

Figure 7. ISI distribution at data sampling phase for sample channel 10.

61

1516 17 11 .s 2 21 22 23 24time (n )

Figure 6. Single bit responses for sample channel 10 withD-DFE, E-DFE and no DFE. The markers indicate the 2xover-sampled samples.

-20, E-DFE-40 DwIDOFE

-60i

0.8cm_

.2A00-120i

-140numnber of OFE taps

Figure 8. Impact ofDFE length on performance.

REFERENCES

S. Wu et al., "Design of a 6.25Gbps Backplane SerDes with TOP-down Design Methodology", DesignCon East, 2004.

R. Payne et al., "A 6.25Gb/s Binary Adaptive DFE with First Post-Cursor Tap Cancellation for Serial BackplaneCommunications", IEEE International Solid-State Circuits Conference, 2005.

V. Stojanovic, "Channel-Limited High-Speed Links: Modeling, Analysis and Design", ProQuest/ UMI, 2006.

K-L. J.Wong, C-K. K. Yang, "A Serial-Link Transciever with Transition Equalization", IEEE International Solid-StateCircuits Conference, 2006.

B. Brunn et al., "Edge-Equalization Extends Performance in Multi_Gigabit Serial Links", DesignCon, 2005.

K. Yamaguchi et al., "12Gb/s Duobinary Signaling with x2 Oversampled Edge Equalization", IEEE International Solid-State Circuits Conference, 2005.

268

282

[1][2]

[3][4]

[5][6]