Equalizer Design and Performance Trade-Offs in ADC-Based Serial Links

download Equalizer Design and Performance Trade-Offs in ADC-Based Serial Links

of 12

Transcript of Equalizer Design and Performance Trade-Offs in ADC-Based Serial Links

  • 8/18/2019 Equalizer Design and Performance Trade-Offs in ADC-Based Serial Links

    1/12

    2096 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 58, NO. 9, SEPTEMBER 2011

    Equalizer Design and Performance Trade-Offs inADC-Based Serial Links

    Jaeha Kim , Senior Member, IEEE , E.-Hung Chen , Member, IEEE , Jihong Ren , Member, IEEE ,Brian S. Leibowitz , Member, IEEE , Patrick Satarzadeh , Member, IEEE , Jared L. Zerbe , Member, IEEE , and

    Chih-Kong Ken Yang , Fellow, IEEE 

     Abstract—This paper investigates the performance benefit of using nonuniformly quantized ADCs for implementing high-speedserial receivers with decision-feedback equalization (DFE). A wayof determining an optimal set of ADC thresholds to achieve theminimum bit-error rate (BER) is described, which can yield avery different set from the one that minimizes signal quantizationerrors. By recognizing that both the loop-unrolling DFE receiverand ADC-based DFE receiver decide each received bit basedupon the result of a single slicer, an efficient architecture named reduced-slicer partial-response DFE  (RS-PRDFE) receiver is pro-posed. The RS-PRDFE receiver eliminates redundant or unused

    slicers from the previous DFE receiver implementations. Both thesimulation and measurement results from a 10 Gb/s ADC-basedreceiver fabricated in 65 nm CMOS technology and multiplebackplane channels demonstrate that the RS-PRDFE can achievethe BER of a 3–4-bit uniform ADC only with 4 data slicers. Also,the combined use of linear equalizers (LEs) can further reduce therequired slicer count in RS-PRDFE receivers, but only when theLEs are realized in analog domain.

     Index Terms—Analog-digital conversion, data communication,equalizers, receivers.

    I. INTRODUCTION

    AS THE complexity of electrical and optical communica-tion links increases, there is a growing interest towards

    implementing the transceivers based on analog-to-digital con-

    verters (ADCs) and digital signal processors (DSPs) [1]–[4].

    As the data rates rose, various channel impairments including

    skin loss, dielectric loss, reflections, and crosstalk have become

    more pronounced and call for advanced coding and modula-

    tion schemes. While the aggressive scaling of CMOS has made

    it feasible to build fast digital logic that can perform such so-

    phisticated signal processing algorithms in the digital domain,

    it is still very challenging to design an ADC with above 10

    Manuscript received March 07, 2011; revised May 19, 2011; accepted June23, 2011. Date of current version September 14, 2011. This paper was recom-mended by Editor G. Manganaro.

    J. Kim is with School of Electrical Engineering and Computer Science, SeoulNational University, Seoul, 151-742, Korea (e-mail: [email protected]).

    E.-H. Chen and C.-K. K. Yang are with the Electrical Engineering De-partment, University of California, Los Angeles, CA, 90095 USA (e-mail:[email protected]; [email protected]).

    J. Ren, B. S. Leibowitz, andJ. L. Zerbe are with Rambus, Inc., Sunnyvale,CA94089, USA (e-mail: [email protected]; [email protected]; [email protected]).

    P. Satarzadeh was with Rambus, Inc. Sunnyvale, CA 94089, USA. He is nowwith Texas Instruments, Inc., Dallas, TX 75243 USA (e-mail: [email protected]).

    Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

    Digital Object Identifier 10.1109/TCSI.2011.2162465

    Gbps sampling rates and high enough resolution. For example,

    even at moderate resolution of 6 bits, an ADC may dissipate

    more than 1 W [5]–[8]. Such high power consumption of ADCs

    has been a discouraging factor for their full adoption in back-

    plane transceivers and the design of low-power ADCs has been

    one of the primary research directions. For instance, a recently

    published work demonstrated a 10 Gbps ADC-based backplane

    transceiver consuming 0.5 W [7].

    In recognition of these trends, this paper aims to strike a

    balance between the flexibility and power efficiency in ADC-based transceiver designs. In particular, it describes a way of 

    maximizing the performance of an ADC-based receiver with

    a coarse-resolution ADC, performing linear equalization (LE)

    and decision feedback equalization (DFE). We find that for a

    giventarget bit-error rate (BER) performance, the required ADC

    resolution can be greatly relaxed when the ADC is allowed to

    have nonuniform quantization levels.

    It is noteworthy that the optimal placement of such nonuni-

    form ADC decision thresholds is not necessarily the one that

    minimizes the quantization errors, especially for low-resolution

    ADCs. We explain this by demonstrating that an ADC-based

    DFE receiver is in fact equivalent to a loop-unrolling DFE re-ceiver [9]–[11]. The optimal threshold placement for the min-

    imum bit error rate (BER) is the one that maximizes the signal

    margin of the selected data slicer. The equivalence between the

    ADC-based DFE receiver and the loop-unrolling DFE receiver

    is discussed in Section II.

    Based on this observation, this paper proposes an opti-

    mally configured, nonuniform ADC-based receiver, called a

    reduced-slicer partial-response DFE  (RS-PRDFE) [15], which

    differentiates itself from the conventional, fully expanded,

    loop-unrolling DFE, also referred to as partial-response DFE

    (PRDFE). Section III describes a dynamic programming algo-

    rithm that can determine the optimal placement of the ADC

    thresholds for given channel characteristics. The performanceof the RS-PRDFE receivers is demonstrated both in simulation

    and in measurement, as described in Section IV. Section V

    then addresses the topic of jointly optimizing the RS-PRDFE

    receiver with various types of linear equalizers.

    II. REDUCED-SLICER PARTIAL-RESPONSE DECISION FEEDBACK

    EQUALIZER (RS-PRDFE)

     A. Equivalence Between ADC-Based DFE and Loop-Unrolling

     DFE Receivers

    Fig. 1 describes the signal flow in an ADC-based receiver per-

    forming decision feedback equalization (DFE). Once the ADC

    1549-8328/$26.00 © 2011 IEEE

  • 8/18/2019 Equalizer Design and Performance Trade-Offs in ADC-Based Serial Links

    2/12

    KIM et al.: EQUALIZER DESIGN AND PERFORMANCE TRADE-OFFS IN ADC-BASED SERIAL LINKS 2097

    Fig. 1. An ADC-based DFE receiver. (a) Its architecture. (b) Its signal flowdiagram where the ADC is modeled as a source of quantization noise.

    converts the received signal into a digital form, the DSP pro-

    cesses the DFE operation, which computes and subtracts the ap-

    propriate amount of offset from the digitized input based on the

    prior bit decisions. The DSP also contains the decision slicer,

    which compares the resulting value with a threshold and deter-

    mines the current bit. To minimize BERs, the signal-to-noise

    ratio (SNR) at this decision slicer’s input must be maximized.

    The quantization errors introduced by the ADC are counted to-

    wards the unwanted noise and hence the ADC strives to have as

    high resolution as possible.

    On the other hand, an analog DFE receiver subtracts the offset

    voltage in the analog domain as illustrated in Fig. 2(a). Another

    difference is that its decision slicer compares the resulting signal

    with an analog threshold (analog comparison) while that in the

    ADC-based receiver in Fig. 1 compares the two inputs in digitalforms (digital comparison). Since the slicer output is always a

    binary value, the signal around the DFE loop crosses the analog-

    digital boundary twice: once through the analog comparator and

    once through the feedback path generating the analog offset

    from the prior bits. The two conversion steps make it difficult

    to close the timing around the loop within one bit period.

    Loop-unrolling DFEs or partial-response DFEs (PRDFEs)

    mitigate this difficulty by shifting this timing loop entirely into

    the digital domain [9]–[11]. As illustrated in Fig. 2(b), the re-

    ceiver precomputes all possible offset values and compares the

    input signal with each and every offset. Once all the results enter

    into the digital domain, one of them is selected based on the priorbit history. Since the decision feedback loop is now entirely

    within the digital domain, higher frequency operation is pos-

    sible. However, a drawback is that the number of offset values

    to be compared with and hence the number of decision slicers

    grows exponentially with the number of DFE tap coeffi-

    cients (N).

    These seemingly different ADC-based DFE receiver in

    Fig. 1(a) and loop-unrolling PRDFE receiver in Fig. 2(b) are in

    fact equivalent and can be optimized using the same principles.

    Recall that the core DFE operation is to subtract a proper offset

    from the received signal before the bit decision. For both types

    of receivers, the bit decision is made based on a single, critical

    analog comparison. For the PRDFE, it is quite apparent sinceone of the slicer outputs is selected as the current bit decision.

    Fig. 2. Analog DFE receivers. (a) Conventional DFE. (b) Loop-unrollingPRDFE.

    For the ADC-based DFE, determining whether the quantized

    ADC output is greater than a certain offset is equivalent to

    determining whether the analog input signal is above a quan-

    tization threshold that is   closest   to this offset. In case of a

    flash-ADC, the bit is decided solely upon one particular slicer

    output that has the corresponding threshold.

    This observation suggests a DFE receiver architecture that we

    call reduced-slicer partial-response DFE  (RS-PRDFE), shown

    in Fig. 3. This architecture is similar to the PRDFE in Fig. 2(b)

    in that it selects one of the loop-unrolled slicer decisions as the

    current bit value. But the key difference is that a slicer is selected

    through a look-up table (LUT) rather than being direct-mapped,and therefore a single slicer may be selected for multiple prior

    bit histories. That is, if some of the slicers in the PRDFE have

    similar threshold values, RS-PRDFE can merge those redun-

    dant slicers into one. On the other hand, an ADC-based DFE

    receiver may contain slicers whose outputs are never used for

    bit decision, especially when the ADC has uniformly fine reso-

    lution. RS-PRDFE eliminates those unused slicers and replaces

    the thermometer-to-binary conversion, digital feedback equal-

    izer ( in Fig. 1), and binary subtraction unit with a simple

    look-up table and multiplexer.

    Therefore, the proposed RS-PRDFE can save power and area

    by removing redundant or unused slicers in the loop-unrolling

    PRDFE and ADC-based DFE receivers without degrading theBER performance.

  • 8/18/2019 Equalizer Design and Performance Trade-Offs in ADC-Based Serial Links

    3/12

    2098 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 58, NO. 9, SEPTEMBER 2011

    Fig. 3. The proposed reduced-slicer PRDFE receiver (RS-PRDFE).

    Fig. 4. The factors determining the bit-error rate (BER) of an ADC-based DFEreceiver in the presence of channel ISI. The signal margin is degraded by the ISIfrom the bits within the DFE tap range     and ISI from those outside the DFErange     . Thelowest BERis achieved when thedecisionthreshold     is equalto the     .

     B. BER Model for ADC-Based DFE Receivers

    Since the ADC-based DFE, PRDFE, and RS-PRDFE are all

    functionally equivalent in that each bit decision maps to a single

    critical analog comparison, their BER performance can be mod-

    eled in the same way. The BER is set by the probability of the

    critical comparison resolving incorrect results.

    To derive the expression for the BER, consider an N-tap DFE

    receiver for a signaling system whose intersymbol interference

    (ISI) spansL bit periods. Foreachofthe possibleprior

    N-bit histories, there is a slicer whose comparison result will be

    used as the current bit decision. Let denote the threshold of 

    that critical slicer . If the received s ignal for the

    th bit experiences the total ISI of from the neighboring L

    bits, the probability of detecting the th bit incorrectly

    can be expressed as

    (1)

    where the ISI isassumednormalized tothe received signal

    level of and denotes the right tail probability of stan-

    dard normal distribution. An additive Gaussian noise is assumed

    at the input of the slicer.

    The ISI contribution can be decomposed into two parts:

    the part that can be canceled by the N-tap DFE and the part

    that cannot (i.e., ISI contribution from the L-N bits outside the

    DFE range; where . By enumerating all

    possible bit patterns each resulting in a different amount of 

    ISI , one can derive the expressionfor the averageBER

    for the 1- and 0-level received bits, shown at the bottom of thepage.

    The expression in (2) tells how to choose the decision thresh-

    olds for the best performance of an ADC-based DFE. To

    achieve the minimum BER, the worst case argument for Q func-

    tion must be maximized. In other words,

    one must maximize the worst case signal margin for the slicer.

    To do so, the decision thresholds should be placed as close as

    possible to the predictable ISI levels to minimize .

    This leads to a very important observation: the best BER per-

    formance for an ADC-based DFE is achieved when it is opti-

    mized like a PRDFE. It is noteworthy that the nonuniform ADC

    quantization levels resulting from this principle can be very dif-ferent from the existing schemes that aim to minimize the signal

    quantization errors, such as adaptive differential pulse-coded

    modulation (ADPCM) used in voice applications.

    When trying to minimize the quantization error, i.e.,

    , one must place the decision thresholds where the signal

    is most expected . In binary signaling, such an

    optimization tends to place the thresholds near the likely levels

    for logic 0 and 1 as shown in Fig. 6(b). However, the next section

    will demonstrate that the optimal placement for the minimum

    BER minimizes and puts the thresholds near the center

    of the eye, as shown in Fig. 6(a).

    The PRDFE receiver has slicers and therefore can achieve

    the ideal N-tap DFE performance (i.e., by as-

    signing each slicer threshold to one of the possible values

    of . However, when the receiver has less than slicers, it

    is necessary to optimize the slicer thresholds to minimize the

    overall BER.

    (2)

  • 8/18/2019 Equalizer Design and Performance Trade-Offs in ADC-Based Serial Links

    4/12

    KIM et al.: EQUALIZER DESIGN AND PERFORMANCE TRADE-OFFS IN ADC-BASED SERIAL LINKS 2099

    Fig. 5. Illustration of the optimal placement problem of      slicer thresholdsin an N-tap RS-PRDFE receiver for the minimum peak error   0      . Theproblem is equivalent to contiguously grouping the sorted     ISI levels        ’sinto M disjoint groups while minimizing the largest group span.

    Fig. 6. The optimal placements of four slicer thresholds for: (a) the minimumthreshold error (i.e., the lowest BER; RS-PRDFE) and (b) the minimum signalquantization error. 5 Gb/s operation on the FR4 channel.

    III. OPTIMIZATION OF RS-PRDFE SLICER THRESHOLDS

    As mentioned previously, the drawback of a PRDFE receiver

    is that the number of decision slicers grows exponentially with

    the number of DFE taps. However, for voice-line modems, it is

    known that the number of quantization steps in the ADC need

    not grow with the number of DFE taps once the resolution is fineenough for the quantization errors not to limit the overall BER.

    The proposed RS-PRDFE receiver was inspired by this observa-

    tion and relaxes the required number of slicers for the PRDFE by

    merging the slicers with similar threshold values while keeping

    the threshold  quantization error small.

    When an RS-PRDFE receiver is constrained with M slicers

    where , one must choose the M slicer threshold levels

    so that the threshold quantization error e( is min-

    imized. Note that the set of ’s can take

    only M unique values, meaning that some of the slicer outputs

    may be selected for more than one prior bit patterns. Instead of 

    minimizing the BER expressed in (2) directly, it is easier to min-

    imize the worst case error based on the following

    approximation:

    (3)

    The approximation is justified by that is a very steep func-

    tion and for an extremely low target BER of less than , the

    BER is easily dominated by the worst case error e among

    its possible values.

    The problem of placing M slicer thresholds for the minimum

    worst case error can be thought of as first clustering

    the set of values into M disjoint groups and then assigningthe center point of each group as the slicer threshold (Fig. 5).

    The worst case threshold error within each group is equal to one

    half of the span (i.e., the difference between the minimum and

    maximum values) of that group. Therefore, minimizing the

    maximum threshold error is equivalent to finding M contiguous

    grouping of ISI levels ( ’s) that minimizes the largest span

    of the groups.

    The optimal grouping of ISI levels to any number of Mgroups can be done via a recursive, dynamic programming

    procedure. Assuming that the ISI levels ’s are sorted in

    an ascending order, let be the largest group span for

    the first ISI levels optimally split into groups. To recur-

    sively express in terms of with and

    smaller than and , respectively, we categorize the possible

    -grouping of ISI points based on the number of elements

    in the last th group. This last group can have as few as one

    element (i.e., ) and as many as elements (since

    the other groups must each have at least one element). If 

    the last group has elements, from through , then

    its span is simply where can vary from 1 to

    . And the minimum largest group span possible withthe rest of the ISI levels is . One should

    choose the number of elements for the last group so that it

    minimizes the overall largest group span of , which can

    be expressed in the following recursive relationship:

    ...(4)

    (5)

    With this definition, is the minimum largest span

    achievable for grouping ISI levels into groups. The op-

    timal thresholds are given by the center of each group’s span.

    The minimum worst case threshold error is hence .

    While the described dynamic programming procedure is

    guaranteed to find the optimal threshold placement for any

    given channel response, it may not be suitable for an online

    calibration scheme that can incrementally update the threshold

    levels of the individual slicers and their assignments to the prior

    bit patterns. One difficulty stems from the fact that the ISI levels

    ’s need to be sorted first, whose resulting order can vary

    strongly with the channel characteristics. Until an effective, yet

    low-cost scheme of incremental adaptation is found, a possible

    solution is to periodically characterize the channel response

    (e.g., the single-bit response) and compute the optimal slicer

    thresholds and assignments by firmware or software.

     A. ADC Threshold Placements for Minimum BER Versus

     Minimum Quantization Error 

    Fig. 6 compares the optimal threshold placements for the

    minimum threshold error (i.e., the optimal RS-PRDFE) and for

    the minimum signal quantization error with 4 slicers. Notice

    the vast difference between the two placements. The optimal

    RS-PRDFE places the thresholds near the center of the eye

    while the minimum quantization ADC places those near thesignal levels.

  • 8/18/2019 Equalizer Design and Performance Trade-Offs in ADC-Based Serial Links

    5/12

    2100 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 58, NO. 9, SEPTEMBER 2011

    Fig. 7. Simulated bit-error rates of various types of ADC-based DFE receivers

    operating at 5 Gb/s on the FR4 channel.

    Prior to this work, it was reported that reducing the full-scale

    range (FSR) of a uniform ADC can improve the DFE perfor-

    mance, even though the ADC cannot faithfully represent the

    signal due to overflows and underflows [12]. Fig. 6 provides an

    explanation to these surprising results; a uniform ADC with re-

    duced FSR has the threshold levels resembling those of the op-

    timal RS-PRDFE receiver.

    Fig. 7 compares the simulated BER for the four different

    types of DFE receivers: the optimal RS-PRDFE, the DFE with

    reduced-FSR ADC, the DFE with nonuniform ADC optimized

    for minimum quantization error, and the conventional DFE withuniform ADC. The simulations are carried out with an in-house

    statistical link simulation tool, LinkLab [13], which can simu-

    late system BERs given the single-bit response of the channel.

    Each analog slicer is assumed to have 10-mV deadband due to

    hysteresis and metastability and 1-mV input-referred noise.

    The results show that the RS-PRDFE indeed achieves the

    lowest BER. The uniform ADC with reduced FSR achieves

    BERs close to the minimum possible values. The results also

    demonstrate that minimum quantization error is clearly a poor

    criterion for a DFE receiver as the resulting performance is

    sometimes even worse than that of the conventional, uniform

    ADC. In this case, note that the increase in the slicer count may

    even degrade the BER since the minimum quantization error

    ADC would place the slicer thresholds even close to the ex-

    pected signal levels , which may be farther from

    the optimal levels from the BER perspective .

    The next section reconfirms this finding from a 10 Gb/s

    RS-PRDFE receiver prototype implemented in 65 nm CMOS

    technology [14].

    IV. EXPERIMENTAL RESULTS

    A simplified diagram of the prototype RS-PRDFE receiver is

    shown in Fig. 8 along with the photograph of the chip imple-

    mented in 65 nm CMOS [14]. The receiver frontend is 4-wayinterleaved to achieve 10 Gb/s data rate and can use up to 16

    slicers for RS-PRDFE operation (i.e., ) whose refer-

    ence voltages can be individually adjusted within a 100-mV

    range in 2-mV steps. Each slicer is running at 2.5 GS/s and con-

    sumes 0.75 mW including the clock buffers. The RS-PRDFE

    slicer selection is performed by a tap-assignment block, which

    routes each slicer’s decision output to the proper input posi-

    tion of the subsequent 32:1 multiplexer. The 32:1 multiplexerthen forwards the preselected input to the receiver’s final output

    based on the 5-bit prior history . By performing the

    slicer selection in the feedforward path rather than in the feed-

    back path as shown in the conceptual architecture in Fig. 3, one

    can shorten the critical feedback path delay around the multi-

    plexer. Furthermore, when implemented as a tree type, the 32:1

    multiplexer can be pipelined, exploiting the fact that the selec-

    tion input bits arrive serially, to achieve a high throughput of 

    10-Gb/s using synthesized circuits. Also, the tap assignment

    block can be utilized to reorder the slicers and minimize the

    offset errors due to mismatch [16].

    The receiver is tested with a 10 Gb/s, 700 mV

    PRBS data pattern transmitted via a 25 -long Nelco back-plane channel that has 17 dB loss at 5 GHz. The frequency and

    single-bit response (SBR) of the channel is shown in Fig. 9.

    To reduce the precursor ISIs and to effectively explore the re-

    ceiver performance with different single-bit responses, a tunable

    prefiltering circuit consisting of a high-pass filter (HPF), a 1-tap

    discrete-time FIR filter, and a variable-gain amplifier (VGA) is

    implemented in front of the RS-PRDFE receiver. In one setting,

    the effective SBR was as shown in Fig. 10(a).

    The effective voltage margin of the receiver is measured by

    inserting an extra slicer that samples the incoming signal with

    an adjustable, deliberate voltage offset. Its output is fed into a

    replica datapath which is identical to the main datapath but re-places one of the main slicer outputs with the extra slicer output,

    as shown in Fig. 8(a). Since the output data stream of the replica

    path should be identical to that of the main path except the data

    originating from the extra slicer, the two data streams can be

    XORed to measure the voltage margin of the data slicer being

    replaced.

    Fig. 10(b) compares the measured voltage margin at

    of various receiver architectures that can be configured by

    the described prototype chip: the RS-PRDFE receiver, loop-un-

    rolling PRDFE receiver, ADC-based receiver with uniform res-

    olution, and ADC-based receiver with reduced full-scale range

    (FSR). Since the effective SBR has one large postcursor ISI of 

    53 mV followed by many small ones [Fig. 10(a)], the possible

    ISI offsets and hence the optimal slicer thresholds are clustered

    around mV and mV levels. Such uneven distribution

    of the required ADC thresholds is difficult to realize with a low-

    resolution uniform ADC and as a result, the uniform ADC re-

    ceiver performs the worst in Fig. 10(b). Reducing the FSRof the

    ADC improves the voltage margin somewhat. The RS-PRDFE

    receiver always out-performs both the reduced-FSR ADC and

    PRDFE receivers for the same number of slicers used. Espe-

    cially, the 4-slicer RS-PRDFE achieves the equivalent voltage

    margin of the 16-slicer (4-bit) uniformly spaced ADC.

    The extra slicer performing the pseudo-BER detection is

    triggered by an independently adjustable clock phase and theeffective eye-diagram seen by the DFE receiver can be obtained

  • 8/18/2019 Equalizer Design and Performance Trade-Offs in ADC-Based Serial Links

    6/12

    KIM et al.: EQUALIZER DESIGN AND PERFORMANCE TRADE-OFFS IN ADC-BASED SERIAL LINKS 2101

    Fig. 8. (a) Block diagram of the prototype RS-PRDFE receiver with a voltagemargin detection circuit. (b) Chip photograph. The receiver contains an analogfrontend, a 16-slicer flash ADC with adjustable reference, and a 5-tap digitalDFE. Total active area is 0.26 mm .

    Fig. 9. The measured (a) frequency response and (b) 10 Gb/s single-bit re-sponse of a 25 Nelco backplane channel.

    by measuring the BERs as a function of voltage and timing

    offsets. Fig. 11 plots such effective eye diagrams measured for

    the receiver configurations mentioned earlier with 4 slicers.

    As expected, the RS-PRDFE receiver achieves the widest eyeopening.

    Fig. 10. (a) The measured single-bit response of a 24 Nelco backplanechannel with prefiltering. (b) Measured voltage margin versus the number of slicers (M) for different ADC-based receiver configurations.

    Fig. 11. Measured eye diagram of the 4-slicer ADC receivers: (a) the partialresponse eye diagrams of the individual 4 slicers; the effective eye diagrams of (b) RS-PRDFE, (c) reduced-FSR ADC, and (d) uniform quantization ADC.

  • 8/18/2019 Equalizer Design and Performance Trade-Offs in ADC-Based Serial Links

    7/12

    2102 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 58, NO. 9, SEPTEMBER 2011

    Fig. 12. (a) The measured single-bit response with a different prefiltering set-ting. (b) Measured voltage margin versus the number of slicers (M) for differentADC-based receiver configurations.

    It should be noted that the performance benefits of an

    RS-PRDFE receiver strongly depends on the SBR charac-

    teristics. For example, Fig. 12 compares the receiver voltage

    margins for a different prefiltering setting. For the effective

    SBR shown in Fig. 12(a), the RS-PRDFE continues to out-per-

    form the uniform ADC and reduced-FSR ADC receivers, but

    it has no performance gain over the PRDFE receiver. It is

    because the SBR has the 5 postcursor ISIs with very distinct

    values while the previous SBR in Fig. 10(a) has 3 ISIs at 5mV. The latter distribution of ISIs results in some overlaps

    among the possible ISI offsets, allowing the RS-PRDFE to

    save slicers for the same BER performance. For instance, if 

    all the postcursor ISIs had equal magnitudes, the required

    number of slicers would grow only linearly with the number of 

    DFE taps, rather than exponentially as in a PRDFE receiver.

    The reduction in the required slicer count leads to savings

    in power. The measured power dissipation of this prototype

    receiver including the analog frontend was 130 mW (13 pJ/bit)

    with 16-slicer RS-PRDFE and only 106 mW (10.6 pJ/bit) with

    8-slicer RS-PRDFE configuration.

    This observation suggests that it might be possible to fur-

    ther improve the performance of the RS-PRDFE receivers byshaping the effective SBRs in a certain way with linear equal-

    Fig. 13. Different approaches to reduce the RS-PRDFE slicer count with linearequalizers: (a) suppress the far-end ISIs to zeros, (b) suppress any ISIs withinthe DFE tap range, (c) make the ISIs specific values to force overlaps in ISIoffsets.

    izers. The next section investigates this possibility of optimizing

    the RS-PRDFE receiver jointly with linear equalizers, such as

    transmit and receive equalizers.

    V. JOINT OPTIMIZATION WITH LINEAR EQUALIZERS

    Combining the described RS-PRDFE receiver with linear

    equalizers presents interesting new opportunities to further

    reduce the slicer count. With linear equalizers either on the

    transmitter side or on the receiver side, one can change the

    effective channel characteristics to some degree. Then the

    question is: how should one shape the channel to achieve the

    lowest BER with minimal hardware? A typical answer for

    conventional DFE receivers is to minimize the ISIs outside theDFE tap range, since the number of decision slicers does not

    depend on the ISI offset values being subtracted.

    In an RS-PRDFE receiver, on the other hand, the number

    of required slicers does depend on the distribution of the ISI

    offsets created by the N prior bits within the DFE tap range.

    As described in Section III, RS-PRDFE receivers leverage the

    fact that some ISI offsets can be close to one another and share

    a common slicer to reduce the hardware cost. Therefore, it is

    also possible to reduce the slicer count not only by making the

    channel cleaner (i.e., suppressing more ISIs to zero) but also by

    creating deliberate overlaps among the ISI offsets. Fig. 13 illus-

    trates this point. Suppose a channel with three postcursor ISIs.An RS-PRDFE receiver would need 8 slicers if their ISI offsets

    are not sufficiently close to each other. For traditional PRDFE

    receivers, the slicer count can be reduced only if the last ISI is

    suppressed to zero [Fig. 13(a)]. However, with RS-PRDFE re-

    ceivers, the slicer count can also be reduced if any ISIs within

    the DFE tap range become zero [Fig. 13(b)]. In this case, the

    RS-PRDFE receiver acts as a roving-tap DFE whose nonzero

    tap positions can freely move within the range. In addition, fur-

    ther reduction in the slicer count can be achieved if the first

    and second ISIs become equal values, for example [Fig. 13(c)].

    Since this channel generates only 3 unique ISI offsets, 3 slicers

    can cancel all the possible ISI offsets.

    Which option is the best? The answer depends on the cost of the linear equalizer shaping the channel for each option listed

  • 8/18/2019 Equalizer Design and Performance Trade-Offs in ADC-Based Serial Links

    8/12

    KIM et al.: EQUALIZER DESIGN AND PERFORMANCE TRADE-OFFS IN ADC-BASED SERIAL LINKS 2103

    Fig. 14. An example circuit implementation of transmit FIR equalizer.

    in Fig. 13. Different types of equalizers such as transmit FIR

    equalizer, receive FIR equalizer, and receive continuous-time

    equalizer have different costs and will be examined. Since the

    cost also depends on the amount of change the linear equalizer

    should affect, the answer strongly depends on the channel loss

    and dispersion characteristics.

     A. RS-PRDFE With Transmit FIR Equalizers

    Transmit FIR filters are widely used in today’s chip-to-chip

    and backplane transceivers. One reason is that its high-speed

    operation can be easily achieved with low-resolution, dig-

    ital-to-analog converter like circuits with nonuniform quan-

    tization steps. Fig. 14 shows an example of transmit linear

    equalizer.

    The main drawback of transmit equalizers is that their peak 

    output swing is constrained. That is, the peak output voltage or

    current must not exceed given limits, typically set by the circuit

    topology or the operation conditions (e.g., supply voltage). If the transmit equalizer has an impulse response of , its peak 

    output is proportional to the sum of all absolute values of ’s

    and should be limited to a certain value

    (6)

    The above (6) implies that every nonzero tap coefficient in the

    transmit FIR equalizer leads to the reduction in the main signal

    swing . In other words, the transmit equalizer may signifi-

    cantly degrade signal margins if it tries to alter the channel re-

    sponse too much.

    Due to this peak swing constraint, for many channels en-countered in chip-to-chip and backplane applications, the

    optimal configurations for transmit linear equalizer combined

    with RS-PRDFE are generally the ones that use the transmit

    LE to suppress the far-end ISIs first. The far-end ISIs refer

    to the postcursor ISIs that are positioned far from the main

    cursor. Since those ISIs are usually smaller in magnitudes than

    the near-end ones (the postcursor ISIs that are nearer to the

    main cursor), using the transmit FIR equalizer to suppress them

    results in less degradation in the signal margin. On the other

    hand, changing the larger ISIs in order to create overlaps among

    the ISI offsets may require large tap coefficients and reduce the

    main signal swing significantly.

    To explore such trade-offs in combining RS-PRDFE with var-ious types of linear equalizers, a few representative backplane

    Fig. 15. Measured S-parameters (S21) of two example backplane channels. (a)

    A 3 -long trace on a FR4 backplane. (b) A 10.6 -long trace on Elma ATCADual Star backplane. The measurement includes the characteristics of the con-nectors, line card traces, and the 5 feet-long, low-loss SMA cables.

    Fig. 16. The measured 10-Gb/s single-bit responses for the two channels.

    channels were chosen. The measured S-parameter characteris-

    tics and single-bit responses (SBR) of those channels are shown

    in Figs. 15 and 16, respectively. The first channel, a 3 -long

    trace on a FR4 backplane has low loss of 15 dB at 5 GHz but

    strong reflections while the second channel, a 10.6”-long trace

    on Elma ATCA Dual Star backplane has the higher dispersion

    loss with the total loss being 20 dB at 5 GHz.

    The effects of a linear equalizer were emulated by convolving

    the measured 10-Gb/s, PRBS time-domain waveforms

    seen at the channel output with the impulse response of thelinear equalizer in MATLAB. The time-domain waveforms

    were collected with a sampling oscilloscope with pattern lock 

    capability (Agilent DCA-J 86100C). The time and voltage

    resolutions were 6.25 ps (32 points per unit interval) and 1 mV,

    respectively. Since the measured waveforms include noises,

    this procedure can predict the noise enhancement effects of 

    certain linear equalizers as well.

    The signal quality seen by the RS-PRDFE receiver can be

    visualized by constructing an effective composite eye diagram,

    as illustrated in Fig. 17. The eye diagrams for the individual

    slicers were first composed by accumulating the input traces

    only when the corresponding slicer output was selected as the

    received bits. Then, these individual eye diagrams were foldedinto one after adjusting their decision thresholds to net zeros.

  • 8/18/2019 Equalizer Design and Performance Trade-Offs in ADC-Based Serial Links

    9/12

    2104 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 58, NO. 9, SEPTEMBER 2011

    Fig. 17. The construction of the effective eye diagram for an RS-PrDFE re-ceiver. The eye diagrams seen by the individual decision slicers (left) are foldedonto a singleeye diagram, after being adjusted for their differentslicer thresholdlevels (right).

    Fig. 18. The benefits of combining transmit FIR equalizers with RS-PRDFE.(a) Equalized eye diagrams by the optimal transmit FIR filters. (b) Effective eyediagrams seen by the optimal 4-slicer RS-PRDFE receivers.

    The opening in the resulting effective eye diagram indicates the

    voltage and timing margins of the RS-PRDFE receiver.

    Fig. 18 shows the eye diagrams after the jointly optimized

    transmit equalizer and the effective eye diagrams seen by the

    RS-PRDFE receiver with 4 effective taps and 4 slicers (i.e.,

    and ) for the two example channels described. For both

    channels, the optimal transmit equalizers suppress mainly the

    far-end ISIs and widen the eye openings. The benefits of com-

    bining the transmit equalizer with RS-PRDFE are also shown

    in Fig. 19, which plots the simulated signal margins of the op-

    timal 4-tap RS-PRDFE receiver with different number of slicers.

    For both channel examples, the RS-PRDFE receiver with only

    4 slicers can achieve comparable signal margins with those of a

    4-tap PRDFE receiver that would require 16 slicers.

     B. RS-PRDFE With Receive Linear Equalizers

    LEs on the receiver side, on the other hand, are not subjectto a similar peak swing constraints to the transmit equalizers.

    Fig. 19. The simulated signal margins versus the number of slicers (M) of theRS-PRDFE receivers with and without the transmit FIR equalizers. (a) 3 FR4channel. (b) 10.6 ATCA channel.

    The difference is that the equalization is applied after the signal

    has been attenuated by channel rather than before. However,

    boosting the high-frequency content of the received signal can

    enhance noise, degrading the SNR and the BER.At the moment, one of the most widely used receive equalizer

    types is the continuous-time, linear equalizer (CTLE) whose

    example circuit is shown in Fig. 20(a). This circuit realizes a

    high-pass filter by enhancing the transconductance of the input

    stage at high frequencies. However, most CTLEs used in prac-

    tice are of low orders, lacking enough degrees of freedom to

    shape the channel responses into the ones in Fig. 13(b) or (c). In-

    stead, their high-pass characteristics are best utilized when im-

    proving the channel bandwidth and thus suppressing the far-end

    ISIs first. The remaining near-end ISIs are handled by the DFE.

    It is also possible to implement an FIR equalizer on the re-

    ceiver side, of which example is shown in Fig. 20(b) [1]. This

    circuit looks similar to the transmit equalizer circuit in Fig. 14except that the input to each stage is a discrete-time sampled

  • 8/18/2019 Equalizer Design and Performance Trade-Offs in ADC-Based Serial Links

    10/12

    KIM et al.: EQUALIZER DESIGN AND PERFORMANCE TRADE-OFFS IN ADC-BASED SERIAL LINKS 2105

    Fig.20. Example implementations of receive linear equalizers.(a) Continuous-time linear equalizer (CTLE). (b) Receive FIR equalizer.

    Fig. 21. The benefits of combining receive FIR equalizers with RS-PRDFE.(a) Equalized eye diagrams by the optimal receive FIR filters. (b) Effective eyediagrams seen by the optimal 4-slicer RS-PRDFE receivers.

    analog input instead of a full-swing digital one. Unlike CTLEs,

    this discrete-time FIR filter has the ability to individually adjust

    the tap coefficients. For example, the first postcursor tap of this

    circuit in Fig. 20(b) is adjusted by the current amplitude .

    Since a receive FIR equalizer can adjust the individual tap

    weights without being subject to a peak swing constraint, it

    can explore the opportunities illustrated in Fig. 13(b) or (c).

    Fig. 21 shows the eye diagrams after the jointly optimized re-

    ceive FIR equalizers and their effective eye diagrams seen bythe RS-PRDFE receiver. For the FR4 channel, the optimal re-

    ceive equalizer suppresses the far-end ISIs first, making the third

    and fourth ISIs zero values. However, for the ATCA channel,

    it is interesting to note that the FIR equalizer suppresses the

    second and third postcursor ISIs, leaving the fourth ISI positive.

    Nonetheless, for both cases, Fig. 22 shows that the RS-PRDFE

    with can effectively cancel the remaining ISI with only

    4 slicers and their achieved signal margins are superior to those

    with the 4-slicer PRDFE receivers.

    C. Digital Versus Analog Receive Linear Equalizers

    In the discussions so far, the receive FIR equalizer showedthe highest flexibility in reducing the required slicer count in

    Fig. 22. The simulated signal margins versus the number of slicers (M) of theRS-PRDFE receivers with and without the receive FIR equalizers. (a) 3 FR4channel. (b) 10.6 ATCA channel.

    an RS-PRDFE receiver. However, implementing an analog FIR

    filter with wide signal bandwidth may incur large power con-

    sumption [1]. Therefore, one may be interested in the potential

    benefits of implementing the receive FIR equalizer in digital do-

    main, after the received signal is quantized by an ADC.Analysis shows that implementing the receiver FIR equalizer

    in digital domain does more harm than good unless the ADC

    resolution is sufficiently high. This is evidenced by the simu-

    lation results shown in Fig. 23, which plot the signal margins

    of the optimal RS-PRDFE receivers combined with analog

    and digital receive FIR equalizers. For both cases, for slicer

    counts less than 16 (i.e., less than 4 bits of ADC resolution),

    an RS-PRDFE with an analog FIR equalizer outperforms one

    with a digital equalizer. It is because the RS-PRDFE and

    digital FIR equalizers have conflicting requirements on the

    ADC threshold placements. The former needs placement for

    minimum threshold error [Fig. 6(a)] while the latter needs

    placement for minimum signal quantization error [Fig. 6(b)].This gap is more pronounced at lower resolution ranges.

  • 8/18/2019 Equalizer Design and Performance Trade-Offs in ADC-Based Serial Links

    11/12

    2106 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 58, NO. 9, SEPTEMBER 2011

    Fig. 23. Performance comparison of the analog and digital receive FIR equal-izers when combined with RS-PrDFE. (a) 3 FR4 channel. (b) 10.6 ATCAchannel.

    Therefore, a practical way of implementing a high- perfor-

    mance receiver with a low-resolution ADC is to combine the

    proposed RS-PRDFE with an analog-type linear equalizer.

    VI. CONCLUSION

    This paper introduced a way of designing high-performance

    equalizing receiver with low-resolution ADCs. The quantiza-

    tion thresholds of the ADC may have to be individually ad-

     justed and optimized for the best signal margins rather than for

    the least quantization errors. The described RS-PRDFE receiver

    with only 4 slicers demonstrated the equivalent performance to a

    receiver with 3–4-bit uniformly quantizing ADC. It also showed

    some synergistic effects of combining RS-PRDFE with LEs, es-

    pecially with the receive FIR equalizers. It was shown to be

    preferable to leave the LEs in analog domain, since the DFE

    and LE have conflicting requirements on the ADC quantizationthresholds.

    ACKNOWLEDGMENT

    The authors would like to thank Dr. Ravi Kollipara and My

    Nguyen for measuring the response characteristics of various

    backplane channels used in this work.

    REFERENCES[1] C.-K. K. Yang and E.-H. Chen, “ADC-based serial I/O receivers,” in

    Proc. Custom Integr. Circuits Conf., Sep. 2009, pp. 323–330.[2] H. Chung and G.-Y. Wei, “Design-space exploration of backplane

    receivers with high-speed ADCs and digital equalization,” in   Proc.Custom Integr. Circuits Conf., Sep. 2009, pp. 555–558.

    [3] M. Harwood   et al., “A 12.5 Gb/s SerDes in 65 nm CMOS using abaud-rate ADC with digital receiver equalization and clock recovery,”in ISSCC Dig. Tech. Papers, Feb. 2007, pp. 436–437.

    [4] O. E. Agazzi et al., “A 90 nm CMOS DSP MLSD transceiver withintegrated AFE for electronic dispersion compensation of multimodeoptical fibers at 10 Gb/s,” J. Solid-State Circuits, pp. 2939–2957, Dec.2008.

    [5] P. Schvan et al., “A 24 GS/s 6 b ADC in 90 nm CMOS,” in ISSCC Dig.Tech. Papers, Feb. 2008, pp. 544–545.

    [6] Y. M. Greshishchev et al., “A 40 GS/s 6 b ADC in 65 nm CMOS,” in ISSCC Dig. Tech. Papers, Feb. 2010, pp. 390–391.

    [7] J. Cao et al., “A 500-mW ADC-based CMOS AFE with digital calibra-tion for 10 Gb/s serial links over KR-backplane and multimode fiberd,”

     J. So lid-State Circuits, pp. 1172–1185, Jun. 2010.[8] B. Murmann, “A/D converter trends: Power dissipation, scaling and

    digitallyassisted architectures,” in Proc. Custom Integr. Circuits Conf.,Sep. 2008, pp. 105–112.

    [9] S. Kasturia and H. J. Winters, “Techniques for high-speed implemen-tation of non-linear cancellation,” IEEE J. Sel. Areas Commun., vol. 9,no. 5, pp. 711–717, Jun. 1991.

    [10] Y.-S. Sohn  et al., “A 2.2-Gbps CMOS look-ahead DFE receiver formultidrop channel with pin-to-pin time skew compensation,” in  Proc.Custom Integr. Circuits Conf., Sep. 2003, pp. 473–476.

    [11] V. Stojanovic etal., “Adaptive equalizationand data recoveryin a dual-mode (PAM2/4) serial link transceiver,” in  Proc. VLSI Circuits Symp.,Jun. 2004, pp. 348–351.

    [12] E.-H. Chen et al., “Adaptation of CDR and full scale range of ADC-based Serdes receiver,” in   Proc. VLSI Circuits Symp., Jun. 2009, pp.

    12–13.[13] D. Oh  et al., “Accurate system voltage and timing margin simulation

    in high-speed I/O system designs,” IEEE Trans. Adv. Packag., vol. 31,no. 4, pp. 722–730, Nov. 2008.

    [14] E.-H. Chen et al., “10 Gb/s serial I/O receiver based on variable refer-ence ADC,” in Proc. VLSI Circuits Symp., 2011, in review.

    [15] J. Kim et al., “Equalizer design and performance trade-offs in ADC-based serial links,” in  Proc. Custom Integr. Circuits Conf., Sep. 2010,pp. 1–8.

    [16] C. Donovan and M. P. Flynn, “A digital 6-bit ADC in 0.25-    mCMOS,”   IEEE J. Solid-State Circuits, vol. 37, no. 3, pp. 432–437,Mar. 2002.

    Jaeha Kim   (S’94-M’03-SM’10) received the B.S.degree in electrical engineering from Seoul NationalUniversity (SNU), Seoul, Korea, in 1997, and

    received the M.S. and Ph.D. degrees in electricalengineering from Stanford University, Stanford, CA,in 1999 and 2003, respectively.

    From 2001 to 2003, he was with True Circuits,Inc., Los Altos, CA as Circuit Designer; withInter-university Semiconductor Research Center(ISRC), SNU, as Postdoctoral Researcher from2003 to 2006; with Rambus, Inc., Los Altos, CA

    as Principal Engineer from 2006 to 2009; and with Stanford University, CAas Acting Assistant Professor from 2009 to 2010. He is currently AssistantProfessor at SNU and his research interests include low-power mixed-signalsystems and their design methodologies.

    Prof. Kim is a recipient of the Takuo Sugano award for outstanding far-eastpaper at 2005 International Solid-State Circuits Conference (ISSCC) and theLow Power Design Contest Award at 2001 International Symposium on LowPower Electronics and Design (ISLPED). He served on the technical program

    committees of Design Automation Conference (DAC), International Conferenceon Computer AidedDesign (ICCAD), and AsianSolid-State Circuit Conference(A-SSCC).

  • 8/18/2019 Equalizer Design and Performance Trade-Offs in ADC-Based Serial Links

    12/12

    KIM et al.: EQUALIZER DESIGN AND PERFORMANCE TRADE-OFFS IN ADC-BASED SERIAL LINKS 2107

    E-Hung Chen   (S’05–M’06) was born in Taipei,Taiwan. He received the B.S. degree in electricalengineering from National Taiwan University andthe M.S. degree in electrical engineering from theUniversity of California, Los Angeles (UCLA),respectively. He is currently working toward thePh.D. degree at UCLA.

    Since 2005, he has been with Broadcom, Rambus,

    and Texas Instruments as a summer intern workingon channel equalization technique and receiver mod-eling. His research interests are high-speed serial link 

    design and its adaptation.

    Jihong Ren (S’03–M’06) received the Ph.D. degreein computer science from the University of BritishColumbia, Vancouver, Canada, in 2006, where sheworked on optimal equalization for chip-to-chiphigh-speed buses.

    She has been with Rambus, Sunnyvale, CA, sinceJanuary 2006, where she has worked on equalizationalgorithms and link performance analysis.

    Brian Leibowitz   (S’97-M’05) was born in NewJersey in 1976. He received the B.Sc. degree inelectrical engineering from Columbia University,New York, in 1998 and the Ph.D. degree in electricalengineering and computer science from the Uni-versity of California, Berkeley, in 2004, where hisdoctoral research included the development of a fullyintegrated CMOS imaging receiver for free-spaceoptical communication.

    Since 2004 he has been with Rambus, Inc., Sunny-vale, CA, where he has worked on equalization and

    mixed-signal circuit design for a variety of high-speed and low power serial

    links and memory interfaces.Dr. Leibowitz received the Edwin H. Armstrong Award from Columbia Uni-versity. His graduate studies at Berkeley were supported by a fellowship fromthe Fannie and John Hertz Foundation.

    Patrick Satarzadeh (S’04-M’09) received the B.S.,M.S., and Ph.D. degrees in electrical and computerengineering from the University of California, Davis,in 2004, 2006, and 2009, respectively.

    He has held internships with MIT Lincoln Lab,Lexington, MA, and Rambus, Inc., Sunnyvale, CA.In 2009, he joined Texas Instruments Inc., Dallas,TX, as a Member of the Technical Staff, where he

    has since worked on continuous time delta-sigmadata converters. His research interests include signalprocessing and mixed signal circuit design.

    Jared Zerbe (M’90-SM’10) was born in New York City in 1965. He received theB.S. degree in electricalengineering from Stanford University, Stanford, CA,in 1987.

    From 1987 to 1992, he worked in circuit design atVLSI Technology and MIPS Computer Systems. In1992 he joined Rambus Inc., Sunnyvale, CA, wherehe has since specialized in the design of high-speedI/O, PLL/DLL clock recovery, and equalizationand data-synchronization circuits. He has authoredor coauthored over 30 papers and over 50 patents

    in the area of high-speed signaling and clocking and has taught courses atBerkeley and Stanford in high-speed serial link design. He is currently aTechnical Director at Rambus where he is focused on development of futurehigh-performance and low-power signaling technologies.

    Chih-Kong Ken Yang  (S’94-M’98-SM’07-F’10)was born in Taipei, Taiwan. He received the B.S. andM.S. degrees in electrical engineering in 1992 andthe Ph.D. degree in electrical engineering in 1998from Stanford University, Stanford, CA in electricalengineering.

    He joined the University of California, Los An-geles, as an Assistant Professor in 1999 and has beena Professor since 2009. His current research area

    is high-performance mixed-mode circuit design forVLSI systems such as clock generation, high-per-

    formance signaling, low-power digital functional blocks, and analog-to-digitalconversion.