Equalizing Filter Design for Cross-talk Cancellation
by
Jihong Ren
B. Sc. (Electrical Engineering), Huazhong University of Science and Technology, 1995
M. Eng. (Electrical Engineering), Huazhong University of Science and Technology, 1998
M. Sc. (Neuroscience), The University of British Columbia, 2000
A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF
THE REQUIREMENTS FOR THE DEGREE OF
Master of Science
in
THE FACULTY OF GRADUATE STUDIES
(Department of Computer Science)
we accept this thesis as conformingto the required standard
The University of British Columbia
June 2002
c�
Jihong Ren, 2002
Abstract
As interconnect line width and spacing decreases and operating clock rate increases, in-
terconnect has become a bottleneck in developing high-speed integrated circuits, multichip
modules, printed circuit boards, and systems. With small line spacing, mutual capacitance
and inductance approach the level of self-capacitance and inductance, and can severely de-
grade signal integrity. The well-known equalizing filter method can significantly improve
signal integrity. This thesis explores the effectiveness of equalizing filters in cross-talk can-
cellation for high-speed, off-chip buses. It demonstrates that linear programming provides
effective methods for designing cross-talk canceling equalizing filters that greatly increase
the bandwidth of high-speed digital buses.
ii
Contents
Abstract ii
Contents iii
List of Tables vi
List of Figures vii
Acknowledgments ix
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Method and Proposed System Structure . . . . . . . . . . . . . . . . . . . 2
1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Background 6
2.1 Transmission channel limitations . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Equalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Design Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
iii
2.2.2 Application of equalizing filters in cross-talk cancellation for the
local telephone subscriber loop . . . . . . . . . . . . . . . . . . . . 12
2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Coupled Distributed RLC Interconnect Model 14
3.1 Interconnect Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Bus parameters and Simulation results . . . . . . . . . . . . . . . . . . . . 18
4 Linear Equalizing Filter Design 20
4.1 Measurements of filter performance . . . . . . . . . . . . . . . . . . . . . 20
4.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2.1 Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2.2 Matrix Representations of Convolution . . . . . . . . . . . . . . . 25
4.3 Least Squares Optimization Method with Pseudo-random Input . . . . . . . 30
4.3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.3.2 Least Square problem formulation . . . . . . . . . . . . . . . . . . 30
4.3.3 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.4 Linear Programming Method with Worst-case Input . . . . . . . . . . . . . 38
4.4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.4.2 Linear Programming Problem formulation . . . . . . . . . . . . . . 44
4.4.3 Smoothing filter . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.4.4 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.5 Testing results: Comparison of LSQ method and LP method . . . . . . . . 49
4.5.1 Worst-case input sequence . . . . . . . . . . . . . . . . . . . . . . 49
4.5.2 Indirect coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.5.3 Over-fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
iv
4.5.4 Minimum bit time . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.6 Time-variant Linear FIR Filter . . . . . . . . . . . . . . . . . . . . . . . . 62
4.7 Optimized Smoothing Filter . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5 Predictor-Corrector Algorithm with Model Reduction 67
5.1 Mehrotra’s predictor-corrector algorithm . . . . . . . . . . . . . . . . . . . 68
5.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2.1 Starting and Stopping . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2.2 Solving the linear systems . . . . . . . . . . . . . . . . . . . . . . 72
5.3 Ill-conditioning and Model Reduction . . . . . . . . . . . . . . . . . . . . 72
5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6 Conclusions and Future Work 77
6.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Bibliography 81
v
List of Tables
4.1 Performance of equalizing filters with different sizes for a bus 32-bits wide
and 5 cm long. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2 Performance of equalizing filters with different sizes for buses 32-bits wide.
All filters designed using the LP method. . . . . . . . . . . . . . . . . . . . 61
4.3 Performance of different smoothing filters with ����� equalizing filters de-
signed by the LP method at 300 ps. . . . . . . . . . . . . . . . . . . . . . . 65
5.1 linprog() iteration display . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.2 Iteration display of our approach: Mehrotra interior-point method with
model reduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
vi
List of Figures
1.1 Proposed transmission network structure. . . . . . . . . . . . . . . . . . . 3
2.1 A coupled microstrip transmission line. . . . . . . . . . . . . . . . . . . . 7
2.2 Simple lumped model for two coupled interconnects . . . . . . . . . . . . 7
2.3 Block diagram of an equalized transmission channel (from [3]). . . . . . . 9
2.4 Simplified model for full-duplex transmission over a linear multi-input/multi-
output channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.1 Analytical solution from equation 3.16 vs. Spice simulation results . . . . . 19
4.1 An illustrative eye diagram . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2 Example of a data eye . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3 Predistorted signal: equalizing filter output . . . . . . . . . . . . . . . . . . 39
4.4 Examples of output signal for 32-bit interconnect network . . . . . . . . . 40
4.5 Eye-diagrams for a 32-bit interconnect network . . . . . . . . . . . . . . . 41
4.6 Frobenius norm of the bus impulse response. . . . . . . . . . . . . . . . . 46
4.7 System with smoothing filter at the receiver end. . . . . . . . . . . . . . . . 48
4.8 Example of output signals for systems with and without the equalizing filter
designed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
vii
4.9 Pseudo-random test: eye diagrams for systems with and without the equal-
izing filter designed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.10 Worst-case test vs. Pseudo-random test . . . . . . . . . . . . . . . . . . . . 53
4.11 Worst-case performance of different equalizing filters designed with the LP
method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.12 Indirect coupling between non-adjacent lines . . . . . . . . . . . . . . . . 55
4.13 Eye diagram for system with � � ���equalizing filters designed by the LP
method. Grey traces indicate high signal transmitted. Black traces indicate
low signal transmitted. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.14 Magnitude of overshoot increases with the size of the equalizing filter de-
signed with the LP method . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.15 The convolution procedure of the time-variant FIR filter . . . . . . . . . . . 63
viii
Acknowledgments
First of all, I would like to thank my supervisor Dr. Mark Greenstreet. This thesis would not
have been possible without his inspiration, extensive support , patience and encouragement.
I also would like to thank my husband, Rui Li, for his consistent support.
JIHONG REN
The University of British Columbia
June 2002
ix
Chapter 1
Introduction
1.1 Motivation
Advances in digital integrated circuit (IC) fabrication technology have resulted in an ex-
ponential growth for the speed and integration levels of ICs. With more and more circuits
placed on each die, high-performance systems require larger and larger I/O bandwidth. This
demand has been addressed by increasing the number of high-speed signals and the per-pin
interconnection bandwidth. Although the number of I/Os has increased from���������
pins in the 1970s, to several hundred pins per IC now [18], this growth is being rapidly
out-paced by the bandwidth demands. To continue to improve overall system performance,
the per-pin interconnection bandwidth must scale with the speed and integration level of
ICs. However, without new approaches, we will soon reach the limit set by the intrinsic
properties of copper lines.
The number of I/Os increases by 12% per year, half of which is due to the increase
in chip perimeter and half of which is due to the increase in pin density. On chip, both the
number of devices and clock rates have increased at 50-60% per year, creating a growing
1
bandwidth gap. Higher bit-rates and pin densities have come to a point that interconnections
are no longer well-behaved short interconnections. With the decreasing cross sectional ar-
eas of interconnections, the line resistance per unit length has increased to a point that long
interconnections can no longer be considered lossless. Resistive effects are particularly se-
vere at high bit-rates because of both the high frequency roll-off of RC transmission lines
and the increase of resistance with frequency due to the skin effect. To achieve maximum
packing density, designers attempt to place signal lines as close to each other as possible.
This introduces problems of electromagnetic coupling (cross-talk) which are exacerbated
by high data rates. Cross-talk has become a critical issue in interconnect performance and
hence overall system performance. Traditionally, cross-talk is reduced by carefully control-
ling line geometry and arranging circuits to decrease the coupled line length. Moreover,
signaling conventions that are less susceptible to coupled energy can be used. These meth-
ods reduce cross-talk in a somewhat ad-hoc way. For example, as a rule-of-thumb, a ratio of
two-to-one for line spacing against line width is commonly used, based on the assumption
that cross-talk decreases monotonically with the increase in line spacing. However, this
simple assumption can fail for high bit-rate design. The relationship between line spacing
and line width is non-linear, and a two-to-one ratio between width and spacing may actu-
ally result in higher coupled energy than smaller line spacing [11][20]. Furthermore, while
these methods might reduce the amount of cross-talk, the problem of cross-talk still exists.
New approaches in cross-talk reduction are needed.
1.2 Method and Proposed System Structure
Equalizing filters have been used effectively for cross-talk cancellation in acoustic applica-
tions such as telephone line subscriber system [1][6][7]. Recently, they have been used to
compensate for the frequency-dependent attenuation of transmission lines [2].
2
0
0
Transmitter
filter
filter
filter
filter
Bus
Filter Network
Receiver
Figure 1.1: Proposed transmission network structure.
This thesis explores the effectiveness of equalizing filters in cross-talk cancellation
for high-bandwidth, digital communication. The proposed system structure is depicted in
figure 1.1. In this transmission system, an equalizing filter is assigned to each wire of the
bus. Each filter takes the input signals on a wire and its adjacent wires as its inputs, and
outputs a predistorted signal onto the wire. For a � -bit bus, the filter system can be viewed
as a � ��� network. Cross-talk is eliminated if the filter network is designed in a way that
the concatenation of the filter network and the bus has frequency response in the form of a
diagonal matrix.
Several optimal filter design strategies are explored, such as the linear programming
method and the least-squares method. Matlab simulation results show that the resulting
3
filters dramatically reduce cross-talk and substantially increase the maximum bandwidth
that can be achieved by buses on PC boards. Thus, the equalizing filter method is promising
for cross-talk cancellation and merits further investigation.
1.3 Contributions
This thesis demonstrates that linear programming models provide effective methods for
designing cross-talk canceling equalizing filters that greatly increase the bandwidth of high
speed digital buses on printed circuit boards. The following are the major contributions
supporting this thesis:
� Equalizing filter design for high speed digital buses can be formulated as a least
squares optimization problem, using a���
metric for optimality. This metric ensures
the quality of the received signal “on average”.
� The���
metric corresponds to the traditional eye height measurement of signal in-
tegrity and guarantees worst-case performance. The filter design problem for� �
optimality can be formulated as a linear programming problem.
� An evaluation of the linear programming and least squares methods for a variety of
filter configurations shows that both offer a dramatic increase in bandwidth when
compared with a bus with no filter or with transmitter pre-emphasis without cross-
talk cancellation. Furthermore, the filters designed for the� �
optimality criterion
using linear programming significantly outperform their counterparts designed by
traditional, least-squares method, when evaluated for digital data transmission.
� To evaluate these methods, I implemented them both using Matlab. In doing so,
I found that Matlab optimization package does not always converge for the linear
4
programming problems presented in this thesis. Therefore, I implemented an interior-
point method with a model reduction technique that successfully solves the linear
programming problems encountered.
1.4 Thesis Outline
In this thesis, Chapter 2 introduces the equalizing filter technique and its existing applica-
tions. Chapter 3 describes a coupled distributed RLC model for transmission lines. Based
on this model, Chapter 4 discusses various techniques, such as least squares and linear
programming, that I explored to design optimal linear FIR equalizing filters. Chapter 5 is
devoted to Mehrotra’s interior point method with a model reduction technique that is used
to solve our particular linear programming problem introduced in Chapter 4.
5
Chapter 2
Background
Computer system performance is often limited by communication bandwidths between
chips and between subsystems. A typical signaling system consists of a transmitter, a chan-
nel, and a receiver. The transmitter encodes digital information as analogue waveforms on
the transmission channel, such as a circuit board trace. On the other end of the transmission
channel, the receiver samples and quantizes the signal to recover the original digital infor-
mation. Although we often think of transmission channels such as wires as being ideal by
having zero resistance, capacitance and inductance, real wires are not ideal but rather par-
asitic circuit elements whose geometry affects their electrical properties. Moreover, with
small line spacing, inductive and capacitive cross-talk can severely degrade signal integrity.
With the growth in integration levels, the interconnect line width and spacing decreases,
and interconnect has become a bottleneck in high-speed digital designs.
This chapter first discusses the channel characteristics, particularly PC board traces.
I then provide background on the equalizing filter technique and an overview of its related,
existing applications.
6
t
sw w
h
Figure 2.1: A coupled microstrip transmission line.
Figure 2.2: Simple lumped model for two coupled interconnects
2.1 Transmission channel limitations
Transmission channels, such as PC board traces and coaxial or twisted-pair cables, have
limited bandwidths that are determined by their physical characteristics: the size and con-
struction of their conductor and shield, and the dielectric material. In this thesis, I am
particularly interested in high-speed interconnect on PC boards. Thus, the following dis-
cussion focuses on PC board traces. Figure 2.1 shows typical microstrip interconnections.
A simple lumped model for two coupled interconnects is shown in figure 2.2.
The resistance per unit length of a trace is given by the conductance of the trace ma-
terial (typically copper) divided by the cross-sectional area of the trace. The cross-sectional
area is the product of the width of the trace and its thickness. The width is determined by the
design. The thickness is specified when the board is manufactured: thickness is specified
in ounces of copper per square yard. A board with 1 oz copper has a conductor thickness
7
of roughly 35 microns. More accurate models consider the skin effect: at high frequencies,
currents flow closer to the surface of the trace, resulting in a frequency-dependent increase
in the series resistance [10][3].
The capacitance per unit length ( � ) and the inductance per unit length (�) of a mi-
crostrip trace are determined by many factors including its width and height and its separa-
tion from the ground plane. Electric and magnetic fields between adjacent traces lead to the
coupling capacitance, ��� , and the mutual inductance,� � , respectively.
For PC board traces, the loss in transmission is primarily due to the series resistive
component of the copper ( � ). Because of this loss, without a special transmission scheme,
off-chip signaling on long wires, even with good current-mode signaling methods, is limited
to about 1GHz [2]. Full-swing unterminated signaling methods that are used in most digital
systems have even lower limits. With narrow wires and smaller line spacing, the coupling
inductance and capacitance between adjacent lines approach the level of self-inductance and
capacitance. In high speed circuits, because of fast signal rise times, coupling effects are
severe and have become a primary concern for present and future high-speed high-density
circuit design. Besides the resistive properties of the line, the coupling effects further limit
the maximum bit-rate at which data can be transmitted correctly.
2.2 Equalization
An ideal transmission channel would in all cases deliver the near end signal � in����
from
the driver without distortion to the far end receiver, i.e. � out����� � in ��� ������� , where
���is
the propagation delay across the channel. Thus, an ideal channel would have the transfer
function ������������ , where � � � � and � is the identity matrix. If an equalizing filter has a
transfer function that equals the inverse of the transfer function of the channel, the concate-
nation of the equalizer and the channel has a flat frequency and phase response. This is the
8
Transmitter
Equalizer
G(s)
Channel
H(s)
Figure 2.3: Block diagram of an equalized transmission channel (from [3]).
equalization technique widely used to actively compensate for the channel transfer func-
tion. Channel equalization can be performed at the transmitter end, as shown in figure 2.3,
preceding the actual channel driver. Transmitters that utilize equalizing filters are called
pre-distorting transmitters. The equalizing filter can also be incorporated into the receiver,
called receiver equalization. It can also be split between the two ends.
� Pre-distorting Transmitters
Pre-distorting transmitters integrate equalizing filters, commonly realized as finite
impulse response (FIR) digital filters. While infinite impulse response (IIR) [9] fil-
ters can be more flexible than FIR, they are generally not used for high data rate
transmission because of the difficulty of calculating the IIR recurrence (i.e. feed-
back) at very high rates. The inputs to the equalizing FIR filters are the present and
past transmitted symbols. The output of the FIR filter is a weighted sum of these
symbols. The length of the filter depends on the number of symbols that affect the
response of the channel to the current symbol. The filter coefficients depend on the
channel characteristics.
Pre-distorting transmitters were first used by Poulton et al. [2] in a serial channel over
copper wires at 4Gb/s to reduce intersymbol interference caused by frequency depen-
9
dent attenuation of the channel. Later, other groups [4][17] used the same technique
to design high-speed serial link transceivers. FIR equalizing filters built into trans-
mitters are easy to implement at very high speed because of the availability of trans-
mitted symbols at the transmitter end. Furthermore, because the transmitted symbols
are either 1s or 0s, multiplication with the filter coefficients is easy. For example, in
[2], a five-tap FIR filter is implemented with digital adders, and a digital-to-analog
converter (DAC) is used to generate pre-distorted pulses. However, because trans-
mitters generally don’t have information of received signals, FIR filter coefficients
are obtained either by characterization of channel properties in advance [2][4], or by
adaptive implementation with feedback information from the receiver end [17].
� Receiver Equalization
Receiver equalization can be realized either with analog filters preceding the analog-
to-digital converter (ADC) or with digital filters following the ADC. The latter one is
the usual technique because digital filters are easy to implement and adapt. Moreover,
more complex and non-linear filters can be implemented. However, it is well-known
that receiver equalization amplifies high frequency noise [8]. Furthermore, histori-
cally, high speed ADC technology is behind high speed DAC technology. Therefore,
pre-distorting transmitters are commonly used in high speed transmission systems
that run at GHz speed. Recently, Horowitz’s group realized 8-Gsamples/s ADC in
0.25 � m CMOS, which makes high speed links with equalization at the receiver end
possible [19].
2.2.1 Design Methods
The following are two methods that are currently used to design equalizing filters.
10
� Zero-forcing method
The transfer function � of the channel can be derived from models established for
each particular channel (reviewed in [18]). The frequency response of the channel
and also the desired frequency response of the equalizing filter is then calculated at
each frequency point. This set of discrete points is used to obtain a discrete impulse
response function using inverse Fourier transform. The following two steps are used
to obtain a more manageable impulse response function.
– Windowing: � ��� � � ����� ��� ��� � where � ����� � is the desired impulse response
and� ��� �
is the windowing function. This step is needed to obtain a filter with
a finite number of taps.
– Delaying: � ��� � is shifted to the right until the samples are all indexed by a
non-negative integer to obtain a causal filter.
In practice, large windows must be used to obtain effective equalizing filters. Ac-
cordingly, many researchers have turned to using optimization methods to obtain
good approximate equalizing filters. This is the approach that I take in this thesis.
� Least Squares Minimization
With an ideal transmission channel, the received signal is a delayed version of the
transmitted signal. Using least squares minimization, the equalizing filter design
problem is equivalent to the problem of designing equalizing filters to determine the
values for the filter coefficients that minimize the� �
norm of the difference between
the received signal and the delayed version of the transmitted signal.
This method is used in optimal pre-emphasis equalizing filter design in [2][19] to
build serial links that operate at over 1 Gigabits per second. Also it is widely used to
design equalizers for telephone subscriber systems [1][6][7].
11
receiver
b(t)
a(t)
P(t)transmitter filter
G(t)
channel
filtertransmitter
P(t)
channelfar−end
H(t)
near−end
filter
R(t)
n(t)
Figure 2.4: Simplified model for full-duplex transmission over a linear multi-input/multi-output channel. � ��� ����� ��� ����� ��� ����� ���� are the impulse responses of the far-end channel,near-end channel, transmitter filter and receive filter respectively.
2.2.2 Application of equalizing filters in cross-talk cancellation for the local
telephone subscriber loop
Equalizing filters are used to reduce intersymbol interference caused by the characteristics
of a single channel [2][4][17][19]. Until now, no work has been reported on the application
of equalizing filters in cross-talk cancellation for high speed buses that run at multi-Gb/s.
Along with the limited bandwidth of transmission channels, cross-talk is another critical
problem that limits the maximum data rate that can be achieved by high density wide buses.
Local telephone subscriber loops have the same problem. Bundles of twisted copper wires
are used in local telephone subscriber loops. Because of the close physical proximity, cross-
talk interference from neighbouring channels is one of the major limitations on the max-
imum data rate that can be achieved over the loops [7]. Multichannel equalization can
effectively suppress both near- and far-end cross-talk [6][7].
In these papers, a cable of twisted pairs that is terminated at a single physical loca-
tion is treated as a single multi-input/multi-output channel. Cross-talk is then characterized
by off-diagonal components of the matrix impulse response of the channel. The multichan-
nel adaptive FIR equalizers, the transmitter and the receiver process the entire vector of
12
inputs and outputs (see figure 2.4). Rather than directly diagonalizing the system trans-
fer function matrix, the multichannel equalizers are designed to minimize the� �
norm of
the difference between the received signal and the transmitted waveform. In Salz’s work
[16], the minimum mean square error (MMSE) linear equalizer for the � ��� channel is
completely specified, assuming uncorrelated data and white noise. Later, Honig et al. [6]
generalized Salz’s work by assuming correlated data symbols, pulse amplitude modulation
(PAM) signals and colored noise.
2.3 Summary
The equalization technique has been successfully used to compensate for resistive effects of
transmission lines [2][4][17]. With this technique and carefully chosen signaling methods,
multi-Gb/s serial links have been built. Equalization is also commonly used in telephone
subscriber systems to cancel near-end and far-end cross-talk [7][1][6]. In this thesis, I
explored the effectiveness of the equalization technique in cross-talk cancellation for high-
speed, off-chip buses. Moreover, besides the least squares optimization technique that is
commonly used to design equalizing filters, this thesis is the first work that formulates the
optimal equalizing filter design problem into a linear programming problem for high speed
digital buses.
13
Chapter 3
Coupled Distributed RLC
Interconnect Model
3.1 Interconnect Model
An electrical model of a uniform transmission line has inductance�, resistance � , capaci-
tance � and parallel conductance � , all per unit length. The term � models the effects of
current leakage and is practically zero for most digital transmission on integrated circuit
and printed circuit boards.
We would like our system be able to operate at bit rate greater than 2 Gbits/sec.
Assuming that the rise and fall times are 10% of the bit time, edges have an electrical length
of ��� = Rise time (ps)/Delay (ps/cm) = 50 (ps)/33 (ps/cm) = 1.51 cm, where 33 ps/cm is
the speed of light in a vacuum. The propagation delay of signals traveling in other media
such as a PCB trace is larger [10], and thus the corresponding electrical length would be
even smaller. For example, the common FR-4 printed circuit board material has a dielectric
constant of about 4.5 and propagation delay about 71 ps/cm. The electrical length of a bit at
14
2Gbits/sec is 0.7 cm. As a rule of thumb, distributed models should be used when the wire
length is greater than or equal to � ��� � . Thus the critical dimension separating lumped from
distributed systems for printed circuit board is 0.117 cm. The wire lengths we consider here
are in the range of 2�
50 cm. Thus a distributed model is needed to correctly model the
behavior of this system at multigiga bit/sec data rate. Assuming the TEM mode of wave
propagation, for a lossy multiconductor system of�
wires, we have��
�inductance matrix�
, capacitance matrix � and resistance matrix�
, where������ , � ��� � is the mutual inductance
and coupling capacitance between line � and � respectively. For simplicity, the following
assumptions are made:
� Coupling between lines is entirely due to mutual inductance and mutual capacitance.
There is no conductance between wires of the bus or between wires of the bus and
ground. Only coupling between adjacent lines are taken into account. We ignore
direct coupling between wires of the bus that are not adjacent.
� Every wire is assumed to have the same characteristics.
� Wires are assumed to be arranged around a cylinder so that every wire is the same as
others.
With the above assumptions, the�
and�
matrices are shown below. The capacitance
matrix � has the same structure as�
.
�
�
� � � � � � � �� � � � � � � �� � � � � � �...
......
� � � � � � �
��������������
�
�
� � � � � ��� � � � � ��� � � � � ��...
......
� � � � � �
��������������
(3.1)
15
The behavior of this distributed system can be described by the following partial
differential equation, where voltage vector � and current vector � are both functions of
position � and time�. � �� �
� � � � � � �� � (3.2)� �� � � �
� �� � (3.3)
Taking the Fourier transformation of these equations yields:
���� � � � ��� ��� � �� (3.4)
� �� � � �� � � (3.5)
where�
is the Fourier transform of � , � is the Fourier transform of � , and � � � � .Differentiating equation 3.4 with respect to � and substituting equation 3.5 into the result
gives � � �� �
� � ��� � � � � � � � � � (3.6)
Let � ��� � � � � � � � . Let � be a diagonalizing matrix for � , i.e., � �� ��� is the diag-
onal matrix�
whose diagonal elements are the eigenvalues of � . Rewriting equation 3.6
with � yields:
� �� � � �� �
� � � �� ��� � � � �� � � (3.7)
Let��� � �� �� and
� � �� ��� , we get
� � ���� �
� � � �(3.8)
This differential equation has the general solution
��� �� �� ��� ��� � � �
� ���(3.9)
16
For a bus with non-zero resistive and capacitive or inductive components, the elements of�
and� �
are complex numbers. Combining equation 3.9 with the definition of� �
yields:
� � � � �� ��� � � � � �
� ��� �(3.10)
Assuming all source ends are terminated with an impedance of� ���
and the load ends are
left open, we have the following boundary conditions.
��� ���length �
� (3.11)� ��� � � � ��� �� ��� ���� ��� � (3.12)
Combined with equation 3.4 and 3.10, the first boundary condition given above yields:
� �� � � �
� � ���length (3.13)
From equation 3.10, we know that:
� � ��� � � � �
��� � �(3.14)
Thus, equation 3.12 yields:
� � � � � � � ��� � � � � ��� � ��� ��� � � �� � � � � � � � � �(3.15)
Equations 3.13, 3.15 yield the final solution
� � � � �� ��� � � � � � �
� ��� � � � ���(3.16)
with
� � � � �� � ���
length � � � � �� � � � � ��� ��� � � �� � � � � � � ��� length � � � � ��� �� �
� �� � ���
length � � (3.17)
where � is the identity matrix. Note that� �
� ���, and
� � � � � � �. Thus, the
frequency response of the bus is:
� � � � �� ���
length � � � � �� ���
length � (3.18)
17
with � � � �
defined in equation 3.17. The inverse Fourier transform yields the impulse
response of the bus which is used extensively in the next chapter. Note that the frequency
response of the bus is a square matrix at each frequency. The impulse response of the bus
is also a square matrix at each time sample. Entry� � � � � at time
�denotes the response on
wire � at time�
given an impulse input on wire � at time � .
3.2 Bus parameters and Simulation results
I validated the model derived above by comparing its prediction with Spice simulations.
Figure 3.1a shows the solution of equation 3.16 using Matlab and figure 3.1b shows spice
simulation results. The parameters used in both simulation are: bus width = 3, length =
5 cm, � = 0.066 ohm/cm, � = 0.8 pF/cm,�
= 3.99 nH/cm,� � � � = 0.31, � � � � = 0.23, � � �
= 5.0 V, bit time = 500 ps,���
= 10% *bit time = 50 ps. These parameters correspond to
microstrip lines 34.5 � m thick (1 oz copper), 75 � m wide with 75 � m separation between
lines, running above a ground plane with a dielectric thickness of 100 � m, and a dielectric
constant of ��
4.5. The bus parameters are computed using formulas given in [10].
18
Figure 3.1: Analytical solution from equation 3.16 (upper panel) vs. Spice simulationresults (lower panel) of 3-bit bus. All lines are quiet except line 1.
19
Chapter 4
Linear Equalizing Filter Design
In this chapter, I present techniques for the design of linear equalizing filters. I first in-
troduce the idea of a data eye and its use to quantify filter performance. The next section
defines notations that simplify the mathematical presentation of linear equalizing filter de-
sign. Then, I introduce the least squares (LSQ) method and the linear programming (LP)
method, followed by test results. Finally, based on the linear FIR filter designs, time-variant
FIR filter design and optimal smoothing filter design are discussed.
4.1 Measurements of filter performance
The effects of distortion and noises are often illustrated using eye diagrams. An illustrative
eye diagram is shown in figure 4.1. It is called eye diagram because of its shape. During
sample interval, signal is either distinctly high or distinctly low. It must not go through the
center of the eye. This allows the receiver to unambiguously determine the value of the
bit that was transmitted. The signal can change between sampling intervals. I also restrict
how high (or low) the signal may go, otherwise, with scaling any eye opening can be made
20
eye width, w
target
v(t)
Bad
Good
Good
Bad
low
targethigh
SampleInterval
Bad
Bad
IntervalSampleNext
hunder
overh
t
Figure 4.1: An illustrative eye diagram.
arbitrarily large. Eye height height is defined as
height������ � � under
� ���target
� � over�
(4.1)
where � under and � over are defined in figure 4.1. The eye height and width are often used as
an indication of signal integrity. Figure 4.2 shows how a data eye is formed by overlaying
a signal waveform over multiple cycles.
The eye width, � in figure 4.1, is the time that the separation between high-going
and low-going signals is greater than zero. In practice, the receiver will attempt to sample
the signal near the moment of the widest eye opening. Due to uncertainties in the timing of
the transmitter and receiver and in the delay of the interconnect, the actual sampling may
occur at some time other than this ideal. The eye-width gives an indication of the robustness
of the interface to these timing uncertainties.
In this thesis, the effectiveness of a filter is quantified in the three following ways:
� eye height of the output signal given a pseudo-random input sequence.
21
Figure 4.2: Example of a data eye. Upper panel shows a random signal. Its correspondingeye diagram is shown in the lower panel.
22
� eye height of the output signal given the worst-case input sequence.
� the smallest bit time (or highest bit rate) at which the eye height of output signals
is greater than a specified amount, e.g. 50% of the nominal signal level and the eye
width is greater than another specified amount, e.g. 25% of the bit time.
4.2 Preliminaries
By defining some notation up-front, the presentation of the filter design methods can be
more succinct and direct. The responses of filters and buses are naturally written as con-
volutions while linear and least squares problems are naturally formulated with matrices.
Here I define some notation to show the connection between various convolutions and their
corresponding matrix representations.
Let ������
be a vector of size�
. The�
components are � � � � � � � ��� � � �. I’ll write
� � � to denote the size of � , � ��� to denote the� �
norm of � , and � ����� to denote the� � norm
of � .Some matrix abbreviations used below are:
� � The��
�identity matrix
� � � The � � �matrix of zeros
� � The
��
�matrix where
� � � ��� � � � �
(4.2)
4.2.1 Convolution
Linear Convolution: Let � and � � be two vectors. The linear convolution of � and � � is
the vector of size� � � � � � � � defined below:
� � � �� � � � � � �
� � � � � � � � � � � � � (4.3)
Linear convolution is commutative and associative.
23
Circular Convolution: Let � and � � be two vectors in ��
. Let � �� �� � �
�denote the
circular convolution of � and � � :
� � �� �� � � � � � �
� � � � � � � � �� � � � � ����� � �
(4.4)
Circular convolution is commutative and associative.
Let � be a vector and�
be an integer with� � ��� �
. The zero-extension of � pads �with zero elements to produce a vector of size
�:
extend � � ��� � � �� � � � �� �
���� (4.5)
Zero-extension is a linear operator:
extend � � ��� � � � ��� � � � �� � �
���� � � (4.6)
Let extend � � � � ��� � be the left matrix on the right hand side of the equation.
Linear convolution can be expressed as circular convolution of zero-extended vec-
tors:
� � ��
extend � � � � �
� � � � � � � � extend � � � � � � � � � � � � � (4.7)
Block Linear Convolution: Let � ��� � � � be a matrix. We can think of � as a column
of � matrices:
�
�
�
� ...
� � ��
�����������
(4.8)
where each of the � �is a � ��� matrix. The block linear convolution of � � � � � � and
� ����� � � is defined similarly as linear convolution:
� � �� � � � � � �
� � �����
�� (4.9)
24
The block linear convolution of matrix � � � � � � and vector � � � � � is defined simi-
larly.
Block Circular Convolution: The block circular convolution of � and�
, where � � � �� � � � :
� �� � � � � �� �
� �� �
� � ���� ������� � (4.10)
Block circular convolution is associative. It is commutative if the product of the sub-
matrices is commutative, for example, if the sub-matrices are all symmetric or all circulant
(circulant matrices are defined in sec 4.2.2 below). Extending the extend operator to block
matrices, let � ��� � � � be a matrix, and let��� � .
extend � � ��� � � �� � � � � � � �
���� (4.11)
Zero extension on block matrices is a linear operator just as it is for vectors.
Block linear convolution can be expressed as block circular convolution of zero-
extended matrices:
� ��
extend � � � � ���� extend � � � � �
��
(4.12)
where � ��� � � � � � � � � � � .
4.2.2 Matrix Representations of Convolution
In this section, I will first present matrix representations for linear convolution, then extend
it to block linear convolution.
Let � � ��
be a vector. Let ��� � �� � be the circulant matrix [5] generated by
� :� � � � � � � � �� � � � � ����� � � � � (4.13)
25
The form of this circulant matrix is depicted below:
� �
�
� � � � � ��� � � � � ��� � � � � � � � � �� � � � � � � � � ��� � � � � � � � � �� � � � � � � � � � � � � � � � � �
......
......
� ��� � � � � ��� � � � � ��� � � � � � � � � �
��������������
(4.14)
Let � and � � be two vectors of the same size. Equations 4.4 and 4.13 yield:
� �� �� � � �
� (4.15)
Furthermore, if � , ��, . . . , � � are all vectors of the same size, then
� �� ��� � � � � � �� � � � � � � �
� � � � �� �� � � (4.16)
Note that matrix multiplication of circulant matrices is commutative and associative, just
like the corresponding convolution.
Let � ���� �� be a matrix, and let row
� � � � � � � � be the vector such that
row� � � � � � � � � � � � � � (4.17)
Likewise, let col� � � � � ���
�be the vector such that
col� � � � � � � � � � � � � � (4.18)
Convolution can be expressed with all arguments represented as matrices:
� � ��
col� � � � � � �� �
����� ����� � ��� �� � � �� �� � � � � � �
�(4.19)
Using equation 4.7, linear convolution can be expressed using matrix multiplication:
� � ��
extend � � � � �
� � � � � � � � extend � � � � � � � � � � � � � (4.20)
26
Define � �� ���� � as the matrix given by
� � extend � � ��� � � (4.21)
The form of this matrix is depicted below:
� �
v(0)
v(1)
v(m−1)
v(m−1)
v(m−1)
v(m−1)
v(m−1)
v(0)
v(0)
0
0(4.22)
where � � � � . The linear convolution of � � � � � � � can be written as
� � ��� � � � � � �� � � � � � �
�� � � � �� �� � �col� � ��� �� � � ��� �
(4.23)
where
� � �� � � � � � (4.24)
The matrix representation for linear convolution described above can be extended
to block linear convolution. Let � � � � � � be a matrix. As described in the previous
section, the matrix � can be regarded as a column of � submatrices of dimension � � �
each.
The block circulant matrix generated by � is
� �
�
� � � �� � � � � � � � � �
......
...
� � �� � � �� � � �
�����������
(4.25)
27
For those who prefer formulas to ellipses:
� � � � � � � � � � �div � � �
�� div � � ������� � � � � � � � � � ����� � � (4.26)
Let � � � � � � � � be matrices. Equations 4.10 and 4.25 yield:
� �� � � �
(4.27)
Using equation 4.12, block linear convolution of � � � � � � and� � � � � �
can be expressed using matrix multiplication:
� ��
extend � � ��� � � extend � � ��� � � �� extend � � ��� �
col� � � � � � � � � � � � �
(4.28)
where � � � �� � � is defined as extend � � ��� � � ,
� � �� , and col
� � � � � � � � is
defined in the obvious manner.
Block linear convolution of � � � � � � with � � � � � can also be expressed as
matrix multiplication:
� � � � �� extend � � ��� � (4.29)
where� � �
� .
Define the following operators:
� � block� � � � � ��� � � � creates circulant blocks from vector � ��� � � .
� � � � � � � � ��� � � div ���� �� � � � � ��� � � �� (4.30)
The form of this matrix is depicted below:
�
�
� � � � � � � � � � � �� � � � � � � � � � � � �
...
� �� � � � � � � � � � � � � � � �
�����������
(4.31)
28
� vec2cir� � � � ��� � converts a vector � � � � � to a circulant matrix:
vec2cir� � � � ��� � extend � block
� � � � ����� � � (4.32)
Define � ��� � �� � � � as the matrix given by vec2cir
� � � � ��� � . This matrix has the
same form as � � (see equation 4.14), except that now each block is a circulant
matrix of size � � � . Notice that � ��� is a block circulant matrix and
extend � � ��� � col� � � � �� �
With these operators, it is straightforward to see that for � � � � � � and � ��� � � ,
� � � col� � � � � � �� � (4.33)
where� � �
� .
The block linear convolution of two vectors � � ���� � and � � � � ��� � is defined
as:
� �� � � col
� � � � ��� � ���� � (4.34)
where� �
� � �. The block linear convolution of � ���
��� � � � � ��� ��� � � � �� ��� �� �can be written as
� �� � � � � � � � � �� �� � � �� � �� �
��� � � � ��� �� ��col� � � � �� �
� ��� � (4.35)
where
� � �� � �
� (4.36)
29
4.3 Least Squares Optimization Method with Pseudo-random
Input
4.3.1 Motivation
As discussed in the previous section, an ideal bus would in all cases deliver the near end
signal without distortion to the far end receiver, with some amount of delay. Thus, we
know that in the ideal case, the expected output signal would be simply a delayed version
of the input signal. The goal of filter design is to find a set of filter coefficients that make
the output signal as close to this ideal output signal as possible. Following the example of
[6][7], I use RMS error (� �
metric) in this section as a measure of the distance of the filter
output from the ideal, delayed signal. In this case, filter design can be formalized as a least
square optimization problem. In section 4.4, I use worst-case difference between a signal
and the target as a measure of distance (� �
metric) and show that the resulting filter design
problem is an instance of linear programming.
4.3.2 Least Square problem formulation
� Input
Consider a bus with � bus wires. Let�
input denote the length of the input training
sequence in bit times. Thus, an input is a function that gives a value, +1 or -1, for
each wire � � � � � � bus� � �
at each time� � � � ��� input
� � �. This function can be
represented by a vector, input, with input�� bus� �
��
denoting the value of the ��� �
wire at time�. Because filter coefficients are given in tap times, oversampling is
needed to convert the input from a sequence in bit times to a sequence in tap times.
Let input � ��
input�bus be a vector and � be a positive integer. The oversample op-
erator, oversample�input
� � � � bus�
computes in � �� �
input�bus which is the � times
30
oversample of input:
in� � � input
�� bus �
� � div�� bus � ���� � � ����� � bus
��
The oversample operator is linear. In particular, oversample���
input� � � � bus
�is a ma-
trix with:
oversample���
input� � � � bus
� � � � � � ������ �����
�if� � div
�� bus � �� � � div � �
and� � ����� � bus
� � � ����� � bus�
� otherwise
Thus,
oversample�input
� � � � bus�
oversample���
input� � � � bus
� �input
Define input � ������ � �� � � ��� �� � ��� as the vector given by oversample
�input
� � � � bus�.
The form of this vector is depicted below:
input � � �����
�
input
input ...
input � ��� �� Repeats � bus
� �more times
input � ���input � �����
...
input� � ���� ��
Repeats � bus� �
more times...
input � � ��� �� � ��� ��
���������������������������������������
(4.37)
31
� Buses
In much the same manner as above, the impulse response of a bus with � bus wires is
a column of � bus � � bus matrices with each matrix giving the response corresponding
to a particular delay. Let�
bus denote the length of the bus impulse response in tap
times. The bus impulse response can be represented by a � bus�
bus � � bus matrix, bus
where bus�� bus� �
� out�� in�
is the response of the � out output wire of the bus after
a delay of�
tap times to the � in input wire.
Let in be the vector for the input of the bus in tap time.
in
input � � ����Let in � � �
� ���� denote the value of the input at tap time t:
in �� � � in
�� bus��� � ��� � � � � � � bus
� � �
Likewise, let bus � ���� ���� � ���� denote the bus impulse response at time
�:
bus �� � � � � bus
�� bus��� � � � ��� � � � � � � � � bus
� � �
Let output � �� � � � ��� ��� � � ���� � � ��� be the vector for the output of the bus and let output �
be the output at tap time�:
output � � �� � bus � �
�in� (4.38)
Equation 4.38 has the form of a block linear convolution. Thus,
output
col� � � bus � in � ���� � (4.39)
where� �
bus� � � input. Moreover, in this thesis, for simplicity, I assume that
wires are arranged around a cylinder (see chapter 3). This means that all wires have
the same characteristics, and bus � is a circulant matrix. Let � � �� ���� � ���� be the
32
vector whose�� bus
� � � ����� � � � � bus� �
components are the first column of bus � .That is,
bus
block�
��
bus � � �� �����
Thus the ouput signal of the bus given input in bit time is:
output
col� � � � � ����� �
input � � ����� � � ���� �(4.40)
where� �
bus� � � input.
� Filter
In figure 1.1, a filter is depicted for each wire of the bus. Because all wires have
the same characteristics, I assume that every filter is the same. For a � bus-bit bus,
this filter system can also be viewed as a � bus � � bus input/output network. Thus,
similar to the bus, the input/output relationship of the filter system with�
fir taps can
be expressed as:
filterOutput
col� � ��� � ����� � input � � ����� � � ���� � (4.41)
where� � �
��� � � ��� is the filter coefficient vector of the filter for wire 0 of the bus. It
33
has the form depicted below:
�
�
� � � ��
� � � � �
� � ���� �� � � �� � � ��
� � �
� � � � ��� �� ��� fir
� � �
�������������������������
(4.42)
where� � ��� �
denotes the contribution of the input on wire 0 at time 0 to the filter output
for wire � at time�.
Because the bus is symmetric, I restrict my attention to symmetric filters. That is, in
the�
vector depicted above,
� � � � � � � ��� � � � � � for � � � � ��� fir� � � � � � � � � � bus
� � �
Moreover, inputs on wires far away produce very little cross-talk. Therefore, it may
be practical to force the filter coefficients for these wires to zero to simplify the im-
plementation of the filter. In this thesis, filters with various sizes are investigated.
Filter size is defined as filter length � filter width. A�
fir ��� fir filter contains � fir sets
of�
fir filter coefficients for inputs on wire itself and the � fir� �
nearest wires in both
34
direction. Its filter coefficient vector fir ��� �� � � � � is depicted below:
fir
�
fir � � �fir � � �...
fir � � � �� � � �
fir � � �...
fir � � � �� ���
fir� � �
���������������������
Define filterExtend���
fir�� fir
�� bus� � �
� � � � ���� � � � � � � as the matrix depicted below:�
� � � �� � � ��� � � � � � � � � � �� � � � � � ��
� � � �� � � ��� � � � � � � � � � �� � � � � � ��
� � � �� � � ��� � � � � � � � � � �� � � � � � �� � �
� �
�����������������������������������
(4.43)
where � � � � � � �� denotes the horizontal concatenation of a column vector � � � � � �� � with a matrix
� � � � �� . Operator filterExtend�fir
�� fir
�� bus�
transforms fir to the full
filter coefficient vector�
in equation 4.42.
filterExtend�fir
�� fir
�� bus�
filterExtend���
fir�� fir
�� bus� �
fir (4.44)
35
Denote fir� ����� � � as the vector given by filterExtend
�fir
�� fir
�� bus�, which equals the full
filter coefficient vector�
depicted previously. Thus, with equation 4.41, the output
signal of the filter system with�
fir � � fir filters can be expressed as:
filterOutput
col� � ��� fir
� ���� � � � � ���� � input � � ���� � � ����� � (4.45)
where� �
fir� � � input.
� Target signal
Let � be the target signal which is a delayed version of the input signal.
I considered two ways to approximate the expected delay.
– LC delay: length� � � .
– approximate the delay by determining the peak of the Frobenius norm of the
bus impulse response.
The second one is more accurate because the effect of resistance is also taken into
account, especially for long buses where RC delay dominates LC delay. In this thesis,
all results are obtained with the second method.
Let ��� ���� � be the following matrix:
���� � � � �
��� �� �if � � � ��
� otherwise
where�
is the approximated delay in tap time and
� � � input���
The target signal � is given by
���� � input � � ����
(4.46)
36
� Output signal
With above analysis, it is straightforward to express the output signal of the system
with�
fir � � fir filters in figure 1.1 using matrix multiplication. Let the vector output ��� � ��� represent the output of the system in tap time:
output
col� � � h � ����� � fir
� ����� � � � � ����� � input � � ����� � ������ �
col� � � h � ����� � input � � ����� � � ����� � fir
� ����� � � � � ���� �
h ������ � input � ������ � ������ extend ��� fir�� bus
��� �filterExtend
���fir�� fir
�� bus�fir
(4.47)
where� �
input � � �fir� �
bus.
Let
� h � ���� � input � � ���� � ������ extend ��� fir
�� bus
��� �filterExtend
���fir�� fir
�� bus�
(4.48)
Then
output � �
fir (4.49)
� Least Squares Problem
With equations 4.46 and 4.49, the least squares problem is:
�����
fir ��� �� � �
� �� ��� � �
fir� �
Given�
and � , I used QR decomposition (i.e. the backslash command in Matlab) to
find the vector fir that minimizes the least square error of the over-determined system
� �fir� .
4.3.3 An example
To show the effectiveness of the equalizing filter approach in cross-talk cancellation, con-
sider a 32-bit bus with length 5 cm. The electrical parameters of the bus are: � = 0.066
37
�/cm, � = 0.8 pF/cm,
�= 3.99 nH/cm,
� � � � = 0.31, � � � � = 0.23. Filter design parameters
are:�
fir� , � fir
�, taps per bit = 4, bit time = 400 ps, length of the training sequence,
�input � � � bits.
For this particular example, in equation 4.46 and 4.49,
� Bus width � bus = 32. Length of the bus impulse response,�
bus is set to be 16 taps.
Thus, h is a vector of length:�
bus � � bus�� � �
.
� input is a vector of length: � bus ��
input ��� � � .
� fir � � �� .
� �is a matrix of size
� � � � ��� ���.
Pseudo-random input sequences are used as test sequence. By comparing the eye
opening with and without the filter designed, we get an indication of the effectiveness of
the filter.
Figure 4.3 shows the predistorted input signal on wire 1. The waveforms in fig-
ure 4.4 clearly show that the ��� � filter greatly reduces the overshoot and undershoot,
which is also shown by the eye diagrams in figure 4.5. With the � � � FIR filter designed,
the eye height is increased from 31% to 82%. This tells us that equalizing filters are a very
promising method in cross-talk cancellation for high speed buses. More thorough testing
results are presented in section 4.5.
4.4 Linear Programming Method with Worst-case Input
In this section, a linear programming method is introduced with the assumption that we can
solve the formulated linear programming problem.
38
0 10 20 30 40 50 60−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
Vol
tage
(v)
t (ns)
Input signal on wire 1Predistorted signal on wire 1
Figure 4.3: Predistorted signal: equalizing filter output
4.4.1 Motivation
Although the least squares optimization method works and the FIR filter described greatly
improves the eye height of signals transmitted, this method has several shortcomings.
� The filter designed by the LSQ optimization method greatly depends upon the pseudo-
random input pattern used as training sequence. To get a good filter design, a long
training sequence must be used, which makes the speed of the filter design very slow
for wide buses as occur frequently in practice.
� The design objective is to transmit the bits without error. It is assumed that as long as
a bit satisfies the eye specification, it will be received correctly. Thus, getting some
bits that already satisfy the eye specification closer to the target signal doesn’t matter.
It’s the worst-case pattern that determines the eye height. Thus, the� �
metric doesn’t
39
0 10 20 30 40 50 60−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
t (ns)
Vol
tage
(v)
System without filtersInput signal on wire 1Output signal on wire 1
0 10 20 30 40 50 60−1.5
−1
−0.5
0
0.5
1
1.5
Vol
tage
(v)
t (ns)
System with 8*2 filtersInput signal on wire 1Output signal on wire 1
Figure 4.4: Examples of output signal for 32-bit interconnect network with (lower panel)and without (upper panel) 8 � 2 equalizing filters designed with the LSQ method.
40
0 100 200 300 400 500 600 700 800−2
−1.5
−1
−0.5
0
0.5
1
1.5
2Eye diagram for system without filters (eye height 29%, eye width 75%)
t (ps)
Vol
tage
(v)
0 100 200 300 400 500 600 700 800−1.5
−1
−0.5
0
0.5
1
1.5
t (ps)
Vol
tage
(v)
Eye diagram for system with 8*2 filters (eye height 82%, eye width 75%)
Figure 4.5: Eye-diagrams for a 32-bit interconnect network with (lower panel) and without(upper panel) 8 � 2 equalizing filters designed with the LSQ method. Red traces indicatehigh signal transmitted. Blue traces indicate low signal transmitted.
41
strictly correspond to eye height, the metric defined in equation 4.1. For example, it
is possible that for a training sequence, some filter coefficient set produces very small
RMS but the output signal has 1 bad trace. It is that 1 bad trace which determines
the eye height. Certainly, we can reformulate the same problem into a linear pro-
gramming problem, such that for a given training sequence (pseudo-random input),
the� �
metric is minimized. However, in order to guarantee worst-case performance,
ideally, all possible input combinations should be part of the training sequence. This
is obviously not practical.
It turns out that for a given set of filter coefficients, the worst-case input pattern can
be figured out and thus the worst-case eye height can be computed. This section is
devoted to this method that minimizes� �
over all possible inputs.
First, I show that the search space for the� �
metric is convex even when more general,
non-linear filters are considered. Formalize an eye height specification as a sequence of
tuples:
� ��� ��� ������� ��� ��� � � ����� � �� ��� � �� �
A filter�
satisfies�
if and only if for every input every output satisfies:
� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � ��� � � � ���output
��� � ��� � ��� � �� �input
��� � �� � �� ��
output��� � ��� � � ��� � �� � input
��� � � � ��(4.50)
where � is the number of taps per bit,�
is the expected delay of the bus, output
h � ����� � �
input�
and�
is the filter function. Let� � �
denote filter�
satisfies eye�
.
Let� and
� �be two filters that satisfy some eye opening constraint
�. That is,
� � �
and� � � �
. Let� � �� �
� � � ����� � �
, where� � � � � � � . Because the bus, h � ����� , is linear,
a system with���
produces output signals that are the same linear combination of what is
42
produced by systems with� and
� �. It then follows from equation 4.50 that
� � � �.
Thus the space of filters that satisfy eye opening constraint E is convex.
The objective is to send -1, 1 signals down the bus as clearly as possible. In this
system, every wire of the bus has the same configuration and thus is interchangeable. Thus,
the original objective is the same as trying to send down 1 on wire 1 as clearly as possible
with the worst-case disturbances from other wires and preceding and following bits.
The output signal on wire 1 for the current bit is simply a summation of the effect
on wire 1 at the current bit from
� the input signal on wire 1 for the current bit, which is the signal expected to come
through if there is no disturbance.
� the input signal on wire 1 at other bit times and also the input signals on other wires
for the current bit and other bits, which produce disturbances on the first wire at the
current bit.
Thus, the optimization problem can be restated as the following: Given that the
current bit input on the first wire is 1, find the best set of filter coefficients that makes the
output signal on wire 1 at the current bit as close to 1 as possible for the worst-case input
sequence which produces the largest disturbances on wire 1 from other bit times and other
wires.
� ����
subject to
undisturbed�
disturbances�
�� �
undisturbed�
disturbances�
�� �
(4.51)
43
4.4.2 Linear Programming Problem formulation
I now focus on the practical case where the filter is linear and FIR, and show that the design
problem is an instance of linear programming. The goal remains to send down 1 along
wire 1 as clearly as possible. A quantified version of this goal is: for the worst case input
sequence with 1 at the current bit, the output signal at some given sampling time is as close
to 1 as possible. That is, at this sampling point, the eye height is as high as possible. A
reasonable sampling point is�, the delay of the bus. Equation 4.51 shows that to formulate
the LP problem for the equalizing filter design, we need to know the undisturbed output at
the sampling point and the largest total disturbances at the sampling point.
Let in be the input sequence that is 1 bit long and only the bit on the first wire is 1:
in� � �
��� �� �if � �
� otherwise(4.52)
Because the whole system is linear and circulant, the response to this pulse input in gives
us all the information we need to compute the output for the worst-case disturbances. From
section 4.3, for the system with�
fir � � fir FIR filters, we know that the corresponding output
is given by G�fir, where G as given by equation 4.48:
G
h � ���� � in � � ����� � � ����� extend ��� fir
�� bus
��� �filterExtend
���fir�� fir
�� bus�
where� � � �
fir� �
bus is the length of the response in tap time. Different rows of G�fir
represent the response on some wire at some tap time. The contribution of the bit from
equation 4.52 to the output at the sampling time is:
undisturbed
row� � � � bus
�G� �
fir (4.53)
Responses from other wires and responses from the first wire arising from earlier and later
bits are the disturbances. For example, the disturbance on wire 1 at the sampling time
44
caused by input on the second wire � bit times earlier, is the same as the disturbance on wire
2 at the sampling time from the input on the first wire � bit times earlier. Moreover, it is the
same as the response of the original pulse input from equation 4.52 observed on wire 2 at
the tap time that is � bit times later than the sampling time. These are due to the linearity
and symmetry of the system. Thus,
disturbance� � � � � row
�� � � ��� � � � bus� � � G � � fir (4.54)
where disturbance� � � � � is the disturbance on the first wire at the sampling time given an
input of 1 on the ��� �
wire � bit times earlier. If the disturbances from other bit times and
other wires are all positive, we get the largest total disturbance and hence the worst-case
disturbances. Let d� � � � � denote the worst-case, positive disturbance on wire 1 at the sam-
pling time from the input on ��� �
wire � bit times earlier. Noting that each input to the filter
is either +1 or -1, the following inequality constraints compute the absolute value function
needed to obtain d� � � � � :
d� � � � � �
row�� � � ��� � � � bus
� � � G � � fir
d� � � � � � �
row�� � � ��� � � � bus
� � � G � � fir(4.55)
Because the cost function is positive monotonic in each of the d� � � � � , either the first con-
straint or the second constraint is tight at the optimal point.
Let���
be the matrix that contains all the rows in G that matter. The total number of
rows are � � � ��� . To calculate the total disturbances from other bit times and wires, ideally,
an infinitely long history should be considered because of the infinite impulse response
of the bus. This is not practical. Notice that most of the energy of the impulse response
expands over about 6 times LC delay of the bus (see figure 4.6). For the particular bus
model presented in this thesis, the LC delay is about 250 ps. All the results presented here
45
0 200 400 600 800 1000 1200 1400 16000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
t (ps)
Fro
beni
us n
orm
of t
he im
puls
e re
spon
se o
f the
bus
peak
future history
Figure 4.6: Frobenius norm of the bus impulse response.
are obtained with:�
bus � � � length
� � � � � (tap time)
Moreover, notice that the bus impulse response does not rise immediately to the peak. This
means that not only the history bits affect the output of the current bit but also a few future
bits. Among the � � � ��� rows, there are � � � � � future bits, 1 current bit and the rest are history
bits. For � � � � � � bus� � � � � � � � � � � � � � � � � � ��� � � � � ��� �
row�� � � � � � � � ��� � � bus
� � � � � � row�� ��� ��� � ���
bus����
G�
(4.56)
Thus row� � � � ��� � � bus
� � � ��� � �fir gives the undisturbed output. Here is the equalizing filter
design problem as a linear programming problem in fir�d�
� :
46
� ����
�
��� � ��� ��� ��� ��� ��� ��� � � � � ��� ��� ��� ��� ��� ��� row� � � � � � � � bus
� � � ��� � �
���� ��� � �
�row� � � � � � � � bus
� � � � � � � � �
��� ��� � �
�����������
�fir
d
�
������� �
�� � � ��� ���
�
� �
�������(4.57)
4.4.3 Smoothing filter
It was found that if we average a few taps of the current output bit and use that as the
objective function of the LP problem, the eye height obtained is better than simply asking
the optimizer to bring one tap of the current output bit as close to 1 as possible. However,
a corresponding smoothing filter is needed at the receiver end in order to get the desired
output signal. Fortunately, such averaging behavior is typical of the input circuits on real
chips [10]. The new system structure is shown in figure 4.7. Smoothing filters will be
further examined in section 4.7.
Assuming we are averaging over 3 taps (a more sophisticated strategy will be dis-
cussed in section 4.7), define a smoothing operator:
smooth� � �
�
� � � � � � � � � � � � � �� � � � � � � � � � � � � �� � � � � � � � � � � � � �...
......
� � � � � � � � � �
��������������� (4.58)
47
TransmitterEqualizing
filterBUS
Smoothingfilter
Receiver
Figure 4.7: System with smoothing filter at the receiver end.
With the smoothing operator, now
G
smooth h ������ � in � � ����� � � ����� extend ��� fir�� bus
��� �filterExtend
���fir�� fir
�� bus�
(4.59)
Moreover, the delay of the bus might not be the best sampling point. It was found
that the best sampling point depended on the filter size and the bit time. For example, at
300 ps bit time, � � � equalizing filters designed with 1 tap extra delay in addition to the
bus delay give the highest eye height (81%) among all � � � filters. It is also better than the
system without the smoothing filter (74%). In the rest of this thesis, all testing results are
obtained with the extra delay varied to give the best eye height.
4.4.4 An example
The following example shows the effectiveness of the equalizing filter approach (LP method)
in cross-talk cancellation. I use the same bus parameters and filter size as the example given
in section 4.3.3. A pseudo-random test sequence is used.
For this particular example, the LP problem formulated has the following properties:
� fir � � �� .
� number of disturbance variables,�d�: 223. Thus, total number of variables is 240.
48
� number of constraints: 448.
From figures 4.5 and 4.9, note that the eye-height for the filter designed by the LP
method (� �
norm) is slightly higher than that for the LSQ filter (� �
norm), � ��� vs. � ��� .
As expected, optimizing for eye-height produces greater actual eye-height than the “average
case” optimization of the LSQ method. The eye width for LP is significantly smaller than
that for LSQ,� � � vs. �
� �. This is expected because the LP filter is optimized for eye-
height at a specific sampling point, whereas the LSQ objective function considers the entire
waveform. Section 4.5 presents further comparisons.
The speed of FIR filter design with the LP method largely depends on the size of the
LP problem formulated. Thus it depends on how many bits (number of disturbances) are
used to design the filter and the size of the filter. The number of disturbances is determined
by the length of the bus impulse response in bit time. The smaller the bit time, the larger the
LP problem. For an ��� � filter design at 400 ps, on a Linux box with a 800MHz Pentium
III CPU and 256MB memory, it finishes within a few seconds. Based on this method, I
investigated other variations of linear FIR filters, such as time-variant linear FIR filters and
other types of smoothing filters.
4.5 Testing results: Comparison of LSQ method and LP method
4.5.1 Worst-case input sequence
In section 4.3.3 and 4.4.4, pseudo-random input sequences were used to measure the eye
opening (eye height and eye width). By comparing the eye opening with and without the
filter designed, we get an indication of the effectiveness of the filter. A shortcoming of
using pseudo-random input sequences as testing sequence is the result varies a lot from
time to time if the input sequence is not long enough. But simulation with a very long input
49
0 10 20 30 40 50 60−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
t (ns)
Vol
tage
(v)
System without filtersInput signal on wire 1Output signal on wire 1
0 10 20 30 40 50 60−1.5
−1
−0.5
0
0.5
1
1.5
t (ns)
Vol
tage
(v)
System with 8*2 filters Input signal on wire 1Output signal on wire 1
Figure 4.8: Example of output signals for systems with (lower panel) and without (upperpanel) the equalizing filter designed with the LP method.
50
0 100 200 300 400 500 600 700 800−2
−1.5
−1
−0.5
0
0.5
1
1.5
2Eye diagram for system without filters (eye height 29%, eye width 75%)
t (ps)
Vol
tage
(v)
0 100 200 300 400 500 600 700 800−1.5
−1
−0.5
0
0.5
1
1.5
t (ps)
Vol
tage
(v)
Eye diagram for system with 8*2 filters (eye height 84%, eye width 50%)
Figure 4.9: Pseudo-random test: eye diagrams for systems with (lower panel) and without(upper panel) the equalizing filter designed with the LP method. Red traces indicate highsignal transmitted. Blue traces indicate low signal transmitted.
51
sequence takes a long time. Inspired by the LP filter design procedure, I used the worst-
case input sequence for each filter instead of pseudo-random input sequence as the testing
sequence. Since the total disturbance from other bit times and other wires is the largest for
the worst-case input, the eye opening is the smallest among all input sequences and hence
the most representative.
For a given set of filter coefficients, the worst-case input sequence input with length
� � � ��� (where�
is defined in equation 4.48, � is the number of taps per bit) can be found
by:
� for every wire � and every bit � , calculate the resulting disturbance on wire 1 at a
given sampling time. If the filter coefficients are obtained with the LP method, the
sampling time is the same as what was used in the LP filter design. For an arbitrary
set of filter coefficients, the sample point is not defined in advance. Instead, I consider
every tap time as a possible sample point and select the one with the best eye height
as the sample point. Accordingly, the worst-case input sequence is determined by
finding the worst-case input for each possible sampling time and concatenating these
sequences together.
� input� � � � � � ��� � � � � ��� � = 1. We are looking for the largest negative disturbances
when 1 is sent. Negation of this input sequence is also a worst-case input sequence.
� If disturbance� � � � � � � , input
� � � � � �. Otherwise input
� � � � � � � , for � �� � � � bus
�, � � � � � � � � � � � .
In this section, all testing results are obtained with input sequences that are con-
catenations of the worst-case input sequence and pseudo-random input sequences, unless
otherwise indicated.
52
0 100 200 300 400 500 600−2
−1.5
−1
−0.5
0
0.5
1
1.5
2V
olta
ge (
v)
t (ps)
Eye diagram for system with 8*3 filters (eye height 80%, eye width 50%)
0 100 200 300 400 500 600−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
t (ps)
Vol
tage
(v)
Eye diagram for system with 8*3 filters (eye height 92%, eye width 50%)
Figure 4.10: Worst-case test (upper panel) vs. Pseudo-random test (lower panel): eye dia-grams for systems with � � � equalizing filters designed with the LP method. Red tracesindicate high signal transmitted. Blue traces indicate low signal transmitted.
53
0.0%
20.0%
40.0%
60.0%
80.0%
100.0%
0 5 10 15 20
Filter Width
Eye
hei
gh
t
4 taps
8 taps
12 tap
16 taps
Filter Width
Eye
hei
ght
Figure 4.11: Worst-case performance of different equalizing filters designed with the LPmethod. Only equalizing filters with eye width greater than 25% are shown. Simulationparameters are: � � � � � � /cm, � � � pF/cm,
� � ���� nH/cm,
� � � � = 0.31, � � � � =0.23. Filter design parameters are: taps per bit = 4, bit time = 300 ps.
Figure 4.10 upper panel shows an eye diagram obtained with such a testing se-
quence. Comparing with the eye diagram shown on the lower panel, it is noticed that
the worst-case input sequence happens rarely and there is a significant difference between
worst-case eye height 80% and random eye height 92%. This gives a possibility that if a
certain amount of bit error is tolerated by using some error correcting code strategy, the
maximum bit rate could be further improved.
4.5.2 Indirect coupling
For simplicity in the distributed coupled RLC model, I only considered capacitive and in-
ductive coupling between adjacent lines. Thus, originally I thought that it should be enough
to equalize one line when only information on this line itself and its two nearest neighbours
54
Figure 4.12: Indirect coupling between non-adjacent lines
are used by the filter design. For a three-bit interconnect, this method considers all wires
and therefore does give the optimal result. However, for a bus with more than three lines,
if we consider more lines instead of just adjacent lines (increase the filter width), better
cross-talk cancellation can be achieved. Figure 4.11 shows the performance of filters with
different sizes at 300 ps bit time. Compared with an � � � filter, the � � � filter has a much
greater eye opening. This indicates that although the direct coupling between non-adjacent
lines is weak and ignored in the interconnect network model, the indirect coupling between
non-adjacent lines is strong and shouldn’t be ignored in the equalizing filter design. It is
also shown in figure 4.11 that for the bus considered this trend nears its asymptote when
filter width is larger than 4.
The indirect coupling between non-adjacent lines is illustrated in figure 4.12. From
figure 4.12, a pattern of transfer function in the frequency domain was conjectured. That is,
� � � � � � � �� � � ��� � � � �� �
� � � � � � �� � � ��� � � � �� �
� � � � � � ���� � ��� � � � � � If this pattern existed, a simple filter considering all lines could be designed. Unfortunately,
since we are considering far end noise cancellation, which is not only a function of input
voltage but also a function of distance � , this speculated pattern does not occur in practice.
55
0 100 200 300 400 500 600−15
−10
−5
0
5
10
15V
olta
ge (
V)
t (ps)
eye diagram for system with 8*16 filters (eye height 99%, eye width 0%)
Figure 4.13: Eye diagram for system with ��� ���equalizing filters designed by the LP
method. Red traces indicate high signal transmitted. Blue traces indicate low signal trans-mitted.
4.5.3 Over-fitting
Compared with the LSQ method, the LP method is fast and guarantees worst-case per-
formance. However, it has an over-fitting problem. Figure 4.14 shows the trend that the
magnitude of overshoot increases with the size of equalizing filter designed. It suggests
that with more degrees of freedom, the optimizer tends to put more energy into the filter in
order to get higher eye height at the sampling time, which results in much greater overshoot.
For some inputs, the output signal changes abruptly but close to the target right at the sam-
pling time, resulting in a larger eye height but also a much smaller eye width. This effect
is clearly shown in figure 4.13, which shows an eye diagram obtained with � � ���filter
designed with the LP method. The eye height of the diagram is the largest among all filters
with length 8, but it barely has an eye opening, and has an enormous amount of overshoot.
56
1
10
100
1000
10000
100000
1000000
0 100 200 300
Number of filter coefficients
Ove
rsh
oo
t (V
)
���������������������������������������������������
���������������������������������������������������
������������������������������
������������������������������
Ove
rsho
ot (
v)
Number of filter coefficients
1
10
100
1000
10000
100000
1000000
0 5 10 15 20
Filter width
Ove
rsh
oo
t (V
) 4 taps
8 taps
12 taps
16 taps
���������������������
���������������������
�������������������������������������������������������
���������������������������������
Ove
rsho
ot (
v)
Filter Width
Figure 4.14: Magnitude of overshoot increases with the size of the equalizing filter designedwith the LP method. Upper panel shows this trend with the number of filter coefficients as� axis. Lower panel shows this trend with the filter width as � axis, given filter length. Allsimulations are done with the same parameters as in figure 4.11.
57
The lower panel of figure 4.14 shows that for a given filter length, magnitude of overshoot
increases with the filter width. Whereas, for filter widths less than 6, the magnitude of the
overshoot doesn’t follow this trend with the filter length. Thus, the filter width plays a more
critical role than filter length in the over-fitting problem. This could be explained by the fact
that wires further away produce less disturbances on the wire than history bits on the wire
itself. Thus with a longer filter, the optimizer could easily push the eye height up without
putting more energy into the filter. When the filter length is long enough to cover most
significant portion of the bus impulse response, this trend reaches its limit. As we can see
from figure 4.11, at 300 ps bit time, filters with length 8 taps already do a very good job
in cross-talk cancellation. Compared with 12 tap filters, 16 tap filters don’t significantly
improve the eye height.
Moreover, for the bus considered, the improvement of eye height by increasing filter
width also stops when the filter width is more than 4 for the bus considered (see 4.11). So,
very long and wide filters won’t bring more benefit, yet the filter becomes more and more
complicated and expensive. In this sense, over-fitting problem of LP method may not be a
serious problem in practice.
Furthermore, instead of trying to bring 1 tap as close to 1 as possible, the LP method
can be easily formulated to bring 2 taps as close to 1 as possible. By doing this, sharp tran-
sitions are avoided hence the amount of overshoot is decreased. In practice, this method
works in decreasing the severity of the over-fitting problem of the LP method when design-
ing large filters.
4.5.4 Minimum bit time
To simplify design and yet achieve reasonable cross-talk cancellation, an important question
is how many lines away should be considered when designing the equalizing filter. In other
58
Taps Width Minimum bit time (ps)LP LSQ
4 1 679 6844 2 338 5754 3 332 5514 4 298 3904 6 298 3904 8 240 3904 16 228 4008 1 679 6848 2 240 3438 3 234 3438 4 228 3438 6 200 3438 8 200 34312 1 679 68412 2 234 29712 3 200 23012 4 200 22816 1 679 68416 2 234 29716 3 200 2300 0 740
Table 4.1: Performance of equalizing filters with different sizes for a bus 32-bits wide and5 cm long.
words, what’s an appropriate width for the filter. Another question is how many taps (filter
length) should we use. Obviously, the longer the filter, the better the noise cancellation, but
the more expensive the design.
Table 4.1 shows simulation results with the filter length and width varied. To eval-
uate the performance of an equalizing filter, the maximum operating frequency (minimum
bit time) at which the height of the eye is around 50% and eye width is over 25% is used. In
these simulations, design parameters are the same as in figure 4.10. Table 4.1 shows that:
59
� Equalizing filters designed with both the LP method and the LSQ method effectively
improve the maximum bit rate of the bus.
� Equalizing filters designed with the LP method have better performance than equal-
izing filters designed with the LSQ method for every configuration considered. As
discussed further below, the advantage for the LP method is most pronounced for
wide filters.
� An � � � filter is a good choice in terms of performance and cost. Although An � � �
filter does improve the eye height at lower bit rate (see figure 4.11), it has similar
minimum bit time as the � � � filter.
Note that width = 1 is separate pre-emphasis for each line. With width = 1, LSQ and LP
have similar performance. Because the focus of this work is on cross-talk cancellation,
high-frequency attenuation caused by skin effect is not built in the bus model. Because
of this, the performance of the system without filter (width = 0) and systems with pre-
emphasis filters (width = 1) are similar (740 ps vs. 679 ps). With cross-talk cancellation
(width ��), the performance of the bus is greatly improved (2.7 to 3.4 times higher bit rate
than independent pre-emphasis). With width ��, LP is significantly better than LSQ. This
might not be a completely fair comparison because all the LP results were obtained with
an additional smoothing filter at the receiver end whereas the LSQ results were obtained
without any smoothing filters. The LSQ method can be easily applied to the system with
a smoothing filter at the receiver end. However, the LSQ method with smoothing is no
better than LSQ with increased number of taps. For example, 8 tap filters designed by
the LSQ method with 3 tap smoothing filter is no better than 12 tap filters designed by
the LSQ method without any smoothing filter. Table 4.1 shows that the performance of
the LP method is better than this upper bound of the performance of the LSQ method with
60
Taps Width Minimum bit time (ps)20 cm bus 50 cm bus
4 1 2354 66274 2 1419 25104 3 1341 24464 4 1264 24324 6 1264 24324 8 1108 24328 1 2295 60058 2 1030 23938 3 932 21988 4 874 21988 6 835 2042
12 1 2295 600512 2 932 231512 3 797 204212 4 777 19630 0 2734 6807
Table 4.2: Performance of equalizing filters with different sizes for buses 32-bits wide. Allfilters designed using the LP method.
smoothing filter. It appears that LP and LSQ are comparably well suited for designing filters
for independent pre-emphasis. However, LP is much better for cross-talk cancellation. The
table also shows that cross-talk cancellation is essential to obtain high bit rates with wide
buses. Table 4.2 shows simulation results with the filter length and width varied for buses
20 cm long and 50 cm long. Again, the performance of equalizing filters are quantified with
the minimum bit time at which the height of the eye is around 50% and eye width is over
25%. All filters are designed with the LP method.
61
4.6 Time-variant Linear FIR Filter
A bit time consists of several tap times. In most examples presented previously, there are
4 taps per bit. Among them, the first tap and the last tap are bit-transition taps. The other
two are stable taps. The receiver samples stable taps and is insensitive to the input value at
bit-transition taps. The FIR filter designed above has no information about bit transition. It
treats every tap the same, no matter whether it is a bit-transition tap or a stable tap. What if
we treat them differently and let the filter have the knowledge of bit transition? This leads
to the design of time-variant linear FIR filter. The idea is to assign a set of filter coefficients
for each tap per bit. For example, for a 8 � 3 FIR filter and 4 taps per bit, the time-variant
linear FIR filter contains 4 sets of 8 � 3 filter coefficients. By doing this, the optimizer can
differentiate transition taps from stable taps, and assign different filter coefficients to their
corresponding filter. The way this time-variant filter works is illustrated in figure 4.15. In
the figure, different sets of filter coefficients are indicated by different line style.
This problem can be formulated into a linear programming problem similarly as the
original simple linear FIR filter design. The only difference between these two linear pro-
gramming problem formulation is the way that filter output is calculated. Let fir � � � � fir
�
denote the � set of filter coefficient vectors. The time-variant linear FIR filter coefficient
vector:
F
�
fir fir
�
...
fir�
�����������
(4.60)
62
FIR 2
FIR 3
FIR 4
FIR 1
Figure 4.15: The convolution procedure of the time-variant FIR filter. Different sets of filtercoefficients are indicated by different line style.
For a given input sequence input in bit time, the filter output is:
filterOutput
shuffle
�
in
in
� � in
in
��������������
F (4.61)
where in �
input � � ���� � � ���� with� �����
input � � �fir� �
bus� � ��� � � , and shuffle �
�� � ���� � � � ��� is the following matrix:
shuffle� � � � �
��� �� �if� � div � � � � ����� � � � � and
� � ����� � � � � div� � � �
� otherwise(4.62)
The correctness of the time-variant FIR filter designed is checked by adding a set
of equality constraints which specify that all four sets of filter coefficients are equal. After
63
adding the equality constraints, this design gives the same set of filter coefficients as the
simple FIR filter designed in section 4.4, and the same objective value. So the time-variant
FIR filter designed should be at least as good as the simple FIR filter designed in section
4.4. For example, in the case of 4 taps per bit, and same design parameters as for figure 4.10,
time-variant � � � FIR filter has worst-case eye height 86% (vs. 80% for simple � � � FIR
filter). The improvement of the eye height tells us it does help to assign different filter
coefficients to different taps. However the benefit might not be large enough to justify any
extra cost in an implementation.
4.7 Optimized Smoothing Filter
The system structure shown in figure 4.7 naturally leads to the topic of optimized smoothing
filter design. In previous sections, a smoothing filter which simply averages over 3 taps was
used. Test results show that this is a good choice. However it is not optimal. For example, it
is observed from those eye diagrams that weights assigned to those 3 taps shouldn’t be the
same. The first tap and last tap are closer to tap transition and should have smaller values.
The middle tap contributes more to the eye height and should be assigned a larger weight.
Moreover, there is no reason to limit the window size of the smoothing filter to only 3 taps.
The system where the coefficients of both the equalizing filter and the smoothing
filter are taken as variables is not linear because output values of the bus depend on the
product of the coefficients of the two filters. Thus, a LP formulation is no longer possible in
this case. The following simple strategy addresses this problem. Given a set of smoothing
filter coefficients, a set of optimal equalizing filter coefficients and the resulting optimal
objective value can be easily computed with the LP method presented previously. Thus the
LP method can be treated as a function with smoothing filter coefficients as its variables and
the objective value � as its return value. Then a general optimizer (in particular, fmincon()
64
Filters(taps � width)
SmoothingFilter
eye height atbit time 300 ps
� � 7%8 � 3 No smoothing filter 74%
8 � 3simple averagingover 3 data points 81%
8 � 3 optimized 3 tap window 86%8 � 3 optimized 9 tap window 95.5%
Table 4.3: Performance of different smoothing filters with � � � equalizing filters designedby the LP method at 300 ps.
provided by Matlab Optimization Toolbox) can be used to minimize this function on the
variable space of smoothing filter coefficients. This strategy doesn’t guarantee finding the
optimal solution, actually it doesn’t even give an indication about how close we are to the
optimal point. However it does find better smoothing filter coefficients and equalizing filter
coefficients in terms of decreasing the objective value � and larger eye height (see table 4.3).
Table 4.3 shows the effectiveness of this approach. In this example, the bus is a
32-bit bus with a length of 5 cm. The electrical parameters of the bus are: � = 0.066�
/cm,
� = 0.8 pF/cm,�
= 3.99 nH/cm,� � � � = 0.31, � � � � = 0.23. Other simulation parameters are:
taps per bit = 4,�
fir� , � fir
� .
4.8 Summary
In this chapter, I described two methods of linear equalizing filter design: the LSQ method
corresponding to� �
metric and the LP method corresponding to� �
metric. The simula-
tion results presented in section 4.5 demonstrate that the LP method outperforms the LSQ
method and provides effective methods for designing cross-talk canceling equalizing filters
that greatly increase the bandwidth of high-speed digital buses. Based on the linear FIR
filter designs, time-variant FIR filter design and optimized smoothing filter design were
65
presented.
66
Chapter 5
Predictor-Corrector Algorithm with
Model Reduction
As shown in chapter 4, an optimal filter design problem can be formulated into the following
LP problem:� ����������� ��� � ���
(5.1)
In chapter 4, the LP method is introduced based on the assumption that the LP problem
formulated could be solved. It turns out that the linprog() routine provided by Matlab does
not converge when more than 7 bits are used to design the filter. In the case of large-
scale problems, linprog() implements LIPSOL (Linear Interior Point Solver [21]), which is
a variant of Mehrotra’s predictor-corrector algorithm, a primal-dual interior-point method.
It is known that when approaching the optimal solution, the system gets more and more
ill-conditioned, which may eventually lead to non-convergence. In this chapter, I describe
an approach that I implemented to overcome the ill-conditioning problem and can be used
to solve problems where�
is too large to be given explicitly. This approach employs
Mehrotra’s predictor-corrector algorithm along with a model reduction technique.
67
Section 5.1 introduces Mehrotra’s predictor-corrector algorithm. Then, several ma-
jor issues in implementation are discussed. Section 5.3 is devoted to the model reduction
technique that I used to overcome the problem of ill-conditioning. Although the method
introduced here is used to solve the linear filter design (with�
explicitly given), it could
easily be adapted to solve linear programs with�
given implicitly, by using an iterative
solver instead of Cholesky factorization to solve the linear systems encountered.
5.1 Mehrotra’s predictor-corrector algorithm
Primal-dual interior point methods outperform the simplex method on many larger prob-
lems and perform better than other interior point methods [14]. Among many general al-
gorithmic approaches, the most effective one in practice has proven to be the primal-dual
infeasible-interior-point approach, including a number of variants and enhancements such
as Mehrotra’s predictor-corrector technique [13]. The Matlab function linprog() imple-
ments an variant of this algorithm in the case of large scale problems.
Consider the LP problem in standard form:
� � � � ��� � � � � � � � � � (5.2)
where� � � � � , which determines the sizes of other vectors involved. The dual problem
for equation 5.2 is� ����� � � � � �
���
� � � �
� � � (5.3)
It is well known that primal-dual solutions of equation 5.2 and 5.3 are characterized
by Karush-Kuhn-Tucker conditions [14]:
� � � � � � ��
�� �
��
�� �
� � � ���� �
������� � (5.4)
68
where
� � ��� � � � � � � � � � � � � �� �� ��� � � �
�
�� � � � � �
� �
� � � � � � � � � � � �The system of equations 5.4 can be solved by applying Newton’s method and carrying out
a linear search to enforce the non-negativity constraints on � and � . Unfortunately, often
we can only take a small step before the non-negativity constraints get violated. Therefore,
the pure Newton’s method with linear search converges very slowly in this case. Rather
than solving the system of equations 5.4, primal-dual interior point methods introduce the
concept of a central path. The central path is parameterized by a scalar � , and consists of a
set of points that are solutions of the following linear system for � � � :
� � ��� � ��� � � ��
�� �
��
�� �
� � � ���� �
�������
���� �
������� (5.5)
� � � ��
� �The role of � is to enforce that all the complementarity products have the same
values for all indices. Hence, the central path keeps iterates biased towards the interior of
the nonnegative orthant� � � �� � � . As � approaches 0, the solution of the linear system 5.5
approaches the optimal solution� � � � � � � which is the solution of the linear system 5.4.
In practice, � is defined as the product of a centering parameter � and a complementarity
gap � , where � � � � � � � and � �
�� � � .
Mehrotra’s predictor-corrector algorithm [13][14] implements the basic ideas de-
scribed above with extra second-order correction. It consists of three major steps.
69
Given an initial point� � � � � �
�with� � � �
�� � .
For � � � � � � � � �
Predictor step: At this step, it computes the pure Newton (affine-scaling) direction��� � aff �
�� aff ��� � aff � by solving:
�� � � �� � �� � � � �
�������
�� � aff
�� aff
�� aff
�������
�� ���� ���
� � � � � �
������� (5.6)
where ��� � �� � � � � � � , ��� � � � � � are the residuals in primal and dual
feasibility respectively.
Adaptive approach to compute centering parameter � : This parameter is calculated in
terms of the complementarity gap at the current point and the complementarity gap
after a hypothetical step in the affine scaling direction is taken. The step size in the
affine scaling direction is calculated by
���aff
�� ��� � � � � ���� � � aff�� � � �� ��� � aff� �
� �aff
�� ��� � � � ������ ���aff�� �
� �� ��� � aff� � (5.7)
� aff � � � � �aff
� � aff � � � �� � �
aff�
� aff � � �Then set the centering parameter to �
�� aff � �
� �. Thus, the centering parameter
is small when good progress can be made in the affine direction and large when the
affine direction produce little improvement and more centrality is needed. This is
chosen to trade off between the twin goals of reducing � and improving centrality.
A corrector step: solves the following equations to get a corrected, centered step direction.
It is essentially a step based on the Taylor series expansion of the complementarity
equations [15].
70
�
� � � �� � �� � � � �
�������
�� � ��� �
�� �
�������
�
� ���� ���
� � � � � � � Diag��� � aff � Diag
���� aff � � � � � �
�������
(5.8)
Compute step size similarly as in equations 5.7, and update� � � � � � � � � � � � �
� � � � � � � � � ��� � ��� � � ��� � � ��� � � � .
5.2 Implementation Details
With above framework, we still need to specify the following: (1) the initial point and the
stopping criteria; (2) how to solve linear systems 5.6 and 5.8.
5.2.1 Starting and Stopping
The starting point selection is based on [12]. Matlab lsqr() solver solves the system� � �
and makes an initial estimate to the primal variable � , denoted as � . Then define �
� ��� ��� � ��� ������
� � �� � � � � � � � � � � � � and � � � � � � � , where � � � denotes
� norm.
Then, for each � � � � � ��� , set ���� ��� � � � �
�and �
��� ��� � � � � �
��. At last compute
�
given� � �
�
�.
Stopping criterion is the standard one [21]:
error � ��� �� ��� � � � � � � �
� � ��� �� ��� � � � � � � �� � �
�� � �
���
� ��� � � � � ��� � � � �
��� � � tol (5.9)
where ��� , ��� are the residuals in primal and dual feasibility. In this implementation,��� �
is
� � � � by default.
71
5.2.2 Solving the linear systems
The special structure of the left-hand-side matrix in 5.6 and 5.8 allows us to reformulate
them as systems with positive definite matrices [14]. For example, 5.6 can be reformulated
into the following system:
��� � � � �� � ��� � � ��� � �� � � � � � � � �
� �� � ��
� � ��� � �
� ��
� � � � � � �� �� � � � �� � �
�
(5.10)
with� � ��
� � � � �
and�
� �
�� � �� � � � . Cholesky factorization is used to solve
the first equation in this linear system. Then�
� and�� are obtained. If we don’t know
anything about�
explicitly except its dimensions, iterative methods can be used to solve
this linear system.
5.3 Ill-conditioning and Model Reduction
Each iteration of the predictor-corrector algorithm involves solving linear systems whose
left-hand-side matrix is the same as��� � � �
in equation 5.10. As we approach the optimal
point, either � � or � � (for each � � � � � ��� ) decreases to zero. Thus the elements of
the diagonal matrix�
take on both huge and tiny values. For this reason, ill conditioning
often occurs during the final stages of the predictor-corrector algorithm. In practice, this
prevented linprog() from converging to a point that satisfied the error tolerance for the
linear programs arising in filter design using the LP formulation presented in chapter 4. My
implementation uses two techniques to handle this ill conditioning: hopping to the optimal
point and model reduction.
� Hop over to the optimal
72
If we are really close to the optimal, the LP solver can hop over to the nearest vertex
on the polytope boundary and check if it is optimal. The simplest thing to do is to
set the smallest� � � components of � to 0 and solve
� � �for the remaining
components of � . As we know that � variables in the primal form reflect the marginal
costs of corresponding constraints in the dual form, � � � means the � � �
constraint
in the dual form is non-essential (the optimal solution is not affected by perturbations
of this constraint). So we solve the remaining part of� �
� � for � . Then we can
get values for the slack variables � . After we get the solution, we need to check for
its feasibility and optimality (i.e. make sure all non-negativity constraints on � and
slack variables � are satisfied).
I’ve tried this for linear FIR filter design problems. The solutions I got had negative
� and � values. This suggests that although we are close to the optimal (system gets
ill-conditioned), we are not close enough to be able to identify the optimal vertex.
� Model Reduction
Although we are not close enough to be able to hop over to the optimal solution di-
rectly, we know that we are in a region close to the optimal vertex. In this region, we
should be able to identify some of the non-essential constraints (although not all of
them). Since � variables of the primal form reflect the marginal costs of correspond-
ing constraints of the dual form, the � � s corresponding to these ready-to-identify
non-essential constraints are very small and contribute to the ill-conditioning of the
linear system. After we identify them, by setting those � � s to 0 and ignoring their
corresponding constraints in the dual form, we reduce the original LP problem to a
smaller size problem and with a smaller condition number. Then we solve the smaller
problem as before. If the total error gets smaller than the tolerance, we stop. If sys-
73
tem gets ill-conditioned, we do model reduction again. We keep doing this till either
the total error is smaller than the tolerance or all� � � non-essential constraints are
identified.
– Empirical criterion for � � being small:
If � �� � � � � � � � � � �
, set � � to 0 and label � � �
constraint of the dual form
as non-essential. The choice of� � � � was simply to make sure that � � was
relatively small.
– When to do model reduction?
In this implementation, when the linear system gets ill-conditioned, model re-
duction will be done. cond() provided by Matlab is used to calculate the condi-
tion number of the matrix. A default threshold� � �
is used to determine whether
model reduction is needed. In the case that we don’t have the explicit form of
� (right-hand-side matrix) available, we have no way to calculate its condition
number. However the relative residual of iterative solver minres() is a good in-
dication of how ill-conditioned the system is. It appears that for a fixed number
of iterations, the more the system gets ill-conditioned, the larger the relative
residual is.
– Model reduction guesses are verified at the end by plugging the final solution to
the original problem and checking optimality requirements.
5.4 Results
Suppose 12 bits are used to design an � � � linear FIR filter for a 32-bit bus at 400 ps
bit time. This linear FIR filter design problem can be formulated as a LP problem with
768 inequality constraints and 408 variables. It is naturally in dual form. In this section, we
74
Residuals: Primal Dual Duality TotalInfeasibility Infeasibility Gap Relative� � � � � � � ��� � � �
��
ErrorIter 0: 2.74e+03 9.95e+01 3.96e+05 1.00e+03Iter 1: 2.13e-09 5.82e+01 1.46e+05 4.11e+01Iter 2: 5.15e-08 4.83e-02 4.39e+02 1.00e-00Iter 3: 1.11e-07 5.10e-04 3.64e+00 9.99e-01Iter 4: 6.41e-04 1.87e-04 1.02e+00 4.77e-01Iter 5: 1.27e-02 3.72e-05 3.40e-01 2.26e-01
Exiting: One or more of the residuals, duality gap, or total relative errorhas grown 100000 times greater than its minimum value so far: the dual ap-pears to be infeasible (and the primal unbounded). (The primal residual �
TolFun=1.00e-08.)
Table 5.1: linprog() iteration display
Residuals: Primal Dual Duality Total #Infeasibility Infeasibility Gap Relative constraints� � � � � � � ��� � � �
��
Error reducedIter 0 0.02 10.8843 194.059 3.5854Iter 1 0.0004 0.21769 7.7587 1.0515Iter 2 8e-06 0.050589 1.871 0.9993Iter 3 1.8542e-07 0.014951 0.5932 0.59672Iter 4 8.1055e-08 0.0048963 0.2152 0.21635Iter 5 4.5586e-08 0.00016507 0.089084 0.089123Iter 6 8.3459e-09 7.0761e-05 0.037025 0.037042Iter 7 1.457e-09 1.9964e-05 0.011225 0.01123Iter 8 4.3047e-10 7.1626e-06 0.0042428 0.0042445Iter 9 1.4528e-10 3.0168e-06 0.0018633 0.001864Iter 0 0.00028457 7.0108e-07 0.00055431 0.00083904 17Iter 0 0.00054624 1.8242e-07 0.00020812 0.00075441 26Iter 0 0.00015359 6.6702e-08 3.7658e-05 0.00019127 39Iter 0 0.00026999 1.4576e-08 7.4835e-06 0.00027748 78Iter 0 4.4736e-05 4.2838e-09 1.603e-06 4.634e-05 93Iter 0 2.7846e-05 1.6497e-10 6.0702e-08 2.7906e-05 75Iter 0 1.1211e-05 3.362e-12 1.2376e-09 1.1212e-05 13Iter 1 2.2422e-07 6.7534e-14 2.2472e-11 2.2424e-07
Table 5.2: Iteration display of our approach: Mehrotra interior-point method with modelreduction.
75
show the results obtained with linprog() and our approach. The linprog() iteration display is
shown in table 5.1. The linprog() routine doesn’t converge on this problem! Table 5.2 shows
the iteration display of the method presented in this chapter. This method does converge!
There are 7 model reductions along the way. The problem size was reduced by 17, 26, 39,
78, 93, 75 and 13 respectively at each time. Altogether 341 constraints were reduced.
76
Chapter 6
Conclusions and Future Work
This thesis explores the effectiveness of equalizing filters in cross-talk cancellation for high-
speed, off-chip buses. It demonstrates that linear programming provides effective methods
for designing cross-talk canceling equalizing filters that greatly increase the bandwidth of
high-speed digital buses. For 5 cm long 32-bit wide PCB buses that are closely spaced (75
� m width and 75 � m separation), with simple full-swing voltage signaling method, system
with � � � equalizing filters can operate at 4.1GHz. Without equalizing filters, such buses
can only operate at 1.3GHz, 1.47GHz with pre-emphasis but no cross-talk cancellation.
In this thesis, a coupled distributed RLC interconnect model is first constructed
and validated. Based on the bus model, the first technique used to design equalizing fil-
ters is the least squares method. The least squares method produces equalizing filters that
greatly improve signal integrity and minimum bit rate at which signals can be received
correctly. Next, because the whole transmission network is linear, the equalizing filter de-
sign problem can be formulated into a linear programming problem which uses the� �
metric corresponding to the traditional eye height measurement of signal integrity. Equal-
izing filters designed with the linear programming approach have better performance than
77
the filters designed with the least squares method (see section 4.5). Another advantage of
the linear programming method over the least squares method is that it does not depend
on pseudo-random sequences and guarantees worst-case performance. The simulation re-
sults presented in chapter 4 show that the equalization technique is a promising method in
cross-talk cancellation for high-speed buses.
Moreover, scaling trends of the VLSI technology favor this approach. Long buses
cost more and support lower data rates. The cost of the bus justifies added circuitry on
the chip. The lower data rate provide more time for the filtering operations. Furthermore,
improvements in chip fabrication are producing smaller and faster circuits for implementing
the filter while buses remain big and slow. This also contributes to the favorability of adding
more sophisticated equalizing filters.
6.1 Future work
This work has demonstrated the effectiveness of equalization technique in cross-talk can-
cellation for high-speed PCB buses by simulation. A natural and necessary work in the
future is circuit design, PCB fabrication and test. Besides this, based on these results, in the
future, the following ideas should be explored to further improve the performance of the
equalizing filter design:
� Figure 4.6 shows the impulse response of the bus considered in this thesis. The source
end of the bus is terminated with resistors whose resistance equals the approximated
input impedance of the bus:� ��� �� � � �
However the bus traces are not perfect LC lines. Its resistance, mutual inductance
and coupling capacitance makes the input impedance calculation inaccurate. This is
78
shown in figure 4.6 by the reflections of the impulse response. Because of the limit in
filter length, these reflections degrade the performance of the equalizing filters. Op-
timization routines, such as Matlab’s fminunc can be used to minimize the amplitude
of reflections by tuning the input resistor.
� With current ADC/DAC technology, more taps per bit can be implemented at over
1GHz bit rate. It was observed that by increasing the number of taps per bit, the
performance of equalizing filters for longer buses can be improved substantially. This
issue should be investigated more thoroughly and systematically in the future.
� skin effect
Because the primary goal of this thesis is cross-talk cancellation, high-frequency at-
tenuation caused by skin effect was not taken into account in the bus model. How-
ever, skin effect is one of the major components that limit the off-chip signaling above
1GHz [2]. It has been shown that pre-emphasis for serial links greatly improves their
bandwidth [2]. In the future, skin effect should be incorporated into the bus model.
The method presented in this thesis can be used directly with the bus models that
include the skin effect.
� Differential signaling
Differential signaling is commonly used to achieve high speed signaling and improve
signal integrity. The down-side of the differential signaling method is that it doubles
the number of wires. Is the differential signaling method inherently the best one
given� � wires for transmitting � signals? Moreover, can we improve performance
by adding � wires to a � -bit bus to transmit � signals? The linear programming
method for equalizing filter design can be adapted to answer these questions.
� Multi-level signaling
79
Multi-level signaling has been a great deal of recent interest. It uses multiple voltage
levels and hence has lower fundamental frequencies than the simple binary signaling
at the same data rate. Many off-chip communication links employ multi-level signal-
ing to achieve higher performance with limited bandwidth. Methods presented in this
thesis can be easily adapted to design equalizing filters for multi-level signaling.
� Non-linear filter design
Because filters are relatively narrow, look-up tables may be a simple and practical im-
plementation. By exploiting the symmetries of the bus, these tables can be made quite
compact. There is no reason that the table entries must correspond to linear combina-
tions of the inputs. Thus, non-linear filters may be easy to implement and they may be
able to handle the apparently rare worst-case input patterns more effectively. I started
to explore this idea. For example, each entry of a look-up table can correspond to the
filter output for a particular type of input. Input types can be divided according to the
number of transitions. Given an input type on a wire at the current bit, input types
on its adjacent wires and preceding bit times are constrained. The linear equalizing
filter design suggests long history and few future bits need to be considered in order
to get a good filter. The chain effect of the input type constraints makes the non-
linear filter design impractical to find the worst-case input. Using more sophisticated
optimization method may solve this and is a topic for future work.
80
Bibliography
[1] P.M. Crespo and M.L. Honig. Pole-zero decision feedback equalization with a rapidlyconverging adaptive IIR algorithm. IEEE Journal of Selected Areas in Communica-tions, 9:817–829, 1991.
[2] W.J. Dally and J.W. Poulton. Transmitter equalization for 4-GBPs signaling. IEEEMicro, 1:48–56, 1997.
[3] W.J. Dally and J.W. Poulton. Digital Systems Engineering. Cambridge UniversityPress, 1998.
[4] A. Fiedler, R. Mactaggart, J. Welch, and S. Krishnan. A 1.0625Gbps transceiver with2x-oversampling and transmit signal pre-emphasis. In Proc. of ISSCC97, pages 238–239, 1997.
[5] N.J. Higham. Accuracy and Stability of Numerical Algorithms. SIAM, 1996.
[6] M.L. Honig, P. Crespo, and K. Steiglitz. Suppression of near- and far-end crosstalkby linear pre- and pose-filtering. IEEE Journal of Selected Areas in Communications,10:614–629, 1992.
[7] M.L. Honig, K. Steiglitz, and B. Gopinath. Multichannel signal procesing for datacommunications in the presence of crosstalk. IEEE Transactions on Communications,38:551–558, 1990.
[8] M. Horowitz, C. Ken, and S. Sidiropoulos. High speed electrical signalling: Overviewand limitations. IEEE Micro, 18:12–24, 1998.
[9] L. Jackson. Digital Filters and Signal Processing. Kluwer Academic Publishers,1996.
[10] H. Johnson and M. Graham. High-Speed Digital Design:A Handbook of Black Magic.Prentice Hall, 1993.
81
[11] L. Lu and V. Ungvichian. Crosstalk versus interline space in ultra high speed digitalPCBs. In IEEE International Symposium on EMC, pages 629–634, 1998.
[12] I. J. Lustig, R.E. Marsten, and D.F. Shanno. On implementing mehrotra’s predictor-corrector interior point method for linear programming. SIAM Journal on Optimiza-tion, 2:435–449, 1992.
[13] S. Mehrotra. On the implementation of a primal-dual interior point method. SIAMJournal on Optimization, 2:575–601, 1992.
[14] J. Nocedal and S. Wright. Numerical Optimization, pages 395–417. Springer Seriesin Operations Research, Springer Press, 1999.
[15] V. Rico-Ramirez and A.W. Westerberg. Interior point methods on the solution ofconditional models. Technical report, Carnegie Mellon University, 1997.
[16] J. Salz. Digital transmission over cross-coupled linear channels. Bell System Technol-ogy Journal, 64:1147–1159, 1985.
[17] V. Stojanovic, G. Ginis, and M.A. Horowitz. Transmit pre-emphasis for high-speedtime-division-multiplexed serial-link transceiver. IEEE Transactions on Communica-tions, 38:551–558, 2001.
[18] S.K. Tewksbury. Microelectronic Systems Interconnections. IEEE Press, 1995.
[19] C.K. Yang, V. Stojanovic, S. Modjtahedi, M.A. Horowitz, and W.F. Ellersick. A serial-link transceiver based on 8-GSamples/s A/D and D/A converters in 0.25- � m CMOS.In IEEE International Conference on Communications, pages 1934–1939, 2002.
[20] T.S. Yeo, C.S. Ng, M.S. Leong, and P.S. Kooi. Interline coupling of ultra-high-speedpulse propagation on PCB. IEEE Transactions on Electromagnetic Compatibility,35(3):401–404, August 1993.
[21] Y. Zhang. Solving large-scale linear programs by interior-point methods under theMATLAB environment. Technical Report TR96-01, University of Maryland, July1995.
82
Top Related