Experimental Demonstration of End-to-End PCI Express ...djm202/pdf/papers/wang09... · Abstract: We...
Transcript of Experimental Demonstration of End-to-End PCI Express ...djm202/pdf/papers/wang09... · Abstract: We...
Experimental Demonstration of End-to-End PCI Express
Communication over a Transparent All-Optical Photonic
Interconnection �etwork Interface
Howard Wang1, Ajay S. Garg
1, Odile Liboiron-Ladouceur
2, and Keren Bergman
1
1: Department of Electrical Engineering, Columbia University, �ew York, �ew York 10027
2: Department of Electrical and Computer Engineering, McGill University, Montréal, Québec H3A 2A7
Abstract: We successfully establish a PCI Express link, demonstrating device recognition between
an endpoint device and host computer across an all-optical WDM network interface. Direct
Memory Access transfers are demonstrated across the upstream and downstream paths.
©2008 Optical Society of America OCIS codes: (200.4650) Optical interconnects, (060.4510) Optical communications, (200.0200) Optics in computing;
1. Introduction
Given the accelerated growth in the performance of microprocessors and the recent emergence of multicore
architectures in chip multiprocessors (CMPs), the limitations of advanced computing systems are becoming
increasingly dependent on the capabilities of the communications infrastructure [1]. As the critical performance
bottleneck shifts from the processors to the interconnect, it will become particularly challenging to meet the
bandwidth demands of future high-performance computing clusters. A high-bandwidth, low-latency, and low-power
communication infrastructure is therefore a requisite for the development of next generation high-performance
advanced computing systems and data centers.
Optical interconnects have been proposed as a potentially attractive solution to alleviate the bandwidth and
power challenges plaguing current electronic interconnect technologies [2]. Optical interconnection networks
provide substantial scalability in I/O bandwidth by uniquely exploiting the parallelism and capacity of wavelength-
division multiplexing (WDM), while operating with fundamentally low power dissipation and latency. Parallel
optical links for board-level inter-chip optical communication and inter-board communication through backplanes
have been demonstrated to offer high data throughput with relatively lower power consumption [3].
To be successfully integrated within current state-of-the-art computing systems, the advantages offered by
photonic interconnection solutions must be leveraged in a way that is complementary to existing electronic
standards. The development and industry-wide acceptance of various standard communication protocols, such as
PCI Express, HyperTransport and RapidIO, has been an enabler for interconnectivity among diverse communicating
modules. PCI Express (PCIe), now in its third generation, has emerged as the I/O protocol of choice for high-speed
serial buses supporting chip-to-chip and board-to-board applications in modern computing systems [4]. As such,
PCIe is well positioned to become the predominant communications protocol across compute clusters.
Previously, we presented a photonic interface gateway capable of transparent all-optical formatting of serial
data streams into high-bandwidth wavelength parallel photonic packets [5,6]. In this work, we experimentally
demonstrate the end-to-end generation of a PCIe link originating from a remote endpoint across the aforementioned
photonic interface gateway to a host computer in a transparent manner. The remote endpoint is implemented on a
field programmable gate array (FPGA) based device. The link is transparently tunneled across the ingress and egress
implementations of the photonic interface gateway. Direct Memory Access (DMA) transfers are also demonstrated
across static optical links traversing both the upstream and downstream paths. The transfers are initiated by an
application running on an x86-based PC, with PCIe traffic originating from the endpoint and logically routed
through the established PCIe link. Successful transmission of PCIe data at 2.5 Gb/s and maintenance of the logical
PCIe link is experimentally confirmed across eight wavelengths.
2. Transparent WDM Interface
Grooming at the optical interface ingress is achieved through the mapping of a serial PCIe data stream onto parallel
WDM channels within a time-slotted packet structure. The implemented network interface constructs packets
intended for optical packet switched interconnection networks supporting wavelength-striped packets such as the
Data Vortex network architecture [7]. First, multiwavelength continuous-wave light is passively combined onto a
single fiber and is modulated by a high-speed electronic data payload stream originating from the computing node
using one high-speed optical modulator, effectively mapping the data stream onto multiple WDM channels. Each
a2109_1.pdf
OTuA4.pdf
© 2009 OSA/OFC/NFOEC 2009 OTuA4.pdf
modulated wavelength is then demultiplexed
lines (FDLs) by an amount of time corresponding t
wavelengths are subsequently multiplexed onto a single fiber, where
gates the WDM time-shifted optical signals to create the wavelength
consist of periodic intervals of payload data
WDM channels. The resultant dead-time between WDM packets can be interleaved with packets generated from
other lanes in a time-division multiplexed manner to maximize link utilization
electronic packet is optically reconstructed by filtering an
in a manner complementary to that employed at the networ
Figure 1 schematically illustrates the
system. To minimize the effect of chromatic dispersi
1549.33 nm, respectively, and spaced by 0.8 nm.
LiNbO3 modulator driven directly by a PCIe data stream.
the WDM packets at the ingress interface.
2.5 Gb/s data in each of the eight channels.
Channels are delayed by 16 ns with respect
received by a 10.7 Gb/s broadband PIN
differences among various wavelengths arising fro
3. PCI Express
PCI Express has become the most prevalent high
consumer desktops to high-end business servers.
third-party peripheral devices and the computing system.
backwards compatibility with the Peripheral Component Interface (PCI) standard, which allows developers to
capitalize on years of previous hardware and software experience when creating a new design.
other interconnection network standards such as Infiniband and 10 Gigabit Ethernet are often implemented as PCIe
devices.
The typical PCIe network consists of four main components connected in a tree
allows the CPU to control the PCIe network
peripheral device connected to the PCIe network.
itself is a point-to-point serial connection between the root complex, s
of 1, 2, 4, 8, or 16 lanes, where each l
data upstream towards the root complex, and one to send data downstr
upstream and downstream sides of each l
lane of 4 Gb/s (2 Gb/s upstream, 2 G
Fig. 1: Experimental implementation of the photonic ingress and egress interfaces. Data modulator and packet gate are impleme
with LiNbO3 modulators. Channels are filtered using 100 GHz spaced TFFs. Delays achieved using FDLs.
Fig. 2: Schematic representation of host and endpoint system organization as experimentally implemented. The optical interfac
demultiplexed and delayed with respect to its adjacent wavelengths via fiber
) by an amount of time corresponding to the packet length of the attached photonic interconnect
multiplexed onto a single fiber, where a second broadband optical modulator
shifted optical signals to create the wavelength parallel packets. The resulting WDM
consist of periodic intervals of payload data which are time-compressed by a factor proportional to the number of
time between WDM packets can be interleaved with packets generated from
on multiplexed manner to maximize link utilization. At the egress
electronic packet is optically reconstructed by filtering and delaying each wavelength of the time
to that employed at the network ingress interface.
schematically illustrates the photonic ingress and egress interfaces as implemented
To minimize the effect of chromatic dispersion, channels W7 to W0 are allocated from 1543.72
and spaced by 0.8 nm. All eight wavelengths are simultaneously modulated with a
modulator driven directly by a PCIe data stream. A high-speed LiNbO3 modulator is also e
the WDM packets at the ingress interface. The resulting WDM packets are 16 ns in length, thus encoding 40 bits of
2.5 Gb/s data in each of the eight channels. Filtering is accomplished using 100 GHz spaced
ns with respect to adjacent channels using FDLs. The recovered PCIe serial stream is
PIN-TIA receiver module with a limiting amplifier, alleviating the power
differences among various wavelengths arising from component imperfections.
has become the most prevalent high-speed I/O protocol in modern computing systems, from average
end business servers. PCIe provides a standard interconnection protocol
vices and the computing system. The widespread use of PCIe can be attributed to its
backwards compatibility with the Peripheral Component Interface (PCI) standard, which allows developers to
ardware and software experience when creating a new design.
standards such as Infiniband and 10 Gigabit Ethernet are often implemented as PCIe
network consists of four main components connected in a tree topology
to control the PCIe network and is connected via a switch to endpoints. The e
connected to the PCIe network. The switch converts a single link into multiple l
connection between the root complex, switches, and endpoints. A
of 1, 2, 4, 8, or 16 lanes, where each lane denotes two differential transmitter-receiver (Tx-Rx)
mplex, and one to send data downstream away from the root complex.
tream sides of each lane transmit 8b/10b symbols at 2.5 Gb/s, for a maximum throughput per
pstream, 2 Gb/s downstream). Generation 2 increases the transmission rate to 5.0 Gb/s,
Fig. 1: Experimental implementation of the photonic ingress and egress interfaces. Data modulator and packet gate are impleme
modulators. Channels are filtered using 100 GHz spaced TFFs. Delays achieved using FDLs.
Schematic representation of host and endpoint system organization as experimentally implemented. The optical interfac
with the upstream link.
o its adjacent wavelengths via fiber delay
e attached photonic interconnect. All
d broadband optical modulator precisely
The resulting WDM packets
by a factor proportional to the number of
time between WDM packets can be interleaved with packets generated from
egress node, the serial
of the time-compressed packet
as implemented in our experimental
are allocated from 1543.72 nm to
All eight wavelengths are simultaneously modulated with a
modulator is also employed to gate
The resulting WDM packets are 16 ns in length, thus encoding 40 bits of
spaced thin film filters.
recovered PCIe serial stream is
alleviating the power
speed I/O protocol in modern computing systems, from average
ion protocol between various
The widespread use of PCIe can be attributed to its
backwards compatibility with the Peripheral Component Interface (PCI) standard, which allows developers to
ardware and software experience when creating a new design. For these reasons,
standards such as Infiniband and 10 Gigabit Ethernet are often implemented as PCIe
topology. The root complex
endpoint (EP) is any
itch converts a single link into multiple links and the link
Any link is composed
Rx) pairs, one to send
eam away from the root complex. Both the
or a maximum throughput per
the transmission rate to 5.0 Gb/s,
Fig. 1: Experimental implementation of the photonic ingress and egress interfaces. Data modulator and packet gate are implemented
modulators. Channels are filtered using 100 GHz spaced TFFs. Delays achieved using FDLs.
Schematic representation of host and endpoint system organization as experimentally implemented. The optical interface is inserted inline
a2109_1.pdf
OTuA4.pdf
© 2009 OSA/OFC/NFOEC 2009 OTuA4.pdf
doubling the throughput per lane. Generation 3
doubling the Generation 2 throughput per l
Our PCIe base electronic system setup in
system. This desktop provides one PCIe
southbridge. We then use a Samtec PCIe
is an XpressGXII from PLD Applications, Inc. (PLDA)
sixteen transceivers. Eight transceivers are
demonstration, while another eight are
PCIe x4 to SMA adapter to access the FPGA
4. Experimental Demonstration
In order to validate the viability of PCIe over
link across the photonic interface gateway.
link connecting the remote FPGA-based endpoint device
link’s differential pair is terminated appropriately while the other terminal is connected single
speed modulator illustrated in the ingress interface of
interface digitizes the recovered PCIe stream and generates a differential output, which is translated to the
appropriate voltage levels for PCIe and connected to the gigabit receiver
is maintained in the electronic domain to achieve a complete
initialization, training, and configuration o
endpoint device is correctly recognized across the photonic interface, confirming the
gateway to the PCIe link.
Furthermore, we demonstrate successful direct memory access (DMA) transfers at 2.5 Gb/s, the
rate, across static optical links between the endpoint and host computer. The DMA transfers are initiated by a
diagnostic application provided by PLD
and functionality is confirmed across eight wavelengths, W0 to W7 (
5. Conclusion
In this work, we experimentally demonstrate the end
endpoint across a transparent photonic interface gateway to a hos
optics. Furthermore, we demonstrate successful direct memory access (DMA) transfers at 2.5 Gb/s, the PCIe x1 line
rate, across static optical links between the endpoint and host computer.
The authors gratefully acknowledge support for this work from the Intel Corporation
5. References
[1] S. Vangal et al. “An 80-Tile 1.28TFLOPS Network[2] D. A. B. Miller, “Rationale and Challenges for Optical Interconnects to Electronic Chips,”
[3] C. L. Schow, et al., “160-Gb/s, 16-Channel Full
[4] Standard Development Group, http//:www.pcisig.com[5] O. Liboiron-Ladouceur, H. Wang, K. Bergman, "An All
OFC 2007, JWA59.
[6] O. Liboiron-Ladouceur, H. Wang, K. Bergman, "Low Power Optical WDM Interface for [7] A. Shacham, B.A. Small, O. Liboiron-Ladouceur, and K. Bergman, “A Fully Implemented 12x12 Data Vortex Optical Interconnection
Network,” IEEE J. Lightwave Technol., 23, 10, 3066
Fig. 3: End-to-end overall system organization depicting host computer and
FPGA-based endpoint device.
Generation 3 will increase the transmission rate to 8.0 Gb/s and remove 8b/10b,
throughput per lane.
Our PCIe base electronic system setup in Figure 2 starts with an x86 dual-core desktop as our
This desktop provides one PCIe x16 graphics slot on the northbridge and one PCIe
We then use a Samtec PCIe x1 to SMA adapter to access the gigabit Tx and Rx signal
from PLD Applications, Inc. (PLDA) which uses an Altera Stratix II GX-class
Eight transceivers are connected to the PCIe x8 card edge, which is not in use in our system
are connected to a PLDA daughter card as one PCIe x4. We then use a Samtec
FPGA’s Tx and Rx pairs through the daughter card.
In order to validate the viability of PCIe over optics, we demonstrate the successful end-to-end generation
link across the photonic interface gateway. The ingress and egress interfaces are inserted inline with the
based endpoint device to the host computer (Fig. 3). One terminal of the upstream
link’s differential pair is terminated appropriately while the other terminal is connected single-
speed modulator illustrated in the ingress interface of Figure 1. The broadband receiver at the output of the egress
interface digitizes the recovered PCIe stream and generates a differential output, which is translated to the
te voltage levels for PCIe and connected to the gigabit receiver of the host computer. The
domain to achieve a complete dual-simplex PCIe link implementation.
initialization, training, and configuration of the link are performed successfully by the host computer
endpoint device is correctly recognized across the photonic interface, confirming the transparency of the interface
Furthermore, we demonstrate successful direct memory access (DMA) transfers at 2.5 Gb/s, the
between the endpoint and host computer. The DMA transfers are initiated by a
diagnostic application provided by PLDA running on the Microsoft Windows Operating System. Link bandwidth
tionality is confirmed across eight wavelengths, W0 to W7 (Fig. 4).
In this work, we experimentally demonstrate the end-to-end generation of a PCIe link originating from a remote
endpoint across a transparent photonic interface gateway to a host computer, validating the viability of PCIe over
Furthermore, we demonstrate successful direct memory access (DMA) transfers at 2.5 Gb/s, the PCIe x1 line
rate, across static optical links between the endpoint and host computer.
fully acknowledge support for this work from the Intel Corporation under Grant SINTEL CU08-7952
Tile 1.28TFLOPS Network-on-Chip in 65nm CMOS,” ISSCC Dig. Tech. Papers, pp. 98-99, Feb. 2007.D. A. B. Miller, “Rationale and Challenges for Optical Interconnects to Electronic Chips,” Proc. IEEE 88, 728-749 (2000).
Channel Full-Duplex, Single-Chip CMOS Optical Transceiver,” OFC 2007, OThG4
Group, http//:www.pcisig.com Ladouceur, H. Wang, K. Bergman, "An All-Optical PCI-Express Network Interface for Optical Packet Switched Netw
Ladouceur, H. Wang, K. Bergman, "Low Power Optical WDM Interface for Off-Chip Interconnects," LEOS 2007, WEE7.Ladouceur, and K. Bergman, “A Fully Implemented 12x12 Data Vortex Optical Interconnection
, 10, 3066-3075, (Oct. 2005).
nization depicting host computer and
based endpoint device.
Fig. 4: Computer application demonstrating successful DMA
transfers with endpoint device.
will increase the transmission rate to 8.0 Gb/s and remove 8b/10b,
core desktop as our host computer
ridge and one PCIe x1 slot on the
signal. Our FPGA card
class FPGA, containing
, which is not in use in our system
We then use a Samtec
generation of a PCIe
s are inserted inline with the upstream
One terminal of the upstream
-endedly to the high-
. The broadband receiver at the output of the egress
interface digitizes the recovered PCIe stream and generates a differential output, which is translated to the
The downstream link
implementation. The
performed successfully by the host computer and the
transparency of the interface
Furthermore, we demonstrate successful direct memory access (DMA) transfers at 2.5 Gb/s, the PCIe x1 line
between the endpoint and host computer. The DMA transfers are initiated by a
running on the Microsoft Windows Operating System. Link bandwidth
of a PCIe link originating from a remote
validating the viability of PCIe over
Furthermore, we demonstrate successful direct memory access (DMA) transfers at 2.5 Gb/s, the PCIe x1 line
99, Feb. 2007. 749 (2000).
OThG4 .
Express Network Interface for Optical Packet Switched Networks,"
Chip Interconnects," LEOS 2007, WEE7. Ladouceur, and K. Bergman, “A Fully Implemented 12x12 Data Vortex Optical Interconnection
Fig. 4: Computer application demonstrating successful DMA
transfers with endpoint device.
a2109_1.pdf
OTuA4.pdf
© 2009 OSA/OFC/NFOEC 2009 OTuA4.pdf