Experimental Demonstration of End-to-End PCI Express ...djm202/pdf/papers/wang09... · Abstract: We...

3
Experimental Demonstration of End-to-End PCI Express Communication over a Transparent All-Optical Photonic Interconnection etwork Interface Howard Wang 1 , Ajay S. Garg 1 , Odile Liboiron-Ladouceur 2 , and Keren Bergman 1 1: Department of Electrical Engineering, Columbia University, ew York, ew York 10027 2: Department of Electrical and Computer Engineering, McGill University, Montréal, Québec H3A 2A7 [email protected] Abstract: We successfully establish a PCI Express link, demonstrating device recognition between an endpoint device and host computer across an all-optical WDM network interface. Direct Memory Access transfers are demonstrated across the upstream and downstream paths. ©2008 Optical Society of America OCIS codes: (200.4650) Optical interconnects, (060.4510) Optical communications, (200.0200) Optics in computing; 1. Introduction Given the accelerated growth in the performance of microprocessors and the recent emergence of multicore architectures in chip multiprocessors (CMPs), the limitations of advanced computing systems are becoming increasingly dependent on the capabilities of the communications infrastructure [1]. As the critical performance bottleneck shifts from the processors to the interconnect, it will become particularly challenging to meet the bandwidth demands of future high-performance computing clusters. A high-bandwidth, low-latency, and low-power communication infrastructure is therefore a requisite for the development of next generation high-performance advanced computing systems and data centers. Optical interconnects have been proposed as a potentially attractive solution to alleviate the bandwidth and power challenges plaguing current electronic interconnect technologies [2]. Optical interconnection networks provide substantial scalability in I/O bandwidth by uniquely exploiting the parallelism and capacity of wavelength- division multiplexing (WDM), while operating with fundamentally low power dissipation and latency. Parallel optical links for board-level inter-chip optical communication and inter-board communication through backplanes have been demonstrated to offer high data throughput with relatively lower power consumption [3]. To be successfully integrated within current state-of-the-art computing systems, the advantages offered by photonic interconnection solutions must be leveraged in a way that is complementary to existing electronic standards. The development and industry-wide acceptance of various standard communication protocols, such as PCI Express, HyperTransport and RapidIO, has been an enabler for interconnectivity among diverse communicating modules. PCI Express (PCIe), now in its third generation, has emerged as the I/O protocol of choice for high-speed serial buses supporting chip-to-chip and board-to-board applications in modern computing systems [4]. As such, PCIe is well positioned to become the predominant communications protocol across compute clusters. Previously, we presented a photonic interface gateway capable of transparent all-optical formatting of serial data streams into high-bandwidth wavelength parallel photonic packets [5,6]. In this work, we experimentally demonstrate the end-to-end generation of a PCIe link originating from a remote endpoint across the aforementioned photonic interface gateway to a host computer in a transparent manner. The remote endpoint is implemented on a field programmable gate array (FPGA) based device. The link is transparently tunneled across the ingress and egress implementations of the photonic interface gateway. Direct Memory Access (DMA) transfers are also demonstrated across static optical links traversing both the upstream and downstream paths. The transfers are initiated by an application running on an x86-based PC, with PCIe traffic originating from the endpoint and logically routed through the established PCIe link. Successful transmission of PCIe data at 2.5 Gb/s and maintenance of the logical PCIe link is experimentally confirmed across eight wavelengths. 2. Transparent WDM Interface Grooming at the optical interface ingress is achieved through the mapping of a serial PCIe data stream onto parallel WDM channels within a time-slotted packet structure. The implemented network interface constructs packets intended for optical packet switched interconnection networks supporting wavelength-striped packets such as the Data Vortex network architecture [7]. First, multiwavelength continuous-wave light is passively combined onto a single fiber and is modulated by a high-speed electronic data payload stream originating from the computing node using one high-speed optical modulator, effectively mapping the data stream onto multiple WDM channels. Each © 2009 OSA/OFC/NFOEC 2009 OTuA4.pdf

Transcript of Experimental Demonstration of End-to-End PCI Express ...djm202/pdf/papers/wang09... · Abstract: We...

Page 1: Experimental Demonstration of End-to-End PCI Express ...djm202/pdf/papers/wang09... · Abstract: We successfully establish a PCI Express link, demonstrating device recognition between

Experimental Demonstration of End-to-End PCI Express

Communication over a Transparent All-Optical Photonic

Interconnection �etwork Interface

Howard Wang1, Ajay S. Garg

1, Odile Liboiron-Ladouceur

2, and Keren Bergman

1

1: Department of Electrical Engineering, Columbia University, �ew York, �ew York 10027

2: Department of Electrical and Computer Engineering, McGill University, Montréal, Québec H3A 2A7

[email protected]

Abstract: We successfully establish a PCI Express link, demonstrating device recognition between

an endpoint device and host computer across an all-optical WDM network interface. Direct

Memory Access transfers are demonstrated across the upstream and downstream paths.

©2008 Optical Society of America OCIS codes: (200.4650) Optical interconnects, (060.4510) Optical communications, (200.0200) Optics in computing;

1. Introduction

Given the accelerated growth in the performance of microprocessors and the recent emergence of multicore

architectures in chip multiprocessors (CMPs), the limitations of advanced computing systems are becoming

increasingly dependent on the capabilities of the communications infrastructure [1]. As the critical performance

bottleneck shifts from the processors to the interconnect, it will become particularly challenging to meet the

bandwidth demands of future high-performance computing clusters. A high-bandwidth, low-latency, and low-power

communication infrastructure is therefore a requisite for the development of next generation high-performance

advanced computing systems and data centers.

Optical interconnects have been proposed as a potentially attractive solution to alleviate the bandwidth and

power challenges plaguing current electronic interconnect technologies [2]. Optical interconnection networks

provide substantial scalability in I/O bandwidth by uniquely exploiting the parallelism and capacity of wavelength-

division multiplexing (WDM), while operating with fundamentally low power dissipation and latency. Parallel

optical links for board-level inter-chip optical communication and inter-board communication through backplanes

have been demonstrated to offer high data throughput with relatively lower power consumption [3].

To be successfully integrated within current state-of-the-art computing systems, the advantages offered by

photonic interconnection solutions must be leveraged in a way that is complementary to existing electronic

standards. The development and industry-wide acceptance of various standard communication protocols, such as

PCI Express, HyperTransport and RapidIO, has been an enabler for interconnectivity among diverse communicating

modules. PCI Express (PCIe), now in its third generation, has emerged as the I/O protocol of choice for high-speed

serial buses supporting chip-to-chip and board-to-board applications in modern computing systems [4]. As such,

PCIe is well positioned to become the predominant communications protocol across compute clusters.

Previously, we presented a photonic interface gateway capable of transparent all-optical formatting of serial

data streams into high-bandwidth wavelength parallel photonic packets [5,6]. In this work, we experimentally

demonstrate the end-to-end generation of a PCIe link originating from a remote endpoint across the aforementioned

photonic interface gateway to a host computer in a transparent manner. The remote endpoint is implemented on a

field programmable gate array (FPGA) based device. The link is transparently tunneled across the ingress and egress

implementations of the photonic interface gateway. Direct Memory Access (DMA) transfers are also demonstrated

across static optical links traversing both the upstream and downstream paths. The transfers are initiated by an

application running on an x86-based PC, with PCIe traffic originating from the endpoint and logically routed

through the established PCIe link. Successful transmission of PCIe data at 2.5 Gb/s and maintenance of the logical

PCIe link is experimentally confirmed across eight wavelengths.

2. Transparent WDM Interface

Grooming at the optical interface ingress is achieved through the mapping of a serial PCIe data stream onto parallel

WDM channels within a time-slotted packet structure. The implemented network interface constructs packets

intended for optical packet switched interconnection networks supporting wavelength-striped packets such as the

Data Vortex network architecture [7]. First, multiwavelength continuous-wave light is passively combined onto a

single fiber and is modulated by a high-speed electronic data payload stream originating from the computing node

using one high-speed optical modulator, effectively mapping the data stream onto multiple WDM channels. Each

a2109_1.pdf   

OTuA4.pdf   

© 2009 OSA/OFC/NFOEC 2009 OTuA4.pdf 

 

Page 2: Experimental Demonstration of End-to-End PCI Express ...djm202/pdf/papers/wang09... · Abstract: We successfully establish a PCI Express link, demonstrating device recognition between

modulated wavelength is then demultiplexed

lines (FDLs) by an amount of time corresponding t

wavelengths are subsequently multiplexed onto a single fiber, where

gates the WDM time-shifted optical signals to create the wavelength

consist of periodic intervals of payload data

WDM channels. The resultant dead-time between WDM packets can be interleaved with packets generated from

other lanes in a time-division multiplexed manner to maximize link utilization

electronic packet is optically reconstructed by filtering an

in a manner complementary to that employed at the networ

Figure 1 schematically illustrates the

system. To minimize the effect of chromatic dispersi

1549.33 nm, respectively, and spaced by 0.8 nm.

LiNbO3 modulator driven directly by a PCIe data stream.

the WDM packets at the ingress interface.

2.5 Gb/s data in each of the eight channels.

Channels are delayed by 16 ns with respect

received by a 10.7 Gb/s broadband PIN

differences among various wavelengths arising fro

3. PCI Express

PCI Express has become the most prevalent high

consumer desktops to high-end business servers.

third-party peripheral devices and the computing system.

backwards compatibility with the Peripheral Component Interface (PCI) standard, which allows developers to

capitalize on years of previous hardware and software experience when creating a new design.

other interconnection network standards such as Infiniband and 10 Gigabit Ethernet are often implemented as PCIe

devices.

The typical PCIe network consists of four main components connected in a tree

allows the CPU to control the PCIe network

peripheral device connected to the PCIe network.

itself is a point-to-point serial connection between the root complex, s

of 1, 2, 4, 8, or 16 lanes, where each l

data upstream towards the root complex, and one to send data downstr

upstream and downstream sides of each l

lane of 4 Gb/s (2 Gb/s upstream, 2 G

Fig. 1: Experimental implementation of the photonic ingress and egress interfaces. Data modulator and packet gate are impleme

with LiNbO3 modulators. Channels are filtered using 100 GHz spaced TFFs. Delays achieved using FDLs.

Fig. 2: Schematic representation of host and endpoint system organization as experimentally implemented. The optical interfac

demultiplexed and delayed with respect to its adjacent wavelengths via fiber

) by an amount of time corresponding to the packet length of the attached photonic interconnect

multiplexed onto a single fiber, where a second broadband optical modulator

shifted optical signals to create the wavelength parallel packets. The resulting WDM

consist of periodic intervals of payload data which are time-compressed by a factor proportional to the number of

time between WDM packets can be interleaved with packets generated from

on multiplexed manner to maximize link utilization. At the egress

electronic packet is optically reconstructed by filtering and delaying each wavelength of the time

to that employed at the network ingress interface.

schematically illustrates the photonic ingress and egress interfaces as implemented

To minimize the effect of chromatic dispersion, channels W7 to W0 are allocated from 1543.72

and spaced by 0.8 nm. All eight wavelengths are simultaneously modulated with a

modulator driven directly by a PCIe data stream. A high-speed LiNbO3 modulator is also e

the WDM packets at the ingress interface. The resulting WDM packets are 16 ns in length, thus encoding 40 bits of

2.5 Gb/s data in each of the eight channels. Filtering is accomplished using 100 GHz spaced

ns with respect to adjacent channels using FDLs. The recovered PCIe serial stream is

PIN-TIA receiver module with a limiting amplifier, alleviating the power

differences among various wavelengths arising from component imperfections.

has become the most prevalent high-speed I/O protocol in modern computing systems, from average

end business servers. PCIe provides a standard interconnection protocol

vices and the computing system. The widespread use of PCIe can be attributed to its

backwards compatibility with the Peripheral Component Interface (PCI) standard, which allows developers to

ardware and software experience when creating a new design.

standards such as Infiniband and 10 Gigabit Ethernet are often implemented as PCIe

network consists of four main components connected in a tree topology

to control the PCIe network and is connected via a switch to endpoints. The e

connected to the PCIe network. The switch converts a single link into multiple l

connection between the root complex, switches, and endpoints. A

of 1, 2, 4, 8, or 16 lanes, where each lane denotes two differential transmitter-receiver (Tx-Rx)

mplex, and one to send data downstream away from the root complex.

tream sides of each lane transmit 8b/10b symbols at 2.5 Gb/s, for a maximum throughput per

pstream, 2 Gb/s downstream). Generation 2 increases the transmission rate to 5.0 Gb/s,

Fig. 1: Experimental implementation of the photonic ingress and egress interfaces. Data modulator and packet gate are impleme

modulators. Channels are filtered using 100 GHz spaced TFFs. Delays achieved using FDLs.

Schematic representation of host and endpoint system organization as experimentally implemented. The optical interfac

with the upstream link.

o its adjacent wavelengths via fiber delay

e attached photonic interconnect. All

d broadband optical modulator precisely

The resulting WDM packets

by a factor proportional to the number of

time between WDM packets can be interleaved with packets generated from

egress node, the serial

of the time-compressed packet

as implemented in our experimental

are allocated from 1543.72 nm to

All eight wavelengths are simultaneously modulated with a

modulator is also employed to gate

The resulting WDM packets are 16 ns in length, thus encoding 40 bits of

spaced thin film filters.

recovered PCIe serial stream is

alleviating the power

speed I/O protocol in modern computing systems, from average

ion protocol between various

The widespread use of PCIe can be attributed to its

backwards compatibility with the Peripheral Component Interface (PCI) standard, which allows developers to

ardware and software experience when creating a new design. For these reasons,

standards such as Infiniband and 10 Gigabit Ethernet are often implemented as PCIe

topology. The root complex

endpoint (EP) is any

itch converts a single link into multiple links and the link

Any link is composed

Rx) pairs, one to send

eam away from the root complex. Both the

or a maximum throughput per

the transmission rate to 5.0 Gb/s,

Fig. 1: Experimental implementation of the photonic ingress and egress interfaces. Data modulator and packet gate are implemented

modulators. Channels are filtered using 100 GHz spaced TFFs. Delays achieved using FDLs.

Schematic representation of host and endpoint system organization as experimentally implemented. The optical interface is inserted inline

a2109_1.pdf   

OTuA4.pdf   

© 2009 OSA/OFC/NFOEC 2009 OTuA4.pdf 

 

Page 3: Experimental Demonstration of End-to-End PCI Express ...djm202/pdf/papers/wang09... · Abstract: We successfully establish a PCI Express link, demonstrating device recognition between

doubling the throughput per lane. Generation 3

doubling the Generation 2 throughput per l

Our PCIe base electronic system setup in

system. This desktop provides one PCIe

southbridge. We then use a Samtec PCIe

is an XpressGXII from PLD Applications, Inc. (PLDA)

sixteen transceivers. Eight transceivers are

demonstration, while another eight are

PCIe x4 to SMA adapter to access the FPGA

4. Experimental Demonstration

In order to validate the viability of PCIe over

link across the photonic interface gateway.

link connecting the remote FPGA-based endpoint device

link’s differential pair is terminated appropriately while the other terminal is connected single

speed modulator illustrated in the ingress interface of

interface digitizes the recovered PCIe stream and generates a differential output, which is translated to the

appropriate voltage levels for PCIe and connected to the gigabit receiver

is maintained in the electronic domain to achieve a complete

initialization, training, and configuration o

endpoint device is correctly recognized across the photonic interface, confirming the

gateway to the PCIe link.

Furthermore, we demonstrate successful direct memory access (DMA) transfers at 2.5 Gb/s, the

rate, across static optical links between the endpoint and host computer. The DMA transfers are initiated by a

diagnostic application provided by PLD

and functionality is confirmed across eight wavelengths, W0 to W7 (

5. Conclusion

In this work, we experimentally demonstrate the end

endpoint across a transparent photonic interface gateway to a hos

optics. Furthermore, we demonstrate successful direct memory access (DMA) transfers at 2.5 Gb/s, the PCIe x1 line

rate, across static optical links between the endpoint and host computer.

The authors gratefully acknowledge support for this work from the Intel Corporation

5. References

[1] S. Vangal et al. “An 80-Tile 1.28TFLOPS Network[2] D. A. B. Miller, “Rationale and Challenges for Optical Interconnects to Electronic Chips,”

[3] C. L. Schow, et al., “160-Gb/s, 16-Channel Full

[4] Standard Development Group, http//:www.pcisig.com[5] O. Liboiron-Ladouceur, H. Wang, K. Bergman, "An All

OFC 2007, JWA59.

[6] O. Liboiron-Ladouceur, H. Wang, K. Bergman, "Low Power Optical WDM Interface for [7] A. Shacham, B.A. Small, O. Liboiron-Ladouceur, and K. Bergman, “A Fully Implemented 12x12 Data Vortex Optical Interconnection

Network,” IEEE J. Lightwave Technol., 23, 10, 3066

Fig. 3: End-to-end overall system organization depicting host computer and

FPGA-based endpoint device.

Generation 3 will increase the transmission rate to 8.0 Gb/s and remove 8b/10b,

throughput per lane.

Our PCIe base electronic system setup in Figure 2 starts with an x86 dual-core desktop as our

This desktop provides one PCIe x16 graphics slot on the northbridge and one PCIe

We then use a Samtec PCIe x1 to SMA adapter to access the gigabit Tx and Rx signal

from PLD Applications, Inc. (PLDA) which uses an Altera Stratix II GX-class

Eight transceivers are connected to the PCIe x8 card edge, which is not in use in our system

are connected to a PLDA daughter card as one PCIe x4. We then use a Samtec

FPGA’s Tx and Rx pairs through the daughter card.

In order to validate the viability of PCIe over optics, we demonstrate the successful end-to-end generation

link across the photonic interface gateway. The ingress and egress interfaces are inserted inline with the

based endpoint device to the host computer (Fig. 3). One terminal of the upstream

link’s differential pair is terminated appropriately while the other terminal is connected single-

speed modulator illustrated in the ingress interface of Figure 1. The broadband receiver at the output of the egress

interface digitizes the recovered PCIe stream and generates a differential output, which is translated to the

te voltage levels for PCIe and connected to the gigabit receiver of the host computer. The

domain to achieve a complete dual-simplex PCIe link implementation.

initialization, training, and configuration of the link are performed successfully by the host computer

endpoint device is correctly recognized across the photonic interface, confirming the transparency of the interface

Furthermore, we demonstrate successful direct memory access (DMA) transfers at 2.5 Gb/s, the

between the endpoint and host computer. The DMA transfers are initiated by a

diagnostic application provided by PLDA running on the Microsoft Windows Operating System. Link bandwidth

tionality is confirmed across eight wavelengths, W0 to W7 (Fig. 4).

In this work, we experimentally demonstrate the end-to-end generation of a PCIe link originating from a remote

endpoint across a transparent photonic interface gateway to a host computer, validating the viability of PCIe over

Furthermore, we demonstrate successful direct memory access (DMA) transfers at 2.5 Gb/s, the PCIe x1 line

rate, across static optical links between the endpoint and host computer.

fully acknowledge support for this work from the Intel Corporation under Grant SINTEL CU08-7952

Tile 1.28TFLOPS Network-on-Chip in 65nm CMOS,” ISSCC Dig. Tech. Papers, pp. 98-99, Feb. 2007.D. A. B. Miller, “Rationale and Challenges for Optical Interconnects to Electronic Chips,” Proc. IEEE 88, 728-749 (2000).

Channel Full-Duplex, Single-Chip CMOS Optical Transceiver,” OFC 2007, OThG4

Group, http//:www.pcisig.com Ladouceur, H. Wang, K. Bergman, "An All-Optical PCI-Express Network Interface for Optical Packet Switched Netw

Ladouceur, H. Wang, K. Bergman, "Low Power Optical WDM Interface for Off-Chip Interconnects," LEOS 2007, WEE7.Ladouceur, and K. Bergman, “A Fully Implemented 12x12 Data Vortex Optical Interconnection

, 10, 3066-3075, (Oct. 2005).

nization depicting host computer and

based endpoint device.

Fig. 4: Computer application demonstrating successful DMA

transfers with endpoint device.

will increase the transmission rate to 8.0 Gb/s and remove 8b/10b,

core desktop as our host computer

ridge and one PCIe x1 slot on the

signal. Our FPGA card

class FPGA, containing

, which is not in use in our system

We then use a Samtec

generation of a PCIe

s are inserted inline with the upstream

One terminal of the upstream

-endedly to the high-

. The broadband receiver at the output of the egress

interface digitizes the recovered PCIe stream and generates a differential output, which is translated to the

The downstream link

implementation. The

performed successfully by the host computer and the

transparency of the interface

Furthermore, we demonstrate successful direct memory access (DMA) transfers at 2.5 Gb/s, the PCIe x1 line

between the endpoint and host computer. The DMA transfers are initiated by a

running on the Microsoft Windows Operating System. Link bandwidth

of a PCIe link originating from a remote

validating the viability of PCIe over

Furthermore, we demonstrate successful direct memory access (DMA) transfers at 2.5 Gb/s, the PCIe x1 line

99, Feb. 2007. 749 (2000).

OThG4 .

Express Network Interface for Optical Packet Switched Networks,"

Chip Interconnects," LEOS 2007, WEE7. Ladouceur, and K. Bergman, “A Fully Implemented 12x12 Data Vortex Optical Interconnection

Fig. 4: Computer application demonstrating successful DMA

transfers with endpoint device.

a2109_1.pdf   

OTuA4.pdf   

© 2009 OSA/OFC/NFOEC 2009 OTuA4.pdf