Pseudo-Banyan Optical WDM Packet Switching System With ... · crossbar matrix network and Cantor...

14
Pseudo-Banyan Optical WDM Packet Switching System With Near-Optimal Packet Scheduling Maria C. Yuang, Po-Lung Tien, and Shih-Hsuan Lin Abstract—We present a novel pseudo-Banyan opti- cal packet switching system (SBOPSS) for optical wavelength division multiplexing (WDM) networks. The system includes a group of pseudo-Banyan space switches together with single-stage downsized fiber- delay-line-based optical buffers. SBOPSS is scalable, with the result that each pseudo-Banyan space switch performs packet switching only for a cluster of wave- lengths. The downsized optical buffers that are shared by output ports via the use of a small number of internal wavelengths result in efficient reduction in packet loss. Essentially, SBOPSS employs a packet scheduling algorithm, referred to as the parallel and incremental packet scheduler (PIPS). Given a set of newly arriving packets per time slot, PIPS deter- mines a maximum number of valid paths (packets) to be scheduled with the current buffers’ state taken into account. The algorithm aims at maximizing the system throughput subject to satisfying three con- straints, which are switch-contention free, buffer- contention free, and sequential delivery. Signifi- cantly, we prove that PIPS is incremental in the sense that the computed-path sets are monotonically non- decreasing over time. We then propose a hardware parallel system architecture for the implementation of PIPS. As is shown, PIPS achieves a near-optimal so- lution with an exceptionally low computational com- plexity, OP Ã log 2 NMW……, where P is the newly- arriving-packet set, N the number of input ports, and M and W the numbers of internal and external wave- lengths, respectively. From simulation results that pit the PIPS algorithm against four other algorithms, we show that PIPS outperforms these algorithms on both system throughput and computational complexity. Index Terms—Optical packet switching (OPS); wavelength division multiplexing (WDM); fiber delay line (FDL); Banyan network; packet scheduling. I. INTRODUCTION O ptical wavelength division multiplexing (WDM) has been shown to be successful in providing vir- tually unlimited bandwidth to support an ever- increasing amount of traffic for future optical net- works. Future optical networks, especially the metro and local networks, are expected to flexibly and cost- effectively satisfy a wide range of applications having time-varying and high bandwidth demands and strin- gent delay requirements. Such facts result in the need to exploit the optical packet-switching (OPS) [1,2] paradigm that takes advantage of efficient sharing of wavelength channels among multiple connections. Notice that there still exist technological limitations in OPS, such as optical random access memory (RAM) and optical signal processing. Thus, the OPS system we study in this paper employs fiber-delay-line (FDL)- based optical buffers, and almost-all-optical switches in which the control header is processed electronically while the packet payloads remain transported in the optical domain. A general OPS system consists of three basic com- ponents that are crucial to the performance and economy of the system. They are the space switch, the optical buffer, and the wavelength converter [3]. First, the space switches can be categorized into having non- blocking, rearrangeable, or blocking architectures [4]. Traditionally, the nonblocking and rearrangeable switches are mostly used for electronic circuit switch- ing systems. The nonblocking switches, such as the crossbar matrix network and Cantor network, can al- ways construct a new connection between the input and the output ports without altering other connec- tions already in the switches. However, the nonblock- ing switches are nonscalable and economically unfea- sible in the optical domain. On the other hand, the rearrangeable switches, such as a Benes network, route new input–output connections by rearranging Manuscript received November 28, 2008; revised March 30, 2009; accepted May 3, 2009; published July 28, 2009 Doc. ID 112926. M. C. Yuang (e-mail: [email protected]) and S.-H. Lin are with the Department of Computer Science and Information Engineering, National Chiao Tung University, Taiwan . P.-L. Tien is with the Department of Communication Engineering, National Chiao Tung University, Taiwan. Digital Object Identifier 10.1364/JOCN.1.0000B1 Yuang et al. VOL. 1, NO. 3/ AUGUST 2009/ J. OPT. COMMUN. NETW. B1 1943-0620/09/0300B1-14/$15.00 © 2009 Optical Society of America

Transcript of Pseudo-Banyan Optical WDM Packet Switching System With ... · crossbar matrix network and Cantor...

Page 1: Pseudo-Banyan Optical WDM Packet Switching System With ... · crossbar matrix network and Cantor network, can al-ways construct a new connection between the input ... is to incorporate

Yuang et al. VOL. 1, NO. 3 /AUGUST 2009/J. OPT. COMMUN. NETW. B1

Pseudo-Banyan Optical WDM PacketSwitching System With

Near-Optimal Packet SchedulingMaria C. Yuang, Po-Lung Tien, and Shih-Hsuan Lin

wl

OtiwaetgtpwNiawbiwo

peotbTsicwatisrr

Abstract—We present a novel pseudo-Banyan opti-cal packet switching system (SBOPSS) for opticalwavelength division multiplexing (WDM) networks.The system includes a group of pseudo-Banyan spaceswitches together with single-stage downsized fiber-delay-line-based optical buffers. SBOPSS is scalable,with the result that each pseudo-Banyan space switchperforms packet switching only for a cluster of wave-lengths. The downsized optical buffers that areshared by output ports via the use of a small numberof internal wavelengths result in efficient reductionin packet loss. Essentially, SBOPSS employs a packetscheduling algorithm, referred to as the parallel andincremental packet scheduler (PIPS). Given a set ofnewly arriving packets per time slot, PIPS deter-mines a maximum number of valid paths (packets) tobe scheduled with the current buffers’ state takeninto account. The algorithm aims at maximizing thesystem throughput subject to satisfying three con-straints, which are switch-contention free, buffer-contention free, and sequential delivery. Signifi-cantly, we prove that PIPS is incremental in the sensethat the computed-path sets are monotonically non-decreasing over time. We then propose a hardwareparallel system architecture for the implementationof PIPS. As is shown, PIPS achieves a near-optimal so-lution with an exceptionally low computational com-plexity, O„�P�à log2„NMW……, where P is the newly-arriving-packet set, N the number of input ports, andM and W the numbers of internal and external wave-lengths, respectively. From simulation results that pitthe PIPS algorithm against four other algorithms, weshow that PIPS outperforms these algorithms on bothsystem throughput and computational complexity.

Manuscript received November 28, 2008; revised March 30, 2009;accepted May 3, 2009; published July 28, 2009 �Doc. ID 112926�.

M. C. Yuang (e-mail: [email protected]) and S.-H. Lin arewith the Department of Computer Science and InformationEngineering, National Chiao Tung University, Taiwan .

P.-L. Tien is with the Department of Communication Engineering,National Chiao Tung University, Taiwan.

Digital Object Identifier 10.1364/JOCN.1.0000B1

1943-0620/09/0300B1-14/$15.00 ©

Index Terms—Optical packet switching (OPS);avelength division multiplexing (WDM); fiber delay

ine (FDL); Banyan network; packet scheduling.

I. INTRODUCTION

ptical wavelength division multiplexing (WDM)has been shown to be successful in providing vir-

ually unlimited bandwidth to support an ever-ncreasing amount of traffic for future optical net-orks. Future optical networks, especially the metrond local networks, are expected to flexibly and cost-ffectively satisfy a wide range of applications havingime-varying and high bandwidth demands and strin-ent delay requirements. Such facts result in the needo exploit the optical packet-switching (OPS) [1,2]aradigm that takes advantage of efficient sharing ofavelength channels among multiple connections.otice that there still exist technological limitations

n OPS, such as optical random access memory (RAM)nd optical signal processing. Thus, the OPS systeme study in this paper employs fiber-delay-line (FDL)-ased optical buffers, and almost-all-optical switchesn which the control header is processed electronicallyhile the packet payloads remain transported in theptical domain.

A general OPS system consists of three basic com-onents that are crucial to the performance andconomy of the system. They are the space switch, theptical buffer, and the wavelength converter [3]. First,he space switches can be categorized into having non-locking, rearrangeable, or blocking architectures [4].raditionally, the nonblocking and rearrangeablewitches are mostly used for electronic circuit switch-ng systems. The nonblocking switches, such as therossbar matrix network and Cantor network, can al-ays construct a new connection between the inputnd the output ports without altering other connec-ions already in the switches. However, the nonblock-ng switches are nonscalable and economically unfea-ible in the optical domain. On the other hand, theearrangeable switches, such as a Benes network,oute new input–output connections by rearranging

2009 Optical Society of America

Page 2: Pseudo-Banyan Optical WDM Packet Switching System With ... · crossbar matrix network and Cantor network, can al-ways construct a new connection between the input ... is to incorporate

tcmepairpc[lqiTts

otsfpmmsisshsprp

appntgssqictwtocPier

B2 J. OPT. COMMUN. NETW./VOL. 1, NO. 3 /AUGUST 2009 Yuang et al.

all other existing connections in the switches. Theserearrangeable switches are more scalable than theirnonblocking counterparts, but they nevertheless re-quire a more complicated scheduling (or rearranging)algorithm, which causes the switch processing to slowdown. The last category, blocking switches, enablesself-routing and requires the least number of switch-ing elements among the three categories. Hence, it isthe most scalable and most economic class. The pricepaid, however, is internal contention (blocking) whentwo packets attempt to access the same internal linkin the switch.

Most work done on OPS systems has consideredonly the nonblocking space switch architecture. Theproblem to be resolved for such OPS systems is solelythe output contentions that are caused if two packetsare destined for the same destination. Because of theexceedingly high cost of fast optical switches (andswitching elements), we consider the blocking Banyanspace switch a promising candidate for future opticalnetworks. To resolve the internal contention problem[4,5] in the electronic domain, numerous methodshave been proposed. The most prevailing method,called the buffered Banyan switch, queues contendingpackets at the input ports through using RAM-basedbuffers. Such a buffering strategy becomes impracti-cal in the optical domain. The main goal of the paperis to incorporate a Banyan-like switch architecture to-gether with a fast and high-throughput packet sched-uling algorithm to resolve internal and output conten-tions.

The second component of optical systems, namely,the optical FDL-based buffer, has been successfullyapplied to resolving contentions in the time dimensionunder various buffering strategies. Similar to that ofelectronic switches, the buffering strategy [5] can becategorized into input buffering, output buffering, andshared buffering. While input (output) buffering has aseparate buffer for each input (output) port, theshared buffering allows buffers to be shared amongmultiple inputs and/or outputs. As mentioned previ-ously, because of the lack of RAM-based optical buff-ers, the FDL is currently the only viable bufferingmeans for optical networks. There are two FDL buff-ering structures: feedback or feed-forward [6]. Thefeedback structure can support dynamic buffering du-rations but at the expense of additional hardware tomaintain signal quality [7–9]. By contrast, the feed-forward FDLs support only fixed buffering durations[10,11] but ensure better signal quality. Thus, thefeed-forward structure is generally preferred over thefeedback-based counterpart.

The third optical-system component, namely, thetunable optical wavelength converter (TOWC), offersan alternative to contention resolution in the space(wavelength) dimension. TOWCs can be realized by

hree key methods: four-wave mixing (FWM) [12],ross-gain modulation (XGM) [13], and cross-phaseodulation (XPM) [14]. These methods all have differ-

nt merits. However, it is worth noting that FWM isarticularly attractive due to its being able to convertgroup of wavelengths simultaneously and because it

s transparent to the modulation format and dataate. Because the use of TOWCs imposes a high costenalty on OPS systems, much investigation has beenarried out to alleviate the cost problem. Some work15,16] proposes the sharing of a number of wave-ength converters, since not all incoming packets re-uire wavelength tuning simultaneously. Other stud-es have considered more cost-efficient limited-rangeOWCs [17,18]. Much work focuses on the combina-

ional use of these two ideas to build more economicalystems [19–21].

In this paper, we propose a novel pseudo-Banyanptical packet switching system (SBOPSS). The sys-em includes a group of pseudo-Banyan spacewitches (PBSs) together with single-stage downsizedeed-forward FDL optical buffers. Each PBS is aseudo-Banyan switch that is of Banyan structure butade from unconventional two-by-two switching ele-ents. Besides traditional cross and bar options, each

witching element allows the merging of two packetsn two different wavelengths from two inputs to theame output. Moreover, each PBS performs packetwitching only for a cluster of wavelengths, yielding aighly scalable system. The economic use of down-ized FDL buffers that are applied on the basis of out-ut and shared buffering strategies, as will be shown,esults in efficient improvement in system through-ut.

Essentially, SBOPSS employs a packet schedulinglgorithm, referred to as the parallel and incrementalacket scheduler (PIPS). Given a set of newly arrivingackets per time slot, PIPS determines a maximumumber of valid paths (packets) to be scheduled withhe current buffers’ state taken into account. The al-orithm aims at maximizing the system throughputubject to satisfying three constraints, which arewitch-contention free, buffer-contention free, and se-uential delivery. Significantly, we prove that PIPS isncremental, as it is named, in the sense that theomputed-path sets are monotonically nondecreasinghroughout each time slot. We further propose a hard-are parallel system architecture for the implementa-

ion of PIPS. As will be shown, PIPS achieves a near-ptimal solution with an exceptionally lowomputational complexity, O��P�� log2�NMW��, where

is the newly-arriving-packet set, N the number ofnput ports, and M and W the numbers of internal andxternal wavelengths, respectively. From simulationesults that pit the PIPS algorithm against four other

Page 3: Pseudo-Banyan Optical WDM Packet Switching System With ... · crossbar matrix network and Cantor network, can al-ways construct a new connection between the input ... is to incorporate

eseatcpo

spfiTpstwt

wn

Yuang et al. VOL. 1, NO. 3 /AUGUST 2009/J. OPT. COMMUN. NETW. B3

algorithms, we show that PIPS outperforms these al-gorithms on both system throughput and computa-tional complexity.

The remainder of this paper is organized as follows.In Section II, we present the architecture of SBOPSS.In Section III, we describe the PIPS algorithm and for-mally prove the incremental property of the algo-rithm. In Section IV, we propose the hardware parallelsystem architecture and derive the computationalcomplexity. Experimental results are then shown inSection V. In Section VI, we then draw comparisonsbetween SBOPSS and several optical space switchstructures with respect to hardware cost and signalquality. Finally, concluding remarks are given in Sec-tion VII.

II. SBOPSS—SYSTEM ARCHITECTURE

SBOPSS consists of two subsystems (see Fig. 1): theoptical switching subsystem for optical switching ofpayloads, and the central switch controller (CSC) for

Fig. 1. (Color online) SB

lectronic processing of headers. It is a synchronousystem that supports fixed-size packets. Packet head-rs carry the label information and are superimposed-mplitude-shift-keying (SASK) modulated [22] withheir payloads. While packet headers are electroni-ally processed by the central switch controller, theayloads travel within the switching subsystem allptically.

The optical switching subsystem consists of fourections: input, space switch, output buffer, and out-ut sections. In the input section, there are N inputbers, each carrying W wavelengths, and N�WOWCs. After DEMUX, each TOWC converts the in-ut wavelength to an internal wavelength that is as-ociated with a free position in the output buffer forhe packet. However, if the system is full, a dumpavelength is converted and the packet is dropped by

he filter.

In the space-switch section, there are C PBSs for Cavelength clusters, respectively, where C is the totalumber of clusters in the system. More specifically,

S—system architecture.

OPS
Page 4: Pseudo-Banyan Optical WDM Packet Switching System With ... · crossbar matrix network and Cantor network, can al-ways construct a new connection between the input ... is to incorporate

BcStdgp

A

ruibtie�wwdEtpith�

rsn�Fsmipse

(tPalbb

n=�

B4 J. OPT. COMMUN. NETW./VOL. 1, NO. 3 /AUGUST 2009 Yuang et al.

the kth PBS, i.e., PBSk, accommodates W /C wave-lengths ranging from ��k−1�W/C+1 to �kW/C for each of theN fibers. This wavelength clustering reduces the sizeof each PBS to a scalable N�W /C��N�W /C�. Withineach PBS, say of size m�m, there are �m /2�� log2 mtwo-by-two switching elements, each of which can beconstructed by four semiconductor optical amplifiers[23]. Each semiconductor optical amplifier can be con-sidered to be an ON–OFF switch to select or unselectthe packet, thus forming four different switching de-cisions, namely, cross, bar, and two merging options.Specifically, the merging option allows two packetswith different wavelengths to be switched from two in-put ports to the same output port. These two packetsultimately will depart from the system through thesame output wavelength and fiber, after receiving dif-ferent delays. Moreover, like Banyan switches, thePBS maintains the self-routing property that allowspackets to be uniquely switched according to their out-put port �N� and assigned output wavelength �W /C�.

The output-buffer section contains W FDL opticalbuffers (FOBs) for W wavelengths, respectively. EachFOB is shared by all output ports. An FOB is com-posed of a pair of arrayed waveguide gratings and Doptical FDLs connecting the arrayed waveguide grat-ings, resulting in a total of B buffer positions, whereB= �D−1��M, and M is the number of internal wave-lengths. It is worth noting that a packet entering theFOB at the ith input port will exit the buffer from theith output port after receiving a certain delay time de-termined by the internal wavelength [11]. Thus, forany FOB, an internal wavelength of a packet uniquelydetermines the delay received by the packet. Finally,in the output section, there are N�W FWM-basedTOWCs and N output fibers, each carrying W wave-lengths. Because a FWM-based wavelength converteris used that is capable of converting multiple wave-lengths simultaneously, SBOPSS can support the pre-emption of a low-priority packet by a later-arrivinghigh-priority packet [23], which is beyond the mainscope of this paper.

In the central switch controller subsystem, headersof simultaneous arriving packets are firstsuperimposed-amplitude-shift-keying-based demodu-lated [22] and received. Their labels are passed to thepacket scheduler, PIPS. Serving as the brain ofSBOPSS, PIPS determines for each packet the des-tined wavelength and the delay, aiming at maximizingthe system utilization subject to satisfying three con-straints. They are switch-contention free, buffer-contention free, and sequential delivery. To avoid am-biguity, we use the term “switch contention” to refer tointernal contention throughout the rest of the paper.That means, switch contention occurs when morethan one packet carried by the same wavelength at-tempts to pass through the same link within the PBS.

uffer contention occurs when more than one packetompetes for the same FOB position. Notice that, inBOPSS, packets that are blocked in the PBS or failo obtain an available FOB position will be inevitablyropped. In the next section, we present the PIPS al-orithm in detail and formally prove its incrementalroperty.

III. PARALLEL AND INCREMENTAL PACKET SCHEDULER

. Definitions and Notation

Before delving into the details of the PIPS algo-ithm, we first give notation and definitions that aresed throughout the paper. Because packet schedul-

ng for different clusters is completely independent,elow we discuss the scheduling problem for the sys-em with one cluster. Let N denote the number ofnput–output fibers; W the number of input–outputxternal wavelengths (�1

I to �WI in the input fiber and

1O to �W

O in the output fiber); M the number of internalavelengths (�1 to �M); FOBi the optical buffer foravelength i, where i=1. . .W; and D the number ofelay lines (including no delay) within each FOB.ach newly-arriving-packet header is associated with

he triplicity information (Ii, �jI, Ok), where Ii is the in-

ut fiber, �jI is the input external wavelength, and Ok

s the output fiber. At the beginning of each time slot,he headers of all newly arriving packets form aeader set, P= �pn �pn= �Ii ,�j

I ,Ok�n ;1�n� �P� ;1� iN ;1� j�W ;1�k�N�.

To perform packet scheduling, the PIPS algorithmequires the constant update of the FOB states. Thetate of FOBi is represented by an N�D matrix, de-oted FSTATi�Oa ,FDLb�, i=1. . .W, where each row Oaa=1. . .N� corresponds to an input–output port of theOB, and each column FDLb �b=0. . .D−1� corre-ponds to a delay line in the FOB. Each entry of theatrix is set to 1 if the corresponding buffer position

s occupied and 0 otherwise. Notice that the fact thatackets move forward in the delay line after each timelot has elapsed is associated with the left shift of thentries of the matrix.

Definition 1. A valid path for a packet with headerIi, �j

I, Ok), denoted (Ii, �jI, Ok, �x

O, �y), is a route withinhe system that starts from an input port (Ii, �j

I) of theBS, through the output port (Ok, �x

O) of the PBS, ton inlet of an FOB for �x

O, to a �y-corresponded delayine, and finally to the FOB outlet, which is free fromeing in contention with any packets currently in theuffer, i.e., FSTATx�Ok ,FDL��y−k+M�mod M��=0.

Definition 2. A sound-path set for a group ofewly arriving packets is a set of valid paths Q��qm �qm= �Ii ,�j

I ,Ok ,�xO ,�y�m ;1�m� �P� ;1� i�N ;1� j

W ;1�k�N ;1�x�W ;1�y�M� that satisfies the

Page 5: Pseudo-Banyan Optical WDM Packet Switching System With ... · crossbar matrix network and Cantor network, can al-ways construct a new connection between the input ... is to incorporate

Ttr=�

cctt

B

Nriggms

Yuang et al. VOL. 1, NO. 3 /AUGUST 2009/J. OPT. COMMUN. NETW. B5

following three constraints: (C1) all paths in the setare mutually switch- and buffer-contention free; (C2)all packets to which the sound-path set correspondsare buffer-contention free from the packets currentlyin the buffer; and (C3) the packets belonging to thesame connection are sequential-delivery guaranteed.

Notice that a packet may have many valid paths as-sociated with paths lending different delays. Finally,the packet-scheduling problem is formally defined asfollows.

Packet-Scheduling Problem Definition. Con-sider a number of simultaneously arriving packets,with the header set P= �pn �pn= �Ii ,�j

I ,Ok�n ;1�n� �P� ;1� i�N ;1� j�W ;1�k�N�, without anycomputation-time constraint. The packet-schedulingproblem is to find the largest set of validpaths, referred to as the target sound-path set Q= �qm �qm= �Ii ,�j

I ,Ok ,�xO ,�y�m ;1�m� �P� ;1� i�N ;1� j

�W ;1�k�N ;1�x�W ;1�y�M�, for a maximalnumber of packets to be scheduled to simultaneouslyenter the system. Given a time constraint,

Fig. 2. (Color onl

, the packet-scheduling problem is to obtain withinime T the largest possible sound-path set,eferred to as the transient sound-path set Q�T��qm �qm= �Ii ,�j

I ,Ok ,�xO ,�y�m ;1�m� �P� ;1� i�N ;1� j

W ;1�k�N ;1�x�W ;1�y�M�.

A packet will be discarded if its valid path is not in-luded in the target or transient sound-path set. A dis-arded packet is converted to wavelength �0, which inurn will be discarded through a filter before enteringhe PBS.

. PIPS Algorithm

The packet-scheduling problem can be proved to beP-complete. In this sequel, we present our PIPS heu-

istic algorithm that finds a near-optimal solution orncrementally returns a feasible solution within aiven time constraint. As shown in Fig. 2, the PIPS al-orithm operates in three phases—the graph transfor-ation, directed-graph construction, and iterative

elf-marking phases—on a slot basis. In the first

) PIPS algorithm.

ine
Page 6: Pseudo-Banyan Optical WDM Packet Switching System With ... · crossbar matrix network and Cantor network, can al-ways construct a new connection between the input ... is to incorporate

tcttatgs

C

tcp

a

satvsrt

usc

dgb=plhAmnbtt

daOtaCrlbe=i

B6 J. OPT. COMMUN. NETW./VOL. 1, NO. 3 /AUGUST 2009 Yuang et al.

phase, the algorithm transforms the packet-scheduling problem into a graph problem according tothe following rule. Consider all valid paths for allnewly-arriving packets: for each valid path (Ii, �j

I, Ok,�x

O, �y), a vertex, v�Ii,�jI,Ok,�x

O,�y�, is created and drawninto the undirected graph Gu. Afterward, an edge,CT�edge�v� ,v��, is drawn between two vertices v� andv� if their corresponding valid paths are in (switch orbuffer) contention with each other. Notice that, al-though each packet may have more than one validpath, there is at most one valid path for each packet tobe included in any sound-path set. Hence, a specialedge, OR�edge�v� ,v��, is drawn between two verticesif their corresponding valid paths belong to the samepacket, i.e., share the same Ii and �j

I, but using differ-ent delays and/or external wavelengths. The packet-scheduling problem thus becomes finding a maximalset of disconnected vertices in Gu without edges con-necting any two of them.

In the second phase, the algorithm converts the un-directed graph Gu into a directed graph Gd with theedge directions assigned based on two searching heu-ristics in an attempt to maximize the sound-path set.The heuristics provide guidelines based on the Degreeof a vertex, which is defined to be the total number ofCT�edges (but not OR�edges) connecting to the vertex.Notice that the higher the degree of a vertex, the morepaths the vertex (path) is in contention with; thelonger the FDL delay of a vertex, the more system re-sources (buffers) are occupied, resulting in a greaterpossibility that the future arriving packets areblocked. That is, for the edge assigning process, lower-degree vertices are preferred because they contendwith fewer vertices (paths), and shorter-delay vertices(paths) are preferred because they leave the systemand release resources more quickly. Ultimately, thetwo heuristics are (Heuristic 1) assigning the edge di-rections from the lower-degree to higher-degree verti-ces and (Heuristic 2) assigning the edge directionsfrom the shorter-delay to longer-delay vertices. In es-sence, Heuristic 1 takes precedence over Heuristic 2.

In the last iterative self-marking phase, each vertexin graph Gd iteratively updates the status (Tag) by se-lecting or deselecting itself according to the status ofits neighboring vertices on a round basis. All verticesare initially marked Tag=ON as being selected. Ineach round for any vertex, say v, if there exists oneneighboring vertex that has an edge directing to ver-tex v and is selected �Tag=ON�, vertex v must dese-lect itself �Tagv=OFF� to prevent potential contention.Otherwise, vertex v will select itself �Tagv=ON�. Es-sentially, as asserted by Theorem 1 (next subsection)for proving the incremental property, if the status of avertex remains unchanged for two consecutive rounds(the vertex is said to be stable), the status of the ver-tex will no longer be changed. The iteration stops ei-

her when the status of all vertices remains un-hanged within the entire round by the end of a slotime or the requested time constraint �T� expires. Inhe former case, the target sound-path set Q is givens the set of valid paths for the selected vertices. Inhe latter case, the transient sound-path set Q�T� isiven as the set of valid paths for the vertices that areelected and stable.

. Incremental Property

In Theorem 1, we first assert and prove that a ver-ex’s status will no longer change once it is stable. Ac-ordingly, the incremental property is then given androved in Theorem 2.

Lemma 1. The directed graph, Gd, does not containny cycles.

Proof. In the directed-graph construction phase,ince Heuristic 1 takes precedence over Heuristic 2,ll vertices of Gd are sorted in an absolute order afterhe two heuristic rules are applied. (Notice that if theertices have the same degree and delay, they areorted by the designated ID, as described in the algo-ithm in Fig. 2). Accordingly, all edges are directed inhe same direction, allowing the Lemma to hold.

Theorem 1. If the status (Tag) of a vertex remainsnchanged for two consecutive rounds in the iterativeelf-marking phase, i.e., the vertex is stable, it will nothange in the following iteration rounds.

Proof. The proof is performed via mathematical in-uction on the number of vertices in the directedraph Gd. Assume that V is the vertex set of Gd. Theasic condition states that the theorem holds for �V�1. If the theorem also holds for �V�=k, we are torove that the theorem holds for �V�=k+1. Withoutoss of generality, vk+1 is chosen as the vertex that onlyas inward edges. By Lemma 1, the vertex must exist.lso due to the inductive assumption, vertices v1vkust obey this theorem because the vertex vk+1 will

ot influence them obviously. Therefore, the proof cane completed by proving that Tagk+1 does not produceraces OFF→OFF→ON and ON→ON→OFF duringhe iterative self-marking phase.

Part 1: First, we show that Tagk+1 will never pro-uce a trace of OFF→OFF→ON. By contradiction,ssume that Tagk+1 does indeed produce the traceFF→OFF→ON from iteration round r to r+2. In

his case, there are only two possibilities that can re-lize such a trace, which are illustrated as Case 1 andase 2 in Fig. 3. That is, for making Tagk+1=OFF in

ound r, at the end of round r−1, there must be ateast one vertex whose Tag is ON. The main differenceetween Case 1 and Case 2 is that there exists a non-mpty set of vertices that direct to vk+1 with TagOFF in Case 2, whereas all vertices are of status ON

n Case 1.

Page 7: Pseudo-Banyan Optical WDM Packet Switching System With ... · crossbar matrix network and Cantor network, can al-ways construct a new connection between the input ... is to incorporate

NatRwhcwCT

asrp

Yuang et al. VOL. 1, NO. 3 /AUGUST 2009/J. OPT. COMMUN. NETW. B7

Case 1.1. Round r−1: assume that Tagk+1=ON or OFF ar-

bitrarily, and Tag=ON for every vertex with anedge directing to vk+1. This will imply Tagk+1=OFF in round r.

2. Round r: to keep Tagk+1=OFF in the next round,r+1, there must be a vertex, say vj in Fig. 3, di-recting to vk+1, and Tagj is set to ON within thisround.

3. Round r+1: set Tagk+1=OFF because Tagj=ONat the end of round r. To have Tagk+1=ON in thenext round, r+2, all vertices directing to vk+1must set Tag to OFF within round r+1.

4. Round r+2: set Tagk+1=ON because all Tags di-recting to vk+1 are OFF at the end of round r+1.

Now, a contradiction occurs because vertex vj experi-ences a Tagj trace of ON→ON→OFF from iterationround r−1 to r+1, violating the inductive assumptionwe made in the proof.

Case 2.1. Round r−1: assume that Tagk+1=ON or OFF ar-

bitrarily. There must be a nonempty set of verti-ces with edges directing to vk+1 and Tag=ON,which trigger Tagk+1=OFF in round r. For sim-plicity, in the sequel (and in Fig. 3), our illustra-tion includes only one vertex �vi� in this set.

2. Round r: in this round, vertex vi must change itsTag to OFF. Otherwise, it becomes stable by theinductive hypothesis, and the stability makes it-self remain Tagi=ON in the following rounds;thus Tagk+1 can never be set to ON in round r+2. However, since it must hold that Tagk+1

Fig. 3. (Color online) Illustra

=OFF in round r+1, there must be a vertex (sayvertex vj in Fig. 3) that was OFF at the end ofround r−1 but is updated to ON in this round.(Note that a situation of having no such vertex iswhat Case 1 discusses.)

3. Round r+1: set Tagk+1=OFF since Tagj=ON atthe end of round r. In this round, the Tag of eachvertex directing to vk+1 must be changed to OFFto allow Tagk+1=ON in round r+2.

4. Round r+2: set Tagk+1=ON because all Tags di-recting to vk+1 are OFF at the end of round r+1.

ow, by the inductive hypothesis made in the proof,ll vertices that are unstable (except vk+1) must havehe Tag switched continuously between OFF and ON.ecall that all vertices belonging to Gd are initializedith Tag=ON. Thereby, the unstable vertices mustave their Tags changed in a synchronous manner. Aontradiction occurs, since vi and vj are unstable butith different Tag values at the end of round r−1.ombining Case 1 and Case 2, we have proved thatagk+1 does not produce a trace of OFF→OFF→ON.

Part 2. We now show that Tagk+1 will never producetrace of ON→ON→OFF. Again by contradiction, as-

ume that Tagk+1 produces such a trace from iterationound r to r+2. Under this scenario, there is only oneossibility, which is illustrated as Case 3 in Fig. 3.

Case 3.1. Round r−1: assume that Tagk+1=ON or OFF ar-

bitrarily. To trigger Tagk+1=ON in round r, allvertices directing to vk+1 must have Tag=OFF inthis round.

n for the proof of Theorem 1.

tio
Page 8: Pseudo-Banyan Optical WDM Packet Switching System With ... · crossbar matrix network and Cantor network, can al-ways construct a new connection between the input ... is to incorporate

mccmi

ttDrsrMtttaOftGsftiOwirt

tiF=bupwnDlaFbr

B

ccrtO

B8 J. OPT. COMMUN. NETW./VOL. 1, NO. 3 /AUGUST 2009 Yuang et al.

2. Round r: set Tagk+1=ON. In order to keepTagk+1=ON in the next round, the Tag of eachvertex directing to vk+1 must retain OFF in thisround.

3. Round r+1: set Tagk+1=ON. In this round, theremust have a vertex, say vj, which is directing tovk+1, converting its Tagj to ON. This action thentriggers Tagk+1=OFF in round r+2.

4. Round r+2: set Tagk+1=OFF because Tagj=ONat the end of round r+1.

By the inductive hypothesis, this case arrives at a con-tradiction because Tagj produces a trace of OFF→OFF→ON from iteration round r−1 to r+1.

With all three cases reasoned, by mathematical in-duction, the theorem holds for all �V�.

Theorem 2. Transient sound-path setsQ�T1��Q�T2� if T1�T2, i.e., Q�T� is monotonicallynondecreasing over time T. This assertion is referredto as the incremental property of the PIPS algorithm.

Proof. Assume a vertex v�Q�T1� is selected �Tagv=ON� and stable at time constraint T1. According toTheorem 1, vertex v is also selected and stable undertime constraint T2 if T2�T1. In other words, v�Q�T2� and the theorem holds.

IV. PIPS IMPLEMENTATION—HARDWARE PARALLELSYSTEM ARCHITECTURE

In this section, we present the hardware parallelsystem architecture for the efficient implementationof the PIPS algorithm. We then derive the upper-bound computational complexity of the algorithm.

A. Hardware System Architecture

Given a SBOPSS, we can preconstruct the hard-ware for all legitimate paths, i.e., verticesv�Ii,�j

I,Ok,�xO,�y�, where 1� i�N; 1� j�W; 1�k�N; 1

�x�W; 1�y�M, and all CT�edges and OR�edgesconnecting these vertices. As depicted in Fig. 4, eachvertex is implemented by a hardware subsystem con-sisting of three modules—graph transformation,directed-graph construction, and iterative self-marking—which correspond to the three phases of thePIPS algorithm. The internal interfaces between mod-ules and external interfaces between subsystems aremade through control signals (binary), control bus(nonbinary), and data signal (binary), as shown in Fig.4.

Initially, all subsystems are inactive. On the arrivalof a set of packets, P, the graph transformation mod-ule of a vertex determines whether the vertex belongsto Gu (a valid path) for packet set P by matching its(Ii, �j

I, Ok) with packets in P and checking the empti-ness of the entry FSTAT �O ,FDL �. If the

x k �y−k+M�mod M

atching and checking succeeds, the vertex is in-luded in Gu, and its corresponding subsystem be-omes activated with active signals sent to the two re-aining modules. Otherwise, the subsystem remains

nactive.

Upon having received an active signal, the phase-wo directed-graph construction module broadcastshe active signal to its neighboring subsystems. Theegree value can be computed as the total number of

eceived active signals from the neighboring sub-ystems that are connected via CT�edges. It can be de-ived that, for a SBOPSS with N input–output ports,

internal wavelengths, and W external wavelengths,here are at most NMW� log2�NW� edges connectingo a directed-graph construction module. As a result,he module contains �1/2��NMW� log2�NW� adders,nd the Degree can be calculated in parallel in�log2�NMW� log2�NW���. The phase-two module in-

orms other active subsystems of the Degree via con-rol buses. It then determines the edge directions for

u by comparing Degree values with its neighboringubsystems in parallel. Once these steps are per-ormed, the directed graph Gd is formed. The phase-wo module finally triggers the ON–OFF switch in theterative self-marking module as shown in Fig. 4. TheN–OFF switch comprises a number of unidirectionalires, each of which stands for a directed edge point-

ng to this vertex. The ON state indicates that the di-ected edge belongs to Gd, while the OFF state meanshe contrary.

Finally, the phase-three module performs the itera-ive update of Tag on a round basis. The Tag is initial-zed to be ON and is stored in a D flip-flop as shown inig. 4. It is noted that both Tag=ON and Flag“stable” correspond to a hardware value of 1; andoth Tag=OFF and Flag= “unstable” a value of 0. Topdate the Tag in the subsequent round, this moduleasses neighboring Tag values from unidirectionalires through an AND gate after the inversion. Theew Tag value is updated and in turn recorded in theflip-flop. The Flag of a vertex can be determined by

ogically XNOR-ing two consecutive Tags from the inletnd outlet of the D flip-flop. By logically AND-ing thelag and Tag, one can determine whether the vertexelongs to the sound-path set or not at the end of eachound.

. Computational Complexity

In this subsection, we derive the computationalomplexity of the PIPS algorithm. We first assert tworucial properties of Gd, followed by proving in Theo-em 3 that the maximum number of iteration roundso finish PIPS computing for any packet set P is��P��.

Page 9: Pseudo-Banyan Optical WDM Packet Switching System With ... · crossbar matrix network and Cantor network, can al-ways construct a new connection between the input ... is to incorporate

nmmsvPcrGIv�

wtttGcOtbetuAScT

are

Yuang et al. VOL. 1, NO. 3 /AUGUST 2009/J. OPT. COMMUN. NETW. B9

Lemma 2. The following two properties hold forvertices in Gd:

(a) A vertex becomes stable only at the end of odd(even) rounds by having Tag=ON (OFF) in twoconsecutive rounds.

(b) Each iteration round generates at least one newstable vertex.

Proof. (a) Recall that Tag is initialized to be ONprior to the first iteration round. Unstable verticesswitch their Tags from ON to OFF in odd rounds, andOFF to ON in even rounds. Therefore, the second con-secutive ON always appears in odd rounds, and OFFin even rounds. (b) Notice that, as Lemma 1 indicates,there exists no cycle in Gd. The statement certainlyholds by only considering unstable vertices in Gd.Thus, before executing the update of the rth round,one can always find one unstable vertex vi that noother unstable vertices direct to it with an edge be-cause of the cycle-free assertion. Therefore, all verti-ces directing to vi are stable before executing the rthround update. These stable vertices keep the sameTag values in the r−2nd and r−1st rounds. As a result,Tagi must repeat the same Tag value in the rth roundas that in the r−1st round. Such an update makes ver-tex vi a new stable vertex by the end of the rth round,proving that the lemma holds.

Theorem 3. Given a new packet set P, without timeconstraint, the iterative self-marking phase of thePIPS algorithm completes the computing in O��P�� it-eration rounds.

Fig. 4. (Color online) Hardw

Proof. Given a packet set P, there are at most �P�ewly arriving packets. Therefore, due to the incre-ental property of PIPS (Theorems 1 and 2), theaximum number of selected paths in the target

ound-path set, i.e., the maximum number of stableertices with Tag=ON, is �P� after completing theIPS algorithm. By Lemma 2, we know that all verti-es with Tag=ON will be stable after 2� �P�−1ounds. Also, at the end of round 2� �P�, all vertices ind must be stable and the PIPS algorithm terminates.

f not, a contradiction occurs by having a new stableertex with Tag=ON at the end of the next round 2�P�+1. Therefore, the theorem holds.

Now, the first step of the PIPS algorithm in Fig. 2,hich involves simultaneous left shifting of all en-

ries, requires an O�1� computation. In the graphransformation phase, as shown in the hardware sys-em in Fig. 4, each vertex tests whether it belongs to

u by matching the newly-arriving-packet set P andhecking its related entry of FSTATs, resulting in an��P�� computation. In the directed-graph construc-

ion phase, the Degree calculation for all vertices cane carried out in O�log2�NMW� log2�NW���, and thedge direction can be assigned in O�1� by triggeringhe ON–OFF switch. The iterative self-marking mod-le performs one round update (step 8) by logicallyND-ing the neighboring Tags after the inversion.imilar to calculating the Degree, this AND-ing actionan be performed in O�log2�NMW� log2�NW���. Byheorem 3, this iterative phase can be finished in

parallel system architecture.

Page 10: Pseudo-Banyan Optical WDM Packet Switching System With ... · crossbar matrix network and Cantor network, can al-ways construct a new connection between the input ... is to incorporate

iSsavdsaf

cchptdcsiSpalOOiF[eprct

mocttsmcnp

saIu=stgepr

B10 J. OPT. COMMUN. NETW./VOL. 1, NO. 3 /AUGUST 2009 Yuang et al.

O��P�� log2�NMW� log2�NW���. Finally, the near-optimal solution Q is updated into FSTATs in O��Q��.Accordingly, the computational complexity �T– � can bederived as follows:

T– ��P�,N,M,W� = O�1� + O��P�� + O�log2�NMW

� log2�NW��� + O��P� � log2�NMW

� log2�NW��� + O��Q��

= O��P� � �log2�NMW� + log2 log2�NW���

= O��P� � log2�NMW��.

V. SIMULATION RESULTS

In this section, we demonstrate and compare theperformance of PIPS and five packet scheduling algo-rithms, with respect to system throughput and com-plexity, via simulation results. Since packet schedul-ing for different clusters of SBOPSS is independent,therefore without loss of generality, we assume thereis only one cluster in SBOPSS. In the simulations, weassume that there are a total of N�W i.i.d. (indepen-dent and identically distributed) input traffic flows en-tering the SBOPSS simultaneously. We also experi-ment with two different traffic arrival distributions—Bernoulli process (BP) and interrupted Bernoulliprocess (IBP)—to model smooth and bursty traffic, re-spectively. Specifically for the IBP, we adopt a ratio ofmean idle to busy periods equal to 1/20, correspond-ing to a highly bursty traffic arrival. The traffic load isdefined as the mean number of newly arriving packets�P� divided by the total channel capacity N�W, i.e.,E��P�� / �N�W�. The normalized system throughput isdefined as the ratio of the mean size of the targetsound-path set �Q� to E��P��, i.e., E��Q�� /E��P��.

The five packet scheduling algorithms are the ex-haustive optimal method, JMinD, JMaxS, SMinB, andSMinD. The exhaustive method returns an optimalsolution by testing all of the path combinations for thenewly-arriving-packet set P. With M internal and Wexternal wavelengths, there are a total of MW+1 pathchoices for each packet, where the additional one cor-responds to discarding the packet. In general, unlikePIPS, the remaining four algorithms select a validpath for a single packet each time. For each packet,once a path is selected, it is inserted in the sound-pathset. The four scheduling algorithms differ in packetselection and/or insertion processes. JMinD andJMaxS perform path insertion subject to jointly satis-fying constraints (C1) and (C2) given in Definition 2.While JMinD selects minimal-delay paths first,JMaxS favors paths of maximal sharing of FDL buff-ers. More specifically, for each packet JMinD breadthsearches a valid path with a minimal delay amongFOBs, and JMaxS depth searches a valid path by fill-

ng all buffer positions in an FOBs. By contrast,MinB and SMinD perform packet insertion by con-idering the two constraints separately. SMinB aimst minimal blocking within the PBS by searching allalid paths that satisfy constraint (C1) first. All can-idate paths are then tested and inserted only if con-traint (C2) is satisfied. On the other hand, SMinDims at minimal delay by testing constraint (C2) be-ore constraint (C1).

We first summarize in Table I the computationalomplexity of the PIPS and five other algorithms. Be-ause all path combinations are considered, the ex-austive method results in exceptionally high com-lexity, O��MW+1��P��. JMinD and JMaxS searchhrough a maximum number of M�W valid path can-idates for each of the packets in P. Each valid pathandidate needs to be tested if it contends with pre-cheduled packets. Accordingly, both algorithms resultn a computational complexity of O��P�2MW�. ForMinB, the switching process requires O��P�2MW� toerform scheduling satisfying constraint (C1). It takesnother O��P�� to resolve the buffer contention prob-em [constraint (C2)], yielding a complexity of��P�2MW�. For all packets in P, SMinD requires��P�W� for examining W output external wavelengths

n order to assign minimal-delay available entries ofOB’s [constraint (C2)]. To resolve switch contentions

constraint (C1)], SMinD sequentially tests whetherach packet contends internally with prescheduledackets, yielding a complexity of O��P�2�. Thus, SMinDequires a complexity of O��P�2+ �P�W�. We can con-lude that PIPS requires much lower complexity thanhe remaining four algorithms.

In Table II, we further draw a comparison of nor-alized system throughput between PIPS and all

ther algorithms under both BP and IBP arrivals. Be-ause of the unmanageable complexity of the exhaus-ive method, we can only attain system throughput forhe SBOPSS that is of small size, i.e., N=2, W=4, ashown in Table II. The system throughput of PIPS al-ost overlaps with that of the optimal method. One

an perceive from the results that PIPS returns aear-optimal solution requiring exceptional low com-lexity, regardless of smooth or bursty traffic arrivals.

Notice that, as N becomes larger, owing to thewitch clustering design, SBOPSS retains a manage-ble size of PBSs by increasing the number of clusters.n Table II, we display the throughput of the SBOPSSsing two practicable sizes of PBS, 16�16 (N=4, W4) and 32�32 (N=8, W=4). In this simulation, weet D=N=M, yielding buffer sizes B=12 and B=56 forhe 16�16 and 32�32 PBS cases, respectively. Ineneral, because constraints (C1) and (C2) are consid-red simultaneously, PIPS, JMinD, and JMaxS out-erform both SMinB and SMinD. Among all algo-ithms, SMinD undergoes the worst throughput

Page 11: Pseudo-Banyan Optical WDM Packet Switching System With ... · crossbar matrix network and Cantor network, can al-ways construct a new connection between the input ... is to incorporate

lstpobpn

sriocpta

Yuang et al. VOL. 1, NO. 3 /AUGUST 2009/J. OPT. COMMUN. NETW. B11

performance. This is because the switch contentionhas greater impact on throughput than buffer conten-tion. This also explains the rationale behind the de-sign that the degree-based heuristic rule takes prece-dence over the delay-based rule in the directed-graphconstruction phase of PIPS. Finally, compared withJMinD and JMaxS, PIPS achieves higher throughputdue to the hardware-based parallel implementation ofthe degree-based heuristic rule. Crucially, we observein Table II that SBOPSS invariably achieves thethroughput under IBP that is nearly as high as thatunder the BP, justifying the robustness of SBOPSSunder bursty traffic.

Furthermore, we demonstrate the impact of the op-tical buffer size on throughput of SBOPSS using thePIPS scheduling algorithm. Traffic is assumed to fol-low the BP model. Again, we adopt two different sizesof PBS, i.e., 16�16 (N=4, W=4) and 32�32 (N=8,W=4). In each case, we use three different buffersizes: bufferless �B=0�, a smaller buffer size, and a

TABCOMPARISON OF COMP

Method PIPS Optimal

Complexity O��P�� log2�NMW�� O��MW+1��P��

TABSYSTEM THROUG

PBSSize Method 0.55 0.6

8�8(N=2,W=4)

Optimal(Exhaustive)

BP: 100% BP: 99IBP: 99.98% IBP: 99

PIPS BP: 99.84% BP: 99IBP: 99.83% IBP: 99

16�16(N=4,W=4)

PIPS BP: 98.67% BP: 98IBP: 98.54% IBP: 98

JMinD BP: 98.19% BP: 95IBP: 97.78% IBP: 95

JMaxS BP: 97.91% BP: 95IBP: 97.78% IBP: 94

SMinB BP: 90.31% BP: 86IBP: 90.09% IBP: 85

SMinD BP: 70.13% BP: 68IBP: 70.65% IBP:68

32�32(N=8,W=4)

PIPS BP: 96% BP: 94IBP: 96.1% IBP: 94

JMinD BP: 94.3% BP: 91IBP: 94.26% IBP: 91

JMaxS BP: 94.54% BP: 90IBP: 94.57% IBP: 90

SMinB BP: 85.65% BP: 80IBP: 85.86% IBP: 79

SMinD BP: 65.32% BP: 62IBP: 65.3% IBP: 62

arger buffer size, as indicated in Table III. We ob-erve a crucial fact in all cases that, compared withhe bufferless system, SBOPSS achieves drastic im-rovement in throughput by applying only a handfulf optical buffers (B=12 and B=24). However, as theuffer size grows �B=56�, the effectiveness of the im-rovement is diminished. This fact justifies our eco-omic use of downsized optical buffers.

VI. ASSESSMENT OF OPTICAL SPACE SWITCHES

We now draw comparisons between SBOPSS andeveral prevailing optical space switch structures withespect to component counts and output signal qual-ty. Let N be the number of input fibers, M the numberf wavelengths in each fiber, and C the wavelengthlusters unique to our system, SBOPSS. Table IV de-icts the component counts of space switches underhree space switch categories: nonblocking, rearrange-ble, and blocking switches. Notice that the

IATIONAL COMPLEXITY

MinD JMaxS SMinB SMinD

�P�2MW� O��P�2MW� O��P�2MW� O��P�2+ �P�W�

IIT COMPARISON

Load

0.75 0.85 0.95

BP: 99.48% BP: 96.78% BP: 92.3%IBP: 99.14% IBP: 96.46% IBP: 91.98%BP: 97.66% BP: 95.08% BP: 91.05%

% IBP: 97.71% IBP: 94.8% IBP: 90.74%BP: 96.08% BP: 92.51% BP: 87.13%IBP: 95.79% IBP: 92.19% IBP: 86.79%BP: 93.14% BP: 92.19% BP: 85.62%IBP:93.05% IBP: 91.73% IBP: 85.34%BP: 90.86% BP: 86.72% BP: 81.59%

% IBP: 90.93% IBP: 86.29% IBP: 81.11%BP: 80.62% BP: 75.13% BP: 70.54%IBP: 80.27% IBP: 75.48% IBP: 70.22%BP: 65.74% BP: 63.95% BP: 61.59%IBP: 65.73% IBP: 64.09% IBP: 61.61%BP: 91.17% BP: 86.84% BP: 81.96%

% IBP: 91.05% IBP: 86.77% IBP: 81.8%BP: 89.33% BP: 85.36% BP: 79.24%

% IBP: 89.35% IBP: 85.17% IBP: 79.43%BP: 84.89% BP: 79.4% BP: 73.94%

% IBP: 84.56% IBP: 79.32% IBP: 74.12%BP: 74.16% BP: 68.94% BP: 64.21%

% IBP: 74.1% IBP: 68.9% IBP: 64.31%BP: 60.39% BP: 57.57% BP: 54.86%

% IBP: 59.88% IBP: 57.39% IBP: 54.91%

LEUT

J

O�

LEHPU

5

.94%.9%

.23%.21.26%.1%

.93%.2%

.29%.98.39%.8%

.56%

.42%

.06%.06.73%.81.16%.08.14%.82.47%.57

Page 12: Pseudo-Banyan Optical WDM Packet Switching System With ... · crossbar matrix network and Cantor network, can al-ways construct a new connection between the input ... is to incorporate

sptCsenisiesa

astePspgPmtb(t

B12 J. OPT. COMMUN. NETW./VOL. 1, NO. 3 /AUGUST 2009 Yuang et al.

broadcast-and-select space switch [24] can be con-structed by simpler ON–OFF semiconductor opticalamplifier gates instead of the basic two-by-two switch-ing elements. For comparison, in this study we adoptthe use of the two-by-two element as a basic element.We have observed in Table IV that SBOPSS andBenes [25] outperform other structures with respect tothe number of switching elements. However, for theBenes structure, the price paid is the requirement ofcomplicated scheduling (rearranging) algorithms,causing slower switch processing. Significantly,SBOPSS equips C pseudo-Banyan space switcheswith size N�W /C��N�W /C� by clustering wave-lengths. Such wavelength-clustering design effectivelyreduces switch complexity to O�NW� log2N� by select-ing C as W /4 or W /8.

An optical signal suffers from signal impairmentand power loss that are caused by passing a number ofswitching elements and splitters or couplers, respec-tively. Therefore, we perform three separate studiesfor broadcast and select, Cantor, and all remainingstructures, respectively, due to their uses of differentcomponents. Results are summarized in Table V. Ingeneral, with splitters and couplers, a 1�m splitter oran m�1 coupler causes severe power loss, yielding anoutput power that is 1/m of the original signal power.Thus, for broadcast and select, the output signalpower becomes 1/N2 after passing through one split-ter �1�N�, one basic two-by-two element (or ON–OFFgate), and one coupler �N�1�. For the Cantor switch,the optical signal passes through one 1� log2�NW�

TABSYSTEM THROUGHPUT OF SBOPSS US

PBSSize

BufferSize 0.55

16�16(N=4,W=4)

D=1, M=4 �B=0� 90.5%D=4, M=4 �B=12� 98.67%D=8, M=8 �B=56� 98.67%

32�32(N=8,W=4)

D=1, M=8 �B=0� 86.32%D=4, M=8 �B=24� 95.84%D=8, M=8 �B=56� 96%

TABCOMPONENT-COUNT COMPARISON OF SPA

Category Architecture

Nonblocking spaceswitches

CrossBar [3]Broadcast and select [24]

Cantor [4]Strictly nonblocking Clos [15]

Rearrangeable spaceswitches

Rearrangeable Clos [4]Benes [25]

Blocking space switches SBOPSS

plitter at the switch front and one log2�NW��1 cou-ler at the switch end, resulting in an output powerhat is �1/log2�NW��2 of the input signal power. Theantor switch also causes signal impairment from ba-ic switching elements, where the number of basic el-ments is the same as that of the Benes switch. Fi-ally, the signal quality for the remaining structures

s solely antiproportional to the number of basicwitching elements [26,27]. Thus, we study the signalmpairment by counting the number of two-by-two el-ments in the longest (worst) switching path. As a re-ult, SBOPSS achieves the best signal performancemong all switch structures.

VII. CONCLUSIONS

In this paper, we have proposed a scalable almost-ll-optical packet switching system, SBOPSS. Theystem incorporates cluster-based pseudo-Banyan op-ical switches and downsized feed-forward FDL buff-rs for the optical switching of packet payloads.acket headers are electrically processed by a centralwitch controller, including a parallel and incrementalacket scheduler, or PIPS. Through a three-phase al-orithm, for newly arriving packets per time slot,IPS determines the target sound-path set, aiming ataximizing system throughput subject to satisfying

hree constraints, namely, switch-contention free (C1),uffer-contention free (C2), and sequential deliveryC3). PIPS was proved to be incremental in the sensehat transient sound-path sets are monotonically non-

IIIPIPS FOR DIFFERENT BUFFER SIZES

0.65 0.75 0.85 0.95

87.57% 84.06% 81.48% 78.8%98.26% 96.08% 92.51% 87.13%98.4% 97% 92.72% 87.5%82.1% 77.81% 74.11% 70.19%93.67% 90.36% 86.31% 81.47%94.06% 91.17% 86.84% 81.96%

IVSWITCHES WITH SWITCH SIZE NW�NW

No. of 2�2 SwitchingElements

SplitterNo.

CouplerNo.

�NW�2 0 0N2W NW NW

W /2� �2 log2�NW�−1�� log2�NW� NW NW2NW�2W−1�+ �2W−1�N2 0 0

2NW2+WN2 0 0NW /2� �2 log2�NW�−1� 0 0

NW /2� log2�NW /C� 0 0

LEING

LECE

N

Page 13: Pseudo-Banyan Optical WDM Packet Switching System With ... · crossbar matrix network and Cantor network, can al-ways construct a new connection between the input ... is to incorporate

Yuang et al. VOL. 1, NO. 3 /AUGUST 2009/J. OPT. COMMUN. NETW. B13

decreasing throughout each time slot. We then pro-posed a hardware parallel system for the implementa-tion of PIPS and derived the computationalcomplexity of the algorithm, O��P�� log2�NMW��. Wedrew comparisons between PIPS and five other sched-uling algorithms, including the exhaustive optimalmethod. The PIPS algorithm was shown to attain anear-optimal solution and invariably to outperformthe four scheduling algorithms on both throughputperformance and computational complexity underboth smooth (BP) and bursty (IBP) traffic arrivals. Forthe SBOPSS with optical pseudo-Banyan switches ofpracticable size, 16�16 and 32�32, PIPS guaranteesa minimal throughput of 0.9 under loads of 0.8 and be-low. Finally, we showed that, for the SBOPSS with16�16 �32�32� PBS under a high load of 0.95, sys-tem throughput is greatly improved from 78.8%(70.2%) with no buffer to 87.13% (81.4%) with B=12�B=24�. However, the throughput no longer improvesas the buffer size is increased to B=56 in both cases.The results justify our economic use of downsized op-tical buffers. Finally, compared with prevailing spaceswitch structures, SBOPSS was shown to include thelowest number of optical component counts and toachieve the best output signal integrity.

REFERENCES

[1] S. Yao, S. Yoo, and B. Mukherjee, “All-optical packet switchingfor metropolitan area networks: opportunities and challenges,”IEEE Commun. Mag. vol. 39, no. 3, pp. 142–148, March 2001.

[2] M. Yuang, S. Lee, P. Tien, Y. Lin, J. Shih, F. Tsai, and A. Chen,“Optical coarse packet-swtiched IP-over-WDM network OPSI-NET: technologies and experiments,” IEEE J. Sel. Areas Com-mun., vol. 24, no. 8, pp. 117–127, Aug. 2006.

[3] G. Papadimitriou, C. Papazoglou, and A. Pomportsis, “Opticalswitching: switch fabrics, techniques, and architectures,” J.Lightwave Technol., vol. 21, no. 2, pp. 384–405, Feb. 2003.

[4] S. Li, Algebraic Switching Theory and Broadband Applica-tions, Burlington, MA: Academic, 2001.

[5] S. Dixit, IP OVER WDM Building the Next Generation OpticalInternet, New York, NY: Wiley, 2004.

[6] M. Chia, D. Hunter, I. Andonovic, P. Ball, I. Wright, S. Fergu-son, K. Guild, and M. O’Mahony, “Packet loss and delay per-formance of feedback and feed-forward arrayed-waveguidegratings-based optical packet switches with WDM inputs–

TABSIGNAL-QUALITY COMPARISON OF SPAC

Category Architecture

Nonblocking space switches CrossBar [3]Broadcast and select [24]

Cantor [4]Strictly nonblocking Clos [15

Rearrangeable space switches Rearrangeable Clos [4]Benes [25]

Blocking space switches SBOPSS

outputs,” J. Lightwave Technol., vol. 19, no. 9, pp. 1241–1254,Sept. 2001.

[7] Z. Zhang and Y. Yang, “Low-loss switching fabric design for re-circulating buffer in WDM optical packet switching networksusing arrayed waveguide grating routers,” IEEE Trans. Com-mun., vol. 54, no. 8, pp. 1469–1472, Aug. 2006.

[8] S. Liew, G. Hu, and H. Chao, “Scheduling algorithms forshared fiber-delay-line optical packet switches—part I: thesingle-state case,” J. Lightwave Technol., vol. 23, no. 4, pp.1586–1600, April 2005.

[9] F. Choa, X. Zhao, X. Yu, J. Lin, J. Zhang, Y. Gu, G. Ru, G.Zhang, L. Li, H. Xiang, H. Hadimioglu, and H. Chao, “An op-tical packet switch based on WDM technologies,” J. LightwaveTechnol., vol. 23, no. 3, pp. 994–1014, March 2005.

[10] T. Zhang, K. Lu, and J. Jue, “Shared fiber delay line buffers inasynchronous optical packet switches,” IEEE J. Sel. AreasCommun., vol. 24, no. 4, pp. 118–127, April 2006.

[11] W. Zhong and R. Turker, “Wavelength routing-based photonicpacket buffers and their applications in photonic packetswitching systems,” J. Lightwave Technol., vol. 16, no. 10, pp.1737–1745, Oct. 1998.

[12] M. Yuang, I. Chao, B. Lo, P. Tien, J. Chen, C. Wei, Y. Lin, S.Lee, and C. Chien, “HOPSMAN: an experimental testbed sys-tem for a 10-Gb/s optical packet-switched WDM metro ringnetwork,” IEEE Commun. Mag., vol. 46, no. 7, pp. 158–166,July 2008.

[13] K. Obermann, S. Kindt, D. Breuer, and K. Petermann, “Perfor-mance analysis of wavelength converters based on cross-gainmodulation in semiconductor-optical amplifiers,” J. LightwaveTechnol., vol. 16, no. 1, pp. 78–85, Jan. 1998.

[14] B. Sarker, T. Yoshino, and S. Majumder, “All-optical wave-length conversion based on cross-phase modulation (XPM) in asingle-mode fiber and a Mach–Zehnder interferometer,” IEEEPhoton. Technol. Lett., vol. 14, no. 3, pp. 340–342, March 2002.

[15] F. Yan, W. Hu, W. Sun, W. Guo, Y. Jin, H. He, and Y. Dong,“Placements of shared wavelength converter groups inside acost-effective permuted Clos network,” IEEE Photon. Technol.Lett., vol. 19, no. 13, pp. 981–983, July 2007.

[16] V. Eramo and M. Listanti, “Packet loss in a bufferless opticalWDM switch employing shared tunable wavelength convert-ers,” J. Lightwave Technol., vol. 18, no. 12, pp. 1818–1833, Dec.2000.

[17] V. Eramo, M. Listanti, and M. Donato, “Performance evalua-tion of a bufferless optical packet switch with limited-rangewavelength converters,” IEEE Photon. Technol. Lett., vol. 16,no. 2, pp. 644–646, Feb. 2004.

[18] G. Shen, S. Bose, T. Cheng, C. Lu, and T. Chai, “Performancestudy on a WDM packet switch with limited-range wavelengthconverters,” IEEE Commun. Lett., vol. 5, no. 10, pp. 432–434,Oct. 2001.

[19] V. Eramo, M. Listanti, and A. Germoni, “Cost evaluation of op-tical packet switches equipped with limited-range and full-

VWITCHES QITH SWITCH SIZE NW�NW

No. of 2�2 SwitchingElements (Signal Impairment)

Splitter and Coupler(Output/Input Power)

2NW N/A1 1/N2

2 log2�NW�−1 �1/log2�NW��2

4W�2W−1�+2N2 N/A4W2+2N2 N/A

2 log2�NW�−1 N/Alog2�NW /C� N/A

LEE S

]

Page 14: Pseudo-Banyan Optical WDM Packet Switching System With ... · crossbar matrix network and Cantor network, can al-ways construct a new connection between the input ... is to incorporate

B14 J. OPT. COMMUN. NETW./VOL. 1, NO. 3 /AUGUST 2009 Yuang et al.

range converters for contention resolution,” J. Lightwave Tech-nol., vol. 26, no. 4, pp. 390–407, Feb. 2008.

[20] H. Li and I. L.-J. Thng, “Cost-saving two-layer wavelength con-version in optical switching network,” J. Lightwave Technol.,vol. 24, no. 2, pp. 705–712, Feb. 2006.

[21] V. Eramo, M. Listanti, and M. Spaziani, “Resources sharing inoptical packet switches with limited-range wavelength con-verters,” J. Lightwave Technol., vol. 23, no. 2, pp. 671–687,Feb. 2005.

[22] Y. Lin, M. Yuang, S. Lee, and W. Way, “Using superimposedASK label in a 10-Gb/s multihop all-optical label swappingsystem,” J. Lightwave Technol., vol. 22, no. 2, pp. 351–361,Feb. 2004.

[23] M. Yuang, P. Tien, J. Shih, S. Lee, Y. Lin, and J. Chen, “A QoSoptical packet-switching system for metro WDM networks,” in

31st European Conf. on Optical Communications, ECOC 2005,vol. 3, Sept. 25–29, 2005, pp. 351–352.

[24] Z. Zhang and Y. Yang, “Optical scheduling in buffered WDM in-terconnects with limited range wavelength conversion capabil-ity,” IEEE Trans. Comput., vol. 55, no. 1, pp. 71–82, Jan. 2006.

[25] Z. Hass, “The staggering switch: an electronically controlledoptical packet switch,” J. Lightwave Technol., vol. 11, no. 6, pp.925–936, June 1993.

[26] O. Ladouceur, B. Small, and K. Bergman, “Physical layer scal-ability of WDM optical packet interconnection networks,” J.Lightwave Technol., vol. 24, no. 1, pp. 262–270, Jan. 2006.

[27] J. Yu and P. Jeppesen, “Improvement of cascaded semiconduc-tor optical amplifier gates by using holding light injection,” J.Lightwave Technol., vol. 19, no. 5, pp. 614–623, May 2001.