Packet Switching Interconnection Networks for Modular Systems

11
Adding buffers to a packet switching network can increase throughput in certain system architectures. A word of warning- don't make them too large. Packet Switching Interconnection Networks for Modular Systems Daniel M. Dias, Mitre Corporation J. Robert Jump, Rice University Packet switching is a method of passing messages be- tween several interconnected computer systems. These messages are organized as relatively small packets (hence the name) that are generated by one system and then passed to another. It is a technique often used in computer networks where the modules are geographically sep- arated. This article examines the use of packet switching techniques in high-bandwidth interconnection networks. We define these networks as packet switching intercon- nection networks, and will look at what effects packet switching has on a network's performance. A packet switching interconnection network is com- posed of simple, interconnected switches that provide communication paths among the modules of a computing system. Each packet contains data and information that identifies a particular destination module. This destination information is used by the switches to route the packet to the proper module. Thus, the control of packet move- ment through the network does not require a global con- troller. Instead, it is implemented locally in the switches themselves. One method of organizing a modular system is shown in Figure Ia. Here, any module can directly communicate with any other module by sending packets through the network. A second commonly encountered structure is shown in Figure lb. In this case, the modules are divided into two disjoint groups connected by a network, and packets can only be passed between modules in different groups. A good example of this is where the modules in one group are processors and those in the other group, memories. The reader should keep in mind that this article is not intended as a comprehensive survey of interconnection networks. Several of these have already been published, Mason, et al.,I Siegel,2 and Thurber,3 for example. In- stead, this article concentrates on a single class of net- works, called delta networks, and summarizes some of the results obtained from simulation and analytical models developed to predict their performance. In par- ticular, it will show how the performance of a network can be affected by changes in the parameters that define its structure, size, and capacity. This performance is characterized in terms of how much information can be Figure 1. Modular system organizations. D r0018-9162/81/1200O4D43S00.75 © 1981 IEEE December 1981 43

Transcript of Packet Switching Interconnection Networks for Modular Systems

Page 1: Packet Switching Interconnection Networks for Modular Systems

Adding buffers to a packet switching network can increase throughput incertain system architectures. A word of warning-

don't make them too large.

Packet SwitchingInterconnection Networksfor Modular SystemsDaniel M. Dias, Mitre CorporationJ. Robert Jump, Rice University

Packet switching is a method of passing messages be-tween several interconnected computer systems. Thesemessages are organized as relatively small packets (hencethe name) that are generated by one system and thenpassed to another. It is a technique often used in computernetworks where the modules are geographically sep-arated. This article examines the use of packet switchingtechniques in high-bandwidth interconnection networks.We define these networks as packet switching intercon-nection networks, and will look at what effects packetswitching has on a network's performance.A packet switching interconnection network is com-

posed of simple, interconnected switches that providecommunication paths among the modules ofa computingsystem. Each packet contains data and information thatidentifies a particular destination module. This destinationinformation is used by the switches to route the packet tothe proper module. Thus, the control of packet move-ment through the network does not require a global con-troller. Instead, it is implemented locally in the switchesthemselves.One method of organizing a modular system is shown

in Figure Ia. Here, any module can directly communicatewith any other module by sending packets through thenetwork. A second commonly encountered structure isshown in Figure lb. In this case, the modules are dividedinto two disjoint groups connected by a network, andpackets can only be passed between modules in differentgroups. A good example of this is where the modules inone group are processors and those in the other group,memories.The reader should keep in mind that this article is not

intended as a comprehensive survey of interconnectionnetworks. Several of these have already been published,Mason, et al.,I Siegel,2 and Thurber,3 for example. In-stead, this article concentrates on a single class of net-

works, called delta networks, and summarizes some ofthe results obtained from simulation and analyticalmodels developed to predict their performance. In par-ticular, it will show how the performance of a networkcan be affected by changes in the parameters that defineits structure, size, and capacity. This performance ischaracterized in terms of how much information can be

Figure 1. Modular system organizations.

D r0018-9162/81/1200O4D43S00.75 © 1981 IEEEDecember 1981 43

Page 2: Packet Switching Interconnection Networks for Modular Systems

passed through a network in a given period of time, andalso how long it takes for a typical packet to pass throughit. To this end, a network's throughput is defined infor-mally as the number of packets that it can pass per unit oftime, with the delay of a network being the average timerequired to pass a single packet. Because several packetscan be concurrently in motion through a network andbecause packets can collide at an internal switch, the delayis not generally equal to the reciprocal of the throughput.

Packet switching networks

The networks we examine are composed of several ofthep input, q output (p x q) switches shown in Figure 2. Apacket at any one ofthep input terminals can be passed bythe switch to any one of the q output terminals. Thus,once a packet has arrived at an input terminal, the switchfirst uses information in the packet to select one of its out-put terminals and then transfers the packet to that ter-minal.A packet switching network is formed by arranging

switches into stages with data paths, called links, whichconnect an output terminal of a switch in one stage to aninput terminal of a switch in the next, adjacent stage. In-put terminals of switches in the first stage are called net-work input ports and output terminals of switches in thelast stage are called network output ports.

Figure 2. Switch types.

Figure 3. A typical (23 x 23) delta network.

44

Delta networks. Delta networks4 are defined as net-works constructed from switches of size pxp, with nstages, N input ports, and Noutput ports, where N=pl.The interconnection patterns ofthe links between two ad-jacent stages must be arranged so that a packet can be sentfrom any one of the network's input ports to any one ofitsoutput ports. Furthermore, a packet's movement throughthe network must be controlled by an n digit, base pnumber in the packet, called the destination address ofthe packet, in the following way: For each switch en-countered by a packet as it moves from stage to stage, thechoice of which of the switch's output terminals is toreceive the packet is uniquely determined by one of thedigits in the destination address. Thus, the n digits in thedestination address correspond to the n stages of the net-work, and each digit controls only switches in its cor-responding stage. Since there are p output terminals oneach switch and each digit of the destination address hasppossible values, the digits can be used to specify directlywhich output terminal of the switch will receive thepacket. In other words, when the switch computes thedestination of the packet, it is performing a simple calcu-lation.

Figure 3 illustrates a typical delta network whereN= 8,p= 2, and n = 3. The output ports of this network arelabeled with their addresses. Let the bits of the destinationaddresses be given as a0a1a2. Then, bit a, routes a packetto output terminal ai ofthe switches in stages Si, for i= 1,2, 3. The dotted line indicates the path taken by a packetwith destination address 011 that enters the network atport 101.

Timing parameters. Remember that there are two oper-ations involved in passing a packet through aswitch-selection of the destination and moving thepacket. In this article we assume that a switch can overlapthese operations, so that one packet can be movedthrough a switch at the same time the destination ofanother packet is being computed. Moreover, we assumethat the destinations for packets on two or more input ter-minals can be concurrently computed by a switch. For thepurpose of analyzing network performance, it is suffi-cient to characterize the operation of a switch by thefollowing two timing parameters:

t_select = time required to compute the destination ofa packet.t_pass = time required to move a packet through the switch.

The time t-select depends primarily on p, the numberof output terminals of the switch. Indeed, selecting theoutput terminal involves decoding a base p digit in thedestination address. The time t_pass is a function of thecapacity (data width) of the links, the size of the packets,and the data path widths in the switch. For example, if allof the information in a packet can be passed through thelinks and switches in parallel, then t_pass could be on theorder of a few gate delays. However, if the packet has tobe broken into several subpackets that are transferredserially, then t_pass would be directly related to thenumber of subpackets.

The effects of multiple packets. When a packet arrivesat a switch input terminal, its passage through the switch

COMPUTER

Page 3: Packet Switching Interconnection Networks for Modular Systems

may be blocked by other packets in the network. In thecase of two packets simultaneously arriving at an inputterminal, only one can be passed when both of them aredirected to the same output terminal of the switch. Herethe switch must pick one of the packets, using some ar-bitration strategy, pass it to the output terminal, anddelay passage of the other one.Whether or not a packet is blocked (delayed) at a switch

also depends on how the switch is implemented. For ex-ample, one way to implement a p x q switch is with a pinput, q output bus. However, only one packet can bepassed through this type of switch at a time, so if morethan one packet arrives at this switch, all but one must beblocked, even if they are all directed to different outputterminals. In this article, we assume that all switches arecapable of passing two or more packets concurrently aslong as they are directed to different output terminals. Iftwo or more of these packets are directed to the same out-put terminal, then we assume that one is randomly pickedfor transfer and the others are blocked.

Links. The links between stages of a packet switchingnetwork may contain data buffers as shown in Figure 4.These are registers organized as a first-in-first-out queue.The size (or length) of a buffer is the maximum number ofpackets it can hold. For the networks considered in thisarticle, either all of the links have buffers, or none ofthemdo. Moreover, the lengths of all buffers will be equal. Forpurposes of analysis, we also assume that the time it takesto move data into and out ofthe buffers can be included int_select and t_pass. One advantage ofthese buffers is thatthey serve as temporary storage to hold packets that havebeen blocked by a switch. Thus, a packet that has pro-gressed through a network and is blocked at a switch inone of the later stages can wait in a buffer at a switch inputterminal until it is unblocked. However, buffers havedrawbacks; they can also block packets. If the buffer atan output terminal is full, then a switch cannot pass apacket to that terminal.

Buffers also have the ability to increase the throughputof a network by enabling different switches on a path tooperate on different packets at the same time. This is aform of pipelining that increases the number of packetsthat can be passed through the network in a given periodof time. While the buffers between stages increase theminimum possible delay experienced by a packet, theaverage packet delay of a buffered network is actually lessthan that of an unbuffered one.

Types of switches and networks. Square switches areswitches with identical numbers of input and output ter-minals. Square networks are packet switching networkscomposed entirely of square (p xp) switches, andtherefore have the same number of input and outputports. But nonsquare networks are needed for a systemlike the one shown in Figure lb. These packet switchingnetworks, withpn input ports andpm output ports, wheren . m, will be analyzed in more detail later in this article.They can be constructed fromp xp square switches,p x 1switches called arbitration switches, and I xp switchescalled distribution switches. In these nonsquare systems,if n is greater than m, then apn xpm network is called an

arbitration network; if n is less than m, it is called adistribution network.5A pn xpm arbitration network can be constructed en-

tirely from stages of p xp square switches and stages ofp x I arbitration switches. Moreover, the stages of squareswitches and the stages of arbitration switches can be ar-ranged in any order. For a stage of arbitration switches,there is only one output terminal per switch. Hence, thedigit of the destination address corresponding to such astage has only one possible value and can be omitted.

Dias6 has shown that for any integer p greater than onewhere integers m and n are greater than or equal to zero, apn xpm packet switching network that satisfies these con-ditions can always be constructed. For distribution net-works where n < m, the individual stages will either con-sist entirely of square switches or distribution switches.Figure 5 shows three 23 x 22 arbitration networks, andFigure 6 illustrates three 21 x 23 distribution networks.Note that the placement of the arbitration and distribu-tion switch stages determines the total number of switchesneeded to realize the network. (This placement also has aneffect on network performance, as will be shown later.)Most of the existing work on interconnection networks

for modular systems is concerned with networks that donot contain buffers and are not operated as packet switch-ing networks. The switches in these networks are usuallycontrolled by an external controller instead of by infor-mation in packets. Much of this work has concentratedon the use of networks to connect each of the input portsto exactly one output port. Then, packets are passed fromall of the input ports to all of the output ports concurrent-ly. In this mode of operation, the state of the network canbe described as a permutation ?r on the set 1I, 2, . . ., n ,

where input port i is connected to output port 7r(i). Net-works operated this way are called permutation net-works. The goal of most permutation network research isto characterize the class of permutations that can berealized using a particular interconnection pattern for thelinks between stages, and to develop algorithms whosestructure matches the structure of the permutation net-

Figure 4. A typical buffered network.

December 1981 45

Page 4: Packet Switching Interconnection Networks for Modular Systems

works. As a result, very little of this research can be usedto analyze the performance of packet switching networks.The interconnection structure of many well-known

permutation networks do satisfy the properties requiredby delta networks. For example, the Omega network,7 theindirect binary n-cube,8 the cube network,9 the flip net-work, 10 a subclass ofbanyan networks called regular ban-yan networks,I1 and the baseline network'2"13 are all in-stances of delta networks. However, when delta networksare operated as packet switching networks and thedestinations of packets are uniformly distributed, theirperformance, measured by throughput and delay, isessentially independent of which interconnection struc-ture is used.4"14

Packet switching network research. The data flow ar-chitecture currently under development at the Massachu-setts Institute of Technology will use buffered arbitrationand distribution networks to interconnect the processormodules with the memory modules.'5 These networks

will operate asynchronously. Some initial analysis of theirperformance can be found in the papers by Jacobsen andMisunas16 and in the thesis by Boughton. 17

The Trac system under construction at the Universityof Texas at Austin will use a synchronous networkcapable of operating in either circuit or packet switchingmodes.18 Trac has a single level of buffering betweenstages.

As previously stated, the class of delta networks wasoriginally defined by Patel.4 He analyzed the perfor-mance of unbuffered delta networks and showed that itwas independent of the choice of the interconnection pat-tern. Dias6 has extended some of Patel's results on un-buffered delta networks, and investigated how the addi-tion of buffers to the links between stages affects a net-work's performance (some of his results are included laterin this article).

Stone19 has proposed the shuffle exchange network foruse as a permutation network; packet communication in

Figure 5. Typical arbitration networks. Figure 6. Typical distribution networks.

COMPUTER

Page 5: Packet Switching Interconnection Networks for Modular Systems

this type of network has been studied by both Lawrie andPadua,20 and Dias and Jump.21The reader should keep in mind that most research on

packet switching networks has been performed on net-works used to interconnect computers at geographicallyseparated sites,22 and that there are significant differencesin the way these networks are designed and used com-pared to the interconnection networks described above.For example, in a separated site situation, there are no in-termediate nodes in most computer networks as there arein the inner stages of an interconnection network. Thebandwidth of the channels between nodes, instead of thedelay in nodes, is the primary limit on performance inthese networks. Also, the main cost of a network is thecost of the channels, not the nodes themselves. And final-ly, the nodes of a separated site network are quite complex(usually being general-purpose computers) and are ca-pable of realizing complex routing strategies.

Network performance

The principal measures used to evaluate and comparethe performance of packet switching networks arethroughput and delay. The values of these measures de-pend on several network parameters as well as the externalenvironment of the network. Therefore, it is necessary toplace some restrictions on the way a network interactswith its environment and to make some simplifying as-sumptions about the way the environment generates andaccepts packets. One goal of the research reported here isto identify and evaluate those properties of networks thatlimit system performance and are relatively independentof their external environment. Thus, to the extent possi-ble, the restrictions and assumptions that we make havebeen chosen with this goal in mind as a means to facilitatemore meaningful comparisons among the different typesof networks.The external environment of a network is characterized

by the way it generates the packets that are passed throughit and the way it removes packets that have passed to anoutput port. In the analysis presented later, we assumethat the environment seen by a network satisfies the fol-lowing conditions:

(1) Packets are presented to the network at each inputport at the maximum rate that the network can ac-cept them.

(2) Packets are removed from the network as soon asthey arrive at an output port.

(3) Packets are generated by N independent randomprocesses, one for each of the network's N inputports.

(4) The destination addresses of packets are uniformlydistributed over the set of output port addresses.

The assumption that packets enter the network at amaximum rate simply means that whenever a packet canbe accepted by the network at an input port, one isavailable. In other words, the environment is capable ofgenerating packets faster than they can be accepted by thenetwork. Similarly, the assumption that packets areremoved from the network as soon as they arrive at an

output port means that the environment must be able toaccept packets at a faster rate than they can be passed bythe system. A network operating under these conditionsis, in a sense, passing packets from inport ports to outputports at its maximum rate. Moreover, this rate is primarilya property of the network and is independent of the rate atwhich the packets themselves are generated. It is thereforea measure of how much the flow of packets in a system islimited by the interconnection network.The assumption that packets are generated by indeperi-

dent, uniform, random processes at the network inputports is made to facilitate the analysis and to estimate theupper limits of network performance. If the modules gen-erating packets depend on the generation of other packetsat other modules, then the actual rate of packet transferthrough the network will be less than the observed trans-fer rate when the modules are independent. Additionally,if the distribution of destination addresses is not uniform,some parts of the network will be more congested thanothers, contributing to a decrease in throughput. In otherwords, assumptions 3 and 4 also lead to an analysis of themaximum performance a network can support.To simplify later analysis, we also assume that the net-

works operate synchronously. This assumption enters theanalysis when the movement of packets into and out ofthe network and from stage to stage within the network isassumed to take place at discrete, equally spaced points intime. The time between these points is the minimum delayexperienced by a packet at a switch. Analysis based on thisassumption could apply directly to a network implementedwith a global clock that is used to synchronize the move-ment of information between buffer registers, or it couldbe used as an approximation for the behavior of an asyn-chronous system.

Earlier, we informally defined the main performancemeasures of a network's throughput and delay; now thesecan be defined more precisely. Let NTdenote the numberof packets passed through the network in the time Twhenit is operated in the environment described above. Lett_rnin denote the minimum time a packet would bedelayed by a switch if it is not blocked. Then, time ismeasured in discrete intervals of duration t_min. Also, inthe following definitions, it will be assumed that there areN input ports and Moutput ports for the networks.

Average throughput. TP is the average throughput of anetwork; in other words, the average number of packetsthat can be passed through the network per unit of time.TP can be expressed as follows:

NTTP = lim

T-_o T

In general, throughput is the rate at which packets flowthrough the network and is dependent on the rate they are

generated. Remember, however, that this article onlyconsiders the throughput at maximum input and outputrates (assumptions 1 and 2 above). The average through-put in this case is sometimes referred to as the bandwidthof the network.

December 1981 47

Page 6: Packet Switching Interconnection Networks for Modular Systems

Normalized throughput. NTP is normalized through-put, and is defined as the ratio of the average throughputto the maximum throughput MTP. That is,

NTP- TPMTP

where MTP is the maximum rate ofpacket flow through anetwork if there are no packets blocked at any switch. It isthe maximum rate of packet movement that could be sus-tained in an ideal system. Hence, the normalized through-put is a measure of how close the network is operating toits ideal performance; it is a measure of the effectivenessof the network in passing packets and an inverse measureof the congestion in the network due to blocked packets.

Average delay. D is the average delay for buffered net-works, and is defined as the average time required for apacket to pass from an input port to an output port. If thetime required for the ith packet leaving the network topass through the network is Ti, then the average delay isgiven as follows:

NT

D= lim - TT- oo NT i1

In a buffered network, a packet is delayed when it iseither blocked by another packet at a switch or by a fullbuffer. When this happens, it waits in its current bufferuntil it becomes unblocked. Thus, its progress throughthe network is always positive, always forward. If apacket in an unbuffered network is blocked, then itstransfer through the network is also delayed. In this case,however, it must start over at an input port. Therefore, toimplement unbuffered networks, there must be some formof feedback to the modules generating packets to indicatewhen its packets are blocked and must be resubmitted.

Normalized delay. Delay is normalized with respect tothe delay of an ideal system (with no blocking) in the sameway throughput is normalized. Thus, the normalizeddelay, or ND, of a network is defined as follows:

ND= DMD

The minimum delay MD is the time required for apacket to pass through the network if it is not blocked atany switch. Normalized delay is simply another measureof the congestion in a network due to the blocking ofpackets.

Use of the measures. The throughput measure is mostuseful as an upper bound on system performance. In asystem where the interconnection network was the mainbottleneck in performance, one would expect packets toalways be available at input ports only if there were a largenumber of packets generated in parallel. That is, the con-dition given in our previous assumptions 1 and 2 would

only be satisfied in a system where there was a largeamount of parallelism. For this type of system, thethroughput measure can provide a useful indication ofsystem performance.

For modeling the performance of a system where thedependence between the tasks generating packets wassignificant, the delay measure would be more useful thanthe throughput measure. Indeed, the rate at whichpackets could be generated would be significantly in-fluenced by the time required for packets to pass from onemodule to another. To use these network measures forthese systems, they would have to be combined withmodels for the generation of packets by the system. Sincethis article only deals with the analysis of networks, fur-ther consideration of these possible extensions to a moregeneral systems analysis will have to be deferred.The following sections present the values of through-

put and delay for several networks. These values were ob-tained in two ways. Wherever possible, they were com-puted using an approximate Markov model. An exactMarkov model was not used for most of the networks asthe number of states was too large. Thus, most of theanalytical models we used are approximations. The sec-ond method of evaluating throughput and delay was bysimulation. For those cases that could not be handledanalytically, a special-purpose, event-driven simulatorwas used. This simulator also verified the accuracy of theapproximate analytical models.14"16The principal advantage of the analytical models was

that they could be quickly evaluated by computer. Thispermitted an exhaustive search for optimal arbitrationand distribution networks, the results of which are sum-marized in the final section of this article. The advantageof the network simulator was that it could represent a net-work in considerably more detail than the models. Theresults that follow were obtained using whichever of thesetwo methods was most feasible. When it was possible touse both for the same network, this was done as it providedsome verification of the results.

Performance of square networks

Square networks have an equal number of input andoutput ports and are constructed from square switches.These networks can be used in a modular system, like thatshown in Figure la, where the modules must be able tosend packets directly to any other module. For optimalperformance, such a system would have direct links be-tween each module (see Figure 7). In this case, any ob-served degradation in system performance would becaused by packets directed to the same module collidingwith one another at that module. In a delta network,however, packets can collide inside the network even ifthey have different destination modules. This direct inter-connection scheme requires something on the order ofN2links for an N module system. Delta networks also haveN2 (area) complexity, if they are implemented on a singleVLSI device.23 However, large networks (for example,networks where N264) will probably have more ter-minals than can be implemented on a single device. In thisevent, it will be necessary to realize large networks as

COMPUTER48

Page 7: Packet Switching Interconnection Networks for Modular Systems

several devices connected by wires. Since delta networkshave only N log N links, they offer a less complex alter-native to the direct connection of all modules when N islarge.

If two packets in a delta network are directed to thesame output terminal of a switch, one of them will beblocked. If the operation of a switch is such that most ofthe time required to process a packet is used to determineits destination (i.e., t-select > t_pass), then the amount ofblocking will be reduced since a packet is only blockedwhile another packet is being passed to its destination out-put terminal. But if most of the time is used to pass thepacket (i.e., t_pass > t-select), then blocking at a switchwill be increased. This means that a rough estimate of theeffect of switch operation on network performance canbe obtained by evaluating throughput and delay as thevalues of t-select and tLpass are varied while t_select +t_pass is held constant.The size of a switch (measured by the number of its ter-

minals) also has an effect on network performance. How-ever, this article is limited to a discussion of networks con-structed from 2 x 2 switches. (Some results for largerswitches can be found in Patel,4 and Dias and Jump. 14)

While buffers can reduce the detrimental effects ofpacket collisions in a network, they can also increasethroughput due to the pipelining they create. Indeed, themaximum throughput for an unbuffered network is givenby

MTPunbuffered =N/(n x t_min)

where n is the number of stages, Nis the number of inputor output ports, and t_min is the delay of a switch. How-ever, the maximum throughput for a buffered network is

MTPbuffered =N/tLmin

since packets can be placed into, and removed from, thenetwork every tLmin unit of time because of the buffersbetween stages. Hence, the increase in throughput speeddue to the buffers is given by the ratio

MTPbuffered/ MTPunbuffered = n

This is the improvement expected from an n stage pipelinesince all n stages are operating in parallel.

The graphs in Figure 8 indicate the improvement inthroughput speed that can be expected when there is inter-ference between packets. Therefore, the ratio ofthrough-put for buffered networks to throughput for unbufferednetworks is plotted against the number of stages. Actually,the speedup due to the added buffers can exceed n for an nstage network because the throughput for the unbufferednetwork is considerably below its ideal maximum value (aresult of packet collisions at switches). In other words,buffers increase throughput because of two different ef-fects-pipelining and the reduction of delay due to colli-sions.A better indication of the extent to which buffers can

compensate for the detrimental effects of packet colli-sions is given in Figure 9. Here, normalized throughput isplotted against the size of the buffers. The differences

Figure 7. Direct interconnection of modules.

Figure 8. Speedup due to buffering (t_select + t_pass = 1)

Figure 9. Normalized throughput vs. the number of buffers betweenstages (t_select + t_pass = 1.0).

December 1981 49

Page 8: Packet Switching Interconnection Networks for Modular Systems

shown between actual and ideal throughput are due topacket collisions. Therefore, normalized throughput isactually an inverse measure of network congestion causedby the blocking of the packets. As the size of the buffers isincreased, the normalized throughput also increases be-cause a switch is more likely to find a space in an outputbuffer when it needs to transfer a packet. If t_pass = 0,then there is no time penalty involved if two packets haveto be passed to the same output terminal; the only block-

ing that occurs is a result of a full buffer, and as the size ofthe buffers approaches infinity, the normalized through-put approaches its maximum value of 1.A different situation exists when t_pass = 1. In this

case, the system experiences the maximum possibleinterference caused by two packets colliding at an outputterminal of a switch; as the size of the buffers approachesinfinity, any divergence between normalized throughputand its ideal value is caused exclusively by this type of in-terference.

Figure 10 shows how normalized delay varies with thesize of the buffers. It increases in an almost linear fashionwith added buffer capacity because the packets tend to fillup the buffers and stay in the network longer. Therefore,the increase in throughput due to larger buffers is offsetby an increase in delay. Since normalized throughputstarts to level off at a buffer size of about two or three, in-creasing the buffer size beyond this point will only in-crease delay, without bringing about a corresponding in-crease in throughput. Hence, buffer sizes of less thanthree or four can significantly improve throughput overthe unbuffered case, but buffers larger than this are muchless effective due to the significant increase in delay.

Figures 11 and 12 show how normalized throughputand delay vary with the size of a network. The normalizedthroughput decreases as the network size increases be-cause there are more packets in the network and, corre-spondingly, more possibilities for blocking. The unnor-

Figure 10. Normalized delay vs. the number of buffers betweenstages (t_select + t_pass = 1.0).

Figure 11. Normalized throughput vs. number of stages with a single Figure 12. Normalized delay vs. number of stages with abuffer between stges (t_select + t_pass = 1). single buffer between stages (t_select + t_pass = 1).

COMPUTER50

Page 9: Packet Switching Interconnection Networks for Modular Systems

malized throughput will increase with network size sincethere are more input and output terminals. However,large networks utilize this increased capacity less effec-tively than smaller networks, as is evidenced by thedecrease in normalized throughput and increase in nor-malized delay.

Performance of nonsquare networks

This section presents some results concerning the per-formance of arbitration networks. These are m stage,pm xpn networks where m > n, with n stages composedentirely of square switches, and m - n stages containingonly p x I arbitration switches (similar results can also beobtained for distribution networks6).

There are several ways to construct a pm Xpn arbitra-tion network. Indeed, the n square stages and m - n ar-bitration stages can be arranged in any order. However,the arrangement chosen will affect both the size of the net-work (i.e., the number of switches it contains) and its per-formance.For any arbitration network, there are two extreme

choices for the arrangement of stages. In one, all ofthe ar-bitration stages are grouped together at the front of thenetwork, and in the other, the arbitration stages aregrouped together and located after all of the squarestages. Both possibilities are shown in Figure 13 for 23 x 21

arbitration networks. Note that the first choice (arbitra-tion stages at the front of the petwork) will have fewerswitches than the second because the total number ofswitches in arbitration stages will not change with theIocation of the stage; however, there are fewer switches ina square stage if it follows arbitration stages.

Figure 14 shows how the performance of 28 x 2m ar-bitration networks varies for values of m from zero toeight. The bottom curve is for networks with all arbitra-tion stages preceding the square stages, and the top curveis for networks where the arbitration stages come after thesquare stages. All other arrangements of stages result inthroughputs that lie somewhere between these two ex-tremes.6 Placing the square stages before the arbitrationstages gives better throughput because there are morepaths in the network and, therefore, fewer collisions ofpackets. If the switches in all of the stages have equaldelay, then the arrangement of stages in buffered arbitra-tion networks affects the performance and network sizein the same way as for unbuffered networks; the bestthroughput is obtained with networks that place the arbi-tration stages after the square stages, but the smallest net-works are a result of arbitration stages being positioned atthe head end of the network.One way to balance the load in an arbitration network

is to match the throughput of the individual stages bydecreasing the delay of the arbitration stages to compen-sate for the reduction in the number of paths. Thus, a

Figure 13. Extreme arbitration networks.

December 1981

Page 10: Packet Switching Interconnection Networks for Modular Systems

stage ofp x I arbitration switches should have a delay I /ptimes the delay of the preceding stage, and the delay of astage of square switches should equal the delay of thepreceding stage. This could be achieved by implementinga serial-to-parallel transformation on the information inthe packets as they pass through an arbitration stage.5Thus, a packet that enters an arbitration switch organizedas k serially transmitted subpackets would leave the stageas k/p serially transmitted subpackets that are p times aslarge as the entering subpackets.

If the matching of stages described above is imple-mented, then the arrangement of stages that gives the bestthroughput is not necessarily one of the two extreme con-figurations shown in Figure 13. Some examples of op-timal stage arrangements are given in Table 1. Here, thearrangement of stages that give the highest normalizedthroughput is given as a string of A's and S's, where Adenotes an arbitration stage and S a square stage. Thesewere obtained by computing the normalized throughputfor each possible arrangement of stages for a given num-ber of network input and output ports, and then pickingthe one with the highest NTP.

Figure 14. Typical NTP for unbuffered arbitration networks ((28 x 2m)networks for Osm:s8).

Conclusion

We have investigated two of the most important prop-erties of packet switching networks, and have made somefindings that should be of interest to system architects.First, we showed that the insertion of buffers between thestages of a network improves its throughput, and that thisimprovement is most significant when two br three buff-ers are placed between each pair of stages. And secondly,we demonstrated that the performance of nonsquare net-works can be optimized by the proper arrangement ofstages. -

Acknowledgment

This work was supported by the National ScienceFoundation under grant number MCS-8001667.

References

1. G. M. Mason, G. C. Gingher, and S. Nakamura, "ASampler of Circuit Switching Networks," Computer, Vol.12, No. 6, June 1979, pp. 32-48.

2. H. J. Siegel, "Interconnection Networks for SIMD Ma-chines," Computer, Vol. 12, No. 6, June 1979, pp. 57-65.

Table 1.Optimal (buffered) arbitration networks

(single buffer between stages).

NETWORKSHAPE

21 X 2021 X 2122 x 2022X2122x22

23x2023x 2123x2223 x2324x2024x 2124x 2224X 2324X2425x2025x2125X 2225x2325x2425 X2526 x226 x 2126 x 2226 X 2326 X 2426 X2526x26

L select = 0 I-select = 1

ARRANGEMENT NTP ARRANGEMENT NTP

ASAASASSAAAAASSSASSSAAAAAAASSAASSSSASSsSAAAAAAAAASSAAASSSAASSSSSASsSSsAAAAAAAAAAASSAAAASSSAAASSSSAASSSSSSASsSSSS

1.000.75

0.750.560.38

0.750.500.420.310.750.500.450.360.270.750.500.480.420.320.250.750.500.500.470.410.300.24

ASAAASSSAAAAASSASSSS

AAAAAAASSAASSSASSSSS

AAAAAAAAASSAAASSAASSSASSSSSSSS

AAAAAAAAAAASSAAAASSAAASSSASASSSASASSSSSSSS

1.001.001.001.000.751.001.000.750.611.001.000.750.650.531.001.000.750.670.570.481.001.000.750.690.600.530.45

COMPUTER52

Page 11: Packet Switching Interconnection Networks for Modular Systems

3. K. J. Thurber, "Interconnection Networks-A Survey andAssessment," AFIPS Conf. Proc., Vol. 43, 1974 NCC.

4. J. H. Patel, "Processor-Memory Interconnections forMultiprocessors," Proc. 6th Ann. Symp. ComputerArchi-tecture, Apr. 1979, pp. 168-177.

5. J. B. Dennis and D. P. Misunas, "A Preliminary Architec-ture for a Basic Data-Flow Processor," Proc. ind Ann.Symp. Computer Architecture, Jan. 1975, pp. 126-132.

6. D. M. Dias, "Packet Communication in Delta and RelatedNetworks," PhD Dissertation, Rice University, Houston,Tex., May 1981.

7. D. H. Lawrie, "Access and Alignment of Data in an ArrayProcessor," IEEE Trans. Computers, Vol. C-24, No. 18,Dec. 1975, pp. 1145-1155.

8. M. C. Pease, "The Indirect Binary n-Cube MicroprocessorArray," IEEE Trans. Computers, Vol. C-24, No. 18, Dec.1975, pp. 458-473.

9. H. J. Siegel and D. S. Smith, "Study of Multistage SIMDInterconnection Networks," Proc. 5th Ann. Symp. Com-puterArchitecture, Apr. 1978, pp. 223-229.

10. K. E. Batcher, "The Flip Network in STARAN," Proc.19761nt'lConf. ParallelProcessing, Aug. 1976, pp.65-71.

11. L. R. Goke and G. J. Lipovski, "Banyan Networks forPartitioning Multiprocessor Systems," Proc. Ist Ann.Symp. Computer Architecture, Apr. 1973, pp. 21-28.

12. C. Wu and T. Feng, "On a Class of Multistage Intercon-nection Networks," IEEE Trans. Computers, Vol. C-29,No. 8, Aug. 1980, pp. 694-702.

13. C. Wu and T. Feng, "The Reverse-Exchange Interconnec-tion Network," IEEE Trans. Computers, Vol. C-29, No.9, Sept. 1980, pp. 801-810.

14. D. M. Dias and J. R. Jump, "Analysis and Simulation ofBuffered Delta Networks," IEEE Trans. Computers, Vol.C-30, No. 4, Apr. 1981, pp. 273-282.

15. J. B. Dennis, G. A. Boughton, and C. K. Leung, "BuildingBlocks for Data Flow Prototypes," Proc. 7th Ann. Symp.Computer Architecture, Apr. 1980, pp. 1-8.

16. R. G. Jacobsen and D. P. Misunas, "Analysis of Struc-tures for Packet Communication," Proc. 1977Int'lConf.Parallel Processing, Aug. 1977, pp. 38-43.

17. G. A. Boughton, "Routing Networks in Packet Communi-cation Architecture," MS Thesis, Dept. EE and CS, MIT,Cambridge, Mass., June 1979.

18., A. R. Tripathi and G. J. Lipovski, "Packet Switching InBanyan Networks," Proc. 6th Ann. Symp. ComputerArchitecture, ACM Sigarch, Apr. 1979.

19. H. S. Stone, "Parallel Processing with the PerfectShuffle," IEEE Trans. Computers, Vol. C-20, No. 2, Feb.1971, pp. 153-161.

20. D. H. Lawrie and D. A. Padua, "Analysis of MessageSwitching with Shuffle-Exchanges in Multiprocessors,"Proc. Workshop Interconnection Networks for Paralleland Distributed Processing, April 1980, pp. 116-123.

21. D. M. Dias and J. R. Jump, "Packet Communication inMultistage Shuffle-Exchange Networks," Proc. 1980 Int'IConf. Parallel Processing, Aug. 1980, pp., 327-328.

22. ComputerNetworks:A Tutorial, M. Abrams, R. P. Blanc,and 1. W. Cotton, eds., IEEE Press, New York, 1980.

23. M. A. Franklin, "VLSI Performance Comparison ofBanyan and Crossbar Communications Networks," IEEETrans. Computers, Vol. C-30, No. 4, Apr. 1981, pp.283-291.

Daniel M. Dias is currently employed by, S SAiMitre Corporation in Houston, Texas. His

E research interests include parallel com-putation, interconnection networks, andperformance analysis. Dias received a B.Tech degree from the Indian Institute ofTechnology, Bombay, India, in 1976, andan MS and PhD in electrical engineeringfrom Rice University in 1978 and 1981,respectively.

J. Robert Jump is currently a professor ofcomputer science at Rice University, wherehis research interests involve parallel com-puting and digital system design. His pre-vious industrial experience includes workin digital system design at Radiation, Inc.,Avco, and IBM, and he is presently servingas an associate editor of IEEE Transac-

7\$ ; tions on Computers.Jump received BS and MS degrees in

electrical engineering from the University of Cincinnati in 1960and 1962, respectively, and MS and PhD degrees in computerand communication sciences from the University of Michigan in1965 and 1968, respectively.

Use order form on p. 160C r

Ih

A new quantitative approach to software managementand engiheering is reflected in this tutorial, wvhich focuseson product-oriented attributes such as size, complexity,and reliability, and process-oriented attributes such ascost, schedules, and resources. The 27 articles includefour specially adapted and one newly written for thistutorial. 343 pp.

Order #31 0Tutorial-Models and Metrics for Software

Management and EngineeringEdited by Victor R. Basili

October 1980

Members-$15.00Non-members-$20.00

IA

December 1981