Distributed Routing Algorithms. In a message passing distributed system, message passing is the only...

Post on 16-Jan-2016

229 views 0 download

Tags:

Transcript of Distributed Routing Algorithms. In a message passing distributed system, message passing is the only...

Distributed Routing Algorithms

In a message passing distributed system, message passing is the only means of interprocessor communication.

Unicast, Multicast, Broadcast Communication latency in a distributed

system depends on the following factors:

Topology Routing Flow control Switching

Topology Network topology can be classified as

general purpose and special purpose. A general purpose network does not have a

uniform and structured formation while a special purpose network follows a predefined structure.

Switching store-and-forward that includes packet switching cut-through that includes circuit switching, virtual cut-through, and

wormhole. Store-and-forward switching: a message is divided into packets that

can be sent to a destination via different paths. When a packet reaches an intermediate node, the entire packet is then forwarded to the next node.

Circuit switching: a physical circuit is constructed before the transmission. After the packet reaches the destination, the path is destroyed.

Virtual cut-through switching: the packet is stored at the intermediate node only if the required channel is busy; otherwise, it is forwarded immediately without buffering.

Wormhole differs from virtual cut-through in two aspects:

(1)   Each packet is further divided into a number of flits.

(2)   When the required channel is busy, instead of buffering the remaining flits by removing them from the network channels, the flow control blocks the trailing flits and they stay in flit buffers along the established route.

At the system level, the main difference between store-and-forward and cut-through is that the former is sensitive to the length of the selected path while the latter, especially in wormhole routing with pipelined flits, is almost insensitive to path length in the absence of network congestion. That is, one unicasting to any destination is considered one step.

The objective of using the store-and-forward model is to minimize the path length.

The objective of using the cut-through model is to reduce network congestion.

Type of communication Unicast, Multicast, Broadcast. Personalized: a source sends different

messages to different destinations.

Routing Routing algorithms can be classified as :         Special purpose vs. general purpose         Minimal vs. nonminimal         Deterministic vs. adaptive         Source routing vs. destination routing         Fault-tolerant vs. non fault-tolerant         Redundant vs. non redundant         Deadlock-free vs. non deadlock-free

General vs. Special Purpose General purpose algorithms are suitable for

all types of networks but may not be efficient for a particular network. Special-purpose algorithms are usually efficient by taking advantage of the topological properties of specific networks.

 

Minimal vs. Nonminimal Minimal-path algorithms provide a least

cost path between source and destination. This scheme can lead to congestion in parts of a network. A nonminimal routing scheme may route the message along a longer path to avoid network congestion.

Deterministic vs. Adaptive In a deterministic algorithm the routing

path changes only in response to topological changes in the underlying network and does not use any information regarding the state of the network. In a dynamic algorithm the routing path changes based on the traffic in the network.

Fault-tolerant vs. non Fault-tolerant In a fault-tolerant routing a routing

message is guaranteed to be delivered in the presence of faults. In a non fault-tolerant routing it is assumed that no fault may occur, and hence, there is no need for the routing algorithm to dynamically adjust its activities.

Redundant vs. non Redundant A typical routing algorithm is nonredundant, i.e.,

for each destination one copy of the message is forwarded. In certain cases a shared path is used to forward the routing message to several destinations. For the purpose of fault tolerance, multiple copies are set to a destination via multiple edge-disjoint paths. As long as one of these paths remains healthy at least one copy will successfully reach its destination. Each destination should make sure only one copy is accepted.

Deadlock-free vs. non Deadlock-free A deadlock-free routing ensures freedom

from deadlock through carefully designed routing algorithms. In a non deadlock-free routing no special provision is given to prevent or avoid the occurrence of a deadlock.

Routing functions The routing function defines how a message is routed from the source

node to the destination node. Destination-dependent This routing function depends on the current

and destination nodes only. Input-dependent This routing function depends on the current and

destination nodes and the adjacent link (or node) from which a message is received.

Source-dependent This routing function depends on the source, current, and destination nodes.

Path-dependent This routing function depends on the destination node the routing path from the source node to the current node.

Dijkstra’s centralized algorithm Let D(v) be the distance (sum of link

weights along a given path) from source s to node v. Let l(v,w) be the given cost between nodes v and w.

There are two parts to the algorithm: An initialization step and a step to be repeated until the algorithm terminates.

 

1      Initialization. Set N={s}. For each node v not in N, set D(v)=l(s,v). We use ∞ for nodes not connected to s. Any number larger than the maximum cost or distance in the network will suffice.

2      At each subsequent step. Find a node w not in N for which D(w) is a minimum and add w to N. Then update D(v) for all nodes remaining that are not in N by computing:

D(v)= min[D(v), D(w)+l(w,v)]  Step 2 is repeated until all nodes are in N.

Ford’s distributed algorithm Each node v has the label (n,D(v)) where D(v) represents the current

value of the shortest distance from the node to the destination and n is the next node along with the currently computed shortest path.

1      Initialization. With node d being the destination node, set D(d)=0 and label all other nodes (., ∞).

2      Shortest-distance labeling of all nodes. For each node v<>d do the following: Update D(v) using the current value D(w) for each neighboring node w to calculate D(w)+l(w,v) and perform the following update:

D(v)=min{D(v), D(w)+l(w,v)}

An example

P2

P4

P3

P1P5

5

41

3

2

2

20

Dijkstra’s centralized algorithmRound N D(1) D(2) D(3) D(4)

Initial {P5} 20 2

1 {P5,P4} 3 4 2

2 {P5,P4,P2} 7 3 4 2

3 {P5,P4,P2,P3} 7 3 4 2

4 {P5,P4,P2,P3,P1} 7 3 4 2

Ford’s distributed algorithm

Round P1 P2 P3 P4

Initial (., ) (., ) (., ) (., )

1 (., ) (., ) (P5,20) (P5,2)

2 (P3,25) (P4,3) (P4,4) (P5,2)

3 (P2,7) (P4,3) (P4,4) (P5,2)

Unicasting in Special-Purpose Networks The routing algorithms in the previous

section are general and are suitable for all types of network topologies. However, they may not be efficient for special-purpose networks such as rings, meshes, and hypercubes.

Bidirectional rings Deterministic unicasting on a bidirectional ring is simple: a message

is forwarded along one direction (clockwise or counterclockwise) depending on the position of the destination.

In multiple-path routing two paths can be used: one along the clockwise direction and the other counterclockwise direction. Two copies of the routing message are sent, one to each direction; or the message is halved and each half is forwarded to a different direction.

Meshes Adaptive routing and XY routing in 2-d

mesh

Hypercubes The length of the shortest path between two nodes u and w is the

Hamming distance between u and w denoted as H(u,w). The number of shortest node-disjoint paths equals the Hamming

distance between the source and destination nodes. If the selection follows a predefined order, the routing is deterministic and is called e-cube routing.

The multiple-path routing in hypercubes is based on the following property: If two nodes s and d are separated by k-hamming-distance in an n-cube, there are n node-disjoint paths between nodes s and d. Out of these n paths k have a length of k and the remaining n-k have a length of k+2.

An example

d

s

100

110 111

101

000001

010 011

3 node-disjoint paths between 000 and 110:

Path 1: 000->100->110Path 2: 000->010->110Path 3: 000->001->011->111->110

000<-> 100

Path 1: 000->100Path 2: 000->001->101->100Path 3: 000->010->110->100

Broadcasting in Special-Purpose Networks - Rings Broadcasting in rings is: two copies of a message are sent

from both directions and they terminate at the two furthermost nodes, respectively. The total number of steps is half of the number of nodes.

One-port model: a node can only forward a copy of the message to one of its neighbors in one step.

All-port model: a node can forward a copy of the message to all its neighbors in one step.

Contention-free broadcasting in a wormhole-routed ring: one port For the one-port model, the best strategy is: the source s

sends the message to the furthermost node in the first step. Partition the ring into two equal halves with one node that has a copy of the message in each half. The above process is repeated until all the nodes have a copy. The total number of steps is log n.

1

2

2

33

33

Contention-free broadcasting in a wormhole-routed ring: all-port For the all-port model, using the cut-through model, the

source can send the message to two nodes that are n/3 distance away where n is the total number of nodes. In the next step each of three nodes sends the message to two nodes that are n/6 distance away. In general, after k steps 3^k nodes have a copy and each sends the message to two nodes that are n/3(k+1) distance away. Basically, this approach cuts a path into three subpaths of equal length with the center node of each subpath as the only node with a copy of the routing message.

1 1

2 2

22 2

2

Broadcasting in a wormhole-routed mesh: one-port

S 1

2 2

A broadcast with message-partition in 2-d meshes

S

Personalized broadcast of¼ message in one row

Broadcast of ¼ message incolumns

Collecting four ¼ messagesin each row.

Hypercubes

110 111

100 101

000 001

011010

2

2

1

3

3

3

3

110 111

100 101

000 001

011010

A broadcasting initiated from 000. A Hamiltonian cycle in a 3-cube.

Path-based Approach

0

1

2

3 4

5

6

7 8

9

10

11 12

13

14

15

Low-channel High-channelA multicast in a 4x4 mesh

U-mesh algorithmSource: (0,0) Destinations: (1,0), (1,1), (1,2), (1,3), (2,0), (2,1), and (3,2)

The lexicographical order of destinations and source is:(0,0), (1,0), (1,1), (1,2), (1,3), (2,0), (2,1), (3,2)

{(0,0), (1,0), (1,1), (1,2)} and {(1,3), (2,0), (2,1), (3,2)}

1

2 2

3

3

3

3

Virtual Channels

1 2 3

4 5 6

7 8 9

1 2 3

4 5 6

7 8 9

Virtual Channels

1 2 3

4 5 6

7 8 9

1 2 3

4 5 6

7 8 9

Positive network Negative network

Unidirection ring

P2

P0

P1P3

P2

P0

P1P3

Ch3Ch2

Ch1Ch0Cl0 Cl1

Cl2Cl3

Ch3 Ch2

Ch1 Ch0

Cl3 Cl2

Cl1 Cl0

Unidirection ring algorithm If the source address is larger than the destination

address, any channel can be used to start with; however, once a high (or low) channel is selected, the remaining steps should use high (or low) channels exclusively.

If the source address is smaller than the destination, high channels are used and high virtual channels are switched to low virtual channels after crossing node P3.

Turn model

Deadlock

Four turns allowed in XY-routing

Six turns allowed in positive-first routing

Six turns allowed in negative-first routing

Adaptivity of positive-first routing

s

d

Y

X

Y

X

s

d

Fully adaptive deterministic