[IEEE Comput. Soc. Press 11th International Parallel Processing Symposium - Genva, Switzerland (1-5...

10
Gracefully Degradable Pipeline Networks* Robert Cypher and Ambrose K. Laing Dept. of Computer Science, Johns Hopkins University, Baltimore, MD 2 12 18 cypher @cs.jhu.edu,laing @cs.jhu.edu Abstract A pipeline is a linear array of processors with an in- put node at one end and an output node at the other end. Thispaper presents le-gracefully-degradablegraphs which, given any set of up to IC faults, contain a pipeline that uses all the healthy processor nodes. Our constructions are de- signed to tolerate faulty input and output nodes, but they can be adapted to provide solutions when the input and output nodes are guaranteed to be healthy. All of our construc- tions are optimal in terms of the number of nodes and the maximum degree of the processor nodes. KEYWORDS: Interconnection Networks, Graph The- ory, Structural Fault Tolerance, Structural Graceful Degra- dation, Graph Embeddings, Reconfigurability. 1. Introduction Advances in digital communication and processing tech- nology will enable a wide range of high-performance communication-intensive applications such as video-on- demand, interactive multimedia applications, and automatic speech processing and recognition. In addition to their significant communication requirements, many of these ap- plications require large amounts of processing. For exam- ple, it is common to use to asymmetrical video compression techniques in which the compression (which is done once at the source) is far more complex than the decompression (which is done once at each destination). The large com- putational requirements lead naturally to the use of parallel arrays of general purpose processors, programmable signal processing chips, or application-specific integrated circuits Furthermore, many of these communication-intensive applications have real-time constraints. The combination of high-bandwidth communications and real-time constraints implies that the communication pattern of the application ~51. 'This research was supported in part by the National Science Foun- dation under Graqi MIP9525887 and by a grant from Siemens Corporate Research. must be carefully mapped to the interconnection network of the parallel computer. Unfortunately, the existence of even a single faulty component could destroy such a mapping. As a result, there is a great need for techniques that map around faults while maintaining the required bandwidths. In particular, communication-intensive parallel applica- tions often perform a sequence of different operations on a single stream of data, and thus have a pipeline communica- tion structure. For example, different stages of processing may consist of subsampling, rescaling, and finite impulse response (FIR) or infinite impulse response (IIR) filtering [20]. Textual substitution techniques for data compression also tend to have a 1D communication structure [19, 221. Parallel algorithms for the Hough and Radon transforms, which are useful in image and computed tomography (CT) processing, have been developed with pipeline communica- tion structures [ 11. In this paper we present constructions for gracefully- degradable pipeline graphs. Intuitively, a pipeline is a graph which consists of a linear array of nodes, one endpoint of which is an "input" node, the other endpoint of which is an "output" node, and the remainder of which are "processor" nodes. Our graphical notation for a pipeline is shown in Figure 1. o = = = - 0 i '\__._.______r_ __.._.--._____ p7 Figure 1. A pipeline with 7 processors The goal in constructing a gracefully-degradable pipeline graph is to define a graph which consists of in- put nodes, output nodes, and processor nodes, such that when any small subset of the nodes has been removed, the remaining graph contains a large pipeline as a subgraph. Such gracefully-degradablepipeline graphs are useful in the creation of fault-tolerant architectures for communication- intensive parallel applications. The remainder of this paper is organized as follows. Sec- tion 2 is a brief survey of related work. Section 3 gives our results. Our constructions are parameterized by le, the 55 1063-7133/97 $10.00 0 1997 IEEE

Transcript of [IEEE Comput. Soc. Press 11th International Parallel Processing Symposium - Genva, Switzerland (1-5...

Page 1: [IEEE Comput. Soc. Press 11th International Parallel Processing Symposium - Genva, Switzerland (1-5 April 1997)] Proceedings 11th International Parallel Processing Symposium - Gracefully

Gracefully Degradable Pipeline Networks*

Robert Cypher and Ambrose K. Laing Dept. of Computer Science, Johns Hopkins University, Baltimore, MD 2 12 18

cypher @cs.jhu.edu, laing @cs.jhu.edu

Abstract

A pipeline is a linear array of processors with an in- put node at one end and an output node at the other end. This paper presents le-gracefully-degradable graphs which, given any set of up to IC faults, contain a pipeline that uses all the healthy processor nodes. Our constructions are de- signed to tolerate faulty input and output nodes, but they can be adapted to provide solutions when the input and output nodes are guaranteed to be healthy. All of our construc- tions are optimal in terms of the number of nodes and the maximum degree of the processor nodes.

KEYWORDS: Interconnection Networks, Graph The- ory, Structural Fault Tolerance, Structural Graceful Degra- dation, Graph Embeddings, Reconfigurability.

1. Introduction

Advances in digital communication and processing tech- nology will enable a wide range of high-performance communication-intensive applications such as video-on- demand, interactive multimedia applications, and automatic speech processing and recognition. In addition to their significant communication requirements, many of these ap- plications require large amounts of processing. For exam- ple, it is common to use to asymmetrical video compression techniques in which the compression (which is done once at the source) is far more complex than the decompression (which is done once at each destination). The large com- putational requirements lead naturally to the use of parallel arrays of general purpose processors, programmable signal processing chips, or application-specific integrated circuits

Furthermore, many of these communication-intensive applications have real-time constraints. The combination of high-bandwidth communications and real-time constraints implies that the communication pattern of the application

~ 5 1 .

'This research was supported in part by the National Science Foun- dation under Graqi MIP9525887 and by a grant from Siemens Corporate Research.

must be carefully mapped to the interconnection network of the parallel computer. Unfortunately, the existence of even a single faulty component could destroy such a mapping. As a result, there is a great need for techniques that map around faults while maintaining the required bandwidths.

In particular, communication-intensive parallel applica- tions often perform a sequence of different operations on a single stream of data, and thus have a pipeline communica- tion structure. For example, different stages of processing may consist of subsampling, rescaling, and finite impulse response (FIR) or infinite impulse response (IIR) filtering [20]. Textual substitution techniques for data compression also tend to have a 1D communication structure [19, 221. Parallel algorithms for the Hough and Radon transforms, which are useful in image and computed tomography (CT) processing, have been developed with pipeline communica- tion structures [ 11.

In this paper we present constructions for gracefully- degradable pipeline graphs. Intuitively, a pipeline is a graph which consists of a linear array of nodes, one endpoint of which is an "input" node, the other endpoint of which is an "output" node, and the remainder of which are "processor" nodes. Our graphical notation for a pipeline is shown in Figure 1.

o = = = - 0 i ' \__ ._ .______r_ __.._.--._____

p 7

Figure 1. A pipeline with 7 processors

The goal in constructing a gracefully-degradable pipeline graph is to define a graph which consists of in- put nodes, output nodes, and processor nodes, such that when any small subset of the nodes has been removed, the remaining graph contains a large pipeline as a subgraph. Such gracefully-degradable pipeline graphs are useful in the creation of fault-tolerant architectures for communication- intensive parallel applications.

The remainder of this paper is organized as follows. Sec- tion 2 is a brief survey of related work. Section 3 gives our results. Our constructions are parameterized by le, the

55 1063-7133/97 $10.00 0 1997 IEEE

Page 2: [IEEE Comput. Soc. Press 11th International Parallel Processing Symposium - Genva, Switzerland (1-5 April 1997)] Proceedings 11th International Parallel Processing Symposium - Gracefully

maximum number of faults, and n, the minimum number of healthy processors required in the pipeline. Subsection 3.2 details our constructions for small n and arbitrary k , and Subsection 3.3 discusses solutions for small IC (fewer than four) with arbitrary n. Subsection 3.4 contains construc- tions for larger IC (four or more) and sufficiently large n. We conclude with a summary of our results in Section 4.

2. Related work

Rosenberg [ 181 proposed a technique which adds a col- lection of buses in order to accomodate processor faults. However this approach does not tolerate faults in the buses.

Hayes [ 131 proposed and formalised a graph model for creating structurally fault-tolerant archite‘ctures. He used this graph model to create fault-tolerant cycles 1131. Hayes’ graph model can accomodate faults in both processors and communication links (by viewing an adjacent processor as being faulty). Later work by Dutt and Hayes [7] proposed the theory of node coverings as a basis for constructing k- fault-tolerant supergraphs for non-homogeneous symmet- rical d-ary trees (NST’s). This unified previous work by Hayes [13], Kwan and Toida [14], Raghavendra, Avizienis and Ercegovac [17], Hassan and Agarwal [ 123, and Lowrie and Fuchs [16] on NST’s. Dutt and Hayes’ method is op- timal for k < d and nearly-optimal otherwise. They then generalized the theory of node coverings further to handle other types of graphs [9]. Further work by Dutt and Hayes [SI proposed the automorphism methodology, which can be used for a variety of target graphs.

Balasubramanian and Banerjee [2] also used the same graph model to define solutions for tolerating one fault in a Chain-structured Butterfly by using a logarithmic number of spares. Bruck, Cypher and Ho [3] also described so- lutions where the target graph is an arbitrary dimensioned mesh. This work includes binary hypercubes as a special case. They also gave constructions for the base-m de Bruijn and shuffle-exchange target graphs [4]. In both cases only k extra nodes are required to achieve k-fault-tolerance. Fur- ther work was done on mesh architectures, applying k spare nodes and an extra row to a two-dimensional mesh [5 ] . This method can be generalised to arbitrary dimensioned meshes.

There are two significant limitations to these previous results. First, nearly all of the previous work (with the exception of work on NST’s) provided solutions for unla- beled graphs only. Underlying all such results is the as- sumption that any node can play the role of any other node. However, this assumption rarely holds when YO-devices are taken into account, both because only certain proces- sors have connections to input and output, and because YO- devices are different from processors. As a result it is neces- sary to use labeled graphs to model parallel computers that

have YO-devices. Specifically, we will use the node labels processor, input-device and output-device to model parallel architectures. This approach allows us to create supergraphs which can tolerate faults in processors and/or VO-devices.

Secondly, the previous work does not guarantee that all of the healthy processors can be utilized when the faults are fewer than the maximum number of permissi- ble faults. Thus, it would be desirable to generalise these fault-tolerance schemes to obtain gracefully-degradable su- pergraphs which utilize all (or at least as many as possible) of the healthy processors.

This paper presents node-labeled gracefully-degradable pipeline architectures which maintain the required VO con- nectivity in the presence of faults and which utilize all healthy processors.

3. Constructions and lower bounds

First we review some basic concepts we need from graph theory [ l l , 211.

Definitions: Given any graph G = (V, E ) and nodes 2, y E V, 2 and y are adjacent iff (2, y) E E. We will use the notation V ( G ) to mean V , and E(G) to mean E. Given any sequence of distinct nodes ao, . . . , a4-1, the path P = (ao, . . . , denotes the graph with nodes ao, . . . , a4-1 and edges ( a i , a j + l ) for all j , 0 5 j < q - 1. The nodes a0 and aq- 1 are the endpoints of the path. Simi- larly, the cycle C = cycle(a0, . . . , a4-1) denotes the graph with nodes ao, . . . , a4-l and edges ( a j , aj+l (mod n)) for a l l j , 0 5 j 5 q - 1.

Definitions: If a graph GI is a subgraph of a graph G:!, we will say that GI is in G:! and that G:! contains G I .

Definitions: Given a graph GI = (VI , E l ) and a set of nodes F C V I , we will use the notation G1 \ F to denote the graph Gz = (V2, E2) where V:! = VI \ F and E2 =

The notion of a gracefully-degradable pipeline can now be defined formally as follows.

Definitions: Let G = (V, E ) be a simple graph with a set of input terminals Ti C V and a set of output ter- minals To c V, where ?;: n To = 0. We will refer to the nodes in V \ ( E U To) as processor nodes. A pipeline in G is a path (ao, . . . , a4) in G such that either a0 € T, and a4 E To, (or else a0 E To and u4 E Ti,) and (in ei- ther case) { a l , . . . , a q - l } = V \ (?;: U To). If for a given fault set F , there exists a pipeline in G \ F , we will say G tolerates F . The graph G is k-gracefully-degradable, de- noted GD(G, k ) , if it is a simple graph, and for all fault sets F c V where IF1 5 k, G tolerates F . If GD(G, k ) , we will also say G is a solution graph.

Notation: The parameter n denotes the minimum num- ber of processor nodes required in the target pipeline. The parameter k denotes the maximum number of faults that

{(Z,Y) E El : “,YE vz}.

56

Page 3: [IEEE Comput. Soc. Press 11th International Parallel Processing Symposium - Genva, Switzerland (1-5 April 1997)] Proceedings 11th International Parallel Processing Symposium - Gracefully

need to be tolerated. We will require that n 2 1 and k 2 1 to avoid trivial problems.

Definition: A k-gracefully-degradable graph for n nodes is a k-gracefully-degradable graph which provides a pipeline of at least n processor nodes for any fault set of size k or less. We use this termipology to refer to any such graphs with the parameters n and k, regardless of their structure, and we use the notation Gn,k for instances of our constructions.

Given the required values of n and k, we want to con- struct a graph Gn,k which is k-gracefully-degradable. Be- cause there can be up to k faulty input terminals in Gn,k, it is clear that Gn,k must contain at least k+ 1 input terminals. Similarly, Gn,k must contain at least k + 1 output terminals. Furthermore, even if k faults occur in processor nodes, G , , k

must contain a pipeline with n processors nodes, so it fol- lows that Gn,k must contain at least n + k processor nodes. We will only allow node-optimal constructions, which are supergraphs that contain the minimum number of nodes of each kind. As a result, our construction Gn,k will have ex- actly k + 1 input terminals, k + 1 output terminals, and n + k processor nodes. Our goal will be to minimize the maximum degree of Gn,k. In fact, in all cases our construc- tions are degree-optimal (that is, they have provably optimal maximum degree).

In the above model, we assume that input and output ter- minals are susceptible to faults. We could consider another model, where input and output terminals are guaranteed to be fault-free. In this new model, only one terminal of each kind is required, and the goal is to provide a pipeline be- tween the input terminal and the output terminal. To provide solutions for both models simultaneously, we will assume the original model and make the further requirement that every input terminal and output terminal must have degree 1 in our constructions. We can then modify each of our so- lutions to the case of modelling single faultless input nodes and output nodes by "merging" Ti into one node i , and To into 0. More formally, we replace Ti by i and each edge ( i 1 , j l ) where i l E Ti, is replaced by ( i , j l ) . To is merged similarly. After merging the terminal nodes the single input terminal i has degree k + 1, which is the smallest possible degree for a terminal (because with fewer than k + 1 neigh- bors, a terminal could be isolated by a fault set including all of its neighbors). Thus, although we are using a model for which terminal nodes can fail, all of our constructions will also yield efficient constructions for the model in which ter- minal nodes cannot fail.

A k-gracefully-degradable graph for n nodes is standard if it is node-optimal, and every input and output terminal has degree 1. For standard graphs we will let I be the set of k $- 1 processor nodes which are adja- cent to the input terminal nodes T i , and let 0 be defined similarly with respect to output terminal nodes.

Definition:

3.1. General results

This subsection presents a number of lower bounds on the degree of nodes in any solution graph for any given val- ues of n and k. A method for deriving one solution graph from another is also presented.

Lemma 3.1 In a k-gracefully-degradable graph G , the minimum degree of a processor node is at least k + 2.

Corollary 3.2 In a k-gracefully-degradable graph G, the maximum degree of a processor node is at least k + 2.

Corollary 3.3 A k-gracefully-degradable graph G with maximum processor degree of k + 2 is degree-optimal.

Lemma 3.4 In a k-gracefully-degradable graph G where n > 1, the minimum number of processor neighbors that a processor node has is at least k + 1.

Lemma 3.5 For all even n and odd k, the maximum degree of a processor node in a standard k-gracefully-degradable graph G is at least k + 3.

Proof: Consider a standard k-gracefully-degradable graph G for which n is even and k is odd. Suppose for the sake of contradiction that the maximum degree of a processor node is at most k 4- 2. Given Lemma 3.1, the degree of every processor is exactly k + 2.

Let G ( m ) be the graph formed by cutting the terminal edges, discarding the terminals, and pairing up the "loose" ends. More formally, we define an arbitrary bijection 4 : X + To, and define the multigraph G(m) in terms of G with nodes V ( G ) \ (Ti U To), and edges { ( a , b) E E ( G ) : a and b are processor nodes} U { ( a , b) : (a , a'), ( b , b') E E ( G ) , a' E ?;., b' E To, and b' = 4(u ' ) } .

Because G is standard, the degree of a node in G(") is the same as its degree in G , hence all nodes have degree k + 2 in G(m) . Counting the number of "edge-endings'' (edge-wise on the left and node-wise on the right), we ob- tain21E(G("))I = (n+k) (k+2) . Sincenisevenandkis odd, the right hand side is odd, but the left hand side must be even, which is a contradiction. 0

Lastly we describe a general technique which given a k- gracefully-degradable graph G for n nodes, creates a larger k-gracefully-degradable graph G' for n + k + 1 nodes. This technique may be iterated to obtain k-gracefully-degradable graphs G" for n + 1 (k + 1) nodes, where 1 2 0. Informally, the idea is to relabel all the input terminal nodes as pro- cessor nodes, to put edges between them so they become a clique, and lastly, to create a new input terminal node adja- cent to each of these relabeled nodes.

Definition: Given a le-gracefully-degradable graph G , we define the graph G' as follows:

57

Page 4: [IEEE Comput. Soc. Press 11th International Parallel Processing Symposium - Genva, Switzerland (1-5 April 1997)] Proceedings 11th International Parallel Processing Symposium - Gracefully

Let V’ be a set. of k + l new nodes, and q5 be an arbitrary This pipeline clearly avoids all faults in Fa, including bijective mapping of V’ onto Ti of G.

V(G’) = V ( G ) U V’

j 3 .

0

Ti of G‘ is V’

To of G’ is To of G.

Lemma 3.6 If G is a standard k-gracefully-degradable graph for n nodes which has maximum degree d, then G’ defined as above is a standard k-gracefully-degradable graph for n + k + 1 nodes and G’ has maximum degree d.

Proof: Clearly G‘ is standard. From Corollary 3.2, the max- imum degree d of G is at least IC + 2. No node of degree larger than k + 2 is created by the construction. Further- more, the construction does not decrease the degree of any node. Hence the maximum degree of G’ is also d.

Let Fa be any set of k or fewer faults in GI. We will show that given the fault set Fa, there must exist a pipeline in G’.

In this proof I denotes the processor nodes in G’ that are adjacent to nodes in Ti of G’. Given any fault set F C V(G’), let P ( F ) be a path in G’ with endpoints in I and To that avoids the faults in F and includes as internal nodes all the processor nodes in (V(G’) \ ( I U F ) ) . Note that such a path exists because G is k-gracefully-degradable and G is a subgraph of G’.

Case 1: Fa n Ti = 0. Let the endpoint in I of the path P(Fa) be il. Let U be an arbitrary sequence on the set of nodes in I \ Fa that are not on P(F,). If U is nonnull, let i2 be the last node of U , and if U is null, we define i2 = il. Let j 2 E Ti be adjacent to i2. Since Fa n = 0, j 2 is healthy. We create the desired pipeline in GI\ Fa by following the path P(Fa) from a terminal in To to its endpoint in I , visiting the sequence U in that order ending at i 2 , and then ending at j 2 .

Case 2: Fa n Ti # 0. Let j 3 E Fa n Ti and choose i4 E I \ Fa such that the input terminal node j 4 , which is adjacent to i 4 , is healthy (such a node i4 must exist because there are k+ 1 disjoint pairs of adjacent nodes, each with one node in Ti and the other in I , and at most k of these pairs can contain faults). Create the fault distribution F, = Fa U { i 4 } \ { j s ) . Note that IF,! 5 k. Let the endpoint in I of the path P(Fm) be i l . Clearly i 4 E I is not on P(F,). We create the desired pipeline in G’\ Fa by starting at j 4 , going to i4, visiting all the remaining nodes in I \ F, that are not on P(F,) (if any) in an arbitrary order, going to i l , and then following P(Fm) to an output terminal node.

Definition: G1,k is defined to have a complete subgraph on the k + 1 processing nodes. The processing nodes are the set I = 0.

Lemma 3.7 G1 ,k is k-gracefully-degradable. more, it is the only standard solution graph.

Proof First we show that G1,k is k-gracefully-degradable. Consider a partition of V(Gl,k) into k + 1 parts such that each terminal node is in the same part as the unique proces- sor node it is adjacent to. There are k + 1 such parts, SO

at least one of them is guaranteed to be completely healthy. Let this part contain processor node a.

Case 1: There is at least one other healthy processor node b (ie., b # a) which is adjacent to some healthy terminal node c.

Further-

Case la: c is in Ti. Let d be the node in To adjacent to a. The path which visits c, b, all the remaining healthy processor nodes in any order ending in a, then visits d, is a pipeline.

Case lb: c is in To. This is symmetrical to the previ- ous case.

Case 2: Each processor node other than a is faulty, or else it is adjacent to two faulty terminals. Since there are k parts that exclude a, we cannot have a healthy proces- sor node adjacent to two faulty terminals, or else there would be more than k faults. So all processor nodes other than a are faulty. The part containing a naturally defines the pipeline.

Hence G I ,k is k-gracefully-degradable. It remains to show that the definition for G1,k is necessary. Let G be a standard solution graph. In order to satisfy the node- optimality requirement, there are only k + 1 processor nodes. The subgraph induced by the processor nodes must be complete because given any two processor nodes c and d, we could consider a fault set F consisting of all processor nodes except c and d (this must be tolerated since IF I < k). In order to have a pipeline in the solution graph G \ F , there must be an edge ( c , d ) E E( G) . The assumption that G is a standard graph completes the proof. 0

Corollary 3.8 Given any 1 2 0, a solution graph Gn”,k exists for n” = (k + 1)l + 1 which is degree-optimal with degree k + 2.

58

Page 5: [IEEE Comput. Soc. Press 11th International Parallel Processing Symposium - Genva, Switzerland (1-5 April 1997)] Proceedings 11th International Parallel Processing Symposium - Gracefully

Definition: G2,k is defined to have a complete subgraph on the processing nodes. There are at least three processing nodes, and we distinguish two of them as a and b. All nodes except a and b are each adjacent to an input terminal node and an output terminal node. Each of a and b is adjacent to only one terminal node; a to an input terminal and b to an output terminal. Note that the maximum degree of G2,k is k + 3 since there is at least one node with k + 1 processor neighbors and two terminal neighbors.

Lemma 3.9 G2,k is k-gracefully-degradable. more, it is the only standard solution graph.

Proof: First we show that G2,k is k-gracefully-degradable. Partition the nodes of G 2 , k into k + 2 parts so that each processor node is in a different part, and each terminal node is in the same part as the unique processor node to which it is adjacent. Let a fault set F be given. There are at most k parts which contain faults, hence there are at least two healthy parts. Let these two healthy parts contain processor nodes c and d. Note that there exist terminal nodes c' and d' which are adjacent to c and d respectively and of different kinds. Let S be a spanning path of the healthy processor nodes with ends at c and d. Clearly (c', S, d') is a pipeline.

Now in order to show that G2,k is a necessary defini- tion, let G be a solution graph. In order to satisfy the node-optimality requirement, there are only k + 2 processor nodes. The subgraph induced by the processor nodes must be complete because given any two processor nodes c and d, we could consider a fault set F consisting of all processor nodes except c and d (this must be tolerated since IF I = k). In order to embed a pipeline in the solution graph G, there must be an edge ( c , d ) E E(G).

We now consider the edges between terminals and pro- cessor nodes in G. There are two cases (recall that I (0) denote the processor nodes in G adjacent to input (output) terminals):

Case 1: I = 0. Let U be the unique processor node not in I or 0. Choose any other processor node v and let F be a fault set such that all the processor nodes except U

and w are faulty. To tolerate F , there must be a terminal adjacent to U , but this is not the case, so when I = 0, G can not be a solution graph.

Further-

Case 2: I # 0. Let b be the unique processor node in V(G) \ I and similarly let a be the unique processor node in V(G)\O. This corresponds to the definition of G2,k, and completes the proof that G2,k is a necessary construction.

0

Corollary 3.10 G2,k has an optimal maximum degree of k + 3.

Lemma 3.11 For k > 1, any solution graph G3,k has a maximum degree of at least k + 3.

Proof: Let G be a solution graph. In order to satisfy the node-optimality constraint, there are only k + 3 processor nodes, and 2k + 2 terminal nodes.

Case 1: I n 0 = 0. It follows that 2(k + 1) 5 k + 3, but this is false when IC > 1, so this case does not arise.

Case 2: I n 0 # 0. Let c E I n 0. From Lemma 3.4 and the fact that c has two terminal neighbors, the degree of c is at least IC + 3, and this proves the result.

cl We now define a solution graph for n = 3 and any k 2 1. Definition: Let Tl = {io,. . . , ik-2, i k , ik+2}, To =

{o~ , . . . , ok - l ,Ok+l} , andP = {PO, . . . ,pk+2} (notethat 1Z.I = ITo! = k + 1 and IPI = n + k = k + 3). Note that the following are not defined: { i k - I , Ok, i k + l , Ok+2}. Let V = U To U P . We define G3,k = (V, E ) where

E = U

{(ij,~j) : ij E T,} U { ( o j , ~ j ) : Oj E To} { ( P j , P l ) : j # 1 )

\ {(P2q,P2q+l) : 0 5 Q I 1 w Note that for k 2 2 the maximum degree of a proces-

sor node is k + 3, hence the lower bound of Lemma 3.1 1 is matched. When k = 1, the maximum degree of a pro- cessor node is k + 2 and this matches the lower bound of Corollary 3.2.

Also note that the construction turns out differently de- pending on the parity of k, as shown in Figure 2 and Fig- ure 3. In both figures, all edges are indicated except those with endpoints in P . Edges with endpoints in P are under- stood to be present unless the two nodes are of the form pzq and (these pairs are indicated by dotted ovals in the figure).

Figure 2. The Construction for G3,k when n + k is even

Lemma 3.12 G3,k is k-gracefully-degradable.

The proof of this lemma is omitted due to space con- straints. It is available in a technical report [6].

59

Page 6: [IEEE Comput. Soc. Press 11th International Parallel Processing Symposium - Genva, Switzerland (1-5 April 1997)] Proceedings 11th International Parallel Processing Symposium - Gracefully

. . . 'k -4 ' k -3 ' k -2 i k ' k+Z each processor node is at least k + 2 = 4 so each proces-

sor node has degree exactly 4. By Lemma 3.4, the num- ber of processor neighbors of any processor node is at least k + 1 = 3. Hence each processor node can be adjacent to

0 0 01 0 2 03 Ok-4 Ok-3 Ok-Z Ok-1 O k t l at most one terminal node (ie. an input terminal or an out-

io i , i, i3

Figure 3. The Construction for G3,k when n + k is odd

3.3. Constructions for small k

This subsection presents degree-optimal standard solu- tions for k E { 1 , 2 , 3 } and for arbitrary n- 2 1. Some of our constructions are presented here without proof, because they were intuitively designed and exhaustively verified by human and/or computer checking. We call these special so- lutions. First we consider graphs for k = l.

Theorem 3.13 For k = 1, there exist solution graphs of maximum degree k + 2 for all odd n and solution graphs of maximum degree k + 3 for all even n. Furthermore these solution graphs are degree-optimal for all n.

Proof: Degree-optimal constructions for the cases of n = 1 and n = 2 are obtained from Lemmas 3.7 and 3.9, as shown in Figure 4. Constructions for other values of n are obtained by Lemma 3.6 from the appropriate one of these graphs. All extensions of G1,1 are degree-optimal by Corollary 3.3 and extensions of Gz,l are degree-optimal by Lemma 3.5. 0

z x x P1 P2

Figure 4. Solution Graphs for IC = 1 and 71 = 1 , 2 , 3 respectively

Note that applying Lemma 3.6 to G1,1 gives a graph G3,1, which is an example of our general construction for n = 3. This is illustrated in Figure 4.

For k = 2, we first prove a lemma required to establish the degree-optimality of our construction for k L 2 and n = 5.

Lemma 3.14 For k = 2 and n = 5 there is no standard solution graph for which the maximum processor degree is k + 2.

Proof: Suppose for the sake of contradiction that there ex- ists a solution graph G for which the maximum degree of a processor node is k + 2. By Lemma 3.1, the degree of

put terminal). Since G is a standard solution, each terminal node is adjacent to a unique processor node. Since there are 2 ( k + 1) = 6 terminal nodes, and n + k = 7 processor nodes, there is exactly one processor node which has 4 pro- cessor neighbors. Let this processor node be z, and let 2 be adjacent to processor nodes { b , c, e , f}. Let the remaining processor nodes be { a , d } .

We will investigate cases on the possible structure of G. For any node U, let N p ( u ) denote the set of its processor neighbors.

Case 1: a is not adjacent to d. It follows that &(a) , N p ( 4 c { b , c , e , f}.

Casela: N p ( a ) = Np(d) . Let NP(a) = { b , c , e } without loss of generality. Note that N P ( b ) = Np (c ) = Np ( e ) = {z, a , d } . Therefore there must exist a loop edge (f, f ) in G in order to ensure that f has 3 processor neighbors (see Fig- ure s), and this is a contradiction because G is a simple graph.

b c .ed Figure 5. Processor subgraph when ( a , d ) 6 E(G) and N p ( a ) = Np(d)

Case lb: Np(a) # N p ( d ) . Let N p ( a ) = { b , c , f } and N p ( d ) = { b , c, e} without loss of gener- ality. Note that NP(b) = N p ( c ) = { z , a , d } . As specified, INp(b)l = INp(c)I = INp(a)l =

there must exist an edge (e , f ) . The resulting subgraph is shown in Figure 6. Now consider a possible fault set F = { a , d } . Clearly there is no path that spans the processor nodes of G \ F , hence no pipeline in G \ F , and this contradicts our assumption that G is a solution graph.

INp(d)l = 3 and INp(e)l = INp(f)l = 2, so

Case2: a is adjacent to d. Each of the nodes { a , b , c, d , e, f} must have a degree of 2 contributed by as-yet-unspecified edges. These unspecified edges can only form cycles (ignoring the specified edges). These cycles cannot be of length 1 or 2 (which corresponds

60

Page 7: [IEEE Comput. Soc. Press 11th International Parallel Processing Symposium - Genva, Switzerland (1-5 April 1997)] Proceedings 11th International Parallel Processing Symposium - Gracefully

Figure 6. Processor subgraph when (a ,d ) 6 E(G) and NP ( a ) # NP (4

to self loops or duplicate edges respectively, since G must be a simple graph). Hence for 6 nodes, the fol- lowing cases suffice.

Case 2a: These unspecified edges form two disjoint cycles of length 3 each. Clearly ( a , d) E E(G) , so (a, d) is not in any of these two 3-cycles (or else a duplicate edge is formed). Hence a and d are in different cycles. Since { b , e, e , f} are symmetrical as described, let { a , b , f} be one 3- cycle, without loss of generality. The resulting subgraph is shown in Figure 7.

a d

e

Figure 7. Processor subgraph when ( a , d ) E E(G) and there are two disjoint 3-cycles

Now consider a possible fault set F = { a , x}. Clearly there is no path that spans the processor nodes of G \ F , so there is no pipeline in G \ F , which contradicts our assumption that G is a solution graph.

Case2b: The unspecified edges form one cycle of length 6. Consider cases on the distance between a and d on this 6-cycle. Clearly it cannot be zero or one. For this case we will use the notation s a m e ( u , v) when the terminal nodes adjacent to U and w must both be input terminals or they must both be output terminals, and say -wwne(u , U ) when the terminal nodes adjacent to U and w must be of opposite kinds. Note that when a processor subgraph contains a unique spanning path with endpoints U and w , then i s a m e ( u , v ) . This is how we will infer the value of the predicate in the following subcases. Case 2bl: The distance between a and d on the

6-cycle is 2. Since { b , c, e , f} are symmet-

rical as described prior to now, let the 6- cycle be cyc le(a , b , c, e , d , f ) without loss of generality. This subgraph is shown in Fig- ure 8. Clearly F = { a , .} +- l s a m e ( b , f ) and F = {e,.} j - -rsame(b,e) imply that s u m e ( e , f ) . However, F = { d , z } j i s a m e ( e , f ) , which is a contradiction.

Figure 8. Processor subgraph when (u ,d ) E E(G) and the distance from U to d on 6-cycle is 2

Case2b2: The distance between a and d on the 6-cycle is 3. Since { b , c , e , f } are symmetrical as described prior to now, let the 6-cycle be cycle(a, b , c, d , e, f) with- out loss of generality. The resulting sub- graph is shown in Figure 9. In this case, F = { b , d } j l s a m e ( a , c ) and F = {f, d } +- - m m e ( a , e) together imply that same(c,e). However, F = { z , d } Tsame(c, e), which is a contradiction.

Figure 9. Processor subgraph when ( a , d ) E E(G) and the distance from a to d on 6-cycle is 3

Hence no such supergraph exists. 0

Theorem 3.15 For k = 2, there exist solution graphs of maximumdegree k+Sforn E {2,3,5}andsolutiongraphs of maximum degree k + 2 for all other n. Furthermore, for all n, these solution graphs are degree-optimal.

Proof: A degree-optimal construction for R = 2 is obtained from Lemma 3.9. Our solution for n = 3 is degree-optimal by Lemma 3.12. By Lemma 3.14, the extension G5,2 of G2,2 defined by Lemma 3.9 is degree-optimal for k = 2

61

Page 8: [IEEE Comput. Soc. Press 11th International Parallel Processing Symposium - Genva, Switzerland (1-5 April 1997)] Proceedings 11th International Parallel Processing Symposium - Gracefully

and n = 5 . Note that these solutions for n E { 2 , 3 , 5 } have degree k + 3 .

A degree-optimal construction for n = 1 is obtained from Lemma 3.7. Our degree-optimal solution for n = 4 (and n = 7) is obtained by applying Lemma 3.6 once (twice) to Gl,2 as defined by Lemma 3.7. We define spe- cial solutions for n = 6 and n = 8, and these are shown in Figure 10 and Figure 1 1. Note that these have maximum degree k + 2.

Figure 10. A special solution graph G6,2

Figure 11. A special solution graph G8,2

Degree-optimal constructions for n 2 9 are obtained by applying Lemma 3.6 to the appropriate construction from the set G6,2, G7,2 and G ~ J . These and their extensions are degree-optimal by Corollary 3.3. 0

Now we consider graphs for k = 3 .

Theorem 3.16 For k = 3, there exist solution graphs of maximum degree k + 2 for all odd n and solution graphs of maximum degree k + 3 for all even n. Furthermore, for all n, these solution graphs are degree-optimal.

Proof: Lemma 3.7 defines a solution for n = 1, and on ap- plying Lemma 3.6, we obtain G s , ~ , which is degree-optimal by Corollary 3.3. Our special solution G7,3, illustrated in Figure 12, is degree-optimal by Corollary 3.3.

Degree-optimal constructions for larger odd values of n ( n 2 9) are obtained by applying Lemma 3.6 to the appropriate construction from { G s , ~ , G7,3}. The degree- optimality of all extensions of G5,3 and G7,3 follows from Corollary 3.3.

Figure 12. A special degree-optimal solution graph for k = 3 and n = 7

Lemma 3.9 defines a solution for n = 2 , and on apply- ing Lemma 3.6 to G2,3, we obtain the solution graph G6,3,

which is degree-optimal by Lemma 3.5. We also have the solution G3,3 from Lemma 3.12 (which is degree-optimal by Lemma 3.1 l), and a special solution G4,3 illustrated in Figure 13, which is degree-optimal by Lemma 3.5.

Figure 13. A special degree-optimal solution graph for k = 3 and n = 4

Degree-optimal constructions for larger even values of n (n 2 8) are obtained by applying Lemma 3.6 to the appro- priate construction from the set {G4,3, G G , ~ } . These con- structions have degree k + 3 (from Lemma 3.6) and are degree-optimal (from Lemma 3.5). 0

3.4. Asymptotic constructions

We now present asymptotic constructions for k- gracefully-degradable pipeline graphs. In particular, it will be assumed throughout that k 2 4, and for any given value of k , it will be assumed that n is sufficiently large. Although we will not exactly quantify the necessary relationship be- tween k and n, it is easily verified that n is only required to be linear in k .

We will begin by defining the graph GL,k, which we call the extended graph. This is not the actual gracefully- degradable graph that we will use, but it contains the gracefully-degradable graph Gn,k and it has a more regu- lar structure.

Definition: A circulant graph [ 101 is specified by a pos- itive integer 1 and a set of positive integers S. It is a graph

62

Page 9: [IEEE Comput. Soc. Press 11th International Parallel Processing Symposium - Genva, Switzerland (1-5 April 1997)] Proceedings 11th International Parallel Processing Symposium - Gracefully

with 1 nodes { 0, . . . , I - 1) in which node i is adjacent to another node j if and only if j = (if s) (mod I ) for some s E S. The set S is called the set of ofSsets.

Definitions: The graph Gk,k = (VI, E’) where IV’I = n + 3k + 6. The vertices V’ are partitioned into six sets q’, TL, 1’, O’, S’ and R’ where = ITLI = II’I = 10’1 = IS’I = k + 2, and IR’I = n - 2k - 4. We also define C’ = S’ U RI. The nodes in each of the sets q‘, TL, I‘, 0‘ and S‘ are labeled with unique integers in the range 0 . . . k + 1, and the nodes in R’ are labeled with unique integers in the range k + 2 . . . n - k - 3 . Given any node x E V’, the integer label of x will be denoted label(x).

The edges E’ are defined as follows. Let m = IC’\ = n -

e

e

e

0

e

e

e

e

k - 2 andletp = 151. for each E E and y E I‘ where labe l (x ) = label(y), (2, Y) E E’,

(X,Y) E E’,

label(y), (2, Y) E E’,

for each x E 0‘ and y E T,’ where labe l (x) = label(y), (x, y) E E’,

for each x, y E I’ where x # y, (x, y) E E‘,

for each x, y E 0’ where x # y, (x, y) E E’,

for each x , y E C’ where labeZ(y) = label($) + z (mod m) for some integer z in the range 1 through

p + 1, (2, y) E E’, and

if k is odd, then for each x, y E C’ where Zabel(y) = labe l (2) + (mod m), (x,y) E E’. We call these edges bisectors.

for each x E I’ and y E S’ where ZabeI(x) = label( y) ,

for each x E S‘ and y E 0’ where label(x) =

Note that nodes in I’ form a clique and the nodes in 0’ form a clique. In addition, there are edges between corre- sponding nodes in the sets T,! and I‘, the sets I’ and S’, the sets S’ and O’, and the sets 0’ and TL. The nodes in C’ form a circulant graph with m nodes and offsets {1 ,2 , . . . , p + 1, L?]} (if k is odd) or {1,2, . . . , p + 1) (otherwise). This particular circulant subgraph is a super- graph of Hayes’s construction [ 131 with the same maximum degree.

We will now define Gn,k, which is the actual gracefully- degradable graph. Intuitively, Gn,k is obtained from GA,k by deleting nodes in T/ and I’ with label 0, the nodes in Ti and 0‘ with label k + 1, and the length 1 edges between nodes in S‘.

Definitions: Given the graph Gk,k = (VI, E’), let G,,k = (V, E ) be the subgraph of Gk,k such that

e V = V’ \ ({x E q’ U I’ : label(x) = 0} U {x E Ti U 0’ : label(x) = k + l}),

the nodes in V are partitionedinto six sets E , To, I , 0, S, and R where E c q‘, To c TL, I C I‘, 0 c O’, S = S’ and R = R’,

the set C is defined as S U R, and

these tE= {(x,y) E E ’ : x , y ~ V}\{(x,y) : x , y ~ S and Ilabel(x) - label(y) I = 1).

Note that Gn,k has n + 3k + 2 nodes and each node in Ti U To has degree 1, so Gn,k is a standard graph. Also, note that if k is even, or if both n and k are odd, each node in I U 0 U C has degree k + 2. Finally, note that if n is even and k is’odd, the maximum degree of Gn,k is k + 3 (as required by Lemma 3.5). Hence Gn,k is degree-optimal.

Figure 14. Example: G22,4 , showing node sets and node labels

The construction for k = 4 and n = 22 is illustrated in Figure 14, showing the node sets Ti, To, I , 0, S and R. Figure 14 also shows the integer label label(x) for each node x. Another example is G x , ~ , which has bisectors, as illustrated in Figure 15. Our main result is that this node-optimal and degree-optimal construction is in fact k- gracefully-degradable.

Theorem 3.17 G , ,k is k-gracefully-degradable.

The proof of this theorem is quite long, and is omitted due to space limitations. It is available in the technical re- port [61.

63

Page 10: [IEEE Comput. Soc. Press 11th International Parallel Processing Symposium - Genva, Switzerland (1-5 April 1997)] Proceedings 11th International Parallel Processing Symposium - Gracefully

Figure 15. Example: G26,5, showing node sets and bisector edges

4. Conclusions

In this paper we have defined a graph model for the graceful degradation of pipelines. We presented degree- optimal and node-optimal standard k-gracefull y-degradable graphs for n E { 1 , 2 , 3 } given any k, and for k E { 1 , 2 , 3 } given any n. We also presented a construction for large k which was shown to be k-gracefully-degradable for sufficiently large n (specifically n = a(le)).

References

I. Agi, P. Hurst, and W. Current. A pipelined IC architecture for radon transform computations in a multiprocessor array. V U 1 Signal Processing IV, pages 442-45 1,199 1. V. Balasubramanian and P. Banejee. A fault-tolerant mas- sively parallel processing architecture. Distributed Computing, 4:363-383,1987. J. Bruck, R. Cypher, and C.-T. Ho. Fault-tolerant meshes and hypercubes with minimal numbers of spares. IEEE Trans. Comput., 42(9):1089-1104, 1993. J. Bruck, R. Cypher, and C.-T. Ho. Fault-tolerant de Bruijn and shuffle-exchange networks. IEEE Trans. on Parallel and Distributed Systems, 5(5):548-553, 1994. J. Bruck, R. Cypher, and C.-T. Ho. Tolerating faults in a mesh with a row of spare nodes. Theoretical Comput. Sci.,

J. of Parallel and

128~241-252,1994.

[6] R. Cypher and A. Laing. Gracefully degradable pipeline networks. Technical Report JHU-96/07, Computer Science Dept, The Johns Hopkins University, 1996. Available at ftp: //ftp.cs. jhu.edu/pub/commlab/kgdp.ps.

On designing and reconfiguring k-fault-tolerant tree architectures. IEEE Trans. Comput.,

[8] S. Dutt and J. Hayes. Designing fault-tolerant systemsusing automorphisms. J. of Parallel and Distributed Computing,

[9] S. Dutt and J. Hayes. Some practical issues in the design of fault-tolerant multiprocessors. In Proc. 2 Ist Intl. Symp. on Fault-Tolerant Computing, pages 292-299,1991.

[lo] B. Elspas and J. Tumer. Graphs with circulant adjacency matrices. J. Combinatorial Theory, (9):297-307, 1970.

[I 11 F. Harary. Graph Theory. Addison-Wesley, Reading, MA, 1969.

[12] A. Hassan and V. Aggarwal. A modular approach to fault- tolerant modular tree architectures. In Proc. Fifteenth Fault Tolerant Comput. Symp., pages 344-349, June 1985.

[13] J. Hayes. A graph model for fault-tolerant computing sys- tems. IEEE Trans. Comput., 25(9):875-884, 1976.

[14] C. Kwan and S. Toida. An optimal 2-fault tolerant real- ization of symmetric hierarchical tree systems. Networks,

[15] T. Leighton. Introduction to Parallel Algorithms and Ar- chitectures: Arrays, Trees, Hypercubes. Morgan Kaufmann Publishers, San Mateo, CA, 1992.

[16] M. Lowrie and W. Fuchs. Reconfigurable tree architectures using subtree oriented fault tolerance. IEEE Trans. Comput., pages 1172-1182,Oct. 1987.

[17] C. Raghavendra, A. Avizienis, and M. Ercegovac. Fault tol- erance in binary tree architectures. IEEE Trans. Comput., pages 568-572, June 1984.

[18] A. Rosenberg. The Diogenes approach to testable fault- tolerant VLSI processor arrays. IEEE Trans. Comput.,

[19] J. Storer, J. Reif, and T. Markas. A massively parallel VLSI design for data compression using a compact dynamic dic- tionary. VLSI Signal Processing N, pages 329-338,1991.

[20] U. Thoeni. Programming Real-Time Multicomputers for Signal Processing. Prentice Hall Intl (UK), Hertfordshire, UK, 1994.

[21] D. West. Introduction to Graph Theory. Prentice Hall, Up- per Saddle River, NJ, 1996.

[22] R. Zito-Wolf. A systolic architecture for sliding-window data compression. VLSI Signal Processing N, pages

[7] S. Dutt and J. Hayes.

39(4):490-503,1990.

12:249-268,1991.

12:231-239,1982.

32( 10):902- 9 10,1983.

339-351,1991.

64