Chapter 8: Graphs

143
Chapter 8: Graphs

description

Chapter 8: Graphs. Objectives. Looking ahead – in this chapter, we’ll consider Graph Representation Graph Traversals Shortest Paths Cycle Detection Spanning Trees Connectivity. Objectives (continued). Topological Sort Networks Matching Eulerian and Hamiltonian Graphs Graph Coloring - PowerPoint PPT Presentation

Transcript of Chapter 8: Graphs

Page 1: Chapter 8: Graphs

Chapter 8: Graphs

Page 2: Chapter 8: Graphs

Objectives

Looking ahead – in this chapter, we’ll consider• Graph Representation• Graph Traversals• Shortest Paths• Cycle Detection• Spanning Trees• Connectivity

2Data Structures and Algorithms in C++, Fourth Edition

Page 3: Chapter 8: Graphs

Objectives (continued)

• Topological Sort• Networks• Matching• Eulerian and Hamiltonian Graphs• Graph Coloring• NP-Complete Problems in Graph Theory

Data Structures and Algorithms in C++, Fourth Edition 3

Page 4: Chapter 8: Graphs

Introductory Remarks

• Although trees are quite flexible, they have an inherent limitation in that they can only express hierarchical structures

• Fortunately, we can generalize a tree to form a graph, in which this limitation is removed

• Informally, a graph is a collection of nodes and the connections between them

• Figure 8.1 illustrates some examples of graphs; notice there is typically no limitation on the number of vertices or edges

• Consequently, graphs are extremely versatile and applicable to a wide variety of situations

• Graph theory has developed into a sophisticated field of study since its origins in the early 1700s

Data Structures and Algorithms in C++, Fourth Edition 4

Page 5: Chapter 8: Graphs

Introductory Remarks (continued)

Fig. 8.1 Examples of graphs: (a–d) simple graphs; (c) a complete graph K4; (e) a multigraph;(f) a pseudograph; (g) a circuit in a digraph; (h) a cycle in the digraph

Data Structures and Algorithms in C++, Fourth Edition 5

Page 6: Chapter 8: Graphs

Introductory Remarks (continued)

• And, while many results are theoretical, the applications of graphs are numerous and worth consideration

• First, though, we need to consider some definitions• A simple graph G = (V, E) consists of a (finite) set denoted by

V, and a collection E, of unordered pairs {u, v} of distinct elements from V

• Each element of V is called a vertex or a point or a node, and each element of E is called an edge or a line or a link

• The number of vertices, the cardinality of V, is called the order of graph and devoted by |V|

• The cardinality of E, called the size of graph, is denoted by |E|

Data Structures and Algorithms in C++, Fourth Edition 6

Page 7: Chapter 8: Graphs

Introductory Remarks (continued)

• A graph G = (V, E) is directed if the edge set is composed of ordered vertex (node) pairs

• Now these definitions restrict the number of edges that can occur between any two vertices to one

• If we allow multiple edges between any two vertices, we have a multigraph (Figure 8.1e)

• Formally, a multigraph is defined as G(V, E, f) where V is the set of vertices, E the edges, and f:E →{{vi, vj} : vi,vj V and vi ≠ vj} is a function defining edges as pairs of distinct vertices

• A pseudograph is a multigraph that drops the vi ≠ vj condition, allowing the graph to have loops (Figure 8.1f)

Data Structures and Algorithms in C++, Fourth Edition 7

Page 8: Chapter 8: Graphs

Introductory Remarks (continued)

• A path between vertices v1 and vn is a sequence of edges denoted v1, v2, …, vn-1, vn

• If v1 = vn, and the edges don’t repeat, it is a circuit (Figure 8.1g); if the vertices in a circuit are different, it is a cycle (Figure 8.1h)

• A weighted graph assigns a value to each edge, based on contextual usage

• A complete graph of n vertices, denoted Kn, has exactly one edge between each pair of vertices (Figure 8.1c)

• The edge count = = = = O

Data Structures and Algorithms in C++, Fourth Edition 8

Page 9: Chapter 8: Graphs

Introductory Remarks (continued)

• A subgraph of a graph G, designated G’, is the graph (V’, E’) where V’ V and E’ E

• If the edges of the subgraph are defined such that e E if e E’, then the subgraph is said to be induced on its vertices V’

• Two vertices are adjacent if the edge defined by them is in E• That edge is called incident with the vertices• The number of edges incident with a vertex v, is the degree

of the vertex; if the degree is 0, v is called isolated• Notice that the definition of a graph allows the set E to be

empty, so a graph may be composed of isolated vertices

Data Structures and Algorithms in C++, Fourth Edition 9

Page 10: Chapter 8: Graphs

Graph Representation• Graphs can be represented in a number of ways• One of the simplest is an adjacency list, where each vertex

adjacent to a give vertex is listed• This can be designed as a table (known as a star

representation) or a linked list, shown in Figure 8.2b-c on page 393

• Another representation is as a matrix, which can be designed in two ways

• An adjacency matrix is a |V| x |V| binary matrix where:

10Data Structures and Algorithms in C++, Fourth Edition

1 if there exists an edge

0 otherwise i j

ij

v va

Page 11: Chapter 8: Graphs

Graph Representation (continued)• An example of an adjacency matrix is shown in Figure 8.2d• The order of the vertices in the matrix is arbitrary, so there

are n! possible matrices for a graph of n vertices• It is also possible to generalize an adjacency matrix definition

to handle a multigraph by defining aij = number of edges between vi and vj

• A second matrix representation is based on incidences, hence the name incidence matrix

• An incidence matrix is a |V| x |E| binary matrix where:

11Data Structures and Algorithms in C++, Fourth Edition

1 edge e is incident with vertex 0 otherwise

j iij

va

Page 12: Chapter 8: Graphs

Graph Representation (continued)• An example of an incidence matrix is shown in Figure 8.2e• For a multigraph, many columns are the same, and a column

with a single 1 represents a loop• As far as usage, the proper structure depends to a great

extent on the kinds of operations that need to be done

12Data Structures and Algorithms in C++, Fourth Edition

Page 13: Chapter 8: Graphs

Graph Traversals

• Like tree traversals, graph traversals visit each node once• However, we cannot apply tree traversal algorithms to graphs

because of cycles and isolated vertices• One algorithm for graph traversal, called the depth-first

search, was developed by John Hopcroft and Robert Tarjan in 1974

• In this algorithm, each vertex is visited and then all the unvisited vertices adjacent to that vertex are visited

• If the vertex has no adjacent vertices, or if they have all been visited, we backtrack to that vertex’s predecessor

• This continues until we return to the vertex where the traversal started

Data Structures and Algorithms in C++, Fourth Edition 13

Page 14: Chapter 8: Graphs

Graph Traversals (continued)

• If any vertices remain unvisited at this point, the traversal restarts at one of the unvisited vertices

• Although not necessary, the algorithm assigns unique numbers to the vertices, so they are renumbered

• Pseudocode for this algorithm is shown on page 395• Figure 8.3 shows an example of this traversal; the numbers

indicate the order in which the nodes are visited; the solid lines indicate the edges traversed during the search

Fig. 8.3 An example of application of the depthFirstSearch() algorithm to a graph

Data Structures and Algorithms in C++, Fourth Edition 14

Page 15: Chapter 8: Graphs

Graph Traversals (continued)

• The algorithm guarantees that we will create a tree (or a forest, which is a set of trees) including the graph’s vertices

• Such a tree is called a spanning tree• The guarantee is based on the algorithm not processing any

edge that leads to an already visited node• Consequently, some edges are not included in the tree

(marked with dashed lines)• The edges included in the tree are called forward edges;

those omitted are called back edges• In Figure 8.4, we can see this algorithm applied to a digraph,

which is a graph where the edges have a direction

Data Structures and Algorithms in C++, Fourth Edition 15

Page 16: Chapter 8: Graphs

Graph Traversals (continued)

Fig. 8.4 The depthFirstSearch() algorithm applied to a digraph

• Notice in this case we end up with a forest of three trees, because the traversal must follow the direction of the edges

• There are a number of algorithms based on depth-first searching

• However, some are more efficient if the underlying mechanism is breadth-first instead

Data Structures and Algorithms in C++, Fourth Edition 16

Page 17: Chapter 8: Graphs

Graph Traversals (continued)

• Recall from our consideration of tree traversals that depth-first traversals used a stack, while breadth-first used queues

• This can be extended to graphs, as the pseudocode on page 397 illustrates

• Figure 8.4 shows this applied to a graph; Figure 8.5 shows the application to a digraph

• In both, the basic operation is to mark all the vertices accessible from a given vertex, placing them in a queue as they are visited

• The first vertex in the queue is then removed, and the process repeated

• No visited nodes are revisited; if a node has no accessible nodes, the next node in the queue is removed and processed

Data Structures and Algorithms in C++, Fourth Edition 17

Page 18: Chapter 8: Graphs

Graph Traversals (continued)

Fig. 8.5 An example of application of the breadthFirstSearch() algorithm to a graph

Fig. 8.6 The breadthFirstSearch() algorithm applied to a digraph

Data Structures and Algorithms in C++, Fourth Edition 18

Page 19: Chapter 8: Graphs

Shortest Paths• A classical problem in graph theory is finding the shortest path

between two nodes, with numerous approaches suggested• The edges of the graph are associated with values denoting

such things as distance, time, costs, amounts, etc.• If we’re determining the distance between two vertices, say v

and u, information about the distance between the intermediate vertices in the path, w, needs to be kept track of

• This can be recorded as a label associated with the vertices• The label may simply be the distance between vertices, or the

distance along with the current node’s predecessor in the path• Methods for finding shortest paths depend on these labels

Data Structures and Algorithms in C++, Fourth Edition 19

Page 20: Chapter 8: Graphs

Shortest Paths (continued)• Based on how many times the labels are updated, solutions to

the shortest path problem fall into two groups• In label-setting methods, one vertex is assigned a value that

remains unchanged• This occurs each time we go through the vertices that remain

to be processed• The main drawback to this is that we cannot process graphs

that have negative weights on any edges• In label-correcting methods, any label can be changed• This means it can be applied to graphs with negative weights

as long as they don’t have negative cycles (a cycle where the sum of the edges is a negative value)

Data Structures and Algorithms in C++, Fourth Edition 20

Page 21: Chapter 8: Graphs

Shortest Paths (continued)• However this method guarantees that after processing is

complete, for all vertices the current distances indicate the shortest path

• Most of these forms (both label-setting and label-correcting) can be looked at as part of the same general process, however

• That is the task of finding the shortest paths from one vertex to all the other vertices, the pseudocode being on page 399

• In this algorithm, a label is defined as:label(v) = (currDist(v),predecessor(v))

• Two open issues in the code are the design of the set called toBeChecked and the order new values are assigned to v

• It is the design of the set that impacts both the choice of v and the efficiency of the algorithm

Data Structures and Algorithms in C++, Fourth Edition 21

Page 22: Chapter 8: Graphs

Shortest Paths (continued)• The distinction between label-setting and label-correcting

algorithms is the way the value for vertex v is chosen• This is the vertex in the set toBeChecked with the smallest

current distance• In considering label-setting algorithms, one of the first was

developed by Edsgar Dijkstra in 1956• In this algorithm, the shortest from among a number of paths

from a vertex, v, are tried• This means that a particular path may be extended by adding

one more edge to it each time v is checked• However, if the path is longer than any other path from that

point, it is dropped, and the other path is expanded

Data Structures and Algorithms in C++, Fourth Edition 22

Page 23: Chapter 8: Graphs

Shortest Paths (continued)• Since the vertices may have more than one outgoing edge,

each new edge adds possible paths for exploration• Thus each vertex is visited, the new paths are started, and the

vertex is then not used anymore• Once all the vertices are visited, the algorithm is done• Dijkstra’s algorithm is shown on page 400; it is derived from

the general algorithm by changing the linev=a vertex in toBeChecked;

tov=a vertex in toBeChecked with minimal currDist(v);

• It also extends the condition in the if to make permanent the current distance of vertices eliminated from the set

Data Structures and Algorithms in C++, Fourth Edition 23

Page 24: Chapter 8: Graphs

Shortest Paths (continued)• Notice that the set’s structure is not indicated; recall it is the

structure that determines efficiency• Figure 8.7 illustrates this for the graph in part (a)

Fig. 8.7 An execution of DijkstraAlgorithm()

Data Structures and Algorithms in C++, Fourth Edition 24

Page 25: Chapter 8: Graphs

Shortest Paths (continued)• As a label-setting algorithm, Dijkstra’s approach may fail when

negative weights are used in graphs• To deal with that, a label-correcting algorithm is needed• One of the first label-correcting algorithms was developed by

Lester R. Ford, Jr. in the late 1950s• It uses the same technique as Dijkstra’s method to set the

current distances, but postpones determining the shortest distance for any vertex until the entire graph is processed

• While it is capable of handling graphs with negative weights, it cannot deal with negative cycles

• In the algorithm, all edges are watched in an attempt to find an improvement for the current distance of the vertices

Data Structures and Algorithms in C++, Fourth Edition 25

Page 26: Chapter 8: Graphs

Shortest Paths (continued)• The pseudocode for the algorithm is shown on page 402• To facilitate monitoring the vertices, an alphabetic sequence

can be used• That way the algorithm can go through the list repeatedly and

adjust any vertex’s current distance as needed• Figure 8.8 contains an example of this; note that the graph

does include negatively weighted edges• While a vertex may change its current distance during the

same iteration, when done each vertex can be reached by the shortest path from the starting vertex

Data Structures and Algorithms in C++, Fourth Edition 26

Page 27: Chapter 8: Graphs

Shortest Paths (continued)

Data Structures and Algorithms in C++, Fourth Edition 27

Fig. 8.8 FordAlgorithm() applied to a digraph with negative weights

• In the case of Dijkstra’s algorithm, we observed that the efficiency can be improved by the choice of data structure

• This in turn impacts the way the edges and vertices are scanned

Page 28: Chapter 8: Graphs

Shortest Paths (continued)• This observation also holds for label-correcting algorithms; in

particular, the FordAlgorithm()specifies no order for edge checking

• In the example of Figure 8.8, the approach was to visit all adjacency lists of all vertices in each iteration

• However this requires that all the edges are checked every time, which is inefficient

• A more sensible organization of the vertices can reduce the number of visits per vertex

• The generic algorithm on page 399 suggests an improvement by explicitly accessing toBeChecked

• In the FordAlgorithm()this structure is used implicitly, and then only as the set of all vertices

Data Structures and Algorithms in C++, Fourth Edition 28

Page 29: Chapter 8: Graphs

Shortest Paths (continued)• So based on this, we can derive a general label-correcting

algorithm, shown in pseudocode on page 403• As indicated before, the efficiency of the algorithm depends

directly on the data structure used for toBeChecked• One possibility is a queue, and was the basis for one of the

earliest implementations• With a queue, as a vertex, v is removed, the current distance

to its neighbors is checked• If any of those distances is updated, the vertex whose

distance was changed is added to the queue• While straightforward, it can sometimes reevaluate the same

labels excessively

Data Structures and Algorithms in C++, Fourth Edition 29

Page 30: Chapter 8: Graphs

Shortest Paths (continued)• Figure 8.9 illustrates this problem for the graph of Figure 8.8a

Fig. 8.9 An execution of labelCorrectingAlgorithm(), which uses a queue

• As can be seen, a number of vertices are updated multiple times

Data Structures and Algorithms in C++, Fourth Edition 30

Page 31: Chapter 8: Graphs

Shortest Paths (continued)• To avoid this situation, a deque can be used in place of the

queue• In this approach, vertices needing to be checked for the first

time are added at the end, otherwise they are placed in front• The reasoning behind this is that if a given vertex, v, is

included for the first time, the vertices accessible from it have yet to be processed, so they will be processed after v

• However, if v has been processed, those vertices are likely still in the list awaiting processing, so putting v in front may avoid unnecessary updates

• Figure 8.10 shows the result of using a deque instead of a queue

Data Structures and Algorithms in C++, Fourth Edition 31

Page 32: Chapter 8: Graphs

Shortest Paths (continued)

Fig. 8.10 An execution of labelCorrectingAlgorithm(), which applies a deque

• The use of a deque does suffer from one problem, however• Its worst case performance is exponential in the number of

vertices

Data Structures and Algorithms in C++, Fourth Edition 32

Page 33: Chapter 8: Graphs

Shortest Paths (continued)• However, the average case is about 60% better than the

queue version of the same algorithm• A variation of this approach uses two queues separately,

rather than combined in a deque• In this variation, vertices enqueued for the first time are

placed in the first queue; otherwise they are placed in the second

• Vertices are then dequeued from the first queue if it is not empty; otherwise they are taken from the second

• The threshold algorithm is another variation of the label-correcting method that uses two lists

• Vertices are removed from the first list for processing

Data Structures and Algorithms in C++, Fourth Edition 33

Page 34: Chapter 8: Graphs

Shortest Paths (continued)• A vertex will be added to the end of the first list if the value of

its label is below the threshold level• Otherwise it will be added to the second list• If the first list becomes empty, the threshold is modified to a

value greater than the minimum label value of all vertices in the second list

• Then those vertices whose labels are less than the new threshold are moved from the second list to the first list

• Yet another approach is the small label first method• In this method, a vertex is placed at the front of the deque if

its label is smaller than the label of the current front of the deque; otherwise it is placed at the rear

Data Structures and Algorithms in C++, Fourth Edition 34

Page 35: Chapter 8: Graphs

Shortest Paths (continued)• All-to-All Shortest Path Problem

– Given the issues of finding the shortest path from one vertex to another, the problem of finding all shortest paths between two vertices might seem daunting

– However, a method developed by Stephen Warshall in 1962 does it fairly easily, as long as an adjacency matrix that provides edge weights is available

– This technique can also handle negative edge weights and the algorithm is shown on page 406

– An example of the algorithm’s application, together with the accompanying adjacency matrix, is shown in Figure 8.11 on page 407

– The algorithm can also detect cycles if the diagonal of the matrix is initialized to ∞ instead of 0

– If any of the diagonal values get changed, the graph contains a cycle

Data Structures and Algorithms in C++, Fourth Edition 35

Page 36: Chapter 8: Graphs

Shortest Paths (continued)• All-to-All Shortest Path Problem (continued)

– As it turns out, if an initial value of ∞ is not changed during processing, then one vertex cannot reach the other

– The algorithm’s simplicity is reflected in the determination of its complexity; there are three loops executed times so it is O 3

– This is adequate for dense, near-complete graphs, but if they are sparse, it may be better to use a one-to-all method applied to each vertex

– Generally this should be a label-setting algorithm, but recall that these types of routines cannot handle negative edge weights

– Fortunately, there are transformations available that eliminate the negative weights while preserving the shortest paths of the original

Data Structures and Algorithms in C++, Fourth Edition 36

Page 37: Chapter 8: Graphs

Cycle Detection• Numerous algorithms rely on their ability to detect cycles in

graphs• Our consideration of the Warshall-Floyd algorithm in the

previous example demonstrated that it can detect cycles• However, its cubic order makes it too inefficient to use in all

circumstances, so other methods have to be considered• One algorithm, based on the depthFirstSearch()routine,

works well for undirected graphs• The pseudocode for this is shown on page 408• Digraphs complicate matters, because the spanning subtrees

might have edges between them (called side edges)

Data Structures and Algorithms in C++, Fourth Edition 37

Page 38: Chapter 8: Graphs

Cycle Detection (continued)• If two vertices already included in a subtree are joined by a

back edge, it indicates a cycle• To take this case into account, a number greater than any

other assigned number generated from subsequent searches is assigned to the current vertex after its descendants have been visited

• This allows us to detect cycles if a vertex is about to be joined by an edge with a vertex having a lower number

• This allows us to modify the algorithm so that it now appears in pseudocode as the algorithm on page 409

Data Structures and Algorithms in C++, Fourth Edition 38

Page 39: Chapter 8: Graphs

Cycle Detection (continued)• Union-Find Problem

– We’ve seen that the depth-first search guarantees creating a spanning tree with no cycles

– However, a problem occurs when the depth-first search algorithm is modified to determine if a specific edge is part of a cycle

– If the modified algorithm is applied to each edge separately, the algorithm could become O4 for dense graphs

– This is unacceptable, and a better approach needs to be investigated– The basic task is to determine if two vertices are members of the same

set– Two procedures are needed for this: first, to find the set to which a

vertex v belongs, and second, to unite two sets into one if v belongs to one set and vertex w belongs to another

Data Structures and Algorithms in C++, Fourth Edition 39

Page 40: Chapter 8: Graphs

Cycle Detection (continued)• Union-Find Problem (continued)

– This process is known as the union-find problem– Circular-linked lists are used to implement the sets involved in solving

the union-find problem– The lists are identified by a vertex which is the root of the tree

containing the vertices in that list– The vertices are numbered from 0 to - 1, which become indices to

three arrays• root[]stores the index of a vertex identifying a set of vertices• next[]indicates the next vertex on a list• length[]indicates the number of vertices in a list

– The circular lists are used to enable combining the lists immediately– This is shown in Figure 8.12

Data Structures and Algorithms in C++, Fourth Edition 40

Page 41: Chapter 8: Graphs

Cycle Detection (continued)• Union-Find Problem (continued)

Fig. 8.12 Concatenating two circular linked lists

– The two lists are merged into one by interchanging next pointers– However, all the vertices now have to have the same root, so the

vertices of one of the lists need to have their root indicators changed– This should be the shorter of the two lists, which can be determined

by the length[] array– Since the union operation performs all the needed tasks, the find

operation is trivial

Data Structures and Algorithms in C++, Fourth Edition 41

Page 42: Chapter 8: Graphs

Cycle Detection (continued)• Union-Find Problem (continued)

– By constantly updating the root[] array, the set to which a vertex v belongs can be identified immediately because it is the set identified by root[v]

– Thus after initializations, the union algorithm can be defined as shown in pseudocode on page 410

– An application of this is shown in Figure 8.13– After the initialization completes, the | | one-node lists are as shown 𝑉

in Figure 8.13a– These smaller ones are merged into larger ones by repeated execution

of the union algorithm, and the arrays updated as seen in Figure 8.13 b-d

Data Structures and Algorithms in C++, Fourth Edition 42

Page 43: Chapter 8: Graphs

Cycle Detection (continued)• Union-Find Problem (continued)

Fig. 8.13 An example of application of union() to merge lists

Data Structures and Algorithms in C++, Fourth Edition 43

Page 44: Chapter 8: Graphs

Spanning Trees• Consider an airline that has routes between seven cities

represented as the graph in Figure 8.14a

Fig. 8.14 A graph representing (a) the airline connections betweenseven cities and (b–d) three possible sets of connections

• If economic hardships force the airline to cut routes, which ones should be kept to preserve a route to each city, if only indirectly?

• One possibility is shown in Figure 8.14b

Data Structures and Algorithms in C++, Fourth Edition 44

Page 45: Chapter 8: Graphs

Spanning Trees (continued)• However, we want to make sure we have the minimum

connections necessary to preserve the routes• To accomplish this, a spanning tree should be used,

specifically one created using depthFirstSearch()• There is a possibility of multiple spanning trees (Figure 8.14c-

d), but each of these has the minimum number of edges• We don’t know which of these might be optimal, since we

haven’t taken distances into account• The airline, wanting to minimize costs, will want to use the

shortest distances for the connections• So what we want to find is the minimum spanning tree,

where the sum of the edge weights is minimal

Data Structures and Algorithms in C++, Fourth Edition 45

Page 46: Chapter 8: Graphs

Spanning Trees (continued)• The problem we looked at earlier involving finding a spanning

tree in a simple graph is a case of this where edge weights = 1• So each spanning tree is a minimum tree in a simple graph• There are a number of solutions to the minimum spanning

tree problem, and we will consider two• One popular algorithm is Kruskal’s algorithm, developed by

Joseph Kruskal in 1956• It orders the edges by weight, and then checks to see if they

can be added to the tree under construction• It will be added if its inclusion doesn’t create a cycle

Data Structures and Algorithms in C++, Fourth Edition 46

Page 47: Chapter 8: Graphs

Spanning Trees (continued)• The algorithm is as follows:KruskalAlgorithm(weighted connected undirected graph) tree = null; edges = sequence of all edges of graph sorted by weight; for (i = 1; i # |E| and |tree| < |V| – 1; i++) if ei from edges does not form a cycle with edges in tree add ei to tree;

• A step-by-step example of the application of this algorithm is shown in Figure 8-15ba-bf on page 413

• It is not necessary to order the edges in order to build a spanning tree, any order of edges can be used

• An algorithm developed by Dijkstra in 1960 (and independently by Robert Kalaba) pursues this approach

Data Structures and Algorithms in C++, Fourth Edition 47

Page 48: Chapter 8: Graphs

Spanning Trees (continued)• This algorithm is shown below:DijkstraMethod(weighted connected undirected graph) tree = null; edges = an unsorted sequence of all edges of graph; for i = 1 to |E| add ei to tree; if there is a cycle in tree remove an edge with maximum weight from this only cycle;

• In this algorithm, edges are added to the tree one-by-one• If a cycle results, the edge in the cycle with maximum weight

is removed• The use of this method is shown in Figure 8.15ca-cl on page

414

Data Structures and Algorithms in C++, Fourth Edition 48

Page 49: Chapter 8: Graphs

Connectivity• In many graph problems we want to find a path from a given

vertex to any other vertex• In undirected graphs this means there are no separate pieces

in the graph (subgraphs)• In a digraph, we may be able to get to some vertices in a

particular direction, but not return to the starting vertex

Data Structures and Algorithms in C++, Fourth Edition 49

Page 50: Chapter 8: Graphs

Connectivity (continued)• Connectivity in Undirected Graphs

– An undirected graph is considered to be connected if there is a path between any two vertices of the graph

– We can use the depth-first search algorithm to determine connectivity if the while loop heading is removed

– When the algorithm completes, we check the edges list to see if it contains all the vertices of the graph

– Connectivity is described in terms of degrees; a graph is more or less connected depending on the number of different paths between vertices

– An n-connected graph has at least n different paths between any two vertices

– This means there are n paths between the vertices that have no vertices in common

Data Structures and Algorithms in C++, Fourth Edition 50

Page 51: Chapter 8: Graphs

Connectivity (continued)• Connectivity in Undirected Graphs (continued)

– One special type of graph is the biconnected (or 2-connected) graph, which has at least two non-overlapping paths between two vertices

– If we can find a vertex that always has to be included in the path between vertices a and b, then the graph is not biconnected

– Removing this vertex, and its incident edges, will split the graph into two subgraphs

– These vertices are referred to as cut-vertices or articulation points– If the graph can be split on an edge, the edge is referred to as a cut-

edge or bridge– If connected subgraphs have no articulation points or bridges, they are

called blocks (if there are at least two vertices, they are biconnected components)

Data Structures and Algorithms in C++, Fourth Edition 51

Page 52: Chapter 8: Graphs

Connectivity (continued)• Connectivity in Undirected Graphs (continued)

– We can detect articulation points by extending the depth-first algorithm to create a tree with forward and back edges

– A vertex in the resulting tree is an articulation point if it has at least one subtree unconnected with any of its predecessors by a back edge

– This is illustrated in Figure 8.16 on page 417– A special case of articulation points occurs when the vertex involved is

a root with more than one descendant– In the case of the graph in Figure 8.16, a is the root, and has three

incident edges; however, only one becomes a forward edge– This is because the other two are processed by the depth-first search

Data Structures and Algorithms in C++, Fourth Edition 52

Page 53: Chapter 8: Graphs

Connectivity (continued)• Connectivity in Undirected Graphs (continued)

– Consequently, if a is reached again, there will be no untried edge, whereas if a were a cut-vertex there would be at least one such edge

– So for a given vertex, v, the vertex is an articulation point if:• v is the root of a depth-first tree and has more than one

descendant in the tree OR• at least one of v’s subtrees includes no vertex connected by a back

edge with any of v’s predecessors– To find articulation points, a parameter pred(v)is used, defined as

the smallest value of the set of vertices connected by a back edge with either v or a predecessor of v

– A stack is used to store the currently processed edges; after the cut-vertex is identified, the graph edges comprising the block are output

– The pseudocode for the algorithm is on pages 416 and 418

Data Structures and Algorithms in C++, Fourth Edition 53

Page 54: Chapter 8: Graphs

Connectivity (continued)• Connectivity in Directed Graphs (continued)

– With directed graphs, defining connectedness depends on whether or not the direction of the edges is considered

– A weakly connected digraph is one where the undirected graph with the same edges and vertices is connected

– A strongly connected digraph has, for every pair of vertices, a path between them in both directions

– A digraph may not be strongly connected, yet contain strongly connected components (SCCs)

– These are subsets of vertices in the digraph that of themselves represent a strongly connected digraph

Data Structures and Algorithms in C++, Fourth Edition 54

Page 55: Chapter 8: Graphs

Connectivity (continued)• Connectivity in Directed Graphs (continued)

– Depth-first search can also be used in determining SCCs– The root of the SCC is the first vertex of the SCC for which the depth-

first search is applied– Because every vertex in the SCC is reachable from this root, the value

of the root will be less than the value of any other vertex in the SCC– Only after those vertices are visited will the depth-first search

backtrack to the root– At that point the SCC that is accessible from this root can be output– The problem then is how to find these vertices in the digraph, which is

a problem similar to finding cut-vertices in an undirected graph

Data Structures and Algorithms in C++, Fourth Edition 55

Page 56: Chapter 8: Graphs

Connectivity (continued)• Connectivity in Directed Graphs (continued)

– To do this, the pred(v) parameter is used, which is the lower of num(v) and pred(u), u being a vertex reachable from v and in the same SCC

– Of course this leads to the question of how we can determine if two vertices are in the same SCC before we determine if it is an SCC

– This can be done using a stack to store the vertices of all SCCs under construction

– The topmost vertices will be in the current SCC– This way we know what vertices are already in the SCC even though

the construction isn’t finished– The algorithm, attributed to Robert Trajan, is shown on page 419; an

example of the execution is shown in Figure 8.17 on page 420

Data Structures and Algorithms in C++, Fourth Edition 56

Page 57: Chapter 8: Graphs

Topological Sort• A topological sort of a directed graph is a linear ordering of its

vertices so that, for every edge uv, u comes before v in the ordering

• For instance, the vertices of the graph may represent tasks to be performed

• The edges may represent constraints that one task must be performed before another

• In this application, a topological ordering is just a valid sequence for the tasks

• A topological ordering is possible if and only if the graph has no directed cycles, that is, if it is a directed acyclic graph (DAG)

Data Structures and Algorithms in C++, Fourth Edition 57

Page 58: Chapter 8: Graphs

Topological Sort (continued)• The algorithm for the topological sort is a simple one:topologicalSort(digraph) for i = 1 to |V| find a minimal vertex v; num(v) = i; remove from digraph vertex v and all edges incident

with v;

• As can be seen, we locate a vertex, v with no outgoing edges• Such a vertex is called a minimal vertex or sink• We then remove any edges leading from a vertex to v• Figure 8.18 shows this process; the graph in Figure 8.18a goes

through a series of deletions (Figure 8.18b-f) to produce the sequence g, e, b, f, d, c, a

Data Structures and Algorithms in C++, Fourth Edition 58

Page 59: Chapter 8: Graphs

Topological Sort (continued)

Fig. 8.18 Executing a topological sort

Data Structures and Algorithms in C++, Fourth Edition 59

Page 60: Chapter 8: Graphs

Topological Sort (continued)• It is not actually necessary to delete the edges and vertices

from a digraph during this processing• If we can determine that all successors of the vertex v have

been processed, they can be considered deleted• This is once again handled by applying the depth-first search

techniques seen earlier• Basically, if the search backtracks to v, then all its successors

can be assumed to have already been searched• The pseudocode for this algorithm is shown on pages 421 and

423• The table (Figure 8.18h) shows how the numbers are assigned

for each vertex of the graph of Figure 8.18a

Data Structures and Algorithms in C++, Fourth Edition 60

Page 61: Chapter 8: Graphs

Networks• Maximum Flows

– A network is a directed graph where each edge has a capacity and each edge receives a flow

– The amount of flow on an edge cannot exceed the capacity of the edge

– A flow must satisfy the restriction that the amount of flow into a node equals the amount of flow out of it, except when it is a source, which has more outgoing flow, or sink, which has more incoming flow

– A network can be used to model traffic in a road system, fluids in pipes, currents in an electrical circuit, or anything similar in which something travels through a network of nodes

– Delbert R. Fulkerson and Lester R. Ford, Jr. developed the first computational models of these flow problems in 1954

Data Structures and Algorithms in C++, Fourth Edition 61

Page 62: Chapter 8: Graphs

Networks (continued)• Maximum Flows (continued)

– The central problem of these network models is to maximize the flow over the edges from the source to the sink

– This is referred to as the maximum flow (or max-flow) problem– Figure 8.19 illustrates this problem for a small water-flow network of 8

pipes and 6 pumping stations; the edges are labeled with the capacity of the pipes in thousands of gallons

Figure 8.19 A pipeline with eightpipes and six pumping stations

Data Structures and Algorithms in C++, Fourth Edition 62

Page 63: Chapter 8: Graphs

Networks (continued)• Maximum Flows (continued)

– A central aspect of the Ford-Fulkerson approach is the concept of a cut– A cut separating s and t is a set of edges between two sets, X and – Every vertex of the graph is a member of one of these two sets; the

source, s, is in X and the sink, t, in – In Figure 8.19, if we choose X = {s, a}, then = {b, c, d, t}, and the cut is

the set of edges {{a, b}, {s, c}, {s, d}}– Thus, if all these edges are cut, there is no way to get from s to t– Now we can define the capacity of the cut as the sum of the capacities

of the edges in this cut set, so

cap{(a,b),(s,c),(s,d)} = cap(a,b) + cap(s,c) + cap(s,d) = 19

Data Structures and Algorithms in C++, Fourth Edition 63

Page 64: Chapter 8: Graphs

Networks (continued)• Maximum Flows (continue)

– From this, we can infer the max-flow min-cut theorem:Theorem: In any network, the maximal flow from s to t isequal to the minimal capacity of any cut.

– This makes it fairly clear that while there may be cuts with larger capacity, it is the cut with the smallest capacity that determines the flow of the network

– For instance, although the capacity of our earlier cut was 19, the two edges coming to the sink can’t transfer more than 9 units

– So we have to search all the cuts to find the one with the smallest capacity, and transfer through this as many units as the capacity allows

– To achieve this, we’ll utilize a new idea

Data Structures and Algorithms in C++, Fourth Edition 64

Page 65: Chapter 8: Graphs

Networks (continued)• Maximum Flows (continue)

– A flow-augmenting path is a sequence of edges from s to t such that on any edge, e, in the path the flow f(e) on the forward edges is less than the capacity, cap(e), and greater than 0 on the backward edges

– This means the path has excess capacity that isn’t being used– However if the flow for any edge in that path reaches capacity, the

flow cannot be augmented– The path also does not have to exclusively use forward edges, so in

Figure 8.19, we have paths s, a, b, t and s, d, b, t– Backward edges push back against the flow, decreasing the total flow

of the network– Eliminating them can increase the overall flow in the network, so the

goal of augmenting isn’t finished until the flows for those edges is 0

Data Structures and Algorithms in C++, Fourth Edition 65

Page 66: Chapter 8: Graphs

Networks (continued)• Maximum Flows (continue)

– The task now is to find an augmenting path; however there may be a large number of paths from s to t, so this is a nontrivial problem

– Ford and Fulkerson devised the first systematic algorithm for this in 1957

– The first phase of the algorithm, labeling, assigns each vertex of the graph a label, defined as the pair label(v) = (parent(v), flow(v))

– parent(v) is the node accessing v, and flow(v) is the flow amount from s to v

– Forward and backward edges are treated differently; if v accesses vertex u via a forward edge, label(u) = (v+,min(flow(v),slack(edge(vu))))

– Here, slack(edge(vu)) = cap(edge(vu)) – f(edge(vu)); this is the difference between the capacity of the edge vu and its current flow

Data Structures and Algorithms in C++, Fourth Edition 66

Page 67: Chapter 8: Graphs

Networks (continued)• Maximum Flows (continue)

– Now if the edge between v and u is backward, then the value of label(u) = (v–,min(flow(v),f(edge(uv)))) where

flow(v) = min(flow (parent(v)), slack(edge(parent(v)v)))

– Once a vertex is labeled, it is stored for subsequent processing– Only the vu edge is labeled in this activity, leaving open the ability to

add more flow– This can be done for forward edges when slack(edge(vu)) > 0, and for

backward edges when f(edge(uv)) > 0– However, finding this path may not complete the whole procedure– It is only finished if we are stuck somewhere in the network and

unable to label any more edges

Data Structures and Algorithms in C++, Fourth Edition 67

Page 68: Chapter 8: Graphs

Networks (continued)• Maximum Flows (continue)

– If we reach the sink, the flows in the augmenting path are adjusted by increasing flows on the forward edges, and decreasing them on the backward ones

– Then we restart the task and look for another augmenting path– The pseudocode for the algorithm is presented on page 425– In examining the algorithm, notice there is no particular mechanism

specified for scanning the graph– The question is in what order vertices should be added to labeled

and detached from it; this implementation uses push and pop operations to process it depth-first

– The operation of this algorithm in shown in Figure 8.20 on pages 426 and 427

Data Structures and Algorithms in C++, Fourth Edition 68

Page 69: Chapter 8: Graphs

Networks (continued)• Maximum Flows (continue)

– A major issue with this implementation is the depth-first approach, which has a significant impact on its efficiency

– Since the depth-first algorithm tries to reach the sink as soon as possible, we may end up choosing the same augmenting path several times as the algorithm proceeds

– A better approach is to try and find the shortest augmenting path, which suggests a breadth-first approach

– This concept was developed by Jack Edmonds and Richard Karp in 1972

– It uses the same approach as the Ford-Fulkerson algorithm, but the labeled structure is now a queue

– This modified approach is illustrated in Figure 8.22 on page 429

Data Structures and Algorithms in C++, Fourth Edition 69

Page 70: Chapter 8: Graphs

Networks (continued)• Maximum Flows (continue)

– Although this approach overcomes the problems associated with the depth-first search, it has its own inefficiencies

– When we perform a breadth-first search, a large number of vertices are labeled in each iteration in order to find the shortest path

– However, these labels are all discarded, only to be re-created when we start looking for another augmenting path

– So to address this shortcoming we turn our attention to an algorithm developed by Efim Dinic in 1970

– His approach used breadth-first search first to avoid the repetitive loops with the same paths and to make sure the depth-first search takes the shortest path

– Once that was done, the depth-first component takes over to reach the sink

Data Structures and Algorithms in C++, Fourth Edition 70

Page 71: Chapter 8: Graphs

Networks (continued)• Maximum Flows (continue)

– The algorithm makes up to - 1 passes through the network resolving all augmenting paths of the same length from source to sink

– All the augmenting paths form a layered (or level) network – Starting from the lowest values, we first extract layered networks of

length one if they exist, then length two, etc.– This is illustrated in Figure 8.23a-b on page 431– The augmenting paths in this layered network are all of length three; a

single path of length one and paths of length two do not exist– Breadth-first processing is used to create the layered network, and it

includes only forward edges with more capacity and backward edges that already carry some flow

Data Structures and Algorithms in C++, Fourth Edition 71

Page 72: Chapter 8: Graphs

Networks (continued)• Maximum Flows (continue)

– Since the paths in a layered network are of the same length, we can avoid redundant edges that are in augmenting paths

– If we cannot reach any of the neighbors of a vertex v in a layered network, the same situation will exist in that network in later tests

– Consequently, we won’t need to check the neighbors of v again– So if we run into a dead-end node v, we mark incident edges as

blocked so we can’t get to v from any direction– Any saturated edges (those already at full capacity) are also blocked;

these are shown as dashed lines in Figure 8.23– Because of the way this works, the layered network is built from the

sink to the source

Data Structures and Algorithms in C++, Fourth Edition 72

Page 73: Chapter 8: Graphs

Networks (continued)• Maximum Flows (continue)

– Next, the depth-first search proceeds to find as many augmenting paths as possible from the layered network

– For each of these paths, one edge will become saturated, so eventually no more augmenting paths will be found

– This process is illustrated in Figure 8.23c-f– Once no more augmenting paths are found, a higher-level layered

network is created, and the search for augmenting paths begins again, eventually stopping when no layered network can be formed

– Figure 8.23g-j shows this, as first a four-edge and then a five-edge path are created

– The algorithm itself is shown on pages 432 and 433

Data Structures and Algorithms in C++, Fourth Edition 73

Page 74: Chapter 8: Graphs

Networks (continued)• Maximum Flows of Minimum Cost

– Edges in the previous examples had two parameters, capacity and flow– Choice of maximum flow was dictated by the algorithm used, even

though there might be many maximum flows– This is illustrated in Figure 8.24

Fig. 8.24 Two possible maximum flows for the same network

– In Figure 8.24a, the edge ab isn’t used at all, whereas in Figure 8.24b all the edges are carrying flow

– Yet our breadth-first only yields the first result, then halts

Data Structures and Algorithms in C++, Fourth Edition 74

Page 75: Chapter 8: Graphs

Networks (continued)• Maximum Flows of Minimum Cost (continued)

– However this may not be the best choice; not all paths of maximum flow are equally good ones

– If we look at the example as road distances between locations, then capacity and flow may not be sufficient information to properly determine a route

– For example, the distance from a to t may be quite long, while the distance from a to b and b to t may be shorter, making the second route preferable

– But distance may not be the sole criterion; there may be many other factors that influence the choice of route

– This leads us to consider a third factor in evaluating edges, the cost of moving a unit of flow through the edge

Data Structures and Algorithms in C++, Fourth Edition 75

Page 76: Chapter 8: Graphs

Networks (continued)• Maximum Flows of Minimum Cost (continued)

– The problem now becomes how to find the maximum flow at minimum cost

– Finding all the possible maximum flows and then comparing their costs is extremely inefficient

– What is needed is an algorithm that can find a maximum flow while also determining the minimum cost

– One possible approach is based on the following theorem:

Theorem. If f is a minimal-cost flow with the flow value v and p is the minimum cost augmenting path sending aflow of value 1 from the source to the sink, then the flowf + p is minimal and its flow value is v + 1.

Data Structures and Algorithms in C++, Fourth Edition 76

Page 77: Chapter 8: Graphs

Networks (continued)• Maximum Flows of Minimum Cost (continued)

– The theorem says we first start with the cheapest way to move v units through the network

– Then we find a path that is the cheapest way of sending a single unit from the source to the sink

– On combining these, we have the route previously determined and the path just found, which transmits v + 1 units

– Now if this augmenting path sends 1 unit at minimum cost, it can send, 2, 3, …, n units, where n is the capacity of the path

– This also suggests a process for finding the cheapest maximum route– Starting with all flows 0, we find the cheapest way to send 1 unit and

then maximize the flow along this path

Data Structures and Algorithms in C++, Fourth Edition 77

Page 78: Chapter 8: Graphs

Networks (continued)• Maximum Flows of Minimum Cost (continued)

– After the next go-around, the path to send 1 unit at least cost is determined, and as many units as this can hold is sent, etc.

– This continues until we can’t send anything more from the source, or the sink can’t receive any more flows

– This is something like finding the shortest path, because it can be looked at as the path with minimum cost

– So we want an algorithm to find the shortest path so we can send the maximum flow through the path

– So a modification of Dijkstra’s one-to-one shortest path algorithm can be used

– The pseudocode for this procedure is shown on page 435

Data Structures and Algorithms in C++, Fourth Edition 78

Page 79: Chapter 8: Graphs

Networks (continued)• Maximum Flows of Minimum Cost (continued)

– The label for each vertex in this algorithm is the triple label(u) = (parent(u), flow(u), cost(u)) since it has to track three items

– First, it records u’s predecessor, v, which how s accesses u– Then, for the path from s to u, it records the maximum flow– Finally, it stores the cost of passing all the edges from the source to u– cost(u), for the forward edge(vu), is the sum of accumulated costs in v

plus the additional cost of pushing a unit through edge(vu)– The unit cost of passing through backward edge(vu) is subtracted from

cost(v) and stored in cost(u)– The process is illustrated in Figure 8.25 on page 437

Data Structures and Algorithms in C++, Fourth Edition 79

Page 80: Chapter 8: Graphs

Matching• A particular company has a set of jobs {a, b, c, d, e}, and a set

of applicants {p, q, r, s, t}• However, applicant p is only qualified for jobs a, b, and c;

applicant q is only qualified for jobs b and d; similar restrictions exist for the other applicants

• Our problem is how to match the applicants to the jobs such that each applicant has a job and all jobs are assigned

• Numerous problems like this exist, and they are conveniently modeled using bipartite graphs

• A bipartite graph is one where the vertices can be divided into two sets, such that any edge has one vertex in each set

Data Structures and Algorithms in C++, Fourth Edition 80

Page 81: Chapter 8: Graphs

Matching (continued)• For the company, we can construct a bipartite graph where

each edge relates an applicant to the job(s) they qualify for• This is shown in Figure 8.26

Fig. 8.26 Matching five applicants with five jobs

• The task is to match each applicant with a job; this may not always be possible, so we want to match as many as possible

• For a given graph G = (V, E), a matching M is defined as a subset of edges M E, where no two edges are adjacent

Data Structures and Algorithms in C++, Fourth Edition 81

Page 82: Chapter 8: Graphs

Matching (continued)• A maximum matching is a matching where the number of

unmatched vertices is minimal• Consider Figure 8.27

Fig. 8.27 A graph with matchings M1 = {edge(cd), edge(ef)}and M2 = {edge(cd), edge(ge), edge(fh)}

• Sets M1 = {edge(cd), edge(ef)} and M2 = {edge(cd), edge(ge), edge(fh)} are matchings, but M2 is a maximum matching

• A perfect matching is one where all vertices in the graph are paired

Data Structures and Algorithms in C++, Fourth Edition 82

Page 83: Chapter 8: Graphs

Matching (continued)• A matching problem is the task of finding a maximum

matching for a given graph• An alternating path for M is a sequence of edges that

alternately belong to M and the set of edges not in M• An augmenting path for M is an alternating path where the

end vertices are not incident with any edge in matching M• Augmenting paths have an odd number of edges, 2k + 1,

where k are in M and k + 1 are not in M• The symmetric difference of two sets, X Y, is the set

X ⊕ Y = (X – Y) (Y – X) = (X Y) – (X Y)• In other words, the symmetric difference of two sets is the set

of elements in their union, less the intersection

Data Structures and Algorithms in C++, Fourth Edition 83

Page 84: Chapter 8: Graphs

Matching (continued)• This leads us to the following lemma, the proof of which is

shown on page 439:Lemma 1. If for two matchings M and N in a graph G = (V,E) we define a set of edges M N E, then each connected component of the⊕ ⊆ subgraph G = (V,M N) is either (a) a single vertex, (b) a cycle with′ ⊕ an even number of edges alternately in M and N, or (c) a path whose edges are alternately in M and N and such that each end vertex of the path is matched only by one of the two matchings M and N (i.e., the whole path should be considered, not just part, to cover the entire connected component)

• Figure 8.28 shows an example of this• The symmetric difference between matching M (dashed

lines) and matching N (dotted lines) contains one path and a cycle (Figure 8.28 b)

Data Structures and Algorithms in C++, Fourth Edition 84

Page 85: Chapter 8: Graphs

Matching (continued)• Notice that the vertices of the graph G not incident with any

edges in the symmetric difference are isolated vertices in G’

Fig. 8.28 (a) Two matchings M and N in a graph G = (V,E)and (b) the graph G’ = (V, M ⊕ N)

• Now consider the next lemma:Lemma 2. If M is a matching and P is an augmenting path for M, then M P is a matching of cardinality |M| + 1⊕

Data Structures and Algorithms in C++, Fourth Edition 85

Page 86: Chapter 8: Graphs

Matching (continued)• The proof of this is on page 440; Figure 8.29 illustrates it

Fig. 8.29 (a) Augmenting path P and a matching M and (b) the matching M ⊕ P

• For matching edge M (dashed lines) and augmenting path P for M (c, b, f, h, g, i, j, e), the matching is {edge(bc), edge(ej), edge(fh), edge(gi)}

• This includes all the edges from P originally excluded from M

Data Structures and Algorithms in C++, Fourth Edition 86

Page 87: Chapter 8: Graphs

Matching (continued)• These two lemmas can then be used to construct the proof of

the following important theorem:Theorem (Claude Berge 1957). A matching M in a graph G is maximum iff there is no augmenting path connecting two unmatched vertices in G

• The proof of this theorem is shown on page 441• This suggests an approach for finding a maximum path• Starting from an initial matching (possibly empty), it

repeatedly finds new augmenting paths to increase the cardinality of the matching until no such path can be found

• This means we need an algorithm to determine augmenting paths

• Fortunately, this is easier to do for bipartite graphs, so we’ll start with them

Data Structures and Algorithms in C++, Fourth Edition 87

Page 88: Chapter 8: Graphs

Matching (continued)• To find an augmenting path, the breadth-first algorithm is

modified to allow for always finding the shortest path• A tree, called a Hungarian tree, is constructed with an

unmatched vertex in the root• It consists of alternating paths, and success is determined as

soon as another unmatched vertex is found• This indicates the presence of an augmenting path• The augmenting path increases the size of matching; once no

such path can be found, the algorithm is finished• The algorithm is shown on pages 441 and 442; an example of

this is shown in Figure 8.30 on page 443

Data Structures and Algorithms in C++, Fourth Edition 88

Page 89: Chapter 8: Graphs

Matching (continued)• Stable Matching Problem

– In the example of matching applicants with jobs, any successful maximum matching was fine

– However, this is typically not possible due to preferences for jobs among applicants, and for applicants among employers

– The stable matching (also called stable marriage) problem uses two non-overlapping sets with the same cardinality, U and W

– The elements of U have a ranking list of elements of W, and those of W have a preference list of elements of U

– The ideal matching is to place elements with their highest preference, but because of possible conflicts, a stable matching is sought

– A matching is unstable is two elements rank each other higher than those with which they are currently matched; otherwise it is stable

Data Structures and Algorithms in C++, Fourth Edition 89

Page 90: Chapter 8: Graphs

Matching (continued)• Stable Matching Problem (continued)

– If we consider the two sets U = {u1, u2, u3, u4} and W = {w1, w2, w3, w4}, and the following ranking lists:

u1: w2 > w1 > w3 > w4 w1: u3 > u2 > u1 > u4

u2: w3 > w2 > w1 > w4 w2: u1 > u3 > u4 > u2

u3: w3 > w4 > w1 > w2 w3: u4 > u2 > u3 > u1

u4: w2 > w3 > w4 > w1 w4: u2 > u1 > u3 > u4

then we can see the matching (u1, w1), (u2, w2), (u3, w4), (u4, w3) is unstable because u1 and w2 prefer each other over the current match

– David Gayle and Lloyd Shapley Designed a matching algorithm in 1962, and also showed that a stable matching always exists

– This algorithm is shown in page 444, together with a discussion of its application to the sets and table above

Data Structures and Algorithms in C++, Fourth Edition 90

Page 91: Chapter 8: Graphs

Matching (continued)• Stable Matching Problem (continued)

– There is an asymmetry associated with the algorithm based on which rankings are considered more important

– As given, the algorithm favors set U– If the roles of the two sets U and W are reversed, then the w’s will

have their preferred choices immediately, instead of the u’s

Data Structures and Algorithms in C++, Fourth Edition 91

Page 92: Chapter 8: Graphs

Matching (continued)• Assignment Problem

– Finding suitable matches becomes more difficult in a weighted graph– In these cases we want to find a matching with a maximum total

weight– This is known as the assignment problem– If we consider complete bipartite graphs with two sets of vertices that

are equal in size, then it is known as the optimal assignment problem– An algorithm known as the Hungarian algorithm was developed by

Harold Kuhn in 1955, and further investigated by James Munkres in 1957

– Kuhn’s original name was in honor of the work done by Dénis Kõnig and Jenõ Egerváry on this problem in 1931

Data Structures and Algorithms in C++, Fourth Edition 92

Page 93: Chapter 8: Graphs

Matching (continued)• Assignment Problem (continued)

– The algorithm is shown on pages 445 and 446– An example of its application is shown in Figure 8.31, together with a

detailed treatment of its application on pages 446 and 447

Data Structures and Algorithms in C++, Fourth Edition 93

Page 94: Chapter 8: Graphs

Matching (continued)• Matching in Nonbipartite Graphs

– The algorithm findMaximumMatching()(pages 441 and 442) is not general enough to correctly handled nonbipartite graphs

– Considering the graph in Figure 8.32 and using breadth-first search to construct a tree to determine an augmenting path we run into a problem

– Starting at vertex c, d is on an even level, e is odd, and a and f are even– a is then expanded by adding b and f by adding g and then i, creating

an augmenting path c, d, e, f, g, i– If i were not in the graph, however, the only augmenting path would

not be detected because g, being labeled, blocks access to f and h– A similar problem would occur if we relied on depth-first search

instead

Data Structures and Algorithms in C++, Fourth Edition 94

Page 95: Chapter 8: Graphs

Matching (continued)• Matching in Nonbipartite Graphs (continued)

Fig. 8.32 Application of the findMaximumMatching() algorithm to a nonbipartite graph

– The problem is caused by certain cycles possessing an odd number of edges

– It isn’t the odd number of edges specifically that leads to this; Figure 8.32b can be successfully processed

Data Structures and Algorithms in C++, Fourth Edition 95

Page 96: Chapter 8: Graphs

Matching (continued)• Matching in Nonbipartite Graphs (continued)

– The type of cycle for which the problems occur is called a blossom– A technique for determining augmenting paths for graphs with

blossoms was developed by Jack Edmonds in 1961 and published in 1965

– A blossom is an alternating cycle where the first and last edges of the cycle are not in matching

– In these cycles, the first vertex is called the base of the blossom– An alternating path of even length is called a stem, so is a path of

length zero with a single vertex– If a blossom has a stem whose edge in matching is incident with the

base, it is called a flower

Data Structures and Algorithms in C++, Fourth Edition 96

Page 97: Chapter 8: Graphs

Matching (continued)• Matching in Nonbipartite Graphs (continued)

– In Figure 8.32a, path c, d, e and path e are stems; cycle e, a, b, g, f, e forms a blossom with base e

– Blossoms cause problems when the potential augmenting path leads to a blossom through the base

– Depending on the edge chosen to continue the path, an augmenting path may not be derived

– If the blossom is entered through any other vertex, however, the problem is averted because only one of the two edges of the vertex can be chosen

– So the idea is to detect a blossom is being entered through its base– We can then temporarily remove the blossom by replacing it with a

vertex and attach to this all edges connected to the blossom

Data Structures and Algorithms in C++, Fourth Edition 97

Page 98: Chapter 8: Graphs

Matching (continued)• Matching in Nonbipartite Graphs (continued)

– At this point the search for an augmenting path continues– If one is found and it includes a vertex representing a blossom, the

blossom is re-inserted– The path through it is then determined by going backwards from the

edge that led to the blossom to an edge incident with the base– So first, we need to detect that a blossom has been entered through

its base– The Hungarian tree in Figure 8.33a was generated using a breadth-first

search on the graph of Figure 8.32a– Trying to find neighbors of b leads us to g, because edge(ab) is in

matching, so only edges not in matching can be included starting from b

Data Structures and Algorithms in C++, Fourth Edition 98

Page 99: Chapter 8: Graphs

Matching (continued)• Matching in Nonbipartite Graphs (continued)

– These edges lead to vertices on an even level in the tree, but g has already been labeled and is on an odd level, signaling a blossom

– Thus, we trace paths back in the tree from g and b until we reach a common, root, which is vertex e; this is the base of the blossom

– We then replace the blossom with a vertex, A, leading to the graph of Figure 8.33b

– The augmenting path search is then resumed, and continues until the path is found, which is c, d, A, h

– Then the blossom is expanded, and the path traced through the blossom

– This is done by starting from edge(hA) (now edge(hf))

Data Structures and Algorithms in C++, Fourth Edition 99

Page 100: Chapter 8: Graphs

Matching (continued)• Matching in Nonbipartite Graphs (continued)

– That edge is not in matching, so from f only edge(fg) can be chosen so the augmenting path remains alternating

– By moving through the vertices f, g, b, a, e, the part of the augmenting path corresponding to A is determined, as seen in Figure 8.33c

– So the full augmenting path is c, d, e, a, b, g, f, h– Once the path is processed, a new matching is determined, shown in

Figure 8.33d

Data Structures and Algorithms in C++, Fourth Edition 100

Page 101: Chapter 8: Graphs

Matching (continued)• Matching in Nonbipartite Graphs (continued)

Fig. 8.33 Processing a graph with a blossom

Data Structures and Algorithms in C++, Fourth Edition 101

Page 102: Chapter 8: Graphs

Eulerian and Hamiltonian Graphs• Eulerian Graphs

– A trail in a graph which visits every edge exactly once is called an Eulerian trail (or Eulerian path)

– Similarly, an Eulerian trail which starts and ends on the same vertex is called an Eulerian circuit or Eulerian cycle

– They were first discussed by Leonhard Euler while solving the famous Seven Bridges of Königsberg problem in 1736

– Euler proved that if every vertex of the graph is incident to an even number of edges, then it is Eulerian

– In addition, if the graph has exactly two vertices incident with an odd number of edges, it contains an Eulerian trail

– An algorithm developed by M. Fleury in 1883 is the oldest that allows us to find an Eulerian cycle if this is possible; it appears on page 450

Data Structures and Algorithms in C++, Fourth Edition 102

Page 103: Chapter 8: Graphs

Eulerian and Hamiltonian Graphs(continued)

• Eulerian Graphs (continued)– Figure 8.34 shows an example of finding an Eulerian cycle

Fig. 8.34 Finding an Eulerian cycle

– A test needs to be made, before an edge is chosen, to see if that edge is a bridge in the untraversed subgraph

– If it is, it could lead to the in ability to complete the path because certain vertices could become unreachable

Data Structures and Algorithms in C++, Fourth Edition 103

Page 104: Chapter 8: Graphs

Eulerian and Hamiltonian Graphs(continued)

• Eulerian Graphs (continued) – The Chinese Postman Problem– The Chinese postman problem is to find a shortest closed path or

circuit that visits every edge of a (connected) undirected graph– Alan Goldman of the U.S. National Bureau of Standards first coined the

name 'Chinese Postman Problem' for this problem, as it was originally studied by the Chinese mathematician Mei-Ku Kwan in 1962

– When the graph has an Eulerian circuit that circuit is an optimal solution

– If it doesn’t, it can be amplified by including each edge as many times as it appears in the postman’s walk

– If this is done, we need to construct the graph in such a way as to minimize the sum of the distances of the added edges

Data Structures and Algorithms in C++, Fourth Edition 104

Page 105: Chapter 8: Graphs

Eulerian and Hamiltonian Graphs(continued)

• Eulerian Graphs (continued) – The Chinese Postman Problem– First we group odd degree vertices into pairs and add a path of new

edges to the already existing path between vertices of each pair– The problem now is to find a grouping of odd-degree vertices such

that the total distance of the added paths is minimum– An algorithm to solve this was developed by Jack Edmonds and Ellis L.

Johnson in 1973, based on earlier work by Edmonds in 1965– The pseudocode for this algorithm is shown on page 451– The task of finding a postman tour is illustrated in Figure 8.35 on page

452– The path has six odd degree vertices, c, d, f, g, h, and j

Data Structures and Algorithms in C++, Fourth Edition 105

Page 106: Chapter 8: Graphs

Eulerian and Hamiltonian Graphs(continued)

• Eulerian Graphs (continued) – The Chinese Postman Problem– In Figure 8.35b-c the shortest paths between all pairs of these vertices

are determined– A complete bipartite graph, H, is then found (Figure 8.35d), and an

optimal assignment, M is determined– A matching in an initial equality subgraph is found by using the

optimalAssignment() algorithm (Figure 8.35e)– Two matchings are found (Figure 8.35f–g), and then a perfect

matching (Figure 8.35h)– Using this, we amplify the original graph by adding new edges (dashed

lines in Figure 8.35i), so there are no odd-degree vertices– Consequently, finding an Eulerian trail is possible

Data Structures and Algorithms in C++, Fourth Edition 106

Page 107: Chapter 8: Graphs

Eulerian and Hamiltonian Graphs(continued)

• Hamiltonian Graphs– A Hamiltonian path is a path in an undirected graph that visits

each vertex exactly once– A Hamiltonian cycle is a Hamiltonian path that is a cycle– Determining whether such paths and cycles exist in graphs is the

Hamiltonian path problem, which is NP-complete– Hamiltonian graphs have no characterizing formula, but all complete

graphs are Hamiltonian– Hamiltonian paths and cycles are named after William Rowan

Hamilton who studied them in 1857– The following theorem will prove useful in discussing Hamiltonian

graphs

Data Structures and Algorithms in C++, Fourth Edition 107

Page 108: Chapter 8: Graphs

Eulerian and Hamiltonian Graphs(continued)

• Hamiltonian Graphs (continued)Theorem (Bondy and Chvatal 1976; Ore 1960). If edge(vu) E, graph G* = (V,E{edge(vu)}) is Hamiltonian, and deg(v) + deg(u) > |V|, then graph G =(V,E) is also Hamiltonian

– The proof of this is shown on page 453; the theorem essentially says that some Hamiltonian graphs can be created from others by eliminating edges

– This process leads to an algorithm where finding a Hamiltonian cycle is easy (by expanding the graph with more edges)

– Then the cycle is manipulated by adding and removing edges until a Hamiltonian cycle is found based on the edges of the original graph

– The algorithm is presented on pages 453 and 454– Figure 8.37 on page 455 shows an example of this

Data Structures and Algorithms in C++, Fourth Edition 108

Page 109: Chapter 8: Graphs

Eulerian and Hamiltonian Graphs(continued)

• Hamiltonian Graphs (continued) – The Traveling Salesman Problem– The travelling salesman problem consists of finding the shortest

possible route that visits each city (in a set of cities) exactly once and returns to the origin city

– If the distances between each pair of cities is known, there are (n – 1)! possible routes

– The problem is then to find a minimum Hamiltonian cycle– Many versions of this problem use the triangle inequality, dist(vivjj) <

dist(vivk)+ dist(vkvj)– A possibility is to add to an already constructed path v1, …, vj a vertex

vj+1, that is closest to vj

– The problem is the last edge added may be as long as the total distance of the remaining edges

Data Structures and Algorithms in C++, Fourth Edition 109

Page 110: Chapter 8: Graphs

Eulerian and Hamiltonian Graphs(continued)

• Hamiltonian Graphs (continued) – The Traveling Salesman Problem– Another possibility uses a minimum spanning tree– The length of the tree is defined to be the sum of the lengths of all the

edges of the tree– Since removing an edge from the tour creates a spanning tree, the

tour cannot be less than the length of the minimum spanning tree– Also, each edge of the tree is traversed twice in a depth-first search, so

the length of the tour is at most twice the length of the tree– However a path that includes each edge twice includes some vertices

twice, and each vertex should be included only once

Data Structures and Algorithms in C++, Fourth Edition 110

Page 111: Chapter 8: Graphs

Eulerian and Hamiltonian Graphs(continued)

• Hamiltonian Graphs (continued) – The Traveling Salesman Problem– So if a vertex is already in such a path, its second occurrence is

eliminated, and the path contracted – This shortens the length of the path due to the triangle inequality– For example, Figure 8.38b (pages 456 and 457) shows the minimum

spanning tree for the graph that connects the cities a through h in Figure 8.38a

– Depth-first search yields 8.38c, and applying the triangle inequality repeatedly (Figure 8.38c-i) transforms the path into the path in 8.38i

– This final path can be obtained directly from the minimum spanning tree in Figure 8.38b using preorder traversal

Data Structures and Algorithms in C++, Fourth Edition 111

Page 112: Chapter 8: Graphs

Eulerian and Hamiltonian Graphs(continued)

• Hamiltonian Graphs (continued) – The Traveling Salesman Problem– The tour in Figure 8.38i is obtained by considering a as the vertex of

the tree, so the cities are visited a, d, e, f, h, g, c, b from which we return to a

– This tour is minimum, which won’t always be the case– For example, if d is considered to be the root, the algorithm yields the

path in Figure 8.38j, clearly not minimal– In another version of the algorithm, a tour is extended by adding to it

the closest city– Since the tour is kept in one piece, it resembles a method developed

by Vojtech Jarnik in 1930 (and separately by Robert C. Prim in 1957)

Data Structures and Algorithms in C++, Fourth Edition 112

Page 113: Chapter 8: Graphs

Eulerian and Hamiltonian Graphs(continued)

• Hamiltonian Graphs (continued) – The Traveling Salesman Problem– This algorithm is shown on page 458– An example of its application is shown in Figure 8.39 on pages 458 and

459

Data Structures and Algorithms in C++, Fourth Edition 113

Page 114: Chapter 8: Graphs

Graph Coloring• Occasionally, we want to determine the minimum number of

sets of non-coincident vertices, where some vertices in each set are independent

• By this we mean that the vertices are not connected by any edge

• By example, we may have several tasks to be performed by several people

• If one task can be performed by one person at one time, the scheduling must be such that this can be done

• We can let the task represent vertices of a graph, and join with an edge two tasks that require the same person

Data Structures and Algorithms in C++, Fourth Edition 114

Page 115: Chapter 8: Graphs

Graph Coloring (continued)• Then we try to construct the minimum number of sets of

independent tasks• Because all the tasks in a given set can be done concurrently,

the number of sets indicates the number of time slots needed• As a variation of this, we could join with an edge those tasks

that cannot be performed concurrently• As before, the independent sets indicate the tasks that can be

performed at the same time• However in this case the minimum number of sets indicates

the minimum number of people needed to perform the tasks• In general, two vertices are joined by an edge if they cannot

be members of the same class

Data Structures and Algorithms in C++, Fourth Edition 115

Page 116: Chapter 8: Graphs

Graph Coloring (continued)• We can restate the problem to say that vertices of a graph are

assigned colors so that vertices joined by an edge are different colors

• So the task amounts to coming up with a graph coloring using a minimum number of colors

• More formally, given a set of colors, C, we determine a function f : V → C so that if edge(vw) exists, f(v) ≠ f(w) and C is of minimum cardinality

• The chromatic number of a graph G is the minimum number of colors needed to color the graph, denoted χ(G)

• A graph where k = χ(G) is called k-colorable

Data Structures and Algorithms in C++, Fourth Edition 116

Page 117: Chapter 8: Graphs

Graph Coloring (continued)• There may be many sets of minimum colors; no general

formula exists for the chromatic number of an arbitrary graph• There are some special cases, however:

– A complete graph, Kn has the chromatic number χ(Kn) = n– For a cycle with an even number of edges, C2n , χ(C2n) = 2– For a cycle with an odd number of edges, C2n + 1 , χ(C2n + 1) = 3– For a bipartite graph, G, χ(G) < 2

• The determination of a graph’s chromatic number is an NP-complete problem

• Consequently, techniques need to be used that can color a graph with a number of colors close to the chromatic number

Data Structures and Algorithms in C++, Fourth Edition 117

Page 118: Chapter 8: Graphs

Graph Coloring (continued)• Sequential coloring is an approach that establishes sequences

of vertices and colors before coloring the vertices• Then the next vertex in sequence is colored with the lowest

number possible• This algorithm appears on page 460• The algorithm does not specify any ordering criteria for the

vertices (order of colors makes no difference)• One possibility is to use the indices assigned to the vertices

before the algorithm is executed, as shown in Figure 8.40b• This can result in a wide disparity between the coloring and

the chromatic number, however

Data Structures and Algorithms in C++, Fourth Edition 118

Page 119: Chapter 8: Graphs

Graph Coloring (continued)

Fig. 8.40 (a) A graph used for coloring; (b) colors assigned tovertices with the sequential coloring algorithm that ordersvertices by index number; (c) vertices are put in the largest

first sequence; (d) graph coloring obtained with the Brélaz algorithm

Data Structures and Algorithms in C++, Fourth Edition 119

Page 120: Chapter 8: Graphs

Graph Coloring (continued)• A theorem due to Dominic Welsh and M. B. Powell (1967) will

be of use (the proof is on page 460)Theorem: For the sequential coloring algorithm, the number of colors needed to color the graph, χ(G) < maxmin(i, deg() + 1)

• Applying this to the graph of Figure 8.40a, we have χ(G) = max(min(1,4), min(2,4), min(3,3), min(4,3), min(5,3), min(6,5), min(7,6), min(8,4)) = max(1, 2, 3, 3, 3, 5, 6, 4) = 6

• The theorem suggests that vertices of higher degree be placed first, so the min value is their position in the sequence

• Vertices of lower degree get placed last, so their minimum value is the degree of the vertex

• This leads to the largest first approach, where the vertices are ordered in descending order by degree

Data Structures and Algorithms in C++, Fourth Edition 120

Page 121: Chapter 8: Graphs

Graph Coloring (continued)

• Doing it this way gives us the order v7, v6, v1, v2, v8, v3, v4, v5, where v7 gets colored first, as seen in Figure 8.40c

• This also gives us a better sense of the chromatic number, because with this ordering χ(G) < 4

• Although this ordering method uses a single criterion, there is no restriction on the number of criteria that can be applied

• This can be helpful in breaking ties, since in our example, two vertices with the same degree are chosen by their index order

• In 1979, Daniel Brélaz proposed an algorithm where the saturation degree of a vertex (the number of colors of the vertex’s neighbors) is used

Data Structures and Algorithms in C++, Fourth Edition 121

Page 122: Chapter 8: Graphs

Graph Coloring (continued)• If a tie occurs, it is broken by choosing the vertex with the

largest uncolored degree, which is the number of uncolored vertices adjacent to the vertex

• This algorithm is on page 462, and is applied in Figure 8.40d• First we choose v7 because it has the highest degree; then

vertices 1, 3, 4, 6 and 8 have their saturations set to 1• From these, v6 is chosen, since it has the most uncolored

neighbors• The saturation of vertices 1 and 8 are changed to 2, and since

their saturation and uncolored neighbors are equal, we rely on the index to select v1; the remainder are as shown in the figure

Data Structures and Algorithms in C++, Fourth Edition 122

Page 123: Chapter 8: Graphs

NP-Complete Problems in Graph Theory

• The Clique Problem– A clique in an undirected graph is a subset of its vertices such that

every two vertices in the subset are connected by an edge– The clique problem is to determine, for some graph G, whether or not

it contains a clique Km for some integer m– The problem is NP because we can check in polynomial time whether

a set of m vertices forming a subgraph is a clique– To show it is NP-complete, we can use the 3-satisfiability problem and

reduce it to the clique problem– The reduction is performed by showing that for a Boolean expression

BE of 3 variables in CNF we can construct a graph such that the expression is satisfiable if there is a clique of m vertices in the graph

Data Structures and Algorithms in C++, Fourth Edition 123

Page 124: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The Clique Problem (continued)– We will let m be the number of alternatives in BE, such that we have

BE = A1 A2 … Am

– Each Ai = (p q r), where the p, q, and r are the three Boolean variables or their negations

– A graph is constructed where the vertices represent all the variables and their negations found in BE

– An edge will join two vertices if they are not complements and they are in different alternatives

– The expression BE = (x y z) (x y z) (w x y) corresponds to the graph in Figure 8.41

Data Structures and Algorithms in C++, Fourth Edition 124

Page 125: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The Clique Problem (continued)

Fig. 8.41 A graph corresponding to the Booleanexpression (x y ¬z) (x ¬y ¬z) (w ¬x ¬y)

– An edge between variables represents the possibility that both variables are true at the same time

Data Structures and Algorithms in C++, Fourth Edition 125

Page 126: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The Clique Problem (continued)– An m-clique represents the possibility that a variable from each

alternative is true, making the BE true– Each triangle in Figure 8.41 represents a 3-clique– This way, if BE is satisfiable, an m-clique exists, and if an m-clique

exists, BE is satisfiable– So the satisfiability problem is reduced to the clique problem– Since the satisfiability problem is NP-complete, the clique problem is

NP-complete as well

Data Structures and Algorithms in C++, Fourth Edition 126

Page 127: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The 3-Colorability Problem– The 3-colorability problem is the question of whether or not a graph

can be colored with three colors– As with the clique problem, we’ll show this is NP-complete by reducing

it to the 3-satisfiability problem– The problem is NP because we can come up with a coloring of the

vertices in three colors and check that the coloring in correct in quadratic time

– We will use an auxiliary 9-subgraph to reduce the 3-satisfiability problem to the 3-colorability problem

– The 9-subgraph takes 3 vertices from an existing graph and adds 6 new vertices and 10 edges, as can be seen in Figure 8.42a

Data Structures and Algorithms in C++, Fourth Edition 127

Page 128: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The 3-Colorability Problem (continued)

Fig. 8.42 (a) A 9-subgraph; (b) a graph corresponding to theBoolean expression (¬w x y) (¬w ¬y z) (w ¬y ¬z)

Data Structures and Algorithms in C++, Fourth Edition 128

Page 129: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The 3-Colorability Problem (continued)– Now, consider the set of three colors {f, t, n} corresponding to

(fuchsia/false, turquoise/true, nasturtium/neutral) used to color the graph

– The following lemma will help us in demonstrating the reducibility of the 3-satisfiability problem to the 3-colorability problem

Lemma. 1) If all three vertices, v1, v2, and v3, of a 9-subgraph arecolored with f, then vertex v4 must also be colored with f to havethe 9-subgraph colored correctly. 2) If only colors t and f can beused to color vertices v1, v2, and v3 of a 9-subgraph, and at leastone is colored with t, then vertex v4 can be colored with t

Data Structures and Algorithms in C++, Fourth Edition 129

Page 130: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The 3-Colorability Problem (continued)– Now the graph for the given Boolean expression BE of k alternatives is

constructed in the following way– There are two special vertices, a and b, and edge(ab) in the graph; also

there is a vertex for the variables in BE and for the negation of these– The graph includes edge(ax), edge(a(x)), and edge(x(x)) for each

vertex, x, and its negation, x– Now, the graph has a 9-subgraph whose vertices v1, v2, and v3

correspond to the three Boolean variables or their negations p, q, and r in the alternative p q r included in BE

– Lastly, the graph includes edge(v4b) for each 9-subgraph

Data Structures and Algorithms in C++, Fourth Edition 130

Page 131: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The 3-Colorability Problem (continued)– The graph corresponding to (w x y) (w y z) (w y z) is shown in

Figure 8.42b– Now we can claim that if a Boolean expression BE is satisfiable, the

graph corresponding to it is 3-colorable– For every variable x in BE, if x is true we set color(x) = t and color(x) = f;

otherwise color(x) = f and color(x) = t– If each alternative in BE is satisfiable, then the Boolean expression is

satisfiable– This takes place when at least one variable or its negation is true in

each alternative

Data Structures and Algorithms in C++, Fourth Edition 131

Page 132: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The 3-Colorability Problem (continued)– Since each neighbor of a has color t or f, and since at least one of the

three vertices of each 9-subgraph has color t, each 9-subgraph is 3-colorable

– Thus color(v4) = t, and the entire graph is 3-colorable by setting color(a) = n and color(b) = f

– Now, suppose a graph as in Figure 8.42b is 3-colorable and that color(a) = n and color(b) = f

– Since color(a) = n, the neighbors of a have color f or t, and this can be interpreted as the Boolean variable associated with the vertices being true or false

Data Structures and Algorithms in C++, Fourth Edition 132

Page 133: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The 3-Colorability Problem (continued)– Only if all three vertices of any 9-subgraph have color f can vertex v4

have color f, but this would conflict with color f of vertex b– So no 9-subgraph’s vertices can all have color f; one must be t– As a consequence, each alternative of the 9-subgraph is true, so the

entire Boolean expression is satisfiable

Data Structures and Algorithms in C++, Fourth Edition 133

Page 134: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The Vertex Cover Problem– A vertex cover of a graph is a set of vertices such that each edge of the

graph is incident to at least one vertex of the set– In this way the vertices in the set cover all the edges– The problem to determine whether a graph, G, has a vertex cover

containing at most k vertices for some integer k is NP-complete– This problem is NP because a solution can be guessed and checked in

polynomial time– To show it is NP-complete, we’ll reduce the clique problem to the

vertex cover problem

Data Structures and Algorithms in C++, Fourth Edition 134

Page 135: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The Vertex Cover Problem (continued)– The first thing to do is define a complement graph of G that has the

same vertices, but whose connections are edges not in G– The reduction algorithm converts a graph G with a ( - k) – clique into

its complement with a vertex cover size of k– If C = (VC , EC) is a clique in G, vertices from V – VC cover all the edges

in the complement, because it has no edges with both vertices in VC – As a result, V – VC is a vertex cover in the complement graph, – Figure 8.43a shows a graph with a clique and 8.43b shows a

complement graph with a vertex cover– Now suppose there is a vertex cover W for

Data Structures and Algorithms in C++, Fourth Edition 135

Page 136: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The Vertex Cover Problem (continued)– If W contains none of the endpoints of an edge, that edge must be in

G meaning the latter endpoints are in V – W – Therefore, VC = V – W forms a clique– As a result, this proves a positive answer to the clique problem is a

positive answer to the vertex cover problem through the conversion– And since the former is NP-complete, so is the latter

Fig. 8.43 (a) A graph with a clique; (b) a complement graph

Data Structures and Algorithms in C++, Fourth Edition 136

Page 137: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The Hamiltonian Cycle Problem– Asserting that the Hamiltonian cycle problem is NP-complete can be

shown by reducing the vertex cover problem to the Hamiltonian cycle problem

– We will make use of an auxiliary 12-graph, as shown in Figure 8.44a– Each edge(vu) of the graph G is converted into a 12-subgraph so that

one side of the subgraph (vertices a and b) corresponds to a vertex v of G and the other side (vertices c and d) corresponds to vertex u

– After entering a side of the 12-subgraph at vertex a, we can go through all 12 vertices in order a, c, d, b and exit at b on the same side

– We can also go directly from a to b, and if there is a Hamiltonian circuit in the entire graph, vertices c and b are traversed in another visit of the 12-subgraph

Data Structures and Algorithms in C++, Fourth Edition 137

Page 138: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The Hamiltonian Cycle Problem (continued)

Fig. 8.44 (a) A 12-subgraph; (b) a graph G and (c) its transformation, graph GH

Data Structures and Algorithms in C++, Fourth Edition 138

Page 139: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The Hamiltonian Cycle Problem (continued)– Any other path through the 12-subgraph would render building a

Hamiltonian cycle of the entire graph impossible– Now, assuming we have a graph G, we can proceed to build another

graph, GH, in the following manner– We first create a set of vertices u1, u2, …, uk, where the value k is the

parameter that corresponds to the vertex cover problem for graph G– Next, for each edge of G, we create a 12-subgraph, and those 12-

subgraphs associated with a vertex v are connected together on the sides corresponding to v

– Finally, the endpoint of the string of these 12-subgraphs is connected to the vertices u1, u2, …, uk

Data Structures and Algorithms in C++, Fourth Edition 139

Page 140: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The Hamiltonian Cycle Problem (continued)– The result of this transformation from G to GH for k = 3 is shown in

Figure 8.44b-c– Figure 8.44c only shows some of the connections, to avoid clutter; the

small segments from the other vertices indicate other connections– Now the claim is that there is a Hamiltonian cycle in GH if there is a

vertex cover of size k in G– We’ll start by assuming there is a vertex cover in G, designated by the

set W = {v1, v2, …, vk}– Next, we’ll assert there is a Hamiltonian cycle in GH, which is formed in

the following procedure

Data Structures and Algorithms in C++, Fourth Edition 140

Page 141: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The Hamiltonian Cycle Problem (continued)– Starting at u1, we go through the sides of 12-subgraphs corresponding

to v1

– We will go through all the 12 vertices of a particular 12-subgraph if the other side of it does not correspond to a vertex in set W

– Otherwise we go straight through the 12-subgraph, which means we won’t traverse 6 of the vertices corresponding to a vertex w

– However, we will traverse them when we process that part of the Hamiltonian cycle corresponding to w

– Once we reach the end of the string of 12-subgraphs, we go to vertex u2 and repeat this process for vertex v2, etc.

– For the last vertex uk, we process vk and end the path at u1

Data Structures and Algorithms in C++, Fourth Edition 141

Page 142: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The Hamiltonian Cycle Problem (continued)– The result of this is the creation of a Hamiltonian cycle– The thick line in Figure 8.44c represents the part of the Hamiltonian

cycle matching v1 that starts at u1 and ends at u2

– Because the cover in this case is W = {v1, v2, v6}, this processing continues at u2 and ends at u3 for v2, and then for v6 from u3 to u1

– Now if GH has a Hamiltonian cycle, conversely it would have k 12-subgraph strings including subpaths that correspond to the k vertices in GC that form a cover

– Consequently, we have shown the reducibility of the vertex cover problem to the Hamiltonian cycle problem, and since the former is NP-complete, so is the latter

Data Structures and Algorithms in C++, Fourth Edition 142

Page 143: Chapter 8: Graphs

NP-Complete Problems in Graph Theory(continued)

• The Hamiltonian Cycle Problem (continued)– As an afterthought, now consider the traveling salesman problem – Given a graph with distance assigned to each edge, we try to identify a

cycle with a total distance not greater than some integer, k– We can demonstrate this problem is NP-complete by reducing it to the

Hamiltonian cycle problem

Data Structures and Algorithms in C++, Fourth Edition 143