Subsequently, Gabow [163] rediscovered this algorithm. Knuth [259]

was the first to give a linear-time algorithm for topological sorting.

1 We distinguish between gray and black vertices to help us understand how breadth-first search operates. In fact, as Exercise 20.2-3 shows, we get the same result even if we do not distinguish between gray and black vertices.

2 Chapters 22 and 23 generalize shortest paths to weighted graphs, in which every edge has a real-valued weight and the weight of a path is the sum of the weights of its constituent edges.

The graphs considered in the present chapter are unweighted or, equivalently, all edges have unit weight.

3 It may seem arbitrary that breadth-first search is limited to only one source whereas depth-first search may search from multiple sources. Although conceptually, breadth-first search could proceed from multiple sources and depth-first search could be limited to one source, our approach reflects how the results of these searches are typically used. Breadth-first search usually serves to find shortest-path distances and the associated predecessor subgraph from a given source. Depth-first search is often a subroutine in another algorithm, as we’ll see later in this chapter.

Image 657

21 Minimum Spanning Trees

Electronic circuit designs often need to make the pins of several

components electrically equivalent by wiring them together. To

interconnect a set of n pins, the designer can use an arrangement of n

1 wires, each connecting two pins. Of all such arrangements, the one

that uses the least amount of wire is usually the most desirable.

To model this wiring problem, use a connected, undirected graph G

= ( V, E), where V is the set of pins, E is the set of possible interconnections between pairs of pins, and for each edge ( u, v) ∈ E, a weight w( u, v) specifies the cost (amount of wire needed) to connect u and v. The goal is to find an acyclic subset TE that connects all of the vertices and whose total weight

is minimized. Since T is acyclic and connects all of the vertices, it must

form a tree, which we call a spanning tree since it “spans” the graph G.

We call the problem of determining the tree T the minimum-spanning-

tree problem. 1 Figure 21.1 shows an example of a connected graph and a minimum spanning tree.

This chapter studies two ways to solve the minimum-spanning-tree

problem. Kruskal’s algorithm and Prim’s algorithm both run in O( E lg

V) time. Prim’s algorithm achieves this bound by using a binary heap as

a priority queue. By using Fibonacci heaps instead (see page 478),

Prim’s algorithm runs in O( E + V lg V) time. This bound is better than O( E lg V) whenever | E| grows asymptotically faster than | V|.

Image 658

Figure 21.1 A minimum spanning tree for a connected graph. The weights on edges are shown, and the blue edges form a minimum spanning tree. The total weight of the tree shown is 37. This minimum spanning tree is not unique: removing the edge ( b, c) and replacing it with the edge ( a, h) yields another spanning tree with weight 37.

The two algorithms are greedy algorithms, as described in Chapter

15. Each step of a greedy algorithm must make one of several possible

choices. The greedy strategy advocates making the choice that is the best

at the moment. Such a strategy does not generally guarantee that it

always finds globally optimal solutions to problems. For the minimum-

spanning-tree problem, however, we can prove that certain greedy

strategies do yield a spanning tree with minimum weight. Although you

can read this chapter independently of Chapter 15, the greedy methods presented here are a classic application of the theoretical notions

introduced there.

Section 21.1 introduces a “generic” minimum-spanning-tree method

that grows a spanning tree by adding one edge at a time. Section 21.2

gives two algorithms that implement the generic method. The first

algorithm, due to Kruskal, is similar to the connected-components

algorithm from Section 19.1. The second, due to Prim, resembles Dijkstra’s shortest-paths algorithm (Section 22.3).

Because a tree is a type of graph, in order to be precise we must

define a tree in terms of not just its edges, but its vertices as well.

Because this chapter focuses on trees in terms of their edges, we’ll

implicitly understand that the vertices of a tree T are those that some

edge of T is incident on.

21.1 Growing a minimum spanning tree

The input to the minumum-spanning-tree problem is a connected, undirected graph G = ( V, E) with a weight function w : E → ℝ. The goal is to find a minimum spanning tree for G. The two algorithms

considered in this chapter use a greedy approach to the problem,

although they differ in how they apply this approach.

This greedy strategy is captured by the procedure GENERIC-MST

on the facing page, which grows the minimum spanning tree one edge at

a time. The generic method manages a set A of edges, maintaining the

following loop invariant:

Prior to each iteration, A is a subset of some minimum

spanning tree.

GENERIC-MST( G, w)

1 A = Ø

2 while A does not form a spanning tree

3

find an edge ( u, v) that is safe for A

4

A = A ∪ {( u, v)}

5 return A

Each step determines an edge ( u, v) that the procedure can add to A without violating this invariant, in the sense that A ∪ {( u, v)} is also a subset of a minimum spanning tree. We call such an edge a safe edge for

A, since it can be added safely to A while maintaining the invariant.

This generic algorithm uses the loop invariant as follows:

Initialization: After line 1, the set A trivially satisfies the loop invariant.

Maintenance: The loop in lines 2–4 maintains the invariant by adding

only safe edges.

Termination: All edges added to A belong to a minimum spanning tree,

and the loop must terminate by the time it has considered all edges.

Therefore, the set A returned in line 5 must be a minimum spanning

tree.

The tricky part is, of course, finding a safe edge in line 3. One must

exist, since when line 3 is executed, the invariant dictates that there is a

Image 659

spanning tree T such that AT. Within the while loop body, A must be a proper subset of T, and therefore there must be an edge ( u, v) ∈ T

such that ( u, v) ∉ A and ( u, v) is safe for A.

The remainder of this section provides a rule (Theorem 21.1) for

recognizing safe edges. The next section describes two algorithms that

use this rule to find safe edges efficiently.

We first need some definitions. Acut (S, V – S) of an undirected graph

G = ( V, E) is a partition of V. Figure 21.2 illustrates this notion. We say that an edge ( u, v) ∈ Ecrosses the cut ( S, VS) if one of its endpoints belongs to S and the other belongs to VS. A cut respects a set A of edges if no edge in A crosses the cut. An edge is a light edge crossing a cut if its weight is the minimum of any edge crossing the cut. There can

be more than one light edge crossing a cut in the case of ties. More

generally, we say that an edge is a light edge satisfying a given property

if its weight is the minimum of any edge satisfying the property.

The following theorem gives the rule for recognizing safe edges.

Theorem 21.1

Let G = ( V, E) be a connected, undirected graph with a real-valued weight function w defined on E. Let A be a subset of E that is included in some minimum spanning tree for G, let ( S, VS) be any cut of G

that respects A, and let ( u, v) be a light edge crossing ( S, VS). Then, edge ( u, v) is safe for A.

Figure 21.2 A cut ( S, VS) of the graph from Figure 21.1. Orange vertices belong to the set S, and tan vertices belong to VS. The edges crossing the cut are those connecting tan vertices with orange vertices. The edge ( d, c) is the unique light edge crossing the cut. Blue edges form a subset A of the edges. The cut ( S, VS) respects A, since no edge of A crosses the cut.

Proof Let T be a minimum spanning tree that includes A, and assume that T does not contain the light edge ( u, v), since if it does, we are done.

We’ll construct another minimum spanning tree T′ that includes A

{( u, v)} by using a cut-and-paste technique, thereby showing that ( u, v) is a safe edge for A.

The edge ( u, v) forms a cycle with the edges on the simple path p from u to v in T, as Figure 21.3 illustrates. Since u and v are on opposite sides of the cut ( S, VS), at least one edge in T lies on the simple path p and also crosses the cut. Let ( x, y) be any such edge. The edge ( x, y) is not in A, because the cut respects A. Since ( x, y) is on the unique simple path from u to v in T, removing ( x, y) breaks T into two components.

Adding ( u, v) reconnects them to form a new spanning tree T′ = ( T

{( x, y)}) ∪ {( u, v)}.

We next show that T′ is a minimum spanning tree. Since ( u, v) is a light edge crossing ( S, VS) and ( x, y) also crosses this cut, w( u, v) ≤

w( x, y). Therefore,

w( T′) = w( T) − w( x, y) + w( u, v)

w( T).

But T is a minimum spanning tree, so that w( T) ≤ w( T′), and thus, T

must be a minimum spanning tree as well.

It remains to show that ( u, v) is actually a safe edge for A. We have A

T′, since AT and ( x, y) ∉ A, and thus, A ∪ {( u, v)} ⊆ T′.

Consequently, since T′ is a minimum spanning tree, ( u, v) is safe for A.

Theorem 21.1 provides insight into how the GENERIC-MST

method works on a connected graph G = ( V, E). As the method proceeds, the set A is always acyclic, since it is a subset of a minimum

spanning tree and a tree may not contain a cycle. At any point in the

execution, the graph GA = ( V, A) is a forest, and each of the connected components of GA is a tree. (Some of the trees may contain just one

vertex, as is the case, for example, when the method begins: A is empty

and the forest contains | V| trees, one for each vertex.) Moreover, any safe edge ( u, v) for A connects distinct components of GA, since A

{( u, v)} must be acyclic.

Image 660

Figure 21.3 The proof of Theorem 21.1. Orange vertices belong to S, and tan vertices belong to VS. Only edges in the minimum spanning tree T are shown, along with edge ( u, v), which does not lie in T. The edges in A are blue, and ( u, v) is a light edge crossing the cut ( S, VS). The edge ( x, y) is an edge on the unique simple path p from u to v in T. To form a minimum spanning tree T′ that contains ( u, v), remove the edge ( x, y) from T and add the edge ( u, v).

The while loop in lines 2–4 of GENERIC-MST executes | V| – 1 times

because it finds one of the | V| – 1 edges of a minimum spanning tree in

each iteration. Initially, when A = Ø, there are | V| trees in GA, and each iteration reduces that number by 1. When the forest contains only a

single tree, the method terminates.

The two algorithms in Section 21.2 use the following corollary to Theorem 21.1.

Corollary 21.2

Let G = ( V, E) be a connected, undirected graph with a real-valued weight function w defined on E. Let A be a subset of E that is included in some minimum spanning tree for G, and let C = ( VC, EC) be a connected component (tree) in the forest GA = ( V, A). If ( u, v) is a light edge connecting C to some other component in GA, then ( u, v) is safe for A.

Proof The cut ( VC, VVC) respects A, and ( u, v) is a light edge for this cut. Therefore, ( u, v) is safe for A.

Exercises

21.1-1

Let ( u, v) be a minimum-weight edge in a connected graph G. Show that ( u, v) belongs to some minimum spanning tree of G.

21.1-2

Professor Sabatier conjectures the following converse of Theorem 21.1.

Let G = ( V, E) be a connected, undirected graph with a real-valued weight function w defined on E. Let A be a subset of E that is included in some minimum spanning tree for G, let ( S, VS) be any cut of G

that respects A, and let ( u, v) be a safe edge for A crossing ( S, VS).

Then, ( u, v) is a light edge for the cut. Show that the professor’s conjecture is incorrect by giving a counterexample.

21.1-3

Show that if an edge ( u, v) is contained in some minimum spanning tree, then it is a light edge crossing some cut of the graph.

21.1-4

Give a simple example of a connected graph such that the set of edges

{( u, v) : there exists a cut ( S, VS) such that ( u, v) is a light edge crossing ( S, VS)} does not form a minimum spanning tree.

21.1-5

Let e be a maximum-weight edge on some cycle of connected graph G =

( V, E). Prove that there is a minimum spanning tree of G′ = ( V, E – { e}) that is also a minimum spanning tree of G. That is, there is a minimum

spanning tree of G that does not include e.

21.1-6

Show that a graph has a unique minimum spanning tree if, for every cut

of the graph, there is a unique light edge crossing the cut. Show that the

converse is not true by giving a counterexample.

21.1-7

Argue that if all edge weights of a graph are positive, then any subset of

edges that connects all vertices and has minimum total weight must be a

tree. Give an example to show that the same conclusion does not follow

if we allow some weights to be nonpositive.

Image 661

21.1-8

Let T be a minimum spanning tree of a graph G, and let L be the sorted list of the edge weights of T. Show that for any other minimum

spanning tree T′ of G, the list L is also the sorted list of edge weights of T′.

21.1-9

Let T be a minimum spanning tree of a graph G = ( V, E), and let V′ be a subset of V. Let T′ be the subgraph of T induced by V′, and let G′ be the subgraph of G induced by V′. Show that if T′ is connected, then T′ is a minimum spanning tree of G′.

21.1-10

Given a graph G and a minimum spanning tree T, suppose that the weight of one of the edges in T decreases. Show that T is still a minimum spanning tree for G. More formally, let T be a minimum spanning tree for G with edge weights given by weight function w.

Choose one edge ( x, y) ∈ T and a positive number k, and define the weight function w′ by

Show that T is a minimum spanning tree for G with edge weights given

by w′.

21.1-11

Given a graph G and a minimum spanning tree T, suppose that the weight of one of the edges not in T decreases. Give an algorithm for finding the minimum spanning tree in the modified graph.

21.2 The algorithms of Kruskal and Prim

The two minimum-spanning-tree algorithms described in this section

elaborate on the generic method. They each use a specific rule to

determine a safe edge in line 3 of GENERIC-MST. In Kruskal’s

algorithm, the set A is a forest whose vertices are all those of the given

Image 662

graph. The safe edge added to A is always a lowest-weight edge in the

graph that connects two distinct components. In Prim’s algorithm, the

set A forms a single tree. The safe edge added to A is always a lowest-

weight edge connecting the tree to a vertex not in the tree. Both

algorithms assume that the input graph is connected and represented by

adjacency lists.

Figure 21.4 The execution of Kruskal’s algorithm on the graph from Figure 21.1. Blue edges belong to the forest A being grown. The algorithm considers each edge in sorted order by weight. A red arrow points to the edge under consideration at each step of the algorithm. If the edge joins two distinct trees in the forest, it is added to the forest, thereby merging the two trees.

Kruskal’s algorithm

Kruskal’s algorithm finds a safe edge to add to the growing forest by

finding, of all the edges that connect any two trees in the forest, an edge

Image 663

( u, v) with the lowest weight. Let C 1 and C 2 denote the two trees that are connected by ( u, v). Since ( u, v) must be a light edge connecting C 1

to some other tree, Corollary 21.2 implies that ( u, v) is a safe edge for C 1. Kruskal’s algorithm qualifies as a greedy algorithm because at each

step it adds to the forest an edge with the lowest possible weight.

Figure 21.4, continued Further steps in the execution of Kruskal’s algorithm.

Like the algorithm to compute connected components from Section

19.1, the procedure MST-KRUSKAL on the following page uses a

disjoint-set data structure to maintain several disjoint sets of elements.

Each set contains the vertices in one tree of the current forest. The

operation FIND-SET( u) returns a representative element from the set

that contains u. Thus, to determine whether two vertices u and v belong to the same tree, just test whether FIND-SET( u) equals FIND-SET( v).

To combine trees, Kruskal’s algorithm calls the UNION procedure.

Figure 21.4 shows how Kruskal’s algorithm works. Lines 1–3

initialize the set A to the empty set and create | V| trees, one containing each vertex. The for loop in lines 6–9 examines edges in order of weight,

from lowest to highest. The loop checks, for each edge ( u, v), whether the endpoints u and v belong to the same tree. If they do, then the edge ( u, v) cannot be added to the forest without creating a cycle, and the edge is ignored. Otherwise, the two vertices belong to different trees. In

this case, line 8 adds the edge ( u, v) to A, and line 9 merges the vertices in the two trees.

MST-KRUSKAL( G, w)

1 A = Ø

2 for each vertex vG.V

3

MAKE-SET( v)

4 create a single list of the edges in G.E

5 sort the list of edges into monotonically increasing order by weight

w

6 for each edge ( u, v) taken from the sorted list in order

7

if FIND-SET( u) ≠ FIND-SET( v)

8

A = A ∪ {( u, v)}

9

UNION( u, v)

10 return A

The running time of Kruskal’s algorithm for a graph G = ( V, E) depends on the specific implementation of the disjoint-set data

structure. Let’s assume that it uses the disjoint-set-forest

implementation of Section 19.3 with the union-by-rank and path-compression heuristics, since that is the asymptotically fastest

implementation known. Initializing the set A in line 1 takes O(1) time, creating a single list of edges in line 4 takes O( V + E) time (which is O(E) because G is connected), and the time to sort the edges in line 5 is O( E lg E). (We’ll account for the cost of the | V| MAKE-SET operations in the for loop of lines 2–3 in a moment.) The for loop of lines 6–9

performs O( E) FIND-SET and UNION operations on the disjoint-set

forest. Along with the | V| MAKE-SET operations, these disjoint-set

operations take a total of O(( V + E) α( V)) time, where α is the very slowly growing function defined in Section 19.4. Because we assume that G is connected, we have | E| ≥ | V| – 1, and so the disjoint-set

operations take O( E α( V)) time. Moreover, since α(| V|) = O(lg V) = O(lg E), the total running time of Kruskal’s algorithm is O( E lg E).

Observing that | E| < | V|2, we have lg | E| = O(lg V), and so we can restate the running time of Kruskal’s algorithm as O( E lg V).

Prim’s algorithm

Like Kruskal’s algorithm, Prim’s algorithm is a special case of the

generic minimum-spanning-tree method from Section 21.1. Prim’s algorithm operates much like Dijkstra’s algorithm for finding shortest

paths in a graph, which we’ll see in Section 22.3. Prim’s algorithm has the property that the edges in the set A always form a single tree. As

Figure 21.5 shows, the tree starts from an arbitrary root vertex r and grows until it spans all the vertices in V. Each step adds to the tree A a

light edge that connects A to an isolated vertex—one on which no edge

of A is incident. By Corollary 21.2, this rule adds only edges that are safe for A. Therefore, when the algorithm terminates, the edges in A form a minimum spanning tree. This strategy qualifies as greedy since at

each step it adds to the tree an edge that contributes the minimum

amount possible to the tree’s weight.

Image 664

Figure 21.5 The execution of Prim’s algorithm on the graph from Figure 21.1. The root vertex is a. Blue vertices and edges belong to the tree being grown, and tan vertices have yet to be added to the tree. At each step of the algorithm, the vertices in the tree determine a cut of the graph, and a light edge crossing the cut is added to the tree. The edge and vertex added to the tree are highlighted in orange. In the second step (part (c)), for example, the algorithm has a choice of adding either edge ( b, c) or edge ( a, h) to the tree since both are light edges crossing the cut.

In the procedure MST-PRIM below, the connected graph G and the

root r of the minimum spanning tree to be grown are inputs to the

algorithm. In order to efficiently select a new edge to add into tree A,

the algorithm maintains a min-priority queue Q of all vertices that are

not in the tree, based on a key attribute. For each vertex v, the attribute

v.key is the minimum weight of any edge connecting v to a vertex in the tree, where by convention, v.key = ∞ if there is no such edge. The

attribute v.π names the parent of v in the tree. The algorithm implicitly maintains the set A from GENERIC-MST as

A = {( v, v.π) : vV – { r} – Q}, where we interpret the vertices in Q as forming a set. When the

algorithm terminates, the min-priority queue Q is empty, and thus the

minimum spanning tree A for G is

A = {( v, v.π) : vV – { r}}.

MST-PRIM( G, w, r)

1 for each vertex uG. V

2

u.key = ∞

3

u.π = NIL

4 r.key = 0

5 Q = Ø

6 for each vertex uG.V

7

INSERT( Q, u)

8 while Q ≠ Ø

9

u = EXTRACT-MIN( Q) // add u to the tree

10

for each vertex v in// update keys of u’s non-tree

G.Adj[ u]

neighbors

11

if vQ and w( u, v) < v.key

12

v.π = u

13

v.key = w( u, v)

14

DECREASE-KEY( Q, v, w( u, v))

Figure 21.5 shows how Prim’s algorithm works. Lines 1–7 set the key of each vertex to ∞ (except for the root r, whose key is set to 0 to make it

the first vertex processed), set the parent of each vertex to NIL, and

insert each vertex into the min-priority queue Q. The algorithm

maintains the following three-part loop invariant:

Prior to each iteration of the while loop of lines 8–14,

1. A = {( v, v.π) : vV – {r} – Q}.

2. The vertices already placed into the minimum spanning

tree are those in VQ.

3. For all vertices vQ, if v.π ≠ NIL, then v.key < ∞ and v.key is the weight of a light edge ( v, v.π) connecting v to some vertex already placed into the minimum spanning

tree.

Line 9 identifies a vertex uQ incident on a light edge that crosses the cut ( VQ, Q) (with the exception of the first iteration, in which u = r due to lines 4–7). Removing u from the set Q adds it to the set VQ of vertices in the tree, thus adding the edge ( u, u.π) to A. The for loop of lines 10–14 updates the key and attributes of every vertex v adjacent to u but not in the tree, thereby maintaining the third part of the loop

invariant. Whenever line 13 updates v.key, line 14 calls DECREASE-

KEY to inform the min-priority queue that v’s key has changed.

The running time of Prim’s algorithm depends on the specific

implementation of the min-priority queue Q. You can implement Q

with a binary min-heap (see Chapter 6), including a way to map between vertices and their corresponding heap elements. The BUILD-MIN-HEAP procedure can perform lines 5–7 in O( V) time. In fact, there is no need to call BUILD-MIN-HEAP. You can just put the key

of r at the root of the min-heap, and because all other keys are ∞, they

can go anywhere else in the min-heap. The body of the while loop

executes | V| times, and since each EXTRACT-MIN operation takes

O(lg V) time, the total time for all calls to EXTRACT-MIN is O( V lg V). The for loop in lines 10–14 executes O( E) times altogether, since the sum of the lengths of all adjacency lists is 2 | E|. Within the for loop, the

test for membership in Q in line 11 can take constant time if you keep a

bit for each vertex that indicates whether it belongs to Q and update the

bit when the vertex is removed from Q. Each call to DECREASE-KEY

in line 14 takes O(lg V) time. Thus, the total time for Prim’s algorithm is O( V lg V + E lg V) = O( E lg V), which is asymptotically the same as for our implementation of Kruskal’s algorithm.

You can further improve the asymptotic running time of Prim’s algorithm by implementing the min-priority queue with a Fibonacci

heap (see page 478). If a Fibonacci heap holds | V| elements, an

EXTRACT-MIN operation takes O(lg V) amortized time and each

INSERT and DECREASE-KEY operation takes only O(1) amortized

time. Therefore, by using a Fibonacci heap to implement the min-

priority queue Q, the running time of Prim’s algorithm improves to

O( E+ V lg V).

Exercises

21.2-1

Kruskal’s algorithm can return different spanning trees for the same

input graph G, depending on how it breaks ties when the edges are

sorted. Show that for each minimum spanning tree T of G, there is a way to sort the edges of G in Kruskal’s algorithm so that the algorithm

returns T.

21.2-2

Give a simple implementation of Prim’s algorithm that runs in O( V 2) time when the graph G = ( V, E) is represented as an adjacency matrix.

21.2-3

For a sparse graph G = ( V, E), where | E| = Θ( V), is the implementation of Prim’s algorithm with a Fibonacci heap asymptotically faster than

the binary-heap implementation? What about for a dense graph, where

| E| = Θ( V 2)? How must the sizes | E| and | V| be related for the Fibonacci-heap implementation to be asymptotically faster than the binary-heap

implementation?

21.2-4

Suppose that all edge weights in a graph are integers in the range from 1

to | V|. How fast can you make Kruskal’s algorithm run? What if the

edge weights are integers in the range from 1 to W for some constant

W?

21.2-5

Suppose that all edge weights in a graph are integers in the range from 1

to | V|. How fast can you make Prim’s algorithm run? What if the edge

weights are integers in the range from 1 to W for some constant W?

21.2-6

Professor Borden proposes a new divide-and-conquer algorithm for

computing minimum spanning trees, which goes as follows. Given a

graph G = ( V, E), partition the set V of vertices into two sets V 1 and V 2

such that | V 1| and | V 2| differ by at most 1. Let E 1 be the set of edges that are incident only on vertices in V 1, and let E 2 be the set of edges that are incident only on vertices in V 2. Recursively solve a minimum-spanning-tree problem on each of the two subgraphs G 1 = ( V 1, E 1) and G 2 = ( V 2, E 2). Finally, select the minimum-weight edge in E that crosses the cut V 1, V 2), and use this edge to unite the resulting two minimum spanning trees into a single spanning tree.

Either argue that the algorithm correctly computes a minimum

spanning tree of G, or provide an example for which the algorithm fails.

21.2-7

Suppose that the edge weights in a graph are uniformly distributed over

the half-open interval [0, 1). Which algorithm, Kruskal’s or Prim’s, can

you make run faster?

21.2-8

Suppose that a graph G has a minimum spanning tree already

computed. How quickly can you update the minimum spanning tree

upon adding a new vertex and incident edges to G?

Problems

21-1 Second-best minimum spanning tree

Let G = ( V, E) be an undirected, connected graph whose weight function is w : E → ℝ, and suppose that | E| ≥ | V| and all edge weights are distinct.

We define a second-best minimum spanning tree as follows. Let T be

the set of all spanning trees of G, and let T be a minimum spanning tree of G. Then a second-best minimum spanning tree is a spanning tree T

such that w( T′) = min { w( T″) : T″ ∈ T − { T}}.

a. Show that the minimum spanning tree is unique, but that the second-

best minimum spanning tree need not be unique.

b. Let T be the minimum spanning tree of G. Prove that G contains some edge ( u, v) ∈ T and some edge ( x, y) ∉ T such that ( T – {( u, v)})

∪ {( x, y)} is a second-best minimum spanning tree of G.

c. Now let T be any spanning tree of G and, for any two vertices u, v

V, let max[ u, v] denote an edge of maximum weight on the unique simple path between u and v in T. Describe an O( V 2)-time algorithm that, given T, computes max[ u, v] for all u, vV.

d. Give an efficient algorithm to compute the second-best minimum

spanning tree of G.

21-2 Minimum spanning tree in sparse graphs

For a very sparse connected graph G = ( V, E), it is possible to further improve upon the O( E + V lg V) running time of Prim’s algorithm with a Fibonacci heap by preprocessing G to decrease the number of vertices

before running Prim’s algorithm. In particular, for each vertex u, choose

the minimum-weight edge ( u, v) incident on u, and put ( u, v) into the minimum spanning tree under construction. Then, contract all chosen

edges (see Section B.4). Rather than contracting these edges one at a time, first identify sets of vertices that are united into the same new

vertex. Then create the graph that would have resulted from contracting

these edges one at a time, but do so by “renaming” edges according to

the sets into which their endpoints were placed. Several edges from the

original graph might be renamed the same as each other. In such a case,

only one edge results, and its weight is the minimum of the weights of

the corresponding original edges.

Initially, set the minimum spanning tree T being constructed to be

empty, and for each edge ( u, v) ∈ E, initialize the two attributes ( u,

v). orig = ( u, v) and ( u, v). c = w( u, v). Use the orig attribute to reference the edge from the initial graph that is associated with an edge in the

contracted graph. The c attribute holds the weight of an edge, and as

edges are contracted, it is updated according to the above scheme for

choosing edge weights. The procedure MST-REDUCE on the facing

page takes inputs G and T, and it returns a contracted graph G′ with updated attributes orig′ and c′. The procedure also accumulates edges of G into the minimum spanning tree T.

a. Let T be the set of edges returned by MST-REDUCE, and let A be the minimum spanning tree of the graph G′ formed by the call MST-PRIM( G′, c′, r), where c′ is the weight attribute on the edges of G′. E

and r is any vertex in G′: V. Prove that T ∪ {( x, y). orig′ : ( x, y) ∈ A} is a minimum spanning tree of G.

b. Argue that | G′. V| ≤ | V| /2.

c. Show how to implement MST-REDUCE so that it runs in O( E) time.

( Hint: Use simple data structures.)

d. Suppose that you run k phases of MST-REDUCE, using the output

G′ produced by one phase as the input G to the next phase and

accumulating edges in T. Argue that the overall running time of the k

phases is O( kE).

e. Suppose that after running k phases of MST-REDUCE, as in part

( d), you run Prim’s algorithm by calling MST-PRIM( G′, c′, r), where G′, with weight attribute c′, is returned by the last phase and r is any vertex in G′. V. Show how to pick k so that the overall running time is O( E lg lg V). Argue that your choice of k minimizes the overall asymptotic running time.

f. For what values of | E| (in terms of | V|) does Prim’s algorithm with preprocessing asymptotically beat Prim’s algorithm without

preprocessing?

MST-REDUCE( G, T)

1 for each vertex vG.V

2

v.mark = FALSE

3

MAKE-SET( v)

4 for each vertex uG.V

5

if u.mark == FALSE

6

choose vG.Adj[ u] such that ( u, v). c is minimized 7

UNION( u, v)

8

T = T ∪ {( u, v). orig}

9

u.mark = TRUE

10

v.mark = TRUE

11 G′. V = {FIND-SET( v) : vG.V}

12 G′. E = Ø

13 for each edge ( x, y) ∈ G.E

14

u = FIND-SET( x)

15

v = FIND-SET( y)

16

if uv

17

if ( u, v) ∉ G′. E

18

G′. E = G′. E ∪ {(u, v)}

19

( u, v). orig′ = ( x, y). orig

20

( u, v). c′ = ( x, y). c

21

elseif ( x, y). c < ( u, v).c′

22

( u, v). orig′ = ( x, y). orig

23

( u, v). c′ = ( x, y). c

24 construct adjacency lists G′. Adj for G

25 return G′ and T

21-3 Alternative minimum-spanning-tree algorithms

Consider the three algorithms MAYBE-MST-A, MAYBE-MST-B, and

MAYBE-MST-C on the next page. Each one takes a connected graph

and a weight function as input and returns a set of edges T. For each

algorithm, either prove that T is a minimum spanning tree or prove that

T is not necessarily a minimum spanning tree. Also describe the most

efficient implementation of each algorithm, regardless of whether it

computes a minimum spanning tree.

21-4 Bottleneck spanning tree

A bottleneck spanning tree T of an undirected graph G is a spanning tree of G whose largest edge weight is minimum over all spanning trees of G.

The value of the bottleneck spanning tree is the weight of the

maximum-weight edge in T.

MAYBE-MST-A( G, w)

1 sort the edges into monotonically decreasing order of edge weights

w

2 T = E

3 for each edge e, taken in monotonically decreasing order by weight

4

if T – { e} is a connected graph

5

T = T – { e}

6 return T

MAYBE-MST-B( G, w)

1 T = Ø

2 for each edge e, taken in arbitrary order

3

if T ∪ { e} has no cycles

4

T = T ∪ { e}

5 return T

MAYBE-MST-C( G, w)

1 T = Ø

2 for each edge e, taken in arbitrary order

3

T = T ∪ { e}

4

if T has a cycle c

5

let e′ be a maximum-weight edge on c

6

T = T – { e′}

7 return T

a. Argue that a minimum spanning tree is a bottleneck spanning tree.

Part (a) shows that finding a bottleneck spanning tree is no harder than

finding a minimum spanning tree. In the remaining parts, you will show

how to find a bottleneck spanning tree in linear time.

Image 665

Image 666

Image 667

b. Give a linear-time algorithm that, given a graph G and an integer b, determines whether the value of the bottleneck spanning tree is at

most b.

c. Use your algorithm for part (b) as a subroutine in a linear-time

algorithm for the bottleneck-spanning-tree problem. ( Hint: You might

want to use a subroutine that contracts sets of edges, as in the MST-

REDUCE procedure described in Problem 21-2.)

Chapter notes

Tarjan [429] surveys the minimum-spanning-tree problem and provides excellent advanced material. Graham and Hell [198] compiled a history of the minimum-spanning-tree problem.

Tarjan attributes the first minimum-spanning-tree algorithm to a

1926 paper by O. Borůvka. Borůvka’s algorithm consists of running

O(lg V) iterations of the procedure MST-REDUCE described in

Problem 21-2. Kruskal’s algorithm was reported by Kruskal [272] in 1956. The algorithm commonly known as Prim’s algorithm was indeed

invented by Prim [367], but it was also invented earlier by V. Jarník in 1930.

When | E| = Ω( V lg V), Prim’s algorithm, implemented with a Fibonacci heap, runs in O( E) time. For sparser graphs, using a combination of the ideas from Prim’s algorithm, Kruskal’s algorithm,

and Borůvka’s algorithm, together with advanced data structures,

Fredman and Tarjan [156] give an algorithm that runs in O( E lg* V) time. Gabow, Galil, Spencer, and Tarjan [165] improved this algorithm to run in O( E lg lg* V) time. Chazelle [83] gives an algorithm that runs in O( E ( E, V)) time, where ( E, V) is the functional inverse of Ackermann’s function. (See the chapter notes for Chapter 19 for a brief discussion of Ackermann’s function and its inverse.) Unlike previous

minimum-spanning-tree algorithms, Chazelle’s algorithm does not

follow the greedy method. Pettie and Ramachandran [356] give an algorithm based on precomputed “MST decision trees” that also runs

in O( E ( E, V)) time.

A related problem is spanning-tree verification: given a graph G = ( V, E) and a tree TE, determine whether T is a minimum spanning tree of G. King [254] gives a linear-time algorithm to verify a spanning tree, building on earlier work of Komlós [269] and Dixon, Rauch, and Tarjan

[120].

The above algorithms are all deterministic and fall into the

comparison-based model described in Chapter 8. Karger, Klein, and Tarjan [243] give a randomized minimum-spanning-tree algorithm that runs in O( V + E) expected time. This algorithm uses recursion in a manner similar to the linear-time selection algorithm in Section 9.3: a recursive call on an auxiliary problem identifies a subset of the edges E

that cannot be in any minimum spanning tree. Another recursive call on

EE′ then finds the minimum spanning tree. The algorithm also uses

ideas from Borůvka’s algorithm and King’s algorithm for spanning-tree

verification.

Fredman and Willard [158] showed how to find a minimum spanning

tree in O( V + E) time using a deterministic algorithm that is not comparison based. Their algorithm assumes that the data are b-bit

integers and that the computer memory consists of addressable b-bit

words.

1 The phrase “minimum spanning tree” is a shortened form of the phrase “minimum-weight spanning tree.” There is no point in minimizing the number of edges in T, since all spanning trees have exactly | V| − 1 edges by Theorem B.2 on page 1169.

Image 668

Image 669

22 Single-Source Shortest Paths

Suppose that you need to drive from Oceanside, New York, to

Oceanside, California, by the shortest possible route. Your GPS

contains information about the entire road network of the United

States, including the road distance between each pair of adjacent

intersections. How can your GPS determine this shortest route?

One possible way is to enumerate all the routes from Oceanside, New

York, to Oceanside, California, add up the distances on each route, and

select the shortest. But even disallowing routes that contain cycles, your

GPS would need to examine an enormous number of possibilities, most

of which are simply not worth considering. For example, a route that

passes through Miami, Florida, is a poor choice, because Miami is

several hundred miles out of the way.

This chapter and Chapter 23 show how to solve such problems efficiently. The input to a shortest-paths problem is a weighted, directed

graph G = ( V, E), with a weight function w : E → ℝ mapping edges to real-valued weights. The weight w( p) of path p = 〈 v 0, v 1, … , vk〉 is the sum of the weights of its constituent edges:

We define the shortest-path weight δ( u, v) from u to v by

A shortest path from vertex u to vertex v is then defined as any path p with weight w( p) = δ( u, v).

In the example of going from Oceanside, New York, to Oceanside,

California, your GPS models the road network as a graph: vertices

represent intersections, edges represent road segments between

intersections, and edge weights represent road distances. The goal is to

find a shortest path from a given intersection in Oceanside, New York

(say, Brower Avenue and Skillman Avenue) to a given intersection in

Oceanside, California (say, Topeka Street and South Horne Street).

Edge weights can represent metrics other than distances, such as

time, cost, penalties, loss, or any other quantity that accumulates

linearly along a path and that you want to minimize.

The breadth-first-search algorithm from Section 20.2 is a shortest-paths algorithm that works on unweighted graphs, that is, graphs in

which each edge has unit weight. Because many of the concepts from

breadth-first search arise in the study of shortest paths in weighted

graphs, you might want to review Section 20.2 before proceeding.

Variants

This chapter focuses on the single-source shortest-paths problem: given a

graph G = ( V, E), find a shortest path from a given source vertex sV

to every vertex vV. The algorithm for the single-source problem can

solve many other problems, including the following variants.

Single-destination shortest-paths problem: Find a shortest path to a

given destination vertex t from each vertex v. By reversing the direction of each edge in the graph, you can reduce this problem to a single-source problem.

Single-pair shortest-path problem: Find a shortest path from u to v for given vertices u and v. If you solve the single-source problem with source vertex u, you solve this problem also. Moreover, all known

algorithms for this problem have the same worst-case asymptotic

running time as the best single-source algorithms.

All-pairs shortest-paths problem: Find a shortest path from u to v for every pair of vertices u and v. Although you can solve this problem by

Image 670

Image 671

Image 672

Image 673

Image 674

running a single-source algorithm once from each vertex, you often

can solve it faster. Additionally, its structure is interesting in its own

right. Chapter 23 addresses the all-pairs problem in detail.

Optimal substructure of a shortest path

Shortest-paths algorithms typically rely on the property that a shortest

path between two vertices contains other shortest paths within it. (The

Edmonds-Karp maximum-flow algorithm in Chapter 24 also relies on this property.) Recall that optimal substructure is one of the key

indicators that dynamic programming (Chapter 14) and the greedy method (Chapter 15) might apply. Dijkstra’s algorithm, which we shall see in Section 22.3, is a greedy algorithm, and the Floyd-Warshall algorithm, which finds a shortest path between every pair of vertices

(see Section 23.2), is a dynamic-programming algorithm. The following lemma states the optimal-substructure property of shortest paths more

precisely.

Lemma 22.1 (Subpaths of shortest paths are shortest paths)

Given a weighted, directed graph G = ( V, E) with weight function w : E

→ ℝ let p = 〈 v 0, v 1, … , vk〉 be a shortest path from vertex v 0 to vertex vk and, for any i and j such that 0 ≤ ijk, let pij = 〈 vi, vi+1, … , vj〉 be the subpath of p from vertex vi to vertex vj. Then, pij is a shortest path from vi to vj.

Proof Decompose path p into

, so that w( p) = w( p 0 i) +

w( pij) + w( pjk). Now, assume that there is a path from vi to vj with weight

. Then,

is a path from v 0 to vk whose

weight

is less than w( p), which contradicts the

assumption that p is a shortest path from v 0 to vk.

Negative-weight edges

Some instances of the single-source shortest-paths problem may include edges whose weights are negative. If the graph G = ( V, E) contains no negative-weight cycles reachable from the source s, then for all vV, the shortest-path weight δ( s, v) remains well defined, even if it has a negative value. If the graph contains a negative-weight cycle reachable

from s, however, shortest-path weights are not well defined. No path

from s to a vertex on the cycle can be a shortest path—you can always

find a path with lower weight by following the proposed “shortest” path

and then traversing the negative-weight cycle. If there is a negative-

weight cycle on some path from s to v, we define δ( s, v) = −∞.

Figure 22.1 illustrates the effect of negative weights and negative-weight cycles on shortest-path weights. Because there is only one path

from s to a (the path 〈 s, a〉), we have δ( s, a) = w( s, a) = 3. Similarly, there is only one path from s to b, and so δ( s, b) = w( s, a) + w( a, b) = 3 + (−4)

= −1. There are infinitely many paths from s to c: 〈 s, c〉, 〈 s, c, d, c〉, 〈 s, c, d, c, d, c〉, and so on. Because the cycle 〈 c, d, c〉 has weight 6 + (−3) = 3

> 0, the shortest path from s to c is 〈 s, c〉, with weight δ( s, c) = w( s, c) =

5, and the shortest path from s to d is 〈 s, c, d〉, with weight δ( s, d) = w( s, c) + w( c, d) = 11. Analogously, there are infinitely many paths from s to e: 〈 s, e〉, 〈 s, e, f, e〉, 〈 s, e, f, e, f, e〉, and so on. Because the cycle 〈 e, f, e

has weight 3 + (−6) = −3 < 0, however, there is no shortest path from s

to e. By traversing the negative-weight cycle 〈 e, f, e〉 arbitrarily many times, you can find paths from s to e with arbitrarily large negative weights, and so δ( s, e) = −∞. Similarly, δ( s, f) = −∞. Because g is reachable from f, you can also find paths with arbitrarily large negative

weights from s to g, and so δ( s, g) = −∞. Vertices h, i, and j also form a negative-weight cycle. They are not reachable from s, however, and so

δ( s, h) = δ( s, i) = δ( s, j) = ∞.

Image 675

Figure 22.1 Negative edge weights in a directed graph. The shortest-path weight from source s appears within each vertex. Because vertices e and f form a negative-weight cycle reachable from s, they have shortest-path weights of −∞. Because vertex g is reachable from a vertex whose shortest-path weight is −∞, it, too, has a shortest-path weight of −∞. Vertices such as h, i, and j are not reachable from s, and so their shortest-path weights are ∞, even though they lie on a negative-weight cycle.

Some shortest-paths algorithms, such as Dijkstra’s algorithm,

assume that all edge weights in the input graph are nonnegative, as in a

road network. Others, such as the Bellman-Ford algorithm, allow

negative-weight edges in the input graph and produce a correct answer

as long as no negative-weight cycles are reachable from the source.

Typically, if there is such a negative-weight cycle, the algorithm can

detect and report its existence.

Cycles

Can a shortest path contain a cycle? As we have just seen, it cannot

contain a negative-weight cycle. Nor can it contain a positive-weight

cycle, since removing the cycle from the path produces a path with the

same source and destination vertices and a lower path weight. That is, if

p = 〈 v 0, v 1, … , vk〉 is a path and c = 〈 vi, vi+1, … , vj〉 is a positive-weight cycle on this path (so that vi = vj and w( c) > 0), then the path p

= 〈 v 0, v 1, … , vi, vj+1, vj+2, … , vk〉 has weight w( p′) = w( p) − w( c) < w( p), and so p cannot be a shortest path from v 0 to vk.

That leaves only 0-weight cycles. You can remove a 0-weight cycle

from any path to produce another path whose weight is the same. Thus,

if there is a shortest path from a source vertex s to a destination vertex v that contains a 0-weight cycle, then there is another shortest path from s

to v without this cycle. As long as a shortest path has 0-weight cycles, you can repeatedly remove these cycles from the path until you have a

shortest path that is cycle-free. Therefore, without loss of generality,

assume that shortest paths have no cycles, that is, they are simple paths.

Since any acyclic path in a graph G = ( V, E) contains at most | V| distinct vertices, it also contains at most | V| − 1 edges. Assume, therefore, that

any shortest path contains at most | V| − 1 edges.

Representing shortest paths

It is usually not enough to compute only shortest-path weights. Most

applications of shortest paths need to know the vertices on shortest

paths as well. For example, if your GPS told you the distance to your

destination but not how to get there, it would not be terribly useful. We

represent shortest paths similarly to how we represented breadth-first

trees in Section 20.2. Given a graph G = ( V, E), maintain for each vertex vV a predecessor v.π that is either another vertex or NIL. The shortest-paths algorithms in this chapter set the π attributes so that the

chain of predecessors originating at a vertex v runs backward along a

shortest path from s to v. Thus, given a vertex v for which v.π ≠ NIL, the procedure PRINT-PATH( G, s, v) from Section 20.2 prints a shortest path from s to v.

In the midst of executing a shortest-paths algorithm, however, the π

values might not indicate shortest paths. The predecessor subgraph G π =

( V π, E π) induced by the π values is defined the same for single-source shortest paths as for breadth-first search in equations (20.2) and (20.3)

on page 561:

V π = { vV : v.π ≠ NIL} ∪ { s},

E π = {( v.π, v) ∈ E : vV π − { s}}.

We’ll prove that the π values produced by the algorithms in this

chapter have the property that at termination G π is a “shortest-paths

tree”—informally, a rooted tree containing a shortest path from the

source s to every vertex that is reachable from s. A shortest-paths tree is like the breadth-first tree from Section 20.2, but it contains shortest

Image 676

paths from the source defined in terms of edge weights instead of

numbers of edges. To be precise, let G = ( V, E) be a weighted, directed graph with weight function w : E → ℝ, and assume that G contains no negative-weight cycles reachable from the source vertex sV, so that

shortest paths are well defined. Ashortest-paths tree rooted at s is a directed subgraph G′ = ( V′, E′), where V′ ⊆ V and E′ ⊆ E, such that 1. V′ is the set of vertices reachable from s in G,

2. G′ forms a rooted tree with root s, and

3. for all vV′, the unique simple path from s to v in G′ is a shortest path from s to v in G.

Figure 22.2 (a) A weighted, directed graph with shortest-path weights from source s. (b) The blue edges form a shortest-paths tree rooted at the source s. (c) Another shortest-paths tree with the same root.

Shortest paths are not necessarily unique, and neither are shortest-

paths trees. For example, Figure 22.2 shows a weighted, directed graph and two shortest-paths trees with the same root.

Relaxation

The algorithms in this chapter use the technique of relaxation. For each

vertex vV, the single-source shortest paths algorithms maintain an

attribute v.d, which is an upper bound on the weight of a shortest path

from source s to v. We call v.d a shortest-path estimate. To initialize the shortest-path estimates and predecessors, call the Θ( V)-time procedure

INITIALIZE-SINGLE-SOURCE. After initialization, we have v.π =

NIL for all vV, s.d = 0 and v.d = ∞ for vV − { s}.

Image 677

INITIALIZE-SINGLE-SOURCE( G, s)

1 for each vertex vG.V

2

v.d = ∞

3

v.π = NIL

4 s.d = 0

The process of relaxing an edge ( u, v) consists of testing whether going through vertex u improves the shortest path to vertex v found so

far and, if so, updating v.d and v.π. A relaxation step might decrease the value of the shortest-path estimate v.d and update v’s predecessor attribute v.π. The RELAX procedure on the following page performs a

relaxation step on edge ( u, v) in O(1) time. Figure 22.3 shows two examples of relaxing an edge, one in which a shortest-path estimate

decreases and one in which no estimate changes.

Figure 22.3 Relaxing an edge ( u, v) with weight w( u, v) = 2. The shortest-path estimate of each vertex appears within the vertex. (a) Because v.d > u.d + w( u, v) prior to relaxation, the value of v.d decreases. (b) Since we have v.du.d + w( u, v) before relaxing the edge, the relaxation step leaves v.d unchanged.

RELAX( u, v, w)

1 if v.d > u.d + w( u, v)

2

v.d = u.d + w( u, v)

3

v.π = u

Each algorithm in this chapter calls INITIALIZE-SINGLE-

SOURCE and then repeatedly relaxes edges. 1 Moreover, relaxation is the only means by which shortest-path estimates and predecessors

change. The algorithms in this chapter differ in how many times they relax each edge and the order in which they relax edges. Dijkstra’s

algorithm and the shortest-paths algorithm for directed acyclic graphs

relax each edge exactly once. The Bellman-Ford algorithm relaxes each

edge | V| − 1 times.

Properties of shortest paths and relaxation

To prove the algorithms in this chapter correct, we’ll appeal to several

properties of shortest paths and relaxation. We state these properties

here, and Section 22.5 proves them formally. For your reference, each property stated here includes the appropriate lemma or corollary

number from Section 22.5. The latter five of these properties, which refer to shortest-path estimates or the predecessor subgraph, implicitly

assume that the graph is initialized with a call to INITIALIZE-

SINGLE-SOURCE( G, s) and that the only way that shortest-path

estimates and the predecessor subgraph change are by some sequence of

relaxation steps.

Triangle inequality (Lemma 22.10)

For any edge ( u, v) ∈ E, we have δ( s, v) ≤ δ( s, u) + w( u, v).

Upper-bound property (Lemma 22.11)

We always have v.d ≥ δ( s, v) for all vertices vV, and once v.d achieves the value δ( s, v), it never changes.

No-path property (Corollary 22.12)

If there is no path from s to v, then we always have v.d = δ( s, v) = ∞.

Convergence property (Lemma 22.14)

If suv is a shortest path in G for some u, vV, and if u.d =

δ( s, u) at any time prior to relaxing edge ( u, v), then v.d = δ( s, v) at all times afterward.

Path-relaxation property (Lemma 22.15)

If p = 〈 v 0, v 1, … , vk〉 is a shortest path from s = v 0 to vk, and the edges of p are relaxed in the order ( v 0, v 1), ( v 1, v 2), … , ( vk−1, vk), then vk. d = δ( s, vk). This property holds regardless of any other

relaxation steps that occur, even if they are intermixed with relaxations of the edges of p.

Predecessor-subgraph property (Lemma 22.17)

Once v.d = δ( s, v) for all vV, the predecessor subgraph is a shortest-paths tree rooted at s.

Chapter outline

Section 22.1 presents the Bellman-Ford algorithm, which solves the single-source shortest-paths problem in the general case in which edges

can have negative weight. The Bellman-Ford algorithm is remarkably

simple, and it has the further benefit of detecting whether a negative-

weight cycle is reachable from the source. Section 22.2 gives a linear-time algorithm for computing shortest paths from a single source in a

directed acyclic graph. Section 22.3 covers Dijkstra’s algorithm, which has a lower running time than the Bellman-Ford algorithm but requires

the edge weights to be nonnegative. Section 22.4 shows how to use the Bellman-Ford algorithm to solve a special case of linear programming.

Finally, Section 22.5 proves the properties of shortest paths and relaxation stated above.

This chapter does arithmetic with infinities, and so we need some

conventions for when ∞ or −∞ appears in an arithmetic expression. We

assume that for any real number a ≠ −∞, we have a + ∞ = ∞ + a = ∞.

Also, to make our proofs hold in the presence of negative-weight cycles,

we assume that for any real number a ≠ ∞, we have a + (−∞) = (−∞) + a

= −∞.

All algorithms in this chapter assume that the directed graph G is

stored in the adjacency-list representation. Additionally, stored with

each edge is its weight, so that as each algorithm traverses an adjacency

list, it can find edge weights in O(1) time per edge.

22.1 The Bellman-Ford algorithm

The Bellman-Ford algorithm solves the single-source shortest-paths

problem in the general case in which edge weights may be negative.

Given a weighted, directed graph G = ( V, E) with source vertex s and weight function w : E → ℝ, the Bellman-Ford algorithm returns a boolean value indicating whether there is a negative-weight cycle that is

reachable from the source. If there is such a cycle, the algorithm

indicates that no solution exists. If there is no such cycle, the algorithm

produces the shortest paths and their weights.

The procedure BELLMAN-FORD relaxes edges, progressively

decreasing an estimate v.d on the weight of a shortest path from the source s to each vertex vV until it achieves the actual shortest-path weight δ( s, v). The algorithm returns TRUE if and only if the graph contains no negative-weight cycles that are reachable from the source.

BELLMAN-FORD( G, w, s)

1 INITIALIZE-SINGLE-SOURCE( G, s)

2 for i = 1 to | G.V| − 1

3

for each edge ( u, v) ∈ G.E

4

RELAX( u, v, w)

5 for each edge ( u, v) = G.E

6

if v.d > u.d + w( u, v)

7

return FALSE

8 return TRUE

Figure 22.4 shows the execution of the Bellman-Ford algorithm on a

graph with 5 vertices. After initializing the d and π values of all vertices

in line 1, the algorithm makes | V| − 1 passes over the edges of the graph.

Each pass is one iteration of the for loop of lines 2–4 and consists of

relaxing each edge of the graph once. Figures 22.4(b)–(e) show the state of the algorithm after each of the four passes over the edges. After

making | V| − 1 passes, lines 5–8 check for a negative-weight cycle and

return the appropriate boolean value. (We’ll see a little later why this

check works.)

Image 678

Figure 22.4 The execution of the Bellman-Ford algorithm. The source is vertex s. The d values appear within the vertices, and blue edges indicate predecessor values: if edge ( u, v) is blue, then v.π = u. In this particular example, each pass relaxes the edges in the order ( t, x), ( t, y), ( t, z), ( x, t), ( y, x), ( y, z), ( z, x), ( z, s), ( s, t), ( s, y). (a) The situation just before the first pass over the edges.

(b)–(e) The situation after each successive pass over the edges. Vertices whose shortest-path estimates and predecessors have changed due to a pass are highlighted in orange. The d and π

values in part (e) are the final values. The Bellman-Ford algorithm returns TRUE in this example.

The Bellman-Ford algorithm runs in O( V 2 + VE) time when the graph is represented by adjacency lists, since the initialization in line 1

takes Θ( V) time, each of the | V| − 1 passes over the edges in lines 2–4

takes Θ( V + E) time (examining | V| adjacency lists to find the | E| edges), and the for loop of lines 5–7 takes O( V + E) time. Fewer than | V| − 1

passes over the edges sometimes suffice (see Exercise 22.1-3), which is

why we say O( V 2+ VE) time, rather than Θ( V 2+ VE) time. In the frequent case where | E| = Ω( V), we can express this running time as O( VE). Exercise 22.1-5 asks you to make the Bellman-Ford algorithm

run in O( VE) time even when | E| = o( V).

To prove the correctness of the Bellman-Ford algorithm, we start by

showing that if there are no negative-weight cycles, the algorithm

computes correct shortest-path weights for all vertices reachable from

the source.

Lemma 22.2

Let G = ( V, E) be a weighted, directed graph with source vertex s and weight function w : E → ℝ, and assume that G contains no negative-weight cycles that are reachable from s. Then, after the | V| − 1 iterations of the for loop of lines 2–4 of BELLMAN-FORD, v.d = δ( s, v) for all vertices v that are reachable from s.

Proof We prove the lemma by appealing to the path-relaxation

property. Consider any vertex v that is reachable from s, and let p = 〈 v 0, v 1, … , vk〉, where v 0 = s and vk = v, be any shortest path from s to v.

Because shortest paths are simple, p has at most | V| − 1 edges, and so k

≤ | V| − 1. Each of the | V| − 1 iterations of the for loop of lines 2–4

relaxes all | E| edges. Among the edges relaxed in the i th iteration, for i =

1, 2, … , k, is ( vi−1, vi). By the path-relaxation property, therefore, v.d =

vk.d = δ( s, vk) = δ( s, v).

Corollary 22.3

Let G = ( V, E) be a weighted, directed graph with source vertex s and weight function w : E → ℝ. Then, for each vertex vV, there is a path from s to v if and only if BELLMAN-FORD terminates with v.d < ∞

when it is run on G.

Proof The proof is left as Exercise 22.1-2.

Theorem 22.4 (Correctness of the Bellman-Ford algorithm)

Let BELLMAN-FORD be run on a weighted, directed graph G = ( V,

E) with source vertex s and weight function w : E → ℝ. If G contains no negative-weight cycles that are reachable from s, then the algorithm

returns TRUE, v.d = δ( s, v) for all vertices vV, and the predecessor subgraph G π is a shortest-paths tree rooted at s. If G does contain a negative-weight cycle reachable from s, then the algorithm returns

FALSE.

Image 679

Image 680

Image 681

Image 682

Image 683

Proof Suppose that graph G contains no negative-weight cycles that are reachable from the source s. We first prove the claim that at

termination, v.d = δ( s, v) for all vertices vV. If vertex v is reachable from s, then Lemma 22.2 proves this claim. If v is not reachable from s, then the claim follows from the no-path property. Thus, the claim is

proven. The predecessor-subgraph property, along with the claim,

implies that G π is a shortest-paths tree. Now we use the claim to show

that BELLMAN-FORD returns TRUE. At termination, for all edges

( u, v) ∈ E we have

v.d = δ( s, v)

≤ δ( s, u) + w( u, v) (by the triangle inequality)

= u.d + w( u, v),

and so none of the tests in line 6 causes BELLMAN-FORD to return

FALSE. Therefore, it returns TRUE.

Now, suppose that graph G contains a negative-weight cycle

reachable from the source s. Let this cycle be c = 〈 v 0, v 1, … , vk〉, where v 0 = vk, in which case we have

Assume for the purpose of contradiction that the Bellman-Ford

algorithm returns TRUE. Thus, vi.dvi−1. d + w( vi−1, vi) for i = 1, 2,

… , k. Summing the inequalities around cycle c gives

Since v 0 = vk, each vertex in c appears exactly once in each of the summations

and

, and so

Image 684

Moreover, by Corollary 22.3, vi.d is finite for i = 1, 2, … , k. Thus, which contradicts inequality (22.1). We conclude that the Bellman-Ford

algorithm returns TRUE if graph G contains no negative-weight cycles

reachable from the source, and FALSE otherwise.

Exercises

22.1-1

Run the Bellman-Ford algorithm on the directed graph of Figure 22.4, using vertex z as the source. In each pass, relax edges in the same order

as in the figure, and show the d and π values after each pass. Now, change the weight of edge ( z, x) to 4 and run the algorithm again, using s as the source.

22.1-2

Prove Corollary 22.3.

22.1-3

Given a weighted, directed graph G = ( V, E) with no negative-weight cycles, let m be the maximum over all vertices vV of the minimum number of edges in a shortest path from the source s to v. (Here, the shortest path is by weight, not the number of edges.) Suggest a simple

change to the Bellman-Ford algorithm that allows it to terminate in m +

1 passes, even if m is not known in advance.

22.1-4

Modify the Bellman-Ford algorithm so that it sets v.d to −∞ for all vertices v for which there is a negative-weight cycle on some path from

the source to v.

22.1-5

Suppose that the graph given as input to the Bellman-Ford algorithm is

represented with a list of | E| edges, where each edge indicates the

vertices it leaves and enters, along with its weight. Argue that the Bellman-Ford algorithm runs in O( VE) time without the constraint that

| E| = Ω( V). Modify the Bellman-Ford algorithm so that it runs in O( VE) time in all cases when the input graph is represented with adjacency

lists.

22.1-6

Let G = ( V, E) be a weighted, directed graph with weight function w : E

→ ℝ. Give an O( VE)-time algorithm to find, for all vertices vV, the value δ*( v) = min {δ( u, v) : uV}.

22.1-7

Suppose that a weighted, directed graph G = ( V, E) contains a negative-weight cycle. Give an efficient algorithm to list the vertices of one such

cycle. Prove that your algorithm is correct.

22.2 Single-source shortest paths in directed acyclic graphs

In this section, we introduce one further restriction on weighted,

directed graphs: they are acyclic. That is, we are concerned with

weighted dags. Shortest paths are always well defined in a dag, since

even if there are negative-weight edges, no negative-weight cycles can

exist. We’ll see that if the edges of a weighted dag G = ( V, E) are relaxed according to a topological sort of its vertices, it takes only Θ( V + E) time to compute shortest paths from a single source.

The algorithm starts by topologically sorting the dag (see Section

20.4) to impose a linear ordering on the vertices. If the dag contains a

path from vertex u to vertex v, then u precedes v in the topological sort.

The DAG-SHORTEST-PATHS procedure makes just one pass over the

vertices in the topologically sorted order. As it processes each vertex, it

relaxes each edge that leaves the vertex. Figure 22.5 shows the execution of this algorithm.

DAG-SHORTEST-PATHS( G, w, s)

1 topologically sort the vertices of G

2 INITIALIZE-SINGLE-SOURCE( G, s)

3 for each vertex uG.V, taken in topologically sorted order

4

for each vertex v in G.Adj[ u]

5

RELAX( u, v, w)

Let’s analyze the running time of this algorithm. As shown in

Section 20.4, the topological sort of line 1 takes Θ( V + E) time. The call of INITIALIZE-SINGLE-SOURCE in line 2 takes Θ( V) time. The for

loop of lines 3–5 makes one iteration per vertex. Altogether, the for loop

of lines 4–5 relaxes each edge exactly once. (We have used an aggregate

analysis here.) Because each iteration of the inner for loop takes Θ(1)

time, the total running time is Θ( V + E), which is linear in the size of an adjacency-list representation of the graph.

The following theorem shows that the DAG-SHORTEST-PATHS

procedure correctly computes the shortest paths.

Theorem 22.5

If a weighted, directed graph G = ( V, E) has source vertex s and no cycles, then at the termination of the DAG-SHORTEST-PATHS

procedure, v.d = δ( s, v) for all vertices vV, and the predecessor subgraph G π is a shortest-paths tree.

Proof We first show that v.d = δ( s, v) for all vertices vV at termination. If v is not reachable from s, then v.d = δ( s, v) = 1 by the no-path property. Now, suppose that v is reachable from s, so that there is a shortest path p = 〈 v 0, v 1, … , vk〉, where v 0 = s and vk = v. Because DAG-SHORTEST-PATHS processes the vertices in topologically

sorted order, it relaxes the edges on p in the order ( v 0, v 1), ( v 1, v 2), … , ( vk−1, vk). The path-relaxation property implies that vi.d = δ( s, vi) at termination for i = 0, 1, … , k. Finally, by the predecessor-subgraph property, G π is a shortest-paths tree.

A useful application of this algorithm arises in determining critical

paths in PERT chart2 analysis. A job consists of several tasks. Each task

takes a certain amount of time, and some tasks must be completed before others can be started. For example, if the job is to build a house,

then the foundation must be completed before starting to frame the

exterior walls, which must be completed before starting on the roof.

Some tasks require more than one other task to be completed before

they can be started: before the drywall can be installed over the wall

framing, both the electrical system and plumbing must be installed. A

dag models the tasks and dependencies. Edges represent tasks, with the

weight of an edge indicating the time required to perform the task.

Vertices represent “milestones,” which are achieved when all the tasks

represented by the edges entering the vertex have been completed. If

edge ( u, v) enters vertex v and edge ( v, x) leaves v, then task ( u, v) must be completed before task ( v, x) is started. A path through this dag represents a sequence of tasks that must be performed in a particular

order. A critical path is a longest path through the dag, corresponding to the longest time to perform any sequence of tasks. Thus, the weight of a

critical path provides a lower bound on the total time to perform all the

tasks, even if as many tasks as possible are performed simultaneously.

You can find a critical path by either

negating the edge weights and running DAG-SHORTEST-

PATHS, or

running DAG-SHORTEST-PATHS, but replacing “∞” by “−∞”

in line 2 of INITIALIZE-SINGLE-SOURCE and “>” by “<” in

the RELAX procedure.

Image 685

Figure 22.5 The execution of the algorithm for shortest paths in a directed acyclic graph. The vertices are topologically sorted from left to right. The source vertex is s. The d values appear within the vertices, and blue edges indicate the π values. (a) The situation before the first iteration of the for loop of lines 3–5. (b)–(g) The situation after each iteration of the for loop of lines 3–5. Blue vertices have had their outgoing edges relaxed. The vertex highlighted in orange was used as u in that iteration. Each edge highlighted in orange caused a d value to change when it was relaxed in that iteration. The values shown in part (g) are the final values.

Exercises

22.2-1

Show the result of running DAG-SHORTEST-PATHS on the directed

acyclic graph of Figure 22.5, using vertex r as the source.

22.2-2

Suppose that you change line 3 of DAG-SHORTEST-PATHS to read

3 for the first | V| − 1 vertices, taken in topologically sorted order

Show that the procedure remains correct.

22.2-3

An alternative way to represent a PERT chart looks more like the dag of

Figure 20.7 on page 574. Vertices represent tasks and edges represent sequencing constraints, that is, edge ( u, v) indicates that task u must be performed before task v. Vertices, not edges, have weights. Modify the

DAG-SHORTEST-PATHS procedure so that it finds a longest path in

a directed acyclic graph with weighted vertices in linear time.

22.2-4

Give an efficient algorithm to count the total number of paths in a

directed acyclic graph. The count should include all paths between all

pairs of vertices and all paths with 0 edges. Analyze your algorithm.

22.3 Dijkstra’s algorithm

Dijkstra’s algorithm solves the single-source shortest-paths problem on

a weighted, directed graph G = ( V, E), but it requires nonnegative weights on all edges: w( u, v) ≥ 0 for each edge ( u, v) ∈ E. As we shall see, with a good implementation, the running time of Dijkstra’s algorithm is

lower than that of the Bellman-Ford algorithm.

You can think of Dijkstra’s algorithm as generalizing breadth-first

search to weighted graphs. A wave emanates from the source, and the

first time that a wave arrives at a vertex, a new wave emanates from that

vertex. Whereas breadth-first search operates as if each wave takes unit

time to traverse an edge, in a weighted graph, the time for a wave to

traverse an edge is given by the edge’s weight. Because a shortest path in

a weighted graph might not have the fewest edges, a simple, first-in,

first-out queue won’t suffice for choosing the next vertex from which to

send out a wave.

Instead, Dijkstra’s algorithm maintains a set S of vertices whose final

shortest-path weights from the source s have already been determined.

The algorithm repeatedly selects the vertex uVS with the minimum shortest-path estimate, adds u into S, and relaxes all edges leaving u.

The procedure DIJKSTRA replaces the first-in, first-out queue of

breadth-first search by a min-priority queue Q of vertices, keyed by their d values.

DIJKSTRA( G, w, s)

1 INITIALIZE-SINGLE-SOURCE( G, s)

2 S = Ø

3 Q = Ø

4 for each vertex uG.V

5

INSERT( Q, u)

6 while Q ≠ Ø

7

u = EXTRACT-MIN( Q)

8

S = S ∪ {u}

9

for each vertex v in G.Adj[ u]

10

RELAX( u, v, w)

11

if the call of RELAX decreased v.d

12

DECREASE-KEY( Q, v, v.d)

Dijkstra’s algorithm relaxes edges as shown in Figure 22.6. Line 1

initializes the d and π values in the usual way, and line 2 initializes the

set S to the empty set. The algorithm maintains the invariant that Q =

VS at the start of each iteration of the while loop of lines 6–12. Lines 3–5 initialize the min-priority queue Q to contain all the vertices in V.

Since S = Ø at that time, the invariant is true upon first reaching line 6.

Each time through the while loop of lines 6–12, line 7 extracts a vertex u

from Q = VS and line 8 adds it to set S, thereby maintaining the invariant. (The first time through this loop, u = s.) Vertex u, therefore, has the smallest shortest-path estimate of any vertex in VS. Then, lines 9–12 relax each edge ( u, v) leaving u, thus updating the estimate v.d and the predecessor v.π if the shortest path to v found so far improves by going through u. Whenever a relaxation step changes the d and π

values, the call to DECREASE-KEY in line 12 updates the min-priority

queue. The algorithm never inserts vertices into Q after the for loop of

lines 4–5, and each vertex is extracted from Q and added to S exactly

once, so that the while loop of lines 6–12 iterates exactly | V| times.

Image 686

Figure 22.6 The execution of Dijkstra’s algorithm. The source s is the leftmost vertex. The shortest-path estimates appear within the vertices, and blue edges indicate predecessor values.

Blue vertices belong to the set S, and tan vertices are in the min-priority queue Q = VS. (a) The situation just before the first iteration of the while loop of lines 6–12. (b)–(f) The situation after each successive iteration of the while loop. In each part, the vertex highlighted in orange was chosen as vertex u in line 7, and each edge highlighted in orange caused a d value and a predecessor to change when the edge was relaxed. The d values and predecessors shown in part (f) are the final values.

Because Dijkstra’s algorithm always chooses the “lightest” or

“closest” vertex in VS to add to set S, you can think of it as using a greedy strategy. Chapter 15 explains greedy strategies in detail, but you need not have read that chapter to understand Dijkstra’s algorithm.

Greedy strategies do not always yield optimal results in general, but as

the following theorem and its corollary show, Dijkstra’s algorithm does

indeed compute shortest paths. The key is to show that u.d = δ( s, u) each time it adds a vertex u to set S.

Image 687

Figure 22.7 The proof of Theorem 22.6. Vertex u is selected to be added into set S in line 7 of DIJKSTRA. Vertex y is the first vertex on a shortest path from the source s to vertex u that is not in set S, and xS is y’s predecessor on that shortest path. The subpath from y to u may or may not re-enter set S.

Theorem 22.6 (Correctness of Dijkstra’s algorithm)

Dijkstra’s algorithm, run on a weighted, directed graph G = ( V, E) with nonnegative weight function w and source vertex s, terminates with u.d

= δ( s, u) for all vertices uV.

Proof We will show that at the start of each iteration of the while loop

of lines 6–12, we have v.d = δ( s, v) for all vS. The algorithm terminates when S = V, so that v.d = δ( s, v) for all vV.

The proof is by induction on the number of iterations of the while

loop, which equals | S| at the start of each iteration. There are two bases:

for | S| = 0, so that S = Ø and the claim is trivially true, and for | S| = 1, so that S = { s} and s.d = δ( s, s) = 0.

For the inductive step, the inductive hypothesis is that v.d = δ( s, v) for all vS. The algorithm extracts vertex u from VS. Because the algorithm adds u into S, we need to show that u.d = δ( s, u) at that time.

If there is no path from s to u, then we are done, by the no-path property. If there is a path from s to u, then, as Figure 22.7 shows, let y be the first vertex on a shortest path from s to u that is not in S, and let xS be the predecessor of y on that shortest path. (We could have y =

u or x = s.) Because y appears no later than u on the shortest path and all edge weights are nonnegative, we have δ( s, y) ≤ δ( s, u). Because the call of EXTRACT-MIN in line 7 returned u as having the minimum d

value in VS, we also have u.dy.d, and the upper-bound property gives δ( s, u) ≤ u.d.

Since xS, the inductive hypothesis implies that x.d = δ( s, x).

During the iteration of the while loop that added x into S, edge ( x, y) was relaxed. By the convergence property, y.d received the value of δ( s, y) at that time. Thus, we have

δ( s, y) ≤ δ( s, u) ≤ u.dy.d and y.d = δ( s, y), so that

δ( s, y) = δ( s, u) = u.d = y.d.

Hence, u.d = δ( s, u), and by the upper-bound property, this value never changes again.

Corollary 22.7

After Dijkstra’s algorithm is run on a weighted, directed graph G = ( V, E) with nonnegative weight function w and source vertex s, the predecessor subgraph G π is a shortest-paths tree rooted at s.

Proof Immediate from Theorem 22.6 and the predecessor-subgraph

property.

Analysis

How fast is Dijkstra’s algorithm? It maintains the min-priority queue Q

by calling three priority-queue operations: INSERT (in line 5),

EXTRACT-MIN (in line 7), and DECREASE-KEY (in line 12). The

algorithm calls both INSERT and EXTRACT-MIN once per vertex.

Because each vertex uV is added to set S exactly once, each edge in the adjacency list Adj[ u] is examined in the for loop of lines 9–12 exactly once during the course of the algorithm. Since the total number of

edges in all the adjacency lists is | E|, this for loop iterates a total of | E|

times, and thus the algorithm calls DECREASE-KEY at most | E| times

overall. (Observe once again that we are using aggregate analysis.)

Just as in Prim’s algorithm, the running time of Dijkstra’s algorithm

depends on the specific implementation of the min-priority queue Q. A

simple implementation takes advantage of the vertices being numbered

1 to | V|: simply store v.d in the v th entry of an array. Each INSERT and DECREASE-KEY operation takes O(1) time, and each EXTRACT-MIN operation takes O( V) time (since it has to search through the entire array), for a total time of O( V 2 + E) = O( V 2).

If the graph is sufficiently sparse—in particular, E = o( V 2/lg V)—you can improve the running time by implementing the min-priority queue

with a binary min-heap that includes a way to map between vertices and

their corresponding heap elements. Each EXTRACT-MIN operation

then takes O(lg V) time. As before, there are | V| such operations. The time to build the binary min-heap is O( V). (As noted in Section 21.2, you don’t even need to call BUILD-MIN-HEAP.) Each DECREASE-KEY operation takes O(lg V) time, and there are still at most | E| such operations. The total running time is therefore O(( V + E) lg V), which is O( E lg V) in the typical case that | E| = Ω( V). This running time improves upon the straightforward O( V 2)-time implementation if E = o( V 2/lg V).

By implementing the min-priority queue with a Fibonacci heap (see

page 478), you can improve the running time to O( V lg V + E). The amortized cost of each of the | V| EXTRACT-MIN operations is O(lg V), and each DECREASE-KEY call, of which there are at most | E|, takes only O(1) amortized time. Historically, the development of

Fibonacci heaps was motivated by the observation that Dijkstra’s

algorithm typically makes many more DECREASE-KEY calls than

EXTRACT-MIN calls, so that any method of reducing the amortized

time of each DECREASE-KEY operation to o(lg V) without increasing

the amortized time of EXTRACT-MIN would yield an asymptotically

faster implementation than with binary heaps.

Dijkstra’s algorithm resembles both breadth-first search (see Section

20.2) and Prim’s algorithm for computing minimum spanning trees (see

Section 21.2). It is like breadth-first search in that set S corresponds to the set of black vertices in a breadth-first search. Just as vertices in S

have their final shortest-path weights, so do black vertices in a breadth-

first search have their correct breadth-first distances. Dijkstra’s

algorithm is like Prim’s algorithm in that both algorithms use a min-

priority queue to find the “lightest” vertex outside a given set (the set S

in Dijkstra’s algorithm and the tree being grown in Prim’s algorithm),

add this vertex into the set, and adjust the weights of the remaining vertices outside the set accordingly.

Exercises

22.3-1

Run Dijkstra’s algorithm on the directed graph of Figure 22.2, first using vertex s as the source and then using vertex z as the source. In the style of Figure 22.6, show the d and π values and the vertices in set S

after each iteration of the while loop.

22.3-2

Give a simple example of a directed graph with negative-weight edges

for which Dijkstra’s algorithm produces an incorrect answer. Why

doesn’t the proof of Theorem 22.6 go through when negative-weight

edges are allowed?

22.3-3

Suppose that you change line 6 of Dijkstra’s algorithm to read

6 while | Q| > 1

This change causes the while loop to execute | V| − 1 times instead of | V|

times. Is this proposed algorithm correct?

22.3-4

Modify the DIJKSTRA procedure so that the priority queue Q is more

like the queue in the BFS procedure in that it contains only vertices that

have been reached from source s so far: QVS and vQ implies v.d ≠ ∞.

22.3-5

Professor Gaedel has written a program that he claims implements

Dijkstra’s algorithm. The program produces v.d and v.π for each vertex

vV. Give an O( V + E)-time algorithm to check the output of the professor’s program. It should determine whether the d and π attributes

match those of some shortest-paths tree. You may assume that all edge

weights are nonnegative.

22.3-6

Professor Newman thinks that he has worked out a simpler proof of

correctness for Dijkstra’s algorithm. He claims that Dijkstra’s algorithm

relaxes the edges of every shortest path in the graph in the order in

which they appear on the path, and therefore the path-relaxation

property applies to every vertex reachable from the source. Show that

the professor is mistaken by constructing a directed graph for which

Dijkstra’s algorithm relaxes the edges of a shortest path out of order.

22.3-7

Consider a directed graph G = ( V, E) on which each edge ( u, v) ∈ E has an associated value r( u, v), which is a real number in the range 0 ≤ r( u, v)

≤ 1 that represents the reliability of a communication channel from

vertex u to vertex v. Interpret r( u, v) as the probability that the channel from u to v will not fail, and assume that these probabilities are independent. Give an efficient algorithm to find the most reliable path

between two given vertices.

22.3-8

Let G = ( V, E) be a weighted, directed graph with positive weight function w : E → {1, 2, … , W} for some positive integer W, and assume that no two vertices have the same shortest-path weights from source

vertex s. Now define an unweighted, directed graph G′ = ( VV′, E′) by replacing each edge ( u, v) ∈ E with w( u, v) unit-weight edges in series.

How many vertices does G′ have? Now suppose that you run a breadth-

first search on G′. Show that the order in which the breadth-first search

of G′ colors vertices in V black is the same as the order in which Dijkstra’s algorithm extracts the vertices of V from the priority queue

when it runs on G.

22.3-9

Let G = ( V, E) be a weighted, directed graph with nonnegative weight function w : E → {0, 1, … , W} for some nonnegative integer W.

Modify Dijkstra’s algorithm to compute the shortest paths from a given

source vertex s in O( W V + E) time.

22.3-10

Image 688

Modify your algorithm from Exercise 22.3-9 to run in O(( V + E) lg W) time. ( Hint: How many distinct shortest-path estimates can VS

contain at any point in time?)

22.3-11

Suppose that you are given a weighted, directed graph G = ( V, E) in which edges that leave the source vertex s may have negative weights, all

other edge weights are nonnegative, and there are no negative-weight

cycles. Argue that Dijkstra’s algorithm correctly finds shortest paths

from s in this graph.

22.3-12

Suppose that you have a weighted directed graph G = ( V, E) in which all edge weights are positive real values in the range [ C, 2 C] for some positive constant C. Modify Dijkstra’s algorithm so that it runs in O( V

+ E) time.

22.4 Difference constraints and shortest paths

Chapter 29 studies the general linear-programming problem, showing how to optimize a linear function subject to a set of linear inequalities.

This section investigates a special case of linear programming that

reduces to finding shortest paths from a single source. The Bellman-

Ford algorithm then solves the resulting single-source shortest-paths

problem, thereby also solving the linear-programming problem.

Linear programming

In the general linear-programming problem, the input is an m × n matrix A, an m-vector b, and an n-vector c. The goal is to find a vector x of n elements that maximizes the objective function

subject to the m

constraints given by Axb.

The most popular method for solving linear programs is the simplex

algorithm, which Section 29.1 discusses. Although the simplex algorithm does not always run in time polynomial in the size of its

input, there are other linear-programming algorithms that do run in

Image 689

polynomial time. We offer here two reasons to understand the setup of

linear-programming problems. First, if you know that you can cast a

given problem as a polynomial-sized linear-programming problem, then

you immediately have a polynomial-time algorithm to solve the

problem. Second, faster algorithms exist for many special cases of linear

programming. For example, the single-pair shortest-path problem

(Exercise 22.4-4) and the maximum-flow problem (Exercise 24.1-5) are

special cases of linear programming.

Sometimes the objective function does not matter: it’s enough just to

find any feasible solution, that is, any vector x that satisfies Axb, or to determine that no feasible solution exists. This section focuses on one

such feasibility problem.

Systems of difference constraints

In a system of difference constraints, each row of the linear-

programming matrix A contains one 1 and one −1, and all other entries

of A are 0. Thus, the constraints given by Axb are a set of mdifference constraints involving n unknowns, in which each constraint is a simple

linear inequality of the form

xjxibk,

where 1 ≤ i, jn, ij, and 1 ≤ km.

For example, consider the problem of finding a 5-vector x = ( xi) that

satisfies

This problem is equivalent to finding values for the unknowns x 1, x 2, x 3, x 4, x 5, satisfying the following 8 difference constraints:

Image 690

One solution to this problem is x = (−5, −3, 0, −1, −4), which you can

verify directly by checking each inequality. In fact, this problem has

more than one solution. Another is x′ = (0, 2, 5, 4, 1). These two solutions are related: each component of x′ is 5 larger than the

corresponding component of x. This fact is not mere coincidence.

Lemma 22.8

Let x = ( x 1, x 2, … , xn) be a solution to a system A xb of difference constraints, and let d be any constant. Then x + d = ( x 1 + d, x 2 + d, … , xn + d) is a solution to Axb as well.

Proof For each xi and xj, we have ( xj + d) − ( xi + d) = xjxi. Thus, if x satisfies Axb, so does x + d.

Systems of difference constraints occur in various applications. For

example, the unknowns xi might be times at which events are to occur.

Each constraint states that at least a certain amount of time, or at most

a certain amount of time, must elapse between two events. Perhaps the

events are jobs to be performed during the assembly of a product. If the

manufacturer applies an adhesive that takes 2 hours to set at time x 1

and has to wait until it sets to install a part at time x 2, then there is a

constraint that x 2 ≥ x 1 + 2 or, equivalently, that x 1 − x 2 ≤ −2.

Alternatively, the manufacturer might require the part to be installed

after the adhesive has been applied but no later than the time that the

adhesive has set halfway. In this case, there is a pair of constraints x 2 ≥

x 1 and x 2 ≤ x 1 + 1 or, equivalently, x 1 − x 2 ≤ 0 and x 2 − x 1 ≤ 1.

If all the constraints have nonnegative numbers on the right-hand side—that is, if bi ≥ 0 for i = 1, 2, … , m—then finding a feasible solution is trivial: just set all the unknowns xi equal to each other. Then

all the differences are 0, and every constraint is satisfied. The problem of

finding a feasible solution to a system of difference constraints is

interesting only if at least one constraint has bi < 0.

Constraint graphs

We can interpret systems of difference constraints from a graph-

theoretic point of view. For a system Axb of difference constraints, let’s view the m × n linear-programming matrix A as the transpose of an incidence matrix (see Exercise 20.1-7) for a graph with n vertices and m

edges. Each vertex vi in the graph, for i = 1, 2, … , n, corresponds to one of the n unknown variables xi. Each directed edge in the graph corresponds to one of the m inequalities involving two unknowns.

More formally, given a system Axb of difference constraints, the

corresponding constraint graph is a weighted, directed graph G = ( V, E), where

V = { v 0, v 1, … , vn}

and

E = {( vi, vj) : xjxibk is a constraint}

∪ {( v 0, v 1), ( v 0, v 2), ( v 0, v 3), … , ( v 0, vn)}.

The constraint graph includes the additional vertex v 0, as we shall see

shortly, to guarantee that the graph has some vertex that can reach all

other vertices. Thus, the vertex set V consists of a vertex vi for each unknown xi, plus an additional vertex v 0. The edge set E contains an edge for each difference constraint, plus an edge ( v 0, vi) for each unknown xi. If xjxibk is a difference constraint, then the weight of edge ( vi, vj) is w( vi, vj) = bk. The weight of each edge leaving v 0 is 0.

Image 691

Image 692

Figure 22.8 shows the constraint graph for the system (22.2)–(22.9) of difference constraints.

Figure 22.8 The constraint graph corresponding to the system (22.2)–(22.9) of difference constraints. The value of δ( v 0, vi) appears in each vertex vi. One feasible solution to the system is x = (−5, −3, 0, −1, −4).

The following theorem shows how to solve a system of difference

constraints by finding shortest-path weights in the corresponding

constraint graph.

Theorem 22.9

Given a system Axb of difference constraints, let G = ( V, E) be the corresponding constraint graph. If G contains no negative-weight cycles,

then

is a feasible solution for the system. If G contains a negative-weight cycle, then there is no feasible solution for the system.

Proof We first show that if the constraint graph contains no negative-

weight cycles, then equation (22.10) gives a feasible solution. Consider

any edge ( vi, vj) ∈ E. The triangle inequality implies that δ( v 0, vj) ≤ δ( v 0, vi) + w( vi, vj), which is equivalent to δ( v 0, vj)−δ( v 0, vi) ≤ w( vi, vj). Thus, letting xi = δ( v 0, vi) and xj = δ( v 0, vj) satisfies the difference constraint xjxiw( vi, vj) that corresponds to edge ( vi, vj).

Now we show that if the constraint graph contains a negative-weight

cycle, then the system of difference constraints has no feasible solution.

Without loss of generality, let the negative-weight cycle be c = 〈 v 1, v 2,

… , vk〉, where v 1 = vk. (The vertex v 0 cannot be on cycle c, because it has no entering edges.) Cycle c corresponds to the following difference

constraints:

x 2 − x 1 ≤ w( v 1, v 2),

x 3 − x 2 ≤ w( v 2, v 3),

xk−1 − xk−2 ≤ w( vk−2, vk−1),

xkxk−1 ≤ w( vk−1, vk).

We’ll assume that x has a solution satisfying each of these k inequalities and then derive a contradiction. The solution must also satisfy the

inequality that results from summing the k inequalities together. In

summing the left-hand sides, each unknown xi is added in once and

subtracted out once (remember that v 1 = vk implies x 1 = xk), so that the left-hand side sums to 0. The right-hand side sums to the weight

w( c) of the cycle, giving 0 ≤ w( c). But since c is a negative-weight cycle, w( c) < 0, and we obtain the contradiction that 0 ≤ w( c) < 0.

Solving systems of difference constraints

Theorem 22.9 suggests how to use the Bellman-Ford algorithm to solve

a system of difference constraints. Because the constraint graph

contains edges from the source vertex v 0 to all other vertices, any negative-weight cycle in the constraint graph is reachable from v 0. If the

Bellman-Ford algorithm returns TRUE, then the shortest-path weights

give a feasible solution to the system. In Figure 22.8, for example, the shortest-path weights provide the feasible solution x = (−5, −3, 0, −1,

−4), and by Lemma 22.8, x = ( d − 5, d − 3, d, d − 1, d − 4) is also a feasible solution for any constant d. If the Bellman-Ford algorithm

returns FALSE, there is no feasible solution to the system of difference

constraints.

A system of difference constraints with m constraints on n unknowns produces a graph with n + 1 vertices and n + m edges. Thus, the Bellman-Ford algorithm provides a way to solve the system in O(( n + 1)

( n + m)) = O( n 2 + nm) time. Exercise 22.4-5 asks you to modify the algorithm to run in O( nm) time, even if m is much less than n.

Exercises

22.4-1

Find a feasible solution or determine that no feasible solution exists for

the following system of difference constraints:

x 1 − x 2 ≤ 1,

x 1 − x 4 ≤ −4,

x 2 − x 3 ≤ 2,

x 2 − x 5 ≤ 7,

x 2 − x 6 ≤ 5,

x 3 − x 6 ≤ 10,

x 4 − x 2 ≤ 2,

x 5 − x 1 ≤ −1,

x 5 − x 4 ≤ 3,

x 6 − x 3 ≤ −8.

22.4-2

Find a feasible solution or determine that no feasible solution exists for

the following system of difference constraints:

x 1 − x 2 ≤ 4,

x 1 − x 5 ≤ 5,

x 2 − x 4 ≤ −6,

x 3 − x 2 ≤ 1,

≤ 3,

Image 693

x 4 − x 1

x 4 − x 3 ≤ 5,

x 4 − x 5 ≤ 10,

x 5 − x 3 ≤ −4,

x 5 − x 4 ≤ −8.

22.4-3

Can any shortest-path weight from the new vertex v 0 in a constraint graph be positive? Explain.

22.4-4

Express the single-pair shortest-path problem as a linear program.

22.4-5

Show how to modify the Bellman-Ford algorithm slightly so that when

using it to solve a system of difference constraints with m inequalities on

n unknowns, the running time is O( nm).

22.4-6

Consider adding equality constraints of the form xi = xj + bk to a system of difference constraints. Show how to solve this variety of

constraint system.

22.4-7

Show how to solve a system of difference constraints by a Bellman-

Ford-like algorithm that runs on a constraint graph without the extra

vertex v 0.

22.4-8

Let Axb be a system of m difference constraints in n unknowns. Show that the Bellman-Ford algorithm, when run on the corresponding

constraint graph, maximizes

subject to Axb and xi ≤ 0 for all xi.

22.4-9

Show that the Bellman-Ford algorithm, when run on the constraint graph for a system Axb of difference constraints, minimizes the quantity (max { xi}−min { xi}) subject to Axb. Explain how this fact might come in handy if the algorithm is used to schedule construction

jobs.

22.4-10

Suppose that every row in the matrix A of a linear program Axb corresponds to a difference constraint, a single-variable constraint of

the form xibk, or a single-variable constraint of the form − xibk.

Show how to adapt the Bellman-Ford algorithm to solve this variety of

constraint system.

22.4-11

Give an efficient algorithm to solve a system Axb of difference constraints when all of the elements of b are real-valued and all of the

unknowns xi must be integers.

22.4-12

Give an efficient algorithm to solve a system Axb of difference constraints when all of the elements of b are real-valued and a specified

subset of some, but not necessarily all, of the unknowns xi must be integers.