cycle in an undirected graph. Give a related decision problem. Give the
language corresponding to the decision problem.
34.1-3
Give a formal encoding of directed graphs as binary strings using an
adjacency-matrix representation. Do the same using an adjacency-list
representation. Argue that the two representations are polynomially
related.
34.1-4
Is the dynamic-programming algorithm for the 0-1 knapsack problem
that is asked for in Exercise 15.2-2 a polynomial-time algorithm?
Explain your answer.
34.1-5
Show that if an algorithm makes at most a constant number of calls to
polynomial-time subroutines and performs an additional amount of
work that also takes polynomial time, then it runs in polynomial time.
Also show that a polynomial number of calls to polynomial-time
subroutines may result in an exponential-time algorithm.
34.1-6
Show that the class P, viewed as a set of languages, is closed under
union, intersection, concatenation, complement, and Kleene star. That
is, if L 1, L 2 ∈ P, then L 1 ∪ L 2 ∈ P, L 1 ∩ L 2 ∈ P, L 1 L 2 ∈ P, L 1 ∈ P, and
.
34.2 Polynomial-time verification
Now, let’s look at algorithms that verify membership in languages. For
example, suppose that for a given instance 〈 G, u, v, k〉 of the decision problem PATH, you are also given a path p from u to v. You can check whether p is a path in G and whether the length of p is at most k, and if so, you can view p as a “certificate” that the instance indeed belongs to
PATH. For the decision problem PATH, this certificate doesn’t seem to
buy much. After all, PATH belongs to P—in fact, you can solve PATH
in linear time—and so verifying membership from a given certificate
takes as long as solving the problem from scratch. Instead, let’s examine
a problem for which we know of no polynomial-time decision algorithm
and yet, given a certificate, verification is easy.
Hamiltonian cycles
The problem of finding a hamiltonian cycle in an undirected graph has
been studied for over a hundred years. Formally, a hamiltonian cycle of
an undirected graph G = ( V, E) is a simple cycle that contains each vertex in V. A graph that contains a hamiltonian cycle is said to be hamiltonian, and otherwise, it is nonhamiltonian. The name honors W.
R. Hamilton, who described a mathematical game on the dodecahedron
(Figure 34.2(a)) in which one player sticks five pins in any five consecutive vertices and the other player must complete the path to
form a cycle containing all the vertices.8 The dodecahedron is hamiltonian, and Figure 34.2(a) shows one hamiltonian cycle. Not all

graphs are hamiltonian, however. For example, Figure 34.2(b) shows a bipartite graph with an odd number of vertices. Exercise 34.2-2 asks you
to show that all such graphs are nonhamiltonian.
Here is how to define the hamiltonian-cycle problem, “Does a graph
G have a hamiltonian cycle?” as a formal language:
HAM-CYCLE = {〈 G〉 : G is a hamiltonian graph}.
How might an algorithm decide the language HAM-CYCLE? Given a
problem instance 〈 G〉, one possible decision algorithm lists all
permutations of the vertices of G and then checks each permutation to
see whether it is a hamiltonian cycle. What is the running time of this
algorithm? It depends on the encoding of the graph G. Let’s say that G
is encoded as its adjacency matrix. If the adjacency matrix contains n
entries, so that the length of the encoding of G equals n, then the number m of vertices in the graph is
. There are m! possible
permutations of the vertices, and therefore the running time is
, which is not O( nk) for any constant k. Thus,
this naive algorithm does not run in polynomial time. In fact, the
hamiltonian-cycle problem is NP-complete, as we’ll prove in Section
Figure 34.2 (a) A graph representing the vertices, edges, and faces of a dodecahedron, with a hamiltonian cycle shown by edges highlighted in blue. (b) A bipartite graph with an odd number of vertices. Any such graph is nonhamiltonian.
Verification algorithms
Consider a slightly easier problem. Suppose that a friend tells you that a
given graph G is hamiltonian, and then the friend offers to prove it by
giving you the vertices in order along the hamiltonian cycle. It would
certainly be easy enough to verify the proof: simply verify that the
provided cycle is hamiltonian by checking whether it is a permutation of
the vertices of V and whether each of the consecutive edges along the
cycle actually exists in the graph. You could certainly implement this
verification algorithm to run in O( n 2) time, where n is the length of the encoding of G. Thus, a proof that a hamiltonian cycle exists in a graph
can be verified in polynomial time.
We define a verification algorithm as being a two-argument algorithm
A, where one argument is an ordinary input string x and the other is a
binary string y called a certificate. A two-argument algorithm A verifies an input string x if there exists a certificate y such that A( x, y) = 1. The language verified by a verification algorithm A is
L = { x ∈ {0, 1}* : there exists y ∈ {0, 1}* such that A( x, y) = 1}.
Think of an algorithm A as verifying a language L if, for any string x
∈ L, there exists a certificate y that A can use to prove that x ∈ L.
Moreover, for any string x ∉ L, there must be no certificate proving that x ∈ L. For example, in the hamiltonian-cycle problem, the certificate is the list of vertices in some hamiltonian cycle. If a graph is hamiltonian,
the hamiltonian cycle itself offers enough information to verify that the
graph is indeed hamiltonian. Conversely, if a graph is not hamiltonian,
there can be no list of vertices that fools the verification algorithm into
believing that the graph is hamiltonian, since the verification algorithm
carefully checks the so-called cycle to be sure.
The complexity class NP
The complexity class NP is the class of languages that can be verified by
a polynomial-time algorithm.9 More precisely, a language L belongs to NP if and only if there exist a two-input polynomial-time algorithm A
and a constant c such that
L = { x ∈ {0, 1}*: there exists a certificate y with | y| = O(| x| c) such that A( x, y) = 1}.
We say that algorithm A verifies language L in polynomial time.
From our earlier discussion about the hamiltonian-cycle problem,
you can see that HAM-CYCLE ∈ NP. (It is always nice to know that an
important set is nonempty.) Moreover, if L ∈ P, then L ∈ NP, since if
there is a polynomial-time algorithm to decide L, the algorithm can be
converted to a two-argument verification algorithm that simply ignores
any certificate and accepts exactly those input strings it determines to
belong to L. Thus, P ⊆ NP.
That leaves the question of whether P = NP. A definitive answer is
unknown, but most researchers believe that P and NP are not the same
class. Think of the class P as consisting of problems that can be solved
quickly and the class NP as consisting of problems for which a solution
can be verified quickly. You may have learned from experience that it is
often more difficult to solve a problem from scratch than to verify a
clearly presented solution, especially when working under time
constraints. Theoretical computer scientists generally believe that this
analogy extends to the classes P and NP, and thus that NP includes
languages that do not belong to P.
Figure 34.3 Four possibilities for relationships among complexity classes. In each diagram, one region enclosing another indicates a proper-subset relation. (a) P = NP = co-NP. Most researchers regard this possibility as the most unlikely. (b) If NP is closed under complement, then NP = co-NP, but it need not be the case that P = NP. (c) P = NP ∩ co-NP, but NP is not closed under complement. (d) NP ≠ co-NP and P ≠ NP ∩ co-NP. Most researchers regard this possibility as the most likely.
There is more compelling, though not conclusive, evidence that P ≠
NP—the existence of languages that are “NP-complete.” Section 34.3
will study this class.
Many other fundamental questions beyond the P ≠ NP question
remain unresolved. Figure 34.3 shows some possible scenarios. Despite much work by many researchers, no one even knows whether the class
NP is closed under complement. That is, does L ∈ NP imply L ∈ NP?
We define the complexity class co-NP as the set of languages L such that L ∈ NP, so that the question of whether NP is closed under
complement is also whether NP = co-NP. Since P is closed under
complement (Exercise 34.1-6), it follows from Exercise 34.2-9 (P ⊆ co-
NP) that P ⊆ NP ∩ co-NP. Once again, however, no one knows whether
P = NP ∩ co-NP or whether there is some language in (NP ∩ co-NP) −
P.
Thus our understanding of the precise relationship between P and
NP is woefully incomplete. Nevertheless, even though we might not be
able to prove that a particular problem is intractable, if we can prove
that it is NP-complete, then we have gained valuable information about
it.
Exercises
34.2-1
Consider the language GRAPH-ISOMORPHISM = {〈 G 1, G 2〉 : G 1
and G 2 are isomorphic graphs}. Prove that GRAPH-ISOMORPHISM
∈ NP by describing a polynomial-time algorithm to verify the language.
34.2-2
Prove that if G is an undirected bipartite graph with an odd number of
vertices, then G is nonhamiltonian.
34.2-3
Show that if HAM-CYCLE ∈ P, then the problem of listing the vertices
of a hamiltonian cycle, in order, is polynomial-time solvable.
34.2-4
Prove that the class NP of languages is closed under union, intersection,
concatenation, and Kleene star. Discuss the closure of NP under
complement.
34.2-5
Show that any language in NP can be decided by an algorithm with a
running time of
for some constant k.
34.2-6
A hamiltonian path in a graph is a simple path that visits every vertex
exactly once. Show that the language HAM-PATH = {〈 G, u, v〉 : there is
a hamiltonian path from u to v in graph G} belongs to NP.
34.2-7
Show that the hamiltonian-path problem from Exercise 34.2-6 can be solved in polynomial time on directed acyclic graphs. Give an efficient
algorithm for the problem.
34.2-8
Let ϕ be a boolean formula constructed from the boolean input
variables x 1, x 2, … , xk, negations (¬), ANDs (∧), ORs (∨), and parentheses. The formula ϕ is a tautology if it evaluates to 1 for every assignment of 1 and 0 to the input variables. Define TAUTOLOGY as
the language of boolean formulas that are tautologies. Show that
TAUTOLOGY ∈ co-NP.
34.2-9
Prove that P ⊆ co-NP.
34.2-10
Prove that if NP ≠ co-NP, then P ≠ NP.
34.2-11
Let G be a connected, undirected graph with at least three vertices, and
let G 3 be the graph obtained by connecting all pairs of vertices that are
connected by a path in G of length at most 3. Prove that G 3 is hamiltonian. ( Hint: Construct a spanning tree for G, and use an inductive argument.)
34.3 NP-completeness and reducibility
Perhaps the most compelling reason why theoretical computer scientists
believe that P ≠ NP comes from the existence of the class of NP-
complete problems. This class has the intriguing property that if any
NP-complete problem can be solved in polynomial time, then every
problem in NP has a polynomial-time solution, that is, P = NP. Despite
decades of study, though, no polynomial-time algorithm has ever been
discovered for any NP-complete problem.

The language HAM-CYCLE is one NP-complete problem. If there
were an algorithm to decide HAM-CYCLE in polynomial time, then
every problem in NP could be solved in polynomial time. The NP-
complete languages are, in a sense, the “hardest” languages in NP. In
fact, if NP − P turns out to be nonempty, we will be able to say with
certainty that HAM-CYCLE ∈ NP − P.
This section starts by showing how to compare the relative
“hardness” of languages using a precise notion called “polynomial-time
reducibility.” It then formally defines the NP-complete languages,
finishing by sketching a proof that one such language, called CIRCUIT-
SAT, is NP-complete. Sections 34.4 and 34.5 will use the notion of reducibility to show that many other problems are NP-complete.
Reducibility
One way that sometimes works for solving a problem is to recast it as a
different problem. We call that strategy “reducing” one problem to
another. Think of a problem Q as being reducible to another problem
Q′ if any instance of Q can be recast as an instance of Q′, and the solution to the instance of Q′ provides a solution to the instance of Q.
For example, the problem of solving linear equations in an
indeterminate x reduces to the problem of solving quadratic equations.
Given a linear-equation instance ax + b = 0 (with solution x = − b/ a), you can transform it to the quadratic equation ax 2 + bx + 0 = 0. This
quadratic equation has the solutions
, where c =
0, so that
. The solutions are then x = (− b + b)/2 a = 0 and
x = (− b − b)/2 a = − b/ a, thereby providing a solution to ax + b = 0.
Thus, if a problem Q reduces to another problem Q′, then Q is, in a sense, “no harder to solve” than Q′.

Figure 34.4 A function f that reduces language L 1 to language L 2. For any input x ∈ {0, 1}*, the question of whether x ∈ L 1 has the same answer as the question of whether f ( x) ∈ L 2.
Returning to our formal-language framework for decision problems,
we say that a language L 1 is polynomial-time reducible to a language L 2, written L 1 ≤P L 2, if there exists a polynomial-time computable function f : {0, 1}* → {0, 1}* such that for all x ∈ {0, 1}*,
We call the function f the reduction function, and a polynomial-time algorithm F that computes f is a reduction algorithm.
Figure 34.4 illustrates the idea of a reduction from a language L 1 to another language L 2. Each language is a subset of {0, 1}*. The
reduction function f provides a mapping such that if x ∈ L 1, then f ( x)
∈ L 2. Moreover, if x ∉ L 1, then f ( x) ∉ L 2. Thus, the reduction function maps any instance x of the decision problem represented by the
language L 1 to an instance f ( x) of the problem represented by L 2.
Providing an answer to whether f ( x) ∈ L 2 directly provides the answer to whether x ∈ L 1. If, in addition, f can be computed in polynomial time, it is a polynomial-time reduction function.
Polynomial-time reductions give us a powerful tool for proving that
various languages belong to P.
Lemma 34.3
If L 1, L 2 ⊆ {0, 1}* are languages such that L 1 ≤P L 2, then L 2 ∈ P
implies L 1 ∈ P.
Figure 34.5 The proof of Lemma 34.3. The algorithm F is a reduction algorithm that computes the reduction function f from L 1 to L 2 in polynomial time, and A 2 is a polynomial-time algorithm that decides L 2. Algorithm A 1 decides whether x ∈ L 1 by using F to transform any input x into f ( x) and then using A 2 to decide whether f ( x) ∈ L 2.
Proof Let A 2 be a polynomial-time algorithm that decides L 2, and let F be a polynomial-time reduction algorithm that computes the
reduction function f. We show how to construct a polynomial-time
algorithm A 1 that decides L 1.
Figure 34.5 illustrates how we construct A 1. For a given input x ∈
{0, 1}*, algorithm A 1 uses F to transform x into f ( x), and then it uses A 2 to test whether f ( x) ∈ L 2. Algorithm A 1 takes the output from algorithm A 2 and produces that answer as its own output.
The correctness of A 1 follows from condition (34.1). The algorithm
runs in polynomial time, since both F and A 2 run in polynomial time
(see Exercise 34.1-5).
▪
NP-completeness
Polynomial-time reductions allow us to formally show that one problem
is at least as hard as another, to within a polynomial-time factor. That
is, if L 1 ≤P L 2, then L 1 is not more than a polynomial factor harder than L 2, which is why the “less than or equal to” notation for reduction
is mnemonic. We can now define the set of NP-complete languages,
which are the hardest problems in NP.
A language L ⊆ {0, 1}* is NP-complete if
1. L ∈ NP, and
2. L′ ≤P L for every L′ ∈ NP.
If a language L satisfies property 2, but not necessarily property 1, we
say that L is NP-hard. We also define NPC to be the class of NP-complete languages.
As the following theorem shows, NP-completeness is at the crux of
deciding whether P is in fact equal to NP.
Theorem 34.4
If any NP-complete problem is polynomial-time solvable, then P = NP.
Equivalently, if any problem in NP is not polynomial-time solvable, then
no NP-complete problem is polynomial-time solvable.
Figure 34.6 How most theoretical computer scientists view the relationships among P, NP, and NPC. Both P and NPC are wholly contained within NP, and P ∩ NPC = Ø.
Proof Suppose that L ∈ P and also that L ∈ NPC. For any L′ ∈ NP, we have L′ ≤P L by property 2 of the definition of NP-completeness.
Thus, by Lemma 34.3, we also have that L′ ∈ P, which proves the first
statement of the theorem.
To prove the second statement, consider the contrapositive of the
first statement: if P ≠ NP, then there does not exist an NP-complete
problem that is polynomial-time solvable. But P ≠ NP means that there
is some problem in NP that is not polynomial-time solvable, and hence
the second statement is the contrapositive of the first statement.
▪
It is for this reason that research into the P ≠ NP question centers
around the NP-complete problems. Most theoretical computer scientists
believe that P ≠ NP, which leads to the relationships among P, NP, and
NPC shown in Figure 34.6. For all we know, however, someone may yet come up with a polynomial-time algorithm for an NP-complete
problem, thus proving that P = NP. Nevertheless, since no polynomial-
time algorithm for any NP-complete problem has yet been discovered, a
proof that a problem is NP-complete provides excellent evidence that it
is intractable.
Circuit satisfiability
We have defined the notion of an NP-complete problem, but up to this
point, we have not actually proved that any problem is NP-complete.
Once we prove that at least one problem is NP-complete, polynomial-
time reducibility becomes a tool to prove other problems to be NP-
complete. Thus, we now focus on demonstrating the existence of an NP-
complete problem: the circuit-satisfiability problem.
Unfortunately, the formal proof that the circuit-satisfiability problem
is NP-complete requires technical detail beyond the scope of this text.
Instead, we’ll informally describe a proof that relies on a basic
understanding of boolean combinational circuits.
Figure 34.7 Three basic logic gates, with binary inputs and outputs. Under each gate is the truth table that describes the gate’s operation. (a) The NOT gate. (b) The AND gate. (c) The OR gate.
Boolean combinational circuits are built from boolean
combinational elements that are interconnected by wires. A boolean
combinational element is any circuit element that has a constant number
of boolean inputs and outputs and that performs a well-defined function. Boolean values are drawn from the set {0, 1}, where 0
represents FALSE and 1 represents TRUE.
The boolean combinational elements appearing in the circuit-
satisfiability problem compute simple boolean functions, and they are
known as logic gates. Figure 34.7 shows the three basic logic gates used in the circuit-satisfiability problem: the NOT gate (or inverter), the AND
gate, and the OR gate. The NOT gate takes a single binary input x, whose value is either 0 or 1, and produces a binary output z whose value is opposite that of the input value. Each of the other two gates takes two
binary inputs x and y and produces a single binary output z.
The operation of each gate, or of any boolean combinational
element, is defined by a truth table, shown under each gate in Figure
34.7. A truth table gives the outputs of the combinational element for
each possible setting of the inputs. For example, the truth table for the
OR gate says that when the inputs are x = 0 and y = 1, the output value is z = 1. The symbol ¬ denotes the NOT function, ∧ denotes the AND
function, and ∨ denotes the OR function. Thus, for example, 0 ∨ 1 = 1.
AND and OR gates are not limited to just two inputs. An AND
gate’s output is 1 if all of its inputs are 1, and its output is 0 otherwise.
An OR gate’s output is 1 if any of its inputs are 1, and its output is 0
otherwise.
A boolean combinational circuit consists of one or more boolean
combinational elements interconnected by wires. A wire can connect the
output of one element to the input of another, so that the output value
of the first element becomes an input value of the second. Figure 34.8
shows two similar boolean combinational circuits, differing in only one
gate. Part (a) of the figure also shows the values on the individual wires,
given the input 〈 x 1 = 1, x 2 = 1, x 3 = 0〉. Although a single wire may have no more than one combinational-element output connected to it, it
can feed several element inputs. The number of element inputs fed by a
wire is called the fan-out of the wire. If no element output is connected
to a wire, the wire is a circuit input, accepting input values from an external source. If no element input is connected to a wire, the wire is a
circuit output, providing the results of the circuit’s computation to the
outside world. (An internal wire can also fan out to a circuit output.)
For the purpose of defining the circuit-satisfiability problem, we limit
the number of circuit outputs to 1, though in actual hardware design, a
boolean combinational circuit may have multiple outputs.
Figure 34.8 Two instances of the circuit-satisfiability problem. (a) The assignment 〈 x 1 = 1, x 2 =
1, x 3 = 0〉 to the inputs of this circuit causes the output of the circuit to be 1. The circuit is therefore satisfiable. (b) No assignment to the inputs of this circuit can cause the output of the circuit to be 1. The circuit is therefore unsatisfiable.
Boolean combinational circuits contain no cycles. In other words, for
a given combinational circuit, imagine a directed graph G = ( V, E) with one vertex for each combinational element and with k directed edges for
each wire whose fan-out is k, where the graph contains a directed edge
( u, v) if a wire connects the output of element u to an input of element v.
Then G must be acyclic.
A truth assignment for a boolean combinational circuit is a set of
boolean input values. We say that a 1-output boolean combinational
circuit is satisfiable if it has a satisfying assignment: a truth assignment that causes the output of the circuit to be 1. For example, the circuit in
Figure 34.8(a) has the satisfying assignment 〈 x 1 = 1, x 2 = 1, x 3 = 0〉, and so it is satisfiable. As Exercise 34.3-1 asks you to show, no
assignment of values to x 1, x 2, and x 3 causes the circuit in Figure
34.8(b) to produce a 1 output. Since it always produces 0, it is
unsatisfiable.
The circuit-satisfiability problem is, “Given a boolean combinational
circuit composed of AND, OR, and NOT gates, is it satisfiable?” In
order to pose this question formally, however, we must agree on a standard encoding for circuits. The size of a boolean combinational
circuit is the number of boolean combinational elements plus the
number of wires in the circuit. We could devise a graph-like encoding
that maps any given circuit C into a binary string 〈 C〉 whose length is
polynomial in the size of the circuit itself. As a formal language, we can
therefore define
CIRCUIT-SAT = {〈 C〉 : C is a satisfiable boolean combinational
circuit}.
The circuit-satisfiability problem arises in the area of computer-aided
hardware optimization. If a subcircuit always produces 0, that subcircuit
is unnecessary: the designer can replace it by a simpler subcircuit that
omits all logic gates and provides the constant 0 value as its output. You
can see the value in having a polynomial-time algorithm for this
problem.
Given a circuit C, you can determine whether it is satisfiable by
simply checking all possible assignments to the inputs. Unfortunately, if
the circuit has k inputs, then you would have to check up to 2 k possible assignments. When the size of C is polynomial in k, checking all possible assignments to the inputs takes Ω(2 k) time, which is
superpolynomial in the size of the circuit.10 In fact, as we have claimed, there is strong evidence that no polynomial-time algorithm exists that
solves the circuit-satisfiability problem because circuit satisfiability is
NP-complete. We break the proof of this fact into two parts, based on
the two parts of the definition of NP-completeness.
Lemma 34.5
The circuit-satisfiability problem belongs to the class NP.
Proof We provide a two-input, polynomial-time algorithm A that can
verify CIRCUIT-SAT. One of the inputs to A is (a standard encoding
of) a boolean combinational circuit C. The other input is a certificate
corresponding to an assignment of a boolean value to each of the wires
in C. (See Exercise 34.3-4 for a smaller certificate.)
The algorithm A works as follows. For each logic gate in the circuit, it checks that the value provided by the certificate on the output wire is
correctly computed as a function of the values on the input wires. Then,
if the output of the entire circuit is 1, algorithm A outputs 1, since the
values assigned to the inputs of C provide a satisfying assignment.
Otherwise, A outputs 0.
Whenever a satisfiable circuit C is input to algorithm A, there exists a certificate whose length is polynomial in the size of C and that causes A
to output a 1. Whenever an unsatisfiable circuit is input, no certificate
can fool A into believing that the circuit is satisfiable. Algorithm A runs in polynomial time, and with a good implementation, linear time
suffices. Thus, CIRCUIT-SAT is verifiable in polynomial time, and
CIRCUIT-SAT ∈ NP.
▪
The second part of proving that CIRCUIT-SAT is NP-complete is to
show that the language is NP-hard: that every language in NP is
polynomial-time reducible to CIRCUIT-SAT. The actual proof of this
fact is full of technical intricacies, and so instead we’ll sketch the proof
based on some understanding of the workings of computer hardware.
A computer program is stored in the computer’s memory as a
sequence of instructions. A typical instruction encodes an operation to
be performed, addresses of operands in memory, and an address where
the result is to be stored. A special memory location, called the program
counter, keeps track of which instruction is to be executed next. The program counter automatically increments when each instruction is
fetched, thereby causing the computer to execute instructions
sequentially. Certain instructions can cause a value to be written to the
program counter, however, which alters the normal sequential execution
and allows the computer to loop and perform conditional branches.
At any point while a program executes, the computer’s memory
holds the entire state of the computation. (Consider the memory to
include the program itself, the program counter, working storage, and
any of the various bits of state that a computer maintains for
bookkeeping.) We call any particular state of computer memory a
configuration. When an instruction executes, it transforms the
configuration. Think of an instruction as mapping one configuration to another. The computer hardware that accomplishes this mapping can be
implemented as a boolean combinational circuit, which we denote by M
in the proof of the following lemma.
Lemma 34.6
The circuit-satisfiability problem is NP-hard.
Proof Let L be any language in NP. We’ll describe a polynomial-time
algorithm F computing a reduction function f that maps every binary
string x to a circuit C = f ( x) such that x ∈ L if and only if C ∈
CIRCUIT-SAT.
Since L ∈ NP, there must exist an algorithm A that verifies L in polynomial time. The algorithm F that we construct uses the two-input
algorithm A to compute the reduction function f.
Let T ( n) denote the worst-case running time of algorithm A on length- n input strings, and let k ≥ 1 be a constant such that T ( n) =
O( nk) and the length of the certificate is O( nk). (The running time of A is actually a polynomial in the total input size, which includes both an
input string and a certificate, but since the length of the certificate is
polynomial in the length n of the input string, the running time is polynomial in n.)
Figure 34.9 The sequence of configurations produced by an algorithm A running on an input x and certificate y. Each configuration represents the state of the computer for one step of the computation and, besides A, x, and y, includes the program counter (PC), auxiliary machine state, and working storage. Except for the certificate y, the initial configuration c 0 is constant. A boolean combinational circuit M maps each configuration to the next configuration. The output is a distinguished bit in the working storage.
The basic idea of the proof is to represent the computation of A as a
sequence of configurations. As Figure 34.9 illustrates, consider each configuration as comprising a few parts: the program for A, the
program counter and auxiliary machine state, the input x, the certificate
y, and working storage. The combinational circuit M, which implements
the computer hardware, maps each configuration ci to the next configuration ci+1, starting from the initial configuration c 0. Algorithm A writes its output—0 or 1—to some designated location by the time it
finishes executing. After A halts, the output value never changes. Thus,
if the algorithm runs for at most T ( n) steps, the output appears as one of the bits in cT( n).
The reduction algorithm F constructs a single combinational circuit
that computes all configurations produced by a given initial
configuration. The idea is to paste together T ( n) copies of the circuit M.
The output of the i th circuit, which produces configuration ci, feeds directly into the input of the ( i +1)st circuit. Thus, the configurations,
rather than being stored in the computer’s memory, simply reside as
values on the wires connecting copies of M.
Recall what the polynomial-time reduction algorithm F must do.
Given an input x, it must compute a circuit C = f ( x) that is satisfiable if and only if there exists a certificate y such that A( x, y) = 1. When F
obtains an input x, it first computes n = | x| and constructs a combinational circuit C′ consisting of T ( n) copies of M. The input to C′
is an initial configuration corresponding to a computation on A( x, y), and the output is the configuration cT( n).
Algorithm F modifies circuit C′ slightly to construct the circuit C = f ( x). First, it wires the inputs to C′ corresponding to the program for A, the initial program counter, the input x, and the initial state of memory
directly to these known values. Thus, the only remaining inputs to the
circuit correspond to the certificate y. Second, it ignores all outputs from C′, except for the one bit of cT( n) corresponding to the output of A. This circuit C, so constructed, computes C( y) = A( x, y) for any input y of length O( nk). The reduction algorithm F, when provided an input string x, computes such a circuit C and outputs it.
We need to prove two properties. First, we must show that F
correctly computes a reduction function f. That is, we must show that C
is satisfiable if and only if there exists a certificate y such that A( x, y) =
1. Second, we must show that F runs in polynomial time.
To show that F correctly computes a reduction function, suppose that there exists a certificate y of length O( nk) such that A( x, y) = 1.
Then, upon applying the bits of y to the inputs of C, the output of C is C( y) = A( x, y) = 1. Thus, if a certificate exists, then C is satisfiable. For the other direction, suppose that C is satisfiable. Hence, there exists an
input y to C such that C( y) = 1, from which we conclude that A( x, y) =
1. Thus, F correctly computes a reduction function.
To complete the proof sketch, we need to show that F runs in time
polynomial in n = | x|. First, the number of bits required to represent a configuration is polynomial in n. Why? The program for A itself has constant size, independent of the length of its input x. The length of the
input x is n, and the length of the certificate y is O( nk). Since the algorithm runs for at most O( nk) steps, the amount of working storage
required by A is polynomial in n as well. (We implicitly assume that this memory is contiguous. Exercise 34.3-5 asks you to extend the argument
to the situation in which the locations accessed by A are scattered across
a much larger region of memory and the particular pattern of scattering
can differ for each input x.)
The combinational circuit M implementing the computer hardware
has size polynomial in the length of a configuration, which is O( nk), and hence, the size of M is polynomial in n. (Most of this circuitry implements the logic of the memory system.) The circuit C consists of
O( nk) copies of M, and hence it has size polynomial in n. The reduction algorithm F can construct C from x in polynomial time, since each step of the construction takes polynomial time.
▪
The language CIRCUIT-SAT is therefore at least as hard as any
language in NP, and since it belongs to NP, it is NP-complete.
Theorem 34.7
The circuit-satisfiability problem is NP-complete.
Proof Immediate from Lemmas 34.5 and 34.6 and from the definition
of NP-completeness.
Exercises
34.3-1
Verify that the circuit in Figure 34.8(b) is unsatisfiable.
34.3-2
Show that the ≤P relation is a transitive relation on languages. That is,
show that if L 1 ≤P L 2 and L 2 ≤P L 3, then L 1 ≤P L 3.
34.3-3
Prove that L ≤P L if and only if L ≤P L.
34.3-4
Show that an alternative proof of Lemma 34.5 can use a satisfying
assignment as a certificate. Which certificate makes for an easier proof?
34.3-5
The proof of Lemma 34.6 assumes that the working storage for
algorithm A occupies a contiguous region of polynomial size. Where
does the proof exploit this assumption? Argue that this assumption does
not involve any loss of generality.
34.3-6
A language L is complete for a language class C with respect to polynomial-time reductions if L ∈ C and L′ ≤P L for all L′ ∈ C. Show that Ø and {0, 1}* are the only languages in P that are not complete for
P with respect to polynomial-time reductions.
34.3-7
Show that, with respect to polynomial-time reductions (see Exercise
34.3-6), L is complete for NP if and only if L is complete for co-NP.
34.3-8
The reduction algorithm F in the proof of Lemma 34.6 constructs the
circuit C = f ( x) based on knowledge of x, A, and k. Professor Sartre
observes that the string x is input to F, but only the existence of A, k, and the constant factor implicit in the O( nk) running time is known to F
(since the language L belongs to NP), not their actual values. Thus, the
professor concludes that F cannot possibly construct the circuit C and
that the language CIRCUIT-SAT is not necessarily NP-hard. Explain
the flaw in the professor’s reasoning.
The proof that the circuit-satisfiability problem is NP-complete showed
directly that L ≤P CIRCUIT-SAT for every language L ∈ NP. This section shows how to prove that languages are NP-complete without
directly reducing every language in NP to the given language. We’ll
explore examples of this methodology by proving that various formula-
satisfiability problems are NP-complete. Section 34.5 provides many more examples.
The following lemma provides a foundation for showing that a given
language is NP-complete.
Lemma 34.8
If L is a language such that L′ ≤P L for some L′ ∈ NPC, then L is NP-hard. If, in addition, we have L ∈ NP, then L ∈ NPC.
Proof Since L′ is NP-complete, for all L″ ∈ NP, we have L″ ≤P L′. By supposition, we have L′ ≤P L, and thus by transitivity (Exercise 34.3-2), we have L″ ≤P L, which shows that L is NP-hard. If L ∈ NP, we also have L ∈ NPC.
▪
In other words, by reducing a known NP-complete language L′ to L,
we implicitly reduce every language in NP to L. Thus, Lemma 34.8
provides a method for proving that a language L is NP-complete:
1. Prove L ∈ NP.
2. Prove that L is NP-hard:
a. Select a known NP-complete language L′.
b. Describe an algorithm that computes a function f mapping
every instance x ∈ {0, 1}* of L′ to an instance f ( x) of L.
c. Prove that the function f satisfies x ∈ L′ if and only if f ( x) ∈
L for all x ∈ {0, 1}*.
d. Prove that the algorithm computing f runs in polynomial time.
This methodology of reducing from a single known NP-complete
language is far simpler than the more complicated process of showing
directly how to reduce from every language in NP. Proving CIRCUIT-
SAT ∈ NPC furnishes a starting point. Knowing that the circuit-
satisfiability problem is NP-complete makes it much easier to prove that
other problems are NP-complete. Moreover, as the catalog of known
NP-complete problems grows, so will the choices for languages from
which to reduce.
Formula satisfiability
To illustrate the reduction methodology, let’s see an NP-completeness
proof for the problem of determining whether a boolean formula, not a
circuit, is satisfiable. This problem has the historical honor of being the
first problem ever shown to be NP-complete.
We formulate the (formula) satisfiability problem in terms of the
language SAT as follows. An instance of SAT is a boolean formula ϕ
composed of
1. n boolean variables: x 1, x 2, … , xn;
2. m boolean connectives: any boolean function with one or two
inputs and one output, such as ∧ (AND), ∨ (OR), ¬ (NOT), →
(implication), ↔ (if and only if); and
3. parentheses. (Without loss of generality, assume that there are no
redundant parentheses, i.e., a formula contains at most one pair
of parentheses per boolean connective.)
We can encode a boolean formula ϕ in a length that is polynomial in n
+ m. As in boolean combinational circuits, a truth assignment for a boolean formula ϕ is a set of values for the variables of ϕ, and a satisfying assignment is a truth assignment that causes it to evaluate to
1. A formula with a satisfying assignment is a satisfiable formula. The
satisfiability problem asks whether a given boolean formula is
satisfiable, which we can express in formal-language terms as
SAT = {〈 ϕ〉 : ϕ is a satisfiable boolean formula}.
As an example, the formula
ϕ = (( x 1 → x 2) ∨ ¬((¬ x 1 ↔ x 3) ∨ x 4)) ∧ ¬ x 2
has the satisfying assignment 〈 x 1 = 0, x 2 = 0, x 3 = 1, x 4 = 1〉, since and thus this formula ϕ belongs to SAT.
The naive algorithm to determine whether an arbitrary boolean
formula is satisfiable does not run in polynomial time. A formula with n
variables has 2 n possible assignments. If the length of 〈 ϕ〉 is polynomial in n, then checking every assignment requires Ω(2 n) time, which is superpolynomial in the length of 〈 ϕ〉. As the following theorem shows, a
polynomial-time algorithm is unlikely to exist.
Theorem 34.9
Satisfiability of boolean formulas is NP-complete.
Proof We start by arguing that SAT ∈ NP. Then we prove that SAT is
NP-hard by showing that CIRCUIT-SAT ≤P SAT, which by Lemma
34.8 will prove the theorem.
To show that SAT belongs to NP, we show that a certificate
consisting of a satisfying assignment for an input formula ϕ can be
verified in polynomial time. The verifying algorithm simply replaces
each variable in the formula with its corresponding value and then
evaluates the expression, much as we did in equation (34.2) above. This
task can be done in polynomial time. If the expression evaluates to 1,
then the algorithm has verified that the formula is satisfiable. Thus, SAT
belongs to NP.
To prove that SAT is NP-hard, we show that CIRCUIT-SAT ≤P
SAT. In other words, we need to show how to reduce any instance of
circuit satisfiability to an instance of formula satisfiability in polynomial
time. We can use induction to express any boolean combinational circuit
as a boolean formula. We simply look at the gate that produces the
circuit output and inductively express each of the gate’s inputs as
formulas. We then obtain the formula for the circuit by writing an
expression that applies the gate’s function to its inputs’ formulas.
Figure 34.10 Reducing circuit satisfiability to formula satisfiability. The formula produced by the reduction algorithm has a variable for each wire in the circuit and a clause for each logic gate.
Unfortunately, this straightforward method does not amount to a
polynomial-time reduction. As Exercise 34.4-1 asks you to show, shared
subformulas—which arise from gates whose output wires have fan-out
of 2 or more—can cause the size of the generated formula to grow
exponentially. Thus, the reduction algorithm must be somewhat more
clever.
Figure 34.10 illustrates how to overcome this problem, using as an example the circuit from Figure 34.8(a). For each wire xi in the circuit C, the formula ϕ has a variable xi. To express how each gate operates,
construct a small formula involving the variables of its incident wires.
The formula has the form of an “if and only if” (↔), with the variable
for the gate’s output on the left and on the right a logical expression
encapsulating the gate’s function on its inputs. For example, the
operation of the output AND gate (the rightmost gate in the figure) is
x 10 ↔ ( x 7 ∧ x 8 ∧ x 9). We call each of these small formulas a clause.
The formula ϕ produced by the reduction algorithm is the AND of
the circuit-output variable with the conjunction of clauses describing
the operation of each gate. For the circuit in the figure, the formula is
ϕ = x10 ∧ ( x 4 ↔ ¬ x 3)
∧ ( x 5 ↔ ( x 1 ∨ x 2))
∧ ( x 6 ↔ ¬ x 4)
∧ ( x 7 ↔ ( x 1 ∧ x 2 ∧ x 4))
∧ ( x 8 ↔ ( x 5 ∨ x 6))
∧ ( x 9 ↔ ( x 6 ∨ x 7))
∧ ( x 10 ↔ ( x 7 ∧ x 8 ∧ x 9)).
Given a circuit C, it is straightforward to produce such a formula ϕ in polynomial time.
Why is the circuit C satisfiable exactly when the formula ϕ is satisfiable? If C has a satisfying assignment, then each wire of the circuit
has a well-defined value, and the output of the circuit is 1. Therefore,
when wire values are assigned to variables in ϕ, each clause of ϕ
evaluates to 1, and thus the conjunction of all evaluates to 1.
Conversely, if some assignment causes ϕ to evaluate to 1, the circuit C is satisfiable by an analogous argument. Thus, we have shown that
CIRCUIT-SAT ≤P SAT, which completes the proof.
▪
3-CNF satisfiability
Reducing from formula satisfiability gives us an avenue to prove many problems NP-complete. The reduction algorithm must handle any input
formula, though, and this requirement can lead to a huge number of
cases to consider. Instead, it is usually simpler to reduce from a
restricted language of boolean formulas. Of course, the restricted
language must not be polynomial-time solvable. One convenient
language is 3-CNF satisfiability, or 3-CNF-SAT.
In order to define 3-CNF satisfiability, we first need to define a few
terms. A literal in a boolean formula is an occurrence of a variable (such
as x 1) or its negation (¬ x 1). A clause is the OR of one or more literals, such as x 1 ∨ ¬ x 2 ∨ ¬ x 3. A boolean formula is in conjunctive normal form, or CNF, if it is expressed as an AND of clauses, and it’s in 3-conjunctive normal form, or 3-CNF, if each clause contains exactly three distinct literals.
For example, the boolean formula
( x 1 ∨ ¬ x 1 ∨ ¬ x 2) ∧ ( x 3 ∨ x 2 ∨ x 4) ∧ (¬ x 1 ∨ ¬ x 3 ∨ ¬ x 4) is in 3-CNF. The first of its three clauses is ( x 1 ∨ ¬ x 1 ∨ ¬ x 2), which contains the three literals x 1, ¬ x 1, and ¬ x 2.
The language 3-CNF-SAT consists of encodings of boolean
formulas in 3-CNF that are satisfiable. The following theorem shows
that a polynomial-time algorithm that can determine the satisfiability of
boolean formulas is unlikely to exist, even when they are expressed in
this simple normal form.
Theorem 34.10
Satisfiability of boolean formulas in 3-conjunctive normal form is NP-
complete.
Proof The argument from the proof of Theorem 34.9 to show that SAT
∈ NP applies equally well here to show that 3-CNF-SAT ∈ NP. By
Lemma 34.8, therefore, we need only show that SAT ≤P 3-CNF-SAT.

Figure 34.11 The tree corresponding to the formula ϕ = (( x 1 → x 2)∨¬((¬ x 1 ↔ x 3)∨ x 4))∧¬ x 2.
We break the reduction algorithm into three basic steps. Each step
progressively transforms the input formula ϕ closer to the desired 3-
conjunctive normal form.
The first step is similar to the one used to prove CIRCUIT-SAT ≤P
SAT in Theorem 34.9. First, construct a binary “parse” tree for the
input formula ϕ, with literals as leaves and connectives as internal
nodes. Figure 34.11 shows such a parse tree for the formula
If the input formula contains a clause such as the OR of several literals,
use associativity to parenthesize the expression fully so that every
internal node in the resulting tree has just one or two children. The
binary parse tree is like a circuit for computing the function.
Mimicking the reduction in the proof of Theorem 34.9, introduce a
variable yi for the output of each internal node. Then rewrite the
original formula ϕ as the AND of the variable at the root of the parse
tree and a conjunction of clauses describing the operation of each node.
For the formula (34.3), the resulting expression is
∧ ( y 1 ↔ ( y 2 ∧ ¬ x 2))





ϕ′ = y 1
∧ ( y 2 ↔ ( y 3 ∨ y 4))
∧ ( y 3 ↔ ( x 1 → x 2))
∧ ( y 4 ↔ ¬ y 5)
∧ ( y 5 ↔ ( y 6 ∨ x 4))
∧ ( y 6 ↔ (¬ x 1 ↔ x 3)).
Figure 34.12 The truth table for the clause ( y 1 ↔ ( y 2 ∧ ¬ x 2)).
The formula ϕ′ thus obtained is a conjunction of clauses , each of
which has at most three literals. These clauses are not yet ORs of three
literals.
The second step of the reduction converts each clause into
conjunctive normal form. Construct a truth table for by evaluating all
possible assignments to its variables. Each row of the truth table consists
of a possible assignment of the variables of the clause, together with the
value of the clause under that assignment. Using the truth-table entries
that evaluate to 0, build a formula in disjunctive normal form (or DNF)
—an OR of ANDs—that is equivalent to ¬ . Then negate this formula
and convert it into a CNF formula by using DeMorgan’s laws for
propositional logic,
¬( a ∧ b) = ¬ a ∨ ¬ b,
¬( a ∨ b) = ¬ a ∧ ¬ b,







to complement all literals, change ORs into ANDs, and change ANDs
into ORs.
In our example, the clause
converts into CNF
as follows. The truth table for appears in Figure 34.12. The DNF
formula equivalent to ¬ is
( y 1 ∧ y 2 ∧ x 2) ∨ ( y 1 ∧ ¬ y 2 ∧ x 2) ∨ ( y 1 ∧ ¬ y 2 ∧ ¬ x 2) ∨ (¬ y 1 ∧ y 2 ∧
¬ x 2).
Negating and applying DeMorgan’s laws yields the CNF formula
which is equivalent to the original clause .
At this point, each clause of the formula ϕ′ has been converted
into a CNF formula , and thus ϕ′ is equivalent to the CNF formula
ϕ″ consisting of the conjunction of the . Moreover, each clause of ϕ″
has at most three literals.
The third and final step of the reduction further transforms the
formula so that each clause has exactly three distinct literals. From the
clauses of the CNF formula ϕ″, construct the final 3-CNF formula ϕ‴.
This formula also uses two auxiliary variables, p and q. For each clause Ci of ϕ″, include the following clauses in ϕ‴:
If Ci contains three distinct literals, then simply include Ci as a
clause of ϕ‴.
If Ci contains exactly two distinct literals, that is, if Ci = ( l 1 ∨ l 2), where l 1 and l 2 are literals, then include ( l 1 ∨ l 2 ∨ p) ∧ ( l 1 ∨ l 2 ∨
¬ p) as clauses of ϕ‴. The literals p and ¬ p merely fulfill the syntactic requirement that each clause of ϕ‴ contain exactly three
distinct literals. Whether p = 0 or p = 1, one of the clauses is equivalent to l 1 ∨ l 2, and the other evaluates to 1, which is the
identity for AND.
If Ci contains just one distinct literal l, then include ( l ∨ p ∨ q)∧( l
∨ p ∨ ¬ q) ∧ ( l ∨ ¬ p ∨ q) ∧ ( l ∨ ¬ p ∨ ¬ q) as clauses of ϕ‴.
Regardless of the values of p and q, one of the four clauses is equivalent to l, and the other three evaluate to 1.
We can see that the 3-CNF formula ϕ‴ is satisfiable if and only if ϕ is satisfiable by inspecting each of the three steps. Like the reduction from
CIRCUIT-SAT to SAT, the construction of ϕ′ from ϕ in the first step
preserves satisfiability. The second step produces a CNF formula ϕ″
that is algebraically equivalent to ϕ′. Then the third step produces a 3-
CNF formula ϕ‴ that is effectively equivalent to ϕ″, since any assignment to the variables p and q produces a formula that is algebraically equivalent to ϕ″.
We must also show that the reduction can be computed in
polynomial time. Constructing ϕ′ from ϕ introduces at most one variable and one clause per connective in ϕ. Constructing ϕ″ from ϕ′
can introduce at most eight clauses into ϕ″ for each clause from ϕ′, since each clause of ϕ′ contains at most three variables, and the truth
table for each clause has at most 23 = 8 rows. The construction of ϕ‴
from ϕ″ introduces at most four clauses into ϕ‴ for each clause of ϕ″.
Thus the size of the resulting formula ϕ‴ is polynomial in the length of
the original formula. Each of the constructions can be accomplished in
polynomial time.
▪
Exercises
34.4-1
Consider the straightforward (nonpolynomial-time) reduction in the
proof of Theorem 34.9. Describe a circuit of size n that, when converted
to a formula by this method, yields a formula whose size is exponential
in n.
34.4-2
Show the 3-CNF formula that results upon using the method of Theorem 34.10 on the formula (34.3).
34.4-3
Professor Jagger proposes to show that SAT ≤P 3-CNF-SAT by using
only the truth-table technique in the proof of Theorem 34.10, and not
the other steps. That is, the professor proposes to take the boolean
formula ϕ, form a truth table for its variables, derive from the truth table a formula in 3-DNF that is equivalent to ¬ ϕ, and then negate and
apply DeMorgan’s laws to produce a 3-CNF formula equivalent to ϕ.
Show that this strategy does not yield a polynomial-time reduction.
34.4-4
Show that the problem of determining whether a boolean formula is a
tautology is complete for co-NP. ( Hint: See Exercise 34.3-7.)
34.4-5
Show that the problem of determining the satisfiability of boolean
formulas in disjunctive normal form is polynomial-time solvable.
34.4-6
Someone gives you a polynomial-time algorithm to decide formula
satisfiability. Describe how to use this algorithm to find satisfying
assignments in polynomial time.
34.4-7
Let 2-CNF-SAT be the set of satisfiable boolean formulas in CNF with
exactly two literals per clause. Show that 2-CNF-SAT ∈ P. Make your
algorithm as efficient as possible. ( Hint: Observe that x ∨ y is equivalent to ¬ x → y. Reduce 2-CNF-SAT to an efficiently solvable problem on a
directed graph.)
NP-complete problems arise in diverse domains: boolean logic, graphs,
arithmetic, network design, sets and partitions, storage and retrieval,
sequencing and scheduling, mathematical programming, algebra and
number theory, games and puzzles, automata and language theory,
program optimization, biology, chemistry, physics, and more. This
section uses the reduction methodology to provide NP-completeness
proofs for a variety of problems drawn from graph theory and set
partitioning.
Figure 34.13 The structure of NP-completeness proofs in Sections 34.4 and 34.5. All proofs ultimately follow by reduction from the NP-completeness of CIRCUIT-SAT.
Figure 34.13 outlines the structure of the NP-completeness proofs in this section and Section 34.4. We prove each language in the figure to be NP-complete by reduction from the language that points to it. At the
root is CIRCUIT-SAT, which we proved NP-complete in Theorem 34.7.
This section concludes with a recap of reduction strategies.
34.5.1 The clique problem
A clique in an undirected graph G = ( V, E) is a subset V′ ⊆ V of vertices, each pair of which is connected by an edge in E. In other words, a clique is a complete subgraph of G. The size of a clique is the number of vertices it contains. The clique problem is the optimization problem of finding a clique of maximum size in a graph. The








corresponding decision problem asks simply whether a clique of a given
size k exists in the graph. The formal definition is
CLIQUE = {〈 G, k〉 : G is a graph containing a clique of size k}.
A naive algorithm for determining whether a graph G = ( V, E) with
| V| vertices contains a clique of size k lists all k-subsets of V and checks each one to see whether it forms a clique. The running time of this
algorithm is
, which is polynomial if k is a constant. In general,
however, k could be near | V|/2, in which case the algorithm runs in superpolynomial time. Indeed, an efficient algorithm for the clique
problem is unlikely to exist.
Theorem 34.11
The clique problem is NP-complete.
Proof First, we show that CLIQUE ∈ NP. For a given graph G = ( V, E), use the set V′ ⊆ V of vertices in the clique as a certificate for G. To check whether V′ is a clique in polynomial time, check whether, for each
pair u, v ∈ V′, the edge ( u, v) belongs to E.
We next prove that 3-CNF-SAT ≤P CLIQUE, which shows that the
clique problem is NP-hard. You might be surprised that the proof
reduces an instance of 3-CNF-SAT to an instance of CLIQUE, since on
the surface logical formulas seem to have little to do with graphs.
The reduction algorithm begins with an instance of 3-CNF-SAT. Let
ϕ = C 1 ∧ C 2 ∧ ⋯ ∧ Ck be a boolean formula in 3-CNF with k clauses.
For r = 1, 2, … , k, each clause Cr contains exactly three distinct literals:
, and . We will construct a graph G such that ϕ is satisfiable if and
only if G contains a clique of size k.
We construct the undirected graph G = ( V, E) as follows. For each clause
in ϕ, place a triple of vertices
, and into V.
Add edge
into E if both of the following hold:
and are in different triples, that is, r ≠ s, and









their corresponding literals are consistent, that is, is not the
negation of .
We can build this graph from ϕ in polynomial time. As an example of
this construction, if
ϕ = ( x 1 ∨ ¬ x 2 ∨ ¬ x 3) ∧ (¬ x 1 ∨ x 2 ∨ x 3) ∧ ( x 1 ∨ x 2 ∨ x 3), then G is the graph shown in Figure 34.14.
We must show that this transformation of ϕ into G is a reduction.
First, suppose that ϕ has a satisfying assignment. Then each clause Cr
contains at least one literal that is assigned 1, and each such literal
corresponds to a vertex . Picking one such “true” literal from each
clause yields a set V′ of k vertices. We claim that V′ is a clique. For any two vertices
, where r ≠ s, both corresponding literals and
map to 1 by the given satisfying assignment, and thus the literals cannot
be complements. Thus, by the construction of G, the edge
belongs to E.
Conversely, suppose that G contains a clique V′ of size k. No edges in G connect vertices in the same triple, and so V′ contains exactly one vertex per triple. If
, then assign 1 to the corresponding literal .
Since G contains no edges between inconsistent literals, no literal and its
complement are both assigned 1. Each clause is satisfied, and so ϕ is satisfied. (Any variables that do not correspond to a vertex in the clique
may be set arbitrarily.)
▪
Figure 34.14 The graph G derived from the 3-CNF formula ϕ = C 1 ∧ C 2 ∧ C 3, where C 1 = ( x 1
∨ ¬ x 2 ∨ ¬ x 3), C 2 = (¬ x 1 ∨ x 2 ∨ x 3), and C 3 = ( x 1 ∨ x 2 ∨ x 3), in reducing 3-CNF-SAT to CLIQUE. A satisfying assignment of the formula has x 2 = 0, x 3 = 1, and x 1 set to either 0 or 1.
This assignment satisfies C 1 with ¬ x 2, and it satisfies C 2 and C 3 with x 3, corresponding to the clique with blue vertices.
In the example of Figure 34.14, a satisfying assignment of ϕ has x 2 =
0 and x 3 = 1. A corresponding clique of size k = 3 consists of the vertices corresponding to ¬ x 2 from the first clause, x 3 from the second clause, and x 3 from the third clause. Because the clique contains no vertices corresponding to either x 1 or ¬ x 1, this satisfying assignment can set x 1 to either 0 or 1.
The proof of Theorem 34.11 reduced an arbitrary instance of 3-
CNF-SAT to an instance of CLIQUE with a particular structure. You
might think that we have shown only that CLIQUE is NP-hard in
graphs in which the vertices are restricted to occur in triples and in
which there are no edges between vertices in the same triple. Indeed, we
have shown that CLIQUE is NP-hard only in this restricted case, but
this proof suffices to show that CLIQUE is NP-hard in general graphs.
Why? If there were a polynomial-time algorithm that solves CLIQUE
on general graphs, it would also solve CLIQUE on restricted graphs.
The opposite approach—reducing instances of 3-CNF-SAT with a
special structure to general instances of CLIQUE—does not suffice,
however. Why not? Perhaps the instances of 3-CNF-SAT that we choose
to reduce from are “easy,” and so we would not have reduced an NP-
hard problem to CLIQUE.
Moreover, the reduction uses the instance of 3-CNF-SAT, but not
the solution. We would have erred if the polynomial-time reduction had
relied on knowing whether the formula ϕ is satisfiable, since we do not
know how to decide whether ϕ is satisfiable in polynomial time.
Figure 34.15 Reducing CLIQUE to VERTEX-COVER. (a) An undirected graph G = ( V, E) with clique V′ = { u, v, x, y}, shown in blue. (b) The graph G produced by the reduction algorithm that has vertex cover V − V′ = { w, z}, in blue.
34.5.2 The vertex-cover problem
A vertex cover of an undirected graph G = ( V, E) is a subset V′ ⊆ V
such that if ( u, v) ∈ E, then u ∈ V′ or v ∈ V′ (or both). That is, each vertex “covers” its incident edges, and a vertex cover for G is a set of
vertices that covers all the edges in E. The size of a vertex cover is the number of vertices in it. For example, the graph in Figure 34.15(b) has a vertex cover { w, z} of size 2.
The vertex-cover problem is to find a vertex cover of minimum size in
a given graph. For this optimization problem, the corresponding
decision problem asks whether a graph has a vertex cover of a given size
k. As a language, we define
VERTEX-COVER = {〈 G, k〉 : graph G has a vertex cover of size k}.
The following theorem shows that this problem is NP-complete.
Theorem 34.12
The vertex-cover problem is NP-complete.
Proof We first show that VERTEX-COVER ∈ NP. Given a graph G =
( V, E) and an integer k, the certificate is the vertex cover V′ ⊆ V itself.
The verification algorithm affirms that | V′| = k, and then it checks, for each edge ( u, v) ∈ E, that u ∈ V′ or v ∈ V′. It is easy to verify the certificate in polynomial time.
To prove that the vertex-cover problem is NP-hard, we reduce from
the clique problem, showing that CLIQUE ≤P VERTEX-COVER. This
reduction relies on the notion of the complement of a graph. Given an
undirected graph G = ( V, E), we define the complement of G as a graph G = ( V, E), where E = {( u, v) : u, v ∈ V, u ≠ v, and ( u, v) ∉ E}. In other words, G is the graph containing exactly those edges that are not in G.
Figure 34.15 shows a graph and its complement and illustrates the reduction from CLIQUE to VERTEX-COVER.
The reduction algorithm takes as input an instance 〈 G, k〉 of the
clique problem and computes the complement G in polynomial time.
The output of the reduction algorithm is the instance 〈 G, | V| − k〉 of the vertex-cover problem. To complete the proof, we show that this
transformation is indeed a reduction: the graph G contains a clique of
size k if and only if the graph G has a vertex cover of size | V| − k.
Suppose that G contains a clique V′ ⊆ V with | V′| = k. We claim that V − V′ is a vertex cover in G. Let ( u, v) be any edge in E. Then, ( u, v) ∉
E, which implies that at least one of u or v does not belong to V′, since every pair of vertices in V′ is connected by an edge of E. Equivalently, at least one of u or v belongs to V − V′, which means that edge ( u, v) is covered by V − V′. Since ( u, v) was chosen arbitrarily from E, every edge of E is covered by a vertex in V − V′. Hence the set V − V′, which has size | V| − k, forms a vertex cover for G.
Conversely, suppose that G has a vertex cover V′ ⊆ V, where | V′| =
| V| − k. Then for all u, v ∈ V, if ( u, v) ∈ E, then u ∈ V′ or v ∈ V′ or both. The contrapositive of this implication is that for all u, v ∈ V, if u
∉
∉
∉ V′ and v ∉ V′, then ( u, v) ∈ E. In other words, V − V′ is a clique, and it has size | V|−| V′| = k.
▪
Since VERTEX-COVER is NP-complete, we don’t expect to find a
polynomial-time algorithm for finding a minimum-size vertex cover.
Section 35.1 presents a polynomial-time “approximation algorithm,”
however, which produces “approximate” solutions for the vertex-cover
problem. The size of a vertex cover produced by the algorithm is at
most twice the minimum size of a vertex cover.
Thus, you shouldn’t give up hope just because a problem is NP-
complete. You might be able to design a polynomial-time
approximation algorithm that obtains near-optimal solutions, even
though finding an optimal solution is NP-complete. Chapter 35 gives several approximation algorithms for NP-complete problems.
34.5.3 The hamiltonian-cycle problem
We now return to the hamiltonian-cycle problem defined in Section
Theorem 34.13
The hamiltonian cycle problem is NP-complete.
Figure 34.16 The gadget used in reducing the vertex-cover problem to the hamiltonian-cycle problem. An edge ( u, v) of graph G corresponds to gadget Γ uv in the graph G′ created in the reduction. (a) The gadget, with individual vertices labeled. (b)–(d) The paths highlighted in blue are the only possible ones through the gadget that include all vertices, assuming that the only connections from the gadget to the remainder of G′ are through vertices [ u, v, 1], [ u, v, 6], [ v, u, 1], and [ v, u, 6].
Proof We first show that HAM-CYCLE ∈ NP. Given an undirected graph G = ( V, E), the certificate is the sequence of | V| vertices that makes up the hamiltonian cycle. The verification algorithm checks that
this sequence contains each vertex in V exactly once and that with the
first vertex repeated at the end, it forms a cycle in G. That is, it checks
that there is an edge between each pair of consecutive vertices and
between the first and last vertices. This certificate can be verified in
polynomial time.
We now prove that VERTEX-COVER ≤P HAM-CYCLE, which
shows that HAM-CYCLE is NP-complete. Given an undirected graph
G = ( V, E) and an integer k, we construct an undirected graph G′ = ( V′, E′) that has a hamiltonian cycle if and only if G has a vertex cover of size k. We assume without loss of generality that G contains no isolated vertices (that is, every vertex in V has at least one incident edge) and that k ≤ | V|. (If an isolated vertex belongs to a vertex cover of size k, then there also exists a vertex cover of size k − 1, and for any graph, the
entire set V is always a vertex cover.)
Our construction uses a gadget, which is a piece of a graph that
enforces certain properties. Figure 34.16(a) shows the gadget we use.
For each edge ( u, v) ∈ E, the constructed graph G′ contains one copy of this gadget, which we denote by Γ uv. We denote each vertex in Γ uv by
[ u, v, i] or [ v, u, i], where 1 ≤ i ≤ 6, so that each gadget Γ uv contains 12
vertices. Gadget Γ uv also contains the 14 edges shown in Figure
Along with the internal structure of the gadget, we enforce the
properties we want by limiting the connections between the gadget and
the remainder of the graph G′ that we construct. In particular, only vertices [ u, v, 1], [ u, v, 6], [ v, u, 1], and [ v, u, 6] will have edges incident from outside Γ uv. Any hamiltonian cycle of G′ must traverse the edges
of Γ uv in one of the three ways shown in Figures 34.16(b)–(d). If the cycle enters through vertex [ u, v, 1], it must exit through vertex [ u, v, 6], and it either visits all 12 of the gadget’s vertices (Figure 34.16(b)) or the six vertices [ u, v, 1] through [ u, v, 6] (Figure 34.16(c)). In the latter case, the cycle will have to reenter the gadget to visit vertices [ v, u, 1] through
[ v, u, 6]. Similarly, if the cycle enters through vertex [ v, u, 1], it must exit through vertex [ v, u, 6], and either it visits all 12 of the gadget’s vertices (Figure 34.16(d)) or it visits the six vertices [ v, u, 1] through [ v, u, 6] and reenters to visit [ u, v, 1] through [ u, v, 6] (Figure 34.16(c)). No other paths through the gadget that visit all 12 vertices are possible. In
particular, it is impossible to construct two vertex-disjoint paths, one of
which connects [ u, v, 1] to [ v, u, 6] and the other of which connects [ v, u, 1] to [ u, v, 6], such that the union of the two paths contains all of the
gadget’s vertices.
The only other vertices in V′ other than those of gadgets are selector
vertices s 1, s 2, … , sk. We’ll use edges incident on selector vertices in G′
to select the k vertices of the cover in G.
In addition to the edges in gadgets, E′ contains two other types of
edges, which Figure 34.17 shows. First, for each vertex u ∈ V, edges join pairs of gadgets in order to form a path containing all gadgets
corresponding to edges incident on u in G. We arbitrarily order the vertices adjacent to each vertex u ∈ V as u(1), u(2), … , u(degree( u)), where degree( u) is the number of vertices adjacent to u. To create a path in G′ through all the gadgets corresponding to edges incident on u, E′
contains the edges {([ u, u( i), 6], [ u, u( i+1), 1]) : 1 ≤ i ≤ degree( u) − 1}. In
Figure 34.17, for example, we order the vertices adjacent to w as 〈 x, y, z〉, and so graph G′ in part (b) of the figure includes the edges ([ w, x, 6],
[ w, y, 1]) and ([ w, y, 6], [ w, z, 1]). The vertices adjacent to x are ordered as 〈 w, y〉, so that G′ includes the edge ([ x, w, 6], [ x, y, 1]). For each vertex u ∈ V, these edges in G′ fill in a path containing all gadgets corresponding to edges incident on u in G.
The intuition behind these edges is that if vertex u ∈ V belongs to
the vertex cover of G, then G′ contains a path from [ u, u(1), 1] to [ u, u(degree( u)), 6] that “covers” all gadgets corresponding to edges incident on u. That is, for each of these gadgets, say
, the path
either includes all 12 vertices (if u belongs to the vertex cover but u( i) does not) or just the six vertices [ u, u( i), 1] through [ u, u( i), 6] (if both u and u( i) belong to the vertex cover).
The final type of edge in E′ joins the first vertex [ u, u(1), 1] and the last vertex [ u, u(degree( u)), 6] of each of these paths to each of the selector vertices. That is, E′ includes the edges
{( sj, [ u, u(1), 1]) : u ∈ V and 1 ≤ j ≤ k}
∪ {( sj, [ u, u(degree( u)), 6]) : u ∈ V and 1 ≤ j ≤
k}.
Figure 34.17 Reducing an instance of the vertex-cover problem to an instance of the hamiltonian-cycle problem. (a) An undirected graph G with a vertex cover of size 2, consisting of the blue vertices w and y. (b) The undirected graph G′ produced by the reduction, with the hamiltonian cycle corresponding to the vertex cover highlighted in blue. The vertex cover { w, y}
corresponds to edges ( s 1, [ w, x, 1]) and ( s 2, [ y, x, 1]) appearing in the hamiltonian cycle.
Next we show that the size of G′ is polynomial in the size of G, and
hence it takes time polynomial in the size of G to construct G′. The

vertices of G′ are those in the gadgets, plus the selector vertices. With 12
vertices per gadget, plus k ≤ | V | selector vertices, G′ contains a total of
| V′| = 12 | E| + k
≤ 12 | E| + | V|
vertices. The edges of G′ are those in the gadgets, those that go between
gadgets, and those connecting selector vertices to gadgets. Each gadget
contains 14 edges, totaling 14 | E| in all gadgets. For each vertex u ∈ V, graph G′ has degree( u) − 1 edges going between gadgets, so that summed over all vertices in V,
edges go between gadgets. Finally, G′ has two edges for each pair
consisting of a selector vertex and a vertex of V, totaling 2 k | V| such edges. The total number of edges of G′ is therefore
| E′| = (14 | E|) + (2 | E| − | V|) + (2 k | V|)
= 16 | E| + (2 k − 1) | V|
≤ 16 | E| + (2 | V| − 1) | V|.
Now we show that the transformation from graph G to G′ is a reduction. That is, we must show that G has a vertex cover of size k if and only if G′ has a hamiltonian cycle.
Suppose that G = ( V, E) has a vertex cover V* ⊆ V, where | V*| = k.
Let V* = { u 1, u 2, … , uk}. As Figure 34.17 shows, we can construct a hamiltonian cycle in G′ by including the following edges11 for each vertex
uj
∈
V*.
Start
by
including
edges
, which connect all gadgets
corresponding to edges incident on uj. Also include the edges within
these gadgets as Figures 34.16(b)–(d) show, depending on whether the edge is covered by one or two vertices in V*. The hamiltonian cycle also
includes the edges

By inspecting Figure 34.17, you can verify that these edges form a cycle, where u 1 = w and u 2 = y. The cycle starts at s 1, visits all gadgets corresponding to edges incident on u 1, then visits s 2, visits all gadgets corresponding to edges incident on u 2, and so on, until it returns to s 1.
The cycle visits each gadget either once or twice, depending on whether
one or two vertices of V* cover its corresponding edge. Because V* is a vertex cover for G, each edge in E is incident on some vertex in V*, and so the cycle visits each vertex in each gadget of G′. Because the cycle also visits every selector vertex, it is hamiltonian.
Conversely, suppose that G′ = ( V′, E′) contains a hamiltonian cycle C
⊆ E′. We claim that the set
is a vertex cover for G.
We first argue that the set V* is well defined, that is, for each selector
vertex sj, exactly one of the incident edges in the hamiltonian cycle C is of the form ( sj, [ u, u(1), 1]) for some vertex u ∈ V. To see why, partition the hamiltonian cycle C into maximal paths that start at some selector
vertex si, visit one or more gadgets, and end at some selector vertex sj
without passing through any other selector vertex. Let’s call each of
these maximal paths a “cover path.” Let P be one such cover path, and
orient it going from si to sj. If P contains the edge ( si, [ u, u(1), 1]) for some vertex u ∈ V, then we have shown that one edge incident on si has the required form. Assume, then, that P contains the edge ( si, [ v, v(degree( v)), 6]) for some vertex v ∈ V. This path enters a gadget from the bottom, as drawn in Figures 34.16 and 34.17, and it leaves from the top. It might go through several gadgets, but it always enters from the
bottom of a gadget and leaves from the top. The only edges incident on
vertices at the top of a gadget either go to the bottoms of other gadgets or to selector vertices. Therefore, after the last gadget in the series of
gadgets visited by P, the edge taken must go to a selector vertex sj, so that P contains an edge of the form ( sj, [ u, u(1), 1]), where [ u, u(1), 1] is a vertex at the top of some gadget. To see that not both edges incident on
sj have this form, simply reverse the direction of traversing P in the above argument.
Having established that the set V* is well defined, let’s see why it is a
vertex cover for G. We have already established that each cover path starts at some si, takes the edge ( si, [ u, u(1), 1]) for some vertex u ∈ V, passes through all the gadgets corresponding to edges in E incident on
u, and then ends at some selector vertex sj. (This orientation is the reverse of the orientation in the paragraph above.) Let’s call this cover
path Pu, and by equation (34.4), the vertex cover V* includes u. Each gadget visited by Pu must be Γ uv or Γ vu for some v ∈ V. For each gadget visited by Pu, its vertices are visited by either one or two cover
paths. If they are visited by one cover path, then edge ( u, v) ∈ E is covered in G by vertex u. If two cover paths visit the gadget, then the other cover path must be Pv, which implies that v ∈ V*, and edge ( u, v)
∈ E is covered by both u and v. Because each vertex in each gadget is visited by some cover path, we see that each edge in E is covered by some vertex in V*.
▪
34.5.4 The traveling-salesperson problem
In the traveling-salesperson problem, which is closely related to the
hamiltonian-cycle problem, a salesperson must visit n cities. Let’s model
the problem as a complete graph with n vertices, so that the salesperson
wishes to make a tour, or hamiltonian cycle, visiting each city exactly
once and finishing at the starting city. The salesperson incurs a
nonnegative integer cost c( i, j) to travel from city i to city j. In the optimization version of the problem, the salesperson wishes to make the
tour whose total cost is minimum, where the total cost is the sum of the
individual costs along the edges of the tour. For example, in Figure
34.18, a minimum-cost tour is 〈 u, w, v, x, u〉, with cost 7. The formal
language for the corresponding decision problem is
Figure 34.18 An instance of the traveling-salesperson problem. Edges highlighted in blue represent a minimum-cost tour, with cost 7.
TSP = {〈 G, c, G = ( V, E) is a complete graph,
k〉:
c is a function from V × V → ℕ,
k ∈ ℕ, and
G has a traveling-salesperson tour with cost at most
k}.
The following theorem shows that a fast algorithm for the traveling-
salesperson problem is unlikely to exist.
Theorem 34.14
The traveling-salesperson problem is NP-complete.
Proof We first show that TSP ∈ NP. Given an instance of the problem,
the certificate is the sequence of n vertices in the tour. The verification
algorithm checks that this sequence contains each vertex exactly once,
sums up the edge costs, and checks that the sum is at most k. This process can certainly be done in polynomial time.
To prove that TSP is NP-hard, we show that HAM-CYCLE ≤P TSP.
Given an instance G = ( V, E) of HAM-CYCLE, construct an instance of TSP by forming the complete graph G′ = ( V, E′), where E′ = {( i, j) : i, j ∈ V and i ≠ j }, with the cost function c defined as
(Because G is undirected, it contains no self-loops, and so c( v, v) = 1 for all vertices v ∈ V.) The instance of TSP is then 〈 G′, c, 0〉, which can be created in polynomial time.
We now show that graph G has a hamiltonian cycle if and only if
graph G′ has a tour of cost at most 0. Suppose that graph G has a hamiltonian cycle H. Each edge in H belongs to E and thus has cost 0 in G′. Thus, H is a tour in G′ with cost 0. Conversely, suppose that graph G′ has a tour H′ of cost at most 0. Since the costs of the edges in E′ are 0
and 1, the cost of tour H′ is exactly 0 and each edge on the tour must
have cost 0. Therefore, H′ contains only edges in E. We conclude that H′
is a hamiltonian cycle in graph G.
▪
34.5.5 The subset-sum problem
We next consider an arithmetic NP-complete problem. The subset-sum
problem takes as inputs a finite set S of positive integers and an integer target t > 0. It asks whether there exists a subset S′ ⊆ S whose elements sum to exactly t. For example, if S = {1, 2, 7, 14, 49, 98, 343, 686, 2409, 2793, 16808, 17206, 117705, 117993} and t = 138457, then the subset S′
= {1, 2, 7, 98, 343, 686, 2409, 17206, 117705} is a solution.
As usual, we express the problem as a language:
SUBSET-SUM = {〈 S, t〉 : there exists a subset S′ ⊆ S such that t =
Σ s∈ S′ S}.
As with any arithmetic problem, it is important to recall that our
standard encoding assumes that the input integers are coded in binary.
With this assumption in mind, we can show that the subset-sum
problem is unlikely to have a fast algorithm.
Theorem 34.15
The subset-sum problem is NP-complete.
Proof To show that SUBSET-SUM ∈ NP, for an instance 〈 S, t〉 of the problem, let the subset S′ be the certificate. A verification algorithm can
check whether t = Σ s∈ S′ S in polynomial time.
We now show that 3-CNF-SAT ≤P SUBSET-SUM. Given a 3-CNF
formula ϕ over variables x 1, x 2, … , xn with clauses C 1, C 2, … , Ck, each containing exactly three distinct literals, the reduction algorithm
constructs an instance 〈 S, t〉 of the subset-sum problem such that ϕ is
satisfiable if and only if there exists a subset of S whose sum is exactly t.
Without loss of generality, we make two simplifying assumptions about
the formula ϕ. First, no clause contains both a variable and its
negation, for such a clause is automatically satisfied by any assignment
of values to the variables. Second, each variable appears in at least one
clause, because it does not matter what value is assigned to a variable
that appears in no clauses.
The reduction creates two numbers in set S for each variable xi and
two numbers in S for each clause Cj. The numbers will be represented in base 10, with each number containing n + k digits and each digit corresponding to either one variable or one clause. Base 10 (and other
bases, as we shall see) has the property we need of preventing carries
from lower digits to higher digits.





Figure 34.19 The reduction of 3-CNF-SAT to SUBSET-SUM. The formula in 3-CNF is ϕ =
C 1∧ C 2∧ C 3∧ C 4, where C 1 = ( x 1∨¬ x 2∨¬ x 3), C 2 = (¬ x 1∨¬ x 2∨¬ x 3), C 3 = (¬ x 1∨¬ x 2∨ x 3), and C 4 = ( x 1 ∨ x 2 ∨ x 3). A satisfying assignment of ϕ is 〈 x 1 = 0, x 2 = 0, x 3 = 1〉. The set S
produced by the reduction consists of the base-10 numbers shown: reading from top to bottom, S = {1001001, 1000110, 100001, 101110, 10011, 11100, 1000, 2000, 100, 200, 10, 20, 1, 2}. The target t is 1114444. The subset S′ ⊆ S is shaded blue, and it contains
, and v 3,
corresponding to the satisfying assignment. Subset S′ also contains slack variables s 1,
, s 3,
s 4, and to achieve the target value of 4 in the digits labeled by C 1 through C 4.
As Figure 34.19 shows, we construct set S and target t as follows.
Label each digit position by either a variable or a clause. The least
significant k digits are labeled by the clauses, and the most significant n digits are labeled by variables.
The target t has a 1 in each digit labeled by a variable and a 4 in
each digit labeled by a clause.
For each variable xi, set S contains two integers vi and . Each of vi and has a 1 in the digit labeled by xi and 0s in the other variable digits. If literal xi appears in clause Cj, then the digit













labeled by Cj in vi contains a 1. If literal ¬ xi appears in clause Cj, then the digit labeled by Cj in contains a 1. All other digits
labeled by clauses in vi and are 0.
All vi and values in set S are unique. Why? For ℓ ≠ i, no vℓ or values can equal vi and in the most significant n digits.
Furthermore, by our simplifying assumptions above, no vi and
can be equal in all k least significant digits. If vi and were equal, then xi and ¬ xi would have to appear in exactly the same set of
clauses. But we assume that no clause contains both xi and ¬ xi
and that either xi or ¬ xi appears in some clause, and so there must be some clause Cj for which vi and differ.
For each clause Cj, set S contains two integers sj and . Each of sj and has 0s in all digits other than the one labeled by Cj. For sj,
there is a 1 in the Cj digit, and has a 2 in this digit. These integers are “slack variables,” which we use to get each clause-labeled digit position to add to the target value of 4.
Simple inspection of Figure 34.19 demonstrates that all sj and values in S are unique in set S.
The greatest sum of digits in any one digit position is 6, which occurs
in the digits labeled by clauses (three 1s from the vi and values, plus 1
and 2 from the sj and values). Interpreting these numbers in base 10,
therefore, no carries can occur from lower digits to higher digits. 12
The reduction can be performed in polynomial time. The set S
consists of 2 n + 2 k values, each of which has n + k digits, and the time to produce each digit is polynomial in n + k. The target t has n + k digits, and the reduction produces each in constant time.
Let’s now show that the 3-CNF formula ϕ is satisfiable if and only if
there exists a subset S′ ⊆ S whose sum is t. First, suppose that ϕ has a














satisfying assignment. For i = 1, 2, … , n, if xi = 1 in this assignment, then include vi in S′. Otherwise, include . In other words, S′ includes exactly the vi and values that correspond to literals with the value 1 in
the satisfying assignment. Having included either vi or , but not both,
for all i, and having put 0 in the digits labeled by variables in all sj and
, we see that for each variable-labeled digit, the sum of the values of S′
must be 1, which matches those digits of the target t. Because each clause is satisfied, the clause contains some literal with the value 1.
Therefore, each digit labeled by a clause has at least one 1 contributed
to its sum by a vi or value in S′. In fact, one, two, or three literals may be 1 in each clause, and so each clause-labeled digit has a sum of 1, 2, or
3 from the vi and values in S′. In Figure 34.19 for example, literals
¬ x 1, ¬ x 2, and x 3 have the value 1 in a satisfying assignment. Each of clauses C 1 and C 4 contains exactly one of these literals, and so together
, and v 3 contribute 1 to the sum in the digits for C 1 and C 4. Clause C 2 contains two of these literals, and
, and v 3 contribute 2 to the
sum in the digit for C 2. Clause C 3 contains all three of these literals, and
, and v 3 contribute 3 to the sum in the digit for C 3. To achieve
the target of 4 in each digit labeled by clause Cj, include in S′ the appropriate nonempty subset of slack variables { sj, }. In Figure 34.19,
S′ includes s 1,
, s 3, s 4, and . Since S′ matches the target in all digits
of the sum, and no carries can occur, the values of S′ sum to t.
Now suppose that some subset S′ ⊆ S sums to t. The subset S′ must include exactly one of vi and for each i = 1, 2, … , n, for otherwise the digits labeled by variables would not sum to 1. If vi ∈ S′, then set xi = 1.
Otherwise,
, and set xi = 0. We claim that every clause Cj, for j =
1, 2, … , k, is satisfied by this assignment. To prove this claim, note that
to achieve a sum of 4 in the digit labeled by Cj, the subset S′ must include at least one vi or value that has a 1 in the digit labeled by Cj,



since the contributions of the slack variables sj and together sum to at
most 3. If S′ includes a that has a 1 in Cj’s position, then the literal xi appears in clause Cj. Since xi = 1 when vi ∈ S′, clause Cj is satisfied. If S′ includes a that has a 1 in that position, then the literal ¬ xi appears in Cj. Since xi = 0 when ∈ S′, clause Cj is again satisfied. Thus, all clauses of ϕ are satisfied, which completes the proof.
▪
34.5.6 Reduction strategies
From the reductions in this section, you can see that no single strategy
applies to all NP-complete problems. Some reductions are
straightforward, such as reducing the hamiltonian-cycle problem to the
traveling-salesperson problem. Others are considerably more
complicated. Here are a few things to keep in mind and some strategies
that you can often bring to bear.
Pitfalls
Make sure that you don’t get the reduction backward. That is, in trying
to show that problem Y is NP-complete, you might take a known NP-
complete problem X and give a polynomial-time reduction from Y to X.
That is the wrong direction. The reduction should be from X to Y, so
that a solution to Y gives a solution to X.
Remember also that reducing a known NP-complete problem X to a
problem Y does not in itself prove that Y is NP-complete. It proves that Y is NP-hard. In order to show that Y is NP-complete, you additionally
need to prove that it’s in NP by showing how to verify a certificate for Y
in polynomial time.
Go from general to specific
When reducing problem X to problem Y, you always have to start with
an arbitrary input to problem X. But you are allowed to restrict the input to problem Y as much as you like. For example, when reducing 3-CNF satisfiability to the subset-sum problem, the reduction had to be
able to handle any 3-CNF formula as its input, but the input to the subset-sum problem that it produced had a particular structure: 2 n + 2 k
integers in the set, and each integer was formed in a particular way. The
reduction did not need to produce every possible input to the subset-sum problem. The point is that one way to solve the 3-CNF satisfiability
problem transforms the input into an input to the subset-sum problem
and then uses the answer to the subset-sum problem as the answer to
the 3-CNF satisfiability problem.
Take advantage of structure in the problem you are reducing from
When you are choosing a problem to reduce from, you might consider
two problems in the same domain, but one problem has more structure
than the other. For example, it’s almost always much easier to reduce
from 3-CNF satisfiability than to reduce from formula satisfiability.
Boolean formulas can be arbitrarily complicated, but you can exploit
the structure of 3-CNF formulas when reducing.
Likewise, it is usually more straightforward to reduce from the
hamiltonian-cycle problem than from the traveling-salesperson
problem, even though they are so similar. That’s because you can view
the hamiltonian-cycle problem as taking a complete graph but with
edge weights of just 0 or 1, as they would appear in the adjacency
matrix. In that sense, the hamiltonian-cycle problem has more structure
than the traveling-salesperson problem, in which edge weights are
unrestricted.
Look for special cases
Several NP-complete problems are just special cases of other NP-
complete problems. For example, consider the decision version of the 0-
1 knapsack problem: given a set of n items, each with a weight and a
value, does there exist a subset of items whose total weight is at most a
given weight W and whose total value is at least a given value V? You
can view the set-partition problem in Exercise 34.5-5 as a special case of
the 0-1 knapsack problem: let the value of each item equal its weight,
and set both W and V to half the total weight. If problem X is NP-hard and it is a special case of problem Y, then problem Y must be NP-hard
as well. That is because a polynomial-time solution for problem Y
automatically gives a polynomial-time solution for problem X. More
intuitively, problem Y, being more general than problem X, is at least as hard.
Select an appropriate problem to reduce from
It’s often a good strategy to reduce from a problem in a domain that is
the same as, or at least related to, the domain of the problem that you’re
trying to prove NP-complete. For example, we saw that the vertex-cover
problem—a graph problem—was NP-hard by reducing from the clique
problem—also a graph problem. From the vertex-cover problem, we
reduced to the hamiltonian-cycle problem, and from the hamiltonian-
cycle problem, we reduced to the traveling-salesperson problem. All of
these problems take undirected graphs as inputs.
Sometimes, however, you will find that is it better to cross over from
one domain to another, such as when we reduced from 3-CNF
satisfiability to the clique problem or to the subset-sum problem. 3-CNF
satisfiability often turns out to be a good choice as a problem to reduce
from when crossing domains.
Within graph problems, if you need to select a portion of the graph,
without regard to ordering, then the vertex-cover problem is often a
good place to start. If ordering matters, then consider starting from the
hamiltonian-cycle or hamiltonian-path problem (see Exercise 34.5-6).
Make big rewards and big penalties
The strategy for reducing the hamiltonian-cycle problem with a graph G
to the traveling-salesperson problem encouraged using edges present in
G when choosing edges for the traveling-salesperson tour. The reduction
did so by giving these edges a low weight: 0. In other words, we gave a
big reward for using these edges.
Alternatively, the reduction could have given the edges in G a finite
weight and given edges not in G infinite weight, thereby exacting a hefty
penalty for using edges not in G. With this approach, if each edge in G
has weight W, then the target weight of the traveling-salesperson tour
becomes W · | V|. You can sometimes think of the penalties as a way to
enforce requirements. For example, if the traveling-salesperson tour
includes an edge with infinite weight, then it violates the requirement
that the tour should include only edges belonging to G.
Design gadgets
The reduction from the vertex-cover problem to the hamiltonian-cycle
problem uses the gadget shown in Figure 34.16. This gadget is a subgraph that is connected to other parts of the constructed graph in
order to restrict the ways that a cycle can visit each vertex in the gadget
once. More generally, a gadget is a component that enforces certain
properties. Gadgets can be complicated, as in the reduction to the
hamiltonian-cycle problem. Or they can be simple: in the reduction of 3-
CNF satisfiability to the subset-sum problem, you can view the slack
variables sj and as gadgets enabling each clause-labeled digit position
to achieve the target value of 4.
Exercises
34.5-1
The subgraph-isomorphism problem takes two undirected graphs G 1 and