Repeated squaring is based on the following formula to compute ab

for nonnegative integers a and b:

The last case, where b is odd, reduces to the one of the first two cases,

since if b is odd, then b − 1 is even. The recursive procedure MODULAR-EXPONENTIATION on the next page computes ab mod

n using equation (31.34), but performing all computations modulo n.

The term “repeated squaring” comes from squaring the intermediate

Image 1294

result d = ab/2 in line 5. Figure 31.4 shows the values of the parameter b, the local variable d, and the value returned at each level of the recursion for the call MODULAR-EXPONENTIATION(7,560,561),

which returns the result 1.

Figure 31.4 The values of the parameter b, the local variable d, and the value returned for recursive calls of MODULAR-EXPONENTIATION with parameter values a = 7, b = 560, and n = 561. The value returned by each recursive call is assigned directly to d. The result of the call with a = 7, b = 560, and n = 561 is 1.

MODULAR-EXPONENTIATION( a, b, n)

1 if b == 0

2

return 1

3 elseif b mod 2 == 0

4

d = MODULAR-EXPONENTIATION( a, b/2,// b is

n)

even

5

return ( d · d) mod n

6 else d = MODULAR-EXPONENTIATION( a, b −// b is

1, n)

odd

7

return ( a · d) mod n

The total number of recursive calls depends on the number of bits of

b and the values of these bits. Assume that b > 0 and that the most significant bit of b is a 1. Each 0 generates one recursive call (in line 4),

and each 1 generates two recursive calls (one in line 6 followed by one in

line 4 because if b is odd, then b − 1 is even). If the inputs a, b, and n are β-bit numbers, then there are between β and 2 β − 1 recursive calls altogether, the total number of arithmetic operations required is O( β), and the total number of bit operations required is O( β 3).

Exercises

Image 1295

Image 1296

Image 1297

31.6-1

Draw a table showing the order of every element in

. Pick the

smallest primitive root g and compute a table giving ind11, g( x) for all

.

31.6-2

Show that x 2 = 1 (mod pe) is equivalent to pe | ( x − 1)( x + 1).

31.6-3

Rewrite the third case of MODULAR-EXPONENTIATION, where b

is odd, so that if b has β bits and the most significant bit is 1, then there are always exactly β recursive calls.

31.6-4

Give a nonrecursive (i.e., iterative) version of MODULAR-

EXPONENTIATION.

31.6-5

Assuming that you know ϕ( n), explain how to compute a−1 mod n for any

using the procedure MODULAR-EXPONENTIATION.

31.7 The RSA public-key cryptosystem

With a public-key cryptosystem, you can encrypt messages sent between

two communicating parties so that an eavesdropper who overhears the

encrypted messages will not be able to decode, or decrypt, them. A

public-key cryptosystem also enables a party to append an unforgeable

“digital signature” to the end of an electronic message. Such a signature

is the electronic version of a handwritten signature on a paper

document. It can be easily checked by anyone, forged by no one, yet

loses its validity if any bit of the message is altered. It therefore provides

authentication of both the identity of the signer and the contents of the

signed message. It is the perfect tool for electronically signed business

contracts, electronic checks, electronic purchase orders, and other

electronic communications that parties wish to authenticate.

Image 1298

The RSA public-key cryptosystem relies on the dramatic difference

between the ease of finding large prime numbers and the difficulty of

factoring the product of two large prime numbers. Section 31.8

describes an efficient procedure for finding large prime numbers.

Public-key cryptosystems

In a public-key cryptosystem, each participant has both a public key

and a secret key. Each key is a piece of information. For example, in the

RSA cryptosystem, each key consists of a pair of integers. The

participants “Alice” and “Bob” are traditionally used in cryptography

examples. We denote the public keys for Alice and Bob as PA and PB,

respectively, and likewise the secret keys are SA for Alice and SB for Bob.Each participant creates his or her own public and secret keys. Secret

keys are kept secret, but public keys can be revealed to anyone or even

published. In fact, it is often convenient to assume that everyone’s

public key is available in a public directory, so that any participant can

easily obtain the public key of any other participant.

Figure 31.5 Encryption in a public key system. Bob encrypts the message M using Alice’s public key PA and transmits the resulting ciphertext C = PA( M) over a communication channel to Alice. An eavesdropper who captures the transmitted ciphertext gains no information about M.

Alice receives C and decrypts it using her secret key to obtain the original message M = SA( C).

The public and secret keys specify functions that can be applied to

any message. Let D denote the set of permissible messages. For

example, D might be the set of all finite-length bit sequences. The

simplest, and original, formulation of public-key cryptography requires

one-to-one functions from D to itself, based on the public and secret

Image 1299

keys. We denote the function based on Alice’s public key PA by PA()

and the function based on her secret key SA by SA(). The functions PA() and SA() are thus permutations of D. We assume that the functions PA() and SA() are efficiently computable given the

corresponding keys PA and SA.

The public and secret keys for any participant are a “matched pair”

in that they specify functions that are inverses of each other. That is,

for any message M ∈ D. Transforming M with the two keys PA and SA successively, in either order, yields back the original message M.

A public-key cryptosystem requires that Alice, and only Alice, be

able to compute the function SA() in any practical amount of time. This

assumption is crucial to keeping encrypted messages sent to Alice

private and to knowing that Alice’s digital signatures are authentic.

Alice must keep her key SA secret. If she does not, whoever else has access to SA can decrypt messages intended only for Alice and can also

forge her digital signature. The assumption that only Alice can

reasonably compute SA() must hold even though everyone knows PA

and can compute PA(), the inverse function to SA(), efficiently. These

requirements appear formidable, but we’ll see how to satisfy them.

In a public-key cryptosystem, encryption works as shown in Figure

31.5. Suppose that Bob wishes to send Alice a message M encrypted so

that it looks like

Image 1300

Figure 31.6 Digital signatures in a public-key system. Alice signs the message M′ by appending her digital signature σ = SA( M′) to it. She transmits the message/signature pair ( M′, σ) to Bob, who verifies it by checking the equation M′ = PA( σ). If the equation holds, he accepts ( M′, σ) as a message that Alice has signed.

unintelligible gibberish to an eavesdropper. The scenario for sending the

message goes as follows.

Bob obtains Alice’s public key PA, perhaps from a public

directory or perhaps directly from Alice.

Bob computes the ciphertext C = PA( M) corresponding to the message M and sends C to Alice.

When Alice receives the ciphertext C, she applies her secret key

SA to retrieve the original message: SA( C) = SA( PA( M)) = M.

Because SA() and PA() are inverse functions, Alice can compute M

from C. Because only Alice is able to compute SA(), only Alice can compute M from C. Because Bob encrypts M using PA(), only Alice can understand the transmitted message.

Digital signatures can be implemented within this formulation of a

public-key cryptosystem. (There are other ways to construct digital

signatures, but we won’t go into them here.) Suppose now that Alice

wishes to send Bob a digitally signed response M′. Figure 31.6 shows how the digital-signature scenario proceeds.

Alice computes her digital signature σ for the message M′ using her secret key SA and the equation σ = SA( M′).

Alice sends the message/signature pair ( M′, σ) to Bob.

When Bob receives ( M′, σ), he can verify that it originated from

Alice by using Alice’s public key to verify the equation M′ =

PA( σ). (Presumably, M′ contains Alice’s name, so that Bob knows whose public key to use.) If the equation holds, then Bob

concludes that the message M′ was actually signed by Alice. If the

equation fails to hold, Bob concludes either that the information

he received was corrupted by transmission errors or that the pair

( M′, σ) is an attempted forgery.

Because a digital signature provides both authentication of the signer’s

identity and authentication of the contents of the signed message, it is

analogous to a handwritten signature at the end of a written document.

A digital signature must be verifiable by anyone who has access to

the signer’s public key. A signed message can be verified by one party

and then passed on to other parties who can also verify the signature.

For example, the message might be an electronic check from Alice to

Bob. After Bob verifies Alice’s signature on the check, he can give the

check to his bank, who can then also verify the signature and effect the

appropriate funds transfer.

A signed message may or may not be encrypted. The message can be

“in the clear” and not protected from disclosure. By composing the

above protocols for encryption and for signatures, Alice can create a

message to Bob that is both signed and encrypted. Alice first appends

her digital signature to the message and then encrypts the resulting

message/signature pair with Bob’s public key. Bob decrypts the received

message with his secret key to obtain both the original message and its

digital signature. Bob can then verify the signature using Alice’s public

key. The corresponding combined process using paper-based systems

would be to sign the paper document and then seal the document inside

a paper envelope that is opened only by the intended recipient.

The RSA cryptosystem

In the RSA public-key cryptosystem, a participant creates a public key

and a secret key with the following procedure:

Image 1301

Image 1302

1. Select at random two large prime numbers p and q such that p

q. The primes p and q might be, say, 1024 bits each.

2. Compute n = pq.

3. Select a small odd integer e that is relatively prime to ϕ( n), which, by equation (31.21), equals ( p – 1)( q – 1).

4. Compute d as the multiplicative inverse of e, modulo ϕ( n).

(Corollary 31.26 guarantees that d exists and is uniquely defined.

You can use the technique of Section 31.4 to compute d, given e and ϕ( n).)

5. Publish the pair P = ( e, n) as the participant’s RSA public key.

6. Keep secret the pair S = ( d, n) as the participant’s RSA secret key.

For this scheme, the domain D is the set ℤ n. To transform a message

M associated with a public key P = ( e, n), compute

To transform a ciphertext C associated with a secret key S = ( d, n), compute

These equations apply to both encryption and signatures. To create a

signature, the signer’s secret key is applied to the message to be signed,

rather than to a ciphertext. To verify a signature, the public key of the

signer is applied to the signature rather than to a message to be

encrypted.

To implement the public-key and secret-key operations (31.37) and

(31.38), you can use the procedure MODULAR-EXPONENTIATION

described in Section 31.6. To analyze the running time of these operations, assume that the public key ( e, n) and secret key ( d, n) satisfy 1g e = O(1), 1g dβ, and 1g nβ. Then, applying a public key requires O(1) modular multiplications and uses O( β 2) bit operations. Applying a

secret key requires O( β) modular multiplications, using O( β 3) bit operations.

Theorem 31.36 (Correctness of RSA)

The RSA equations (31.37) and (31.38) define inverse transformations

of ℤ n satisfying equations (31.35) and (31.36).

Proof From equations (31.37) and (31.38), we have that for any M

n,

P( S( M)) = S( P( M)) = Med (mod n).

Since e and d are multiplicative inverses modulo ϕ( n) = ( p – 1)( q – 1), ed = 1 + k( p – 1)( q – 1)

for some integer k. But then, if M ≠ 0 (mod p), we have

Med = M( Mp–1) k( q–1)

(mod p)

= M(( M mod p) p–1) k( q–1) (mod p)

= M(1) k( q–1)

(mod p) (by Theorem 31.31)

= M

(mod p)

Also, Med = M (mod p) if M = 0 (mod p). Thus,

Med = M (mod p)

for all M. Similarly,

Med = M (mod q)

for all M. Thus, by Corollary 31.29 to the Chinese remainder theorem,

Med = M (mod n)

for all M.

The security of the RSA cryptosystem rests in large part on the difficulty of factoring large integers. If an adversary can factor the

modulus n in a public key, then the adversary can derive the secret key

from the public key, using the knowledge of the factors p and q in the

same way that the creator of the public key used them. Therefore, if

factoring large integers is easy, then breaking the RSA cryptosystem is

easy. The converse statement, that if factoring large integers is hard,

then breaking RSA is hard, is unproven. After two decades of research,

however, no easier method has been found to break the RSA public-key

cryptosystem than to factor the modulus n. And factoring large integers

is surprisingly difficult. By randomly selecting and multiplying together

two 1024-bit primes, you can create a public key that cannot be

“broken” in any feasible amount of time with current technology. In the

absence of a fundamental breakthrough in the design of number-

theoretic algorithms, and when implemented with care following

recommended standards, the RSA cryptosystem is capable of providing

a high degree of security in applications.

In order to achieve security with the RSA cryptosystem, however,

you should use integers that are quite long—more than 1000 bits—to

resist possible advances in the art of factoring. In 2021, RSA moduli are

commonly in the range of 2048 to 4096 bits. To create moduli of such

sizes, you must find large primes efficiently. Section 31.8 addresses this problem.

For efficiency, RSA is often used in a “hybrid” or “key-

management” mode with fast cryptosystems that are not public-key

cryptosystems. With such a symmetric-key system, the encryption and

decryption keys are identical. If Alice wishes to send a long message M

to Bob privately, she selects a random key K for the fast symmetric-key

cryptosystem and encrypts M using K, obtaining ciphertext C, where C

is as long as M, but K is quite short. Then she encrypts K using Bob’s public RSA key. Since K is short, computing PB( K) is fast (much faster than computing PB( M)). She then transmits ( C, PB( K)) to Bob, who decrypts PB( K) to obtain K and then uses K to decrypt C, obtaining M.

A similar hybrid approach creates digital signatures efficiently. This

approach combines RSA with a public collision-resistant hash function h

—a function that is easy to compute but for which it is computationally infeasible to find two messages M and M′ such that h( M) = h( M′). The value h( M) is a short (say, 256-bit) “fingerprint” of the message M. If Alice wishes to sign a message M, she first applies h to M to obtain the fingerprint h( M), which she then encrypts with her secret key. She sends ( M, SA( h( M))) to Bob as her signed version of M. Bob can verify the signature by computing h( M) and verifying that PA applied to SA( h( M)) as received equals h( M). Because no one can create two messages with the same fingerprint, it is computationally infeasible to

alter a signed message and preserve the validity of the signature.

One way to distribute public keys uses certificates. For example,

assume that there is a “trusted authority” T whose public key is known

by everyone. Alice can obtain from T a signed message (her certificate)

stating that “Alice’s public key is PA.” This certificate is “self-

authenticating” since everyone knows PT. Alice can include her

certificate with her signed messages, so that the recipient has Alice’s

public key immediately available in order to verify her signature.

Because her key was signed by T, the recipient knows that Alice’s key is

really Alice’s.

Exercises

31.7-1

Consider an RSA key set with p = 11, q = 29, n = 319, and e = 3. What value of d should be used in the secret key? What is the encryption of

the message M = 100?

31.7-2

Prove that if Alice’s public exponent e is 3 and an adversary obtains Alice’s secret exponent d, where 0 < d < ϕ( n), then the adversary can factor Alice’s modulus n in time polynomial in the number of bits in n.

(Although you are not asked to prove it, you might be interested to

know that this result remains true even if the condition e = 3 is

removed. See Miller [327].)

Image 1303

31.7-3

Prove that RSA is multiplicative in the sense that

PA( M 1) PA( M 2) = PA( M 1 M 2) (mod n).

Use this fact to prove that if an adversary had a procedure that could

efficiently decrypt 1% of messages from ℤ n encrypted with PA, then the

adversary could employ a probabilistic algorithm to decrypt every

message encrypted with PA with high probability.

★ 31.8 Primality testing

This section shows how to find large primes. We begin with a discussion

of the density of primes, proceed to examine a plausible, but incomplete,

approach to primality testing, and then present an effective randomized

primality test due to Miller and Rabin.

The density of prime numbers

Many applications, such as cryptography, call for finding large

“random” primes. Fortunately, large primes are not too rare, so that it is

feasible to test random integers of the appropriate size until you find

one that is prime. The prime distribution function π( n) specifies the number of primes that are less than or equal to n. For example, π(10) =

4, since there are 4 prime numbers less than or equal to 10, namely, 2, 3,

5, and 7. The prime number theorem gives a useful approximation to

π( n).

Theorem 31.37 (Prime number theorem)

The approximation n/ln n gives reasonably accurate estimates of π( n) even for small n. For example, it is off by less than 6% at n = 109, where

Image 1304

Image 1305

Image 1306

Image 1307

π( n) = 50,847,534 and n/ln n ≈ 48,254,942. (To a number theorist, 109 is a small number.)

The process of randomly selecting an integer n and determining

whether it is prime is really just a Bernoulli trial (see Section C.4). By the prime number theorem, the probability of a success—that is, the

probability that n is prime—is approximately 1/ln n. The geometric distribution says how many trials must occur to obtain a success, and by

equation (C.36) on page 1197, the expected number of trials is

approximately ln n. Thus, in order to find a prime that has the same length as n by testing integers chosen randomly near n, the expected number examined would be approximately ln n. For example, the

expectation is that finding a 1024-bit prime would require testing

approximately ln 21024 ≈ 710 randomly chosen 1024-bit numbers for

primality. (Of course, to cut this figure in half, choose only odd

integers.)

The remainder of this section shows how to determine whether a

large odd integer n is prime. For notational convenience, we assume that

n has the prime factorization

where r ≥ 1, p 1, p 2, …, pr are the prime factors of n, and e 1, e 2, …, er are positive integers. The integer n is prime if and only if r = 1 and e 1 =

1. One simple approach to the problem of testing for primality is trial

division: try dividing n by each integer 2, 3, 5, 7, 9, …,

, skipping

even integers greater than 2. We can conclude that n is prime if and only

if none of the trial divisors divides n. Assuming that each trial division

takes constant time, the worst-case running time is

, which is

exponential in the length of n. (Recall that if n is encoded in binary using β bits, then β = ⌈1g( n + 1)⌉, and so

.) Thus, trial

division works well only if n is very small or happens to have a small prime factor. When it works, trial division has the advantage that it not

only determines whether n is prime or composite, it also determines one

of n’s prime factors if n is composite.

Image 1308

Image 1309

Image 1310

Image 1311

Image 1312

Image 1313

Image 1314

This section focuses on finding out whether a given number n is

prime. If n is composite, we won’t worry about finding its prime

factorization. Computing the prime factorization of a number is

computationally expensive. You might be surprised that it turns out to

be much easier to ascertain whether a given number is prime than it is

to determine the prime factorization of the number if it is not prime.

Pseudoprimality testing

We’ll start with a method for primality testing that “almost works” and,

in fact, is good enough for many practical applications. Later on, we’ll

refine this method to remove the small defect. Let

denote the

nonzero elements of ℤ n:

If n is prime, then

.

We say that n is a base-a pseudoprime if n is composite and

Fermat’s theorem (Theorem 31.31 on page 932) implies that if n is

prime, then n satisfies equation (31.39) for every a in . Thus, if there is any

such that n does not satisfy equation (31.39), then n is certainly composite. Surprisingly, the converse almost holds, so that this

criterion forms an almost perfect test for primality. Instead of trying

every value of

, test to see whether n satisfies equation (31.39) for

just a = 2. If not, then declare n to be composite by returning COMPOSITE. Otherwise, return PRIME, guessing that n is prime

(when, in fact, all we know is that n is either prime or a base-2

pseudoprime).

The procedure PSEUDOPRIME on the next page pretends in this

manner to check whether n is prime. It uses the procedure MODULAR-

EXPONENTIATION from Section 31.6. It assumes that the input n is an odd integer greater than 2. This procedure can make errors, but only

of one type. That is, if it says that n is composite, then it is always correct. If it says that n is prime, however, then it makes an error only if

n is a base-2 pseudoprime.

Image 1315

Image 1316

How often does PSEUDOPRIME err? Surprisingly rarely. There are

only 22 values of n less than 10,000 for which it errs, the first four of

which are 341, 561, 645,

PSEUDOPRIME( n)

1 if MODULAR-EXPONENTIATION(2, n – 1, n) ≠ 1 (mod n)

2

return COMPOSITE

// definitely

3 else return PRIME

// we hope!

and 1105. We won’t prove it, but the probability that this program

makes an error on a randomly chosen β-bit number goes to 0 as β

approaches ∞. Using more precise estimates due to Pomerance [361] of the number of base-2 pseudoprimes of a given size, a randomly chosen

512-bit number that is called prime by PSEUDOPRIME has less than

one chance in 1020 of being a base-2 pseudoprime, and a randomly

chosen 1024-bit number that is called prime has less than one chance in

1041 of being a base-2 pseudoprime. Thus, if you are merely trying to

find a large prime for some application, for all practical purposes you

almost never go wrong by choosing large numbers at random until one

of them causes PSEUDOPRIME to return PRIME. But when the

numbers being tested for primality are not randomly chosen, you might

need a better approach for testing primality. As we’ll see, a little more

cleverness, and some randomization, will yield a primality-testing

method that works well on all inputs.

Since PSEUDOPRIME checks equation (31.39) for only a = 2, you

might think that you could eliminate all the errors by simply checking

equation (31.39) for a second base number, say a = 3. Better yet, you

could check equation (31.39) for even more values of a. Unfortunately,

even checking for several values of a does not eliminate all errors, because there exist composite integers n, known as Carmichael numbers,

that satisfy equation (31.39) for all

. (The equation does fail when

gcd( a, n) > 1—that is, when

—but demonstrating that n is

composite by finding such an a can be difficult if n has only large prime factors.) The first three Carmichael numbers are 561, 1105, and 1729.

Image 1317

Carmichael numbers are extremely rare. For example, only 255 of them

are less than 100,000,000. Exercise 31.8-2 helps explain why they are so

rare.Let’s see how to improve the primality test so that Carmichael

numbers won’t fool it.

The Miller-Rabin randomized primality test

The Miller-Rabin primality test overcomes the problems of the simple

procedure PSEUDOPRIME with two modifications:

It tries several randomly chosen base values a instead of just one

base value.

While computing each modular exponentiation, it looks for a

nontrivial square root of 1, modulo n, during the final set of

squarings. If it finds one, it stops and returns COMPOSITE.

Corollary 31.35 from Section 31.6 justifies detecting composites in this manner.

The pseudocode for the Miller-Rabin primality test appears in the

procedures MILLER-RABIN and WITNESS. The input n > 2 to

MILLER-RABIN is the odd number to be tested for primality, and s is

the number of randomly chosen base values from to be tried. The

code uses the random-number generator RANDOM described on page

129: RANDOM(2, n – 2) returns a randomly chosen integer a satisfying

2 ≤ an – 2. (This range of values avoids having a = ≥1 (mod n).) The call of the auxiliary procedure WITNESS( a, n) returns TRUE if and only if a is a “witness” to the compositeness of n—that is, if it is possible using a to prove (in a manner that we will see) that n is composite. The test WITNESS( a, n) is an extension of, but more effective than, the test in equation (31.39) that formed the basis for PSEUDOPRIME, using a

= 2.Let’s first understand how WITNESS works, and then we’ll see how

the Miller-Rabin primality test uses it. Let n – 1 = 2 tu where t ≥ 1 and u is odd. That is, the binary representation of n – 1 is the binary

representation of the odd integer u followed by exactly t zeros.

Image 1318

Image 1319

Image 1320

Therefore,

, so that one way to compute an−1 mod n is to

first compute au mod n and then square the result t times successively.

MILLER-RABIN( n, s)

// n > 2 is odd

1 for j = 1 to s

2

a = RANDOM(2, n – 2)

3

if WITNESS( a, n)

4

return COMPOSITE// definitely

5 return PRIME

// almost surely

WITNESS( a, n)

1 let t and u be such that t ≥ 1, u is odd, and n – 1 = 2 tu 2 x 0 = MODULAR-EXPONENTIATION( a, u, n)

3 for i = 1 to t

4

5

if xi == 1 and xi–1 ≠ 1 and xi–1 ≠ n – 1

6

return TRUE

// found a nontrivial square root

of 1

7 if xt ≠ 1

8

return TRUE

//

composite,

as

in

PSEUDOPRIME

9 return FALSE

This pseudocode for WITNESS computes an–1 mod n by first

computing the value x 0 = au mod n in line 2 and then repeatedly squaring the result t times in the for loop of lines 3–6. By induction on i, the sequence x 0, x 1, …, xt of values computed satisfies the equation for i = 0, 1, …, t, so that in particular xt = an–1 (mod n). After line 4 performs a squaring step, however, the loop will terminate early if

lines 5–6 detect that a nontrivial square root of 1 has just been

discovered. (We’ll explain these tests shortly.) If so, the procedure stops

and returns TRUE. Lines 7–8 return TRUE if the value computed for xt

Image 1321

Image 1322

Image 1323

Image 1324

= an–1 (mod n) is not equal to 1, just as the PSEUDOPRIME

procedure returns COMPOSITE in this case. Line 9 returns FALSE if

lines 6 or 8 have not returned TRUE.

The following lemma proves the correctness of WITNESS.

Lemma 31.38

If WITNESS( a, n) returns TRUE, then a proof that n is composite can be constructed using a as a witness.

Proof If WITNESS returns TRUE from line 8, it’s because line 7

determined that xt = an–1 mod n ≠ 1. If n is prime, however, Fermat’s theorem (Theorem 31.31) says that an–1 = 1 (mod n) for all

. Since

if n is prime, Fermat’s theorem also says that an–1 = 1 (mod n) for all

. Therefore, n cannot be prime, and the equation an–1 mod

n ≠ 1 proves this fact.

If WITNESS returns TRUE from line 6, then it has discovered that

xi–1 is a nontrivial square root of 1, modulo n, since we have that xi–1 ≠

±1 (mod n) yet

. Corollary 31.35 on page 934 states that only

if n is composite can there exist a nontrivial square root of 1, modulo n, so that demonstrating that xi–1 is a nontrivial square root of 1, modulo

n proves that n is composite.

Thus, if the call WITNESS( a, n) returns TRUE, then n is surely composite, and the witness a, along with the reason that the procedure

returns TRUE (did it return from line 6 or from line 8?), provides a

proof that n is composite.

Let’s explore an alternative view of the behavior of WITNESS as a

function of the sequence X = 〈 x 0, x 1, …, xt〉. We’ll find this view useful later on, when we analyze the error rate of the Miller-Rabin primality

test. Note that if xi = 1 for some 0 ≤ i < t, WITNESS might not compute the rest of the sequence. If it were to do so, however, each

value xi+1, xi+2, …, xt would be 1, so we can consider these positions in the sequence X as being all 1s. There are four cases:

Image 1325

1. X = 〈…, d〉, where d ≠ 1: the sequence X does not end in 1.

Return TRUE in line 8, since a is a witness to the compositeness

of n (by Fermat’s Theorem).

2. X = 〈1, 1, …, 1〉: the sequence X is all 1s. Return FALSE, since a is not a witness to the compositeness of n.

3. X = 〈…, –1, 1, …, 1〉: the sequence X ends in 1, and the last non-

1 is equal to –1. Return FALSE, since a is not a witness to the

compositeness of n.

4. X = 〈…, d, 1, …, 1〉, where d ≠ ±1: the sequence X ends in 1, but the last non-1 is not –1. Return TRUE in line 6: a is a witness to

the compositeness of n, since d is a nontrivial square root of 1.

Now, let’s examine the Miller-Rabin primality test based on how it

uses the WITNESS procedure. As before, assume that n is an odd

integer greater than 2.

The procedure MILLER-RABIN is a probabilistic search for a

proof that n is composite. The main loop (beginning on line 1) picks up

to s random values of a from , except for 1 and n – 1 (line 2). If it picks a value of a that is a witness to the compositeness of n, then MILLER-RABIN returns COMPOSITE on line 4. Such a result is

always correct, by the correctness of WITNESS. If MILLER-RABIN

finds no witness in s trials, then the procedure assumes that it found no

witness because no witnesses exist, and therefore it assumes that n is prime. We’ll see that this result is likely to be correct if s is large enough, but there is still a tiny chance that the procedure could be unlucky in its

choice of s random values of a, so that even though the procedure failed to find a witness, at least one witness exists.

To illustrate the operation of MILLER-RABIN, let n be the

Carmichael number 561, so that n – 1 = 560 = 24 · 35, t = 4, and u = 35.

If the procedure chooses a = 7 as a base, the column for b = 35 in Figure

31.4 (Section 31.6) shows that WITNESS computes x 0 = a 35 = 241

(mod 561). Because of how the MODULAR-EXPONENTIATION

procedure operates recursively on its parameter b, the first four columns

Image 1326

Image 1327

Image 1328

in Figure 31.4 represent the factor 24 of 560—the rightmost four zeros in the binary representation of 560—reading these four zeros from right

to left in the binary representation. Thus WITNESS computes the

sequence X = 〈241, 298, 166, 67, 1〉. Then, in the last squaring step, WITNESS discovers that a 280 is a nontrivial square root of 1 since

a 280 = 67 (mod n) and ( a 280)2 = a 560 = 1 (mod n). Therefore, a = 7 is a witness to the compositeness of n, WITNESS(7, n) returns TRUE, and

MILLER-RABIN returns COMPOSITE.

If n is a β-bit number, MILLER-RABIN requires O( ) arithmetic operations and O( 3) bit operations, since it requires asymptotically no more work than s modular exponentiations.

Error rate of the Miller-Rabin primality test

If MILLER-RABIN returns PRIME, then there is a very slim chance

that it has made an error. Unlike PSEUDOPRIME, however, the

chance of error does not depend on n: there are no bad inputs for this

procedure. Rather, it depends on the size of s and the “luck of the draw”

in choosing base values a. Moreover, since each test is more stringent

than a simple check of equation (31.39), we can expect on general

principles that the error rate should be small for randomly chosen

integers n. The following theorem presents a more precise argument.

Theorem 31.39

If n is an odd composite number, then the number of witnesses to the

compositeness of n is at least ( n – 1)/2.

Proof The proof shows that the number of nonwitnesses is at most ( n

1)/2, which implies the theorem.

We start by claiming that any nonwitness must be a member of .

Why? Consider any nonwitness a. It must satisfy an–1 = 1 (mod n) or, equivalently, a · an−2 = 1 (mod n). Thus the equation ax = 1 (mod n) has a solution, namely an−2. By Corollary 31.21 on page 924, gcd( a, n) |

1, which in turn implies that gcd( a, n) = 1. Therefore, a is a member of

, and all nonwitnesses belong to .

Image 1329

Image 1330

Image 1331

Image 1332

Image 1333

Image 1334

Image 1335

Image 1336

Image 1337

Image 1338

Image 1339

Image 1340

Image 1341

Image 1342

Image 1343

Image 1344

Image 1345

Image 1346

To complete the proof, we show that not only are all nonwitnesses

contained in , they are all contained in a proper subgroup B of

(recall that B is a proper subgroup of when B is subgroup of but B

is not equal to ). By Corollary 31.16 on page 921, we then have

. Since

, we obtain | B| ≤ ( n – 1)/2. Therefore, if all

nonwitnesses are contained in a proper subgroup of

, then the

number of nonwitnesses is at most ( n – 1)/2, so that the number of witnesses must be at least ( n – 1)/2.

To find a proper subgroup B of

containing all of the

nonwitnesses, we consider two cases.

Case 1: There exists an

such that

xn–1 ≠ 1 (mod n).

In other words, n is not a Carmichael number. Since, as noted earlier,

Carmichael numbers are extremely rare, case 1 is the more typical case

(e.g., when n has been chosen randomly and is being tested for

primality).

Let

. The set B must be nonempty, since 1 ∈ B.

The set B is closed under multiplication modulo n, and so B is a subgroup of by Theorem 31.14. Every nonwitness belongs to B, since

a nonwitness a satisfies an–1 = 1 (mod n). Since

, we have that

B is a proper subgroup of .

Case 2: For all

,

In other words, n is a Carmichael number. This case is extremely rare in

practice. Unlike a pseudoprimality test, however, the Miller-Rabin test

can efficiently determine that Carmichael numbers are composite, as

we’re about to see.

In this case, n cannot be a prime power. To see why, suppose to the

contrary that n = pe, where p is a prime and e > 1. We derive a contradiction as follows. Since we assume that n is odd, p must also be odd. Theorem 31.32 on page 933 implies that is a cyclic group: it

contains

a

generator

g

such

that

. (The formula for ϕ( n)

Image 1347

Image 1348

Image 1349

Image 1350

Image 1351

Image 1352

Image 1353

Image 1354

Image 1355

Image 1356

comes from equation (31.21) on page 920.) By equation (31.40), we have

gn–1 = 1 (mod n). Then the discrete logarithm theorem (Theorem 31.33

on page 933, taking y = 0) implies that n – 1 = 0 (mod ϕ ( n)), or ( p – 1) pe−1 | pe – 1.

This statement is a contradiction for e > 1, since ( p – 1) pe−1 is divisible by the prime p, but pe – 1 is not. Thus n is not a prime power.

Since the odd composite number n is not a prime power, we

decompose it into a product n 1 n 2, where n 1 and n 2 are odd numbers greater than 1 that are relatively prime to each other. (There may be

several ways to decompose n, and it does not matter which one we

choose. For example, if

, then we can choose

and

.)

Recall that t and u are such that n – 1 = 2 tu, where t ≥ 1 and u is odd, and that for an input a, the procedure WITNESS computes the

sequence

where all computations are performed modulo n.

Let us call a pair ( v, j) of integers acceptable if

,

and

Acceptable pairs certainly exist, since u is odd. Choose v = n – 1 and j =

0, and let u = 2 k + 1, so that

. Taking this

number modulo n gives ( n – 1)2 k+1 = ( n – 1)2 k · ( n – 1) = (–1)2 k · –1 =

−1 (mod n). Thus, ( n – 1, 0) is an acceptable pair. Now pick the largest possible j such that there exists an acceptable pair ( v, j), and fix v so that ( v, j) is an acceptable pair. Let

Since B is closed under multiplication modulo n, it is a subgroup of .

By Theorem 31.15 on page 921, therefore, | B| divides

. Every

Image 1357

Image 1358

Image 1359

Image 1360

Image 1361

Image 1362

Image 1363

Image 1364

Image 1365

Image 1366

Image 1367

Image 1368

Image 1369

Image 1370

Image 1371

nonwitness must be a member of B, since the sequence X produced by a

nonwitness must either be all 1s or else contain a –1 no later than the j th

position, by the maximality of j. (If ( a, j′) is acceptable, where a is a nonwitness, we must have j′ ≤ j by how we chose j.)

We now use the existence of v to demonstrate that there exists a

, and hence that B is a proper subgroup of

. Since

, we also have

by Corollary 31.29 to the

Chinese remainder theorem. By Corollary 31.28, there exists a w

simultaneously satisfying the equations

w = v (mod n 1),

w = 1 (mod n 2).

Therefore,

Corollary 31.29 gives that

implies

and also that

implies

. Hence, we conclude that

, and so wB.

It remains to show that

. We start by working separately

modulo n 1 and modulo n 2. Working modulo n 1, since

, we have

that gcd( v, n) = 1. Also, we have gcd( v, n 1) = 1, since if v does not have any common divisors with n, then it certainly does not have any

common divisors with n 1. Since w = v (mod n 1), we see that gcd( w, n 1)

= 1. Working modulo n 2, we have w = 1 (mod n 2) implies gcd( w, n 2) = 1

by Exercise 31.2-3. Since gcd( w, n 1) = 1 and gcd( w, n 2) = 1, Theorem 31.6 on page 908 yields gcd( w, n 1 n 2) = gcd( w, n) = 1. That is,

.

Therefore, we have

, and we can conclude in case 2 that B,

which includes all nonwitnesses, is a proper subgroup of

and

therefore has size at most ( n – 1)/2.

In either case, the number of witnesses to the compositeness of n is at

least ( n – 1)/2.

Image 1372

Image 1373

Image 1374

Image 1375

Theorem 31.40

For any odd integer n > 2 and positive integer s, the probability that MILLER-RABIN( n, s) errs is at most 2– s.

Proof By Theorem 31.39, if n is composite, then each execution of the for loop of lines 1–4 of MILLER-RABIN has a probability of at least

1/2 of discovering a witness to the compositeness of n. MILLER-

RABIN makes an error only if it is so unlucky as to miss discovering a

witness to the compositeness of n on each of the s iterations of the main loop. The probability of such a sequence of misses is at most 2– s.

If n is prime, MILLER-RABIN always reports PRIME, and if n is

composite, the chance that MILLER-RABIN reports PRIME is at

most 2− s.

When applying MILLER-RABIN to a large randomly chosen

integer n, however, we need to consider as well the prior probability that

n is prime, in order to correctly interpret MILLER-RABIN’s result.

Suppose that we fix a bit length β and choose at random an integer n of length β bits to be tested for primality, so that β ≈ 1g n ≈ 1.443 ln n. Let A denote the event that n is prime. By the prime number theorem (Theorem 31.37), the probability that n is prime is approximately

Pr { A} ≈ 1/ln n

≈ 1.443/ β.

Now let B denote the event that MILLER-RABIN returns PRIME. We

have that

(or equivalently, that Pr{ B | A} = 1) and

(or equivalently, that

).

But what is Pr{ A | B}, the probability that n is prime, given that MILLER-RABIN has returned PRIME? By the alternate form of

Bayes’s theorem (equation (C.20) on page 1189) and approximating

, we have

Image 1376

This probability does not exceed 1/2 until s exceeds 1g(ln n – 1).

Intuitively, that many initial trials are needed just for the confidence

derived from failing to find a witness to the compositeness of n to overcome the prior bias in favor of n being composite. For a number with β = 1024 bits, this initial testing requires about

lg(ln n – 1) ≈ lg( β/1.443)

≈ 9

trials. In any case, choosing s = 50 should suffice for almost any

imaginable application.

In fact, the situation is much better. If you are trying to find large

primes by applying MILLER-RABIN to large randomly chosen odd

integers, then choosing a small value of s (say 3) is unlikely to lead to

erroneous results, though we won’t prove it here. The reason is that for a

randomly chosen odd composite integer n, the expected number of

nonwitnesses to the compositeness of n is likely to be considerably

smaller than ( n – 1)/2.

If the integer n is not chosen randomly, however, the best that can be

proven is that the number of nonwitnesses is at most ( n – 1)/4, using an

improved version of Theorem 31.39. Furthermore, there do exist

integers n for which the number of nonwitnesses is ( n – 1)/4.

Exercises

31.8-1

Prove that if an odd integer n > 1 is not a prime or a prime power, then

there exists a nontrivial square root of 1, modulo n.

31.8-2

Image 1377

Image 1378

Image 1379

It is possible to strengthen Euler’s theorem (Theorem 31.30) slightly to

the form

where

and λ( n) is defined by

Prove that λ( n) | ϕ( n). A composite number n is a Carmichael number if λ( n) | n – 1. The smallest Carmichael number is 561 = 3 · 11 · 17, for which λ( n) = 1cm(2, 10, 16) = 80, which divides 560. Prove that Carmichael numbers must be both “square-free” (not divisible by the

square of any prime) and the product of at least three primes. (For this

reason, they are not common.)

31.8-3

Prove that if x is a nontrivial square root of 1, modulo n, then gcd( x

1, n) and gcd( x + 1, n) are both nontrivial divisors of n.

Problems

31-1 Binary gcd algorithm

Most computers can perform the operations of subtraction, testing the

parity (odd or even) of a binary integer, and halving more quickly than

computing remainders. This problem investigates the binary gcd

algorithm, which avoids the remainder computations used in Euclid’s

algorithm.

a. Prove that if a and b are both even, then gcd( a, b) = 2 · gcd( a/2, b/2).

b. Prove that if a is odd and b is even, then gcd( a, b) = gcd( a, b/2).

c. Prove that if a and b are both odd, then gcd( a, b) = gcd(( ab)/2, b).

d. Design an efficient binary gcd algorithm for input integers a and b, where ab, that runs in O(1g a) time. Assume that each subtraction, parity test, and halving takes unit time.

31-2 Analysis of bit operations in Euclid’s algorithm

Image 1380

Image 1381

a. Consider the ordinary “paper and pencil” algorithm for long

division: dividing a by b, which yields a quotient q and remainder r.

Show that this method requires O((1 + 1g q)1g b) bit operations.

b. Define μ( a, b) = (1 + 1g a)(1 + 1g b). Show that the number of bit operations performed by EUCLID in reducing the problem of

computing gcd( a, b) to that of computing gcd( b, a mod b) is at most c( μ( a, b) – μ( b, a mod b)) for some sufficiently large constant c > 0.

c. Show that EUCLID( a, b) requires O( μ( a, b)) bit operations in general and O( β 2) bit operations when applied to two β-bit inputs.

31-3 Three algorithms for Fibonacci numbers

This problem compares the efficiency of three methods for computing

the n th Fibonacci number Fn, given n. Assume that the cost of adding, subtracting, or multiplying two numbers is O(1), independent of the size

of the numbers.

a. Show that the running time of the straightforward recursive method

for computing Fn based on recurrence (3.31) on page 69 is exponential

in n. (See, for example, the FIB procedure on page 751.)

b. Show how to compute Fn in O( n) time using memoization.

c. Show how to compute Fn in O(1g n) time using only integer addition and multiplication. ( Hint: Consider the matrix

and its powers.)

d. Assume now that adding two β-bit numbers takes Θ( β) time and that multiplying two β-bit numbers takes Θ( β 2) time. What is the running

time of these three methods under this more reasonable cost measure

for the elementary arithmetic operations?

31-4 Quadratic residues

Let p be an odd prime. A number

is a quadratic residue modulo p,

if the equation x 2 = a (mod p) has a solution for the unknown x.

a. Show that there are exactly ( p – 1)/2 quadratic residues, modulo p.

Image 1382

Image 1383

Image 1384

Image 1385

Image 1386

Image 1387

b. If p is prime, we define the Legendre symbol , for

, to be 1 if a

is a quadratic residue, modulo p, and –1 otherwise. Prove that if

,

then

Give an efficient algorithm that determines whether a given number a

is a quadratic residue, modulo p. Analyze the efficiency of your

algorithm.

c. Prove that if p is a prime of the form 4 k + 3 and a is a quadratic residue in , then ak+1 mod p is a square root of a, modulo p. How much time is required to find the square root of a quadratic residue a,

modulo p?

d. Describe an efficient randomized algorithm for finding a

nonquadratic residue, modulo an arbitrary prime p, that is, a member

of that is not a quadratic residue. How many arithmetic operations

does your algorithm require on average?

Chapter notes

Knuth [260] contains a good discussion of algorithms for finding the greatest common divisor, as well as other basic number-theoretic

algorithms. Dixon [121] gives an overview of factorization and primality testing. Bach [33], Riesel [378], and Bach and Shallit [34] provide overviews of the basics of computational number theory; Shoup [411]

provides a more recent survey. The conference proceedings edited by

Pomerance [362] contains several excellent survey articles.

Knuth [260] discusses the origin of Euclid’s algorithm. It appears in Book 7, Propositions 1 and 2, of the Greek mathematician Euclid’s

Elements, which was written around 300 B.C.E. Euclid’s description

may have been derived from an algorithm due to Eudoxus around 375

B.C.E. Euclid’s algorithm may hold the honor of being the oldest

nontrivial algorithm, rivaled only by an algorithm for multiplication

known to the ancient Egyptians. Shallit [407] chronicles the history of the analysis of Euclid’s algorithm.

Knuth attributes a special case of the Chinese remainder theorem

(Theorem 31.27) to the Chinese mathematician Sun-Tsŭ, who lived

sometime between 200 B.C.E. and 200 C.E.—the date is quite

uncertain. The same special case was given by the Greek mathematician

Nichomachus around 100 C.E. It was generalized by Qin Jiushao in

1247. The Chinese remainder theorem was finally stated and proved in

its full generality by L. Euler in 1734.

The randomized primality-testing algorithm presented here is due to

Miller [327] and Rabin [373] and is the fastest randomized primality-testing algorithm known, to within constant factors. The proof of

Theorem 31.40 is a slight adaptation of one suggested by Bach [32]. A proof of a stronger result for MILLER-RABIN was given by Monier

[332, 333]. For many years primality-testing was the classic example of a problem where randomization appeared to be necessary to obtain an

efficient (polynomial-time) algorithm. In 2002, however, Agrawal,

Kayal, and Saxena [4] surprised everyone with their deterministic polynomial-time primality-testing algorithm. Until then, the fastest

deterministic primality testing algorithm known, due to Cohen and

Lenstra [97], ran in (1g n) O(1g1g1g n) time on input n, which is just slightly superpolynomial. Nonetheless, for practical purposes,

randomized primality-testing algorithms remain more efficient and are

generally preferred.

Beauchemin, Brassard, Crépeau, Goutier, and Pomerance [40] nicely

discuss the problem of finding large “random” primes.

The concept of a public-key cryptosystem is due to Diffie and

Hellman [115]. The RSA cryptosystem was proposed in 1977 by Rivest, Shamir, and Adleman [380]. Since then, the field of cryptography has blossomed. Our understanding of the RSA cryptosystem has deepened,

and modern implementations use significant refinements of the basic

techniques presented here. In addition, many new techniques have been

developed for proving cryptosystems to be secure. For example,

Goldwasser and Micali [190] show that randomization can be an effective tool in the design of secure public-key encryption schemes. For

Image 1388

Image 1389

signature schemes, Goldwasser, Micali, and Rivest [191] present a digital-signature scheme for which every conceivable type of forgery is

provably as difficult as factoring. Katz and Lindell [253] provide an overview of modern cryptography.

The best algorithms for factoring large numbers have a running time

that grows roughly exponentially with the cube root of the length of the

number n to be factored. The general number-field sieve factoring

algorithm (as developed by Buhler, Lenstra, and Pomerance [77] as an extension of the ideas in the number-field sieve factoring algorithm by

Pollard [360] and Lenstra et al. [295] and refined by Coppersmith [102]

and others) is perhaps the most efficient such algorithm in general for

large inputs. Although it is difficult to give a rigorous analysis of this

algorithm, under reasonable assumptions we can derive a running-time

estimate of L(1/3, n)1.902+ o(1), where

.

The elliptic-curve method due to Lenstra [296] may be more effective for some inputs than the number-field sieve method, since it can find a

small prime factor p quite quickly. With this method, the time to find p

is estimated to be

.

32 String Matching

Text-editing programs frequently need to find all occurrences of a

pattern in the text. Typically, the text is a document being edited, and

the pattern searched for is a particular word supplied by the user.

Efficient algorithms for this problem—called “string matching”—can

greatly aid the responsiveness of the text-editing program. Among their

many other applications, string-matching algorithms search for

particular patterns in DNA sequences. Internet search engines also use

them to find web pages relevant to queries.

The string-matching problem can be stated formally as follows. The

text is given as an array T[1 : n] of length n, and the pattern is an array P[1 : m] of length mn. The elements of P and T are characters drawn from an alphabet ∑, which is a finite set of characters. For example, ∑

could be the set {0, 1}, or it could be the set {a, b, …, z}. The

character arrays P and T are often called strings of characters.

As Figure 32.1 shows, pattern P occurs with shift s in text T (or, equivalently, that pattern P occurs beginning at position s + 1 in text T) if 0 ≤ snm and T[ s + 1: s + m] = P[1: m], that is, if T[ s + j] = P[ j], for 1 ≤

jm. If P occurs with shift s in T, then s is a valid shift, and otherwise, s is an invalid shift. The string-matching problem is the problem of finding all valid shifts with which a given pattern P occurs in a given text T.

Image 1390

Figure 32.1 An example of the string-matching problem to find all occurrences of the pattern P

= abaa in the text T = abcabaabcabac. The pattern occurs only once in the text, at shift s =

3, which is a valid shift. A vertical line connects each character of the pattern to its matching character in the text, and all matched characters are shaded blue.

Except for the naive brute-force algorithm in Section 32.1, each string-matching algorithm in this chapter performs some preprocessing

based on the pattern and then finds all valid shifts. We call this latter

phase “matching.” Here are the preprocessing and matching times for

each of the string-matching algorithms in this chapter. The total

running time of each algorithm is the sum of the preprocessing and

matching times:

Algorithm

Preprocessing time Matching time

Naive

0

O(( nm + 1) m)

Rabin-Karp

Θ( m)

O(( nm + 1) m)

Finite automaton

O( m |∑|)

Θ( n)

Knuth-Morris-Pratt

Θ( m)

Θ( n)

Suffix array1

O( n 1g n)

O( m 1g n + km)

Section 32.2 presents an interesting string-matching algorithm, due to Rabin and Karp. Although the Θ(( nm + 1) m) worst-case running time of this algorithm is no better than that of the naive method, it works

much better on average and in practice. It also generalizes nicely to

other pattern-matching problems. Section 32.3 then describes a string-matching algorithm that begins by constructing a finite automaton

specifically designed to search for occurrences of the given pattern P in

a text. This algorithm takes O( m |∑|) preprocessing time, but only Θ( n) matching time. Section 32.4 presents the similar, but much cleverer, Knuth-Morris-Pratt (or KMP) algorithm, which has the same Θ( n)

matching time, but it reduces the preprocessing time to only Θ( m).

A completely different approach appears in Section 32.5, which examines suffix arrays and the longest common prefix array. You can

Image 1391

use these arrays not only to find a pattern in a text, but also to answer

other questions, such as what is the longest repeated substring in the

text and what is the longest common substring between two texts. The

algorithm to form the suffix array in Section 32.5 takes O( n 1g n) time and, given the suffix array, the section shows how to compute the

longest common prefix array in O( n) time.

Notation and terminology

We denote by ∑* (read “sigma-star”) the set of all finite-length strings

formed using characters from the alphabet ∑. This chapter considers

only finite-length strings. The 0-length empty string, denoted ϵ, also belongs to ∑*. The length of a string x is denoted | x|. The concatenation of two strings x and y, denoted xy, has length | x| + | y| and consists of the characters from x followed by the characters from y.

Figure 32.2 A graphical proof of Lemma 32.1. Suppose that xz and yz. The three parts of the figure illustrate the three cases of the lemma. Vertical lines connect matching regions (shown in blue) of the strings. (a) If | x| ≤ | y|, then xy. (b) If | x| ≥ | y|, then yx. (c) If | x| = | y|, then x =

y.

A string w is a prefix of a string x, denoted wx, if x = wy for some string y ∈ ∑*. Note that if wx, then | w| ≤ | x|. Similarly, a string w is a suffix of a string x, denoted wx, if x = yw for some y ∈ ∑*. As with a prefix, wx implies | w| ≤ | x|. For example, ab ⊏ abcca and cca ⊐

abcca. A string w is a proper prefix of x if wx and | w| < | x|, and likewise for a proper suffix. The empty string ϵ is both a suffix and a prefix of every string. For any strings x and y and any character a, we have xy if and only if xaya. The ⊏ and ⊐ relations are transitive.

The following lemma will be useful later.

Lemma 32.1 (Overlapping-suffix lemma)

Suppose that x, y, and z are strings such that xz and yz. If | x| ≤ | y|, then xy. If | x| ≥ | y|, then yx. If | x| = | y|, then x = y.

Proof See Figure 32.2 for a graphical proof.

For convenience, denote the k-character prefix P[1: k] of the pattern P[1: m] by P[: k]. Thus, we can write P[:0] = ϵ and P[: m] = P = P[1: m].

Similarly, denote the k-character prefix of the text T by T[: k]. Using this notation, we can state the string-matching problem as that of finding all

shifts s in the range 0 ≤ snm such that PT[: s + m].

Our pseudocode allows two equal-length strings to be compared for

equality as a primitive operation. If the strings are compared from left

to right and the comparison stops when a mismatch is discovered, we

assume that the time taken by such a test is a linear function of the

number of matching characters discovered. To be precise, the test “x ==

y” is assumed to take Θ( t) time, where t is the length of the longest string z such that zx and zy.

32.1 The naive string-matching algorithm

The NAIVE-STRING-MATCHER procedure finds all valid shifts

using a loop that checks the condition P[1: m] = T[ s+1: s+ m] for each of the nm+1 possible values of s.

NAIVE-STRING-MATCHER( T, P, n, m)

1 for s = 0 to nm

2

if P[1: m] == T[ s + 1: s + m]

3

print “Pattern occurs with shift” s

Figure 32.3 portrays the naive string-matching procedure as sliding a

“template” containing the pattern over the text, noting for which shifts

all of the characters on the template equal the corresponding characters

in the text. The for loop of lines 1–3 considers each possible shift

explicitly. The test in line 2 determines whether the current shift is valid.

This test implicitly loops to check corresponding character positions

until all positions match successfully or a mismatch is found. Line 3

prints out each valid shift s.

Procedure NAIVE-STRING-MATCHER takes O(( nm + 1) m) time, and this bound is tight in the worst case. For example, consider the

text string a n (a string of n a’s) and the pattern a m. For each of the nm+1 possible values of the shift s, the implicit loop on line 2 to compare corresponding characters must execute m times to validate the

shift. The worst-case running time is thus Θ(( nm + 1) m), which is Θ( n 2) if m = ⌊ n/2⌊. Because it requires no preprocessing, NAIVE-STRING-MATCHER’s running time equals its matching time.

NAIVE-STRING-MATCHER is far from an optimal procedure for

this problem. Indeed, this chapter will show that the Knuth-Morris-

Pratt algorithm is much better in the worst case. The naive string-

matcher is inefficient because it entirely ignores information gained

about the text for one value of s when it considers other values of s.

Such information can be quite valuable, however. For example, if P =

aaab and s = 0 is valid, then none of the shifts 1, 2, or 3 are valid, since

T[4] = b. The following sections examine several ways to make effective

use of this sort of information.

Image 1392

Image 1393

Figure 32.3 The operation of the NAIVE-STRING-MATCHER procedure for the pattern P =

aab and the text T = acaabc. Imagine the pattern P as a template that slides next to the text.

(a)–(d) The four successive alignments tried by the naive string matcher. In each part, vertical lines connect corresponding regions found to match (shown in blue), and a red jagged line connects the first mismatched character found, if any. The algorithm finds one occurrence of the pattern, at shift s = 2, shown in part (c).

Exercises

32.1-1

Show the comparisons the naive string matcher makes for the pattern P

= 0001 in the text T = 000010001010001.

32.1-2

Suppose that all characters in the pattern P are different. Show how to

accelerate NAIVE-STRING-MATCHER to run in O( n) time on an n-

character text T.

32.1-3

Suppose that pattern P and text T are randomly chosen strings of length m and n, respectively, from the d-ary alphabet ∑ d = {0, 1, …, d – 1}, where d ≥ 2. Show that the expected number of character-to-character

comparisons made by the implicit loop in line 2 of the naive algorithm

is

over all executions of this loop. (Assume that the naive algorithm stops

comparing characters for a given shift once it finds a mismatch or

matches the entire pattern.) Thus, for randomly chosen strings, the

naive algorithm is quite efficient.

Image 1394

Image 1395

32.1-4

Suppose that the pattern P may contain occurrences of a gap character

♢ that can match an arbitrary string of characters (even one of 0

length). For example, the pattern ab ♢ ba ♢ c occurs in the text

cabccbacbacab as

and as

The gap character may occur an arbitrary number of times in the

pattern but not at all in the text. Give a polynomial-time algorithm to

determine whether such a pattern P occurs in a given text T, and analyze the running time of your algorithm.

32.2 The Rabin-Karp algorithm

Rabin and Karp proposed a string-matching algorithm that performs

well in practice and that also generalizes to other algorithms for related

problems, such as two-dimensional pattern matching. The Rabin-Karp

algorithm uses Θ( m) preprocessing time, and its worst-case running time

is Θ(( nm+1) m). Based on certain assumptions, however, its average-case running time is better.

This algorithm makes use of elementary number-theoretic notions

such as the equivalence of two numbers modulo a third number. You

might want to refer to Section 31.1 for the relevant definitions.

For expository purposes, let’s assume that ∑ = {0, 1, 2, …, 9}, so

that each character is a decimal digit. (In the general case, you can

assume that each character is a digit in radix- d notation, so that it has a

numerical value in the range 0 to d – 1, where d = |∑|.) You can then view a string of k consecutive characters as representing a length- k decimal number. For example, the character string 31415 corresponds

to the decimal number 31,415. Because we interpret the input characters

Image 1396

Image 1397

as both graphical symbols and digits, it will be convenient in this section

to denote them as digits in standard text font.

Given a pattern P[1: m], let p denote its corresponding decimal value.

In a similar manner, given a text T[1: n], let ts denote the decimal value of the length- m substring T[ s + 1: s + m], for s = 0, 1, …, nm.

Certainly, ts = p if and only if T [ s + 1: s + m] = P[1: m], and thus, s is a valid shift if and only if ts = p. If you could compute p in Θ( m) time and all the ts values in a total of Θ( nm + 1) time, 2 then you could determine all valid shifts s in Θ( m)+Θ( nm + 1) = Θ( n) time by comparing p with each of the ts values. (For the moment, let’s not worry about the possibility that p and the ts values might be very large numbers.)

Indeed, you can compute p in Θ( m) time using Horner’s rule (see Problem 2-3):

Similarly, you can compute t 0 from T[1: m] in Θ( m) time.

To compute the remaining values t 1, t 2, …, tnm in Θ( nm) time, observe that you can compute ts+1 from ts in constant time, since

Subtracting 10 m−1 T [ s + 1] removes the high-order digit from ts, multiplying the result by 10 shifts the number left by one digit position,

and adding T[ s + m + 1] brings in the appropriate low-order digit. For example, suppose that m = 5, ts = 31415, and the new low-order digit is T[ s + 5 + 1] = 2. The high-order digit to remove is T[ s + 1] = 3, and so ts+1 = 10 (31415 − 10000 · 3) + 2

= 14152.

If you precompute the constant 10 m−1 (which you can do in O(1g m) time using the techniques of Section 31.6, although for this application

Image 1398

a straightforward O( m)-time method suffices), then each execution of equation (32.1) takes a constant number of arithmetic operations. Thus,

you can compute p in Θ( m) time, and you can compute all of t 0, t 1, …, tnm in Θ( nm + 1) time. Therefore, you can find all occurrences of the pattern P[1: m] in the text T[1: n] with Θ( m) preprocessing time and Θ( nm + 1) matching time.

This scheme works well if P is short enough and the alphabet ∑ is

small enough that arithmetic operations on p and ts take constant time.

But what if P is long, or if the size of ∑ means that instead of powers of

10 in equation (32.1) you have to use powers of a larger number (such as

powers of 256 for the extended ASCII character set)? Then the values of

p and ts might be too large to work with in constant time. Fortunately, this problem can be solved, as Figure 32.4 shows: compute p and the ts values modulo a suitable modulus q. You can compute p modulo q in Θ( m) time and all the ts values modulo q in Θ( nm + 1) time. With |∑|

= 10, if you choose the modulus q as a prime such that 10 q just fits within one computer word, then you can perform all the necessary

computations with single-precision arithmetic. In general, with a d-ary

alphabet {0, 1, …, d – 1}, choose q so that dq fits within a computer word and adjust the recurrence equation (32.1) to work modulo q, so

that it becomes

Image 1399

Figure 32.4 The Rabin-Karp algorithm. Each character is a decimal digit. Values are computed modulo 13. (a) A text string. A window of length 5 is shaded blue. The numerical value of the blue number, computed modulo 13, yields the value 7. (b) The same text string with values computed modulo 13 for each possible position of a length-5 window. Assuming the pattern P =

31415, look for windows whose value modulo 13 is 7, since 31415 = 7 (mod 13). The algorithm finds two such windows, shaded blue in the figure. The first, beginning at text position 7, is indeed an occurrence of the pattern. The second window, beginning at text position 13, is a spurious hit. (c) How to compute the value for a window in constant time, given the value for the previous window. The first window has value 31415. Dropping the high-order digit 3, shifting left (multiplying by 10), and then adding in the low-order digit 2 gives the new value 14152. Because all computations are performed modulo 13, the value for the first window is 7, and the value for the new window is 8.

where h = dm−1 mod q is the value of the digit “1” in the high-order position of an m-digit text window.

The solution of working modulo q is not perfect, however: ts = p (mod q) does not automatically mean that ts = p. On the other hand, if tsp (mod q), then you definitely know that tsp, so that shift s is invalid. Thus you can use the test ts = p (mod q) as a fast heuristic test to rule out invalid shifts. If ts = p (mod q)—a hit—then you need to test further to see whether s is really valid or you just have a spurious hit.

This additional test explicitly checks the condition P[1: m] = T[ s + 1: s +

m]. If q is large enough, then you would hope that spurious hits occur

infrequently enough that the cost of the extra checking is low.

The procedure RABIN-KARP-MATCHER on the next page makes

these ideas precise. The inputs to the procedure are the text T, the pattern P, their lengths n and m, the radix d to use (which is typically taken to be |∑|), and the prime q to use. The procedure works as follows.

All characters are interpreted as radix- d digits. The subscripts on t are provided only for clarity: the procedure works correctly if all the

subscripts are dropped. Line 1 initializes h to the value of the high-order

digit position of an m-digit window. Lines 2–6 compute p as the value of P[1: m] mod q and t 0 as the value of T[1: m] mod q. The for loop of lines 7–12 iterates through all possible shifts s, maintaining the following invariant:

Whenever line 8 is executed, ts = T[ s + 1: s + m] mod q.

If a hit occurs because p = ts in line 8, then line 9 determines whether s is a valid shift or the hit was spurious via the test P[1: m] == T[ s +1: s

+ m]. Line 10 prints out any valid shifts that are found. If s < nm (checked in line 11), then the for loop will iterate at least one more time,

and so line 12 first executes to ensure that the loop invariant holds upon

the next iteration. Line 12 computes the value of ts+1 mod q from the

value of ts mod q in constant time using equation (32.2) directly.

RABIN-KARP-MATCHER takes Θ( m) preprocessing time, and its

matching time is Θ(( nm + 1) m) in the worst case, since (like the naive string-matching algorithm) the Rabin-Karp algorithm explicitly verifies

Image 1400

every valid shift. If P = a m and T = a n, then verifying takes Θ(( nm+1) m) time, since each of the nm+1 possible shifts is valid.

In many applications, you expect few valid shifts—perhaps some

constant c of them. In such applications, the expected matching time of

the algorithm is only O(( nm+1)+ cm) = O( n+ m), plus the time required to process spurious hits. We can base a heuristic analysis on the

assumption that reducing values modulo q acts like a random mapping

from ∑* to ℤ q. The expected number of spurious hits is then O( n/ q), because we can estimate the chance that an arbitrary ts will be

equivalent to p, modulo q, as 1/ q. Since there are O( n) positions at which the test of line 8 fails (actually, at most nm + 1 positions) and checking each hit takes O( m) time in line 9, the expected matching time taken by the Rabin-Karp algorithm is

RABIN-KARP-MATCHER( T, P, n, m, d, q)

1 h = dm−1 mod q

2 p = 0

3 t 0 = 0

4 for i = 1 to m

// preprocessing

5

p = ( dp + P[ i]) mod q

6

t 0 = ( dt 0 + T[ i]) mod q

7 for s = 0 to nm

// matching—try all possible

shifts

8

if p == ts

// a hit?

9

if P[1: m] == T[ s + 1: s + m] // valid shift?

10

print “Pattern occurs with shift” s

11

if s < nm

12

O( n) + O( m( v + n/ q)),

where v is the number of valid shifts. This running time is O( n) if v =

O(1) and you choose qm. That is, if the expected number of valid shifts is small ( O(1)) and you choose the prime q to be larger than the

Image 1401

length of the pattern, then you can expect the Rabin-Karp procedure to

use only O( n + m) matching time. Since mn, this expected matching time is O( n).

Exercises

32.2-1

Working modulo q = 11, how many spurious hits does the Rabin-Karp

matcher encounter in the text T = 3141592653589793 when looking for

the pattern P = 26?

32.2-2

Describe how to extend the Rabin-Karp method to the problem of

searching a text string for an occurrence of any one of a given set of k

patterns. Start by assuming that all k patterns have the same length.

Then generalize your solution to allow the patterns to have different

lengths.

32.2-3

Show how to extend the Rabin-Karp method to handle the problem of

looking for a given m × m pattern in an n × n array of characters. (The pattern may be shifted vertically and horizontally, but it may not be

rotated.)

32.2-4

Alice has a copy of a long n-bit file A = 〈 an–1, an–2, …, a 0〉, and Bob similarly has an n-bit file B = 〈 bn–1, bn–2, …, b 0〉. Alice and Bob wish to know if their files are identical. To avoid transmitting all of A or B, they use the following fast probabilistic check. Together, they select a

prime q > 1000 n and randomly select an integer x from {0, 1, …, q – 1}.

Letting

Alice evaluates A( x) and Bob evaluates B( x). Prove that if AB, there is at most one chance in 1000 that A( x) = B( x), whereas if the two files are

the same, A( x) is necessarily the same as B( x). ( Hint: See Exercise 31.4-4.)

32.3 String matching with finite automata

Many string-matching algorithms build a finite automaton—a simple

machine for processing information—that scans the text string T for all

occurrences of the pattern P. This section presents a method for

building such an automaton. These string-matching automata are

efficient: they examine each text character exactly once, taking constant

time per text character. The matching time used—after preprocessing

the pattern to build the automaton—is therefore Θ( n). The time to build

the automaton, however, can be large if ∑ is large. Section 32.4 describes a clever way around this problem.

We begin this section with the definition of a finite automaton. We

then examine a special string-matching automaton and show how to use

it to find occurrences of a pattern in a text. Finally, we’ll see how to

construct the string-matching automaton for a given input pattern.

Finite automata

A finite automaton M, illustrated in Figure 32.5, is a 5-tuple ( Q, q 0, A, ∑, δ), where

Q is a finite set of states,

q 0 ∈ Q is the start state,

AQ is a distinguished set of accepting states,

∑ is a finite input alphabet,

δ is a function from Q × ∑ into Q, called the transition function of M.

Image 1402

Figure 32.5 A simple two-state finite automaton with state set Q = {0, 1}, start state q 0 = 0, and input alphabet ∑ = {a, b}. (a) A tabular representation of the transition function δ. (b) An equivalent state-transition diagram. State 1, in orange, is the only accepting state. Directed edges represent transitions. For example, the edge from state 1 to state 0 labeled b indicates that δ(1, b)

= 0. This automaton accepts those strings that end in an odd number of a’s. More precisely, it accepts a string x if and only if x = yz, where y = ϵ or y ends with a b, and z = a k, where k is odd. For example, on input abaaa, including the start state, this automaton enters the sequence of states 〈0, 1, 0, 1, 0, 1〉, and so it accepts this input. For input abbaa, it enters the sequence of states 〈0, 1, 0, 0, 1, 0〉, and so it rejects this input.

The finite automaton begins in state q 0 and reads the characters of

its input string one at a time. If the automaton is in state q and reads

input character a, it moves (“makes a transition”) from state q to state δ( q, a). Whenever its current state q is a member of A, the machine M

has accepted the string read so far. An input that is not accepted is rejected.

A finite automaton M induces a function ϕ, called the final-state function, from ∑* to Q such that ϕ( w) is the state M ends up in after reading the string w. Thus, M accepts a string w if and only if ϕ( w) ∈ A.

We define the function ϕ recursively, using the transition function:

ϕ( ϵ) = q 0,

ϕ( wa) = δ( ϕ( w), a) for w ∈ ∑*, a ∈ ∑.

String-matching automata

For a given pattern P, a preprocessing step constructs a string-matching

automaton specific to P. The automaton then searches the text string

for occurrences of P. Figure 32.6 illustrates the automaton for the pattern P = ababaca. From now on, let’s assume that P is fixed, and

Image 1403

for brevity, we won’t bother to indicate the dependence upon P in our

notation.

In order to specify the string-matching automaton corresponding to

a given pattern P[1: m], we first define an auxiliary function σ, called the suffix function corresponding to the pattern P. The function σ maps ∑*

to {0, 1, …, m} such that σ( x) is the length of the longest prefix of P that is also a suffix of x:

Image 1404

Figure 32.6 (a) A state-transition diagram for the string-matching automaton that accepts all strings ending in the string ababaca. State 0 is the start state, and state 7 (in orange) is the only accepting state. The transition function δ is defined by equation (32.4), and a directed edge from state i to state j labeled a represents δ( i, a) = j. The right-going edges forming the “spine” of the automaton, shown in blue, correspond to successful matches between pattern and input characters. Except for the edges from state 7 to states 1 and 2, the left-going edges correspond to mismatches. Some edges corresponding to mismatches are omitted: by convention, if a state i has no outgoing edge labeled a for some a ∈ ∑, then δ( i, a) = 0. (b) The corresponding transition function δ, and the pattern string P = ababaca. The entries corresponding to successful matches between pattern and input characters are shown in blue. (c) The operation of the automaton on the text T = abababacaba. Under each text character T[ i] appears the state ϕ( T[: i]) that the automaton is in after processing the prefix T[: i]. The substring of the pattern that occurs in the text is highlighted in blue. The automaton finds this one occurrence of the pattern, ending in position 9.

The suffix function σ is well defined since the empty string P[:0] = ϵ is a suffix of every string. As examples, for the pattern P = ab, we have σ( ε)

= 0, σ(ccaca) = 1, and σ(ccab) = 2. For a pattern P of length m, we have σ( x) = m if and only if Px. From the definition of the suffix function, xy implies σ( x) ≤ σ( y) (see Exercise 32.3-4).

We are now ready to define the string-matching automaton that

corresponds to a given pattern P[1: m]:

Image 1405

Image 1406

The state set Q is {0, 1, …, m}. The start state q 0 is state 0, and state m is the only accepting state.

The transition function δ is defined, for any state q and character

a, by

As the automaton consumes characters of the text T, it is trying to

build a match of the pattern P against the most recently seen characters

of T. At any time, the state number q gives the length of the longest prefix of P that matches the most recently seen text characters.

Whenever the automaton reaches state m, the m most recently seen text

characters match the first m characters of P. Since P has length m, reaching state m means that the m most recently seen text characters match the entire pattern, so that the automaton has found a match.

With this intuition behind the design of the automaton, here is the

reasoning behind defining δ( q, a) = σ( P[: q] a). Suppose that the automaton is in state q after reading the first i characters of the text, that is, q = ϕ( T[: i]). The intuitive idea then says that q also equals the length of the longest prefix of P that matches a suffix of T[: i] or, equivalently, that q = σ( T[: i]). Thus, since ϕ( T[: i]) and σ( T[: i]) both equal q, we will see (in Theorem 32.4 on page 973) that the automaton

maintains the following invariant:

If the automaton is in state q and reads the next character T[ i + 1] = a, then the transition should lead to the state corresponding to the longest

prefix of P that is a suffix of T[: i] a. That state is σ( T[: i] a), and equation (32.5) gives ϕ( T[: i] a) = σ( T[: i] a). Because P[: q] is the longest prefix of P

that is a suffix of T[: i], the longest prefix of P that is a suffix of T[: i] a has length not only σ( T[: i] a), but also σ( P[: q] a), and so ϕ( T[: i] a) = σ( P[: q] a).

(Lemma 32.3 on page 972 will prove that σ( T[: i] a) = σ( P[: q] a).) Thus, when the automaton is in state q, the transition function δ on character a should take the automaton to state δ( q, a) = δ( ϕ( T[: i]), a) = ϕ( T[: i] a) =

σ( P[: q] a) (with the last equality following from equation (32.5)).

There are two cases to consider, depending on whether the next character continues to match the pattern. In the first case, a = P[ q + 1], so that the character a continues to match the pattern. In this case, because δ( q, a) = q + 1, the transition continues to go along the “spine”

of the automaton (the blue edges in Figure 32.6(a)). In the second case, aP[ q + 1], so that a does not extend the match being built. In this case, we need to find the longest prefix of P that is also a suffix of T[: i] a, which will have length at most q. The preprocessing step matches the pattern against itself when creating the string-matching automaton, so

that the transition function can quickly identify the longest such smaller

prefix of P.

Let’s look at an example. Consider state 5 in the string-matching

automaton of Figure 32.6. In state 5, the five most recently read characters of T are ababa, the characters along the spine of the

automaton that reach state 5. If the next character of T is c, then the

most recently read characters of T are ababac, which is the prefix of P

with length 6. The automaton should continue along the spine to state

6. This is the first case, in which the match continues, and δ(5, c) = 6. To

illustrate the second case, suppose that in state 5, the next character of T

is b, so the most recently read characters of T are ababab. Here, the

longest prefix of P that matches the most recently read characters of T

—that is, a suffix of the portion of T read so far—is abab, with length

4, so δ(5, b) = 4.

To clarify the operation of a string-matching automaton, the simple

and

efficient

procedure

FINITE-AUTOMATON-MATCHER

simulates the behavior of such an automaton (represented by its

transition function δ) in finding occurrences of a pattern P of length m in an input text T[1: n]. As for any string-matching automaton for a pattern of length m, the state set Q is {0, 1, …, m}, the start state is 0, and the only accepting state is state m. From the simple loop structure

of FINITE-AUTOMATON-MATCHER, you can see that its matching

time on a text string of length n is Θ( n), assuming that each lookup of the transition function δ takes constant time. This matching time,

however, does not include the preprocessing time required to compute

the transition function. We address this problem later, after first proving

Image 1407

Image 1408

that the procedure FINITE-AUTOMATON-MATCHER operates

correctly.

FINITE-AUTOMATON-MATCHER( T, δ, n, m)

1 q = 0

2 for i = 1 to n

3

q = δ( q, T[ i])

4

if q == m

5

print “Pattern occurs with shift” im

Let’s examine how the automaton operates on an input text T[1: n].

We will prove that the automaton is in state σ( T[: i]) after reading character T[ i]. Since σ( T[: i]) = m if and only if PT[: i], the machine is in the accepting state m if and only if it has just read the pattern P. We start with two lemmas about the suffix function σ.

Lemma 32.2 (Suffix-function inequality)

For any string x and character a, we have σ( xa) ≤ σ( x) + 1.

Figure 32.7 An illustration for the proof of Lemma 32.2. The figure shows that rσ( x) + 1, where r = σ( xa).

Figure 32.8 An illustration for the proof of Lemma 32.3. The figure shows that r = σ( P[: q] a), where q = σ( x) and r = σ( xa).

Proof Referring to Figure 32.7, let r = σ( xa). If r = 0, then the conclusion σ( xa) = rσ( x)+1 is trivially satisfied since σ( x) is nonnegative. Now assume that r > 0. Then, P[: r] ⊐ xa, by the definition of σ. Thus, P[: r − 1] ⊐ x, by dropping the a from both the end of P[: r]

and the end of xa. Therefore, r – 1 ≤ σ( x), since σ( x) is the largest k such that P[: k] ⊐ x, and thus σ( xa) = rσ( x) + 1.

Lemma 32.3 (Suffix-function recursion lemma)

For any string x and character a, if q = σ( x), then σ( xa) = σ( P[: q] a).

Proof The definition of σ gives that P[: q] ⊐ x. As Figure 32.8 shows, we also have P[: q] axa. Let r = σ( xa). Then P[: r] ⊐ xa and, by Lemma 32.2, rq + 1. Thus, we have | P[: r]| = rq + 1 = | P[: q] a|. Since P[: q] a

xa, P[: r] ⊐ xa, and | P[: r]| ≤ | P[: q] a|, Lemma 32.1 on page 959 implies that P[: r] ⊐ P[: q] a. Therefore, r ≤ ( P[: q] a), that is, σ( xa) ≤ σ( P[: q] a). But we also have σ( P[: q] a) ≤ σ( xa), since P[: q] axa. Thus, σ( xa) = σ( P[: q] a).

We are now ready to prove the main theorem characterizing the

behavior of a string-matching automaton on a given input text. As

noted above, this theorem shows that the automaton is merely keeping

track, at each step, of the longest prefix of the pattern that is a suffix of

what has been read so far. In other words, the automaton maintains the

invariant (32.5).

Theorem 32.4

If ϕ is the final-state function of a string-matching automaton for a given pattern P and T[1: n] is an input text for the automaton, then ϕ( T[: i]) = σ( T[: i])

for i = 0, 1, …, n.

Proof The proof is by induction on i. For i = 0, the theorem is trivially true, since T[:0] = ε. Thus, ϕ( T[:0]) = 0 = σ( T[:0]).

Now assume that ϕ( T[: i]) = σ( T[: i]). We will prove that ϕ( T[: i + 1]) =

σ( T[: i + 1]). Let q denote ϕ( T[: i]), so that q = σ( T[: i]), and let a denote T[ i + 1]. Then,

ϕ( T[: i + 1]) = ϕ( T[: i] a)

(by the definitions of T[: i + 1] and a)

= δ( ϕ( T[: i]), a) (by the definition of ϕ)

= δ( q, a)

(by the definition of q)

= σ( P[: q] a)

(by the definition (32.4) of δ)

= σ( T[: i] a)

(by Lemma 32.3)

= ( T[: i + 1])

(by the definition of T[: i + 1]).

By Theorem 32.4, if the machine enters state q on line 3, then q is the largest value such that P[: q] ⊐ T[: i]. Thus, in line 4, q = m if and only if the machine has just read an occurrence of the pattern P. Therefore, FINITE-AUTOMATON-MATCHER operates correctly.

Computing the transition function

The procedure COMPUTE-TRANSITION-FUNCTION on the

following page computes the transition function δ from a given pattern

P[1: m]. It computes δ( q, a) in a straightforward manner according to its definition in equation (32.4). The nested loops beginning on lines 1 and

2 consider all states q and all characters a, and lines 3–6 set δ( q, a) to be the largest k such that P[: k] ⊐ P[: q] a. The code starts with the largest conceivable value of k, which is q+1, unless q = m, in which case k cannot be larger than m. It then decreases k until P[: k] is a suffix of P[: q] a, which must eventually occur, since P[:0] = ε is a suffix of every string.

COMPUTE-TRANSITION-FUNCTION( P, ∑, m)

1 for q = 0 to m

2

for each character a ∈ ∑

3

k = min { m, q + 1}

4

while P[: k] is not a suffix of P[: q] a

5

k = k – 1

6

δ( q, a) = k