In general, if n random variables X 1, X 2, … , Xn are pairwise independent, then

The standard deviation of a random variable X is the nonnegative square root of the variance of X. The standard deviation of a random variable

X is sometimes denoted σX or simply σ when the random variable X is understood from context. With this notation, the variance of X is

denoted σ 2.

Exercises

C.3-1

You roll two ordinary, 6-sided dice. What is the expectation of the sum

of the two values showing? What is the expectation of the maximum of

the two values showing?

C.3-2

An array A[1 : n] contains n distinct numbers that are randomly ordered, with each permutation of the n numbers being equally likely.

What is the expectation of the index of the maximum element in the

array? What is the expectation of the index of the minimum element in

the array?

C.3-3

A carnival game consists of three dice in a cage. A player can bet a

dollar on any of the numbers 1 through 6. The cage is shaken, and the

payoff is as follows. If the player’s number doesn’t appear on any of the

dice, the player loses the dollar. Otherwise, if the player’s number

Image 1879

appears on exactly k of the three dice, for k = 1, 2, 3, the player keeps the dollar and wins k more dollars. What is the expected gain from

playing the carnival game once?

C.3-4

Argue that if X and Y are nonnegative random variables, then

E[max { X, Y}] ≤ E[ X] + E[ Y].

C.3-5

Let X and Y be independent random variables. Prove that f( X) and g( Y) are independent for any choice of functions f and g.

C.3-6

Let X be a nonnegative random variable, and suppose that E[ X] is well

defined. Prove Markov’s inequality:

for all t > 0.

C.3-7

Let S be a sample space, and let X and X′ be random variables such that X( s) ≥ X′( s) for all sS. Prove that for any real constant t, Pr { Xt} ≥ Pr { X′ ≥ t}.

C.3-8

Which is larger: the expectation of the square of a random variable, or

the square of its expectation?

C.3-9

Show that for any random variable X that takes on only the values 0

and 1, we have Var[ X] = E[ X] E [1 − X].

C.3-10

Prove that Var[ aX] = a 2Var[ X] from the definition (C.31) of variance.

C.4 The geometric and binomial distributions

A Bernoulli trial is an experiment with only two possible outcomes:

success, which occurs with probability p, and failure, which occurs with probability q = 1 − p. A coin flip serves as an example where, depending on your point of view, heads equates to success and tails to failure.

When we speak of Bernoulli trials collectively, we mean that the trials

are mutually independent and, unless we specifically say otherwise, that

each has the same probability p for success. Two important distributions

arise from Bernoulli trials: the geometric distribution and the binomial

distribution.

The geometric distribution

Consider a sequence of Bernoulli trials, each with a probability p of success and a probability q = 1 − p of failure. How many trials occur

before a success? Define the random variable X to be the number of

trials needed to obtain a success. Then X has values in the range {1, 2,

…}, and for k ≥ 1,

Image 1880

Image 1881

Image 1882

Figure C.1 A geometric distribution with probability p = 1/3 of success and a probability q = 1 −

p of failure. The expectation of the distribution is 1/ p = 3.

since k − 1 failures occur before the first success. A probability

distribution satisfying equation (C.35) is said to be a geometric

distribution. Figure C.1 illustrates such a distribution.

Assuming that q < 1, we can calculate the expectation of a geometric

distribution:

Thus, on average, it takes 1/ p trials before a success occurs, an intuitive

result. As Exercise C.4-3 asks you to show, the variance is

Image 1883

Image 1884

Image 1885

Image 1886

Image 1887

As an example, suppose that you repeatedly roll two dice until you

obtain either a seven or an eleven. Of the 36 possible outcomes, 6 yield a

seven and 2 yield an eleven. Thus, the probability of success is p = 8/36

= 2/9, and you’d have to roll 1/ p = 9/2 = 4.5 times on average to obtain a

seven or eleven.

The binomial distribution

How many successes occur during n Bernoulli trials, where a success

occurs with probability p and a failure with probability q = 1 − p?

Define the random variable X to be the number of successes in n trials.

Then X has values in the range {0, 1, … , n}, and for k = 0, 1, … , n, since there are ways to pick which k of the n trials are successes, and the probability that each occurs is pkqnk. A probability distribution satisfying equation (C.38) is said to be a binomial distribution. For convenience, we define the family of binomial distributions using the

notation

Figure C.2 illustrates a binomial distribution. The name “binomial”

comes from the right-hand side of equation (C.38) being the k th term of

the expansion of ( p + q) n. Consequently, since p + q = 1, equation (C.4) on page 1181 gives

as axiom 2 of the probability axioms requires.

We can compute the expectation of a random variable having a

binomial distribution from equations (C.9) and (C.40). Let X be a

Image 1888

Image 1889

random variable that follows the binomial distribution b( k; n, p), and let q = 1 − p. The definition of expectation gives

Figure C.2 The binomial distribution b( k; 15, 1/3) resulting from n = 15 Bernoulli trials, each with probability p = 1/3 of success. The expectation of the distribution is np = 5.

Linearity of expectation produces the same result with substantially

less algebra. Let Xi be the random variable describing the number of

successes in the i th trial. Then E[ Xi] = p · 1 + q · 0 = p, and the expected number of successes for n trials is

Image 1890

Image 1891

Image 1892

Image 1893

Image 1894

Image 1895

We can use the same approach to calculate the variance of the

distribution. By equation (C.31),

. Since Xi takes on

only the values 0 and 1, we have

, which implies

.

Hence,

To compute the variance of X, we take advantage of the independence

of the n trials. By equation (C.33), we have

As Figure C.2 shows, the binomial distribution b( k; n, p) increases with k until it reaches the mean np, and then it decreases. To prove that the distribution always behaves in this manner, examine the ratio of

successive terms:

Image 1896

Image 1897

Image 1898

This ratio is greater than 1 precisely when ( n + 1) pk is positive.

Consequently, b( k; n, p) > b( k − 1; n, p) for k < ( n + 1) p (the distribution increases), and b( k; n, p) < b( k − 1; n, p) for k > ( n + 1) p (the distribution decreases). If ( n + 1) p is an integer, then for k = ( n + 1) p, the ratio b( k; n, p)/ b( k − 1; n, p) equals 1, so that b( k; n, p) = b( k − 1; n, p). In this case, the distribution has two maxima: at k = ( n+1) p and at k−1 = ( n+1) p−1 = npq. Otherwise, it attains a maximum at the unique integer k that lies in the range npq < k < ( n + 1) p.

The following lemma provides an upper bound on the binomial

distribution.

Lemma C.1

Let n ≥ 0, let 0 < p < 1, let q = 1 − p, and let 0 ≤ kn. Then Proof We have

Exercises

Image 1899

Image 1900

Image 1901

C.4-1

Verify axiom 2 of the probability axioms for the geometric distribution.

C.4-2

How many times on average do you need to flip six fair coins before

obtaining three heads and three tails?

C.4-3

Show that the variance of the geometric distribution is q/ p 2. ( Hint: Use Exercise A.1-6 on page 1144.)

C.4-4

Show that b( k; n, p) = b( nk; n, q), where q = 1 − p.

C.4-5

Show that the value of the maximum of the binomial distribution b( k; n, p) is approximately

, where q = 1 − p.

C.4-6

Show that the probability of no successes in n Bernoulli trials, each with

probability p = 1/ n of success, is approximately 1/ e. Show that the probability of exactly one success is also approximately 1/ e.

C.4-7

Professor Rosencrantz flips a fair coin n times, and so does Professor Guildenstern. Show that the probability that they get the same number

of heads is

. ( Hint: For Professor Rosencrantz, call a head a

success, and for Professor Guildenstern, call a tail a success.) Use your

argument to verify the identity

C.4-8

Image 1902

Image 1903

Show that for 0 ≤ kn,

b( k; n, 1/2) ≤ 2 n H( k/ n)− n,

where H( x) is the entropy function (C.8) on page 1182.

C.4-9

Consider n Bernoulli trials, where for i = 1, 2, … , n, the i th trial has probability pi of success, and let X be the random variable denoting the total number of successes. Let ppi for all i = 1, 2, … , n. Prove that for 1 ≤ kn,

C.4-10

Let X be the random variable for the total number of successes in a set

A of n Bernoulli trials, where the i th trial has a probability pi of success, and let X′ be the random variable for the total number of successes in a

second set A′ of n Bernoulli trials, where the i th trial has a probability of success. Prove that for 0 ≤ kn,

Pr { X′ ≥ k} ≥ Pr { Xk}.

( Hint: Show how to obtain the Bernoulli trials in A′ by an experiment

involving the trials of A, and use the result of Exercise C.3-7.)

★ C.5 The tails of the binomial distribution

The probability of having at least, or at most, k successes in n Bernoulli trials, each with probability p of success, is often of more interest than

the probability of having exactly k successes. In this section, we

investigate the tails of the binomial distribution: the two regions of the

distribution b( k; n, p) that are far from the mean np. We’ll prove several important bounds on (the sum of all terms in) a tail.

Image 1904

Image 1905

We first provide a bound on the right tail of the distribution b( k; n, p). To determine bounds on the left tail, simply invert the roles of successes and failures.

Theorem C.2

Consider a sequence of n Bernoulli trials, where success occurs with

probability p. Let X be the random variable denoting the total number

of successes. Then for 0 ≤ kn, the probability of at least k successes is Proof For S ⊆ {1, 2, … , n}, let AS denote the event that the i th trial is a success for every iS. Since Pr { AS} = pk, where | S| = k, we have

The following corollary restates the theorem for the left tail of the

binomial distribution. In general, we’ll leave it to you to adapt the

proofs from one tail to the other.

Corollary C.3

Consider a sequence of n Bernoulli trials, where success occurs with

probability p. If X is the random variable denoting the total number of successes, then for 0 ≤ kn, the probability of at most k successes is

Image 1906

Image 1907

Image 1908

Image 1909

Our next bound concerns the left tail of the binomial distribution. Its

corollary shows that, far from the mean, the left tail diminishes

exponentially.

Theorem C.4

Consider a sequence of n Bernoulli trials, where success occurs with

probability p and failure with probability q = 1 − p. Let X be the random variable denoting the total number of successes. Then for 0 < k

< np, the probability of fewer than k successes is

Proof We bound the series

by a geometric series using the

technique from Section A.2, page 1147. For i = 1, 2, … , k, equation (C.45) gives

If we let

Image 1910

Image 1911

it follows that

b( i − 1; n, p) < xb( i; n, p) for 0 < ik. Iteratively applying this inequality ki times gives b( i; n, p) < xki b( k; n, p) for 0 ≤ i < k, and hence

Corollary C.5

Consider a sequence of n Bernoulli trials, where success occurs with

probability p and failure with probability q = 1 − p. Then for 0 < k

np/2, the probability of fewer than k successes is less than half the probability of fewer than k + 1 successes.

Proof Because knp/2, we have

Image 1912

Image 1913

Image 1914

Image 1915

Image 1916

since q ≤ 1. Letting X be the random variable denoting the number of

successes, Theorem C.4 and inequality (C.46) imply that the probability

of fewer than k successes is

Thus we have

since

.

Bounds on the right tail follow similarly. Exercise C.5-2 asks you to

prove them.

Corollary C.6

Consider a sequence of n Bernoulli trials, where success occurs with

probability p. Let X be the random variable denoting the total number

of successes. Then for np < k < n, the probability of more than k successes is

Corollary C.7

Image 1917

Image 1918

Image 1919

Image 1920

Consider a sequence of n Bernoulli trials, where success occurs with

probability p and failure with probability q = 1 − p. Then for ( np + n)/2

< k < n, the probability of more than k successes is less than half the probability of more than k − 1 successes.

The next theorem considers n Bernoulli trials, each with a

probability pi of success, for i = 1, 2, … , n. As the subsequent corollary shows, we can use the theorem to provide a bound on the right tail of

the binomial distribution by setting pi = p for each trial.

Theorem C.8

Consider a sequence of n Bernoulli trials, where in the i th trial, for i = 1, 2, … , n, success occurs with probability pi and failure occurs with probability qi = 1 − pi. Let X be the random variable describing the total number of successes, and let μ = E[ X]. Then for r > μ, Proof Since for any α > 0, the function eαx strictly increases in x, where we will determine α later. Using Markov’s inequality (C.34), we

obtain

The bulk of the proof consists of bounding E[ ( Xμ)] and substituting a suitable value for α in inequality (C.48). First, we evaluate

E[ ( Xμ)]. Using the technique of indicator random variables (see

Section 5.2), let Xi = I {the i th Bernoulli trial is a success} for i = 1, 2,

… , n. That is, Xi is the random variable that is 1 if the i th Bernoulli trial is a success and 0 if it is a failure. Thus, we have

Image 1921

Image 1922

Image 1923

Image 1924

Image 1925

Image 1926

Image 1927

Image 1928

and by linearity of expectation,

which implies

To evaluate E[ ( Xμ)], we substitute for Xμ, obtaining which follows from equation (C.27), since the mutual independence of

the random variables Xi implies the mutual independence of the

random variables

(see Exercise C.3-5). By the definition of

expectation,

where exp( x) denotes the exponential function: exp( x) = ex. (Inequality (C.49) follows from the inequalities α > 0, qi ≤ 1,

, and

.

The last line follows from inequality (3.14) on page 66.) Consequently,

Image 1929

Image 1930

Image 1931

Image 1932

since

. Therefore, from equation (C.47) and inequalities (C.48)

and (C.50), it follows that

Choosing α = ln( r/ μ) (see Exercise C.5-7), we obtain

When applied to Bernoulli trials in which each trial has the same

probability of success, Theorem C.8 yields the following corollary

bounding the right tail of a binomial distribution.

Corollary C.9

Consider a sequence of n Bernoulli trials, where in each trial success occurs with probability p and failure occurs with probability q = 1 − p.

Then for r > np,

Proof By equation (C.41), we have μ = E[ X] = np.

Exercises

C.5-1

Which is more likely: getting exactly n heads in 2 n flips of a fair coin, or n heads in n flips of a fair coin?

C.5-2

Prove Corollaries C.6 and C.7.

Image 1933

Image 1934

Image 1935

Image 1936

Image 1937

Image 1938

C.5-3

Show that

for all a > 0 and all k such that 0 < k < na/( a + 1).

C.5-4

Prove that if 0 < k < np, where 0 < p < 1 and q = 1 − p, then

C.5-5

Use Theorem C.8 to show that

for r > nμ. Similarly, use Corollary C.9 to show that

for r > nnp.

C.5-6

Consider a sequence of n Bernoulli trials, where in the i th trial, for i = 1, 2, … , n, success occurs with probability pi and failure occurs with probability qi = 1 − pi. Let X be the random variable describing the total number of successes, and let μ = E[ X]. Show that for r ≥ 0, ( Hint: Prove that

. Then follow the outline of the proof

of Theorem C.8, using this inequality in place of inequality (C.49).)

C.5-7

Show that choosing α = ln( r/ μ) minimizes the right-hand side of inequality (C.51).

Problems

C-1 The Monty Hall problem

Imagine that you are a contestant in the 1960s game show Let’s Make a

Deal, hosted by emcee Monty Hall. A valuable prize is hidden behind

one of three doors and comparatively worthless prizes behind the other

two doors. You will win the valuable prize, typically an automobile or

other expensive product, if you select the correct door. After you have

picked one door, but before the door has been opened, Monty, who

knows which door hides the automobile, directs his assistant Carol

Merrill to open one of the other doors, revealing a goat (not a valuable

prize). He asks whether you would like to stick with your current choice

or to switch to the other closed door. What should you do to maximize

your chances of winning the automobile and not the other goat?

The answer to this question—stick or switch?—has been heavily

debated, in part because the problem setup is ambiguous. We’ll explore

different subtle assumptions.

a. Suppose that your first pick is random, with probability 1/3 of

choosing the right door. Moreover, you know that Monty always gives

every contestant (and will give you) the opportunity to switch. Prove

that it is better to switch than stick. What is your probability of

winning the automobile?

This answer is the one typically given, even though the original

statement of the problem rarely mentions the assumption that Monty

always offers the contestant the opportunity to switch. But, as the

remainder of this problem will elucidate, your best strategy may be

different if this unstated assumption does not hold. In fact, in the real

game show, after a contestant picked a door, Monty sometimes simply

asked Carol to open the door that the contestant had chosen.

Let’s model the interactions between you and Monty as a

probabilistic experiment, where you both employ randomized strategies.

Image 1939

Specifically, after you pick a door, Monty offers you the opportunity to

switch with probability p right if you picked the right door and with probability p wrong if you picked the wrong door. Given the opportunity

to switch, you randomly choose to switch with probability p switch. For

example, if Monty always offers you the opportunity to switch, then his

strategy is given by p right = p wrong = 1. If you always switch, then your strategy is given by p switch = 1.

The game can now be viewed as an experiment consisting of five

steps:

1. You pick a door at random, choosing the automobile (right) with

probability 1/3 or a goat (wrong) with probability 2/3.

2. Carol opens one of the two closed doors, revealing a goat.

3. Monty offers you the opportunity to switch with probability p right if

your choice is right and with probability p wrong if your choice is

wrong.

4. If Monty makes you an offer in step 3, you switch with probability

p switch.

5. Carol opens the door you’ve chosen, revealing either an automobile

(you win) or a goat (you lose).

Let’s now analyze this game and understand how the choices of

p right, p wrong, and p switch influence the probability of winning.

b. What are the six outcomes in the sample space for this game? Which

outcomes correspond to you winning the automobile? What are the

probabilities in terms of p right, p wrong, and p switch of each outcome? Organize your answers into a table.

c. Use the results of your table (or other means) to prove that the

probability of winning the automobile is

Image 1940

Suppose that Monty knows the probability p switch that you switch, and

his goal is to minimize your chance of winning.

d. If p switch > 0 (you switch with a positive probability), what is Monty’s best strategy, that is, his best choice for p right and p wrong?

e. If p switch = 0 (you always stick), argue that all of Monty’s possible strategies are optimal for him.

Suppose that now Monty’s strategy is fixed, with particular values for

p right and p wrong.

f. If you know p right and p wrong, what is your best strategy for choosing your probability p switch of switching as a function of p right and p wrong?

g. If you don’t know p right and p wrong, what choice of p switch maximizes the minimum probability of winning over all the choices of

p right and p wrong?

Let’s return to the original problem as stated, where Monty has given

you the option of switching, but you have no knowledge of Monty’s

possible motivations or strategies.

h. Argue that the conditional probability of winning the automobile

given that Monty offers you the opportunity to switch is

Explain why p right + 2 p wrong ≠ 0.

i. What is the value of expression (C.52) when p switch = 1/2? Show that choosing p switch < 1/2 or p switch > 1/2 allows Monty to select values for p right and p wrong that yield a lower value for expression (C.52)

than choosing p switch = 1/2.

j. Suppose that you don’t know Monty’s strategy. Explain why choosing

to switch with probability 1/2 is a good strategy for the original

Image 1941

Image 1942

Image 1943

problem as stated. Summarize what you have learned overall from this

problem.

C-2 Balls and bins

This problem investigates the effect of various assumptions on the

number of ways of placing n balls into b distinct bins.

a. Suppose that the n balls are distinct and that their order within a bin does not matter. Argue that the number of ways of placing the balls in

the bins is bn.

b. Suppose that the balls are distinct and that the balls in each bin are

ordered. Prove that there are exactly ( b + n − 1)!/( b − 1)! ways to place the balls in the bins. ( Hint: Consider the number of ways of arranging

n distinct balls and b − 1 indistinguishable sticks in a row.)

c. Suppose that the balls are identical, and hence their order within a

bin does not matter. Show that the number of ways of placing the

balls in the bins is

. ( Hint: Of the arrangements in part (b), how

many are repeated if the balls are made identical?)

d. Suppose that the balls are identical and that no bin may contain more

than one ball, so that nb. Show that the number of ways of placing

the balls is .

e. Suppose that the balls are identical and that no bin may be left empty.

Assuming that nb, show that the number of ways of placing the

balls is

.

Appendix notes

The first general methods for solving probability problems were

discussed in a famous correspondence between B. Pascal and P. de

Fermat, which began in 1654, and in a book by C. Huygens in 1657.

Rigorous probability theory began with the work of J. Bernoulli in 1713

and A. De Moivre in 1730. Further developments of the theory were

provided by P.-S. Laplace, S.-D. Poisson, and C. F. Gauss.

Sums of random variables were originally studied by P. L. Chebyshev

and A. A. Markov. A. N. Kolmogorov axiomatized probability theory

in 1933. Chernoff [91] and Hoeffding [222] provided bounds on the tails of distributions. Seminal work in random combinatorial structures was

done by P. Erdős.

Knuth [259] and Liu [302] are good references for elementary combinatorics and counting. Standard textbooks such as Billingsley

[56], Chung [93], Drake [125], Feller [139], and Rozanov [390] offer comprehensive introductions to probability.

1 For a general probability distribution, there may be some subsets of the sample space S that are not considered to be events. This situation usually arises when the sample space is uncountably infinite. The main requirement for what subsets are events is that the set of events of a sample space must be closed under the operations of taking the complement of an event, forming the union of a finite or countable number of events, and taking the intersection of a finite or countable number of events. Most of the probability distributions we see in this book are over finite or countable sample spaces, and we generally consider all subsets of a sample space to be events. A notable exception is the continuous uniform probability distribution, which we’ll see shortly.

Image 1944

D Matrices

Matrices arise in numerous applications, including, but by no means

limited to, scientific computing. If you have seen matrices before, much

of the material in this appendix will be familiar to you, but some of it

might be new. Section D.1 covers basic matrix definitions and operations, and Section D.2 presents some basic matrix properties.

D.1 Matrices and matrix operations

This section reviews some basic concepts of matrix theory and some

fundamental properties of matrices.

Matrices and vectors

A matrix is a rectangular array of numbers. For example,

is a 2 × 3 matrix A = ( aij), where for i = 1, 2 and j = 1, 2, 3, the element of the matrix in row i and column j is denoted by aij. By convention, uppercase letters denote matrices and corresponding subscripted

lowercase letters denote their elements. We denote the set of all m × n

matrices with real-valued entries by ℝ m× n and, in general, the set of m

× n matrices with entries drawn from a set S by Sm× n.

Image 1945

Image 1946

The transpose of a matrix A is the matrix A T obtained by exchanging the rows and columns of A. For the matrix A of equation (D.1),

A vector is a one-dimensional array of numbers. For example,

is a vector of size 3. We sometimes call a vector of length n an n-vector.

By convention, lowercase letters denote vectors, and the i th element of a

size- n vector x is denoted by xi, for i = 1, 2, … , n. We take the standard form of a vector to be as a column vector equivalent to an n × 1 matrix, whereas the corresponding row vector is obtained by taking the

transpose:

x T = ( 2 3 5 ).

The unit vector ei is the vector whose i th element is 1 and all of whose other elements are 0. Usually, the context makes the size of a unit vector

clear.

A zero matrix is a matrix all of whose entries are 0. Such a matrix is

often denoted 0, since the ambiguity between the number 0 and a

matrix of 0s can usually be resolved from context. If a matrix of 0s is

intended, then the size of the matrix also needs to be derived from the

context.

Square matrices

Square n × n matrices arise frequently. Several special cases of square matrices are of particular interest:

1. A diagonal matrix has aij = 0 whenever ij. Because all of the off-diagonal elements are 0, a succinct way to specify the matrix lists only

the elements along the diagonal:

Image 1947

Image 1948

Image 1949

Image 1950

2. The n × n identity matrix In is a diagonal matrix with 1s along the diagonal:

When I appears without a subscript, its size derives from the context.

The i th column of an identity matrix is the unit vector ei.

3. A tridiagonal matrix T is one for which tij = 0 if | ij | > 1. Nonzero entries appear only on the main diagonal, immediately above the main

diagonal ( ti,i+1 for i = 1, 2, … , n − 1), or immediately below the main diagonal ( ti+1, i for i = 1, 2, … , n − 1):

4. An upper-triangular matrix U is one for which uij = 0 if i > j. All entries below the diagonal are 0:

An upper-triangular matrix is unit upper-triangular if it has all 1s

along the diagonal.

Image 1951

Image 1952

Image 1953

5. A lower-triangular matrix L is one for which lij = 0 if i < j. All entries above the diagonal are 0:

A lower-triangular matrix is unit lower-triangular if it has all 1s along

the diagonal.

6. A permutation matrix P has exactly one 1 in each row or column, and

0s elsewhere. An example of a permutation matrix is

Such a matrix is called a permutation matrix because multiplying a

vector x by a permutation matrix has the effect of permuting

(rearranging) the elements of x. Exercise D.1-4 explores additional

properties of permutation matrices.

7. A symmetric matrix A satisfies the condition A = A T. For example, is a symmetric matrix.

Basic matrix operations

The elements of a matrix or vector are scalar numbers from a number

system, such as the real numbers, the complex numbers, or integers

modulo a prime. The number system defines how to add and multiply

scalars. These definitions extend to encompass addition and

multiplication of matrices.

We define matrix addition as follows. If A = ( aij) and B = ( bij) are m

× n matrices, then their matrix sum C = ( cij) = A + B is the m × n

Image 1954

matrix defined by

cij = aij + bij

for i = 1, 2, … , m and j = 1, 2, … , n. That is, matrix addition is performed componentwise. A zero matrix is the identity for matrix

addition:

A + 0 = A = 0 + A.

If is λ a scalar number and A = ( aij) is a matrix, then λA = ( λaij) is the scalar multiple of A obtained by multiplying each of its elements by λ. As a special case, we define the negative of a matrix A = ( aij) to be −1

· A = − A, so that the ij th entry of − A is − aij. Thus, A + (− A) = 0 = (− A) + A.

The negative of a matrix defines matrix subtraction: AB = A + (− B).

We define matrix multiplication as follows. Start with two matrices A

and B that are compatible in the sense that the number of columns of A equals the number of rows of B. (In general, an expression containing a

matrix product AB is always assumed to imply that matrices A and B

are compatible.) If A = ( aik) is a p × q matrix and B = ( bkj) is a q × r matrix, then their matrix product C = AB is the p × r matrix C = ( cij), where

for i = 1, 2, … , m and j = 1, 2, … , p. The procedure RECTANGULAR-MATRIX-MULTIPLY on page 374 implements

matrix multiplication in the straightforward manner based on equation

(D.2), assuming that C is initialized to 0, using pqr multiplications and p( q − 1) r additions for a running time of Θ( pqr). If the matrices are n× n square matrices, so that n = p = q = r, the pseudocode reduces to MATRIX-MULTIPLY on page 81, whose running time is Θ( n 3).

(Section 4.2 describes an asymptotically faster Θ( n lg7)-time algorithm due to V. Strassen.)

Image 1955

Image 1956

Image 1957

Image 1958

Image 1959

Image 1960

Matrices have many (but not all) of the algebraic properties typical

of numbers. Identity matrices are identities for matrix multiplication:

ImA = AIn = A

for any m × n matrix A. Multiplying by a zero matrix gives a zero matrix:

A · 0 = 0.

Matrix multiplication is associative:

A( BC) = ( AB) C

for compatible matrices A, B, and C. Matrix multiplication distributes over addition:

A( B + C) = AB + AC,

( B + C) D = BD + CD.

For n > 1, multiplication of n × n matrices is not commutative. For example, if

and

, then

and

.

We define matrix-vector products or vector-vector products as if the

vector were the equivalent n × 1 matrix (or a 1 × n matrix, in the case of a row vector). Thus, if A is an m × n matrix and x is an n-vector, then Ax is an m-vector. If x and y are n-vectors, then is a scalar number (actually a 1 × 1 matrix) called the inner product of x

and y. We also use the notation 〈 x, y〉 to denote x T y. The inner-product operator is commutative: 〈 x, y〉 = 〈 y, x〉. The matrix xy T is an n × n matrix Z called the outer product of x and y, where zij = xiyj. The (euclidean) normx∥ of an n-vector x is defined by

Image 1961

Image 1962

Thus, the norm of x is its length in n-dimensional euclidean space. A useful fact, which follows from the equality

is that for any real number a and n-vector x,

Exercises

D.1-1

Show that if A and B are symmetric n × n matrices, then so are A + B

and AB.

D.1-2

Prove that ( AB)T = B T A T and that A T A is always a symmetric matrix.

D.1-3

Prove that the product of two lower-triangular matrices is lower-

triangular.

D.1-4

Prove that if P is an n × n permutation matrix and A is an n × n matrix, then the matrix product PA is A with its rows permuted, and the matrix

product AP is A with its columns permuted. Prove that the product of

two permutation matrices is a permutation matrix.

D.2 Basic matrix properties

We now define some basic properties pertaining to matrices: inverses,

linear dependence and independence, rank, and determinants. We also

define the class of positive-definite matrices.

Matrix inverses, ranks, and determinants

Image 1963

Image 1964

The inverse of an n × n matrix A is the n × n matrix, denoted A−1 (if it exists), such that AA−1 = In = A−1 A. For example,

Many nonzero n × n matrices do not have inverses. A matrix without an

inverse is called noninvertible, or singular. An example of a nonzero singular matrix is

If a matrix has an inverse, it is called invertible, or nonsingular. Matrix inverses, when they exist, are unique. (See Exercise D.2-1.) If A and B

are nonsingular n × n matrices, then

( BA)−1 = A−1 B−1.

The inverse operation commutes with the transpose operation:

( A−1)T = ( A T)−1.

The vectors x 1, x 2, … , xn are linearly dependent if there exist coefficients c 1, c 2, … , cn, not all of which are 0, such that c 1 x 1 + c 2 x 2

+ ⋯ + cnxn = 0. The row vectors x 1 = ( 1 2 3 ), x 2 = ( 2 6 4 ), and x 3 = (

4 11 9 ) are linearly dependent, for example, since 2 x 1+3 x 2−2 x 3 = 0. If vectors are not linearly dependent, they are linearly independent. For example, the columns of an identity matrix are linearly independent.

The column rank of a nonzero m × n matrix A is the size of the largest set of linearly independent columns of A. Similarly, the row rank

of A is the size of the largest set of linearly independent rows of A. A fundamental property of any matrix A is that its row rank always equals

its column rank, so that we can simply refer to the rank of A. The rank of an m × n matrix is an integer between 0 and min { m, n}, inclusive.

(The rank of a zero matrix is 0, and the rank of an n × n identity matrix is n.) An alternate, but equivalent and often more useful, definition is that the rank of a nonzero m× n matrix A is the smallest number r such

Image 1965

that there exist matrices B and C of respective sizes m × r and r × n such that A = BC. A square n × n matrix has full rank if its rank is n. An m ×

n matrix has full column rank if its rank is n. The following theorem gives a fundamental property of ranks.

Theorem D.1

A square matrix has full rank if and only if it is nonsingular.

A null vector for a matrix A is a nonzero vector x such that Ax = 0.

The following theorem (whose proof is left as Exercise D.2-7) and its

corollary relate the notions of column rank and singularity to null

vectors.

Theorem D.2

A matrix has full column rank if and only if it does not have a null

vector.

Corollary D.3

A square matrix is singular if and only if it has a null vector.

The ij th minor of an n× n matrix A, for n > 1, is the ( n−1)×( n−1) matrix A[ ij] obtained by deleting the i th row and j th column of A. The determinant of an n× n matrix A is defined recursively in terms of its minors by

The term (−1) i+ j det( A[ ij]) is known as the cofactor of the element aij.

The following theorems, whose proofs are omitted, express

fundamental properties of the determinant.

Theorem D.4 (Determinant properties)

The determinant of a square matrix A has the following properties:

Image 1966

If any row or any column of A is zero, then det( A) = 0.

The determinant of A is multiplied by λ if the entries of any one

row (or any one column) of A are all multiplied by λ.

The determinant of A is unchanged if the entries in one row

(respectively, column) are added to those in another row

(respectively, column).

The determinant of A equals the determinant of A T.

The determinant of A is multiplied by −1 if any two rows (or any

two columns) are exchanged.

Also, for any square matrices A and B, we have det( AB) = det( A) det( B).

Theorem D.5

An n × n matrix A is singular if and only if det( A) = 0.

Positive-definite matrices

Positive-definite matrices play an important role in many applications.

An n × n matrix A is positive-definite if x T Ax > 0 for all n-vectors x ≠ 0.

For example, the identity matrix is positive-definite, since if x = ( x 1 x 2

xn)T is a nonzero vector, then

Matrices that arise in applications are often positive-definite due to

the following theorem.

Theorem D.6

For any matrix A with full column rank, the matrix A T A is positive-definite.

Proof We must show that x T( A T A) x > 0 for any nonzero vector x. For any vector x,

x T( A T A) x = ( Ax)T( Ax) (by Exercise D.1-2)

= ∥ Ax∥2.

The value ∥ Ax∥2 is just the sum of the squares of the elements of the

vector Ax. Therefore, ∥ Ax∥2 ≥ 0. We’ll show by contradiction that

Ax∥2 > 0. Suppose that ∥ Ax∥2 = 0. Then, every element of Ax is 0, which is to say Ax = 0. Since A has full column rank, Theorem D.2 says

that x = 0, which contradicts the requirement that x is nonzero. Hence, A T A is positive-definite.

Section 28.3 explores other properties of positive-definite matrices.

Section 33.3 uses a similar condition, known as positive-semidefinite.

An n × n matrix A is positive-semidefinite if x T Ax ≥ 0 for all n-vectors x

≠ 0.

Exercises

D.2-1

Prove that matrix inverses are unique, that is, if B and C are inverses of A, then B = C.

D.2-2

Prove that the determinant of a lower-triangular or upper-triangular

matrix is equal to the product of its diagonal elements. Prove that the

inverse of a lower-triangular matrix, if it exists, is lower-triangular.

D.2-3

Prove that if P is a permutation matrix, then P is invertible, its inverse is P T, and P T is a permutation matrix.

D.2-4

Let A and B be n × n matrices such that AB = I. Prove that if A′ is obtained from A by adding row j into row i, where ij, then subtracting column i from column j of B yields the inverse B′ of A′.

D.2-5

Let A be a nonsingular n × n matrix with complex entries. Show that every entry of A−1 is real if and only if every entry of A is real.

D.2-6

Show that if A is a nonsingular, symmetric, n × n matrix, then A−1 is symmetric. Show that if B is an arbitrary m × n matrix, then the m × m matrix given by the product BAB T is symmetric.

D.2-7

Prove Theorem D.2. That is, show that a matrix A has full column rank

if and only if Ax = 0 implies x = 0. ( Hint: Express the linear dependence of one column on the others as a matrix-vector equation.)

D.2-8

Prove that for any two compatible matrices A and B,

rank( AB) ≤ min {rank( A), rank( B)},

where equality holds if either A or B is a nonsingular square matrix.

( Hint: Use the alternate definition of the rank of a matrix.)

Problems

D-1 Vandermonde matrix

Given numbers x 0, x 1, … , xn−1, prove that the determinant of the Vandermonde matrix

Image 1967

Image 1968

Image 1969

Image 1970

Image 1971

is

( Hint: Multiply column i by − x 0 and add it to column i + 1 for i = n

1, n − 2, … , 1, and then use induction.)

D-2 Permutations defined by matrix-vector multiplication over GF.( 2)

One class of permutations of the integers in the set Sn = {0, 1, 2, … , 2 n

− 1} is defined by matrix multiplication over GF(2), the Galois field of

two elements. For each integer xSn, we view its binary representation as an n-bit vector

where

. If A is an n × n matrix in which each entry is either 0

or 1, then we can define a permutation mapping each value xSn to

the number whose binary representation is the matrix-vector product

Ax. All this arithmetic is performed over GF(2): all values are either 0 or 1, and with one exception, the usual rules of addition and multiplication

apply. The exception is that 1 + 1 = 0. You can think of arithmetic over

GF(2) as being just like regular integer arithmetic, except that you use

only the least-significant bit.

As an example, for S 2 = {0, 1, 2, 3}, the matrix

Image 1972

defines the following permutation πA: πA(0) = 0, πA(1) = 3, πA(2) = 2, πA(3) = 1. To see why πA(3) = 1, observe that, working in GF(2), which is the binary representation of 1.

For the remainder of this problem, we’ll work over GF(2), and all

matrix and vector entries will be 0 or 1. Define the rank of a 0-1 matrix

(a matrix for which each entry is either 0 or 1) over GF(2) the same as

for a regular matrix, but with all arithmetic that determines linear

independence performed over GF(2). We define the range of an n × n 0-1

matrix A by

R( A) = { y : y = Ax for some xSn}, so that R( A) is the set of numbers in Sn that are produced by multiplying each value xSn by A.

a. If r is the rank of matrix A, prove that | R( A)| = 2 r. Conclude that A defines a permutation on Sn only if A has full rank.

For a given n × n matrix A and a given value yR( A), we define the preimage of y by

P ( A, y) = { x : Ax = y},

so that P( A, y) is the set of values in Sn that map to y when multiplied by A.

b. If r is the rank of n × n matrix A and yR( A), prove that | P( A, y)| =

2 nr.

Let 0 ≤ mn, and suppose that we partition the set Sn into blocks of consecutive numbers, where the i th block consists of the 2 m numbers

Image 1973

Image 1974

i 2 m, i 2 m + 1, i 2 m +2, … , ( i +1)2 m −1. For any subset SSn, define B( S, m) to be the set of size-2 m blocks of Sn containing some element of S. As an example, when n = 3, m = 1, and S = {1, 4, 5}, then B( S, m) consists of blocks 0 (since 1 is in the 0th block) and 2 (since both 4 and

5 belong to block 2).

c. Let r be the rank of the lower left ( nm) × m submatrix of A, that is, the matrix formed by taking the intersection of the bottom nm rows

and the leftmost m columns of A. Let S be any size-2 m block of Sn, and let S′ = { y : y = Ax for some xS}. Prove that | B( S′, m)| = 2 r and that for each block in B( S′, m), exactly 2 mr numbers in S map to that block.

Because multiplying the zero vector by any matrix yields a zero

vector, the set of permutations of Sn defined by multiplying by n × n 0-1

matrices with full rank over GF(2) cannot include all permutations of

Sn. Let’s extend the class of permutations defined by matrix-vector

multiplication to include an additive term, so that xSn maps to Ax +

c, where c is an n-bit vector and addition is performed over GF(2). For example, when

and

we get the following permutation πA,c: πA,c(0) = 2, πA,c(1) = 1, πA,c(2)

= 0, πA,c(3) = 3. We call any permutation that maps xSn to Ax + c, for some n × n 0-1 matrix A with full rank and some n-bit vector c, a linear permutation.

d. Use a counting argument to show that the number of linear

permutations of Sn is much less than the number of permutations of

Sn.

e. Give an example of a value of n and a permutation of Sn that cannot be achieved by any linear permutation. ( Hint: For a given

permutation, think about how multiplying a matrix by a unit vector

relates to the columns of the matrix.)

Appendix notes

Linear-algebra textbooks provide plenty of background information on

matrices. The books by Strang [422, 423] are particularly good.

Bibliography

[1] Milton Abramowitz and Irene A. Stegun, editors. Handbook of Mathematical Functions.

Dover, 1965.

[2] G. M. Adel’son-Vel’skiĭ and E. M. Landis. An algorithm for the organization of

information. Soviet Mathematics Doklady, 3(5):1259–1263, 1962.

[3] Alok Aggarwal and Jeffrey Scott Vitter. The input/output complexity of sorting and related problems. Communications of the ACM, 31(9):1116–1127, 1988.

[4] Manindra Agrawal, Neeraj Kayal, and Nitin Saxena. PRIMES is in P. Annals of

Mathematics, 160(2):781–793, 2004.

[5] Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, 1974.

[6] Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman. Data Structures and Algorithms.

Addison-Wesley, 1983.

[7] Ravindra K. Ahuja, Thomas L. Magnanti, and James B. Orlin. Network Flows: Theory, Algorithms, and Applications. Prentice Hall, 1993.

[8] Ravindra K. Ahuja, Kurt Mehlhorn, James B. Orlin, and Robert E. Tarjan. Faster

algorithms for the shortest path problem. Journal of the ACM, 37(2):213–223, 1990.

[9] Ravindra K. Ahuja and James B. Orlin. A fast and simple algorithm for the maximum

flow problem. Operations Research, 37(5):748–759, 1989.

[10] Ravindra K. Ahuja, James B. Orlin, and Robert E. Tarjan. Improved time bounds for the maximum flow problem. SIAM Journal on Computing, 18(5):939–954, 1989.

[11] Miklós Ajtai, Nimrod Megiddo, and Orli Waarts. Improved algorithms and analysis for secretary problems and generalizations. SIAM Journal on Discrete Mathematics, 14(1):1–

27, 2001.

[12] Selim G. Akl. The Design and Analysis of Parallel Algorithms. Prentice Hall, 1989.

[13] Mohamad Akra and Louay Bazzi. On the solution of linear recurrence equations.

Computational Optimization and Applications, 10(2):195–210, 1998.

[14] Susanne Albers. Online algorithms: A survey. Mathematical Programming, 97(1-2):3–26, 2003.

[15] Noga Alon. Generating pseudo-random permutations and maximum flow algorithms.

Information Processing Letters, 35:201–204, 1990.

[16] Arne Andersson. Balanced search trees made simple. In Proceedings of the Third Workshop on Algorithms and Data Structures, volume 709 of Lecture Notes in Computer Science, pages 60–71. Springer, 1993.

[17] Arne Andersson. Faster deterministic sorting and searching in linear space. In

Proceedings of the 37th Annual Symposium on Foundations of Computer Science, pages 135–141, 1996.

[18] Arne Andersson, Torben Hagerup, Stefan Nilsson, and Rajeev Raman. Sorting in linear time? Journal of Computer and System Sciences, 57:74–93, 1998.

[19] Tom M. Apostol. Calculus, volume 1. Blaisdell Publishing Company, second edition, 1967.

[20] Nimar S. Arora, Robert D. Blumofe, and C. Greg Plaxton. Thread scheduling for

multiprogrammed multiprocessors. Theory of Computing Systems, 34(2):115–144, 2001.

[21] Sanjeev Arora. Probabilistic checking of proofs and the hardness of approximation problems. PhD thesis, University of California, Berkeley, 1994.

[22] Sanjeev Arora. The approximability of NP-hard problems. In Proceedings of the 30th Annual ACM Symposium on Theory of Computing, pages 337–348, 1998.

[23] Sanjeev Arora. Polynomial time approximation schemes for euclidean traveling salesman and other geometric problems. Journal of the ACM, 45(5):753–782, 1998.

[24] Sanjeev Arora and Boaz Barak. Computational Complexity: A Modern Approach.

Cambridge University Press, 2009.

[25] Sanjeev Arora, Elad Hazan, and Satyen Kale. The multiplicative weights update method: A meta-algorithm and applications. Theory of Computing, 8(1):121–164, 2012.

[26] Sanjeev Arora and Carsten Lund. Hardness of approximations. In Dorit S. Hochbaum, editor, Approximation Algorithms for NP-Hard Problems, pages 399–446. PWS Publishing Company, 1997.

[27] Mikhail J. Atallah and Marina Blanton, editors. Algorithms and Theory of Computation Handbook, volume 1. Chapman & Hall/CRC Press, second edition, 2009.

[28] Mikhail J. Atallah and Marina Blanton, editors. Algorithms and Theory of Computation Handbook, volume 2. Chapman & Hall/CRC Press, second edition, 2009.

[29] G. Ausiello, P. Crescenzi, G. Gambosi, V. Kann, A. Marchetti-Spaccamela, and M.

Protasi. Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties. Springer, 1999.

[30] Shai Avidan and Ariel Shamir. Seam carving for content-aware image resizing. ACM

Transactions on Graphics, 26(3), article 10, 2007.

[31] László Babai, Eugene M. Luks, and Ákos Seress. Fast management of permutation

groups I. SIAM Journal on Computing, 26(5):1310–1342, 1997.

[32] Eric Bach. Private communication, 1989.

[33] Eric Bach. Number-theoretic algorithms. In Annual Review of Computer Science, volume 4, pages 119–172. Annual Reviews, Inc., 1990.

[34] Eric Bach and Jeffrey Shallit. Algorithmic Number Theory—Volume I: Efficient

Algorithms. The MIT Press, 1996.

[35] Nikhil Bansal and Anupam Gupta. Potential-function proofs for first-order methods.

CoRR, abs/1712.04581, 2017.

[36] Hannah Bast, Daniel Delling, Andrew V. Goldberg, Matthias Müller-Hannemann,

Thomas Pajor, Peter Sanders, Dorothea Wagner, and Renato F. Werneck. Route planning

in transportation networks. In Algorithm Engineering - Selected Results and Surveys, volume 9220 of Lecture Notes in Computer Science, pages 19–80. Springer, 2016.

[37] Surender Baswana, Ramesh Hariharan, and Sandeep Sen. Improved decremental

algorithms for maintaining transitive closure and all-pairs shortest paths. Journal of Algorithms, 62(2):74–92, 2007.

[38] R. Bayer. Symmetric binary B-trees: Data structure and maintenance algorithms. Acta Informatica, 1(4):290–306, 1972.

[39] R. Bayer and E. M. McCreight. Organization and maintenance of large ordered indexes.

Acta Informatica, 1(3):173–189, 1972.

[40] Pierre Beauchemin, Gilles Brassard, Claude Crépeau, Claude Goutier, and Carl

Pomerance. The generation of random numbers that are probably prime. Journal of

Cryptology, 1(1):53–64, 1988.

[41] L. A. Belady. A study of replacement algorithms for a virtual-storage computer. IBM

Systems Journal, 5(2):78–101, 1966.

[42] Mihir Bellare, Joe Kilian, and Phillip Rogaway. The security of cipher block chaining message authentication code. Journal of Computer and System Sciences, 61(3):362–399, 2000.

[43] Mihir Bellare and Phillip Rogaway. Random oracles are practical: A paradigm for

designing efficient protocols. In CCS ’93, Proceedings of the 1st ACM Conference on Computer and Communications Security, pages 62–73, 1993.

[44] Richard Bellman. Dynamic Programming. Princeton University Press, 1957.

[45] Richard Bellman. On a routing problem. Quarterly of Applied Mathematics, 16(1):87–90, 1958.

[46] Michael Ben-Or. Lower bounds for algebraic computation trees. In Proceedings of the Fifteenth Annual ACM Symposium on Theory of Computing, pages 80–86, 1983.

[47] Michael A. Bender, Erik D. Demaine, and Martin Farach-Colton. Cache-oblivious B-

trees. SIAM Journal on Computing, 35(2):341–358, 2005.

[48] Samuel W. Bent and John W. John. Finding the median requires 2 n comparisons. In Proceedings of the Seventeenth Annual ACM Symposium on Theory of Computing, pages 213–216, 1985.

[49] Jon L. Bentley. Writing Efficient Programs. Prentice Hall, 1982.

[50] Jon L. Bentley. More Programming Pearls: Confessions of a Coder. Addison-Wesley, 1988.

[51] Jon L. Bentley. Programming Pearls. Addison-Wesley, second edition, 1999.

[52] Jon L. Bentley, Dorothea Haken, and James B. Saxe. A general method for solving divide-and-conquer recurrences. SIGACT News, 12(3):36–44, 1980.

[53] Claude Berge. Two theorems in graph theory. Proceedings of the National Academy of Sciences, 43(9):842–844, 1957.

[54] Aditya Y. Bhargava. Grokking Algorithms: An Illustrated Guide For Programmers and Other Curious People. Manning Publications, 2016.

[55] Daniel Bienstock and Benjamin McClosky. Tightening simplex mixed-integer sets with guaranteed bounds. Optimization Online, 2008.

[56] Patrick Billingsley. Probability and Measure. John Wiley & Sons, second edition, 1986.

[57] Guy E. Blelloch. Scan Primitives and Parallel Vector Models. PhD thesis, Department of Electrical Engineering and Computer Science, MIT, 1989. Available as MIT Laboratory

for Computer Science Technical Report MIT/LCS/TR-463.

[58] Guy E. Blelloch. Programming parallel algorithms. Communications of the ACM, 39(3):85–97, 1996.

[59] Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, and Julian Shun. Internally deterministic parallel algorithms can be fast. In 17th ACM SIGPLAN Symposium on

Principles and Practice of Parallel Programming, pages 181–192, 2012.

[60] Guy E. Blelloch, Jeremy T. Fineman, Yan Gu, and Yihan Sun. Optimal parallel

algorithms in the binary-forking model. In Proceedings of the 32nd Annual ACM

Symposium on Parallelism in Algorithms and Architectures, pages 89–102, 2020.

[61] Guy E. Blelloch, Phillip B. Gibbons, and Yossi Matias. Provably efficient scheduling for languages with fine-grained parallelism. Journal of the ACM, 46(2):281–321, 1999.

[62] Manuel Blum, Robert W. Floyd, Vaughan Pratt, Ronald L. Rivest, and Robert E. Tarjan.

Time bounds for selection. Journal of Computer and System Sciences, 7(4):448–461, 1973.

[63] Robert D. Blumofe and Charles E. Leiserson. Scheduling multithreaded computations by work stealing. Journal of the ACM, 46(5):720–748, 1999.

[64] Robert L Bocchino, Jr., Vikram S. Adve, Sarita V. Adve, and Marc Snir. Parallel

programming must be deterministic by default. In Proceedings of the First USENIX

Conference on Hot Topics in Parallelism (HotPar), 2009.

[65] Béla Bollobás. Random Graphs. Academic Press, 1985.

[66] Leonardo Bonacci. Liber Abaci, 1202.

[67] J. A. Bondy and U. S. R. Murty. Graph Theory with Applications. American Elsevier, 1976.

[68] A. Borodin and R. El-Yaniv. Online Computation and Competitive Analysis. Cambridge University Press, 1998.

[69] Stephen P. Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University Press, 2004.

[70] Gilles Brassard and Paul Bratley. Fundamentals of Algorithmics. Prentice Hall, 1996.

[71] Richard P. Brent. The parallel evaluation of general arithmetic expressions. Journal of the ACM, 21(2):201–206, 1974.

[72] Gerth Stølting Brodal. A survey on priority queues. In Andrej Brodnik, Alejandro López-Ortiz, Venkatesh Raman, and Alfredo Viola, editors, Space-Efficient Data Structures, Streams, and Algorithms: Papers in Honor of J. Ian Munro on the Occasion of His 66th Birthday, volume 8066 of Lecture Notes in Computer Science, pages 150–163. Springer, 2013.

[73] Gerth Stølting Brodal, George Lagogiannis, and Robert E. Tarjan. Strict Fibonacci heaps.

In Proceedings of the 44th Annual ACM Symposium on Theory of Computing, pages 1177–

1184, 2012.

[74] George W. Brown. Some notes on computation of games solutions. RAND Corporation Report, P-78, 1949.

[75] Sébastien Bubeck. Convex optimization: Algorithms and complexity. Foundations and Trends in Machine Learning, 8(3-4):231–357, 2015.

[76] Niv Buchbinder and Joseph Naor. The design of competitive online algorithms via a primal-dual approach. Foundations and Trends in Theoretical Computer Science, 3(2–

3):93–263, 2009.

[77] J. P. Buhler, H. W. Lenstra, Jr., and Carl Pomerance. Factoring integers with the number field sieve. In A. K. Lenstra and H. W. Lenstra, Jr., editors, The Development of the Number Field Sieve, volume 1554 of Lecture Notes in Mathematics, pages 50–94. Springer, 1993.

[78] M. Burrows and D. J. Wheeler. A block-sorting lossless data compression algorithm. SRC

Research Report 124, Digital Equipment Corporation Systems Research Center, May

1994.

[79] Neville Campbell. Recurrences. Unpublished treatise available at

https://nevillecampbell.com/Recurrences.pdf, 2020.

[80] J. Lawrence Carter and Mark N. Wegman. Universal classes of hash functions. Journal of Computer and System Sciences, 18(2):143–154, 1979.

[81] Barbara Chapman, Gabriele Jost, and Ruud van der Pas. Using OpenMP: Portable Shared Memory Parallel Programming. The MIT Press, 2007.

[82] Philippe Charles, Christian Grothoff, Vijay Saraswat, Christopher Donawa, Allan

Kielstra, Kemal Ebcioglu, Christoph Von Praun, and Vivek Sarkar. X10: An object-

oriented approach to non-uniform cluster computing. In ACM SIGPLAN Conference on

Object-oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 519–538, 2005.

[83] Bernard Chazelle. A minimum spanning tree algorithm with inverse-Ackermann type

complexity. Journal of the ACM, 47(6):1028–1047, 2000.

[84] Ke Chen and Adrian Dumitrescu. Selection algorithms with small groups. International Journal of Foundations of Computer Science, 31(3):355–369, 2020.

[85] Guang-Ien Cheng, Mingdong Feng, Charles E. Leiserson, Keith H. Randall, and Andrew F. Stark. Detecting data races in Cilk programs that use locks. In Proceedings of the 10th

Annual ACM Symposium on Parallel Algorithms and Architectures, pages 298–309, 1998.

[86] Joseph Cheriyan and Torben Hagerup. A randomized maximum-flow algorithm. SIAM

Journal on Computing, 24(2):203–226, 1995.

[87] Joseph Cheriyan and S. N. Maheshwari. Analysis of preflow push algorithms for

maximum network flow. SIAM Journal on Computing, 18(6):1057–1086, 1989.

[88] Boris V. Cherkassky and Andrew V. Goldberg. On implementing the push-relabel method for the maximum flow problem. Algorithmica, 19(4):390–410, 1997.

[89] Boris V. Cherkassky, Andrew V. Goldberg, and Tomasz Radzik. Shortest paths

algorithms: Theory and experimental evaluation. Mathematical Programming, 73(2):129–

174, 1996.

[90] Boris V. Cherkassky, Andrew V. Goldberg, and Craig Silverstein. Buckets, heaps, lists and monotone priority queues. SIAM Journal on Computing, 28(4):1326–1346, 1999.

[91] H. Chernoff. A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Annals of Mathematical Statistics, 23(4):493–507, 1952.

[92] Brian Christian and Tom Griffiths. Algorithms to Live By: The Computer Science of Human Decisions. Picador, 2017.

[93] Kai Lai Chung. Elementary Probability Theory with Stochastic Processes. Springer, 1974.

[94] V. Chvátal. Linear Programming. W. H. Freeman and Company, 1983.

[95] V. Chvátal, D. A. Klarner, and D. E. Knuth. Selected combinatorial research problems.

Technical Report STAN-CS-72-292, Computer Science Department, Stanford University,

1972.

[96] Alan Cobham. The intrinsic computational difficulty of functions. In Proceedings of the 1964 Congress for Logic, Methodology, and the Philosophy of Science, pages 24–30. North-Holland, 1964.

[97] H. Cohen and H. W. Lenstra, Jr. Primality testing and Jacobi sums. Mathematics of Computation, 42(165):297–330, 1984.

[98] Michael B. Cohen, Aleksander Madry, Piotr Sankowski, and Adrian Vladu. Negative-

weight shortest paths and unit capacity minimum cost flow in Õ( m 10/7 log w) time (extended abstract). In Proceedings of the 28th ACM-SIAM Symposium on Discrete

Algorithms, pages 752–771, 2017.

[99] Douglas Comer. The ubiquitous B-tree. ACM Computing Surveys, 11(2):121–137, 1979.

[100] Stephen Cook. The complexity of theorem proving procedures. In Proceedings of the Third Annual ACM Symposium on Theory of Computing, pages 151–158, 1971.

[101] James W. Cooley and John W. Tukey. An algorithm for the machine calculation of

complex Fourier series. Mathematics of Computation, 19(90):297–301, 1965.

[102] Don Coppersmith. Modifications to the number field sieve. Journal of Cryptology, 6(3):169–180, 1993.

[103] Don Coppersmith and Shmuel Winograd. Matrix multiplication via arithmetic

progression. Journal of Symbolic Computation, 9(3):251–280, 1990.

[104] Thomas H. Cormen. Algorithms Unlocked. The MIT Press, 2013.

[105] Thomas H. Cormen, Thomas Sundquist, and Leonard F. Wisniewski. Asymptotically

tight bounds for performing BMMC permutations on parallel disk systems. SIAM Journal on Computing, 28(1):105–136, 1998.

[106] Don Dailey and Charles E. Leiserson. Using Cilk to write multiprocessor chess programs.

In H. J. van den Herik and B. Monien, editors, Advances in Computer Games, volume 9, pages 25–52. University of Maastricht, Netherlands, 2001.

[107] Sanjoy Dasgupta, Christos Papadimitriou, and Umesh Vazirani. Algorithms. McGraw-Hill, 2008.

[108] Abraham de Moivre. De fractionibus algebraicis radicalitate immunibus ad fractiones simpliciores reducendis, deque summandis terminis quarundam serierum aequali

intervallo a se distantibus. Philosophical Transactions, 32(373):162–168, 1722.

[109] Erik D. Demaine, Dion Harmon, John Iacono, and Mihai Pǎtraşcu. Dynamic optimality

—almost. SIAM Journal on Computing, 37(1):240–251, 2007.

[110] Camil Demetrescu, David Eppstein, Zvi Galik, and Giuseppe F. Italiano. Dynamic graph algorithms. In Mikhail J. Attalah and Marina Blanton, editors, Algorithms and Theory of Computation Handbook, chapter 9, pages 9-1–9-28. Chapman & Hall/CRC, second edition, 2009.

[111] Camil Demetrescu and Giuseppe F. Italiano. Fully dynamic all pairs shortest paths with real edge weights. Journal of Computer and System Sciences, 72(5):813–837, 2006.

[112] Eric V. Denardo and Bennett L. Fox. Shortest-route methods: 1. Reaching, pruning, and buckets. Operations Research, 27(1):161–186, 1979.

[113] Martin Dietzfelbinger, Torben Hagerup, Jyrki Katajainen, and Martti Penttonen. A reliable randomized algorithm for the closest-pair problem. Journal of Algorithms, 25(1):19–51, 1997.

[114] Martin Dietzfelbinger, Anna Karlin, Kurt Mehlhorn, Friedhelm Meyer auf der Heide, Hans Rohnert, and Robert E. Tarjan. Dynamic perfect hashing: Upper and lower bounds.

SIAM Journal on Computing, 23(4):738–761, 1994.

[115] Whitfield Diffie and Martin E. Hellman. New directions in cryptography. IEEE

Transactions on Information Theory, IT-22(6):644–654, 1976.

[116] Edsger W. Dijkstra. A note on two problems in connexion with graphs. Numerische Mathematik, 1(1):269–271, 1959.

[117] Edsger W. Dijkstra. A Discipline of Programming. Prentice-Hall, 1976.

[118] Dimitar Dimitrov, Martin Vechev, and Vivek Sarkar. Race detection in two dimensions.

ACM Transactions on Parallel Computing, 4(4):1–22, 2018.

[119] E. A. Dinic. Algorithm for solution of a problem of maximum flow in a network with power estimation. Soviet Mathematics Doklady, 11(5):1277–1280, 1970.

[120] Brandon Dixon, Monika Rauch, and Robert E. Tarjan. Verification and sensitivity

analysis of minimum spanning trees in linear time. SIAM Journal on Computing,

21(6):1184–1192, 1992.

[121] John D. Dixon. Factorization and primality tests. The American Mathematical Monthly, 91(6):333–352, 1984.

[122] Dorit Dor, Johan Håstad, Staffan Ulfberg, and Uri Zwick. On lower bounds for selecting the median. SIAM Journal on Discrete Mathematics, 14(3):299–311, 2001.

[123] Dorit Dor and Uri Zwick. Selecting the median. SIAM Journal on Computing, 28(5):1722–1758, 1999.

[124] Dorit Dor and Uri Zwick. Median selection requires (2 + ϵ) n comparisons. SIAM Journal on Discrete Mathematics, 14(3):312–325, 2001.

[125] Alvin W. Drake. Fundamentals of Applied Probability Theory. McGraw-Hill, 1967.

[126] James R. Driscoll, Neil Sarnak, Daniel D. Sleator, and Robert E. Tarjan. Making data structures persistent. Journal of Computer and System Sciences, 38(1):86–124, 1989.

[127] Ran Duan, Seth Pettie, and Hsin-Hao Su. Scaling algorithms for weighted matching in general graphs. ACM Transactions on Algorithms, 14(1):8:1–8:35, 2018.

[128] Richard Durstenfeld. Algorithm 235 (RANDOM PERMUTATION). Communications of

the ACM, 7(7):420, 1964.

[129] Derek L. Eager, John Zahorjan, and Edward D. Lazowska. Speedup versus efficiency in parallel systems. IEEE Transactions on Computers, 38(3):408–423, 1989.

[130] Jack Edmonds. Paths, trees, and flowers. Canadian Journal of Mathematics, 17:449–467, 1965.

[131] Jack Edmonds. Matroids and the greedy algorithm. Mathematical Programming, 1(1):127–136, 1971.

[132] Jack Edmonds and Richard M. Karp. Theoretical improvements in the algorithmic

efficiency for network flow problems. Journal of the ACM, 19(2):248–264, 1972.

[133] Jeff Edmonds. How To Think About Algorithms. Cambridge University Press, 2008.

[134] Mourad Elloumi and Albert Y. Zomaya, editors. Algorithms in Computational Molecular Biology: Techniques, Approaches and Applications. John Wiley & Sons, 2011.

[135] Jeff Erickson. Algorithms. https://archive.org/details/Algorithms-Jeff-Erickson, 2019.

[136] Martin Erwig. Once Upon an Algorithm: How Stories Explain Computing. The MIT Press, 2017.

[137] Shimon Even. Graph Algorithms. Computer Science Press, 1979.

[138] Shimon Even and Yossi Shiloach. An on-line edge-deletion problem. Journal of the ACM, 28(1):1–4, 1981.

[139] William Feller. An Introduction to Probability Theory and Its Applications. John Wiley & Sons, third edition, 1968.

[140] Mingdong Feng and Charles E. Leiserson. Efficient detection of determinacy races in Cilk programs. In Proceedings of the 9th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 1–11, 1997.

[141] Amos Fiat, Richard M. Karp, Michael Luby, Lyle A. McGeoch, Daniel Dominic Sleator, and Neal E. Young. Competitive paging algorithms. Journal of Algorithms, 12(4):685–699,

1991.

[142] Amos Fiat and Gerhard J. Woeginger, editors. Online Algorithms, The State of the Art, volume 1442 of Lecture Notes in Computer Science. Springer, 1998.

[143] Sir Ronald A. Fisher and Frank Yates. Statistical Tables for Biological, Agricultural and Medical Research. Hafner Publishing Company, fifth edition, 1957.

[144] Robert W. Floyd. Algorithm 97 (SHORTEST PATH). Communications of the ACM, 5(6):345, 1962.

[145] Robert W. Floyd. Algorithm 245 (TREESORT). Communications of the ACM, 7(12):701, 1964.

[146] Robert W. Floyd. Permuting information in idealized two-level storage. In Raymond E.

Miller and James W. Thatcher, editors, Complexity of Computer Computations, pages 105–

109. Plenum Press, 1972.

[147] Robert W. Floyd and Ronald L. Rivest. Expected time bounds for selection.

Communications of the ACM, 18(3):165–172, 1975.

[148] L. R. Ford. Network Flow Theory. RAND Corporation, Santa Monica, CA, 1956.

[149] Lestor R. Ford, Jr. and D. R. Fulkerson. Flows in Networks. Princeton University Press, 1962.

[150] Lestor R. Ford, Jr. and Selmer M. Johnson. A tournament problem. The American Mathematical Monthly, 66(5):387–389, 1959.

[151] E. W. Forgy. Cluster analysis of multivariate efficiency versus interpretatbility of classifications. Biometrics, 21(3):768–769, 1965.

[152] Lance Fortnow. The Golden Ticket: P, NP, and the Search for the Impossible. Princeton University Press, 2013.

[153] Michael L. Fredman. New bounds on the complexity of the shortest path problem. SIAM

Journal on Computing, 5(1):83–89, 1976.

[154] Michael L. Fredman, János Komlós, and Endre Szemerédi. Storing a sparse table with O(1) worst case access time. Journal of the ACM, 31(3):538–544, 1984.

[155] Michael L. Fredman and Michael E. Saks. The cell probe complexity of dynamic data structures. In Proceedings of the Twenty First Annual ACM Symposium on Theory of

Computing, pages 345–354, 1989.

[156] Michael L. Fredman and Robert E. Tarjan. Fibonacci heaps and their uses in improved network optimization algorithms. Journal of the ACM, 34(3):596–615, 1987.

[157] Michael L. Fredman and Dan E. Willard. Surpassing the information theoretic bound with fusion trees. Journal of Computer and System Sciences, 47(3):424–436, 1993.

[158] Michael L. Fredman and Dan E. Willard. Trans-dichotomous algorithms for minimum

spanning trees and shortest paths. Journal of Computer and System Sciences, 48(3):533–

551, 1994.

[159] Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119–139, 1997.

[160] Matteo Frigo, Pablo Halpern, Charles E. Leiserson, and Stephen Lewin-Berlin. Reducers and other Cilk++ hyperobjects. In Proceedings of the 21st Annual ACM Symposium on

Parallelism in Algorithms and Architectures, pages 79–90, 2009.

[161] Matteo Frigo and Steven G. Johnson. The design and implementation of FFTW3.

Proceedings of the IEEE, 93(2):216–231, 2005.

[162] Hannah Fry. Hello World: Being Human in the Age of Algorithms. W. W. Norton & Company, 2018.

[163] Harold N. Gabow. Path-based depth-first search for strong and biconnected components.

Information Processing Letters, 74(3–4):107–114, 2000.

[164] Harold N. Gabow. The weighted matching approach to maximum cardinality matching.

Fundamenta Informaticae, 154(1-4):109–130, 2017.

[165] Harold N. Gabow, Z. Galil, T. Spencer, and Robert E. Tarjan. Efficient algorithms for finding minimum spanning trees in undirected and directed graphs. Combinatorica, 6(2):109–122, 1986.

[166] Harold N. Gabow and Robert E. Tarjan. A linear-time algorithm for a special case of disjoint set union. Journal of Computer and System Sciences, 30(2):209–221, 1985.

[167] Harold N. Gabow and Robert E. Tarjan. Faster scaling algorithms for network problems.

SIAM Journal on Computing, 18(5):1013–1036, 1989.

[168] Harold N. Gabow and Robert Endre Tarjan. Faster scaling algorithms for general graph-matching problems. Journal of the ACM, 38(4):815–853, 1991.

[169] D. Gale and L. S. Shapley. College admissions and the stability of marriage. American Mathematical Monthly, 69(1):9–15, 1962.

[170] Zvi Galil and Oded Margalit. All pairs shortest distances for graphs with small integer length edges. Information and Computation, 134(2):103–139, 1997.

[171] Zvi Galil and Oded Margalit. All pairs shortest paths for graphs with small integer length edges. Journal of Computer and System Sciences, 54(2):243–254, 1997.

[172] Zvi Galil and Kunsoo Park. Dynamic programming with convexity, concavity and

sparsity. Theoretical Computer Science, 92(1):49–76, 1992.

[173] Zvi Galil and Joel Seiferas. Time-space-optimal string matching. Journal of Computer and System Sciences, 26(3):280–294, 1983.

[174] Igal Galperin and Ronald L. Rivest. Scapegoat trees. In Proceedings of the 4th ACM-SIAM Symposium on Discrete Algorithms, pages 165–174, 1993.

[175] Michael R. Garey, R. L. Graham, and J. D. Ullman. Worst-case analyis of memory

allocation algorithms. In Proceedings of the Fourth Annual ACM Symposium on Theory of Computing, pages 143–150, 1972.

[176] Michael R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979.

[177] Naveen Garg and Jochen Könemann. Faster and simpler algorithms for multicommodity flow and other fractional packing problems. SIAM Journal on Computing, 37(2):630–652, 2007.

[178] Saul Gass. Linear Programming: Methods and Applications. International Thomson Publishing, fourth edition, 1975.

[179] Fănică Gavril. Algorithms for minimum coloring, maximum clique, minimum covering by cliques, and maximum independent set of a chordal graph. SIAM Journal on

Computing, 1(2):180–187, 1972.

[180] Alan George and Joseph W-H Liu. Computer Solution of Large Sparse Positive Definite Systems. Prentice Hall, 1981.

[181] E. N. Gilbert and E. F. Moore. Variable-length binary encodings. Bell System Technical Journal, 38(4):933–967, 1959.

[182] Ashish Goel, Sanjeev Khanna, Daniel H. Larkin, and Rober E. Tarjan. Disjoint set union with randomized linking. In Proceedings of the 25th ACM-SIAM Symposium on Discrete

Algorithms, pages 1005–1017, 2014.

[183] Michel X. Goemans and David P. Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM, 42(6):1115–1145, 1995.

[184] Michel X. Goemans and David P. Williamson. The primal-dual method for

approximation algorithms and its application to network design problems. In Dorit S.

Hochbaum, editor, Approximation Algorithms for NP-Hard Problems, pages 144–191.

PWS Publishing Company, 1997.

[185] Andrew V. Goldberg. Efficient Graph Algorithms for Sequential and Parallel Computers.

PhD thesis, Department of Electrical Engineering and Computer Science, MIT, 1987.

[186] Andrew V. Goldberg. Scaling algorithms for the shortest paths problem. SIAM Journal on Computing, 24(3):494–504, 1995.

[187] Andrew V. Goldberg and Satish Rao. Beyond the flow decomposition barrier. Journal of the ACM, 45(5):783–797, 1998.

[188] Andrew V. Goldberg and Robert E. Tarjan. A new approach to the maximum flow

problem. Journal of the ACM, 35(4):921–940, 1988.

[189] D. Goldfarb and M. J. Todd. Linear programming. In G. L. Nemhauser, A. H. G.

Rinnooy-Kan, and M. J. Todd, editors, Handbooks in Operations Research and

Management Science, Vol. 1, Optimization, pages 73–170. Elsevier Science Publishers, 1989.

[190] Shafi Goldwasser and Silvio Micali. Probabilistic encryption. Journal of Computer and System Sciences, 28(2):270–299, 1984.

[191] Shafi Goldwasser, Silvio Micali, and Ronald L. Rivest. A digital signature scheme secure against adaptive chosen-message attacks. SIAM Journal on Computing, 17(2):281–308, 1988.

[192] Gene H. Golub and Charles F. Van Loan. Matrix Computations. The Johns Hopkins University Press, third edition, 1996.

[193] G. H. Gonnet and R. Baeza-Yates. Handbook of Algorithms and Data Structures in Pascal and C. Addison-Wesley, second edition, 1991.

[194] Rafael C. Gonzalez and Richard E. Woods. Digital Image Processing. Addison-Wesley, 1992.

[195] Michael T. Goodrich and Roberto Tamassia. Algorithm Design: Foundations, Analysis, and Internet Examples. John Wiley & Sons, 2001.

[196] Michael T. Goodrich and Roberto Tamassia. Data Structures and Algorithms in Java. John Wiley & Sons, sixth edition, 2014.

[197] Ronald L. Graham. Bounds for certain multiprocessor anomalies. Bell System Technical Journal, 45(9):1563–1581, 1966.

[198] Ronald L. Graham and Pavol Hell. On the history of the minimum spanning tree

problem. Annals of the History of Computing, 7(1):43–57, 1985.

[199] Ronald L. Graham, Donald E. Knuth, and Oren Patashnik. Concrete Mathematics.

Addison-Wesley, second edition, 1994.

[200] David Gries. The Science of Programming. Springer, 1981.

[201] M. Grötschel, László Lovász, and Alexander Schrijver. Geometric Algorithms and Combinatorial Optimization. Springer, 1988.

[202] Leo J. Guibas and Robert Sedgewick. A dichromatic framework for balanced trees. In Proceedings of the 19th Annual Symposium on Foundations of Computer Science, pages 8–

21, 1978.

[203] Dan Gusfield and Robert W. Irving. The Stable Marriage Problem: Structure and Algorithms. The MIT Press, 1989.

[204] Gregory Gutin and Abraham P. Punnen, editors. The Traveling Salesman Problem and Its Variations. Kluwer Academic Publishers, 2002.

[205] Torben Hagerup. Improved shortest paths on the word RAM. In Procedings of 27th International Colloquium on Automata, Languages and Programming, ICALP 2000,

volume 1853 of Lecture Notes in Computer Science, pages 61–72. Springer, 2000.

[206] H. Halberstam and R. E. Ingram, editors. The Mathematical Papers of Sir William Rowan Hamilton, volume III (Algebra). Cambridge University Press, 1967.

[207] Yijie Han. Improved fast integer sorting in linear space. Information and Computation, 170(1):81–94, 2001.

[208] Frank Harary. Graph Theory. Addison-Wesley, 1969.

[209] Gregory C. Harfst and Edward M. Reingold. A potential-based amortized analysis of the union-find data structure. SIGACT News, 31(3):86–95, 2000.

[210] J. Hartmanis and R. E. Stearns. On the computational complexity of algorithms.

Transactions of the American Mathematical Society, 117:285–306, 1965.

[211] Michael T. Heideman, Don H. Johnson, and C. Sidney Burrus. Gauss and the history of the Fast Fourier Transform. IEEE ASSP Magazine, 1(4):14–21, 1984.

[212] Monika R. Henzinger and Valerie King. Fully dynamic biconnectivity and transitive closure. In Proceedings of the 36th Annual Symposium on Foundations of Computer

Science, pages 664–672, 1995.

[213] Monika R. Henzinger and Valerie King. Randomized fully dynamic graph algorithms with polylogarithmic time per operation. Journal of the ACM, 46(4):502–516, 1999.

[214] Monika R. Henzinger, Satish Rao, and Harold N. Gabow. Computing vertex connectivity: New bounds from old techniques. Journal of Algorithms, 34(2):222–250, 2000.

[215] Nicholas J. Higham. Exploiting fast matrix multiplication within the level 3 BLAS. ACM

Transactions on Mathematical Software, 16(4):352–368, 1990.

[216] Nicholas J. Higham. Accuracy and Stability of Numerical Algorithms. SIAM, second edition, 2002.

[217] W. Daniel Hillis and Jr. Guy L. Steele. Data parallel algorithms. Communications of the ACM, 29(12):1170–1183, 1986.

[218] C. A. R. Hoare. Algorithm 63 (PARTITION) and algorithm 65 (FIND). Communications of the ACM, 4(7):321–322, 1961.

[219] C. A. R. Hoare. Quicksort. The Computer Journal, 5(1):10–15, 1962.

[220] Dorit S. Hochbaum. Efficient bounds for the stable set, vertex cover and set packing problems. Discrete Applied Mathematics, 6(3):243–254, 1983.

[221] Dorit S. Hochbaum, editor. Approximation Algorithms for NP-Hard Problems. PWS

Publishing Company, 1997.

[222] W. Hoeffding. On the distribution of the number of successes in independent trials. Annals of Mathematical Statistics, 27(3):713–721, 1956.

[223] Micha Hofri. Probabilistic Analysis of Algorithms. Springer, 1987.

[224] John E. Hopcroft and Richard M. Karp. An n 5/2 algorithm for maximum matchings in bipartite graphs. SIAM Journal on Computing, 2(4):225–231, 1973.

[225] John E. Hopcroft, Rajeev Motwani, and Jeffrey D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison Wesley, third edition, 2006.

[226] John E. Hopcroft and Robert E. Tarjan. Efficient algorithms for graph manipulation.

Communications of the ACM, 16(6):372–378, 1973.

[227] John E. Hopcroft and Jeffrey D. Ullman. Set merging algorithms. SIAM Journal on Computing, 2(4):294–303, 1973.

[228] John E. Hopcroft and Jeffrey D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, 1979.

[229] Juraj Hromkovič. Algorithmics for Hard Problems: Introduction to Combinatorial Optimization, Randomization, Approximation, and Heuristics. Springer-Verlag, 2001.

[230] T. C. Hu and M. T. Shing. Computation of matrix chain products. Part I. SIAM Journal on Computing, 11(2):362–373, 1982.

[231] T. C. Hu and M. T. Shing. Computation of matrix chain products. Part II. SIAM Journal on Computing, 13(2):228–251, 1984.

[232] T. C. Hu and A. C. Tucker. Optimal computer search trees and variable-length alphabetic codes. SIAM Journal on Applied Mathematics, 21(4):514–532, 1971.

[233] David A. Huffman. A method for the construction of minimum-redundancy codes.

Proceedings of the IRE, 40(9):1098–1101, 1952.

[234] Oscar H. Ibarra and Chul E. Kim. Fast approximation algorithms for the knapsack and sum of subset problems. Journal of the ACM, 22(4):463–468, 1975.

[235] E. J. Isaac and R. C. Singleton. Sorting by address calculation. Journal of the ACM, 3(3):169–174, 1956.

[236] David S. Johnson. Approximation algorithms for combinatorial problems. Journal of Computer and System Sciences, 9(3):256–278, 1974.

[237] David S. Johnson. The NP-completeness column: An ongoing guide—The tale of the

second prover. Journal of Algorithms, 13(3):502–524, 1992.

[238] Donald B. Johnson. Efficient algorithms for shortest paths in sparse networks. Journal of the ACM, 24(1):1–13, 1977.

[239] Richard Johnsonbaugh and Marcus Schaefer. Algorithms. Pearson Prentice Hall, 2004.

[240] Neil C. Jones and Pavel Pevzner. An Introduction to Bioinformatics Algorithms. The MIT

Press, 2004.

[241] T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman, and A. Y. Wu.

A local search approximation algorithm for k-means clustering. Computational Geometry, 28:89–112, 2004.

[242] A. Karatsuba and Yu. Ofman. Multiplication of multidigit numbers on automata. Soviet Physics—Doklady, 7(7):595–596, 1963. Translation of an article in Doklady Akademii Nauk SSSR, 145(2), 1962.

[243] David R. Karger, Philip N. Klein, and Robert E. Tarjan. A randomized linear-time algorithm to find minimum spanning trees. Journal of the ACM, 42(2):321–328, 1995.

[244] David R. Karger, Daphne Koller, and Steven J. Phillips. Finding the hidden path: Time bounds for all-pairs shortest paths. SIAM Journal on Computing, 22(6):1199–1217, 1993.

[245] Juha Kärkkäinen, Peter Sanders, and Stefan Burkhardt. Linear work suffix array

construction. Journal of the ACM, 53(6):918–936, 2006.

[246] Howard Karloff. Linear Programming. Birkhäuser, 1991.

[247] N. Karmarkar. A new polynomial-time algorithm for linear programming. Combinatorica, 4(4):373–395, 1984.

[248] Richard M. Karp. Reducibility among combinatorial problems. In Raymond E. Miller and James W. Thatcher, editors, Complexity of Computer Computations, pages 85–103.

Plenum Press, 1972.

[249] Richard M. Karp. An introduction to randomized algorithms. Discrete Applied

Mathematics, 34(1–3):165–201, 1991.

[250] Richard M. Karp and Michael O. Rabin. Efficient randomized pattern-matching

algorithms. IBM Journal of Research and Development, 31(2):249–260, 1987.

[251] A. V. Karzanov. Determining the maximal flow in a network by the method of preflows.

Soviet Mathematics Doklady, 15(2):434–437, 1974.

[252] Toru Kasai, Gunho Lee, Hiroki Arimura, Setsuo Arikawa, and Kunsoo Park. Linear-time longest-common-prefix computation in suffix arrays and its applications. In Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching, volume 2089, pages 181–

192. Springer-Verlag, 2001.

[253] Jonathan Katz and Yehuda Lindell. Introduction to Modern Cryptography. CRC Press, second edition, 2015.

[254] Valerie King. A simpler minimum spanning tree verification algorithm. Algorithmica, 18(2):263–270, 1997.

[255] Valerie King, Satish Rao, and Robert E. Tarjan. A faster deterministic maximum flow algorithm. Journal of Algorithms, 17(3):447–474, 1994.

[256] Philip N. Klein and Neal E. Young. Approximation algorithms for NP-hard optimization problems. In CRC Handbook on Algorithms, pages 34-1–34-19. CRC Press, 1999.

[257] Jon Kleinberg and Éva Tardos. Algorithm Design. Addison-Wesley, 2006.

[258] Robert D. Kleinberg. A multiple-choice secretary algorithm with applications to online auctions. In Proceedings of the 16th ACM-SIAM Symposium on Discrete Algorithms, pages 630–631, 2005.

[259] Donald E. Knuth. Fundamental Algorithms, volume 1 of The Art of Computer Programming. Addison-Wesley, third edition, 1997.

[260] Donald E. Knuth. Seminumerical Algorithms, volume 2 of The Art of Computer Programming. Addison-Wesley, third edition, 1997.

[261] Donald E. Knuth. Sorting and Searching, volume 3 of The Art of Computer Programming.

Addison-Wesley, second edition, 1998.

[262] Donald E. Knuth. Combinatorial Algorithms, volume 4A of The Art of Computer Programming. Addison-Wesley, 2011.

[263] Donald E. Knuth. Satisfiability, volume 4, fascicle 6 of The Art of Computer Programming. Addison-Wesley, 2015.

[264] Donald E. Knuth. Optimum binary search trees. Acta Informatica, 1(1):14–25, 1971.

[265] Donald E. Knuth. Big omicron and big omega and big theta. SIGACT News, 8(2):18–23, 1976.

[266] Donald E. Knuth. Stable Marriage and Its Relation to Other Combinatorial Problems: An Introduction to the Mathematical Analysis of Algorithms, volume 10 of CRM Proceedings and Lecture Notes. American Mathematical Society, 1997.

[267] Donald E. Knuth, James H. Morris, Jr., and Vaughan R. Pratt. Fast pattern matching in strings. SIAM Journal on Computing, 6(2):323–350, 1977.

[268] Mykel J. Kochenderfer and Tim A. Wheeler. Algorithms for Optimization. The MIT Press, 2019.

[269] J. Komlós. Linear verification for spanning trees. Combinatorica, 5(1):57–65, 1985.

[270] Dexter C. Kozen. The Design and Analysis of Algorithms. Springer, 1992.

Image 1975

[271] David W. Krumme, George Cybenko, and K. N. Venkataraman. Gossiping in minimal

time. SIAM Journal on Computing, 21(1):111–139, 1992.

[272] Joseph B. Kruskal, Jr. On the shortest spanning subtree of a graph and the traveling salesman problem. Proceedings of the American Mathematical Society, 7(1):48–50, 1956.

[273] Harold W. Kuhn. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly, 2:83–97, 1955.

[274] William Kuszmaul and Charles E. Leiserson. Floors and ceilings in divide-and-conquer recurrences. In Proceedings of the 3rd SIAM Symposium on Simplicity in Algorithms, pages 133–141, 2021.

[275] Leslie Lamport. How to make a multiprocessor computer that correctly executes

multiprocess programs. IEEE Transactions on Computers, C-28(9):690–691, 1979.

[276] Eugene L. Lawler. Combinatorial Optimization: Networks and Matroids. Holt, Rinehart, and Winston, 1976.

[277] Eugene L. Lawler, J. K. Lenstra, A. H. G. Rinnooy Kan, and D. B. Shmoys, editors. The Traveling Salesman Problem. John Wiley & Sons, 1985.

[278] François Le Gall. Powers of tensors and fast matrix multiplication. In Proceedings of the 2014 International Symposium on Symbolic and Algebraic Computation, (ISSAC), pages 296–303, 2014.

[279] Doug Lea. A Java fork/join framework. In ACM 2000 Conference on Java Grande, pages 36–43, 2000.

[280] C. Y. Lee. An algorithm for path connection and its applications. IRE Transactions on Electronic Computers, EC-10(3):346–365, 1961.

[281] Edward A. Lee. The problem with threads. IEEE Computer, 39(3):33–42, 2006.

[282] I-Ting Angelina Lee, Charles E. Leiserson, Tao B. Schardl, Zhunping Zhang, and Jim Sukha. On-the-fly pipeline parallelism. ACM Transactions on Parallel Computing, 2(3):17:1–17:42, 2015.

[283] I-Ting Angelina Lee and Tao B. Schardl. Efficient race detection for reducer hyperobjects.

ACM Transactions on Parallel Computing, 4(4):1–40, 2018.

[284] Mun-Kyu Lee, Pierre Michaud, Jeong Seop Sim, and Daehun Nyang. A simple proof of optimality for the MIN cache replacement policy. Information Processing Letters, 116(2):168–170, 2016.

[285] Yin Tat Lee and Aaron Sidford. Path finding methods for linear programming: Solving linear programs in

iterations and faster algorithms for maximum flow. In

Proceedings of the 55th Annual Symposium on Foundations of Computer Science, pages 424–433, 2014.

[286] Tom Leighton. Tight bounds on the complexity of parallel sorting. IEEE Transactions on Computers, C-34(4):344–354, 1985.

[287] Tom Leighton. Notes on better master theorems for divide-and-conquer recurrences.

Class notes. Available at http://citeseerx.ist.psu.edu/viewdoc/summary?

doi=10.1.1.39.1636, 1996.

[288] Tom Leighton and Satish Rao. Multicommodity max-flow min-cut theorems and their use in designing approximation algorithms. Journal of the ACM, 46(6):787–832, 1999.

[289] Daan Leijen and Judd Hall. Optimize managed code for multi-core machines. MSDN

Magazine, 2007.

[290] Charles E. Leiserson. The Cilk++ concurrency platform. Journal of Supercomputing, 51(3):244–257, March 2010.

[291] Charles E. Leiserson. Cilk. In David Padua, editor, Encyclopedia of Parallel Computing, pages 273–288. Springer, 2011.

[292] Charles E. Leiserson, Tao B. Schardl, and Jim Sukha. Deterministic parallel random-number generation for dynamic-multithreading platforms. In Proceddings of the 17th

ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

(PPoPP), pages 193–204, 2012.

[293] Charles E. Leiserson, Neil C. Thompson, Joel S. Emer, Bradley C. Kuszmaul, Butler W.

Lampson, Daniel Sanchez, and Tao B. Schardl. There’s plenty of room at the Top: What

will drive computer performance after Moore’s law? Science, 368(6495), 2020.

[294] Debra A. Lelewer and Daniel S. Hirschberg. Data compression. ACM Computing Surveys, 19(3):261–296, 1987.

[295] A. K. Lenstra, H. W. Lenstra, Jr., M. S. Manasse, and J. M. Pollard. The number field sieve. In A. K. Lenstra and H. W. Lenstra, Jr., editors, The Development of the Number Field Sieve, volume 1554 of Lecture Notes in Mathematics, pages 11–42. Springer, 1993.

[296] H. W. Lenstra, Jr. Factoring integers with elliptic curves. Annals of Mathematics, 126(3):649–673, 1987.

[297] L. A. Levin. Universal sequential search problems. Problems of Information Transmission, 9(3):265–266, 1973. Translated from the original Russian article in Problemy Peredachi Informatsii 9(3): 115–116, 1973.

[298] Anany Levitin. Introduction to the Design & Analysis of Algorithms. Addison-Wesley, third edition, 2011.

[299] Harry R. Lewis and Christos H. Papadimitriou. Elements of the Theory of Computation.

Prentice Hall, second edition, 1998.

[300] Nick Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2(4):285–318, 1988.

[301] Nick Littlestone and Manfred K. Warmuth. The weighted majority algorithm.

Information and Computation, 108(2):212–261, 1994.

[302] C. L. Liu. Introduction to Combinatorial Mathematics. McGraw-Hill, 1968.

[303] Yang P. Liu and Aaron Sidford. Faster energy maximization for faster maximum flow. In Proceedings of the 52nd Annual ACM Symposium on Theory of Computing, pages 803–814, 2020.

[304] S. P. Lloyd. Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2):129–137, 1982.

[305] Panos Louridas. Real-World Algorithms: A Beginner’s Guide. The MIT Press, 2017.

[306] László Lovász and Michael D. Plummer. Matching Theory, volume 121 of Annals of Discrete Mathematics. North Holland, 1986.

[307] John MacCormick. 9 Algorithms That Changed the Future: The Ingenious Ideas That Drive Today’s Computers. Princeton University Press, 2012.

[308] Aleksander Madry. Navigating central path with electrical flows: From flows to matchings, and back. In Proceedings of the 54th Annual Symposium on Foundations of Computer

Science, pages 253–262, 2013.

[309] Bruce M. Maggs and Serge A. Plotkin. Minimum-cost spanning tree as a path-finding problem. Information Processing Letters, 26(6):291–293, 1988.

[310] M. Mahajan, P. Nimbhorkar, and K. Varadarajan. The planar k-means problem is NP-hard. In S. Das and R. Uehara, editors, WALCOM 2009: Algorithms and Computation, volume 5431 of Lecture Notes in Computer Science, pages 274–285. Springer, 2009.

[311] Michael Main. Data Structures and Other Objects Using Java. Addison-Wesley, 1999.

[312] Udi Manber and Gene Myers. Suffix arrays: A new method for on-line string searches.

SIAM Journal on Computing, 22(5):935–948, 1993.

[313] David F. Manlove. Algorithmics of Matching Under Preferences, volume 2 of Series on Theoretical Computer Science. World Scientific, 2013.

[314] Giovanni Manzini. An analysis of the Burrows-Wheeler transform. Journal of the ACM, 48(3):407–430, 2001.

[315] Mario Andrea Marchisio, editor. Computational Methods in Synthetic Biology. Humana Press, 2015.

[316] William J. Masek and Michael S. Paterson. A faster algorithm computing string edit distances. Journal of Computer and System Sciences, 20(1):18–31, 1980.

[317] Yu. V. Matiyasevich. Real-time recognition of the inclusion relation. Journal of Soviet Mathematics, 1(1):64–70, 1973. Translated from the original Russian article in Zapiski Nauchnykh Seminarov Leningradskogo Otdeleniya Matematicheskogo Institute im. V. A.

Steklova Akademii Nauk SSSR 20: 104–114, 1971.

[318] H. A. Maurer, Th. Ottmann, and H.-W. Six. Implementing dictionaries using binary trees of very small height. Information Processing Letters, 5(1):11–14, 1976.

[319] Ernst W. Mayr, Hans Jürgen Prömel, and Angelika Steger, editors. Lectures on Proof Verification and Approximation Algorithms, volume 1367 of Lecture Notes in Computer Science. Springer, 1998.

[320] Catherine C. McGeoch. All pairs shortest paths and the essential subgraph. Algorithmica, 13(5):426–441, 1995.

[321] Catherine C. McGeoch. A Guide to Experimental Algorithmics. Cambridge University Press, 2012.

[322] Andrew McGregor. Graph stream algorithms: A survey. SIGMOD Record, 43(1):9–20, 2014.

[323] M. D. McIlroy. A killer adversary for quicksort. Software—Practice and Experience, 29(4):341–344, 1999.

[324] Kurt Mehlhorn and Stefan Näher. LEDA: A Platform for Combinatorial and Geometric Computing. Cambridge University Press, 1999.

[325] Kurt Mehlhorn and Peter Sanders. Algorithms and Data Structures: The Basic Toolbox.

Springer, 2008.

[326] Dinesh P. Mehta and Sartaj Sahni. Handbook of Data Structures and Applications.

Chapman and Hall/CRC, second edition, 2018.

[327] Gary L. Miller. Riemann’s hypothesis and tests for primality. Journal of Computer and System Sciences, 13(3):300–317, 1976.

[328] Marvin Minsky and Seymore A. Pappert. Perceptrons. The MIT Press, 1969.

[329] John C. Mitchell. Foundations for Programming Languages. The MIT Press, 1996.

[330] Joseph S. B. Mitchell. Guillotine subdivisions approximate polygonal subdivisions: A simple polynomial-time approximation scheme for geometric TSP, k-MST, and related problems. SIAM Journal on Computing, 28(4):1298–1309, 1999.

[331] Michael Mitzenmacher and Eli Upfal. Probability and Computing. Cambridge University Press, second edition, 2017.

[332] Louis Monier. Algorithmes de Factorisation D’Entiers. PhD thesis, L’Université Paris-Sud, 1980.

[333] Louis Monier. Evaluation and comparison of two efficient probabilistic primality testing algorithms. Theoretical Computer Science, 12(1):97–108, 1980.

[334] Edward F. Moore. The shortest path through a maze. In Proceedings of the International Symposium on the Theory of Switching, pages 285–292. Harvard University Press, 1959.

[335] Rajeev Motwani, Joseph (Seffi) Naor, and Prabhakar Raghavan. Randomized

approximation algorithms in combinatorial optimization. In Dorit Hochbaum, editor,

Approximation Algorithms for NP-Hard Problems, chapter 11, pages 447–481. PWS

Publishing Company, 1997.

[336] Rajeev Motwani and Prabhakar Raghavan. Randomized Algorithms. Cambridge

University Press, 1995.

[337] James Munkres. Algorithms for the assignment and transportation problems. Journal of the Society for Industrial and Applied Mathematics, 5(1):32–38, 1957.

[338] J. I. Munro and V. Raman. Fast stable in-place sorting with O( n) data moves.

Algorithmica, 16(2):151–160, 1996.

[339] Yoichi Muraoka and David J. Kuck. On the time required for a sequence of matrix

products. Communications of the ACM, 16(1):22–26, 1973.

[340] Kevin P. Murphy. Machine Learning: A Probabilistic Perspective. MIT Press, 2012.

[341] S. Muthukrishnan. Data streams: Algorithms and applications. Foundations and Trends in Theoretical Computer Science, 1(2), 2005.

[342] Richard Neapolitan. Foundations of Algorithms. Jones & Bartlett Learning, fifth edition, 2014.

[343] Yurii Nesterov. Introductory Lectures on Convex Optimization: A Basic Course, volume 87

of Applied Optimization. Springer, 2004.

[344] J. Nievergelt and E. M. Reingold. Binary search trees of bounded balance. SIAM Journal on Computing, 2(1):33–43, 1973.

[345] Ivan Niven and Herbert S. Zuckerman. An Introduction to the Theory of Numbers. John Wiley & Sons, fourth edition, 1980.

[346] National Institute of Standards and Technology. Hash functions.

https://csrc.nist.gov/projects/hash-functions, 2019.

[347] Alan V. Oppenheim and Ronald W. Schafer, with John R. Buck. Discrete-Time Signal Processing. Prentice Hall, second edition, 1998.

[348] Alan V. Oppenheim and Alan S. Willsky, with S. Hamid Nawab. Signals and Systems.

Prentice Hall, second edition, 1997.

[349] James B. Orlin. A polynomial time primal network simplex algorithm for minimum cost flows. Mathematical Programming, 78(1):109–129, 1997.

[350] James B. Orlin. Max flows in O( nm) time, or better. In Proceedings of the 45th Annual ACM Symposium on Theory of Computing, pages 765–774, 2013.

[351] Anna Pagh, Rasmus Pagh, and Milan Ruzic. Linear probing with constant independence.

https://arxiv.org/abs/cs/0612055, 2006.

[352] Christos H. Papadimitriou. Computational Complexity. Addison-Wesley, 1994.

[353] Christos H. Papadimitriou and Kenneth Steiglitz. Combinatorial Optimization: Algorithms and Complexity. Prentice Hall, 1982.

[354] Michael S. Paterson. Progress in selection. In Proceedings of the Fifth Scandinavian Workshop on Algorithm Theory, pages 368–379, 1996.

[355] Seth Pettie. A new approach to all-pairs shortest paths on real-weighted graphs.

Theoretical Computer Science, 312(1):47–74, 2004.

[356] Seth Pettie and Vijaya Ramachandran. An optimal minimum spanning tree algorithm.

Journal of the ACM, 49(1):16–34, 2002.

[357] Seth Pettie and Vijaya Ramachandran. A shortest path algorithm for real-weighted undirected graphs. SIAM Journal on Computing, 34(6):1398–1431, 2005.

[358] Steven Phillips and Jeffery Westbrook. Online load balancing and network flow.

Algorithmica, 21(3):245–261, 1998.

[359] Serge A. Plotkin, David. B. Shmoys, and Éva Tardos. Fast approximation algorithms for fractional packing and covering problems. Mathematics of Operations Research, 20:257–

301, 1995.

[360] J. M. Pollard. Factoring with cubic integers. In A. K. Lenstra and H. W. Lenstra, Jr., editors, The Development of the Number Field Sieve, volume 1554 of Lecture Notes in Mathematics, pages 4–10. Springer, 1993.

[361] Carl Pomerance. On the distribution of pseudoprimes. Mathematics of Computation, 37(156):587–593, 1981.

[362] Carl Pomerance, editor. Proceedings of the AMS Symposia in Applied Mathematics: Computational Number Theory and Cryptography. American Mathematical Society, 1990.

[363] William K. Pratt. Digital Image Processing. John Wiley & Sons, fourth edition, 2007.

[364] Franco P. Preparata and Michael Ian Shamos. Computational Geometry: An Introduction.

Springer, 1985.

[365] William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery.

Numerical Recipes in C++: The Art of Scientific Computing. Cambridge University Press, second edition, 2002.

[366] William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery.

Numerical Recipes: The Art of Scientific Computing. Cambridge University Press, third edition, 2007.

[367] R. C. Prim. Shortest connection networks and some generalizations. Bell System Technical Journal, 36(6):1389–1401, 1957.

[368] Robert L. Probert. On the additive complexity of matrix multiplication. SIAM Journal on Computing, 5(2):187–203, 1976.

[369] William Pugh. Skip lists: A probabilistic alternative to balanced trees. Communications of the ACM, 33(6):668–676, 1990.

[370] Simon J. Puglisi, W. F. Smyth, and Andrew H. Turpin. A taxonomy of suffix array

construction algorithms. ACM Computing Surveys, 39(2), 2007.

[371] Paul W. Purdom, Jr. and Cynthia A. Brown. The Analysis of Algorithms. Holt, Rinehart, and Winston, 1985.

[372] Michael O. Rabin. Probabilistic algorithms. In J. F. Traub, editor, Algorithms and Complexity: New Directions and Recent Results, pages 21–39. Academic Press, 1976.

[373] Michael O. Rabin. Probabilistic algorithm for testing primality. Journal of Number Theory, 12(1):128–138, 1980.

[374] P. Raghavan and C. D. Thompson. Randomized rounding: A technique for provably good algorithms and algorithmic proofs. Combinatorica, 7(4):365–374, 1987.

[375] Rajeev Raman. Recent results on the single-source shortest paths problem. SIGACT

News, 28(2):81–87, 1997.

[376] James Reinders. Intel Threading Building Blocks: Outfitting C++ for Multi-core Processor Parallelism. O’Reilly Media, Inc., 2007.

[377] Edward M. Reingold, Kenneth J. Urban, and David Gries. K-M-P string matching

revisited. Information Processing Letters, 64(5):217–223, 1997.

[378] Hans Riesel. Prime Numbers and Computer Methods for Factorization, volume 126 of Progress in Mathematics. Birkhäuser, second edition, 1994.

[379] Ronald L. Rivest, M. J. B. Robshaw, R. Sidney, and Y. L. Yin. The RC6 block cipher. In First Advanced Encryption Standard (AES) Conference, 1998.

[380] Ronald L. Rivest, Adi Shamir, and Leonard M. Adleman. A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM, 21(2):120–126, 1978. See also U.S. Patent 4,405,829.

[381] Herbert Robbins. A remark on Stirling’s formula. American Mathematical Monthly, 62(1):26–29, 1955.

[382] Julia Robinson. An iterative method of solving a game. The Annals of Mathematics, 54(2):296–301, 1951.

[383] Arch D. Robison and Charles E. Leiserson. Cilk Plus. In Pavan Balaji, editor,

Programming Models for Parallel Computing, chapter 13, pages 323–352. The MIT Press, 2015.

[384] D. J. Rosenkrantz, R. E. Stearns, and P. M. Lewis. An analysis of several heuristics for the traveling salesman problem. SIAM Journal on Computing, 6(3):563–581, 1977.

[385] Tim Roughgarden. Algorithms Illuminated, Part 1: The Basics. Soundlikeyourself Publishing, 2017.

[386] Tim Roughgarden. Algorithms Illuminated, Part 2: Graph Algorithms and Data Structures.

Soundlikeyourself Publishing, 2018.

[387] Tim Roughgarden. Algorithms Illuminated, Part 3: Greedy Algorithms and Dynamic Programming. Soundlikeyourself Publishing, 2019.

[388] Tim Roughgarden. Algorithms Illuminated, Part 4: Algorithms for NP-Hard Problems.

Soundlikeyourself Publishing, 2020.

[389] Salvador Roura. Improved master theorems for divide-and-conquer recurrences. Journal of the ACM, 48(2):170–205, 2001.

[390] Y. A. Rozanov. Probability Theory: A Concise Course. Dover, 1969.

[391] Stuart Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. Pearson, fourth edition, 2020.

[392] S. Sahni and T. Gonzalez. P-complete approximation problems. Journal of the ACM, 23(3):555–565, 1976.

[393] Peter Sanders, Kurt Mehlhorn, Martin Dietzfelbinger, and Roman Dementiev. Sequential and Parallel Algorithms and Data Structures: The Basic Toolkit. Springer, 2019.

[394] Piotr Sankowski. Shortest paths in matrix multiplication time. In Proceedings of the 13th Annual European Symposium on Algorithms, pages 770–778, 2005.