Fourth Edition
Charles E. Leiserson
Ronald L. Rivest
Clifford Stein
Introduction to Algorithms
Fourth Edition
The MIT Press
Cambridge, Massachusetts London, England
© 2022 Massachusetts Institute of Technology
All rights reserved. No part of this book may be reproduced in any form or by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher.
The MIT Press would like to thank the anonymous peer reviewers who provided comments on drafts of this book. The generous work of academic experts is essential for establishing the authority and quality of our publications. We acknowledge with gratitude the contributions of these otherwise uncredited readers.
Names: Cormen, Thomas H., author. | Leiserson, Charles Eric, author. | Rivest, Ronald L., author. | Stein, Clifford, author.
Title: Introduction to algorithms / Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein.
Description: Fourth edition. | Cambridge, Massachusetts : The MIT Press, [2022] | Includes bibliographical references and index.
Identifiers: LCCN 2021037260 | ISBN 9780262367509
Subjects: LCSH: Computer programming. | Computer algorithms.
Classification: LCC QA76.6 .C662 2022 | DDC 005.13--dc23
LC record available at http://lccn.loc.gov/2021037260
10 9 8 7 6 5 4 3 2 1
d_r0
1 The Role of Algorithms in Computing
1.2 Algorithms as a technology
3 Characterizing Running Times
3.1 O-notation, Ω-notation, and Θ-notation
3.2 Asymptotic notation: formal definitions
3.3 Standard notations and common functions
4.1 Multiplying square matrices
4.2 Strassen’s algorithm for matrix multiplication
4.3 The substitution method for solving recurrences
4.4 The recursion-tree method for solving
4.5 The master method for solving recurrences
★ 4.6 Proof of the continuous master theorem
5 Probabilistic Analysis and Randomized Algorithms
5.2 Indicator random variables
★ 5.4 Probabilistic analysis and further uses of
II Sorting and Order Statistics
6.2 Maintaining the heap property
7.3 A randomized version of quicksort
9 Medians and Order Statistics
9.2 Selection in expected linear time
9.3 Selection in worst-case linear time
10.1 Simple array-based data structures: arrays,
10.3 Representing rooted trees
12.1 What is a binary search tree?
12.2 Querying a binary search tree
13.1 Properties of red-black trees
IV Advanced Design and Analysis Techniques
14.2 Matrix-chain multiplication
14.3 Elements of dynamic programming
14.4 Longest common subsequence
14.5 Optimal binary search trees
15.1 An activity-selection problem
15.2 Elements of the greedy strategy
17.2 How to augment a data structure
18.2 Basic operations on B-trees
18.3 Deleting a key from a B-tree
19 Data Structures for Disjoint Sets
19.2 Linked-list representation of disjoint sets
★ 19.4 Analysis of union by rank with path
20 Elementary Graph Algorithms
20.1 Representations of graphs
20.5 Strongly connected components
21.1 Growing a minimum spanning tree
21.2 The algorithms of Kruskal and Prim
22 Single-Source Shortest Paths
22.1 The Bellman-Ford algorithm
22.2 Single-source shortest paths in directed acyclic
22.4 Difference constraints and shortest paths
22.5 Proofs of shortest-paths properties
23.1 Shortest paths and matrix multiplication
23.2 The Floyd-Warshall algorithm
23.3 Johnson’s algorithm for sparse graphs
24.2 The Ford-Fulkerson method
24.3 Maximum bipartite matching
25 Matchings in Bipartite Graphs
25.1 Maximum bipartite matching (revisited)
25.2 The stable-marriage problem
25.3 The Hungarian algorithm for the assignment
26.1 The basics of fork-join parallelism
26.2 Parallel matrix multiplication
27.2 Maintaining a search list
28.1 Solving systems of linear equations
28.3 Symmetric positive-definite matrices and least-
29.1 Linear programming formulations and
29.2 Formulating problems as linear programs
31 Number-Theoretic Algorithms
31.1 Elementary number-theoretic notions
31.4 Solving modular linear equations
31.5 The Chinese remainder theorem
31.7 The RSA public-key cryptosystem
32.1 The naive string-matching algorithm
32.3 String matching with finite automata
★ 32.4 The Knuth-Morris-Pratt algorithm
33 Machine-Learning Algorithms
33.2 Multiplicative-weights algorithms
34.2 Polynomial-time verification
34.3 NP-completeness and reducibility
35.2 The traveling-salesperson problem
35.4 Randomization and linear programming
VIII Appendix: Mathematical Background
A.1 Summation formulas and properties
C.4 The geometric and binomial distributions
★ C.5 The tails of the binomial distribution
D.1 Matrices and matrix operations
Not so long ago, anyone who had heard the word “algorithm” was
almost certainly a computer scientist or mathematician. With
computers having become prevalent in our modern lives, however, the
term is no longer esoteric. If you look around your home, you’ll find
algorithms running in the most mundane places: your microwave oven,
your washing machine, and, of course, your computer. You ask
algorithms to make recommendations to you: what music you might
like or what route to take when driving. Our society, for better or for
worse, asks algorithms to suggest sentences for convicted criminals. You
even rely on algorithms to keep you alive, or at least not to kill you: the
control systems in your car or in medical equipment.1 The word
“algorithm” appears somewhere in the news seemingly every day.
Therefore, it behooves you to understand algorithms not just as a
student or practitioner of computer science, but as a citizen of the
world. Once you understand algorithms, you can educate others about
what algorithms are, how they operate, and what their limitations are.
This book provides a comprehensive introduction to the modern
study of computer algorithms. It presents many algorithms and covers
them in considerable depth, yet makes their design accessible to all
levels of readers. All the analyses are laid out, some simple, some more
involved. We have tried to keep explanations clear without sacrificing
depth of coverage or mathematical rigor.
Each chapter presents an algorithm, a design technique, an
application area, or a related topic. Algorithms are described in English
and in a pseudocode designed to be readable by anyone who has done a
little programming. The book contains 231 figures—many with multiple
parts—illustrating how the algorithms work. Since we emphasize
efficiency as a design criterion, we include careful analyses of the
running times of the algorithms.
The text is intended primarily for use in undergraduate or graduate
courses in algorithms or data structures. Because it discusses
engineering issues in algorithm design, as well as mathematical aspects,
it is equally well suited for self-study by technical professionals.
In this, the fourth edition, we have once again updated the entire
book. The changes cover a broad spectrum, including new chapters and
sections, color illustrations, and what we hope you’ll find to be a more
engaging writing style.
To the teacher
We have designed this book to be both versatile and complete. You
should find it useful for a variety of courses, from an undergraduate
course in data structures up through a graduate course in algorithms.
Because we have provided considerably more material than can fit in a
typical one-term course, you can select the material that best supports
the course you wish to teach.
You should find it easy to organize your course around just the
chapters you need. We have made chapters relatively self-contained, so
that you need not worry about an unexpected and unnecessary
dependence of one chapter on another. Whereas in an undergraduate
course, you might use only some sections from a chapter, in a graduate
course, you might cover the entire chapter.
We have included 931 exercises and 162 problems. Each section ends
with exercises, and each chapter ends with problems. The exercises are
generally short questions that test basic mastery of the material. Some
are simple self-check thought exercises, but many are substantial and
suitable as assigned homework. The problems include more elaborate
case studies which often introduce new material. They often consist of
several parts that lead the student through the steps required to arrive at
a solution.
As with the third edition of this book, we have made publicly available solutions to some, but by no means all, of the problems and
exercises. You can find these solutions on our website,
http://mitpress.mit.edu/algorithms/. You will want to check this site to see whether it contains the solution to an exercise or problem that you
plan to assign. Since the set of solutions that we post might grow over
time, we recommend that you check the site each time you teach the
course.
We have starred (★) the sections and exercises that are more suitable
for graduate students than for undergraduates. A starred section is not
necessarily more difficult than an unstarred one, but it may require an
understanding of more advanced mathematics. Likewise, starred
exercises may require an advanced background or more than average
creativity.
To the student
We hope that this textbook provides you with an enjoyable introduction
to the field of algorithms. We have attempted to make every algorithm
accessible and interesting. To help you when you encounter unfamiliar
or difficult algorithms, we describe each one in a step-by-step manner.
We also provide careful explanations of the mathematics needed to
understand the analysis of the algorithms and supporting figures to help
you visualize what is going on.
Since this book is large, your class will probably cover only a portion
of its material. Although we hope that you will find this book helpful to
you as a course textbook now, we have also tried to make it
comprehensive enough to warrant space on your future professional
bookshelf.
What are the prerequisites for reading this book?
You need some programming experience. In particular, you
should understand recursive procedures and simple data
structures, such as arrays and linked lists (although Section 10.2
covers linked lists and a variant that you may find new).
You should have some facility with mathematical proofs, and
especially proofs by mathematical induction. A few portions of
the book rely on some knowledge of elementary calculus.
Although this book uses mathematics throughout, Part I and
Appendices A–D teach you all the mathematical techniques you will need.
Our website, http://mitpress.mit.edu/algorithms/, links to solutions for some of the problems and exercises. Feel free to check your solutions
against ours. We ask, however, that you not send your solutions to us.
To the professional
The wide range of topics in this book makes it an excellent handbook
on algorithms. Because each chapter is relatively self-contained, you can
focus on the topics most relevant to you.
Since most of the algorithms we discuss have great practical utility,
we address implementation concerns and other engineering issues. We
often provide practical alternatives to the few algorithms that are
primarily of theoretical interest.
If you wish to implement any of the algorithms, you should find the
translation of our pseudocode into your favorite programming language
to be a fairly straightforward task. We have designed the pseudocode to
present each algorithm clearly and succinctly. Consequently, we do not
address error handling and other software-engineering issues that
require specific assumptions about your programming environment. We
attempt to present each algorithm simply and directly without allowing
the idiosyncrasies of a particular programming language to obscure its
essence. If you are used to 0-origin arrays, you might find our frequent
practice of indexing arrays from 1 a minor stumbling block. You can
always either subtract 1 from our indices or just overallocate the array
and leave position 0 unused.
We understand that if you are using this book outside of a course,
then you might be unable to check your solutions to problems and
exercises against solutions provided by an instructor. Our website,
http://mitpress.mit.edu/algorithms/, links to solutions for some of the
problems and exercises so that you can check your work. Please do not
send your solutions to us.
To our colleagues
We have supplied an extensive bibliography and pointers to the current
literature. Each chapter ends with a set of chapter notes that give
historical details and references. The chapter notes do not provide a
complete reference to the whole field of algorithms, however. Though it
may be hard to believe for a book of this size, space constraints
prevented us from including many interesting algorithms.
Despite myriad requests from students for solutions to problems and
exercises, we have adopted the policy of not citing references for them,
removing the temptation for students to look up a solution rather than
to discover it themselves.
Changes for the fourth edition
As we said about the changes for the second and third editions,
depending on how you look at it, the book changed either not much or
quite a bit. A quick look at the table of contents shows that most of the
third-edition chapters and sections appear in the fourth edition. We
removed three chapters and several sections, but we have added three
new chapters and several new sections apart from these new chapters.
We kept the hybrid organization from the first three editions. Rather
than organizing chapters only by problem domains or only according to
techniques, this book incorporates elements of both. It contains
technique-based
chapters
on
divide-and-conquer,
dynamic
programming, greedy algorithms, amortized analysis, augmenting data
structures, NP-completeness, and approximation algorithms. But it also
has entire parts on sorting, on data structures for dynamic sets, and on
algorithms for graph problems. We find that although you need to know
how to apply techniques for designing and analyzing algorithms,
problems seldom announce to you which techniques are most amenable
to solving them.
Some of the changes in the fourth edition apply generally across the
book, and some are specific to particular chapters or sections. Here is a
summary of the most significant general changes:
We added 140 new exercises and 22 new problems. We also
improved many of the old exercises and problems, often as the
result of reader feedback. (Thanks to all readers who made
suggestions.)
We have color! With designers from the MIT Press, we selected a
limited palette, devised to convey information and to be pleasing
to the eye. (We are delighted to display red-black trees in—get this
—red and black!) To enhance readability, defined terms,
pseudocode comments, and page numbers in the index are in
color.
Pseudocode procedures appear on a tan background to make
them easier to spot, and they do not necessarily appear on the
page of their first reference. When they don’t, the text directs you
to the relevant page. In the same vein, nonlocal references to
numbered equations, theorems, lemmas, and corollaries include
the page number.
We removed topics that were rarely taught. We dropped in their
entirety the chapters on Fibonacci heaps, van Emde Boas trees,
and computational geometry. In addition, the following material
was excised: the maximum-subarray problem, implementing
pointers and objects, perfect hashing, randomly built binary
search trees, matroids, push-relabel algorithms for maximum flow,
the iterative fast Fourier transform method, the details of the
simplex algorithm for linear programming, and integer
factorization. You can find all the removed material on our
website, http://mitpress.mit.edu/algorithms/.
We reviewed the entire book and rewrote sentences, paragraphs,
and sections to make the writing clearer, more personal, and
gender neutral. For example, the “traveling-salesman problem” in
the previous editions is now called the “traveling-salesperson
problem.” We believe that it is critically important for engineering
and science, including our own field of computer science, to be
welcoming to everyone. (The one place that stumped us is in
Chapter 13, which requires a term for a parent’s sibling. Because the English language has no such gender-neutral term, we
regretfully stuck with “uncle.”)
The chapter notes, bibliography, and index were updated,
reflecting the dramatic growth of the field of algorithms since the
third edition.
We corrected errors, posting most corrections on our website of
third-edition errata. Those that were reported while we were in
full swing preparing this edition were not posted, but were
corrected in this edition. (Thanks again to all readers who helped
us identify issues.)
The specific changes for the fourth edition include the following:
We renamed Chapter 3 and added a section giving an overview of
asymptotic notation before delving into the formal definitions.
Chapter 4 underwent substantial changes to improve its
mathematical foundation and make it more robust and intuitive.
The notion of an algorithmic recurrence was introduced, and the
topic of ignoring floors and ceilings in recurrences was addressed
more rigorously. The second case of the master theorem
incorporates polylogarithmic factors, and a rigorous proof of a
“continuous” version of the master theorem is now provided. We
also present the powerful and general Akra-Bazzi method
(without proof).
The deterministic order-statistic algorithm in Chapter 9 is slightly different, and the analyses of both the randomized and
deterministic order-statistic algorithms have been revamped.
In addition to stacks and queues, Section 10.1 discusses ways to
store arrays and matrices.
Chapter 11 on hash tables includes a modern treatment of hash functions. It also emphasizes linear probing as an efficient method
for resolving collisions when the underlying hardware implements
caching to favor local searches.
To replace the sections on matroids in Chapter 15, we converted a problem in the third edition about offline caching into a full
section.
Section 16.4 now contains a more intuitive explanation of the potential functions to analyze table doubling and halving.
Chapter 17 on augmenting data structures was relocated from
Part III to Part V, reflecting our view that this technique goes beyond basic material.
Chapter 25 is a new chapter about matchings in bipartite graphs.
It presents algorithms to find a matching of maximum cardinality,
to solve the stable-marriage problem, and to find a maximum-
weight matching (known as the “assignment problem”).
Chapter 26, on task-parallel computing, has been updated with modern terminology, including the name of the chapter.
Chapter 27, which covers online algorithms, is another new chapter. In an online algorithm, the input arrives over time, rather
than being available in its entirety at the start of the algorithm.
The chapter describes several examples of online algorithms,
including determining how long to wait for an elevator before
taking the stairs, maintaining a linked list via the move-to-front
heuristic, and evaluating replacement policies for caches.
In Chapter 29, we removed the detailed presentation of the simplex algorithm, as it was math heavy without really conveying
many algorithmic ideas. The chapter now focuses on the key
aspect of how to model problems as linear programs, along with
the essential duality property of linear programming.
Section 32.5 adds to the chapter on string matching the simple, yet powerful, structure of suffix arrays.
Chapter 33, on machine learning, is the third new chapter. It introduces several basic methods used in machine learning:
clustering to group similar items together, weighted-majority
algorithms, and gradient descent to find the minimizer of a
function.
Section 34.5.6 summarizes strategies for polynomial-time reductions to show that problems are NP-hard.
The proof of the approximation algorithm for the set-covering
problem in Section 35.3 has been revised.
Website
You can use our website, http://mitpress.mit.edu/algorithms/, to obtain supplementary information and to communicate with us. The website
links to a list of known errors, material from the third edition that is not
included in the fourth edition, solutions to selected exercises and
problems, Python implementations of many of the algorithms in this
book, a list explaining the corny professor jokes (of course), as well as
other content, which we may add to. The website also tells you how to
report errors or make suggestions.
How we produced this book
Like the previous three editions, the fourth edition was produced in
LATEX 2 ε. We used the Times font with mathematics typeset using the
MathTime Professional II fonts. As in all previous editions, we
compiled the index using Windex, a C program that we wrote, and
produced the bibliography using BIBTEX. The PDF files for this book
were created on a MacBook Pro running macOS 10.14.
Our plea to Apple in the preface of the third edition to update
MacDraw Pro for macOS 10 went for naught, and so we continued to
draw illustrations on pre-Intel Macs running MacDraw Pro under the
Classic environment of older versions of macOS 10. Many of the
mathematical expressions appearing in illustrations were laid in with the
psfrag package for LATEX 2 ε.
Acknowledgments for the fourth edition
We have been working with the MIT Press since we started writing the
first edition in 1987, collaborating with several directors, editors, and
production staff. Throughout our association with the MIT Press, their
support has always been outstanding. Special thanks to our editors Marie Lee, who put up with us for far too long, and Elizabeth Swayze,
who pushed us over the finish line. Thanks also to Director Amy Brand
and to Alex Hoopes.
As in the third edition, we were geographically distributed while
producing the fourth edition, working in the Dartmouth College
Department of Computer Science; the MIT Computer Science and
Artificial Intelligence Laboratory and the MIT Department of
Electrical Engineering and Computer Science; and the Columbia
University Department of Industrial Engineering and Operations
Research, Department of Computer Science, and Data Science Institute.
During the COVID-19 pandemic, we worked largely from home. We
thank our respective universities and colleagues for providing such
supportive and stimulating environments. As we complete this book,
those of us who are not retired are eager to return to our respective
universities now that the pandemic seems to be abating.
Julie Sussman, P.P.A., came to our rescue once again with her
technical copy-editing under tremendous time pressure. If not for Julie,
this book would be riddled with errors (or, let’s say, many more errors
than it has) and would be far less readable. Julie, we will be forever
indebted to you. Errors that remain are the responsibility of the authors
(and probably were inserted after Julie read the material).
Dozens of errors in previous editions were corrected in the process of
creating this edition. We thank our readers—too many to list them all—
who have reported errors and suggested improvements over the years.
We received considerable help in preparing some of the new material
in this edition. Neville Campbell (unaffiliated), Bill Kuszmaul of MIT,
and Chee Yap of NYU provided valuable advice regarding the
treatment of recurrences in Chapter 4. Yan Gu of the University of California, Riverside, provided feedback on parallel algorithms in
Chapter 26. Rob Shapire of Microsoft Research altered our approach to the material on machine learning with his detailed comments on
Chapter 33. Qi Qi of MIT helped with the analysis of the Monty Hall problem (Problem C-1).
Molly Seaman and Mary Reilly of the MIT Press helped us select the
color palette in the illustrations, and Wojciech Jarosz of Dartmouth
College suggested design improvements to our newly colored figures.
Yichen (Annie) Ke and Linda Xiao, who have since graduated from
Dartmouth, aided in colorizing the illustrations, and Linda also
produced many of the Python implementations that are available on the
book’s website.
Finally, we thank our wives—Wendy Leiserson, Gail Rivest, Rebecca
Ivry, and the late Nicole Cormen—and our families. The patience and
encouragement of those who love us made this project possible. We
affectionately dedicate this book to them.
THOMAS H. CORMEN
Lebanon, New Hampshire
CHARLES E. LEISERSON
Cambridge, Massachusetts
RONALD L. RIVEST
Cambridge, Massachusetts
CLIFFORD STEIN
New York, New York
June, 2021
1 To understand many of the ways in which algorithms influence our daily lives, see the book by Fry [162].
When you design and analyze algorithms, you need to be able to
describe how they operate and how to design them. You also need some
mathematical tools to show that your algorithms do the right thing and
do it efficiently. This part will get you started. Later parts of this book
will build upon this base.
Chapter 1 provides an overview of algorithms and their place in modern computing systems. This chapter defines what an algorithm is
and lists some examples. It also makes a case for considering algorithms
as a technology, alongside technologies such as fast hardware, graphical
user interfaces, object-oriented systems, and networks.
In Chapter 2, we see our first algorithms, which solve the problem of sorting a sequence of n numbers. They are written in a pseudocode
which, although not directly translatable to any conventional
programming language, conveys the structure of the algorithm clearly
enough that you should be able to implement it in the language of your
choice. The sorting algorithms we examine are insertion sort, which uses
an incremental approach, and merge sort, which uses a recursive
technique known as “divide-and-conquer.” Although the time each
requires increases with the value of n, the rate of increase differs between the two algorithms. We determine these running times in
Chapter 2, and we develop a useful “asymptotic” notation to express them.
Chapter 3 precisely defines asymptotic notation. We’ll use
asymptotic notation to bound the growth of functions—most often,
functions that describe the running time of algorithms—from above and
below. The chapter starts by informally defining the most commonly
used asymptotic notations and giving an example of how to apply them.
It then formally defines five asymptotic notations and presents
conventions for how to put them together. The rest of Chapter 3 is primarily a presentation of mathematical notation, more to ensure that
your use of notation matches that in this book than to teach you new
mathematical concepts.
Chapter 4 delves further into the divide-and-conquer method
introduced in Chapter 2. It provides two additional examples of divide-and-conquer algorithms for multiplying square matrices, including
Strassen’s surprising method. Chapter 4 contains methods for solving recurrences, which are useful for describing the running times of
recursive algorithms. In the substitution method, you guess an answer
and prove it correct. Recursion trees provide one way to generate a
guess. Chapter 4 also presents the powerful technique of the “master method,” which you can often use to solve recurrences that arise from
divide-and-conquer algorithms. Although the chapter provides a proof
of a foundational theorem on which the master theorem depends, you
should feel free to employ the master method without delving into the
proof. Chapter 4 concludes with some advanced topics.
Chapter 5 introduces probabilistic analysis and randomized
algorithms. You typically use probabilistic analysis to determine the
running time of an algorithm in cases in which, due to the presence of
an inherent probability distribution, the running time may differ on
different inputs of the same size. In some cases, you might assume that
the inputs conform to a known probability distribution, so that you are
averaging the running time over all possible inputs. In other cases, the
probability distribution comes not from the inputs but from random
choices made during the course of the algorithm. An algorithm whose
behavior is determined not only by its input but by the values produced
by a random-number generator is a randomized algorithm. You can use
randomized algorithms to enforce a probability distribution on the
inputs—thereby ensuring that no particular input always causes poor
performance—or even to bound the error rate of algorithms that are
allowed to produce incorrect results on a limited basis.
Appendices A–D contain other mathematical material that you will find helpful as you read this book. You might have seen much of the
material in the appendix chapters before having read this book
(although the specific definitions and notational conventions we use
may differ in some cases from what you have seen in the past), and so
you should think of the appendices as reference material. On the other
hand, you probably have not already seen most of the material in Part I.
All the chapters in Part I and the appendices are written with a tutorial
flavor.
1 The Role of Algorithms in Computing
What are algorithms? Why is the study of algorithms worthwhile? What
is the role of algorithms relative to other technologies used in
computers? This chapter will answer these questions.
Informally, an algorithm is any well-defined computational procedure
that takes some value, or set of values, as input and produces some value, or set of values, as output in a finite amount of time. An
algorithm is thus a sequence of computational steps that transform the
input into the output.
You can also view an algorithm as a tool for solving a well-specified
computational problem. The statement of the problem specifies in
general terms the desired input/output relationship for problem
instances, typically of arbitrarily large size. The algorithm describes a
specific computational procedure for achieving that input/output
relationship for all problem instances.
As an example, suppose that you need to sort a sequence of numbers
into monotonically increasing order. This problem arises frequently in
practice and provides fertile ground for introducing many standard
design techniques and analysis tools. Here is how we formally define the
sorting problem:
Input: A sequence of n numbers 〈 a 1, a 2, … , an〉.

Output: A permutation (reordering)
of the input sequence
such that
.
Thus, given the input sequence 〈31, 41, 59, 26, 41, 58〉, a correct sorting
algorithm returns as output the sequence 〈26, 31, 41, 41, 58, 59〉. Such
an input sequence is called an instance of the sorting problem. In
general, an instance of a problem 1 consists of the input (satisfying whatever constraints are imposed in the problem statement) needed to
compute a solution to the problem.
Because many programs use it as an intermediate step, sorting is a
fundamental operation in computer science. As a result, you have a
large number of good sorting algorithms at your disposal. Which
algorithm is best for a given application depends on—among other
factors—the number of items to be sorted, the extent to which the items
are already somewhat sorted, possible restrictions on the item values,
the architecture of the computer, and the kind of storage devices to be
used: main memory, disks, or even—archaically—tapes.
An algorithm for a computational problem is correct if, for every
problem instance provided as input, it halts—finishes its computing in
finite time—and outputs the correct solution to the problem instance. A
correct algorithm solves the given computational problem. An incorrect
algorithm might not halt at all on some input instances, or it might halt
with an incorrect answer. Contrary to what you might expect, incorrect
algorithms can sometimes be useful, if you can control their error rate.
We’ll see an example of an algorithm with a controllable error rate in
Chapter 31 when we study algorithms for finding large prime numbers.
Ordinarily, however, we’ll concern ourselves only with correct
algorithms.
An algorithm can be specified in English, as a computer program, or
even as a hardware design. The only requirement is that the specification
must provide a precise description of the computational procedure to be
followed.
What kinds of problems are solved by algorithms?
Sorting is by no means the only computational problem for which algorithms have been developed. (You probably suspected as much
when you saw the size of this book.) Practical applications of algorithms
are ubiquitous and include the following examples:
The Human Genome Project has made great progress toward the
goals of identifying all the roughly 30,000 genes in human DNA,
determining the sequences of the roughly 3 billion chemical base
pairs that make up human DNA, storing this information in
databases, and developing tools for data analysis. Each of these
steps requires sophisticated algorithms. Although the solutions to
the various problems involved are beyond the scope of this book,
many methods to solve these biological problems use ideas
presented here, enabling scientists to accomplish tasks while using
resources efficiently. Dynamic programming, as in Chapter 14, is
an important technique for solving several of these biological
problems, particularly ones that involve determining similarity
between DNA sequences. The savings realized are in time, both
human and machine, and in money, as more information can be
extracted by laboratory techniques.
The internet enables people all around the world to quickly access
and retrieve large amounts of information. With the aid of clever
algorithms, sites on the internet are able to manage and
manipulate this large volume of data. Examples of problems that
make essential use of algorithms include finding good routes on
which the data travels (techniques for solving such problems
appear in Chapter 22), and using a search engine to quickly find
pages on which particular information resides (related techniques
are in Chapters 11 and 32).
Electronic commerce enables goods and services to be negotiated
and exchanged electronically, and it depends on the privacy of
personal information such as credit card numbers, passwords, and
bank statements. The core technologies used in electronic
commerce include public-key cryptography and digital signatures
(covered in Chapter 31), which are based on numerical algorithms and number theory.
Manufacturing and other commercial enterprises often need to
allocate scarce resources in the most beneficial way. An oil
company might wish to know where to place its wells in order to
maximize its expected profit. A political candidate might want to
determine where to spend money buying campaign advertising in
order to maximize the chances of winning an election. An airline
might wish to assign crews to flights in the least expensive way
possible, making sure that each flight is covered and that
government regulations regarding crew scheduling are met. An
internet service provider might wish to determine where to place
additional resources in order to serve its customers more
effectively. All of these are examples of problems that can be
solved by modeling them as linear programs, which Chapter 29
explores.
Although some of the details of these examples are beyond the scope
of this book, we do give underlying techniques that apply to these
problems and problem areas. We also show how to solve many specific
problems, including the following:
You have a road map on which the distance between each pair of
adjacent intersections is marked, and you wish to determine the
shortest route from one intersection to another. The number of
possible routes can be huge, even if you disallow routes that cross
over themselves. How can you choose which of all possible routes
is the shortest? You can start by modeling the road map (which is
itself a model of the actual roads) as a graph (which we will meet
in Part VI and Appendix B). In this graph, you wish to find the shortest path from one vertex to another. Chapter 22 shows how
to solve this problem efficiently.
Given a mechanical design in terms of a library of parts, where
each part may include instances of other parts, list the parts in
order so that each part appears before any part that uses it. If the
design comprises n parts, then there are n! possible orders, where
n! denotes the factorial function. Because the factorial function grows faster than even an exponential function, you cannot
feasibly generate each possible order and then verify that, within
that order, each part appears before the parts using it (unless you
have only a few parts). This problem is an instance of topological
sorting, and Chapter 20 shows how to solve this problem
efficiently.
A doctor needs to determine whether an image represents a
cancerous tumor or a benign one. The doctor has available images
of many other tumors, some of which are known to be cancerous
and some of which are known to be benign. A cancerous tumor is
likely to be more similar to other cancerous tumors than to
benign tumors, and a benign tumor is more likely to be similar to
other benign tumors. By using a clustering algorithm, as in
Chapter 33, the doctor can identify which outcome is more likely.
You need to compress a large file containing text so that it
occupies less space. Many ways to do so are known, including
“LZW compression,” which looks for repeating character
sequences. Chapter 15 studies a different approach, “Huffman
coding,” which encodes characters by bit sequences of various
lengths, with characters occurring more frequently encoded by
shorter bit sequences.
These lists are far from exhaustive (as you again have probably
surmised from this book’s heft), but they exhibit two characteristics
common to many interesting algorithmic problems:
1. They have many candidate solutions, the overwhelming majority
of which do not solve the problem at hand. Finding one that
does, or one that is “best,” without explicitly examining each
possible solution, can present quite a challenge.
2. They have practical applications. Of the problems in the above
list, finding the shortest path provides the easiest examples. A
transportation firm, such as a trucking or railroad company, has
a financial interest in finding shortest paths through a road or
rail network because taking shorter paths results in lower labor
and fuel costs. Or a routing node on the internet might need to
find the shortest path through the network in order to route a
message quickly. Or a person wishing to drive from New York to
Boston might want to find driving directions using a navigation
app.
Not every problem solved by algorithms has an easily identified set
of candidate solutions. For example, given a set of numerical values
representing samples of a signal taken at regular time intervals, the
discrete Fourier transform converts the time domain to the frequency
domain. That is, it approximates the signal as a weighted sum of
sinusoids, producing the strength of various frequencies which, when
summed, approximate the sampled signal. In addition to lying at the
heart of signal processing, discrete Fourier transforms have applications
in data compression and multiplying large polynomials and integers.
Chapter 30 gives an efficient algorithm, the fast Fourier transform (commonly called the FFT), for this problem. The chapter also sketches
out the design of a hardware FFT circuit.
Data structures
This book also presents several data structures. A data structure is a way
to store and organize data in order to facilitate access and
modifications. Using the appropriate data structure or structures is an
important part of algorithm design. No single data structure works well
for all purposes, and so you should know the strengths and limitations
of several of them.
Technique
Although you can use this book as a “cookbook” for algorithms, you
might someday encounter a problem for which you cannot readily find a
published algorithm (many of the exercises and problems in this book,
for example). This book will teach you techniques of algorithm design
and analysis so that you can develop algorithms on your own, show that
they give the correct answer, and analyze their efficiency. Different
chapters address different aspects of algorithmic problem solving. Some chapters address specific problems, such as finding medians and order
statistics in Chapter 9, computing minimum spanning trees in Chapter
21, and determining a maximum flow in a network in Chapter 24. Other
chapters introduce techniques, such as divide-and-conquer in Chapters
2 and 4, dynamic programming in Chapter 14, and amortized analysis
in Chapter 16.
Hard problems
Most of this book is about efficient algorithms. Our usual measure of
efficiency is speed: how long does an algorithm take to produce its
result? There are some problems, however, for which we know of no
algorithm that runs in a reasonable amount of time. Chapter 34 studies an interesting subset of these problems, which are known as NP-complete.
Why are NP-complete problems interesting? First, although no
efficient algorithm for an NP-complete problem has ever been found,
nobody has ever proven that an efficient algorithm for one cannot exist.
In other words, no one knows whether efficient algorithms exist for NP-
complete problems. Second, the set of NP-complete problems has the
remarkable property that if an efficient algorithm exists for any one of
them, then efficient algorithms exist for all of them. This relationship
among the NP-complete problems makes the lack of efficient solutions
all the more tantalizing. Third, several NP-complete problems are
similar, but not identical, to problems for which we do know of efficient
algorithms. Computer scientists are intrigued by how a small change to
the problem statement can cause a big change to the efficiency of the
best known algorithm.
You should know about NP-complete problems because some of
them arise surprisingly often in real applications. If you are called upon
to produce an efficient algorithm for an NP-complete problem, you are
likely to spend a lot of time in a fruitless search. If, instead, you can show that the problem is NP-complete, you can spend your time
developing an efficient approximation algorithm, that is, an algorithm
that gives a good, but not necessarily the best possible, solution.
As a concrete example, consider a delivery company with a central
depot. Each day, it loads up delivery trucks at the depot and sends them
around to deliver goods to several addresses. At the end of the day, each
truck must end up back at the depot so that it is ready to be loaded for
the next day. To reduce costs, the company wants to select an order of
delivery stops that yields the lowest overall distance traveled by each
truck. This problem is the well-known “traveling-salesperson problem,”
and it is NP-complete.2 It has no known efficient algorithm. Under certain assumptions, however, we know of efficient algorithms that
compute overall distances close to the smallest possible. Chapter 35
discusses such “approximation algorithms.”
Alternative computing models
For many years, we could count on processor clock speeds increasing at
a steady rate. Physical limitations present a fundamental roadblock to
ever-increasing clock speeds, however: because power density increases
superlinearly with clock speed, chips run the risk of melting once their
clock speeds become high enough. In order to perform more
computations per second, therefore, chips are being designed to contain
not just one but several processing “cores.” We can liken these multicore
computers to several sequential computers on a single chip. In other
words, they are a type of “parallel computer.” In order to elicit the best
performance from multicore computers, we need to design algorithms
with parallelism in mind. Chapter 26 presents a model for “task-parallel” algorithms, which take advantage of multiple processing cores.
This model has advantages from both theoretical and practical
standpoints, and many modern parallel-programming platforms
embrace something similar to this model of parallelism.
Most of the examples in this book assume that all of the input data
are available when an algorithm begins running. Much of the work in
algorithm design makes the same assumption. For many important real-
world examples, however, the input actually arrives over time, and the
algorithm must decide how to proceed without knowing what data will
arrive in the future. In a data center, jobs are constantly arriving and
departing, and a scheduling algorithm must decide when and where to
run a job, without knowing what jobs will be arriving in the future.
Traffic must be routed in the internet based on the current state, without
knowing about where traffic will arrive in the future. Hospital
emergency rooms make triage decisions about which patients to treat
first without knowing when other patients will be arriving in the future
and what treatments they will need. Algorithms that receive their input
over time, rather than having all the input present at the start, are online
algorithms, which Chapter 27 examines.
Exercises
1.1-1
Describe your own real-world example that requires sorting. Describe
one that requires finding the shortest distance between two points.
1.1-2
Other than speed, what other measures of efficiency might you need to
consider in a real-world setting?
1.1-3
Select a data structure that you have seen, and discuss its strengths and
limitations.
1.1-4
How are the shortest-path and traveling-salesperson problems given
above similar? How are they different?
1.1-5
Suggest a real-world problem in which only the best solution will do.
Then come up with one in which “approximately” the best solution is
good enough.
1.1-6
Describe a real-world problem in which sometimes the entire input is
available before you need to solve the problem, but other times the input
is not entirely available in advance and arrives over time.
1.2 Algorithms as a technology
If computers were infinitely fast and computer memory were free, would
you have any reason to study algorithms? The answer is yes, if for no
other reason than that you would still like to be certain that your
solution method terminates and does so with the correct answer.
If computers were infinitely fast, any correct method for solving a
problem would do. You would probably want your implementation to
be within the bounds of good software engineering practice (for
example, your implementation should be well designed and
documented), but you would most often use whichever method was the
easiest to implement.
Of course, computers may be fast, but they are not infinitely fast.
Computing time is therefore a bounded resource, which makes it
precious. Although the saying goes, “Time is money,” time is even more
valuable than money: you can get back money after you spend it, but
once time is spent, you can never get it back. Memory may be
inexpensive, but it is neither infinite nor free. You should choose
algorithms that use the resources of time and space efficiently.
Efficiency
Different algorithms devised to solve the same problem often differ
dramatically in their efficiency. These differences can be much more
significant than differences due to hardware and software.
As an example, Chapter 2 introduces two algorithms for sorting. The
first, known as insertion sort, takes time roughly equal to c 1 n 2 to sort n items, where c 1 is a constant that does not depend on n. That is, it takes time roughly proportional to n 2. The second, merge sort, takes time roughly equal to c 2 n lg n, where lg n stands for log2 n and c 2 is another constant that also does not depend on n. Insertion sort typically has a
smaller constant factor than merge sort, so that c 1 < c 2. We’ll see that the constant factors can have far less of an impact on the running time
than the dependence on the input size n. Let’s write insertion sort’s running time as c 1 n · n and merge sort’s running time as c 2 n · lg n. Then

we see that where insertion sort has a factor of n in its running time, merge sort has a factor of lg n, which is much smaller. For example, when n is 1000, lg n is approximately 10, and when n is 1,000,000, lg n is approximately only 20. Although insertion sort usually runs faster than
merge sort for small input sizes, once the input size n becomes large enough, merge sort’s advantage of lg n versus n more than compensates
for the difference in constant factors. No matter how much smaller c 1 is
than c 2, there is always a crossover point beyond which merge sort is
faster.
For a concrete example, let us pit a faster computer (computer A)
running insertion sort against a slower computer (computer B) running
merge sort. They each must sort an array of 10 million numbers.
(Although 10 million numbers might seem like a lot, if the numbers are
eight-byte integers, then the input occupies about 80 megabytes, which
fits in the memory of even an inexpensive laptop computer many times
over.) Suppose that computer A executes 10 billion instructions per
second (faster than any single sequential computer at the time of this
writing) and computer B executes only 10 million instructions per
second (much slower than most contemporary computers), so that
computer A is 1000 times faster than computer B in raw computing
power. To make the difference even more dramatic, suppose that the
world’s craftiest programmer codes insertion sort in machine language
for computer A, and the resulting code requires 2 n 2 instructions to sort
n numbers. Suppose further that just an average programmer
implements merge sort, using a high-level language with an inefficient
compiler, with the resulting code taking 50 n lg n instructions. To sort 10
million numbers, computer A takes
while computer B takes
By using an algorithm whose running time grows more slowly, even with a poor compiler, computer B runs more than 17 times faster than
computer A! The advantage of merge sort is even more pronounced
when sorting 100 million numbers: where insertion sort takes more than
23 days, merge sort takes under four hours. Although 100 million might
seem like a large number, there are more than 100 million web searches
every half hour, more than 100 million emails sent every minute, and
some of the smallest galaxies (known as ultra-compact dwarf galaxies)
contain about 100 million stars. In general, as the problem size
increases, so does the relative advantage of merge sort.
Algorithms and other technologies
The example above shows that you should consider algorithms, like
computer hardware, as a technology. Total system performance depends
on choosing efficient algorithms as much as on choosing fast hardware.
Just as rapid advances are being made in other computer technologies,
they are being made in algorithms as well.
You might wonder whether algorithms are truly that important on
contemporary computers in light of other advanced technologies, such
as
advanced computer architectures and fabrication technologies,
easy-to-use, intuitive, graphical user interfaces (GUIs),
object-oriented systems,
integrated web technologies,
fast networking, both wired and wireless,
machine learning,
and mobile devices.
The answer is yes. Although some applications do not explicitly require
algorithmic content at the application level (such as some simple, web-
based applications), many do. For example, consider a web-based
service that determines how to travel from one location to another. Its
implementation would rely on fast hardware, a graphical user interface,
wide-area networking, and also possibly on object orientation. It would also require algorithms for operations such as finding routes (probably
using a shortest-path algorithm), rendering maps, and interpolating
addresses.
Moreover, even an application that does not require algorithmic
content at the application level relies heavily upon algorithms. Does the
application rely on fast hardware? The hardware design used
algorithms. Does the application rely on graphical user interfaces? The
design of any GUI relies on algorithms. Does the application rely on
networking? Routing in networks relies heavily on algorithms. Was the
application written in a language other than machine code? Then it was
processed by a compiler, interpreter, or assembler, all of which make
extensive use of algorithms. Algorithms are at the core of most
technologies used in contemporary computers.
Machine learning can be thought of as a method for performing
algorithmic tasks without explicitly designing an algorithm, but instead
inferring patterns from data and thereby automatically learning a
solution. At first glance, machine learning, which automates the process
of algorithmic design, may seem to make learning about algorithms
obsolete. The opposite is true, however. Machine learning is itself a
collection of algorithms, just under a different name. Furthermore, it
currently seems that the successes of machine learning are mainly for
problems for which we, as humans, do not really understand what the
right algorithm is. Prominent examples include computer vision and
automatic language translation. For algorithmic problems that humans
understand well, such as most of the problems in this book, efficient
algorithms designed to solve a specific problem are typically more
successful than machine-learning approaches.
Data science is an interdisciplinary field with the goal of extracting
knowledge and insights from structured and unstructured data. Data
science uses methods from statistics, computer science, and
optimization. The design and analysis of algorithms is fundamental to
the field. The core techniques of data science, which overlap significantly
with those in machine learning, include many of the algorithms in this
book.
Furthermore, with the ever-increasing capacities of computers, we use them to solve larger problems than ever before. As we saw in the
above comparison between insertion sort and merge sort, it is at larger
problem sizes that the differences in efficiency between algorithms
become particularly prominent.
Having a solid base of algorithmic knowledge and technique is one
characteristic that defines the truly skilled programmer. With modern
computing technology, you can accomplish some tasks without
knowing much about algorithms, but with a good background in
algorithms, you can do much, much more.
Exercises
1.2-1
Give an example of an application that requires algorithmic content at
the application level, and discuss the function of the algorithms
involved.
1.2-2
Suppose that for inputs of size n on a particular computer, insertion sort
runs in 8 n 2 steps and merge sort runs in 64 n lg n steps. For which values of n does insertion sort beat merge sort?
1.2-3
What is the smallest value of n such that an algorithm whose running
time is 100 n 2 runs faster than an algorithm whose running time is 2 n on the same machine?
Problems
1-1 Comparison of running times
For each function f ( n) and time t in the following table, determine the largest size n of a problem that can be solved in time t, assuming that the algorithm to solve the problem takes f ( n) microseconds.
Chapter notes
There are many excellent texts on the general topic of algorithms,
including those by Aho, Hopcroft, and Ullman [5, 6], Dasgupta, Papadimitriou, and Vazirani [107], Edmonds [133], Erickson [135], Goodrich and Tamassia [195, 196], Kleinberg and Tardos [257], Knuth
[259, 260, 261, 262, 263], Levitin [298], Louridas [305], Mehlhorn and Sanders [325], Mitzenmacher and Upfal [331], Neapolitan [342], Roughgarden [385, 386, 387, 388], Sanders, Mehlhorn, Dietzfelbinger, and Dementiev [393], Sedgewick and Wayne [402], Skiena [414], Soltys-Kulinicz [419], Wilf [455], and Williamson and Shmoys [459]. Some of the more practical aspects of algorithm design are discussed by Bentley
[49, 50, 51], Bhargava [54], Kochenderfer and Wheeler [268], and McGeoch [321]. Surveys of the field of algorithms can also be found in books by Atallah and Blanton [27, 28] and Mehta and Sahhi [326]. For less technical material, see the books by Christian and Griffiths [92], Cormen [104], Erwig [136], MacCormick [307], and Vöcking et al. [448].
Overviews of the algorithms used in computational biology can be
found in books by Jones and Pevzner [240], Elloumi and Zomaya [134], and Marchisio [315].
1 Sometimes, when the problem context is known, problem instances are themselves simply called “problems.”
2 To be precise, only decision problems—those with a “yes/no” answer—can be NP-complete.
The decision version of the traveling salesperson problem asks whether there exists an order of stops whose distance totals at most a given amount.

This chapter will familiarize you with the framework we’ll use
throughout the book to think about the design and analysis of
algorithms. It is self-contained, but it does include several references to
material that will be introduced in Chapters 3 and 4. (It also contains several summations, which Appendix A shows how to solve.)
We’ll begin by examining the insertion sort algorithm to solve the
sorting problem introduced in Chapter 1. We’ll specify algorithms using a pseudocode that should be understandable to you if you have done
computer programming. We’ll see why insertion sort correctly sorts and
analyze its running time. The analysis introduces a notation that
describes how running time increases with the number of items to be
sorted. Following a discussion of insertion sort, we’ll use a method
called divide-and-conquer to develop a sorting algorithm called merge
sort. We’ll end with an analysis of merge sort’s running time.
Our first algorithm, insertion sort, solves the sorting problem introduced
in Chapter 1:
Input: A sequence of n numbers 〈 a 1, a 2, … , an〉.
Output: A permutation (reordering)
of the input sequence
such that
.
The numbers to be sorted are also known as the keys. Although the problem is conceptually about sorting a sequence, the input comes in
the form of an array with n elements. When we want to sort numbers,
it’s often because they are the keys associated with other data, which we
call satellite data. Together, a key and satellite data form a record. For example, consider a spreadsheet containing student records with many
associated pieces of data such as age, grade-point average, and number
of courses taken. Any one of these quantities could be a key, but when
the spreadsheet sorts, it moves the associated record (the satellite data)
with the key. When describing a sorting algorithm, we focus on the keys,
but it is important to remember that there usually is associated satellite
data.
In this book, we’ll typically describe algorithms as procedures
written in a pseudocode that is similar in many respects to C, C++, Java,
Python,1 or JavaScript. (Apologies if we’ve omitted your favorite programming language. We can’t list them all.) If you have been
introduced to any of these languages, you should have little trouble
understanding algorithms “coded” in pseudocode. What separates
pseudocode from real code is that in pseudocode, we employ whatever
expressive method is most clear and concise to specify a given
algorithm. Sometimes the clearest method is English, so do not be
surprised if you come across an English phrase or sentence embedded
within a section that looks more like real code. Another difference
between pseudocode and real code is that pseudocode often ignores
aspects of software engineering—such as data abstraction, modularity,
and error handling—in order to convey the essence of the algorithm
more concisely.
We start with insertion sort, which is an efficient algorithm for
sorting a small number of elements. Insertion sort works the way you
might sort a hand of playing cards. Start with an empty left hand and
the cards in a pile on the table. Pick up the first card in the pile and hold
it with your left hand. Then, with your right hand, remove one card at a
time from the pile, and insert it into the correct position in your left
hand. As Figure 2.1 illustrates, you find the correct position for a card by comparing it with each of the cards already in your left hand,
starting at the right and moving left. As soon as you see a card in your
left hand whose value is less than or equal to the card you’re holding in
your right hand, insert the card that you’re holding in your right hand
just to the right of this card in your left hand. If all the cards in your left
hand have values greater than the card in your right hand, then place
this card as the leftmost card in your left hand. At all times, the cards
held in your left hand are sorted, and these cards were originally the top
cards of the pile on the table.
The pseudocode for insertion sort is given as the procedure
INSERTION-SORT on the facing page. It takes two parameters: an
array A containing the values to be sorted and the number n of values of sort. The values occupy positions A[1] through A[ n] of the array, which we denote by A[1 : n]. When the INSERTION-SORT procedure is
finished, array A[1 : n] contains the original values, but in sorted order.
Figure 2.1 Sorting a hand of cards using insertion sort.
INSERTION-SORT( A, n)
1 for i = 2 to n
2
key = A[ i]
3
// Insert A[ i] into the sorted subarray A[1 : i – 1].
4
j = i – 1
5
while j > 0 and A[ j] > key
6
A[ j + 1] = A[ j]
7
j = j – 1
8
A[ j + 1] = key
Loop invariants and the correctness of insertion sort
Figure 2.2 shows how this algorithm works for an array A that starts out with the sequence 〈5, 2, 4, 6, 1, 3〉. The index i indicates the “current
card” being inserted into the hand. At the beginning of each iteration of
the for loop, which is indexed by i, the subarray (a contiguous portion of the array) consisting of elements A[1 : i – 1] (that is, A[1] through A[ i –
1]) constitutes the currently sorted hand, and the remaining subarray
A[ i + 1 : n] (elements A[ i + 1] through A[ n]) corresponds to the pile of cards still on the table. In fact, elements A[1 : i – 1] are the elements originally in positions 1 through i – 1, but now in sorted order. We state these properties of A[1 : i – 1] formally as a loop invariant:
Figure 2.2 The operation of INSERTION-SORT( A, n), where A initially contains the sequence
〈5, 2, 4, 6, 1, 3〉 and n = 6. Array indices appear above the rectangles, and values stored in the array positions appear within the rectangles. (a)–(e) The iterations of the for loop of lines 1–8. In each iteration, the blue rectangle holds the key taken from A[ i], which is compared with the values in tan rectangles to its left in the test of line 5. Orange arrows show array values moved one position to the right in line 6, and blue arrows indicate where the key moves to in line 8. (f) The final sorted array.
At the start of each iteration of the for loop of lines 1–8, the
subarray A[1 : i – 1] consists of the elements originally in A[1 : i
– 1], but in sorted order.
Loop invariants help us understand why an algorithm is correct.
When you’re using a loop invariant, you need to show three things:
Initialization: It is true prior to the first iteration of the loop.
Maintenance: If it is true before an iteration of the loop, it remains true
before the next iteration.
Termination: The loop terminates, and when it terminates, the invariant
—usually along with the reason that the loop terminated—gives us a
useful property that helps show that the algorithm is correct.
When the first two properties hold, the loop invariant is true prior to
every iteration of the loop. (Of course, you are free to use established
facts other than the loop invariant itself to prove that the loop invariant
remains true before each iteration.) A loop-invariant proof is a form of
mathematical induction, where to prove that a property holds, you
prove a base case and an inductive step. Here, showing that the
invariant holds before the first iteration corresponds to the base case,
and showing that the invariant holds from iteration to iteration
corresponds to the inductive step.
The third property is perhaps the most important one, since you are
using the loop invariant to show correctness. Typically, you use the loop
invariant along with the condition that caused the loop to terminate.
Mathematical induction typically applies the inductive step infinitely,
but in a loop invariant the “induction” stops when the loop terminates.
Let’s see how these properties hold for insertion sort.
Initialization: We start by showing that the loop invariant holds before
the first loop iteration, when i = 2.2 The subarray A[1 : i – 1] consists of just the single element A[1], which is in fact the original element in
A[1]. Moreover, this subarray is sorted (after all, how could a subarray
with just one value not be sorted?), which shows that the loop
invariant holds prior to the first iteration of the loop.
Maintenance: Next, we tackle the second property: showing that each
iteration maintains the loop invariant. Informally, the body of the for
loop works by moving the values in A[ i – 1], A[ i – 2], A[ i – 3], and so on by one position to the right until it finds the proper position for
A[ i] (lines 4–7), at which point it inserts the value of A[ i] (line 8). The subarray A[1 : i] then consists of the elements originally in A[1 : i], but
in sorted order. Incrementing i (increasing its value by 1) for the next iteration of the for loop then preserves the loop invariant.
A more formal treatment of the second property would require us to
state and show a loop invariant for the while loop of lines 5–7. Let’s
not get bogged down in such formalism just yet. Instead, we’ll rely on
our informal analysis to show that the second property holds for the
outer loop.
Termination: Finally, we examine loop termination. The loop variable i
starts at 2 and increases by 1 in each iteration. Once i’s value exceeds n
in line 1, the loop terminates. That is, the loop terminates once i
equals n + 1. Substituting n + 1 for i in the wording of the loop invariant yields that the subarray A[1 : n] consists of the elements originally in A[1 : n], but in sorted order. Hence, the algorithm is correct.
This method of loop invariants is used to show correctness in various
places throughout this book.
Pseudocode conventions
We use the following conventions in our pseudocode.
Indentation indicates block structure. For example, the body of
the for loop that begins on line 1 consists of lines 2–8, and the
body of the while loop that begins on line 5 contains lines 6–7 but
not line 8. Our indentation style applies to if-else statements3 as
well. Using indentation instead of textual indicators of block
structure, such as begin and end statements or curly braces,
reduces clutter while preserving, or even enhancing, clarity.4
The looping constructs while, for, and repeat-until and the if-else
conditional construct have interpretations similar to those in C,
C++, Java, Python, and JavaScript.5 In this book, the loop
counter retains its value after the loop is exited, unlike some
situations that arise in C++ and Java. Thus, immediately after a
for loop, the loop counter’s value is the value that first exceeded
the for loop bound.6 We used this property in our correctness argument for insertion sort. The for loop header in line 1 is for i =
2 to n, and so when this loop terminates, i equals n + 1. We use the keyword to when a for loop increments its loop counter in each
iteration, and we use the keyword downto when a for loop
decrements its loop counter (reduces its value by 1 in each
iteration). When the loop counter changes by an amount greater
than 1, the amount of change follows the optional keyword by.
The symbol “//” indicates that the remainder of the line is a
comment.
Variables (such as i, j, and key) are local to the given procedure.
We won’t use global variables without explicit indication.
We access array elements by specifying the array name followed
by the index in square brackets. For example, A[ i] indicates the i th element of the array A.
Although many programming languages enforce 0-origin indexing
for arrays (0 is the smallest valid index), we choose whichever
indexing scheme is clearest for human readers to understand.
Because people usually start counting at 1, not 0, most—but not
all—of the arrays in this book use 1-origin indexing. To be clear
about whether a particular algorithm assumes 0-origin or 1-origin
indexing, we’ll specify the bounds of the arrays explicitly. If you
are implementing an algorithm that we specify using 1-origin
indexing, but you’re writing in a programming language that
enforces 0-origin indexing (such as C, C++, Java, Python, or
JavaScript), then give yourself credit for being able to adjust. You
can either always subtract 1 from each index or allocate each
array with one extra position and just ignore position 0.
The notation “:” denotes a subarray. Thus, A[ i : j] indicates the subarray of A consisting of the elements A[ i], A[ i + 1], … , A[ j]. 7
We also use this notation to indicate the bounds of an array, as we
did earlier when discussing the array A[1 : n].
We typically organize compound data into objects, which are composed of attributes. We access a particular attribute using the
syntax found in many object-oriented programming languages:
the object name, followed by a dot, followed by the attribute
name. For example, if an object x has attribute f, we denote this
attribute by x.f.
We treat a variable representing an array or object as a pointer
(known as a reference in some programming languages) to the
data representing the array or object. For all attributes f of an
object x, setting y = x causes y.f to equal x.f. Moreover, if we now set x.f = 3, then afterward not only does x.f equal 3, but y.f equals 3 as well. In other words, x and y point to the same object after
the assignment y = x. This way of treating arrays and objects is
consistent with most contemporary programming languages.
Our attribute notation can “cascade.” For example, suppose that
the attribute f is itself a pointer to some type of object that has an
attribute g. Then the notation x.f.g is implicitly parenthesized as
( x.f). g. In other words, if we had assigned y = x.f, then x.f.g is the same as y.g.
Sometimes a pointer refers to no object at all. In this case, we give
it the special value NIL.
We pass parameters to a procedure by value: the called procedure
receives its own copy of the parameters, and if it assigns a value to
a parameter, the change is not seen by the calling procedure. When
objects are passed, the pointer to the data representing the object
is copied, but the object’s attributes are not. For example, if x is a
parameter of a called procedure, the assignment x = y within the
called procedure is not visible to the calling procedure. The
assignment x.f = 3, however, is visible if the calling procedure has
a pointer to the same object as x. Similarly, arrays are passed by