Softpanorama

May the source be with you, but remember the KISS principle ;-)
Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and  bastardization of classic Unix

 The Art of Computer Programming (TAOCP)
by Donald Knuth

News

Classic Books

Recommended Links

Social Problem in Enterprise Unix Administration

The Unix Hater’s Handbook

 The Peter Principle

Nineteen Eighty-Four

Parkinson Law

How to Solve It

The Art of Computer Programming

The Mythical Man-Month

The Jargon file

The Good Soldier Svejk

The Power Elite

Programming Pearls The True Believer Lions' Commentary on Unix K&R Book Rapid Development Winner-Take-All Politics Military Incompetence
Alice's Adventures in Wonderland Tao of programming AWK book Animal Farm The Elements of Programming Style Humor Etc

See the

for more information


Introduction


There's an apocryphal story about Steve Jobs meeting Knuth. The first thing Jobs said to him was "It's a pleasure to meet you Dr. Knuth. I've read all your works!". Knuth's response was "You're full of shit"

http://www.folklore.org/StoryView.py?project=Macintosh&story=Close_Encounters_of_the_Steve_Kind.txt

Donald Knuth is probably the greatest of living computer scientists and an important contributor to the open source (he authored TeX). See Portraits of Open Source Pioneers -- the chapter 2 is devoted to Donald Knuth. It also contains additional information about this classic book and a collection of Donald Knuth interviews

The first three volumes of The Art of Computer Programming (TAOCP), are classic. Each is IMHO a book that every gifted CS student should try to study re-implementing example by example. Even among the most gifted not many will succeed to finish even the first volume, but if you definitely secured your employment in Microsoft. just write to Bill Gates, he once promised :-).  Although if you managed to solve all most exercises in the first volume you probably will not be interested in Microsoft anyway ;-)

The forth volume the fist chapter of which was published as a preprint in January 2011 can be a Guinness World Records book entry. It was published 30 years after volume three :-). See The Art of Computer Programming, Volume 4A Combinatorial Algorithms

I think the most important is to study Vol 1. The way he wrote it is very enjoyable with dry humor, and the problems are labeled (according to difficulty) very aptly. Get the first volume and read chapters 1 and 2 and see how you like it. The second edition of this volume can be bought at Amazon for as little as $2.56 plus shipping. I'm firmly in the camp of folks that feel that every developer should make the investment in getting the books at some point (and it's getting easier now as older editions are costing almost nothing on Amazon) but I do not recommend that someone would sit down and read the first volume all from cover to cover and try to do "easy" exercises.  Please note that some of the "difficult"  exercises are research problems that might take months or even years to solve. A better approach is to read some small fragments during your commute to work if you have opportunity to sit and read  so that in the future you will be able to find things in them. Reading a full chapter "from cover to cover" is a lot of work (each it is half-the book) and doing exercises will probably take an average person a several month or so. But even without reading all the book you can use it as the reference or enthiclopadia, if you wish.

And for a student or researcher it is as good the the latest edition which costs more then ten times more.  It gives enough exposition to the Donald Knuth style and brilliant thinking. And it is the level of thinking of the author that represents the main value of the book: you instantly understand the book was written by a great scientist and it does not matter much that now, that the contents of most chapters can be significantly improved using more modern sources. After all Vol 1 is more then a 30 years old book (it is older then Unix) and as such it should be outdated (we all believe in progress, don't we)...

Please note that parts of Vol. 1 on of TAOCP looks completely out of touch with modern hardware  especially MIX assembler. Actually MIX assembler, designed for computer with no byte-addressable memory (and wrong size of the byte ;-)  was outdated even when the book was first published and more reflects unique Knuth's background with IBM 650, not so much the state of hardware development in late 60th, the period when IBM/360 was the king of the hill.  Now IBM 650, a 1,966 lb machine that consumed almost 30 Kw of electricity looks more like a primitive calculator than a real computer: typical installation has the memory of just 10,000 decimal digits ( 1,000 words; 10 digit per word).

It's really sad that Knuth did not adopt System 360 architecture and PL/360 assembler (Wirth's structured assembler for S/360) for his books, but we can do nothing about it.

But actually the statement above is not true: this is a book about timeless truths, not the book about the resent CS fashion like Java or you name it :-). It actually can serve as a perfect antidote against any current CS fashion. And Knuth does provide pseudo code with his natural language algorithm description. The problem with a "language-like" pseudo code is that the set of control structures is fixed and may not reflect the needs of a particular algorithms (branching out of loop is a common problem that is not addressed by structured programming well) and the subsequent languages advances. For example Perl has an interesting set of control structures that is superior to C. But even it can be improved. That's why assembler language is preferable: it never obscures "natural" control structures for each algorithms, that one day can be mapped into some new elegant language construct. Also as one review noted "sometimes high level languages with all their abstractions make things look more complex than they need be.".

Each volume is very difficult to read; you really need to work your way thru each chapter by re-implementing the examples that Knuth gives in your favorite language (assembler might help but it is not essential).

Mathematical considerations as for average and worst running time a particular algorithm can be largely ignored during the first couple of years of study of this book. Average time formulas are usually wrong ;-). And even they are right they are useful mostly as a very raw estimate of the algorithms average speed ( O notation for sorting algorithms is an example).

Worst time formulas are useful only if you can prove that they corresponds to real tests and the author should provide a test suit, which was not done in this book. Different architectures can behave very behave differently in this case (depending of the size of cache, number of   comparisons that break the pipeline, etc).   Published should help in such cases, but Addison-Wesley preferred just to milk the book and never provided downloads of code , test cases and Mix assembler for PC. Suckers...

Actually most mathematics in Vol. 1 can (and probably should) be initially completely ignored.

Please note that the latest editions are usually slightly overpriced. To save money you can buy one of the first editions: there is not that much difference in content to justify the differences in price. Actually the differences are so minor that are almost unnoticeable. Knuth did an excellent work the first time he published each volume. And he worked 10 years non stop (from 1962 to 1972) to publish the first three volumes. As for the forth volume he initially planned to finish it in 2003. So it might never happen as of 2014 only one chapter was printed and he is already 76 (born January 10, 1938)

The first three volume are now so tightly connected with the state of programming and hardware in 1960th that a significant revision essentially means writing a new book. Don't expect it, despite Knuth promises. I think, that for a significant improvement we probably need another century and another person.

Selected Amazon reviews

Here is one of the most interesting (and not so sycophantic :-) reviews of the book on Amazon

THE CS Bible? Let's be realistic and honest, November 15, 2003
Reviewer: Strict Evaluation (Seattle, WA USA)
The Art of CP (TAoCP) book set covers the core of computer science curriculum on data structures and algorithms. Not everything there is today (that would be impossible), just the core, but that's more than enough to begin with (and for most people quite sufficient in general.) This is typical Knuth: he knows his stuff, he writes very well, he's an encyclopedic mind; his texts are mathematically rich, yet at the same time not overwhelming; time and again he demonstrates this 19-th century Germanic scientific style, which is to say he's incredibly detailed and exact -- one can even accuse him of pedantry, but in a good sense. He writes with a sharp, dry wit (his sense of humor makes him unique among the rest of the writers on the computing theme.) So far so good.

However, all the benefits mentioned above notwithstanding, I have to say that on balance this triptych of his is impractical. It has either become outdated, or was even originally written with an independently-wealthy reader in mind, someone like an 18th century gentleman-farmer who, fully disencumbered of the vulgarity of having to earn a living, is leisurely indulging in the exercise of his mental ability for the pure intellectual challenge of it; someone with no plebeian concerns of practicality ever entering his exalted mind.

The problem is with MIX. I second what the others said about it, and what's more, I refuse to accept the explanation (purportedly Knuth's) posted below by someone: the problem is not only that MIX is an assembly language (which still would be a functional malapropism in a book like that) -- no, a far more grievous problem is that MIX is a phantasm, a whimsically extravagant invention having no real-life equivalent, at least today. The mythical processor underlying this thing (5-byte words, etc) is not something that anyone below 40 years of age has ever seen, even if it does have historical analogues.

The gravity of the offence here is not that it is some real but unfamiliar processor's assembly -- after all, if you know i86 assembly, you can (kinda, sorta) read the Motorola equivalent... No, it is that MIX and the fairy-tale processor architecture it is imagined to run on are *purposely made to resemble nothing* that you may have some familiarity with -- thus making the already-difficult material obfuscated beyond anything even marginally manageable for a regular computing Joe, who has a (real) life, and at any rate, can't limit his CS intake to this one work.

Elucidating difficult in itself CS material via examples in assembly language of even a real (or made-up but realistic) kind is a very bad choice because the student's attention, already taxed by the subject matter itself, will be further burdened by the non-algorithmic nature of the assembly language. But to exacerbate this potential ordeal by insisting on the use of something so gratuitously eccentric and profitless for the reader as MIX is simply unconscionable.

Ideally, what a good CS text of this sort will use is pseudocode. But if a writer wishes to add to his book a realistic slant, it is acceptable that he use some sort of real language -- so long as it is algorithmic; today, C is a perfect choice. Knuth counters (and he's absolutely right): there was no C when the book was written. He's also right in saying that had he written it with Pascal it would have become outdated by now. So if that was the problem, TAoCP could have been written with some sort of pseudocode; this would last forever.

Of course, even using a real language would not actually be such a great problem -- we all know of similar books where the originally-chosen language was replaced when it fell in disuse: for example the numeric programming book by Teukolsky; it started with Fortran and was then redone in C; this demonstrates that the language part can be brought up-to-date if necessary. Both Fortran and C are algorithmic languages that, owing to their readability, can be used instead of pseudocode.

Ideally, books should be written with both pseudocode (a must, in my view), and, in order to give an example of an actual implementation, some real language (see the recent book by Goodrich; it's pseudocode throughout and a smattering of Java here and there -- perfect!)

To sum it all up: measured by today's needs, The Art of CP is overrated (out of snobbery; bragging of having read it is "kewl"; meantime, the truth is, not too many people are capable of such a feat for the reasons stipulated above; when actually used, TAoCP books are read in chunks, a chapter here, a chapter there -- which is a shame, because they are very well written, and to work through them in their entirety would be much more profitable than biting off a little here and there.)

I am going to be slammed by the Knuth cult followers for saying this, but I do not recommend these books. Instead, consider something similar but more practical: two titles immediately come to mind, Cormen &Co. (a.k.a. CLR) and Goodrich (forget the title but search on the name.) Foundations of Computer Science by Aho/Ullmann (The Tortoise book) is a suitable option as well.

TAoCP is potentially very good, but until someone ruthlessly excises all the bloody junk (MIX etc.) and replaces it with pseudocode or C, it will remain useless.

Don't get me wrong here: Knuth is an admirable, justly venerated computer scientist, and a very good writer to boot (for example his Concrete Mathematics book is excellent). But when it comes to TAoCP, even though to mention it is very chic in some circles, we must admit the obvious: he has produced a work that's impenetrable (or, rather, the enormity of time and effort required to penetrate it makes such an attempt an unworthy investment) and therefore useless in practical terms for the majority of the potential readership.

Note:  I suspect that the author of this review never dealt with algorithms in depth in his professional life: I totally disagree with his recommendation of  Cormen &Co. (a.k.a. CLR)  which  IMHO is devoid of any connections with the reality to be useful. It is the science for the science sake.   Don't know much about Goodrich's book and the Foundations of Computer Science by Aho/Ullmann (The Tortoise book).

And Knuth does provide pseudo code with his natural language algorithm description. The problem with the pseudo code is that it can be as outdate as any algorithmic language in a sense that the set of control structures used by the author in this pseudo code may not reflect the needs of a particular algorithms (branching out of loop is a common problem that is not addresses by structured programming well) and the subsequent languages development. For example Perl has an interesting set of control structures that are superior to C. But even this set t can be improved.

That's why assembler language is preferable: it never obscures "natural" control structures for each algorithms. Also as one review noted "sometimes high level languages with all their abstractions make things look more complex than they need be."

Here is another review that touches very important subject: professionalism in programming is strongly correlated with the level of knowledge of TAOCP:

Without reading Knuth you are at most a talented amateur, August 27, 2002

Reviewer: J. putnam "jefu" (eastern washington state, usa) -

The three volumes of the original version of "The Art of Computer Programming" are more than thirty years old now. I still have the edition I bought back in 1978 or so and they're never too far from the "easy to reach shelf" in my bookshelves. Sometimes I rearrange things and move them away, always figuring that newer books will work as well, but somehow they always move back - not always quickly, but rarely too slowly.

Sometimes its because I just want to reread something, sometimes its because I want to challenge myself with one of the problems, but often enough it is because I find myself needing to supplement information from somewhere else or because I just can not find quite what I need anywhere else. And I will turn to the web to search for things - but first I usually check out TAOCP.

It can be tough going in some places, the math sometimes reaching the "AAArrrggghhhh, run away, run away" kind of appearance, but a bit of work almost invariably pays off.

This is not a book from which you will learn to program. You should have some facility with more than college freshman level mathematics. And you'll need to read things more than once in many cases.

If you're an IT person, a software installer type, a low level coder or the like and are content with this, you can probably afford to avoid ever reading TAOCP, but if you want to solve the hard problems, if you want to learn just WHY things work, and learn the mathematics and the kinds of analysis techniques that make the difference between the grunt programmer and the really good ones, you'll need the math, you'll need the kind of information, knowledge and computerology-goodness-and-niceness that TAOCP (and few books other than TAOCP) can give you.

Seeing a well used copy of TAOCP on a computer professionals bookshelf is always a sign to me that they're serious about their profession and about their own learning. Not seeing one is often a flag that they're someone I'd rather not have on a project I'm involved with. Worse yet not seeing TAOCP on a CS professors bookshelf leads me to the impression that professor is at best semi-educated, and at worst one of those frightening types who managed to slip through the cracks somehow.

Now I'm waiting for Volume 4 (and all the rest). (Interestingly, I got one of those mail back cards from the publisher about 4 years ago asking me if I wanted to reserve a copy of the "soon to be published" V4 - I could not resist calling and asking lots of details about it.)

I was also lucky enough to find copies of the original editions of V1 and V3 two years back in a book sale for a grand total of about a dollar. Now now I have 1.101010... (in binary) sets. If V1-4 get published in a boxed set I'll undoubtedly spring for the set at whatever price.

For more reviews see Recommended Links

TAOCP Volumes

  1. The first volume of his series was published in 1968. Only small corrections were make in the second and third edition. The third edition of the book was produced using TeX. Still all example are yet in MIX.
  2. The second volume  The Art of Computer Programming : Seminumerical Algorithms was initially published one year after the first in 1969. 
  3. The third  The Art of Computer Programming : Sorting and Searching was published four years after the first in  1973. Unfortunately the third volume was written when Knuth was already pretty much exhausted by publishing the first two volumes and that shows. The third volume also was based not on Knuth own research but on lecture notes of Professor Robert W. Floyd, one of the few winners of ACM Turing prise, the top 50 scientists and technologists of the  USA early computer science explosion.  Also at the moment of writing the field still quickly developed and this shows in the book. It is the most outdated volume out of three, despite the fact that it was published the last. Nevertheless it remains classic because Knuth is Knuth, and his touch on the subject can't be replicated easily. Most of the content of volume 3 can now be found in other books, although in somewhat emasculated fashion.
  4. The forth volume was never published in full as of 2014, but a large chapter "Combinatorial Algorithms" was published in 2011. Which make probably a record of the interval between volumes of a non-fiction book (almost 40 years !)

While each of the volumes now considered to be classic, the third volume is less so then the first. See  TAoCP and its Influence of Computer Science  page for more details.

All three volumes were translated into all major languages, including Russian and Chinese.

By publishing this books Donald Knuth essentially started the systematic, encyclopedic study of algorithms (and actually coined the term "analysis of algorithms" in the mid-sixties)  -- a field in computer science whose overall goal is an understanding of the properties as well as provide a formal measure of complexity of algorithms. Properties of random strings, permutations, trees, and graphs are essential components in the analysis of algorithms and at the same time are building blocks of system programs. That's why trees were introduced in the first volume.  As he remarked:

The original work I do in The Art of Computer Programming is to take the methods of two different authors and analyze method A from the standpoint of author B, and method B from the standpoint of author A. They have only given their sides of it, so I try to fill in ....

TOC of all three volumes

All volumes recently were recently published in the third edition using TeX as a typesetting system. EPUB and PDF editions are expected in 2015. Amazon sells Kindle edition of the first three volumes for some time 

Volume 1. The Art of Computer Programming : Fundamental Algorithms (Vol 1, 3rd Ed)

Donald E. Knuth / Paperback / Published 1997
This is actually a book first published in 1968 and written in 1962-1967.  Only small corrections were make in the second and third edition. This edition of the book was produced using TeX.

This book is real classic and it combines unique author style with the the introduction to many important algorithms and concepts of system programming (coroutnes).    

Table of Contents

1. Basic Concepts.

2. Information Structures.

Answers to Exercises.

Appendix A. Tables of Numerical Quantities.

  1. Fundamental Constants (decimal).
  2. Fundamental Constants (octal).
  3. Harmonic Numbers, Bernoulli Numbers, Fibonacci Numbers.

Appendix B. Index to Notations.

Index and Glossary

Back Cover

The bible of all fundamental algorithms and the work that taught many of today's software developers most of what they know about computer programming.

-Byte, September 1995

I can't begin to tell you how many pleasurable hours of study and recreation they have afforded me! I have pored over them in cars, restaurants, at work, at home... and even at a Little League game when my son wasn't in the line-up.

-Charles Long

If you think you're a really good programmer... read [Knuth's] Art of Computer Programming... You should definitely send me a resume if you can read the whole thing.

-Bill Gates

It's always a pleasure when a problem is hard enough that you have to get the Knuths off the shelf. I find that merely opening one has a very useful terrorizing effect on computers.

-Jonathan Laventhol

This first volume in the series begins with basic programming concepts and techniques, then focuses more particularly on information structures-the representation of information inside a computer, the structural relationships between data elements and how to deal with them efficiently. Elementary applications are given to simulation, numerical methods, symbolic computing, software and system design. Dozens of simple and important algorithms and techniques have been added to those of the previous edition. The section on mathematical preliminaries has been extensively revised to match present trends in research.
           

Volume 2. The Art of Computer Programming : Seminumerical Algorithms (Vol 2, 3rd Ed)

Donald E. Knuth / Paperback / Published 1997

The first edition was published in 1969, one year after the first volume. This edition of the book was produced using TeX. This volume is more difficult to read, more specialized and generally less impressive that the first volume. Some parts of the text now shows its age as the thirty of random generation is now more advanced than in late 60th, when the book was written.  You might avoid buying it "just for collection" unless you really use random number generators and other things covered in this volume.

Table of Contents

3. Random Numbers.

4. Arithmetic.

Answers to Exercises.

Appendix A. Tables of Numerical Quantities.

Volume 3 The Art of Computer Programming : Sorting and Searching (Vol 3, 2nd Ed)

Donald E. Knuth / Paperback / Published 1998 The first edition was published in 1973. This edition of the book was produced using TeX.  This volume now shows its age but still is a very valuable, unique book. I would still recommend buying it. 
Preface
Cookery is become an art,
a noble science;
cooks are gentlemen.

TITUS LIVIUS, Ab Urbe Condita XXXIX.vi
(Robert Burton, Anatomy of Melancholy 1.2.2.2)

This book forms a natural sequel to the material on information structures in Chapter 2 of Volume 1, because it adds the concept of linearly ordered data to the other basic structural ideas.

The title "Sorting and Searching" may sound as if this book is only for those systems programmers who are concerned with the preparation of general-purpose sorting routines or applications to information retrieval. But in fact the area of sorting and searching provides an ideal framework for discussing a wide variety of important general issues:

  • How are good algorithms discovered?
  • How can given algorithms and programs be improved?
  • How can the efficiency of algorithms be analyzed mathematically?
  • How can a person choose rationally between different algorithms for the same task?
  • In what senses can algorithms be proved ''best possible''?
  • How does the theory of computing interact with practical considerations?
  • How can external memories like tapes, drums, or disks be used efficiently with large databases?

Indeed, I believe that virtually every important aspect of programming arises somewhere in the context of sorting or searching!

This volume comprises Chapters 5 and 6 of the complete series. Chapter 5 is concerned with sorting into order; this is a large subject that has been divided chiefly into two parts, internal sorting and external sorting. There also are supplementary sections, which develop auxiliary theories about permutations (Section 5.1) and about optimum techniques for sorting (Section 5.3). Chapter 6 deals with the problem of searching for specified items in tables or files; this is subdivided into methods that search sequentially, or by comparison of keys, or by digital properties, or by hashing, and then the more difficult problem of secondary key retrieval is considered. There searching related to sorting is a surprising amount of interplay between both chapters, with strong analogies tying the topics together. Two important varieties of information structures are also discussed, in addition to those considered in Chapter 2, namely priority queues (Section 5.2.3) and linear lists represented as balanced trees (Section 6.2.3).

Like Volumes 1 and 2, this book includes a lot of material that does not appear in other publications. Many people have kindly written to me about their ideas, or spoken to me about them, and I hope that I have not distorted the material too badly when I have presented it in my own words.

I have not had time to search the patent literature systematically; indeed, I decry the current tendency to seek patents on algorithms (see Section 5.4.5). If somebody sends me a copy of a relevant patent not presently cited in this book, I will dutifully refer to it in future editions. However, I want to encourage people to continue the centuries-old mathematical tradition of putting newly discovered algorithms into the public domain. There are better ways to earn a living than to prevent other people from making use of one's contributions to computer science.

Before I retired from teaching, I used this book as a text for a student's second course in data structures, at the junior-to-graduate level, omitting most of the mathematical material. I also used the mathematical portions of this book as the basis for graduate-level courses in the analysis of algorithms, emphasizing especially Sections 5.1, 5.2.2, 6.3, and 6.4. A graduate-level course on concrete computational complexity could also be based on Sections 5.3, and 5.4.4, together with Sections 4.3.3, 4.6.3, and 4.6.4 of Volume 2.

For the most part this book is self-contained, except for occasional discussions relating to the MIX computer explained in Volume 1. Appendix B MIX computer contains a summary of the mathematical notations used, some of which are a little different from those found in traditional mathematics books.

Table of content

5. Sorting.

  • Combinatorial Properties of Permutations.
  • Inversions.
  • Permutations of a Multiset.
  • Runs.
  • Tableaux and Involutions.
  • Internal sorting.
  • Sorting by Insertion.
  • Sorting by Exchanging.
  • Sorting by Selection.
  • Sorting by Merging.
  • Sorting by Distribution.
  • Optimum Sorting.
  • Minimum-Comparison Sorting.
  • Minimum-Comparison Merging.
  • Minimum-Comparison Selection.
  • Networks for Sorting.
  • External Sorting.
  • Multiway Merging and Replacement Selection.
  • The Polyphase Merge.
  • The Cascade Merge.
  • Reading Tape Backwards.
  • The Oscillating Sort.
  • Practical Considerations for Tape Merging.
  • External Radix Sorting.
  • Two-Tape Sorting.
  • Disks and Drums.
  • Summary, History, and Bibliography.

6. Searching.

  • Sequential Searching.
  • Searching by Comparison of Keys.
  • Searching an Ordered Table.
  • Binary Tree Searching.
  • Balanced Trees.
  • Multiway Trees.
  • Digital Searching.
  • Hashing.
  • Retrieval on Secondary Keys.

Answers to Exercises.

Appendix A: Tables of Numerical Quantities.

  • Fundamental Constants (decimal).
  • Fundamental Constants (octal).
  • Harmonic Numbers, Bernoulli Numbers, Fibonacci Numbers.

Volume 4A. The Art of Computer Programming, Volume 4A Combinatorial Algorithms

Hardcover: 912 pages
Publisher: Addison-Wesley Professional (January 22, 2011)
ISBN-10: 9780201038040
ISBN-13: 978-0201038040
ASIN: 0201038048
Product Dimensions: 9.4 x 6.7 x 2 inches

Edward T. Pegg Jr(Champaign, IL USA): A gorgeous classic on Combinatorial thought, (February 2, 2011)

Knuth has written many books considered classics. Some of the previous works have been set-up for where the real fun is - Combinatorics. In one of my own columns, I say "Never trust the brute-force power of a computer network to do the job of a combinatorialist." In 1967, John P. Robinson and Arthur J. Bernstein published an optimal Golomb ruler with 24 marks (OGR24). Their solution was confirmed in 2004 by a massive distributed effort using tens of thousand of computer years.

Knuth is attempting to discuss all the algorithms that will still be important 50 years from now. The amount of speed given using these algorithms is staggering.

Some examples topics in the book:

Helpful mathematical illustrations feature prominently throughout the book, and pretty much every page is gorgeously formatted. Knuth developed TeX in part to produce beautiful books, and that is on display here.

Many thoughtful questions are provided as an aid to learning these very useful techniques. The Answers section runs for 303 pages.

It will take me months or years to digest most the information in this work, but I can't imagine a better presentation for this difficult but lucratively useful material.

Table of Contents

Preface
Notes on the Exercises

Chapter 7: Combinatorial Searching 1

  • 7.1: Zeros and Ones 47
  • 7.2: Generating All Possibilities 281

Answers to Exercises 514

Appendix A: Tables of Numerical Quantities 818

Appendix B: Index to Notations 822

Appendix C: Index to Algorithms and Theorems 828

Appendix D: Index to Combinatorial Problems 830

Index and Glossary 834

Digital Typography - Knuth digression from writing the Art of computer Programming

Paradoxically Knuth stopped working on his volumes because hw was not completely satisfied with how published volumes were typeset. Or he intuitively felt that he need to stop after ten years marathon and breath fresh air. In any case the switch to creating TeX typesetting system. it took him a another decade. The next Knuth books were devoted to it.  Here he again his choice of language was problematic -- Pascal.

TeX became standard-de-facto in mathematical publishing so this one  man company named Donald Knuth gave publishing software vendors a run for the money ;-)

Digital Typography (Csli Lecture Notes, 78)

 Donald E. Knuth / Hardcover

TeX is Donald Knuth most famous open source program. It is extremely rare for a mathematician at Knuth's level to write both a large application program and the documentation -- and even rarer still to succeed. Donald Knuth has written this program to help him to publish his Art of Computer Programming without typos.

He created it in 1986 just before PC revolution and it was the first open source typesetting program of such outstanding quality. Donald Knuth spent considerable time learning the book compositor's art, and that shows in the details of TeX -- as with the oft-mentioned paragraph optimization routines. But more than this, TeX is malleable. It is a tool that lets skilled compositors automate more of the niceties of fine composition, rather than having to add them by hand.

Just by chance SGML was chosen as a basis of HTML instead of  TeX  and TeX instantly became a second class citizen of computer publishing.

But the jury is still out and only future can tell the final verdict. 

The Texbook : A Complete Guide to Computer Typesetting With Tex

Donald Ervin Knuth / Paperback / Published 1988

Amazon price: $41.95

Tex : The Program (Computers and Typesetting, Vol B)

Donald Ervin Knuth / Hardcover / Published 1986

Amazon price: $51.95

The Metafontbook (Computers and Typesetting, Vol C)

Donald Ervin Knuth / Hardcover / Published 1986

Metafont : The Program (Computers and Typesetting, Vol D)

Donald Ervin Knuth / Hardcover / Published 1986

Computer Modern Typefaces (Computers and Typesetting, Vol E)

Donald Ervin Knuth / Hardcover / Published 1986

Literary Programming  -- good idea, subpar execution

Knuth is also an avid proponent of so called literate programming. For me the key is the ability to mix text of the program with annotations and documentation. TeX is not essential here. It looks like you can implement more modern version of this by using HTML editor for writing both text and documentation and convert the text of the program from HTML to ASCII on the fly before compilation. 

Also dictionary of all words should be constructed and maintained on the fly by the literate programming tool. It makes perfect sense for large programs. Probably Ms Word, FrontPage or any other powerful HTML editor can be used as a literate programming tools.

Literate Programming  CSLI Lecture Notes Number 27. Center for the Study of Language and Information (Palo Alto: 1992).  ISBN 0-937073-80-6:

This book that contains description of hypertext based approach to writing code (see Literate Programming -- Propaganda and Tools for intro information) as well as some of the most well know papers. including influential "Structured Programming with GOTO Statements" and Turing Lecture Computer Programming as an Art. A detailed history of all errors in TeX can be found in chapters 10 and 11.

   Contents
     Acknowledgments
     Preface

     1.  [A.M. Turing Award Lecture, 1974]
     2. Structured Programming with goto Statements [1974]
     3. A Structured Program to Generate All Topological Sorting Arrangements [with Jayme L. Szwarcfiter, 1974]
     4. Literate Programming [1984]
     5. Programming Pearls, by Jon Bentley: Sampling [1986]
     6. Programming Pearls, Continued: Common Words [1986]
     7. How to Read a
WEB [1986]
     8. Excerpts from the Programs for TEX and METAFONT [1986]
     9. Mathematical Writing [1987]
     10. The Errors of TEX [1989]
     11. The Error Log of TEX [1991]
     12. An Example of
CWEB [1990]
     Further Reading
     Index

4 of 5 stars Arguing for an aesthetic appreciation of programming, March 30, 2000
 Reviewer: Charles Ashbacher (see more about me) from Hiawatha, IA ([email protected])

Writing computer programs is easy, writing programs that are useful is hard and writing programs that are very useful as well as correct sometimes seems impossible. Knuth takes this truism even further and offers up the radical notion that the very best programs are so profound that people will one day read them as one would a piece of classic literature. If the idea of curling up by the fire with a copy of The World's Greatest Programs and spending the night in a state of rapture seems absurd, you think as I did. However, after reading this book, my mind now concedes the possibility does exist. After all, most of the great works of literature describe actions, conditions and solutions (algorithms) to problems of human-human and sometimes human-god interactions. Science fiction writers and readers have known for a long time that computers are very interesting objects. Buildings, paintings or other works of art are often admired not only for their subjective beauty, but also for the talent that it took to create them. Programming ability can be admired just as easily.

However, an extremely large technical barrier exists, in that programming languages are literal, terse and lack flair. Knuth works to eliminate this problem by combining the programming and documentation languages into a structure called a WEB. He also adopts the reverse paradigm that a program should be an explanation to humans of what the computer is doing. The result does wonders for readability and introduces a bit of flair. Certainly, this is a good first step towards Knuth's ideal.

The development of TEX is chronicled in great detail. It is personally comforting to read about some of the errors made in its development. Learning that the great ones make errors provides emotional security to all who hack for fun and/or profit. Some classic programming problems are used to demonstrate exactly what literate programming is meant to be. Jon Bentley, author of the 'Programming Pearls' section of "Communications of the ACM", contributes two chapters that were co-authored with Donald Knuth. These pearls demonstrate the applications of literate programming to common coding problems. All are presented in a clear, easy-to-understand style.
A bit of clever humor is also used. A WEB program is constructed from two distinct components. The Weave part explains what the program is doing, and the Tangle component produces the program. Of course, this suggests the line from Sir Walter Scott's poem Marmion, "O what a tangled web we weave, when first we practice to deceive."

I do not know whether to consider this book the product of a dreamer or a visionary. The truth, like most of the work of pioneers, is no doubt somewhere in between. My opinion is that it is more vision than dream. And is that not a common theme among the greatest works of art and literature?

Published in Mathematics and Computer Education, reprinted with permission.

There is a free literate programming system -- CWeb -- that is used by Donald Knuth himself. For details see: 

The Cweb System of Structured Documentation/Version 3.0

Donald E. Knuth, et al / Paperback / Published 1994

Addison-Wesley Professional did a second rate job as a publisher

According to Wikipedia: Addison-Wesley is a publisher of textbooks and computer literature. It is an imprint of Pearson PLC, a global publishing and education company. In addition to publishing books, Addison-Wesley also distributes its technical titles through the Safari Books Online e-reference service. Addison-Wesley's majority of sales derive from the United States (55%) and Europe (22%)

By the fact of their longevity and multiple editions (four editions of each volume now) this is milk cow for publisher.

But they did not give anything in return. No CD with the book. No downloadable code from the book, no MIX interpreter for Windows and debugging tools. Nothing.

Quality of flowcharts in all three volumes is extremely low. That's another big no-no for the publisher. there is an established flowchart style and to invest some bustard child instead is simply stupid.

This is a second rate job, to say the least.

 

Knuth articles that are held hostage by ACM  and books that are compilation of articles

ACM holds the copyright on a lot of Donald Knuth articles but they were never published them on the WEB for free access, although ACM electronic archive is now available for ACM members for approx. $70 a year. Some of those articles are still important.  There is also a compilation on Knuth articles of various quality in  two books:

Selected Papers on Computer Science (Csli Lecture Notes, No. 59)

Donald Ervin Knuth/ Paperback / Published 1996

Everyone should read Don Knuth's homage to the 650. He describes his rapture upon first reading Stan Poley's code for SOAP(Symbolic Optimal Assembly Language, written by Stan Poley of IBM, that was the assembler of choice for almost every 650 user) and the sheer beauty of his work.

 Content
     Preface
     Acknowledgments

     0. Algorithms, Programs, and Computer Science [1966; 1992]
     1. Computer Science and its Relation to Mathematics [1973; 1974]
 &nbsnbsp;  2. Mathematics and Computer Science: Coping with Finiteness [1976]
     3. Algorithms [1977]
     4. Algorithms in Modern Mathematics and Computer Science [1981]
     5. Algorithmic Themes [1988]
     6. Theory and Practice, I [1977]
     7. Theory and Practice, II [1985]
     8. Theory and Practice, III [1986]
     9. Theory and Practice, IV [1989]
     10. Are Toy Problems Useful [1977]
     11. Ancient Babylonian Algorithms [1972; 1976]
     12. Von Neumann's First Computer Program [1970]
     13. The IBM 650: An Appreciation from the Field [1986]
     14. George Forsythe and the Development of Computer Science [1972]
     15. Artistic Programming [1993]
     Index

[Knuth2000]

Knuth, Donald E Selected Papers on Analysis of Algorithms.  CLSI Lecture Notes Number 102.  Center for the Study of Language and Information (Palo Alto: 2000).  ISBN 1-57586-212-3 pbk.

 


Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

News Algorithms and Data Structures Donald Knuth: Leonard Euler of Computer Science Recommended Links TAOCP Volumes Digital Typography Literary Programming
TAoCP and its Influence of Computer Science TeX Typography and TeX Literate programming Webliography Donald Knuth Interviews
Assembler is not for Dummies Sorting Algorithms Searching Algorithms Graph Algorithms Compilers Algorithms Humor Etc

[Dec 26, 2016] Ask Slashdot: Have You Read The Art of Computer Programming?

Dec 26, 2016 | ask.slashdot.org
Posted by EditorDavid on Sunday December 04, 2016 @09:09PM from the truth-from-Knuth dept.

In 1962, 24-year-old Donald Knuth began writing The Art of Computer Programming, publishing three volumes by 1973, with volume 4 arriving in 2005. (Volume 4A appeared in 2011 , with new paperback fascicles planned for every two years, and fascicle 6, "Satisfiability," arriving last December).

"You should definitely send me a resume if you can read the whole thing," Bill Gates once said, in a column where he described working through the book . "If somebody is so brash that they think they know everything, Knuth will help them understand that the world is deep and complicated."

But now long-time Slashdot reader Qbertino has a question: I've had The Art of Computer Programming on my book-buying list for just about two decades now and I'm still torn...about actually getting it. I sometimes believe I would mutate into some programming demi-god if I actually worked through this beast, but maybe I'm just fooling myself...

Have any of you worked through or with TAOCP or are you perhaps working through it? And is it worthwhile? I mean not just for bragging rights. And how long can it reasonably take? A few years?

Share your answers and experiences in the comments. Have you read The Art of Computer Programming ?

[Oct 24, 2014] Knuth Recent News

Hopefully this will be the version of the books in which MIX is replaced by MMIX. But so far (October, 2014) the only book available as ebook is MMIX description.

Announcing the first Art of Computer Programming eBooks

For many years I've resisted temptations to put out a hasty electronic version of The Art of Computer Programming, because the samples sent to me were not well made.

But now, working together with experts at Mathematical Sciences Publishers, Addison-Wesley and I are launching an electronic edition that meets the highest standards. We've put special emphasis into making the search feature work well.

Thousands of useful "clickable" cross-references are also provided --- from exercises to their answers and back, from the index to the text, from the text to important tables and figures, etc.

Note: However, I have personally approved ONLY the PDF versions of these books. Beware of glitches in the ePUB versions, which cannot be faithful to my intentions because of deficiencies in that peculiar format.

The first fascicle can now be ordered from Pearson's InformIT website, and so can Volumes 1 and 2.

Volumes 3 and 4A will be ready very soon.

[Mar 30, 2011] Book Review The Art of Computer Programming. Volume 4A Combinatorial Algorithm -

30 years since his last book ...

Slashdot

Decades in the making, Donald Knuth presents the latest few chapters in his by now classic book series The Art of Computer Programming. The computer science pioneer's latest book on combinatorial algorithms is just the first in an as-of-yet unknown number of parts to follow. While these yet-to-be-released parts will discuss other combinatorial algorithms, such as graph and network algorithms, the focus of this book titled Volume 4A Combinatorial Algorithms Part 1 is solely on combinatorial search and pattern generation algorithms. Much like the other books in the series, this latest piece is undoubtedly an instant classic, not to be missing in any serious computer science library or book collection." Keep reading for the rest of asgard4's review.

The Art of Computer Programming. Volume 4A: Combinatorial Algorithms Part 1 author Donald E. Knuth pages 883 publisher Addison-Wesley Publishing rating 9/10 reviewer asgard4 ISBN 0-201-03804-8 summary Knuth's latest masterpiece. Almost all there is to know about combinatorial search algorithms.

The book is organized into four major parts, an introduction, a chapter on Boolean algebra, a chapter on algorithms to generate all possibilities (the main focus of the book), and finally 300 pages of answers to the many exercises at the end of every section in the book. These exercises and answers make this work an excellent companion for teachers of a university course.

The book begins with some introductory examples of combinatorial searching and then gives various definitions of graphs and directed acyclic graphs (DAGs) since a lot of combinatorial algorithms conveniently use graphs as the data structures they operate on. Knuth's writing style is terse and to the point, especially when he presents definitions and proofs. However, the text is sprinkled with toy problems and puzzles that keep it interesting.

After the introduction, the first chapter of the book (out of only two) is titled "Zeros and Ones" and discusses Boolean algebra. Most readers that have studied computer science in some form should be intimately familiar with most of the discussed basics, such as disjunctive normal forms and Boolean functions and their evaluation. The reader might be surprised to find a discussion of such an elemental foundation of computer science in a book on combinatorial algorithms. The reason is that storage efficiency is especially important for these types of algorithms and understanding the basic storage unit of computer systems nowadays (as the decimal computer is a definite thing of the past) is of importance.

After covering the basics of Boolean algebra and Boolean functions in quite some detail, Knuth gets to the most fun part of this chapter in my opinion: the section on bitwise tricks and techniques on integer numbers. Being a software engineer in the video games industry, I recognized a lot of the techniques from my day-to-day work, such as bit packing of data and various bit shifting and bit masking tricks. There is also a discussion of some interesting rasterization-like algorithms, such as the shrinking of bitmaps using Levialdi's transformation or filling of regions bounded by simple curves. The chapter concludes with Binary Decision Diagrams that represent an important family of data structures for representing and manipulating Boolean functions. This topic was also quite interesting to me since I have never been exposed to it before.

The second and main chapter of the book is titled "Generating All Possibilities". In this particular volume of the The Art of Computer Programming series, the only subsection of the chapter in this volume is on generating basic combinatorial patterns, or more specifically generating all n-tuples, permutations, combinations, partitions, and trees. We can expect more on this topic from Knuth in his continuation in Volume 4B and beyond.

The discussion on n-tuples starts out with a lengthy focus on Gray codes, which are binary strings of n bits arranged in an order such that only one bit changes from string to string.

A quite fun example for generating all permutations presented in this part of the book is alphametics, also sometimes known as verbal arithmetic — a kind of puzzle where every letter of a word stands for a digit and words are used in equations. The goal is to assign digits to letters in such a way that the equation is correct. A classic example is SEND + MORE = MONEY (the solution is left as an exercise for the reader).

The next section deals with generating all combinations. Given a set of n elements, the number of all possible combinations of distinct subsets containing k elements is the well-known binomial coefficient, typically read as "n choose k". One of the more interesting algorithms in this section of the book to me was generating all feasible ways to fill a rucksack, which can come in quite handy when going camping.

After combinations, Knuth moves on to briefly discuss integer partitions. Integer partitions are ways to split positive integer numbers into sums of positive integers, disregarding order. So, for example 3, 2+1, and 1+1+1 are the three possible partitions of the integer 3. Knuth, in particular, focuses on generating all possible integer partitions and determining how many there are for a given number. The book continues with a concise presentation of the somewhat related topic of set partitions, which refer to ways of subdividing a set of elements into disjoint subsets. Mathematically, a set partition defines an equivalence relation and the disjoint subsets are called equivalence classes; concepts that should be familiar to any mathematics major. Again, the focus is on generating all possible set partitions and determining how many partitions can be generated.

The main part of the book closes with a discussion of how to exhaustively generate all possible trees, which is a topic that I have never given much thought to. I am familiar with generating permutations, combinations, and partitions, but have never really been confronted with generating all possible trees that follow a certain pattern. One main example used throughout this part of the book is generating all possible strings of nested parentheses of a certain length. Such strings can be represented equivalently as binary trees.

Knuth's latest book is comprehensive and almost all encompassing in its scope. It compiles an incredible amount of computer science knowledge on combinatorial searching from past decades into a single volume. As such, it is an important addition to any computer science library. This book is not necessarily an easy read and requires a dedicated reader with the intention of working through it from front to back and a considerable amount of time to fully digest. However, for those with patience, this book contains a lot of interesting puzzles, brain teasers, and almost everything there is to know on generating combinatorial patterns.

On a final note, if you don't have volumes 1-3 yet you can get all volumes in a convenient box set .

Martin Ecker has been involved in real-time graphics programming for more than 10 years and works as a professional video game developer for High Moon Studios http://www.highmoonstudios.com/ in sunny California.

[Apr 25, 2008] Interview with Donald Knuth By Donald E. Knuth,Andrew Binstock

Apr 25, 2008 | informit.com

Andrew Binstock and Donald Knuth converse on the success of open source, the problem with multicore architecture, the disappointing lack of interest in literate programming, the menace of reusable code, and that urban legend about winning a programming contest with a single compilation.

Andrew Binstock: You are one of the fathers of the open-source revolution, even if you aren’t widely heralded as such. You previously have stated that you released TeX as open source because of the problem of proprietary implementations at the time, and to invite corrections to the code—both of which are key drivers for open-source projects today. Have you been surprised by the success of open source since that time?

Donald Knuth: The success of open source code is perhaps the only thing in the computer field that hasn’t surprised me during the past several decades. But it still hasn’t reached its full potential; I believe that open-source programs will begin to be completely dominant as the economy moves more and more from products towards services, and as more and more volunteers arise to improve the code.

For example, open-source code can produce thousands of binaries, tuned perfectly to the configurations of individual users, whereas commercial software usually will exist in only a few versions. A generic binary executable file must include things like inefficient "sync" instructions that are totally inappropriate for many installations; such wastage goes away when the source code is highly configurable. This should be a huge win for open source.

Yet I think that a few programs, such as Adobe Photoshop, will always be superior to competitors like the Gimp—for some reason, I really don’t know why! I’m quite willing to pay good money for really good software, if I believe that it has been produced by the best programmers.

Remember, though, that my opinion on economic questions is highly suspect, since I’m just an educator and scientist. I understand almost nothing about the marketplace.

Andrew: A story states that you once entered a programming contest at Stanford (I believe) and you submitted the winning entry, which worked correctly after a single compilation. Is this story true? In that vein, today’s developers frequently build programs writing small code increments followed by immediate compilation and the creation and running of unit tests. What are your thoughts on this approach to software development?

Donald: The story you heard is typical of legends that are based on only a small kernel of truth. Here’s what actually happened: John McCarthy decided in 1971 to have a Memorial Day Programming Race. All of the contestants except me worked at his AI Lab up in the hills above Stanford, using the WAITS time-sharing system; I was down on the main campus, where the only computer available to me was a mainframe for which I had to punch cards and submit them for processing in batch mode. I used Wirth’s ALGOL W system (the predecessor of Pascal). My program didn’t work the first time, but fortunately I could use Ed Satterthwaite’s excellent offline debugging system for ALGOL W, so I needed only two runs. Meanwhile, the folks using WAITS couldn’t get enough machine cycles because their machine was so overloaded. (I think that the second-place finisher, using that "modern" approach, came in about an hour after I had submitted the winning entry with old-fangled methods.) It wasn’t a fair contest.

As to your real question, the idea of immediate compilation and "unit tests" appeals to me only rarely, when I’m feeling my way in a totally unknown environment and need feedback about what works and what doesn’t. Otherwise, lots of time is wasted on activities that I simply never need to perform or even think about. Nothing needs to be "mocked up."

Andrew: One of the emerging problems for developers, especially client-side developers, is changing their thinking to write programs in terms of threads. This concern, driven by the advent of inexpensive multicore PCs, surely will require that many algorithms be recast for multithreading, or at least to be thread-safe. So far, much of the work you’ve published for Volume 4 of The Art of Computer Programming (TAOCP) doesn’t seem to touch on this dimension. Do you expect to enter into problems of concurrency and parallel programming in upcoming work, especially since it would seem to be a natural fit with the combinatorial topics you’re currently working on?

Donald: The field of combinatorial algorithms is so vast that I’ll be lucky to pack its sequential aspects into three or four physical volumes, and I don’t think the sequential methods are ever going to be unimportant. Conversely, the half-life of parallel techniques is very short, because hardware changes rapidly and each new machine needs a somewhat different approach. So I decided long ago to stick to what I know best. Other people understand parallel machines much better than I do; programmers should listen to them, not me, for guidance on how to deal with simultaneity.

Andrew: Vendors of multicore processors have expressed frustration at the difficulty of moving developers to this model. As a former professor, what thoughts do you have on this transition and how to make it happen? Is it a question of proper tools, such as better native support for concurrency in languages, or of execution frameworks? Or are there other solutions?

Donald: I don’t want to duck your question entirely. I might as well flame a bit about my personal unhappiness with the current trend toward multicore architecture. To me, it looks more or less like the hardware designers have run out of ideas, and that they’re trying to pass the blame for the future demise of Moore’s Law to the software writers by giving us machines that work faster only on a few key benchmarks! I won’t be surprised at all if the whole multithreading idea turns out to be a flop, worse than the "Titanium" approach that was supposed to be so terrific — until it turned out that the wished-for compilers were basically impossible to write.

Let me put it this way: During the past 50 years, I’ve written well over a thousand programs, many of which have substantial size. I can’t think of even five of those programs that would have been enhanced noticeably by parallelism or multithreading. Surely, for example, multiple processors are no help to TeX.[1]

How many programmers do you know who are enthusiastic about these promised machines of the future? I hear almost nothing but grief from software people, although the hardware folks in our department assure me that I’m wrong.

I know that important applications for parallelism exist—rendering graphics, breaking codes, scanning images, simulating physical and biological processes, etc. But all these applications require dedicated code and special-purpose techniques, which will need to be changed substantially every few years.

Even if I knew enough about such methods to write about them in TAOCP, my time would be largely wasted, because soon there would be little reason for anybody to read those parts. (Similarly, when I prepare the third edition of Volume 3 I plan to rip out much of the material about how to sort on magnetic tapes. That stuff was once one of the hottest topics in the whole software field, but now it largely wastes paper when the book is printed.)

The machine I use today has dual processors. I get to use them both only when I’m running two independent jobs at the same time; that’s nice, but it happens only a few minutes every week. If I had four processors, or eight, or more, I still wouldn’t be any better off, considering the kind of work I do—even though I’m using my computer almost every day during most of the day. So why should I be so happy about the future that hardware vendors promise? They think a magic bullet will come along to make multicores speed up my kind of work; I think it’s a pipe dream. (No—that’s the wrong metaphor! "Pipelines" actually work for me, but threads don’t. Maybe the word I want is "bubble.")

From the opposite point of view, I do grant that web browsing probably will get better with multicores. I’ve been talking about my technical work, however, not recreation.

I also admit that I haven’t got many bright ideas about what I wish hardware designers would provide instead of multicores, now that they’ve begun to hit a wall with respect to sequential computation. (But my MMIX design contains several ideas that would substantially improve the current performance of the kinds of programs that concern me most — at the cost of incompatibility with legacy x86 programs.)

Andrew: One of the few projects of yours that hasn’t been embraced by a widespread community is literate programming. What are your thoughts about why literate programming didn’t catch on? And is there anything you’d have done differently in retrospect regarding literate programming?

Donald: Literate programming is a very personal thing. I think it’s terrific, but that might well be because I’m a very strange person. It has tens of thousands of fans, but not millions.

In my experience, software created with literate programming has turned out to be significantly better than software developed in more traditional ways. Yet ordinary software is usually okay—I’d give it a grade of C (or maybe C++), but not F; hence, the traditional methods stay with us. Since they’re understood by a vast community of programmers, most people have no big incentive to change, just as I’m not motivated to learn Esperanto even though it might be preferable to English and German and French and Russian (if everybody switched).

Jon Bentley probably hit the nail on the head when he once was asked why literate programming hasn’t taken the whole world by storm. He observed that a small percentage of the world’s population is good at programming, and a small percentage is good at writing; apparently I am asking everybody to be in both subsets.

Yet to me, literate programming is certainly the most important thing that came out of the TeX project. Not only has it enabled me to write and maintain programs faster and more reliably than ever before, and been one of my greatest sources of joy since the 1980s—it has actually been indispensable at times. Some of my major programs, such as the MMIX meta-simulator, could not have been written with any other methodology that I’ve ever heard of. The complexity was simply too daunting for my limited brain to handle; without literate programming, the whole enterprise would have flopped miserably.

If people do discover nice ways to use the newfangled multithreaded machines, I would expect the discovery to come from people who routinely use literate programming. Literate programming is what you need to rise above the ordinary level of achievement. But I don’t believe in forcing ideas on anybody. If literate programming isn’t your style, please forget it and do what you like. If nobody likes it but me, let it die.

On a positive note, I’ve been pleased to discover that the conventions of CWEB are already standard equipment within preinstalled software such as Makefiles, when I get off-the-shelf Linux these days.

Andrew: In Fascicle 1 of Volume 1, you reintroduced the MMIX computer, which is the 64-bit upgrade to the venerable MIX machine comp-sci students have come to know over many years. You previously described MMIX in great detail in MMIXware. I’ve read portions of both books, but can’t tell whether the Fascicle updates or changes anything that appeared in MMIXware, or whether it’s a pure synopsis. Could you clarify?

Donald: Volume 1 Fascicle 1 is a programmer’s introduction, which includes instructive exercises and such things. The MMIXware book is a detailed reference manual, somewhat terse and dry, plus a bunch of literate programs that describe prototype software for people to build upon. Both books define the same computer (once the errata to MMIXware are incorporated from my website). For most readers of TAOCP, the first fascicle contains everything about MMIX that they’ll ever need or want to know.

I should point out, however, that MMIX isn’t a single machine; it’s an architecture with almost unlimited varieties of implementations, depending on different choices of functional units, different pipeline configurations, different approaches to multiple-instruction-issue, different ways to do branch prediction, different cache sizes, different strategies for cache replacement, different bus speeds, etc. Some instructions and/or registers can be emulated with software on "cheaper" versions of the hardware. And so on. It’s a test bed, all simulatable with my meta-simulator, even though advanced versions would be impossible to build effectively until another five years go by (and then we could ask for even further advances just by advancing the meta-simulator specs another notch).

Suppose you want to know if five separate multiplier units and/or three-way instruction issuing would speed up a given MMIX program. Or maybe the instruction and/or data cache could be made larger or smaller or more associative. Just fire up the meta-simulator and see what happens.

Andrew: As I suspect you don’t use unit testing with MMIXAL, could you step me through how you go about making sure that your code works correctly under a wide variety of conditions and inputs? If you have a specific work routine around verification, could you describe it?

Donald: Most examples of machine language code in TAOCP appear in Volumes 1-3; by the time we get to Volume 4, such low-level detail is largely unnecessary and we can work safely at a higher level of abstraction. Thus, I’ve needed to write only a dozen or so MMIX programs while preparing the opening parts of Volume 4, and they’re all pretty much toy programs—nothing substantial. For little things like that, I just use informal verification methods, based on the theory that I’ve written up for the book, together with the MMIXAL assembler and MMIX simulator that are readily available on the Net (and described in full detail in the MMIXware book).

That simulator includes debugging features like the ones I found so useful in Ed Satterthwaite’s system for ALGOL W, mentioned earlier. I always feel quite confident after checking a program with those tools.

Andrew: Despite its formulation many years ago, TeX is still thriving, primarily as the foundation for LaTeX. While TeX has been effectively frozen at your request, are there features that you would want to change or add to it, if you had the time and bandwidth? If so, what are the major items you add/change?

Donald: I believe changes to TeX would cause much more harm than good. Other people who want other features are creating their own systems, and I’ve always encouraged further development—except that nobody should give their program the same name as mine. I want to take permanent responsibility for TeX and Metafont, and for all the nitty-gritty things that affect existing documents that rely on my work, such as the precise dimensions of characters in the Computer Modern fonts.

Andrew: One of the little-discussed aspects of software development is how to do design work on software in a completely new domain. You were faced with this issue when you undertook TeX: No prior art was available to you as source code, and it was a domain in which you weren’t an expert. How did you approach the design, and how long did it take before you were comfortable entering into the coding portion?

Donald: That’s another good question! I’ve discussed the answer in great detail in Chapter 10 of my book Literate Programming, together with Chapters 1 and 2 of my book Digital Typography. I think that anybody who is really interested in this topic will enjoy reading those chapters. (See also Digital Typography Chapters 24 and 25 for the complete first and second drafts of my initial design of TeX in 1977.)

Andrew: The books on TeX and the program itself show a clear concern for limiting memory usage—an important problem for systems of that era. Today, the concern for memory usage in programs has more to do with cache sizes. As someone who has designed a processor in software, the issues of cache-aware and cache-oblivious algorithms surely must have crossed your radar screen. Is the role of processor caches on algorithm design something that you expect to cover, even if indirectly, in your upcoming work?

Donald: I mentioned earlier that MMIX provides a test bed for many varieties of cache. And it’s a software-implemented machine, so we can perform experiments that will be repeatable even a hundred years from now. Certainly the next editions of Volumes 1-3 will discuss the behavior of various basic algorithms with respect to different cache parameters.

In Volume 4 so far, I count about a dozen references to cache memory and cache-friendly approaches (not to mention a "memo cache," which is a different but related idea in software).

Andrew: What set of tools do you use today for writing TAOCP? Do you use TeX? LaTeX? CWEB? Word processor? And what do you use for the coding?

Donald: My general working style is to write everything first with pencil and paper, sitting beside a big wastebasket. Then I use Emacs to enter the text into my machine, using the conventions of TeX. I use tex, dvips, and gv to see the results, which appear on my screen almost instantaneously these days. I check my math with Mathematica.

I program every algorithm that’s discussed (so that I can thoroughly understand it) using CWEB, which works splendidly with the GDB debugger. I make the illustrations with MetaPost (or, in rare cases, on a Mac with Adobe Photoshop or Illustrator). I have some homemade tools, like my own spell-checker for TeX and CWEB within Emacs. I designed my own bitmap font for use with Emacs, because I hate the way the ASCII apostrophe and the left open quote have morphed into independent symbols that no longer match each other visually. I have special Emacs modes to help me classify all the tens of thousands of papers and notes in my files, and special Emacs keyboard shortcuts that make bookwriting a little bit like playing an organ. I prefer rxvt to xterm for terminal input. Since last December, I’ve been using a file backup system called backupfs, which meets my need beautifully to archive the daily state of every file.

According to the current directories on my machine, I’ve written 68 different CWEB programs so far this year. There were about 100 in 2007, 90 in 2006, 100 in 2005, 90 in 2004, etc. Furthermore, CWEB has an extremely convenient "change file" mechanism, with which I can rapidly create multiple versions and variations on a theme; so far in 2008 I’ve made 73 variations on those 68 themes. (Some of the variations are quite short, only a few bytes; others are 5KB or more. Some of the CWEB programs are quite substantial, like the 55-page BDD package that I completed in January.) Thus, you can see how important literate programming is in my life.

I currently use Ubuntu Linux, on a standalone laptop—it has no Internet connection. I occasionally carry flash memory drives between this machine and the Macs that I use for network surfing and graphics; but I trust my family jewels only to Linux. Incidentally, with Linux I much prefer the keyboard focus that I can get with classic FVWM to the GNOME and KDE environments that other people seem to like better. To each his own.

Andrew: You state in the preface of Fascicle 0 of Volume 4 of TAOCP that Volume 4 surely will comprise three volumes and possibly more. It’s clear from the text that you’re really enjoying writing on this topic. Given that, what is your confidence in the note posted on the TAOCP website that Volume 5 will see light of day by 2015?

Donald: If you check the Wayback Machine for previous incarnations of that web page, you will see that the number 2015 has not been constant.

You’re certainly correct that I’m having a ball writing up this material, because I keep running into fascinating facts that simply can’t be left out—even though more than half of my notes don’t make the final cut.

Precise time estimates are impossible, because I can’t tell until getting deep into each section how much of the stuff in my files is going to be really fundamental and how much of it is going to be irrelevant to my book or too advanced. A lot of the recent literature is academic one-upmanship of limited interest to me; authors these days often introduce arcane methods that outperform the simpler techniques only when the problem size exceeds the number of protons in the universe. Such algorithms could never be important in a real computer application. I read hundreds of such papers to see if they might contain nuggets for programmers, but most of them wind up getting short shrift.

From a scheduling standpoint, all I know at present is that I must someday digest a huge amount of material that I’ve been collecting and filing for 45 years. I gain important time by working in batch mode: I don’t read a paper in depth until I can deal with dozens of others on the same topic during the same week. When I finally am ready to read what has been collected about a topic, I might find out that I can zoom ahead because most of it is eminently forgettable for my purposes. On the other hand, I might discover that it’s fundamental and deserves weeks of study; then I’d have to edit my website and push that number 2015 closer to infinity.

Andrew: In late 2006, you were diagnosed with prostate cancer. How is your health today?

Donald: Naturally, the cancer will be a serious concern. I have superb doctors. At the moment I feel as healthy as ever, modulo being 70 years old. Words flow freely as I write TAOCP and as I write the literate programs that precede drafts of TAOCP. I wake up in the morning with ideas that please me, and some of those ideas actually please me also later in the day when I’ve entered them into my computer.

On the other hand, I willingly put myself in God’s hands with respect to how much more I’ll be able to do before cancer or heart disease or senility or whatever strikes. If I should unexpectedly die tomorrow, I’ll have no reason to complain, because my life has been incredibly blessed. Conversely, as long as I’m able to write about computer science, I intend to do my best to organize and expound upon the tens of thousands of technical papers that I’ve collected and made notes on since 1962.

Andrew: On your website, you mention that the Peoples Archive recently made a series of videos in which you reflect on your past life. In segment 93, "Advice to Young People," you advise that people shouldn’t do something simply because it’s trendy. As we know all too well, software development is as subject to fads as any other discipline. Can you give some examples that are currently in vogue, which developers shouldn’t adopt simply because they’re currently popular or because that’s the way they’re currently done? Would you care to identify important examples of this outside of software development?

Donald: Hmm. That question is almost contradictory, because I’m basically advising young people to listen to themselves rather than to others, and I’m one of the others. Almost every biography of every person whom you would like to emulate will say that he or she did many things against the "conventional wisdom" of the day.

Still, I hate to duck your questions even though I also hate to offend other people’s sensibilities—given that software methodology has always been akin to religion. With the caveat that there’s no reason anybody should care about the opinions of a computer scientist/mathematician like me regarding software development, let me just say that almost everything I’ve ever heard associated with the term "extreme programming" sounds like exactly the wrong way to go...with one exception. The exception is the idea of working in teams and reading each other’s code. That idea is crucial, and it might even mask out all the terrible aspects of extreme programming that alarm me.

I also must confess to a strong bias against the fashion for reusable code. To me, "re-editable code" is much, much better than an untouchable black box or toolkit. I could go on and on about this. If you’re totally convinced that reusable code is wonderful, I probably won’t be able to sway you anyway, but you’ll never convince me that reusable code isn’t mostly a menace.

Here’s a question that you may well have meant to ask: Why is the new book called Volume 4 Fascicle 0, instead of Volume 4 Fascicle 1? The answer is that computer programmers will understand that I wasn’t ready to begin writing Volume 4 of TAOCP at its true beginning point, because we know that the initialization of a program can’t be written until the program itself takes shape. So I started in 2005 with Volume 4 Fascicle 2, after which came Fascicles 3 and 4. (Think of Star Wars, which began with Episode 4.)

Recommended Links

Google matched content

Softpanorama Recommended

The Art of Computer Programming - Wikipedia, the free encyclopedia

Don Knuth's Home Page

Amazon Readers Reviews to all three volumes

5 of 5 stars Excellent, for certain people!, April 4, 2000
Reviewer: A reader from Uppsala, Sweden
These books are indisputably classics of the field, and like all classics they have religious adherents and equally firm detractors. The key difference between the two groups is that the adherents are interested in computer SCIENCE, whereas the rest are more taken with computer programming. The books are well written, quite mathematical, and abstract. The books deal with the core subjects of computer science and shy away from the trendy, and so some people tend to see them as anachronistic. Nevertheless, they are deservedly core references in computer science, and a joy for any patient, theoretically minded reader. There are three points I believe should be made.
 
  1. A lot of the detractors of the books are saying correct things: the books don't deal with hot topics, they do present things in greater detail than is necessary in day to day programming, they are books they require a lot of the reader. What they don't recognize is that this is the intention, and that there is nothing wrong with that. The book is targeted at those with a geniune interest in theoretical computer science.
     
  2. Many reviewers complain about Knuth's typesetting system, TeX. What they fail to recognize is that TeX is incredibly useful, and about as user friendly as could be expected, for the task for which it was designed: typesetting professional quality mathematics. Anyone who challenges this statement would have to contend with virtually the entire community of people who write papers using higher mathematics, including virtually all professional physicists, mathematicians, and computer scientists.
     
  3. Some people accuse Knuth's books of being poorly written. These people are ignorant: either they have not read the works, or they would not recognize skillful writing if they saw it. These books are splendid examples of scientific writing, and are justifiably acclaimed as such. In short, Knuth's books have ensured that the word "science" deserves its place in the phrase "computer science"
5 of 5 stars Definitive, June 15, 1999
Reviewer: A reader from Illinois
As Knuth himself says, it is impossible for any one person to keep up with all the research in computer science, but these 3 volumes do a remarkably good job of distilling the most important results and explaining them with mathematical rigor.

Each volume contains 2 chapters.

  •  Ch. 1, Basic Concepts: mathematical foundations and a description of MIX, a hypothetical machine (now available in software emulations).
  • Ch. 2, Information Structures: lists, trees, memory allocation, garbage collection.
  • Ch. 3, Random Numbers: how to produce series of "random" numbers and test their statistical properties. Ch. 4, Arithmetic: algorithms for integer and floating-point arithmetic.
  • Ch. 5, Sorting: both in memory and on disks or tapes.
  • Ch. 6, Searching: sequential, binary, hashing.

Despite the detailed coverage of the topics, which often involves esoteric mathematical notation, the author's lively style makes the algorithms and the main theoretical results relatively easy to grasp. If all you care about is getting a program to run, buy another book; but if you really want to understand how and why software works, there's nothing quite like this.

5 of 5 stars Full of little gems, March 8, 2001
Reviewer: fife (see more about me)

Knuth is obviously in the education business. This is a book written for learning from. It's very easy to ignore the parts that are too detailed for your needs and not feel like you've missed something. My favorite parts are his historical notes. These are the reward for ploughing through a section, some of them quite fascinating.

I'm a compiler designer. Compilers like most other big applications are built on stacks, queues, lists, trees, etc. These books will teach you how to implement these structures solidly and efficiently. Alot of my time at work involves reading research papers on optimizations. I need to understand how algorithms are analyzed and how to compare two algorithms. These books give the mathematical tools needed to perform that job. Some criticize his using a machine language for examples. I personally think that this is a good thing. Seeing something done in assembly shows you how easy it really is. Sometimes high level languages with all their abstractions make things look more complexh the second volume and it has been a long time already. Nevertheless, I couldn't resist buying the third volume. I just hope to live long enough to get to the end of the fifth and last volume of this collection. Thank you Donald Knuth for this brilliant and inspiring work. --This text refers to an out of print or unavailable edition of this title.



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: December 28, 2016