Software

Writing computer software is one of the purest creative activities in the history of the human race. Programmers aren’t bound by practical limitations such as the laws of physics; we can create exciting virtual worlds with behaviors that could never exist in the real world. Programming doesn’t require great physical skill or coordination, like ballet or basketball. All programming requires is a creative mind and the ability to organize your thoughts. If you can visualize a system, you can probably implement it in a computer program.

This means that the greatest limitation in writing software is our ability to understand the systems we are creating. As a program evolves and acquires more features, it becomes complicated, with subtle dependencies between its components. Over time, complexity accumulates, and it becomes harder and harder for programmers to keep al of the relevant factors in their minds as they modify the system. This slows down development and leads to bugs, which slow development even more and add to its cost. Complexity increases inevitably over the life of any program. The larger the program, and the more people that work on it, the more difficult it is to manage complexity.

There are two general approaches to fighting complexity. The first is by making code simpler and more obvious. The second approach is to encapsulate it, so that programmers can work on a system without being exposed to all of its complexity at once.

Complexity is anything related to the structure of a software system that makes it hard to understand and modify the system.

The problem with tactical programming is that it is short-sighted. If you’re programming tactically, you’re trying to finish a task as quickly as possible. Perhaps you have a hard deadline. As a result, planning for the future isn’t a priority. You don’t spend much time looking for the best design; you just want to get something working soon. You tell yourself that it’s OK to add a bit of complexity or introduce a small kludge or two, if that allows the current task to be completed more quickly.

This is how systems become complicated. Complexity is incremental. It’s not one particular thing that makes a system complicated, but the accumulation of dozens or hundreds of small things. If you program tactically, each programming task will contribute a few of these complexities. Each of them seems like a reasonable compromise in order to finish the current task quickly. However, the complexities accumulate rapidly, especially if everyone is programming tactically.

Documentation can reduce cognitive load by providing developers with the information they need to make changes and by making it easy for developers to ignore information that is irrelevant. Without adequate documentation, developers may have to read large amounts of code to reconstruct what was in the designer’s mind. Documentation can also reduce the unknown unknowns by clarifying the structure of the system, so that it is clear what information and code is relevant for any given change.

A first step towards writing good comment is to use different words in the comment from those in the name of the entity being described. Pick words from the comment that provide additional information about the meaning of the entity, rather than just repeating its name.

The reward for being a good designer is that you get to spend a larger fraction of your time in the design phase, which is fun. Poor designers spend most of their time chasing bugs in complicated and brittle code. If you improve your design skills, not only will you produce higher quality software more quickly, but the software development process will be more enjoyable.

The concept of abstract data type is that an object’s type should be defined by a name, a set of proper values, and a set of proper operations, rather than its storage structure, which should be hidden.

Many of the classical problems of developing software products derive from this essential complexity and its nonlinear increases with size. From the complexity comes the difficulty of communication among team members, which leads to product flaws, cost overruns, schedule delays. From the complexity comes the difficulty of enumerating, much less understanding, all the possible states of the program, and from that comes the unreliability. From the complexity of the functions comes the difficulty of invoking those functions, which makes program hard to use. From complexity of new structure comes the difficulty of extending programs to new functions without creating side effects. From complexity of structure comes the unvisualized states that constitute security trapdoors.

To use a program. Every user needs a prose description of the program. Most documentation fails in giving too little overview. The trees are described, the bark and leaves are commented, but there is no map of the forest. To write a useful prose description, stand way back and come in slowly:

Purpose. What is the main function, the reason for the program?
Environment. On what machines, hardware configurations, and operating system configurations will it run?
Domain and range. What domain of input is valid? What range of output can legitimately appear?
Functions realized and algorithms used. Precisely what does it do?
Input-output formats, precise and complete.
Operating instructions, including normal and abnormal ending behavior, as seen at the console and on the outputs.
Options. What choices does the user have about functions? Exactly how are those choices specified?
Running time. How long does it take to do a problem of specified size on a specified configuration?
Accuracy and checking. How precise are the answers expected to be? What means of checking accuracy are incorporated?

For picking the milestones there is only one relevant rule. Milestones must be concrete, specific, measurable events, defined with knife-edge sharpness. Coding, for a counterexample, is “90 percent finished” for half of the total coding time. Debugging is “99 percent complete” most of the time. “Planning complete” is an event one can proclaim almost at will.

Concrete milestones, on the other hand, are 100-percent events. “Specifications signed by architects and implementers,” “source coding 100 percent complete, keypunched, entered into disk library,” “debugged version passes all test cases.” These concrete milestones demark the vague phases of planning, coding, debugging.

It is more important that milestones be sharp-edged and un-ambiguous than that they be easily verifiable by the boss. Rarely will a man lie about milestone progress, if the milestone is so sharp that he can’t deceive himself.

Bug-proofing the definition. The most pernicious and subtle bugs are system bugs arising from mismatched assumptions made by the authors of various components. In short, conceptual integrity of the product not only makes it easier to use, it also makes it easier to build and less subject to bugs.

So does the detailed, painstaking architectural effort implied by that approach. The crucial task is to get the product defined. Many, many failures concern exactly those aspects that were never quite specified. Careful function definition, careful specification, and the disciplined exorcism of frills of function and flights of technique all reduce the number of system bugs that have to be found.

Testing the specification. Long before any code exists, the specification must be handed to an outside testing group to be scrutinized for completeness and clarity. The developers themselves cannot do this. They won’t tell you they don’t understand it; they will happily invent their way through the gaps and obscurities.

You’re wasting money by having $100/hour programmers do work that can be done by $30/hour testers.

The most important function of a spec is to design the program. Even if you are working on code all by yourself, and you write spec solely for your own benefit, the act of writing the spec — describing how the program works in minute detail — will force you to actually design the program.

The moral of the story is that when you design your product in a human language, it takes only a few minutes to try thinking about several possibilities, revising, and improving your design. Nobody feels bad just deleting a paragraph in a word processor. A programmer who’s just spent two weeks writing some code is going to be quite attached to that code, no matter how wrong it is.

A functional specification describes how a product will work entirely from the user’s perspective. It doesn’t care how the thing is implemented. It talks about features. It specifies screens, menus, dialogs, and so on.

A technical specification describes the internal implementation of the program. It talks about data structures, relational database models, choice of programming languages and tools, algorithms, etc.

When you design a product, inside and out, the most important thing is to nail down the user experience. What are the screens, how do they work, what do they do.

Details are the most important thing in a functional spec. You’ll notice in the sample spec how I go into outrageous detail talking about all the error cases for the login pace. All of these cases correspond to real code that’s going to be written, but, more importantly, these cases correspond to decisions that somebody is going to have to make. Somebody has to decide what the policy is going to be for a forgotten password. If you don’t decide, you can’t write the code. The spec needs to document the decision.

All new source code! As if source code rusted.

The idea that new code is better than old is patently absurd. Old code has been used. It has been tested. Lots of bugs have been found, and they’ve been fixed. There’s nothing wrong with it. It doesn’t acquire bugs just by sitting around on your had drive. Au contraire, baby! Is software supposed to be like an old Dodge Dart that rusts just sitting in the garage? Is software like a teddy bear that’s kind of gross if it’s not made out of all new material?

You know how an iceberg is 90 percent underwater? Well, most software is like that too — there’s a pretty user interface that takes about 10 percent of the work, and then 90 percent of the programming work is under the covers. And if you take into account the fact that about half of your time is spent fixing bugs, the UI takes only 5 percent of the work. And if you limit yourself to the visual part of the UI, the pixels, what you would see in PowerPoint, now we’re talking less than 1 percent.

That is, approximately, the magic of TCP. It is what computer scientists like to call an abstraction: a simplification of something much more complicated that is going on under the covers. As it turns out, a lot of computer programming consists of building abstractions. What is a string library? It’s a way to pretend that computers can manipulate strings just as easily as they can manipulate numbers. What is a file system? It’s a way to pretend that a hard drive isn’t really a bunch of spinning magnetic platters that can store bits at certain locations, but rather a hierarchical system of folders-within-folders containing individual files that in turn consist of one or more strings of bytes.

The law of leaky abstractions means that whenever somebody comes up with a new wizzy new code-generation tool that is supposed to make us all ever-so-efficient, you hear a lot of people saying “learn how to do it manually first, then use the wizzy tool to save time.” Code-generation tools that pretend to abstract out something, like all abstractions, leak, and the only way to deal with the leaks competently is to learn about how the abstractions work and what they are abstracting. So, the abstractions save us time working, but they don’t save us time learning.

And all this means that paradoxically, even as we have higher and higher level programming tools with better and better abstractions, becoming a proficient programmer is getting harder and harder.

Sometimes you download software and you just can’t believe how bad it is, or how hard it is to accomplish the very simple tasks that the software tries to accomplish. Chances are, it’s because the developers of the software don’t use it.

Three minutes of design work saved me hours of coding.

If you’ve spent more than 20 minutes of your life writing code, you’ve probably discovered a good rule of thumb by now: nothing is as simple as it seems.

An important thing you notice from all these examples is that it’s easy for software to commoditize hardware, but it’s incredibly hard for hardware to commoditize software. Software is not interchangeable, as the StarOffice marketing team is learning. Even when the prize is zero, the cost of switching form Microsoft Office is non-zero. Even the smallest differences can make two software packages a pain to switch between.

The art of programming is the art of organizing complexity, of mastering multitude and avoiding its bastard chaos as effectively as possible.

The competent programmer is fully aware of the strictly limited size of his own skull; therefore he approaches the programming task in full humility, and among other things he avoids clever tricks like the plague.

Some people found error messages they couldn’t ignore more annoying than wrong results, and, when judging the relative merits of programming languages, some still seem to equate “the ease of programming” with the ease of making undetected mistakes.

If you want more effective programmers, you will discover that they should not waste their time debugging, they should not introduce the bugs to start with.

The purpose of abstracting is not to be vague, but to create a new semantic level in which one can be absolutely precise.

Besides a mathematical inclination, an exceptionally good mastery of one’s native tongue is the most vital asset of a competent programmer.

Simplicity is prerequisite for reliability.

Elegance is not a dispensable luxury but a quality that decides between success and failure.