Great Principles of Computing

One of the reasons I have been a strong proponent of teaching programming in middle school and high school is the discipline it imposes on organizing thoughts to problem solving. One has to analyze the problem, break it into manageable parts, figure out what has to happen for the program to solve the problem, then work through the task of writing the program, utilizing preexisting libraries, if applicable, compiling and running the program, and verifying that it produces the desired result and nothing else.

I am reminded of the remark that “neatness is its own reward” because you can find things you put away when you need them. Good design is its own reward because it facilitates understanding of complexity and ability to evolve and revise the design to achieve new objectives.

Computer science is no more about computers than astronomy is about telescopes.

To keep up with the flux, the name of the field changed several times. In the 1940s it was called automatic computation, and in the 1950s, information processing. In the 1960s, as it moved into academia, it acquired the name computer science in the US and informatics in Europe. By the 1980s the computing field comprised a complex of related fields including computer science, informatics, computational science, computer engineering, software engineering, information systems, and information technology. By 1990 the term “computing” had become the standard for referring to this core group.

What is so special about computing’s approach to information? Information traditionally means facts that add to knowledge when communicated. It is an old concept, studied for centuries in philosophy, mathematics, business, the humanities, and the sciences. Science is concerned with discovering facts, fitting them together into models, using the models to make predictions, and turning validated predictive models into technologies. Scientists record all they have learned in to the structure called the “scientific body of knowledge” for future use. Information has always played a strong role in sciences.

Computing differs in two ways from the other sciences in its approach to information. First, computing emphasizes the transformation of information, not simply its discovery, classification, storage, and communication. Algorithms not only read information structures, they modify them. Moreover, humans constantly modify information structures — such as in the web — with transformations for which we yet have no computational models. Purely analytic methods cannot help us understand the dynamics of these information structures. The experimental methods of science are needed to make progress.

The second difference is that the structures of computing are not just descriptive, they are generative. An algorithm is not just a description of a method of solving a problem, it causes the machine to solve the problem.

Turing realized that “intelligence” is so ill defined that he could make no progress with the question of when a machine might be intelligent. His imitation game (the Turing test) asks not whether a machine has intelligence but instead whether it behaves intelligently.

The ARPANET designers soon realized that shared services would be sought by name rather than location and that location-independent addressing would be the only way to hide the many addressing conventions of local networks containing the services.

What was once big is now small. The term “big data” is a recent moniker for an old problem, which now affects many communities.

As data stores grew, the designers focuses on methods to organize the data for fast access and easy maintenance. The two chief competing methods were the Integrated Data System and the Relational Database System. The IDS was simple, fast, and pragmatic in its approach to managing large sets of files while hiding the file structure and location of the disks. The RDS was based on the mathematical theory of sets; it had a clean conceptual model but took many years to perfect and achieve the kinds of efficiency seen in IDS.

A key tenet of his theory is that information and meaning are distinct, making it possible for machines to process information without regard to its meaning. And yet the whole purpose of communication and computing is to convey and produce meaningful results. How can this be?

Software designers, scientists, and consumers look to software to generate virtual worlds, social networks, music, new discoveries, financial projections, love letters, inspiring images, and much more. But the meaning of the information output by these systems seems to depend on the observer. For example, a tabulation of stock prices may look like numerical gibberish to a financial amateur but a source of riches to a professional investor.

Note that in communication systems, coding is not the same as encryption. Encryption is an additional step that converts messages from the source into cipher text before that text goes to the encoder, so that only receivers who have the cipher keys can read them.

Communication systems need to suffer from digitization errors. Every continuous, bandwidth-limited signal can be digitized without any loss of information by sampling at a rate of at least twice the highest frequency.

A code representing just four possible messages is not far removed from nature. The DNA in our cells is a natural message source that uses just four letters — G, A, T and C.

Unlike signals, bits do not have a physical reality. A bit is a notation that represents which of the two observable properties holds. For example, an engineer might assign bit “1” to mean that a laser beam reflects from a spot on the surface of a DVD and a “0” to mean that there is no reflection. Or the “1” might mean that the output of a transistor is 5 volts, and a “0” might mean it is 3 volts. Or a “1” might mean that a particular frequency, say 400 Hz, is present in a musical recording, whereas a “0” means it is not. Bits are abstractions that we use to specify what we want the systems to do. Engineers arrange the physical “stuff” (materials) so that it behaves as specified.

Because information in a physical system is always represented by physical states, it takes time and energy to read, write, and transform it. Communication and computation can never be free of the constraints of the physical world.

Computer scientists are fond of abstractions. An abstraction is a mental model that captures the essential features of a thing and suppresses all other features. Computer scientists often describe programming as designing a hierarchy of abstractions represented as “abstract objects” operated on by designated functions. This notion has become so popular that computer science is often touted as the field that has learned best how to manage abstractions.

Computing abstractions differ in an important way from the mathematical abstractions common in other fields: computing abstractions perform actions. The terminology of abstractions often obscures the principle of stuff: the reality that computational actions are implemented as physical processes controlled by programs.

In the 1920s, the term “computer” mean a person who calculated numbers. Thus, to distinguish them from actual, human computers, the first electronic computing machines were billed as “automatic computers.” The acronyms of the first electronic computers in the 1940s ended in “-AC” to signify this.

A program is a set of instructions arranged in a pattern that causes the desired function to be calculated. Programming is a the art of designing a program and providing convincing evidence that the program computes its function correctly.

Originally, in the 1950s, reusable subprograms were called subroutines. That name eventually gave way to “procedure” in the 1960s under the influence of the Algol language. A procedure is a subprogram that implements a single, usually simple, function.

This problem plagued many early computer systems. Before engineers understood it, all they would see was at random times the CPU would stop. They described these mysterious freezes as “cosmic ray crashes” because they seemed to be random disruptions of transistor function.

People who are more than casually interested in computers should have at least some idea of what the underlying hardware is like. Otherwise the programs they write will be pretty weird.

Programmers develop varying levels of skill at programming. Novice programmers find themselves spending a lot of time trying to understand the syntax of their programming languages and generally accepted basic algorithms. Competent programmers can perform a range of standard programming tasks and satisfy customers without requiring a supervisor to watch over them. Virtuoso programmers can program large systems well and can mentally move from high-level system views to low-level machine code views with great facility.

Engineers view this process as imprecise and are, constantly iterating and backing up as they learn new things. For example, while writing a specification, they may learn that a requirement is ambiguous, and they must consult with users to resolve the ambiguity. While implementing a prototype, they may learn that a specification leads to a major inefficiency, requiring a change of specification or requirement. While testing, they may discover a performance problem or uncover a bug, requiring that they double check their implementation, review the specifications, or consult again with users. These steps can be unwieldy with large systems consisting of many modules and built by large teams. Sophisticated decision tracking, coordination, and version control systems are used to keep track of all the changes, steps forward, and steps backward.

A critical component of programming is the conversion of the program as an expression into a form executable on a machine. There have been two approaches to this, the compiler and the interpreter. A compiler is a program that takes an input file containing string of symbols representing a program, parses it into a syntax tree according to the grammar of the language, and generates machine code that implements the operations called for by the syntax tree. An interpreter is a program that also parses the input but, instead of generating machine code, it calls system operations as soon as the parser identifies them; the system operations are subroutines already compiled into machine code. Over the years the distinction between the two forms of execution has blurred. Perhaps the most important one is that a compiler runs once and produces executable machine code that can be run many times, whereas an interpreter is run on every execution.

The Turing machine has long been the reference model for computation, winning out over competing models such as Post’s rewriting systems, Church’s lambda calculus, and Godel’s recursive functions. It achieved this status because it most closely resembled automatic computers and could compute any function any of the other models could.

All computational methods take time and consume energy. We estimate time (or energy) by counting the number of instructions executed.

There are over 3,000 known NP-complete problems from all parts of science, engineering, economics, and social sciences. That means that the best algorithms we know for any of these problems are exponential or worse. It also gives empirical evidence for believing that P != NP because thousands of people have been unsuccessful at finding a fast method for any of those problems.

In the internet, retrieval is worse than looking for a needle in a haystack; it’s looking for a specific needle in a needle stack.

As a consequence of the hierarchy, performance optimization is not simply a matter of finding algorithms with lowest CPU instruction counts; it depends greatly on the arrangements of data in the hierarchy.

The unstructured Internet has disappointed many people because, despite the hype about access to all the world’s information, it is a poor information retrieval system.

In the fourth mode the name is a symbolic string that means something to users. Internet host names, file pathnames, and web URLs are examples. This mode is layered on top of the others.

Internet search is the hardest of the six modes. The set of objects matching a given set of search terms is likely to be enormous — there can ben thousands or millions of matching documents. Even though Google’s page-ranking algorithm does a remarkable job at presenting useful documents, users often find that the highest-ranked documents are still not useful.

Over the years critics predicted that virtual addressing methods would disappear because RAM technology would eventually be so good that most programs would never have to page at all. A simpler OS with no paging could give everyone all the RAM they need. This outcome is quite unlikely. Virtual addressing methods are here to stay because they solve important problems of sharing, naming, authenticating, and preventing programs from interfering. They are essential even if their data-positioning policies are never switched on. Moreover, because so many of the computations we wish to run deal with “big data” — meaning data that exceed our current storage systems — there will never be enough RAM.

Busy waiting can devastate a time-sharing system by wasting large amounts of CPU time. It would be much better to suspend the process the moment it waits for a lock and resume it when the lock is released.

One of the most difficult questions about computations is: How long does it take? This question is difficult on networks of servers where jobs compete and queues form. Although jobs and the networks are deterministic, the response times are random. The uncertainty is caused by randomness in the arrivals of jobs and in the lengths of their service times at servers in the network. Queueing theory has proved to be a remarkably accurate way to overcome the uncertainty and predict throughput, response time, and congestion in networks of computers.

Simplicity is inherently complex.

Descriptions of software entities that abstract away their complexity often abstract away their essence. Good judgment comes from experience, and experience comes from bad judgment.

They declare the entire software industry to be in a state of perpetual crisis because the size and complexity of needed system always seemed to exceed our tools and skills for building them. They called for a new field of “software engineering” to bring rigorous engineering methods to software development. Many developers produced new tools and methods to help reduce errors and make software predictably reliable.

He said that despite tremendous advances in tools, our ability to create dependable, reliable, usable, safe, and secure software systems had not materially improved. He said that the hard part of software design was getting an intellectual grasp of the problem to be solved by software. That will never be easy. Success depends largely on the cultivation of people who have the requisite skills.

This is a profound conclusion. In large measure the success of a design depends on the engineer’s skill, not on formal mathematical analysis or derivations from first principles. It also depends on knowledge of history in the designer’s field, which informs the designer on what works and what does not work.

In his 1995 memoir he strongly criticized the more recent trend to ignore the historical development and try to design from scratch. Without the knowledge of what worked and what did not, designers have tended to repeat the same mistakes. We can see it today: the designers of personal computer operating systems and software tried to avoid the mistakes of the pervious, mainframe era by ignoring them (“keep it simple”) and wound up repeating the mistake a generation later and then struggling to believe that solutions had been found many years before.

Barry argued that the standard engineering design approach of careful, almost rigid process was at the strict end of a planning spectrum, and agile methods were at the flexible end. He thought that careful planning is needed when reliability, safety, and security are important and that agility is needed when usability and evolvability are important. He exhorted the careful-planning schools to collaborate with the agile schools to find the common ground for better systems.

Abstraction means to define a simplified version of something and to state the operations (functions) that apply to it. Abstraction is one of the most fundamental power of the human brain. By bringing out the essence and suppressing detail, an abstraction offers a simple set of operations that apply to all cases. An abstraction corresponds to an aggregate in a hierarchy; forming a hierarchy is a process of abstraction.

A file (a named sequence of bits) is a common abstraction for named containers of digital objects. A file system provides create, delete, open, close, read, and write operations that work on any file.

In the early 1960s defense communications engineers began searching for alternatives to the telephone network. Defense officials were very concerned about using the telephone network for their communications. The telephone network was based on the principle of circuit switching, which meant that dialing a phone number caused a series of mechanical crossbar switches to provide a direct electrical connection from a calling station to a receiving station. The network could be severely disrupted by a natural disaster or a manmade disaster because disabling one or a few nodes could completely sever the network. Not only was the network insecure — a telephone line could be tapped fairly easily — but it was unreliable in the face of hostile operations. The telephone network was seen as a serious weak spot in national defenses.

The packet network is attractive for three main reasons. First, it accommodates different types of traffic simultaneously. For example, voice traffic comes in bitstreams of a constant rate, typically 44,000 samples per second, lasting many minutes. Web traffic comes in short, high-intensity bursts when web servers transfer pages to requesting clients. The network transmits packets without regard to whether they contain continuous voice or bursty data traffic.

A router stores incoming packets in a queue and then, using a routing table, forwards them to a next router closer to the destination. The routing algorithm, which generates the routing tables, find the shortest paths (measured by hops between routers) to the targeted hosts. Routing algorithms are generally run in the background so that routing tables are continuously updated to represent the current network connectivity.

Vint Cerf draws an analogy between the IP and a version of the Postal Service that deals only with postcards. A postcard is like a packet: to-field, from-field, and a limited area for the message. When you mail a postcard, you know it may not be delivered, or it may be damaged in transit. You do not know exactly how long it will take for delivery. There is no guarantee that a series of postcards will be received in the same order they were sent: each postcard can follow a different route on different mail trucks or planes.

So that the sender knows which postcards have been received, the receiver returns an ACK postcard from time to time summarizing the postcards that have bene so far received; on receipt of ACKs, the sender discards the backup copies of acknowledged postcards. Here is the important step: if the sender receives no ACK within a time-out limit since the previous send, the sender automatically resends the postcard. The sender repeats this until finally an ACK come for the card — or until the sender “times out” and reports that a transmission failure has occurred. This method overcomes possible loss of ACKs in the network. The receiver ignores duplicate postcards from the sender.

This scheme would be clumsy and messy with a real (paper) book, but with digital document it is simple, clean, and fast.