The best documentation for a computer program is a clean structure.
Kernighan and Plauger
Structured programming is not about avoiding the goto. The absence of gotos is merely a side effect, and their presence can invariably be used as a barometer Dijkstra observed that the comprehensibility of a program was inversely proportional to the number of gotos it contained.
In many respects software engineering is about the management of complexity. By this definition any introduction of complexity is questionable. A unit of code regardless of whether it is a class, a function, a loop, a block or a simple statement should have a single simple responsibility: it should do one thing well. Complexity is handled by breaking complex responsibilities into simpler ones, and regrouping vertically or horizontally. This is the root of all modular approaches: it is not the exclusive domain of top-down programming. For control flow, the basic premise is that any program can be expressed in terms of three structure types:
Readers familiar with the concept of patterns Alexander, rather than knitting might like to consider structured programming as a pattern language for control flow. The forces and problems to be resolved are those of complexity manifested as readability, reliability, testability and maintainability. These three primitives provide the essential means to do this. Structured programming is not a handle-turning exercise, so using these primitives does not automatically endow your code with style and elegance. There is no escaping the need for reasoned thought in the process of program development.
Jumpy code has, by definition, non-smooth control flow and often implies turbulent or chaotic understanding on the part of the programmer. Indeed, one entry in my dictionary states that flow is continuous by definition. The classic jump statement is the goto, but it also has more restrained relatives in the form of continue, return and break (in a loop).
Taligent's Guide to Designing Programs is filled with sound, concise advice on building programs. They take no prisoners when considering the goto:
Taking some examples from real code is a good way to illustrate two simple rules: forward jumps should be expressed using selection and backward jumps using loops. More specific refinement rules become apparent on inspection:
next1:
p = mem[w];
if(p == 0) {
w = pop();
goto next1;
}
The context of this code is not important, except to say that unfortunately this
fragment is about par for the course. This is a loop by any other name: C provides
the programmer with three well-defined looping structures what more do you
want! A literal translation into something tidier gives us
for(;;) {
p = mem[w];
if(p == 0)
w = pop();
else
break;
}
Simplifying further eliminates any kind of jumpiness in the code:
while((p = mem[w]) == 0)
w = pop();
If you prefer side-effect-free conditions and wish to avoid unnecessary
assignments, the following has the same effect:
while(mem[w] == 0)
w = pop();
p = mem[w];
The program the original was drawn from also exhibits another goto
syndrome: jumping for code reuse. In particular, the usage message in main
is accessed by jumping to it from any number of points. One is sorely tempted to
take the original author to one side and explain, kindly and patiently, that this
is what functions are for. Such wrong-headedness about code reuse led to a
main that was a few hundred lines long in this instance.
So is there a valid use for the goto? Yes, three that I can think of: in machine-generated code not intended for human consumption; in real optimisation, by which I mean speeding up genuine performance-critical compute-bound loops with complex internal behaviour on a specific platform relatively few programmers come across many of these in their careers; and, taking the fifth, as witness against itself in the form of code counter-examples. Thus, the need for a goto is a symptom and not a cure.
What about other jumps? Taligent's guide finishes the small section on the goto with
Standard block ifs can be used to direct flow around code rather than bombing out prematurely with returns, e.g. instead of
if(this == &rhs)
return *this;
...
return *this;
use
if(this != &rhs)
...
return *this;
Premature exits from within a loop can often be eliminated by the "do one thing
well" rule. A typical find algorithm using multiple returns:
for(start(); !at_end(); next())
if(check(current()))
return true;
return false;
A more general approach reallocates the responsibilities, so that the loop
advances the program to a well-defined state, and the subsequent code acts on
this state:
start();
while(!at_end() && !check(current()))
next();
return !at_end();
The while, used here in preference to a null-bodied for, does the
'travelling' and the result is evaluated in the 'slip-stream' of the control
flow. Likewise the following cascade pattern for error handling avoids multiple
returns, allowing a single point for any resource reclamation or rollback before
exiting:
const char *status = 0;
if((buffer = (char *) malloc(size)) == 0)
status = "malloc failed";
else if(read(source, buffer, size) == -1)
status = "read failed";
else if(...)
status = "...";
else
... // perform main function
free(buffer);
buffer = 0;
if(status)
fprintf(stderr, "error: %s\n", status);
In C++ much of this can be tidied using exceptions and the
resource acquisition is initialisation idiom. In this an object is given
responsibility for owning access to a resource, automatically closing or
deallocating on its destruction. This is exception safe and can do a lot to
simplify code.
So how do exceptions fit into the structured scheme of things? They are not truly part of the structured manifesto, and some have tried likening them to a form of goto. Truth be known, the C++ model of exceptions is so unlike the goto that the lack of similarity is quite startling. A thrown exception results in a transfer of control (unwinding the call stack in an orderly fashion) and information (such as a parameterised indication of what occurred) to a point where the calling code can handle it. Exception handling is a pragmatic tool for managing complexity, respecting a program's modularity, extensibility and original structure.
Large functions with a lot of jumps are obviously harder to comprehend. For these it is probably worth taking a step back, reconsidering the problem, breaking the function up into smaller units and then seeing where you stand. If it looks pretty much the same you can still perform a noble service to your fellow programmer by inserting clear, brief comments. Sure, compared to the goto the other jumps are well behaved, but they still have their dark side. When you see them, get their alibi.
© Kevlin Henney
First published in CVu 7(6), September, 1995
Converted to HTML, March 2001