|
|
| Author |
Message |
cross
|
|
Is `goto' still harmful?
|
Jan 26 04:27 UTC 2007 |
Almost 20 years ago, Remmers entered an item in M-Net's programming
conference about structured programming and the use of goto. He quoted a
letter to the editor of the Communications of the ACM that took exception to
the idea that the `goto' statement should be banned outright as a matter of
course in programming (and, in a larger sense, computer science) education.
The assertion of the letter was that judicious use of goto could enhance
clarity and readability of programs, and in some cases, correctness.
The letter stated that its author gave the following task to a group of 12
or so "expert programmers" (however that was decided):
"Let X be an N by N matrix of integers. Write a program that will print
the number of the first all-zero row of X, if any."
(Most of the solutions offered in the thread generalized to an M by N
matrix, where M and N might differ. This should not materially affect
the answer.)
Various solutions were offered; two of which were incorrect. Three or
so used goto's. The rest did not.
The question is: How would you implement this? (Without cheating and
looking at the answers on M-Net!) Would you use a goto? Would that
improve the efficiency and/or readability of the resulting program?
What do people on grex have to say? Do we still believe the dogma that
goto is only ever harmful, or are there cases where it is acceptable?
Bearing in mind that the major languages of the time the question was
posed included FORTRAN, COBOL, Pascal and others, how have the advances
in languages changed the way we think about these problems in the last
twenty years? Feel free to post a solution in any language you like
and let's talk about it!
|
| 45 responses total. |
remmers
|
|
response 1 of 45:
|
Jan 26 12:56 UTC 2007 |
(It would be cheating for me to post my solution, but I await others'
with interest.)
|
nharmon
|
|
response 2 of 45:
|
Jan 26 14:18 UTC 2007 |
I'm certainly NOT an expert programmer, but here is what I came up with
in Perl:
@matrix = (
[1, 2, 3, 4],
[0, 5, 6, 0],
[0, 0, 0, 0],
[7, 0, 8, 9]
);
my ($rows, $cols) = matrix_count_rows_cols ($matrix);
my $row = 0;
my $col;
my $found;
while ( $row < $rows ) {
$col = 0;
while ( $col < $cols ) {
$found = $row + 1;
if ( $matrix[$row][$col] != 0 ) {
$col = $cols;
$found = 0;
}
$col++;
}
if ( $found != 0 ) {
$row = $rows;
}
$row++;
}
if ( $found != 0 ) {
print "The first all-zero row is row number $found.\n";
}
else {
print "No all-zero rows found.\n";
}
|
cross
|
|
response 3 of 45:
|
Jan 26 17:00 UTC 2007 |
Hmm, that will work, but there are better solutions. Early on in the thread,
Remmers posted the letter's author's solution, which looked something like
this (in Pascal, slightly reformated):
for i := 1 to n
do begin
for j := 1 to n do
if x[i,j] <> 0
then goto reject;
writeln('1st all-zero row is ',i);
break;
reject:
end
In C, we'd rewrite this as:
for (i = 0; i < n; i++) {
for (j = 0; j < n; j++)
if (x[i][j] != 0)
goto reject;
printf("1st all-zero row is %d\n", i);
break;
reject:
}
Nate, you gave a go in Perl: how would you recast this code directly
into Perl, without using goto? (Hint: look at loop control statements.)
|
nharmon
|
|
response 4 of 45:
|
Jan 26 18:09 UTC 2007 |
This response has been erased.
|
nharmon
|
|
response 5 of 45:
|
Jan 26 18:13 UTC 2007 |
Woops
Make that:
for ($i = 0; $i < $n; $i++) {
$rowsum = 0;
for ($j = 0; $j < $n; $j++) {
$rowsum = $rowsum + $matrix[$i][$j];
}
if ( $rowsum != 0 ) {
next;
}
print "1st all-zero row is ".($i+1)."\n";
last;
}
|
cross
|
|
response 6 of 45:
|
Jan 26 18:23 UTC 2007 |
That's closer to what I was thinking about, but still not optimal. In
particular, once you hit a non-zero element in some column, why do you keep
going? You know that row is not all-zero at that point: is there a way to
avoid looking at the rest of the row? And do you actually need $rowsum?
|
nharmon
|
|
response 7 of 45:
|
Jan 26 18:35 UTC 2007 |
I see what you mean, Dan.
for ($i = 0; $i < $n; $i++) {
for ($j = 0; $j < $n; $j++) {
if ( $matrix[$i][$j] != 0 ) {
last;
}
}
if ( $matrix[$i][$j] == 0 ) {
print "1st all-zero row is ".($i+1)."\n";
last;
}
}
|
cross
|
|
response 8 of 45:
|
Jan 26 18:49 UTC 2007 |
Okay, can you spot the bug?
|
nharmon
|
|
response 9 of 45:
|
Jan 26 19:02 UTC 2007 |
It seems to test okay with real data. Maybe you can clue me in? :)
|
cross
|
|
response 10 of 45:
|
Jan 26 19:13 UTC 2007 |
Consider the case where a row is all zero. Then the inner for loop will
terminate naturally, and $j will be equal to $n. But arrays in Perl are
origin 0, meaning that their indicies range from 0 ... ($n - 1). In your if
statement, you index $a[$i][$j], but on an all-zero row, $a[$i][$j] is
actually one element beyond the end of array that represents the row; a
classic off by one error. I could be mistaken, I'm fairly certain there's
no guarantee in Perl that going one past the end of the array always yields
a zero and that this is, in fact, an error.
You could do a test on $j instead of looking at the array to do essentially
the same thing, but then you're doing an extra comparison in the outter loop
that you could avoid in the goto case. There is another solution using the
facilities of Perl, anyone know what it is?
|
albaugh
|
|
response 11 of 45:
|
Jan 26 19:36 UTC 2007 |
Is this item about discussing the merits of goto, writing a problem solution,
or both?
|
cross
|
|
response 12 of 45:
|
Jan 26 19:46 UTC 2007 |
Both.
|
cross
|
|
response 13 of 45:
|
Jan 26 19:47 UTC 2007 |
Nate saw my solution on M-Net, so I'm going to post it. Basically, I used
the labeled loop functionality of perl. Here's my Perl-based M x N matrix
solution:
# Now that we have loop control statements, do we need a fancy algorithm?
my @a = ([0, 1, 2], [0, 0, 0], [0, 1, 0]);
my $N = 3;
my $M = 3;
my $i, $j;
row: for ($i = 0; $i < $M; $i++) {
for ($j = 0; $j < $N; $j++) {
next row if $a[$i][$j] != 0;
}
print "1st all-zero row is ", $i + 1, "\n";
last row;
}
if ($i == $M) { print "No all-zero row found.\n"; }
|
nharmon
|
|
response 14 of 45:
|
Jan 26 20:10 UTC 2007 |
This response has been erased.
|
cross
|
|
response 15 of 45:
|
Jan 26 20:11 UTC 2007 |
Note something about this solution: basically, we're using functionality of
the language that effectively implements a goto. But, clearly someone
recognized the need for such a thing and added it to the language, thus
obviating the need to use goto to break out of a nested loop structure.
Actually, we have several things of this nature in modern programming. For
instance, exceptions are nothing more than glorified goto's with some
additional state information associated with them. So the concept is
useful, and what you've seen in the 40 years since Djikstra wrote ``Goto
Statement Consider Harmful'' in Communications of the ACM is a selective
identification of the instances where goto really *added* something to
programming, and an extraction of those paradigms into codified programming
constructs. In other words, the language designers took what was useful and
recast it as a first class language component. Multi-level loop control
statements (e.g., as implemented in Perl's labelled loops) and exceptions
are the two important examples.
So, in a language where we have those things, is goto still useful? I'm not
sure. But, from the pro-goto side of the house, let's quote Ken Thompson
from the Plan 9 fortune file:
"If you want to go somewhere, goto is the best way to get there. K Thompson"
My own sense is that goto is never strictly necessary, and hardly ever
useful, particularly in modern programming languages where we have
exceptions and multi-level loop control constructs (when I *have* used goto
in real programs, it was almost always to break out of nested loops).
By the way: there's still another algorithm for the matrix problem that
is both elegant and efficient that avoids goto and doesn't require the
loop control constructs I used (in fact, it can be implemented entirely
in ISO standard Pascal and doesn't require a BREAK or CYCLE). Can anyone
find it?
I'm most curious, once it's discovered (I know at least one person reading
this knows the solution, since he made it up, and Nate probably does too,
since he read the M-Net thread, so it *will* be discovered) to compare it
to a solution that uses the `modern' language functions. Which is more
obvious? Which is more efficient? What other metrics would one judge
something like this by?
|
cross
|
|
response 16 of 45:
|
Jan 26 20:14 UTC 2007 |
Regarding #14; Interesting. Nice way to generalize to a totally unexpected
problem domain. I wonder if you can't avoid the left join by using a
subselect, however...
|
nharmon
|
|
response 17 of 45:
|
Jan 26 20:17 UTC 2007 |
Let me repost that:
So lets say this matrix was stored in a MySQL Database. :-)
mysql> select * x;
+----+------+------+-------+
| id | row | col | value |
+----+------+------+-------+
| 1 | 1 | 1 | 0 |
| 2 | 1 | 2 | 2 |
| 3 | 2 | 1 | 0 |
| 4 | 2 | 2 | 0 |
+----+------+------+-------+
4 rows in set (0.00 sec)
mysql> SELECT x.row FROM x
LEFT JOIN (SELECT row,SUM(value) AS sum FROM x GROUP BY row) AS x2
ON x.row = x2.row
WHERE x2.sum = 0 ORDER BY row LIMIT 1;
+------+
| row |
+------+
| 2 |
+------+
1 row in set (0.02 sec)
|
nharmon
|
|
response 18 of 45:
|
Jan 26 20:18 UTC 2007 |
I'm not exactly sure if this gives you the correct answer. But I put the
join in there because MySQL does not allow aggregate functions in the
WHERE expression.
|
albaugh
|
|
response 19 of 45:
|
Jan 26 20:22 UTC 2007 |
My opnion is that in general goto's should not be employed, but there are
always exceptions. Certainly generous use of goto's can make code
unmaintainable (especially for the inheritors). Almost never should a goto
be used to exit from the middle of a block to some other different place, or
used to jump into the middle of a block. Something like a "break", which does
a "goto" to the end of a block is not too hazardous.
|
cross
|
|
response 20 of 45:
|
Jan 26 20:39 UTC 2007 |
Interesting. What about exception handling? If I understand your prohibition
about jumping *out* of the middle of a block, that would rule out jumping out
of nested loops, no? I actually think that should be allowed, since the
alternatives often involve flag variables and lots of additional checks. Ick!
|
easlern
|
|
response 21 of 45:
|
Jan 26 21:55 UTC 2007 |
I don't know any Pascal so I'm not sure if it qualifies, but here's some
Python that only uses a while:
matrix = list()
matrix.append ([1,0,0])
matrix.append ([0,0,1])
matrix.append ([0,0,0])
matrix.append ([1,1,1])
matrix.append ([0,1,0])
def blah (matrix):
m = len (matrix)
n = len (matrix [0])
row = 0
column = 0
while (row < m):
if (matrix [row][column] != 0):
row += 1
column = 0
else:
column += 1
if (column == n):
print row
row = m
blah (matrix)
|
cross
|
|
response 22 of 45:
|
Jan 26 22:26 UTC 2007 |
Excellent. That is very similar the to method that Remmers used on M-Net,
(20 years ago, in Pascal). Here is his solution:
From Remmers (excerpt copied without permission):
----
Let (i,j) be the current (row,column) position. Assume that
all rows prior to the ith have been "rejected" (contain at least
one non-zero element) and that in the ith row, all columns
after the jth have been "accepted" (are all zero). We can
make progress toward termination by updating i and j as follows:
If x[i,j] = 0, then decrement j; if x[i,j] <> 0, then increment
i and set j to n. Termination occurs when we accept an entire
row (j = 0) or reject the nth row. (If we use j = -1 to flag
this, the loop test is simply "j > 0".)
This leads directly to the algorithm:
i := 1; j := n;
while j > 0 do
if x[i,j] = 0 then j := j-1
else if i = n then j := -1
else begin i := i+1; j := n end;
if j = 0 then writeln('1st all-zero row is ',i)
----
Whereas he tested against the column index, here you test against the
row index, but the idea is the same. You could avoid the second if
statement inside the loop by using an elif (observe that, if
matrix[row][column] != 0, then we reset column to 0 and we can never
get the case where column == n; that can only happen after we increment
column, and we needn't increment column column if that's the case. So,
we get the following modification:
def findnonzero(matrix):
m = len (matrix)
n = len (matrix [0])
k = n - 1
row = 0
column = 0
while (row < m):
if (matrix [row][column] != 0):
row += 1
column = 0
elif (column == k):
print row
row = m
else:
column += 1
I'm not too stokked about the appearance of the temporary k, but that is
just there to deal with the fact that Python arrays are also origin 0, and
we want to (explicitly) avoid the cost of calculating either (column + 1)
or (n - 1) in the conditional (though the python interpreter is probably
smart enough to do that for us).
I claim this code is equivalent to Remmers's. Clearly, it operates from
the point of view (if one will) of the row instead of the column. Whether
it will be faster, however, depends on whether non-zero elements appear
frequently in the matrix on not.
|
cross
|
|
response 23 of 45:
|
Jan 26 22:34 UTC 2007 |
Er, where I say, `we needn't increment column if that's the case'' I mean the
case that we've reset column to 0. Sorry, that was ... unclear.
|
albaugh
|
|
response 24 of 45:
|
Jan 26 23:22 UTC 2007 |
"error traps" are not goto's in the conventional sense - the code running at
the time of an exception didn't want to abort and wildly jump somewhere else
to execute something different.
|