[cpp-threads] infinite loops

Tue Aug 15 22:13:55 BST 2006

> From:  Doug Lea
> 
> I still think there is something wrong with how this problem 
> is being posed and discussed. Maybe with SC itself.
> Consider saying that a program is SC if for all k >= 0, k 
> steps of the program obey Lamport SC relations. In that case, 
> this problem seems to go away.
> 
Does it?

To make this all a bit more concrete, let's try a somewhat realistic
example:

Assume everything reachable from q is thread private, count17 and
count13 are ordinary potentially shared variables.

Can

Thread 1a:
for (foo *p = q; p != 0; p = p -> next) if (p -> data == 17) count17++;
for (foo *p = q; p != 0; p = p -> next) if (p -> data == 13) count13++;

be transformed to

Thread 1b:
for (foo *p = q; p != 0; p = p -> next) {
    if (p -> data == 17) count17++;
    if (p -> data == 13) count13++;
}

?

Assume that count13 is initially zero, the list q is cyclic containing a
single data
element of 13.

If I combine either of the above with

Thread 2:

if (count13 != 0) do_something_bad();

Clearly the transformed program can do something bad almost immediately,
i.e. after a very small number of steps.  But it seems to me that a
sequentially consistent execution of Thread 1a and thread 2 cannot.  And
the Thread 1a + Thread 2 combination is clearly data race free, since
count13 is never written.

If the list is in fact acyclic and long, the transformed program is
probably about twice as fast as the original.  And I think the
transformation is legal for sequential programs.

Thus I think that there is a real simplicity vs. performance (&
implementation cost) tradeoff here.  What makes this one unusual in my
mind is that if we make certain infinite loops exhibit undefined
behavior, I think the cost to programming simplicity seems very small.
And you seem to get some of that back by being able to diagnose errors
that were previously not conveniently reportable.

On the other side, I think we have few (no?) examples for which
compilers actually perform profitable transformations like the above.
The real problem, as Bill points out, seems to be that current analyses
ignore this issue and may hence may conceivably produce such dubious
results, probably without actually getting a lot of benefit from it.

I assume the Sun Java server compiler does something like fairly
aggressive PRE?  If so, does anybody know with some confidence whether
it gets this right and how hard it was to get it right?

Hans