[cpp-threads] Alternatives to SC

Mon Jan 29 15:59:18 GMT 2007

Hans Boehm wrote:

> 
> 3) The current N2052 memory model proposal disallows all data races
> on ordinary variables.  

It's taken me a while to see how the stance of not providing ANY
semantics for programs with races makes C++ memory model so much
harder and more controversial than Java.  Backing up to explain:

The atomic classes perform two different roles.

The first role is to allow some variation in how you can achieve
correct "strong" semantics. For the main example, some correct uses of
atomic variables rely on  the combination:
   {writes/CASes followed by StoreLoad barrier,
    reads use only local ordering barrier}
While others rely on the combination:
   (writes/CASes use only local ordering barrier,
    all reads are "confirmed" via acquire-mode CAS, entailing barrier}
You often want the former, but most spinlocks (although not
blocking locks), some flavors of double-check, and some other common
constructions are more efficient using variations of the latter
(including cases involving variables that, by virtue of monotonicity
constraints, don't actually need to be confirmed.) But both are valid.
However. some other usage combinations do not actually provide
"strong" (SC or CCCC) semantics even though they are by definition
data-race free.

The second role is to provide tools for dealing with variables that
are sometimes used in thread-local mode (i.e., as normal variables),
but may at other times be read or written across threads. This is
experts-only territory.  To be used correctly, the programmer must
have deep appreciation of the invariants governing local vs shared
modes. In most cases, a compiler will have no chance itself of proving
correctness of locality assumptions. Although one would hope that they
get better at this sort of thing, they will never be perfect at it.
This is not much different than proving correctness of programs that
mix atomic/volatile and normal variables, where there are races
involving the normal ones, but the ultimate result is correct.  In
Java, one can write programs like this.  For example, I have a
component in Java that is correct wrt the JMM with probablility 1
because it relies on randomized processing that eventually
converges.

I'm not sure I could write such a program in C++ under the no
semantics for programs with data-races approach. And some of the
linux-kernel and related examples seem to fall under this category
too. (Hmm; maybe they will start writing such code in Java instead :-)

While I was among those who agreed with the data-race-free-only
approach for C++, it strikes me now that there is a significant
portion of the C/C++ user base for which this is not likely to be
desirable. So maybe it is time to revisit the underlying
issues. Starting with your:

> 
> 	r1 = x;
> 	if (r1 < 2) {
> 	    ...
> 	    switch(r1) {
> 	      case 0:
> 	      case 1: ...
> 	    }
>         }
> 
> Under suitable conditions, a compiler may
> 
> - Spill r1 during the first ellipses.
> - Reload r1 from x in the switch expression.
> - Implement the switch statement using a branch table, and omit the
> range check, since r1 is known to be in range.
> 
> This means that a concurrent change to x, even if it is just used
> for statistical purposes, will result in a wild branch.
> 

In this and related cases, all that a compiler needs to do is assign
an activation-stack-frame location to hold the spill.  (That's what's
done in Java VMs). I have trouble believing that this is not a better
option anyway -- in part because later reading from frame is very unlikely
to encounter a cache miss.

I think the other issues along these lines involve invented
writes. These might be more controversial. Do you have a list of
them handy?

-Doug