[cpp-threads] Alternatives to SC

Mon Jan 29 23:09:09 GMT 2007

On Mon, Jan 29, 2007 at 02:11:46PM -0600, Boehm, Hans wrote:
> > From:  Doug Lea
> > 
> > Hans Boehm wrote:

[ . . . ]

> > I'm not sure I could write such a program in C++ under the no
> > semantics for programs with data-races approach. And some of the
> > linux-kernel and related examples seem to fall under this category
> > too. (Hmm; maybe they will start writing such code in Java instead :-)
>
> I think that you should be able to write these using "raw" atomics in
> C++.  The intent is for that to give you atomicity at low cost, which is
> kind of what you get from normal Java variables.  (Actually, I think the
> Java approach of guaranteeing this for certain types and not others
> wouldn't work well for C++, since it doesn't interact well with either
> templates or narrow bus embedded processors.  I have no idea how
> important the latter actually are for C++.)

The action that the compiler would need to take to make this guarantee for
non-native data formats does indeed depend on the software environment.

For single-processor narrow-bus embedded processors, one can simply
disable interrupts across the access.  Unless one must interact with
NMI handlers, in which case the only thing that can be atomic is the
native data types of the CPU.

For multiprocessors, the compiler could invent a per-instance lock.
However, this would have problems if the data structure were accessed
from interrupts, signal handlers, or NMI handlers.  Disabling interrupts
across the access is not to expensive, but disabling signal handlers
is quite expensive on some systems.  Disabling NMIs is of course a
contradiction in terms.

If atomic pointers are supported by the hardware, I can solve this
using RCU.  The idea is to use a pointer to the variable, and to update
by updating the pointer.  RCU is used to reclaim the memory corresponding
to the old version.  Of course, RCU is not universally available.

So it might be necessary to restrict this to native data types.  :-(

							Thanx, Paul

> > While I was among those who agreed with the data-race-free-only
> > approach for C++, it strikes me now that there is a significant
> > portion of the C/C++ user base for which this is not likely to be
> > desirable. So maybe it is time to revisit the underlying
> > issues. Starting with your:
> > 
> > 
> > > 
> > > 	r1 = x;
> > > 	if (r1 < 2) {
> > > 	    ...
> > > 	    switch(r1) {
> > > 	      case 0:
> > > 	      case 1: ...
> > > 	    }
> > >         }
> > > 
> > > Under suitable conditions, a compiler may
> > > 
> > > - Spill r1 during the first ellipses.
> > > - Reload r1 from x in the switch expression.
> > > - Implement the switch statement using a branch table, and omit the
> > > range check, since r1 is known to be in range.
> > > 
> > > This means that a concurrent change to x, even if it is just used
> > > for statistical purposes, will result in a wild branch.
> > > 
> > 
> > 
> > 
> > In this and related cases, all that a compiler needs to do is assign
> > an activation-stack-frame location to hold the spill.  (That's what's
> > done in Java VMs). I have trouble believing that this is not a better
> > option anyway -- in part because later reading from frame is 
> > very unlikely
> > to encounter a cache miss.
> > 
> > I think the other issues along these lines involve invented
> > writes. These might be more controversial. Do you have a list of
> > them handy?
> I don't, and that's really what would scare me about trying to change
> this.  I think this assumption is pervasive in all kinds of compiler
> analyses, and nobody remembers where it was made.  Note that the above
> example still has the same problem if x is mentioned explicitly
> everywhere that r1 is used (and the r1 = x assignment is removed).
> Conventional flow analysis will still prove (assuming that x is
> unsigned) that x has to be either 0 or 1 when the switch statement is
> entered, end hence the bounds check for the branch table can be elided.
> 
> It seems to me that essentially any flow- or context-sensitive compiler
> analysis makes this sort of assumption by default.  For Java, compiler
> implementers were forced to think about this from the start.  (And a
> smaller number probably actually did :-) )  For C++, the tradition is
> mostly to think about single-threaded executions.  And Posix basically
> tried to say that was OK.  And I suspect that at least until very
> recently, Microsoft largely followed those rules as well.
> 
> The fact that C++ performance tends to rely on heavyweight static
> optimization, where Java performance relies more on dynamic information
> and lighter-weight optimizations also doesn't help.
> 
> Among the other consequences of trying to "fix" this:
> 
> - We need to specify a set of types that are always atomically loaded
> and stored.  My guess is that this would not fly in the C/C++ embedded
> systems community.
> - We sometimes need fences after vtable pointer initialization in stack
> allocated objects.
> 
> Hans
> 
> --
> cpp-threads mailing list
> cpp-threads at decadentplace.org.uk
> http://www.decadentplace.org.uk/cgi-bin/mailman/listinfo/cpp-threads