[cpp-threads] cpp-threads Digest, Vol 41, Issue 11

Tue Dec 7 07:07:03 GMT 2010

> From: Mark Batty
> Sent: Monday, December 06, 2010 5:46 AM
> To: cpp-threads at decadent.org.uk
> Subject: Re: [cpp-threads] cpp-threads Digest, Vol 41, Issue 11
> 
> > But now consider this hybrid version:
> >
> >        Thread 0                        Thread 1
> >        --------                        --------
> >        a.store(1, mo_seq_cst);         b.store(1, mo_relaxed);
> >                                        fence(mo_seq_cst);
> >        b.store(2, mo_seq_cst);         a.store(2, mo_relaxed);
> >
> > Here the standard allows the assertion to fail.  Why is this?  Is it
> > deliberate or is it simply an oversight?  If it is deliberate, what
> is
> > the justification?
> >
> > Alan Stern
> 
> I don't understand how seq_cst fences are intended to be used, but I'd
> like to find out. I checked each of your three examples against our
> computerised mathematical version of the C++0x memory model, and each
> behaves as you predicted.
> 
> Mark
> 
I'm inclined to classify this as a low priority bug.  I can't think of any implementation-based motivations for it, though I may be overlooking something.

Fences were added late in the game, for basically two reasons:

1) To allow simple existing fence-based code to be moved forward with minimal effort.

2) To support a few relatively rare cases in which fence-based code could actually provide noticeably better performance on existing hardware.

In the large majority of cases, I would prefer to discourage people from writing new fence-based code, especially if it relies on subtle properties of those fences.  Such code is error-prone, and it overconstrains the implementation by enforcing lots of unnecessary ordering constraints.

Fences in the current standard were a compromise to provide the most essential fence functionality without introducing too much complexity into the specification, and without introducing significant implementation problems, especially in light of the fact that hardware fences don't all have the same semantics.  We particularly want to avoid slowing down the rest of the atomics to support fences.  There is no claim that fences are specified as strongly as they possibly could be.

Nonetheless, if someone can suggest improvements that don't appreciably complicate the specification, I think most of us would be very interested.  I agree that this particular behavior is weird and unexpected.  I'm not sure how important it is in practice.

Hans