[cpp-threads] Visibility question

Wed Aug 2 19:47:39 BST 2006

Boehm, Hans wrote:
>> From:  Peter Dimov
>>
>> Do you know of an architecture/platform where <ordered>
>> atomics aren't (going to be) fences?
> I don't know of any such hardware platforms.

My arguments in favor of <ordered> atomics being fences are:

1. They are going to be fences in practice. Programmers quickly learn to 
ignore specifications that claim that things are going to break on the 
hypothetical platform Y' when existing practice clearly demonstrates that 
things work fine everywhere. So there is a risk that the memory model will 
end up being ignored, and we probably don't want that.

2. Making <ordered> atomics fences makes the model more accessible for 
mortals who think in terms of "can X be reordered to cross Y" instead of 
sync-with edges.

> They probably wouldn't be fences on a software DSM platform.  (There
> are arguments about whether we should care.  Give that memory
> latencies and minimum packet latencies over a network seem to be
> getting surprisingly close, I'd be inclined to say that we should.)
>
> The other general concern, recently also expressed to me by David
> Callahan, is that under the right conditions, it should be possible
> for the compiler to merge excessively fine-grained threads, and
> remove the synchronization between them.  This argues that
> synchronization operations should not have global implications beyond
> the threads the threads that access the synchronization object.  In
> that case, a synchronization object that is touched only by a single
> thread after such a merge can be removed, or replaced by a usually
> much cheaper single-threaded construct.

It makes sense to optimize an atomic that operates on a thread-private 
location into a non-atomic; and it also makes sense to optimize out an 
atomic entirely if the location isn't referenced anywhere else in the 
program. But if I understand correctly the current formulation is somewhere 
in-between. It doesn't allow the ordered accesses to y and z to be optimized 
out, and it also doesn't give fence semantics to ordered atomics.