[cpp-threads] Visibility question
Peter Dimov
pdimov at mmltd.net
Wed Aug 2 19:47:39 BST 2006
Boehm, Hans wrote:
>> From: Peter Dimov
>>
>> Do you know of an architecture/platform where <ordered>
>> atomics aren't (going to be) fences?
> I don't know of any such hardware platforms.
My arguments in favor of <ordered> atomics being fences are:
1. They are going to be fences in practice. Programmers quickly learn to
ignore specifications that claim that things are going to break on the
hypothetical platform Y' when existing practice clearly demonstrates that
things work fine everywhere. So there is a risk that the memory model will
end up being ignored, and we probably don't want that.
2. Making <ordered> atomics fences makes the model more accessible for
mortals who think in terms of "can X be reordered to cross Y" instead of
sync-with edges.
> They probably wouldn't be fences on a software DSM platform. (There
> are arguments about whether we should care. Give that memory
> latencies and minimum packet latencies over a network seem to be
> getting surprisingly close, I'd be inclined to say that we should.)
>
> The other general concern, recently also expressed to me by David
> Callahan, is that under the right conditions, it should be possible
> for the compiler to merge excessively fine-grained threads, and
> remove the synchronization between them. This argues that
> synchronization operations should not have global implications beyond
> the threads the threads that access the synchronization object. In
> that case, a synchronization object that is touched only by a single
> thread after such a merge can be removed, or replaced by a usually
> much cheaper single-threaded construct.
It makes sense to optimize an atomic that operates on a thread-private
location into a non-atomic; and it also makes sense to optimize out an
atomic entirely if the location isn't referenced anywhere else in the
program. But if I understand correctly the current formulation is somewhere
in-between. It doesn't allow the ordered accesses to y and z to be optimized
out, and it also doesn't give fence semantics to ordered atomics.
More information about the cpp-threads
mailing list