[cpp-threads] Review comments on N2176 WRT dependency ordering

Wed Apr 18 12:08:57 BST 2007

Paul E. McKenney wrote:
> On Wed, Apr 18, 2007 at 02:17:15AM +0300, Peter Dimov wrote:

>> There's also the option of just providing dependency_fence() which
>> maps to atomic_compiler_fence( __acquire ) on everything except
>> Alpha, where it inserts an additional rmb as well.
>
> And for strongly ordered machines (e.g., C4 or TSO), this makes sense.
> The required acquire fence is free (aside from preventing some
> compiler optimizations).  However, on weakly ordered machines that
> enforce data dependencies, the acquire fence would be excessively
> expensive.

No, the fence is a compiler fence, it will insert no hardware fence 
instructions, just prevent optimizations that are equivalent to moving a 
memory access upwards. It then becomes an interesting debate whether a 
particular dependency-breaking optimization X is equivalent to moving the 
memory access M upwards. :-) (Which, I believe, it generally is.)

I think that if we have compiler fences at all, we'll have this problem of 
optimization validity in their presence anyway.