[cpp-threads] Visibility question

Wed Aug 2 21:28:32 BST 2006

Herb Sutter wrote:
> I also think Hans' example should be a race, and I'll mention the
> reason responding to this other subthread below:
>
> Hans wrote:
>>> From:  Peter Dimov
>>> Do you know of an architecture/platform where <ordered>
>>> atomics aren't (going to be) fences?
>
> I assume this question is asking whether it's a global property (i.e.,
> two fences in two threads cause some synchronization between threads),
> as opposed to write-read of the same variable?

It is my understanding that:

fence() is local to a thread and has the following effects:

1. The compiler is not allowed to perform reorderings across the fence;
2. All loads that have been issued before the fence are satisfied, and all 
stores that have been issued before the fence are made globally visible, 
before subsequent accesses are issued.

This means that there is a total order between all fences, and that given an 
execution E in which a fence F1 happens before a fence F2 in this order, all 
operations that precede F1 also precede all operations that follow F2.

An atomic operation with a <fence> constraint (to distinguish it from 
<ordered>) has the same properties. All x86 operations with a LOCK prefix 
are <fence> atomic operations.

Given <acquire/release/ordered>, a <fence> atomic op can be constructed by 
using <acquire> preceded by a fence(), or <release> followed by a fence(). 
For x86 read-modify-write atomic ops, which already have <fence> semantics, 
this emulation introduces one unnecessary fence, potentially reducing 
performance by half.

<fence> atomic ops are sequentially consistent.

Is my understanding correct?