[cpp-threads] std::atomic<> in acquire-release mode and write atomicity
Paul E. McKenney
paulmck at linux.vnet.ibm.com
Tue Dec 16 16:56:48 GMT 2008
On Tue, Dec 16, 2008 at 03:20:59AM +0100, Alexander Terekhov wrote:
> On Tue, Dec 16, 2008 at 3:05 AM, Paul E. McKenney
> <paulmck at linux.vnet.ibm.com> wrote:
> > On Tue, Dec 16, 2008 at 02:46:32AM +0100, Alexander Terekhov wrote:
> >> On Tue, Dec 16, 2008 at 1:00 AM, Paul E. McKenney
> >> <paulmck at linux.vnet.ibm.com> wrote:
> >> [...]
> >> > I agree that std::atomic<> in acquire-release mode does not support IRIW.
> >> > Whether this is due to a failure to totally order stores or a failure
> >> > to provide cumulativity to loads is a philosophical point, at least from
> >> > what I can tell. ;-)
> >>
> >> How about
> >>
> >> P1: Y = 1;
> >> P2: if( Y == 1 ) { Z = 1; }
> >> P3: if( Z == 1 ) { assert( Y == 1 ); }
> >>
> >> ?
> >
> > Hmmm... No memory fences of any kind, so that the loads and stores
> > are all memory_order_relaxed, correct?
>
> No. We are in (load.)acquire-(store.)release mode. I meant:
>
> P1: Y.store(1, release);
> P2: if( Y.load(acquire) == 1 ) { Z.store(1, release); }
> P3: if( Z.load(acquire) == 1 ) { assert( Y.load(acquire) == 1 ); }
Well, that does put a different light on it. ;-)
OK, P1's release has no effect, as there is no prior operation.
P2's store-release to Z ensures that P2's load from Y is performed
WRT all other threads before P2's store to Z. A-cumulativity
ensures that if P2's load from Y sees P1's store, then P1's store
to Y is performed WRT all threads before P2's store to Z. These
are both stores, and hence are "applicable" to a store-release,
which on PowerPC turns into an lwsync instruction.
If P3's load from Z sees P2's store, then by B-cumulativity, P3's
load from Y is P2's lwsync's B-set -except- that a prior store and
following load is not "applicable" in the case of lwsync. So, let's
turn to P3's acquire operation, which becomes a conditional-branch/isync
combination. This means that P3's load from Z is performed before
P3's load from Y WRT all threads. Because P3's load from Z returned
1, it must have been performed after P2's store to Z WRT P3.
But P1's store to Y was performed WRT all threads before P2's
store to Z, as noted earlier. Therefore, the assert is required to
see the new value of Y, and thus cannot fail. I think, anyway.
Hey, you asked!!! ;-)
Thanx, Paul
More information about the cpp-threads
mailing list