[cpp-threads] std::atomic<> in acquire-release mode and write atomicity

Paul E. McKenney paulmck at linux.vnet.ibm.com
Tue Dec 16 16:56:48 GMT 2008


On Tue, Dec 16, 2008 at 03:20:59AM +0100, Alexander Terekhov wrote:
> On Tue, Dec 16, 2008 at 3:05 AM, Paul E. McKenney
> <paulmck at linux.vnet.ibm.com> wrote:
> > On Tue, Dec 16, 2008 at 02:46:32AM +0100, Alexander Terekhov wrote:
> >> On Tue, Dec 16, 2008 at 1:00 AM, Paul E. McKenney
> >> <paulmck at linux.vnet.ibm.com> wrote:
> >> [...]
> >> > I agree that std::atomic<> in acquire-release mode does not support IRIW.
> >> > Whether this is due to a failure to totally order stores or a failure
> >> > to provide cumulativity to loads is a philosophical point, at least from
> >> > what I can tell.  ;-)
> >>
> >> How about
> >>
> >> P1: Y = 1;
> >> P2: if( Y == 1 ) { Z = 1; }
> >> P3: if( Z == 1 ) { assert( Y == 1 ); }
> >>
> >> ?
> >
> > Hmmm...  No memory fences of any kind, so that the loads and stores
> > are all memory_order_relaxed, correct?
> 
> No. We are in (load.)acquire-(store.)release mode. I meant:
> 
> P1: Y.store(1, release);
> P2: if( Y.load(acquire) == 1 ) { Z.store(1, release); }
> P3: if( Z.load(acquire) == 1 ) { assert( Y.load(acquire) == 1 ); }

Well, that does put a different light on it.  ;-)

OK, P1's release has no effect, as there is no prior operation.

P2's store-release to Z ensures that P2's load from Y is performed 
WRT all other threads before P2's store to Z.  A-cumulativity
ensures that if P2's load from Y sees P1's store, then P1's store
to Y is performed WRT all threads before P2's store to Z.  These
are both stores, and hence are "applicable" to a store-release,
which on PowerPC turns into an lwsync instruction.

If P3's load from Z sees P2's store, then by B-cumulativity, P3's
load from Y is P2's lwsync's B-set -except- that a prior store and
following load is not "applicable" in the case of lwsync.  So, let's
turn to P3's acquire operation, which becomes a conditional-branch/isync
combination.  This means that P3's load from Z is performed before
P3's load from Y WRT all threads.  Because P3's load from Z returned
1, it must have been performed after P2's store to Z WRT P3.

But P1's store to Y was performed WRT all threads before P2's
store to Z, as noted earlier.  Therefore, the assert is required to
see the new value of Y, and thus cannot fail.  I think, anyway.

Hey, you asked!!!  ;-)

							Thanx, Paul



More information about the cpp-threads mailing list