[cpp-threads] SC on PPC
Boehm, Hans
hans.boehm at hp.com
Fri May 4 22:49:08 BST 2007
> -----Original Message-----
> From: Raul Silvera
> Hans Boehm wrote on 05/02/2007 11:06:12 AM:
> > On Wed, 2 May 2007, Alexander Terekhov wrote:
> >
> > >
> > > I'm talking about the cost of acquire on PPC.
> > >
> > > P1: x.store_relaxed(1)
> > > P2: if (x.load_relaxed()==1) { y.store_release(1) }
> > > P3: if (y.load_acquire()==1) { Assert(x.load_relaxed()==1) }
> > >
> > > wont abort with
> > >
> > > load_acquire: load;"branch never taken";isync // see B.2.3 Safe
> > > Fetch (Book II).
> > > store_release: lwsync;store
> > >
> > > I don't want to have more constrained
> > >
> > > load_acquire: load;lwsync
> > >
> > Certainly that's an interesting question. On the other hand, I'm
> > saying that in the proposed C++ memory model, this is
> allowed to abort
> > unless
> you
> > change the initial store_relaxed to a store_release.
> >
> > I susoect this has little bearing on the rest of your
> discussion, though.
> > And I can't think of realistic situations in which you
> wouldn't need
> > the store_release in P1 anyway.
>
> Thanks for the clarification, Hans. I have all along been
> thinking of symmetric acquire and releases, as defined on
> N2153. So your proposal is to have causality be triggered by
> release but not acquire? Or do you need both acquire and
> releases to trigger causality? Do you foresee this changing
> on the next version of the model?
>
> I realize that this is not so interesting if all you have is
> acquire/release operations, but once you include relaxed
> operations (and fences) it is important to clearly define how
> they interact with each other. In particular, in Alexander's
> example above, why is the store_relaxed() from P1
> insufficient even though
> P1 doesn't
> issue any other memory operations? What if there was a
> release_fence() before that store_relaxed()?
>
I had unfortunately misread Alexander's example slightly, and read the
load_relaxed in P2 as a load_acquire. I believe you really need to
change the store_relaxed/load_relaxed in P1/P2 to
store_release/load_acquire in order to guarantee that the assertion will
not fail in our proposal. The P3 load of x can remain unchanged.
In the release/acquire version
x_initialization hb x.store_relaxed(P1) hb x.load_relaxed(P2) hb
y.store_release(P2) hb y.load_acquire(P3) hb x.load_relaxed(P3).
It follows that P3s load of x cannot see the zero initialization of x,
since there is a store that "happens between" them.
If either of the P1/P2 operations on x are "relaxed", there is no
ordering between (a)x.store_relaxed(P1) and x.load_relaxed(P2). Hence
(a) can become visible to either of the other threads at any time. You
don't get causality in the sense in which it's used here. But you do
get a transitive happens-before relation; it's just a very sparse one.
This is consistent with the Java approach, if you view "relaxed" C++
atomics as equivalent to ordinary (non-atomic/volatile) Java variables.
Hans
More information about the cpp-threads
mailing list