[cpp-threads] SC on PPC (was Re: Increment/decrement operators on atomics package)

Paul E. McKenney paulmck at linux.vnet.ibm.com
Mon Apr 30 22:03:25 BST 2007


On Mon, Apr 30, 2007 at 10:49:04PM +0200, Alexander Terekhov wrote:
> On 4/30/07, Paul E. McKenney <paulmck at linux.vnet.ibm.com> wrote:
> >On Mon, Apr 30, 2007 at 09:09:12PM +0200, Alexander Terekhov wrote:
> >> On 4/30/07, Raul Silvera <rauls at ca.ibm.com> wrote:
> >> >
> >> >Alexander Terekhov wrote on 04/30/2007 08:10:19 AM:
> >> >
> >> >> How does cumulativity help in the IRIW case?
> >> >>
> >> >> P1: x = 1;
> >> >> P2: y = 1;
> >> >> P3: r1 = x; r2 = y;
> >> >> P4: r3 = y; r4 = x;
> >> >>
> >> >
> >> >The short version of this is that cumulativity on a hwsync between the 
> >two
> >> >loads on P3 would cause a StoreLoad ordering between P1's store to x
> >> >and P3's load of y.
> >>
> >> This is rather intriguing because unless I'm just missing something,
> >> cumulativity is defined the same for all barriers including
> >> lwsync/eieio and it comes into play when *P3* makes a post-barrier
> >> store which is observed by another processor.
> >
> >Not for isync or for ordering based on dependencies, however.
> 
> But isync is not really a bidirectional barrier. It's just a way to
> achieve .acq using fake or real dependencies.

It can indeed be used to achieve ,acq in some cases, but as near as I
can tell its original intent was to handle the case where the subsequent
instruction stream depended on prior operations, for example, after
mapping in a module but before executing it.

> >B-cumulativity does indeed cover the case where a post-barrier store
> >is observed by another processor.  In this case, subsequent operations
> >on this other processor will observe all applicable pre-barrier
> >operations.  Of course, "subsequent" must be enforced, and in the
> >case of "lwsync", a pair consisting of a pre-barrier store and a
> >post-barrier load is not "applicable".
> >
> >A-cumulativity covers the case where a pre-barrier operation
> >observes an operation by some other CPU, and forces post-barrier
> >operations to also observe that other CPU's operation.
> 
> To me, cummulative "post barrier" means that we have something in the
> B set (after expansion) on another processor "... performed after a
> Load instruction executed by that processor or mechanism has returned
> the value stored by a store that is in B."
> 
> http://www.decadentplace.org.uk/pipermail/cpp-threads/2007-January/001369.html
> http://www.decadentplace.org.uk/pipermail/cpp-threads/2007-January/001373.html

There has to be a store post-barrier on one CPU, the value of which is
loaded by some other CPU.  The example in the last URL is more complex,
however, as it combines control dependency with barriers.

> IOW with an empty "B-cumulativity", whatever you add to the A set
> (calling it "A-cumulativity") is not constrained at all with respect
> to other processors.

By "empty B-cumulativity", do you mean that there is no store following
the barrier providing the cumulativity, or do you mean that there is
no load referencing that store, or something else entirely?

						Thanx, Paul



More information about the cpp-threads mailing list