[cpp-threads] SC on PPC
Paul E. McKenney
paulmck at linux.vnet.ibm.com
Fri May 11 04:56:36 BST 2007
On Wed, May 09, 2007 at 09:50:07PM -0000, Boehm, Hans wrote:
>
>
> > -----Original Message-----
> > From: cpp-threads-bounces at decadentplace.org.uk
> > [mailto:cpp-threads-bounces at decadentplace.org.uk] On Behalf
> > Of Raul Silvera
> > Sent: Wednesday, May 09, 2007 1:15 PM
> > To: C++ threads standardisation
> > Subject: RE: [cpp-threads] SC on PPC
> >
> >
> > Hans Boehm wrote on 05/04/2007 05:49:08 PM:
> >
> > > > > On Wed, 2 May 2007, Alexander Terekhov wrote:
> > > > > > P1: x.store_relaxed(1)
> > > > > > P2: if (x.load_relaxed()==1) { y.store_release(1) }
> > > > > > P3: if (y.load_acquire()==1) { Assert(x.load_relaxed()==1) }
> >
> > > I had unfortunately misread Alexander's example slightly,
> > and read the
> > > load_relaxed in P2 as a load_acquire. I believe you really need to
> > > change the store_relaxed/load_relaxed in P1/P2 to
> > > store_release/load_acquire in order to guarantee that the assertion
> > > will not fail in our proposal. The P3 load of x can remain
> > unchanged.
> > >
> > > In the release/acquire version
> > >
> > > x_initialization hb x.store_relaxed(P1) hb x.load_relaxed(P2) hb
> > > y.store_release(P2) hb y.load_acquire(P3) hb x.load_relaxed(P3).
> >
> > I assume you meant "x_initialization hb x.store_release(P1) hb
> > x.load_relaxed(P2) hb ..."
> Oops. I'll get it right eventually. I meant
>
> "x_initialization hb x.store_release(P1) hb x.load_acquire(P2) hb ..."
> >
> > > It follows that P3s load of x cannot see the zero
> > initialization of x,
> > > since there is a store that "happens between" them.
> > >
> > > If either of the P1/P2 operations on x are "relaxed", there is no
> > > ordering between (a)x.store_relaxed(P1) and
> > x.load_relaxed(P2). Hence
> > > (a) can become visible to either of the other threads at any time.
> > > You don't get causality in the sense in which it's used
> > here. But you
> > > do get a transitive happens-before relation; it's just a
> > very sparse one.
> > >
> > > This is consistent with the Java approach, if you view
> > "relaxed" C++
> > > atomics as equivalent to ordinary (non-atomic/volatile)
> > Java variables.
> >
> > I find this troublesome. As a programmer I don't see the
> > rationale of why you would need a release store on P1. A
> > release operation is needed to order a store with respect to
> > preceding operations, but there are no preceding operations
> > from P1 on this example.
>
> In the proposed model, a release/acquire pair is the only thing that
> enforces any visibility ordering between threads. (And SC operations
> implicitly give you release/acquire semantics.) "Relaxed" gives you
> nothing but atomicity, allowing it to be implemented as an ordinary
> load/store almost everywhere. This is what I suspect you want for a DSM
> implementation, for example. It may not be intuitive, but it does seem
> fairly simple.
There does not seem to be complete agreement among DSM experts on this
point, however.
> One could certainly think of alternate semantics. (I'm not completely
> sure what the intended semantics are for java.util.concurrent.atomic's
> lazySet. Doug may have something to add here.) I do have my doubts that
> anything we come up with for "relaxed" will be particularly intuitive.
> I can't think of a real case in which this example matters, so I would
> be hesitant to complicate the description to accommodate it. I suspect
> that one way or another, you will really need to understand the precise
> description in order to program portably at this level.
I agree that some modification to the semantics will be required in
order to accommodate bare fences. However, there are a lot of them
out there, so it seems worth doing.
> > It seems natural to me that if an atomic relaxed load
> > observes the value stored by an atomic relaxed store, then
> > the relaxed store hb the relaxed load.
There are exceptions to this involving shared caches and shared
store buffers, right? Or am I confused?
> This would have negative consequences on synchronization elimination.
> It would mean that if I have
>
> T1: x.store_relaxed(1); z.fetch_add_acq_rel(1); y.store_relaxed(1)
> T2: if (y.load_relaxed()) w.store_release(1);
>
> with z thread local (possibly after thread coalescing by the compiler),
> I still can't turn the fetch_add into an ordinary increment, since it
> orders the other operations. This affects the performance of code that
> doesn't use the low level operations, since the fetch_add (or equivalently
> lock acquisition) may be in a separate compilation unit.
Hmmm... Isn't this the general situation for optimizations? There are
certainly analogs to this example for pointer-alias analysis and the
like.
> I think that for Java, we can't go there. Lock elimination
> is important. At least in the short term, the performance loss for
> ordinary lock-based Java code would outweigh anything we could get back
> from better atomics.
Perhaps what is needed is a way to declare a given compilation unit
or set of compilation units to have no dependencies on outside ordering.
That seems like it should handle a very large fraction of the cases
occuring in practice -- certainly many of the cases surrounding library
code.
> For C++, we have much less redundant synchronization to start with,
> but some of us feel this still matters a bit because of thread coalescing.
> This has been an issue some of us have gone back and forth on a few times.
Seems that a similar declaration would help quite a bit for C/C++ as well.
Thanx, Paul
> Hans
>
> >
> > I don't think relaxed C++ atomics are equivalent ordinary
> > Java variables; instead, I believe they are a middle ground
> > between ordinary variables and SC atomics. The fundamental
> > issue is that, unlike ordinary Java variables, relaxed
> > operations can provide reliable communication between
> > threads, and that should be part of the memory model.
> >
> > --
> > Raúl E. Silvera IBM Toronto Lab Team Lead, Toronto Portable
> > Optimizer (TPO)
> > Tel: 905-413-4188 T/L: 969-4188 Fax: 905-413-4854
> > D2/KC9/8200/MKM
> >
> >
> >
> > --
> > cpp-threads mailing list
> > cpp-threads at decadentplace.org.uk
> > http://www.decadentplace.org.uk/cgi-bin/mailman/listinfo/cpp-threads
> >
>
> --
> cpp-threads mailing list
> cpp-threads at decadentplace.org.uk
> http://www.decadentplace.org.uk/cgi-bin/mailman/listinfo/cpp-threads
More information about the cpp-threads
mailing list