[cpp-threads] Slightly revised memory model proposal (D2300)
Paul E. McKenney
paulmck at linux.vnet.ibm.com
Sun Jun 24 18:38:41 BST 2007
On Sun, Jun 24, 2007 at 02:54:53AM -0000, Boehm, Hans wrote:
> > -----Original Message-----
> > From: Paul E. McKenney [mailto:paulmck at linux.vnet.ibm.com]
> > Sent: Friday, June 22, 2007 3:34 PM
> > To: Boehm, Hans
> > Cc: C++ threads standardisation; Sarita V Adve
> > Subject: Re: [cpp-threads] Slightly revised memory model
> > proposal (D2300)
> >
> > On Fri, Jun 22, 2007 at 08:38:51PM -0000, Boehm, Hans wrote:
> > > > -----Original Message-----
> > > > From: Paul E. McKenney
> >
> > [ . . . ]
> >
> > > > > An evaluation A that performs a release operation
> > on an object
> > > > > M synchronizes with an evaluation B that performs an acquire
> > > > > operation on M and reads either the value written
> > by A or, if
> > > > > the following (in modification order) sequence of
> > updates to M
> > > > > are atomic read-modify-write operations or ordered
> > > > atomic stores,
> > > >
> > > > Again, I believe that this should instead read "non-relaxed
> > > > read-modify-write operations" in order to allow some common
> > > > data-element initialization optimizations.
> > >
> > > Could you explain in a bit more detail? It seems weird to
> > me that an
> > > intervening fetch_add_relaxed(0) would break "synchronizes with"
> > > relationships, but fetch_add_acquire(0) would not. Acq_rel and
> > > seq_cst inherently do not have this issue, but the others
> > all seem to
> > > me like they should be treated identically.
> >
> > The proper analogy is between fetch_add_acquire() and
> > fetch_add_relaxed on the on hand and load_acquire() and
> > load_relaxed() on the other.
> > In both cases, the _acquire() variant participates in
> > "synchronizes with"
> > relationships, while the _relaxed() variant does not.
> >
> > Looking at the argument to fetch_add_relaxed(), we have
> > fetch_add_relaxed(0) breaking the "synchronizes with"
> > relationship, but then again, so would fetch_add_relaxed(1),
> > fetch_add_relaxed(2), and even fetch_add_relaxed(42).
> >
> > Or am I missing something subtle here?
>
> I think we're still misunderstanding each other. The situation this is
> addressing is roughly
>
> T1 T2 T3
> x.store_release(17);
> x.fetch_add_relaxed(1);
> x.load_acquire();
>
> which we can think of as executing sequentially in that order. Thus the
> load_acquire sees a value of 18. The question is whether the
> store_release synchronizes with the load_acquire. We agree that there
> are no synchronizes with relationships involving the fetch_add (which
> would change if the fetch_add were anything other than relaxed).
>
> Was this also the scenario you were looking at?
I was considering scenarios of this sort, but where the compiler realized
that it safely could use weaker fences for the x.store_release(),
but only in absence of an intervening synchronizes-with-preserving
x.fetch_add_relaxed().
> If not, do you want to suggest clearer wording?
Just adding the "non-relaxed" modifier for the RMW operations would
be fine.
> > > ...would like to aim for is:
> > >
> > > - We vote the basic memory model text (N2300), possibly with some
> > > small changes, into the working paper in Toronto. I suspect that's
> > > nearly required to have a chance at propagating the effects
> > into some
> > > of the library text by Kona.
> > >
> > > - We agree on exactly what we want in terms of atomics and
> > fences in
> > > Toronto, so that it can be written up as formal standardese
> > ideally in
> > > the post-Toronto mailing.
> >
> > I cannot deny that past discussions on this topic have at
> > times been spirited, but I must defer to Michael and Raul on
> > how best to move this through the process.
>
> Clearly. And I suspect there will be interesting discussions between
> all of us in Toronto.
>
> > > > On the "scalar" modifier used here, this means that we are not
> > > > proposing atomic access to arbitrary structures, for example,
> > > > atomic<struct foo>?
> > > > (Fine by me, but need to know.) However, we -are- requiring that
> > > > implementations provide atomic access to large scalars,
> > correct? So
> > > > that a conforming 8-bit system would need to provide proper
> > > > semantics for "atomic<long long>"? Not that there are
> > likely to be
> > > > many parallel 8-bit systems, to be sure, but same question for
> > > > 32-bit systems and both long long and double. (Again, fine by me
> > > > either way, but need to know.)
> > >
> > > In my view, most of this discussion really belongs in the atomics
> > > proposal. It should specify that if a read to an atomic object
> > > observes any of the effect of an atomic write to an object, then it
> > > observes all of it. And that applies to objects beyond
> > scalars. But
> > > it's just an added constraint beyond what is specified in
> > the memory model section.
> > >
> > > Our atomics proposal still proposes to guarantee atomicity for
> > > atomic<T> for any T. We just don't promise to do it in a lock-free
> > > manner. Thus atomic<struct foo> does behave atomically
> > with respect to threads.
> >
> > In that case, my question becomes "why is the qualifier 'scalar'
> > required?".
>
> The intent here is to decompose non-scalar reads into the constituent
> scalar reads, and similarly for nonscalar assignments, in order to
> determine what assignments can be seen where. Otherwise we would have
> to talk about which combination of field assignments a particular struct
> read can see. It just seems easier to define visibility on a
> component-by-component basis.
OK. So it is then the library's responsibility to make operations on
structs appear to be atomic, right?
Thanx, Paul
> Hans
>
> >
> > > > Although the general thrust of 1.10p8 and 1.10p10 look OK to me,
> > > > more careful analysis will be required -- which won't
> > happen by your
> > > > deadline, sorry to say!
> > >
> > > Thanks for looking at this quickly. If we need to make some more
> > > tweaks before Toronto, I think that's fine.
> > >
> > > I think at the moment, the real constraint on getting this into the
> > > working paper is that we need to have more people reading it.
> >
> > More eyes would indeed be a very good thing from my perspective.
> >
> > Thanx, Paul
> >
More information about the cpp-threads
mailing list