[cpp-threads] Yet another visibility question

Tue Nov 21 06:20:55 GMT 2006

I think that even the official N2052 proposal, the proposed 1.10p7
is fairly clear.  The thread 3 load_acquire reads only one of the values
written to z by threads 1 and 2.  Hence there is a synchronizes with
relationship between only one of threads 1 and 2 and thread 3.
Hence one of the checks in the assertion may fail.

I don't really agree with Lawrence that this is unusable.  The rules
are fairly clear, and we all agree that non-wizards should use
fetchadd_full here anyway.

It seems to be slightly harder to specify the version in which all the
edges are introduced.  For fetch_add, there is clearly a total order among
all the operations on z.  In general, we currently do not assume a total
order on stores at this level, so the notion of "all previous" is not
well-defined.  But I think that's fixable, especially if someone can come
up with a strong reason for doing so.

(I'm a little uncomfortable with the absence of a total order on stores to
a single atomic variable, but I haven't convinced myself that it really
matters.  We do allow, for raw atomics:

Thread 1: x = 1; x = 2;
Thread 2: r1 = x; r2 = x; r3 = x;

r1 = r3 = 1 and r2 = 2

but since the compiler can reorder the loads in thread 2, I don't think
that's astonishing.  And weakly ordered atomics are weird and dangerous.)

Currently I think it's a tradeoff between simplicity of the specification
and reference counting performance.  I think current hardware gives you
the stronger semantics anyway, though a software DSM might benefit from
the weaker version.

Java does give you the stronger semantics, but Java volatiles have
stronger ordering properties anyway.

Hans

On Mon, 20 Nov 2006, Lawrence Crowl wrote:

> On 11/17/06, Peter Dimov <pdimov at mmltd.net> wrote:
> > Let me re-ask the same question again using slightly different wording. I'm
> > really not sure of the correct answer. "Correct" as in both "follows from
> > the memory model", and "what we want".
> >
> > // x y z initially zero
> >
> > // thread 1
> >
> > x = 1;
> > fetchadd_release( &z, +1 );
> >
> > // thread 2
> >
> > y = 1;
> > fetchadd_release( &z, +1 );
> >
> > // thread 3
> >
> > if( load_acquire( &z ) == 2 )
> > {
> >     assert( x == 1 && y == 1 );
> > }
> >
> > Will the assert pass?
> >
> > In other words, does the load-acquire from z in thread 3 introduce
> > sync-with edges to all previous store-releases to z, or just to the
> > last one?
>
> I'd need to look more carefully to be sure, but I think all edges.
>
> In any case, I think introducing only the last edge would produce
> a programming model that is effectively unusable.
>
> --
> Lawrence Crowl
>
> --
> cpp-threads mailing list
> cpp-threads at decadentplace.org.uk
> http://www.decadentplace.org.uk/cgi-bin/mailman/listinfo/cpp-threads
>