[cpp-threads] D2335 (sequential consistency proof) revision

Thu Aug 23 23:28:43 BST 2007

It seems to me that the approach we would with CAS should also apply
to locks.  That is, without decoration they are SC, not just acquire
or release.  (The lock is, after all, RMW.)  With decoration, the
lock operations can be less than SC.  Unfortunately, declaration
will complicate the lock API.  How much performance would we lose?

On 8/16/07, Boehm, Hans <hans.boehm at hp.com> wrote:
> N2335 was intended to contain proofs that the proposed memory model was
> in fact ensures sequential consistency for race free programs that
> satisfy some additional requirements (only SC atomics, lock(), and
> unlock()).
>
> It turned out that D2335 required a significant amount of revision,
> mostly as a result of Clark's N2334
> (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2334.htm)
> rewording.  As a result, it did not make the post-Toronto mailing, and
> will need to acquire a new number.
>
> I now have a version that looks OK to me, which I've attached.
>
> It turns out that this raised one other question that I had overlooked.
> (Note to core group: The rest of this is mostly a library issue.  I
> don't think N2334 is affected either way.)
>
> Consider the following (variant of a well-known) example, where x is
> atomic (initially zero, SC ops only) and l1 is initially unlocked:
>
> Thread 1:
>         x.store(1);
>         l1.lock();
>
> Thread 2:
>         r2 = trylock();  // fails
>         r3 = x.load();   // yields 0
>
> Note that lock and trylock are both only acquire operations, even though
> the trylock reads the value written by the lock.
>
> Since there are no synchronizes-with relationships and hence no
> happens-before relationships across threads, and the current plan is to
> insist on a total order only among SC atomics, not locks, I believe this
> outcome is currently allowed, in spite of the fact that only SC atomic
> operations are used.  This outcome is of course not sequentially
> consistent.  And the program is race-free by virtue of mentioning only
> synchronization operations.
>
> (Herb points out that timedlock() is for present intents equivalent to
> trylock().  These issues don't arise with just lock() and unlock().)
>
> My tentative conclusion is the way this is currently proposed is
> probably not a disaster (the prrofs go through without trylock(), which
> inherently causes some problems anyway), but it's probably not really
> what we want.  We should instead insist that there be a single total
> order, covering both atomics and locks and consistent with hb, such that
> each atomic operation and lock sees the previous write to the
> corresponding object in this sequence.  I think this makes it easier to
> characterize SC-compatible uses of trylock().
>
> But in implementation terms, this change would seem to require that a
> failed trylock still needs enough ordering semantics to prevent
> reordering with a subsequent SC load.  Presumably Java already requires
> this.  Does anyone feel that this is a serious issue?  I guess this
> reraises some of the issues we previously discussed for CAS failure
> ordering, which I previously thought we didn't have for locks.
>
> Opinions?
>
> Hans
>
> --
> cpp-threads mailing list
> cpp-threads at decadentplace.org.uk
> http://www.decadentplace.org.uk/cgi-bin/mailman/listinfo/cpp-threads
>
>
>

-- 
Lawrence Crowl