[cpp-threads] D2335 (sequential consistency proof) revision

Thu Aug 16 21:58:05 BST 2007

N2335 was intended to contain proofs that the proposed memory model was
in fact ensures sequential consistency for race free programs that
satisfy some additional requirements (only SC atomics, lock(), and
unlock()).

It turned out that D2335 required a significant amount of revision,
mostly as a result of Clark's N2334
(http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2334.htm)
rewording.  As a result, it did not make the post-Toronto mailing, and
will need to acquire a new number.

I now have a version that looks OK to me, which I've attached.

It turns out that this raised one other question that I had overlooked.
(Note to core group: The rest of this is mostly a library issue.  I
don't think N2334 is affected either way.)

Consider the following (variant of a well-known) example, where x is
atomic (initially zero, SC ops only) and l1 is initially unlocked:

Thread 1:
	x.store(1);
	l1.lock();

Thread 2:
	r2 = trylock();  // fails
	r3 = x.load();   // yields 0

Note that lock and trylock are both only acquire operations, even though
the trylock reads the value written by the lock.

Since there are no synchronizes-with relationships and hence no
happens-before relationships across threads, and the current plan is to
insist on a total order only among SC atomics, not locks, I believe this
outcome is currently allowed, in spite of the fact that only SC atomic
operations are used.  This outcome is of course not sequentially
consistent.  And the program is race-free by virtue of mentioning only
synchronization operations.

(Herb points out that timedlock() is for present intents equivalent to
trylock().  These issues don't arise with just lock() and unlock().)

My tentative conclusion is the way this is currently proposed is
probably not a disaster (the prrofs go through without trylock(), which
inherently causes some problems anyway), but it's probably not really
what we want.  We should instead insist that there be a single total
order, covering both atomics and locks and consistent with hb, such that
each atomic operation and lock sees the previous write to the
corresponding object in this sequence.  I think this makes it easier to
characterize SC-compatible uses of trylock().

But in implementation terms, this change would seem to require that a
failed trylock still needs enough ordering semantics to prevent
reordering with a subsequent SC load.  Presumably Java already requires
this.  Does anyone feel that this is a serious issue?  I guess this
reraises some of the issues we previously discussed for CAS failure
ordering, which I previously thought we didn't have for locks.

Opinions?

Hans
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.decadentplace.org.uk/pipermail/cpp-threads/attachments/20070816/4a77a436/attachment-0001.html