[cpp-threads] D2335 (sequential consistency proof) revision

Fri Aug 24 00:50:57 BST 2007

Lawrence wrote:
> It seems to me that the approach we would with CAS should also apply
> to locks.

But CAS and lock() are different operations. CAS is low-level and can be used for reading as well as writing (well, it necessarily is about writing) hence needs both acquire and release in the success case at least. But taking a lock begins a critical section and so is only an acquire operation. No?

> That is, without decoration they are SC, not just acquire
> or release.

Why should taking a lock be a full fence? The only use case I know of for making a lock operation a full fence is the anti-idiom of abusing a trylock or timed lock to detect a lock taken by another thread, as Hans noted below.

I thought that taking a lock enters a critical section, which only needs to be an acquire.

> (The lock is, after all, RMW.)

Sort of. Taking a lock generally does both reading and writing, but for ordering purposes a lock need only be an acquire. (Hence something like CAS_acquire.)

> On 8/16/07, Boehm, Hans <hans.boehm at hp.com> wrote:
> > (Note to core group: The rest of this is mostly a library issue.  I
> > don't think N2334 is affected either way.)

[so I've dropped -core herefrom]

> > Consider the following (variant of a well-known) example, where x is
> > atomic (initially zero, SC ops only) and l1 is initially unlocked:
> >
> > Thread 1:
> >         x.store(1);
> >         l1.lock();
> >
> > Thread 2:
> >         r2 = trylock();  // fails
> >         r3 = x.load();   // yields 0
> >
> > Note that lock and trylock are both only acquire operations, even though
> > the trylock reads the value written by the lock.
> >
> > Since there are no synchronizes-with relationships and hence no
> > happens-before relationships across threads, and

I think this next part is a problem:

> > the current plan is to
> > insist on a total order only among SC atomics, not locks,

I didn't realize that. I don't agree, because atomics and locks (and, sigh, fences) should be interchangeable for synchronization purposes. They all express (parts of) critical sections, and we shouldn't have multiple types of critical sections that have odd interactions. Besides, then programs wouldn't be SC, right? And I thought the goal of having SC atomics (esp. as the default) was so that programs that use just locks (correctly, meaning not the trylock trick) and SC atomics are SC.

In particular, just as

  x.store(1);
  y.load(1);

can't be reordered, which is slightly stronger than just read=acquire store=release, I thought that

  mut.unlock();
  y.load(1);

and

  x.store(1);
  mut.lock();

can't be reordered, as they're analogous cases. No?

Hans proposes:

> > We should instead insist that there be a single total
> > order, covering both atomics and locks and consistent with hb, such that
> > each atomic operation and lock sees the previous write to the
> > corresponding object in this sequence.

> > I think this makes it easier to
> > characterize SC-compatible uses of trylock().

Could you explain what example(s) you have in mind? Thanks.

> > But in implementation terms, this change would seem to require that a
> > failed trylock still needs enough ordering semantics to prevent
> > reordering with a subsequent SC load.

Yes, I thought we were already doing that because I assumed it would be inherent in the decision to make atomics SC by default and say that programs that use locks and default atomics would be SC. No?

Herb