[cpp-threads] Alternatives to SC

Mon Jan 15 07:12:44 GMT 2007

On Sun, Jan 14, 2007 at 03:16:47PM -0800, Herb Sutter wrote:
> Paul wrote:
> > But I get the feeling that we are talking past each other.
> 
> I think so too. I'll try to just respond to selected places where I think that seems to be happening, rather than reply exhaustively.
> 
> > > The fundamental philosophy of CCCC seems to be:
> > >
> > >   Correctly locked programs are SC.
> >
> > We could probably get ten programmers together and get twenty different
> > definitions of "correctly locked".  ;-)
> 
> I didn't think the term was confusing in this group. Informally,
> 'correctly locked' means that if two otherwise concurrent pieces of code
> (i.e., neither is otherwise ordered before the other) could manipulate
> the same shared variable and at least one is a writer, then they must
> manipulate that variable only while holding the same lock (so as to
> order themselves w.r.t. each other).

Not confusing, just wildly ambiguous.  Just one example, though, for
the moment.  Suppose a critical section increments statistical counters.
If one samples the counters without acquiring the lock, is that "correctly
locked"?  By your definition, it is most definitely not.  But many people
would argue that it is your definition, and not their code, that is buggy.

> > Don't you agree that there should be rules governing use of atomics
> > just like there are rules covering use of locking?
> 
> Sure, but I believe in defaults that are easy to use correctly and hard
> to use incorrectly. SC atomics fit that, and the other extreme of 'write
> explicit fences' fits the reverse. CCCC is trying to fall in between,
> and on the safe end of the spectrum, which is laudable.

And this illustrates one difference between us -- you are trying to
implement safety within the confines of atomics, and I believe that
it is better to permit the safety to be added at a higher level.

> > >   Is an atomic variable a monitor?
> >
> > The word "monitor" has all sorts of connotations, and I so I am not
> > sure I understand the question.
> 
> Sorry, I thought I was being clear. I added:
> 
> > > Or equivalently:
> > >
> > >   Does an atomic variable have internally-locked semantics?
> 
> Another attempt: Is there a total order on operations on the
> variable? (And does it correspond to program order so that the programmer
> can express what the order is?)

Total order of writes, sure.  Forces the combination of reads and writes
to be totally ordered is a bridge too far.

> > > (Aside: Relationships to lock semantics keep coming up. This question
> > > is indirectly related to the question of whether transactional memory
> > > systems should obey "single global lock" semantics.)
> >
> > I do not believe that transactional memory is ready for
> > standardization.
> 
> Sure, it's far from that. I'm just pointing out a recurring consideration.

It does seem likely that something useful will come from transactional
memory, at least once the hype dies down a bit and the wheat separates
from the chaff.

> > On the other hand, why anyone be telling people that atomics are
> > monitors?
> > If you want monitors, after all, you can easily use locking.
> 
> It's not that anyone is telling them to. It's that people (and even
> many of us on this list) constantly do assume it, even when they know
> better and are trying to analyze a weaker model. Humans find it very
> difficult to reason about partial orders.

Again, you seem to be assuming that everyone will be using atomics
in their raw form, instead of as invisible internals to library
functions and the like.  SC is just not going to save you if you
insist on forcing everyone to program at too low a level of abstraction.

> As just noted elsewhere in this thread, even the best peer-reviewed
> lock-free papers (which I think we can agree are by and for rocket
> scientists only) usually assume SC in their code, and at that are liable
> to be corrected. Nearly none try to show code written to something looser,
> say a fence-based model, _and_ say where fences should go. That makes
> me worry that it's too high a bar for real-world programming.

The problem in my studied opinion is -not- lack of SC.  It is instead
the insistence on using lock-free algorithms where other tools should
be used instead.  I did not form this opinion lightly, nor did I form
it over a short period of time.

SC will not save those who insist on using the wrong tools for the job.

And yes, there are cases where certain lock-free algorithms -can-
be the right tool for the job, for example, some of the simpler
counting and list-manipulation algorithms.

> > > So let me ask whether this is really what you intended:
> > >
> > >   Q1: If you think that for program 1 the non-SC result should be
> > > allowed, do you also agree that for program 2 the non-SC result
> > should
> > > be allowed? If not, why not?
> >
> > I do believe that a non-SC result should be allowed for Program 1, and
> > that it is reasonable to expect that Program 2 would always give SC
> > results (though, again, I have not thoroughly analyzed Program 2).
> >
> > I believe that it is eminently reasonable to expect different results
> > because very different materials are being used in construction of the
> > two programs.
> >
> > There are any number of analogs to this situation.  I will choose
> > integer
> > and floating-point arithmetic.  Integer arithmetic is associative,
> > while floating-point arithmetic is not necessarily associative.
> > Different materials, different results.
> 
> Although I disagree, being able to better understand your view is helpful.
> 
> You see it as a difference in kind. I see the two cases as not different
> in kind because the question is whether programmers can successfully
> reason about their code.

This line of reasoning could just as easily be used to argue that
floating-point arithmetic should not exist.  Let's face it -- there
is no shortage of ways to get oneself in trouble using floating-point
arithmetic.  There are entire books written on how to avoid such
trouble.

Sorry, but I just can't buy your argument.

> Let me try again to put it still another way:
> 
>   In program 2 (IRIW using a lock each for x and y), do we think
> programmers need SC to write code that they can understand and reason
> about reliably?

False qualifier.  You mention program 2, but don't actually reference
it in the rest of the sentence.  Perhaps you meant to ask:

	Would a programmer be well-advised to rely on SC in analyzing
	program 2?

The answer to this question would be "yes" -- if a program does exhibit
SC, why not make use of this fact when analyzing it.

>   If yes, what makes us think they don't need it for program 1, which
> is otherwise equivalent?

Rephrasing the question:

	Would a programmer be well-advised to rely on SC in analyzing
	program 1?

The answer to this question is "of course not".  The program is not
SC unless the underlying machine is itself fully SC, so it simply
does not make sense to rely on SC when analyzing it.

But is program 1 or 2 important, other than as a test case to
differentiate between CCCC and SC?  I do not believe that it has
any other legitimate use.

>   If no, why are we considering it a benefit that CCCC programs are
> still SC, and not trying to jettison that restriction too for still more
> potential performance?

If CCCC meets my current expectations, then I would characterize it as
the portions of SC that are actually useful in real code.  Some parts
of SC are more valuable than are others, and some parts of SC have
greater potential for performance degradation than do others.

The -real- question is why -wouldn't- we trade off the less-useful and
more performance-hurtful pieces of SC?

> Regardless of whether SC vs. CCCC is "better", I'm trying to understand
> why isn't not inconsistent to say that "CCCC maintains SC for correctly
> locked code" (incl. lock-based IRIW) is desirable and necessary for
> the programmer to be able to reason reliably about the meaning of
> his code, but that it's not necessary for the same example converted
> to atomics. Either SC is necessary for both, unnecessary for both,
> or the programmer's views of these cases are different in a way not
> yet quantified.

You seem to have a black-and-white view here: "SC good, non-SC bad".
Last I checked, programs written for SC machines were not bug free.
Keep in mind that most single-CPU machines are SC by definition.
Single-threaded programs still have bugs, sad to say.  So SC clearly
cannot be a panacea.

This programmer's view of the two cases is that they are different by
virtue of using different mechanisms, just as floating point arithmetic
is different than integer arithmetic.  Why is that so hard?

> There may be a good answer why these cases are different in terms
> of the programmer being able to reason about the code, but I haven't
> heard it yet -- saying "the proposed rules are just different" isn't an
> answer when the question is about whether the rules are necessary for
> a programmer to reason with confidence.

So you are saying that you want floating point arithmetic eliminated?

The argument is precisely analogous.  And there can be no denying that
floating point's lack of associativity makes it more difficult to
understand, and also directly results in bugs.

So, why aren't you also advocating doing away with floating point?

> > >   Q3: Combining the two parts, why do the cases where non-SC creates
> > > problems for lock-based programs not apply equivalently to program 1?
> >
> > Nice try.  ;-)
> >
> > Please note that I did not say that non-SC creates problems for lock-based
> > programs.  I -did- agree that restricted (but common) uses of locking
> > might hide non-SC behavior from the programmer, however, it is quite
> > trivial to construct non-SC locked programs.  Take any non-SC program,
> > and have each thread acquire its own per-thread lock at start, and
> > release this lock upon exit.  The program is fully locked, but retains
> > any non-SC behavior that it started with.
> 
> This hasn't helped me, unfortunately, because it hasn't addressed
> my question and because the example is not correctly locked by any
> reasonable definition.

You didn't say correctly lock-based, just lock-based.

And quite frankly, your question makes absolutely no sense to me,
except perhaps as a artifice.  Seriously.  You might as well be asking
me why the cases where opaque window glass causes problems do not apply
equivalently to bricks.  Bricks aren't -supposed- to be transparent.
Just as atomics aren't -supposed- to be SC.

Furthermore, we have had two messages thus far expressing regret that
Java atomics were defined to be SC.

Shouldn't we learn from the Java experience, rather than mindlessly
repeating those mistakes?

> The way to address the question would be to consider the cases where/why
> SC (and global ordering) are considered important for lock-based code that
> programmers can reason about, and then say why those are not important
> for being able to reason about the equivalent atomic-based code.

Again, consider associativity and floating point.  This really is an
analogous situation.

And then -listen- to the people telling you that they regret the
choice of SC for Java atomics.

> > > I realize that this comes around to the question of whether atomic
> > > operations, like locks, should be a global property, and I realize
> > > people want to make it not a global property. But if it's not global,
> > > then atomics aren't monitors, and I now realize it's that consequence
> > > that I haven't been able to swallow yet.
> >
> > This may be a key point -- I never have treated atomic variables as
> > if they were monitors.
> >
> > So, why do you want atomic variables to act like monitors when you
> > already have locking from which to construct monitors?
> 
> Because programmers, including experts who know better, continually
> make the mistake of treating them that way. It's not that we haven't
> tried; we just haven't yet come up with a weaker model which real-world
> experienced programmers have got used to using and are consistently able
> to reason about with confidence. IMO.

Programmers also continue to make the mistake of assuming that
floating point numbers are real numbers, rather than a small subset
of rational numbers.  Experts included.  This is still no reason
to eliminate floating point.

And again, I suspect you are seeing the results of insufficient
abstraction more than you are seeing the lack of SC.

> For example, I think today most people probably accept that asking the
> programmer to write explicit fences is a failure as a usable programming
> model, and that in practice, despite experience and effort, programmers
> routinely write bugs due to inadequate fences, performance hits due to
> overuse of fences, and often both.

This certainly is one of the reasons that the Linux kernel's RCU API
buries the fences into higher-level abstractions -- so that the people
using this API don't need to worry about fences.  Abstraction is the
key here -- not taking away fences so that the people implementing RCU
cannot make use of them.

Let's not throw away the baby with the bathwater here!

> > You still need to show me a compelling reason why we need atomics to be
> > monitors, especially given that we already have locking for that
> > purpose.
> 
> Because locks aren't only about that. Telling people "oh, if you want
> SC, use locks" is inadequate because: (a) locks carry other tradeoffs
> (e.g., they buy you SC, but they also buy you potential deadlock); and
> (b) it ignores that people have valid motivation to want to eliminate
> locks to avoid their drawbacks while not wanting to give up being able
> to reason successfully in terms of a monolithic shared memory.

Telling people "if you don't want SC, go pound sand" is useless as well.
We have heard people telling us that Java's choice of SC for atomics
was a mistake.  My experience also indicates that SC atomics in C/C++
would just as surely be a mistake.  Let's try learning from experience,
please!!!

						Thanx, Paul