[cpp-threads] catch-up

Sat Oct 22 15:38:08 BST 2005

I've been living in a parallel universe surrounding some of the list
traffic all week, occasionally quickly scanning list mail, while being
at the Synchronization and Concurrency in Object-Oriented Languages
(SCOOL) workshop (http://research.microsoft.com/~tharris/scool05/) and
OOPSLA and talking with people about some similar issues. Some
catch-up:

1. Enough people from Intel who can speak authoritatively about this
for me to confidently believe it have said (a) "locked" instructions
and mfence DO have global ordering properties on current and
near-future x86s (b) Intel now realizes that this should have been
documented and will try to ensure that it is (c) Intel does not want
to promise that this will hold forever, and might be interested in
engaging with different language-level standards groups to see if
there is a way to weaken total SC-ness of lock/volatile/atomic specs
to avoid multiple observer ordering agreement requirements that do not
impact practical programs.

2. I agree with Herb that locks are problematic constructs. But
transactional memory isn't here yet, and even when it arrives, people
will still need all sorts of other concurrency control constructs to
go along with it. Atomic regions make it easier to write "sequential"
code that still works in concurrent environments, but they don't do a
lot for most people writing explicitly parallel/concurrent
programs. Historically, this has entailed highly diverse constructs --
futures, event-based producer-consumer, barrier-based bulk-parallel,
CSP-ish handoffs, and so on. If we know anything about concurrency
support from the past 40 years, it is that no one size fits all.
(And also that language designers will always stubbornly refuse to
believe this :-)

3. Notice that just about all constructs that involve blocking across
multiple threads (even simple blocking IO) have the potential for
liveness failures -- lockups, deadlocks, livelocks; it is intrinsic to
concurrency. Usually, the right way out takes some thought and defies
automation. Hans's discussion of client-callbacks is a good
example. Invoking a callback while holding a lock is rarely a good
idea. Changing application-level designs to not require such a
callback but instead to issue an async event (among other similar
options) is usually much better all around.  This would hold even if
you had transactions in which case callbacks don't lockup, but might
often have surprising/wrong behavior. (Aside: One reason some people
find concurrent OO programming difficult is that they have to face the
fact that several common OO constructions weren't such great ideas
after all. The threat of inconsistent state during callbacks is an
often-ignorable problem in sequential OO programs where things often
work out OK even though they were not quite right, but deadlock is
harder to ignore.)

4. Considering (2) and (3) and related issues, my take is that no
small number of concurrency constructions satisifies a clear majority
of users in practice, which means that the issue is in the realm of
library/API design not language design. (See some slides I wrote about
some of this last year at http://www.research.ibm.com/vee04/Lea.pdf)
Additionally, you can remove some of the problems of locks in
particular by including libraries of classes that usually have safety
and liveness problems when users write them themselves.  The main
focus here is usually collection classes (like those we've been adding
to java.util.concurrent), where you can create libraries of
cool/exotic non-blocking implementations (or often, lock-based
implementations using only hidden "leaf" locks, and/or RW locks etc).
(Herb: This is phase 3 for Java. First, only lock-based collections,
Second, add non-locked ones with optional wrappers, Third,
"concurrent" collections that are safe and never lock up.)  These can
be made to compose nicely, although in part by supporting weaker
semantics than some people expect.  For example,
aCollection.addAll(anotherCollection) is not an atomic operation
for most java.util.concurrent collections.
Readers of aCollection may see it in a state where only some of the
elements of anotherCollection have been added. Usually, this is what
people want anyway, and when they don't, they realize that they need
some secondary coordination. Maybe using transactional collections
would be even better here, but I'm not so sure about this. (I DO like
language-based support for transactions; I'm just worried that people
are expecting too much for them.)

5. Considering (4) and related issues, I think you want the most
minimally invasive language/syntax support for threads, locks, etc --
something in which for example, monitor-style classes in the style of
uC++ are really easy to express, and similarly for Kevlin's
task-centered stuff, but none are hard-wired syntactically.  This is
again mostly what Java did, defining a "Thread" class, which is
magical (i.e., contains mostly instrinsified methods) but not
syntax-driven. It is also approximately what I think Hans is
suggesting for building on top of pthreads, which is the most boring,
hence likely to succeed, way to do this. It sure would be nice if
someone were to flesh this out more. (Not me though.)

6. Some of the boostrapping issues raised by Peter are eerily
reminiscent of Java rules for "final" fields, which is the most
complicated part of the JMM.  You'd like rules ensuring that, unless
programmers do something stupid, write-once fields are actually
written (happen-before) publication of the objects with those fields
allowing access by other threads.  And even this was easier than C++
because Java has rules for class-loading that avoid circularities etc.
Offhand, in the absence of modules with controllable initialization,
it seems that a lot of lower-level initialization code will need to
make extensive use of double-check and avoid static state as much as
possible. (Maybe we could just have the rule that concurrent C++
programs that include global variables have no semantics at all :-)

-Doug