[cpp-threads] High-level concurrency issues

Tue Oct 18 03:45:55 BST 2005

> Somewhat playing devil's advocate here, could someone make that
> statement about "not composable" precise?  I've certainly seen it
> before, but I think it would help to understand exactly what we're
> talking about here.

Below I've pasted the relevant section of my recent article on "Software
and the Concurrency Revolution" (coauthored with Jim Larus) that
appeared in last month's ACM Queue. The key paragraph is:

  A fundamental problem with locks is that they are not composable.
  You can't take two correct lock-based pieces of code, combine them,
  and know that the result is still correct. Modern software
  development relies on the ability to compose libraries into larger
  programs, and so it is a serious difficulty that we cannot build on
  lock-based components without examining their implementations.

Then there's some more discussion.

> This can get me into trouble if atomicity or other considerations
> require me to hold a lock during a client call-back, for example.

Or call a virtual function, or use a template parameter, or let control
flow through any other customization point into unknown code.

The general issue that calling any unknown code while holding a lock is
a recipe for deadlock. You really are required to know the
implementation details of all code that you might call while you are
holding a lock.

> But
> in most cases, it seems to work tolerably well, and layers of calls
seem
> to compose reasonably.  And I'm not 100% convinced that the cases in
> which it breaks don't have single-threaded lock-free analogs in which
I
> break object invariants with an unintentional recursive call.

I'd be interested to see analogs. The main reason we're just getting
away with it right now is that few programmers are writing any
significant code that uses locks (explicitly concurrent or otherwise).

For example, imagine someone writes an object that lazily initializes
itself from a file or database, and when the lazy initialization gets
triggered it acquires a file or database lock to perform its
initialization. Now say you put an object of that type into a
synchronized container like a .NET SyncHashTable (a monitor), and the
latter in particular will lock(this) and then call object.GetHashCode --
which can trigger a lazy read and acquire a lock. That's fine in itself,
but of course now you have a deadlock just waiting to happen if any
other code in the system (that exists today or that could be written
tomorrow) might try to acquire the same file/database lock (e.g., to
lazily initialize a similar object) and then while holding that use the
hash table. Rare? Maybe. But so are all races, and there's nothing in
the type system or in mainstream tools that will tap the programmer on
the shoulder and say "hey, buddy, you're asking for deadlock here, and
oh by the way you shouldn't have tried to use those two objects together
which is incorrect even though each of them is correct on its own."

But even that's not the worst part. The nightmare scenario, which
unfortunately is all too common in modern software, is when two types
that use locks internally (here the container type and the lazily
initialized type) are written by different vendors and are composed for
the first time on an end user's machine (thanks to plugins, DLLs and
dynamic loading, etc.). No testing can ever catch that problem before
either vendor ships their component, other than simply identifying
points where control flow can leave the vendor's code while a lock is
held. Lock-based code just is inherently unsafe to compose.

> This does not mesh at all well with a model in which all locking is
done
> on entry to member function, and locks the object itself.  But I've
> stated previously that I don't like that model.  And I don't claim I
> know how to program with it.

Frankly, monitors are the worst of the lot and almost never appropriate.
They should definitely not be encouraged, and should be actively
discouraged because they usually lock too much and/or not enough
(usually both) and are disasters waiting to happen by making deadlock
situations much more likely.

Java learned that, though only after shipping synchronized containers in
version 1. C++ learned it too -- just look at the trouble C++ vendors
got into with refcounted implementations of basic_string, which to be
safe had to lock the object on every possibly mutating operation; the
industry has now moved en masse away from that, and I take some credit
for making that happen by repeatedly publicizing the problem in articles
and books and talking to the vendors. (Remind me to tell you some truly
hilarious stories about that next month in Las Vegas, whoever is there.)

Herb

P.S.: Here's the excerpt from the recent Queue article.

The Problem of Shared State, and Why Locks Aren't the Answer
------------------------------------------------------------

Another challenging aspect of unstructured parallelism is sharing
unstructured state. A client application typically manipulates shared
memory organized as unpredictably interconnected graphs of objects.

When two tasks try to access the same object, and one could modify its
state, if we do nothing to coordinate the tasks we have a data race.
Races are bad, because the concurrent tasks can read and write
inconsistent or corrupted values.

There exists a rich variety of synchronization devices that can prevent
races. The simplest of these is a lock. Each task that wants to access a
piece of shared data must acquire the lock for that data, perform its
computation, and then release the lock so other operations on the data
can proceed. Unfortunately, although locks work, they pose serious
problems for modern software development.

A fundamental problem with locks is that they are not composable. You
can't take two correct lock-based pieces of code, combine them, and know
that the result is still correct. Modern software development relies on
the ability to compose libraries into larger programs, and so it is a
serious difficulty that we cannot build on lock-based components without
examining their implementations.

The composability issue arises primarily from the possibility of
deadlock. Deadlock, in its simplest form, happens any time two locks
might be acquired by two tasks in opposite orders; task T1 takes lock
L1, task T2 takes lock L2, and then T1 tries to take L2 while T2 tries
to take L1. Both block forever. Because this can happen anytime two
locks can be taken in opposite orders, calling into code you don't
control while holding a lock is a recipe for deadlock.

But that is exactly what extensible frameworks do, as they call virtual
functions while holding a lock. Today's best-of-breed commercial
application frameworks all do this, including the .NET Frameworks and
the Java standard libraries. We have gotten away with it because
developers aren't yet writing lots of heavily concurrent programs that
do frequent locking. Many complex models attempt to deal with the
deadlock problem, such as backoff-and-retry protocols, but they require
high programmer discipline and some introduce their own problems (e.g.,
livelock).

Techniques for avoiding deadlock by guaranteeing locks will always be
acquired in a safe order do not compose, either. For example, lock
leveling and lock hierarchies prevent programs from acquiring locks in
conflicting orders by requiring that all locks at a given level be
acquired at once in a predetermined order, and that while holding locks
at one level, you can only acquire additional locks at higher levels.
Such techniques work inside a module or framework maintained by a team
(although, they're underused in practice), but they assume control of an
entire code base. That severely restricts their use in extensible
frameworks, add-in systems, and other situations that bring together
code written by different parties. 

A more basic problem with locks is that they rely on programmers to
strictly follow conventions. The relationship between a lock and the
data that it protects is implicit, and it is preserved only through
programmer discipline. A programmer must always remember to take the
right lock at the right point before touching shared data. Conventions
governing locks in a program are sometimes written down, but they're
almost never stated precisely enough for a tool to check them.

Locks have other, more subtle problems. Locking is a global program
property, which is difficult to localize to a single procedure, class,
or framework. All code that accesses a piece of shared state must know
and obey the locking convention, regardless of who wrote the code or
where it resides.

Attempts to make synchronization a local property do not work all the
time. Consider a popular solution like Java's synchronized methods. Each
of an object's methods can take a lock on the object, so no two threads
can directly manipulate the object's state simultaneously. As long as an
object's state is only accessed by its methods and programmers remember
to add the synchronized declaration, this approach works.

There are at least three major problems with synchronized methods.
First, they are not appropriate for types whose methods call virtual
functions on other objects (e.g., Java's Vector and .NET's
SyncHashTable), because calling into third-party code while holding a
lock opens the possibility of deadlock. Second, synchronized methods can
perform too much locking, by acquiring and releasing locks on all object
instances, even those never shared across threads (typically the
majority). Third, synchronized methods can also perform too little
locking, by not preserving atomicity when a program calls multiple
methods on an object or multiple methods on different objects. As a
simple example of the latter, consider a banking transfer:

  account1.Credit(amount); account2.Debit(amount);

Per-object locking protects each call, but does not prevent another
thread from seeing the inconsistent state of the two accounts between
the calls. Operations of this type, whose atomicity does not correspond
to a method call boundary, require additional, explicit synchronization.

For completeness, we note two major alternatives to locks. The first is
lock-free programming.  By relying on a deep knowledge of a processor's
memory model, it is possible to create data structures that can be
shared without explicit locking. Lock-free programming is difficult and
fragile; inventing a new lock-free data structure implementation is
still often a publishable result.

The second alternative is transactional memory, which brings the central
idea of transactions from databases into programming languages.
Programmers write their programs as a series of explicitly atomic
blocks, which appear to execute indivisibly, so concurrently executing
operations see the shared state strictly before or after an atomic
action executes. Although many people view transactional memory as a
promising direction, it is still a subject of active research.