C++ Multithreading: Library Primitives

Thu Sep 9 16:49:53 BST 2004

Hi all,

OK, here's some drafted thoughts concerning library primitives. I have 
not yet included any references.

Kevlin

<blurb>
Library Primitives to Support Multithreaded Programming
=======================================================
Against a standardized, clearly defined, and sufficiently portable 
memory model that addresses threading issues explicitly, a C++ 
programmer expecting to work with threads would likely expect standard 
library support for thread programming primitives. There are many forms 
and styles such a library can take, and many examples of such libraries 
in production code. The goal of the current proposal is not to define 
such a library in detail, but to outline requirements and reasonable 
expectations of such a library.

Level of Library
----------------
High-level threaded programming models and utilities are ultimately 
desirable, but must be built on a more primitive and portable layer. A 
standard C++ threading library should aim to be that primitive and 
portable layer rather than a C library, no matter how standard or well 
known a given C API is. It is most likely that any threading library 
implementation would build on an existing C threading API, but that is a 
matter for the library implementation rather than the user of the 
library.

This proposal is therefore conservative in focusing on library 
primitives rather than higher-level facilities. It would be reasonable 
to also standardize such facilities, but they are considered to be a 
separate and additional consideration, layering on top of the library 
primitives. Until progress is made on the memory model and the library 
primitives, extending the proposal with higher-level facilities might be 
considered premature.

Portability of Library
----------------------
A platform may or may not support preemptive a multithreading model 
natively. Non-preemptive threading models can be made to look and behave 
superficially like their preemptive counterparts in certain simple 
cases, but they are sufficiently different to work with that this 
proposal focuses on what is these days presumed to be the default 
threading approach in programmers' minds, namely preemptive 
multithreading. As such, threading support in the library is not 
required to extend to non-preemptive models. Conversely, it may not be 
possible to use a standard threading library on all platforms that 
support other C++ standard library features.

Therefore, a question that needs to be resolved in proposing a threading 
library is to define its kind of conformance. A _freestanding_ C++ 
implementation, by definition, need not include threading facilities in 
its library since its support for the standard library is minimal. 
However, it might be considered too much of a burden on what a _hosted_ 
implementation must provide if, to conform to the standard, thread 
facilities must be supported. There are a number of possible approaches 
to consider:

(1) Add a new kind of implementation, along the lines of _freestanding_, 
_hosted_, and _hosted with threads_ (or some suitable synonym). The 
_hosted_ category can be considered to be the library as it is defined 
in the current standard (along with any other nonthread-related 
extensions proposed for the next standard), whereas the _hosted with 
threads_ category would include the whole library.

(2) Provide the threading primitives library as an optional part of the 
library, perhaps as an appendix. Although this might conceptually be 
similar to approach (1), its spirit and practicalities are subtly 
different. Optionality implies feature testing, and so some feature 
testing mechanism is needed, such as macro testing (as used in POSIX), 
tag types, or traits. However, the inclusion of one optional library 
might set a precedent and be considered a cue for other optional 
libraries.

(3) Require the definition of threading primitives within the standard 
library, but leave its viability as a QoI matter: code would compile but 
would not necessarily run successfully, although its execution would not 
be undefined. For example, an implementation on a single-threaded 
platform might chose to throw an exception for any attempted thread 
creation, or synchronization primitives might be implemented as 
stateless objects with null implementations of their locking functions. 
This approach could be complemented by feature-checking mechanisms.

In addition to whether or not threading is supported fully in a given 
library implementation, there is also per-platform variation in the 
features supported. Because C++ is still considered a systems 
programming language, there is an expectation of a close (but no closer 
than necessary) correspondence between its primitives and the primitives 
of the platform. This means that any offering of library primitives must 
strike a balance between being a pure and common subset of what is 
common across platforms -- but perhaps too small a subset to be useful 
in real-world applications -- and a constructed superset of what is 
available -- demanding more of a library implementor. The superset 
approach may take an implementation too far from the correspondence 
between the platform and the library primitives, making constructs that 
are efficient on one platform indistinguishable from those that are 
inefficient, because the former can be realized directly and the latter 
must be constructed and might require elaborate support.

For example, although mutexes are commonly supported, they tend to 
appear in different flavors (e.g. reentrant and non-reentrant). 
Platforms vary in their support for which flavors are supported -- some 
support just one, some support many. Most users would likely expect that 
these were close to their underlying platform primitives. Types could be 
resolved at compile time or at runtime.

Another case that may warrant a quality of implementation license is 
that of deadlock detection. Although convenient, it is not universally 
supported and not necessarily efficient for all programmer's needs, 
especially when they have the choice and confidence of not using it. A 
reasonable quality of implementation constraint would be that, in the 
event of deadlock, an implementation may either block indefinitely (or 
until a timeout, if a lock has one specified) or throw an exception to 
indicate a deadlock condition.

In other cases there is a difference in the platform primitives on 
offer. For example, Win32 offers event variables but not conditional 
variables, and pthreads offers condition variables but not event 
variables. Assuming a mutex primitive, it is possible to implement one 
in terms of the other without too much infrastructure or surprise. 
However, in this particular case, condition variables are generally seen 
as the superior alternative and event variables as too primitive, so it 
would be reasonable for a standard library to support condition 
variables but not event variables.

Similarly, synchronization primitives vary in their scope, i.e. 
available only within a process or also visible to other processes. 
Win32 treats its _mutex_ primitive as having interprocess visibility and 
its _critical section_ primitive, which has mutex semantics, as being 
process local only. In this case, because standard C++ does not have a 
concept of separate processes, it is sufficient to focus on 
single-process scope for standardization, but acknowledge that a library 
implementor may reasonably choose to extend the library to accommodate 
interprocess communication.

However, no matter what set of primitives is considered sufficiently 
portable, it must be recognized that underlying threading platforms are 
invariably richer, often providing specific features that a programmer 
may wish to take advantage of, e.g. interprocess synchronization and 
real-time scheduling. Therefore, it is probable rather than just 
possible that specific implementations will extend a core standardized 
threading library. Any chosen library design must take this into account 
and be open to these kinds of extensions.

Library Features
----------------
There are broadly two areas in which a library needs to offer 
primitives:

(1) Thread execution: The launching, termination, and joining of 
threads. Whether and to what degree library primitives are offered to 
support manipulation of thread execution priorities, specification of 
scheduling model (e.g. FIFO or round-robin), and thread cancellation is 
currently open for discussion, and depends largely on whether a 
standardized library adopts a pure common subset approach, a common 
subset with optional extensions, or a required superset approach.

(2) Synchronization: This is the traditional domain of locks, but also 
includes atomic operations for lock-free programming. There is some 
range of variation in what primitives are supported (e.g. binary 
semaphores, counting semaphores, mutexes, condition variables, event 
variables) and in the way that they are supported (e.g. mutex 
reentrancy). A library needs to offer a reasonable minimum set of 
synchronization primitives and a way of dealing with the feature 
variability within each type.

In principle, if a higher-level set of facilities were also offered as 
part of a standard threading library, they should be fully and portably 
implementable in terms of the library primitives. This is not to say 
that they would be required to be implemented in terms of them, but that 
they could be (cf the relationship between I/O streams and the C 
standard I/O facilities).

Style of Library
----------------
Although a threading and synchronization library is intended to be 
primitive, that does not mean it has to be at the lowest level from a 
C++ programmer's perspective: an existing C library would otherwise be 
sufficient. A user of a standard C++ threading library would reasonably 
expect such a library to make best use of the language features and 
programming idioms available, which favors an approach based on objects 
rather than on function pointers and void pointers, etc.

There are two basic approaches to defining active objects: one is to 
inherit threadedness from a base class and the other is to use a 
threading object to execute a function on a separate object in a thread. 
The former was traditionally popular, but the delegation-based approach 
is now considered both a better design, in terms of its separation of 
concerns, and the more popular style, both in existing C++ threading 
libraries and other languages. In C++ the most idiomatic realization of 
the delegation-based approach is to execute function objects, as opposed 
to requiring that a threadable object inherit from a library-specified 
base class with a single virtual ordinary function member that needs to 
be overridden. The function object approach is based on concrete types 
and templates, supporting uniformity of use for both function objects 
and function pointers.

More generally, the generic approach suggests itself as the approach to 
be used for defining library primitives for threading and 
synchronization. This approach has not been taken in any of the popular 
C++ threading libraries, but it has been the subject of some work by one 
of the authors of this proposal. Following the existing example of the 
STL, a generic approach to threading divides a primitives library into 
two aspects: a set of requirements on types (concepts) and a 
standardized set of types for out-of-the-box use (cf Sequence 
requirements and the std::deque, std::list, and std::vector class 
templates). This approach offers a ready-to-use and standard code 
library, but also offers an open model for extension for both users and 
library implementors.

For the threading side of the library, requirements would be needed to 
define at least what functions and function objects could be threaded, 
the way in which they were launched, and the mechanism by which they 
could be joined, and any results recovered. One model that has been 
suggested is that thread launching can be treated as an application of a 
function or function object (a threader) to the function to be run in a 
thread. Treating a threader as a function object allows libraries to 
extend behavior by overloading the constructor to handle specification 
of potentially platform-specific features, such as stack size, 
scheduling policy, etc, without disrupting the uniformity of the 
function call syntax. Alternatively, different threader types could be 
provided that satisfy the threader requirements but realized specific 
thread-execution policies, e.g. thread pooling. A consideration is to 
allow threadable functions be to return a value that the user can pick 
up through an explicit join action. C APIs commonly accommodate this 
feature via a void * or equivalent. Object-wrapped threading libraries 
normally ignore this return bandwidth, so that a join operation returns 
void. A standard C++ threading library could handle type-safe value 
returns simply and generically.

The variety of operations that a locked synchronization primitive can 
support (e.g. blocking lock, nonblocking lock, lock with timeout) is 
large enough and diverse enough that mandating their full support on all 
locking primitives might be considered too much of an overhead by both 
users and implementors alike. A hierarchical set of categories defining 
lockability, similar to the iterator categories in the current standard 
library, might offer a reasonable approach to addressing this 
variability. A library could define traits and tag types to allow 
programmers discover details of a given primitive or to choose the best 
fit for their needs.

On their own, objects satisfying some category of lockability 
requirements, whether primitives defined in the library or higher-level 
objects written by the user, could be tedious and error prone to use. 
Inevitably their use would be wrapped up, using the _scoped locking_ 
idiom, a specific and common application of the _resource acquisition is 
initialization_ idiom). The approach is common enough that provision of 
some scoped-locking-based helpers would most likely be expected by users 
of a standard threading library. The set of such lockers is potentially 
unbounded, and is not restricted to the common scope lock: for example, 
smart pointers that wrap individual function calls can be defined in 
terms of a stable set of lockability requirements. In this sense, 
lockers are to lockable objects as algorithms are to iterators. A 
library could provide some common helper types, but the formalizing of 
lockability requirements would allow library implementors and users 
alike a uniform model for extension that would be nonintrusive on 
existing lockable objects, whether standardized synchronization 
primitives or other user-defined lockable objects.
</blurb>
-- 
____________________________________________________________

   Kevlin Henney                   phone:  +44 117 942 2990
   mailto:kevlin at curbralan.com     mobile: +44 7801 073 508
   http://www.curbralan.com        fax:    +44 870 052 2289
   Curbralan: Consultancy + Training + Development + Review
____________________________________________________________