C++ Threading Proposal

Thu Sep 30 22:19:07 BST 2004

> -----Original Message-----
> From: Jim Rogers [mailto:jimmaureenrogers at att.net]
> 
[ Various sections deleted.]
> Ada does not provide any guarantee of execution order. This 
> is particularly
> appropriate since we have computer architectures that provide
> out-of-order execution.
Actually out-of-order execution is not the issue. This example
actually works correctly on Pentium 4s (if the compiler does
no reordering), which rely heavily on out-of-order execution.
Without volatile declarations (which would generate different
instructions), it doesn't work on current Itanium processors,
which execute instructions in-order.  Recent PA-RISC implementations
are out-of-order but sequentially consistent.  It depends much more on
how write buffers etc. work.
> 
> In your example above, if x_initialized is volatile, while x 
> is not, then
> two
> different threads on separate processors can be expected to 
> see different
> values for x, since each processor has its own copy of x in 
> local cache.
> 
> If x is also volatile, another thread will see the "x = 17" once x is
> updated.
> The way to ensure that both x_initialized and x will properly 
> be viewed and
> set by different threads is by the use of critical regions. 
> Atomic compare
> and
> swap operations are never specified in hardware for arbitrarily large
> regions of
> memory. This is the reason that critical regions have been 
> defined. This is
> also
> the situation where mutex locks are very useful.
There was a lot of discussion of this during the Java effort.
The problem is that locks or critical regions are expensive, at
least on a multiprocessor, and therefore people try to use alternatives
for performance-critical applications.  (Search for "double-checked
locking".)  In a few cases, the alternatives are essential to get
any real speedups from multiprocessors.  Whether or not it's
essential, we know that some significant code bases (e.g. the one from
Redmond) regularly use such techniques, and there needs to be a way
to do it correctly.

For the example I gave, hardware almost always provides much cheaper ways to
ensure correctness than by using a lock.  There needs to be a way to
get access to that.

You are right that this is tricky because compare-and-swap operations
are limited at best.  I also agree that if critical regions give
you adequate performance, you shouldn't bother with this stuff.

> 
> Java allows threads to send messages to each other through any public
> method, whether or not that method contains a synchronized block. This
> is an invitation to race conditions, because Java does not 
> ensure atomic
> updates of all the data members in a Thread class. I would 
> suggest that
> the public functions of a C++ Thread class should be implicitly
> synchronized.
Typically a lot of shared data is not directly in a class identifiable
as being thread-related.

I don't believe the language should ever add implicit synchronization,
except possibly where it's necessary to protect shared data that is
not programmer-visible.  Synchronization is expensive.  Adding more of
it is in general unsafe and can certainly lead to deadlocks.  In my
view, the original Java libraries suffered from way too much implicit
synchronization.

Having said that, I don't think we should follow the Java approach
down to the thread-related classes in the standard library.  The C++
standard library is different from the Java standard library, and we
aren't going to change that.  In either case, I find I typically
need a manual while I'm programming, and I can live with that.
I do want the memory visibility rules to be similar, since those are
conceptually hard, and they are something that most people will have
to be taught.  And teaching two sets of rules is a significant extra
effort.  (Plus there are the occasional compiler back ends that need
to know both sets of rules, but that's a relatively minor issue.)

There is a fair amount of existing multithreaded C and C++ code.  I think
we need to accommodate it as much as possible.  In cases in which
it's inherently broken (e.g. double-checked locking) I think it's
important to provide a reasonably painless path to fix it.  Thus
I do think we need to accommodate something very close to at least
the pthreads and probably win32 programming styles.

Hans
> >
> > > -----Original Message-----
> > > From: Jim Rogers [mailto:jimmaureenrogers at att.net]
> > > Sent: Monday, September 27, 2004 9:02 PM
> > > To: Boehm, Hans; 'Andrei Alexandrescu'; Ben Hutchings; 
> Kevlin Henney;
> > > Bill Pugh; Douglas C. Schmidt; Doug Lea
> > > Subject: Re: Re: C++ Threading Proposal
> > >
> > >
> > >
> > > ----- Original Message ----- 
> > > From: "Boehm, Hans" <hans.boehm at hp.com>
> > > To: "'Andrei Alexandrescu'" <andrei at metalanguage.com>; 
> "Ben Hutchings"
> > > <ben at decadentplace.org.uk>; "Boehm, Hans" 
> <hans.boehm at hp.com>; "Kevlin
> > > Henney" <kevlin at curbralan.com>; "Bill Pugh"
> > > <pugh at cs.umd.edu>; "Douglas C.
> > > Schmidt" <schmidt at dre.vanderbilt.edu>; "Doug Lea" 
> <dl at cs.oswego.edu>
> > > Cc: "Jim Rogers" <jimmaureenrogers at att.net>
> > > Sent: Monday, September 27, 2004 1:20 PM
> > > Subject: RE: Re: C++ Threading Proposal
> > >
> > >
> > > > My understanding of the Ada model (based also on a 1994 
> draft of the
> > > standard) is that it's actually pretty good for the time.  It
> > > differs from
> > > the Java approach in several ways:
> > > >
> > > > 1) Writes to a field are allowed to "interfere with" at
> > > least other fields
> > > in that object.  I currently think this is a mistake given
> > > current hardware
> > > capabilities and the existing difficulties of writing
> > > multithreaded code.
> > > It was a reasonable choice in 1994.
> > >
> > > Protected procedures and protected entries are allowed to
> > > modify any data in
> > > the protected object.
> > >
> > > >
> > > > 2) A concurrent write and other access (a race) is viewed
> > > as an error,
> > > just like the Pthread model.    Unlike in the Java 
> setting where this
> > > creates security issues, this is OK for many C++ programs.
> > > But it doesn't
> > > address lock-free algorithms, which was a big part of our 
> motivation.
> > >
> > > I understand some of the appeal of lock-free algorithms. I
> > > have not seen
> > > documentation claiming that lock-free algorithms are
> > > deterministic to the
> > > levels needed for hard real time systems. Given this lack of
> > > knowledge, I
> > > would not focus all thread communication around only
> > > lock-free algorithms.
> > >
> > > >
> > > > 3) It uses guards on entry to a "protected procedure" (roughly
> > > "synchronized method" in Java) instead of condition
> > > variables.  I gather
> > > that these are typically repeatedly evaluated on exit 
> from protected
> > > procedures?  I think I mildly prefer explicit condition
> > > variables (roughly
> > > as in Pthreads, though I don't fully understand the tradeoffs
> > > as to whether
> > > or not a notification should hold the lock) to either 
> this or the Java
> > > model, since I think it makes performance a bit more transparent.
> > >
> > > I would like to make a correction here. Protected objects are
> > > allowed three
> > > kinds of methods: procedures, entries, and functions.
> > > Protected procedures are given exclusive access to the object
> > > without guard
> > > conditions.
> > > Protected entries are given exclusive access to the object
> > > when a specified
> > > guard condition evaluates to True.
> > > Protected functions are given shared access to the object 
> through an
> > > implicit read-lock.
> > >
> > > Protected entries are similar to critical sections controlled
> > > by a condition
> > > variable. Every protected entry has an associated entry
> > > queue. That queue is
> > > used to queue up calls to the entry while the guard condition
> > > evaluates to
> > > False. Upon completion of a protected entry or protected
> > > procedure all entry
> > > guard conditions are evaluated. If a condition evaluates to
> > > True and the
> > > entry queue size is greater than 0 the current task executes
> > > the entry body
> > > as a proxy for the next task waiting in the queue. Entry
> > > queues can be set
> > > up as either priority queues or FIFO queues.
> > >
> > > >
> > > > 4) It's a bit clearer than the Pthread model about
> > > visibility rules, but
> > > not as clear as I would like.  It's a bit more of a stretch
> > > to argue that my
> > > register promotion example is legal in Ada, but it's not 
> as clearly
> > > prohibited as I would like.
> > >
> > > I am not familiar with your register promotion example.
> > >
> > > >
> > > > 5) It doesn't seem to address atomic memory operations or
> > > "unscoped" e.g.
> > > conditionally acquired locks.
> > >
> > > The Ada tasking model does provide the ability to specify
> > > atomic operations,
> > > as well as volatile operations.
> > > The Ada Language Reference Manual contains the following language:
> > >
> > > C.6 Shared Variable Control
> > >
> > > C.6 Shared Variable Control
> > >
> > >   1.. This clause specifies representation pragmas that
> > > control the use of
> > > shared variables.
> > >
> > > Syntax
> > >
> > >   2.. The form for pragmas Atomic, Volatile, 
> Atomic_Components, and
> > > Volatile_Components is as follows:
> > >
> > >     3.. pragma Atomic(local_name);
> > >
> > >     4.. pragma Volatile(local_name);
> > >
> > >     5.. pragma Atomic_Components(array_local_name);
> > >
> > >     6.. pragma Volatile_Components(array_local_name);
> > >
> > >   7.. An atomic type is one to which a pragma Atomic applies.
> > > An atomic
> > > object (including a component) is one to which a pragma
> > > Atomic applies, or a
> > > component of an array to which a pragma Atomic_Components
> > > applies, or any
> > > object of an atomic type.
> > >
> > >   8.. A volatile type is one to which a pragma Volatile
> > > applies. A volatile
> > > object (including a component) is one to which a pragma
> > > Volatile applies, or
> > > a component of an array to which a pragma Volatile_Components
> > > applies, or
> > > any object of a volatile type. In addition, every atomic type
> > > or object is
> > > also defined to be volatile. Finally, if an object is
> > > volatile, then so are
> > > all of its subcomponents (the same does not apply to atomic).
> > >
> > > Name Resolution Rules
> > >
> > >   9.. The local_name in an Atomic or Volatile pragma shall
> > > resolve to denote
> > > either an object_declaration, a non-inherited
> > > component_declaration, or a
> > > full_type_declaration. The array_local_name in an 
> Atomic_Components or
> > > Volatile_Components pragma shall resolve to denote the
> > > declaration of an
> > > array type or an array object of an anonymous type.
> > >
> > > Legality Rules
> > >
> > >   1.. It is illegal to apply either an Atomic or
> > > Atomic_Components pragma to
> > > an object or type if the implementation cannot support the
> > > indivisible reads
> > > and updates required by the pragma (see below).
> > >
> > >   2.. It is illegal to specify the Size attribute of an
> > > atomic object, the
> > > Component_Size attribute for an array type with atomic
> > > components, or the
> > > layout attributes of an atomic component, in a way that 
> prevents the
> > > implementation from performing the required indivisible reads
> > > and updates.
> > >
> > >   3.. If an atomic object is passed as a parameter, then the
> > > type of the
> > > formal parameter shall either be atomic or allow pass by copy
> > > (that is, not
> > > be a nonatomic by-reference type). If an atomic object is
> > > used as an actual
> > > for a generic formal object of mode in out, then the type of
> > > the generic
> > > formal object shall be atomic. If the prefix of an
> > > attribute_reference for
> > > an Access attribute denotes an atomic object (including a
> > > component), then
> > > the designated type of the resulting access type shall be
> > > atomic. If an
> > > atomic type is used as an actual for a generic formal derived
> > > type, then the
> > > ancestor of the formal type shall be atomic or allow pass by copy.
> > > Corresponding rules apply to volatile objects and types.
> > >
> > >   4.. If a pragma Volatile, Volatile_Components, Atomic, or
> > > Atomic_Components applies to a stand-alone constant object,
> > > then a pragma
> > > Import shall also apply to it.
> > > Pragma Volatile is used to enforce properly shared global 
> updates on
> > > multi-processor systems. Pragma Volatile forces each
> > > processor to view and
> > > update the shared variable directly, instead of through a
> > > local copy of the
> > > shared variable.
> > >
> > > >
> > > > My general inclination is still to try for as much 
> commonality as
> > > reasonable between Java and C++, since many programmers will
> > > need to deal
> > > with both.
> > >
> > > I suggest you avoid the Java model implemented through 
> wait()/notify()
> > > operations. The Java model does not ensure that all waiting
> > > threads will
> > > ever be serviced. The Java model ensures that things will
> > > work out "on the
> > > average", which is not deterministic and is totally
> > > inappropriate for hard
> > > real-time systems.
> > >
> > > The Ada model of protected entries (which corresponds well
> > > with Pthread
> > > condition variables) includes a queue so that you can
> > > deterministically
> > > assure all waiting tasks will be processed in a predictable order.
> > >
> > > >
> > > > Hans
> > > >
> > > > > -----Original Message-----
> > > > > From: Andrei Alexandrescu [mailto:andrei at metalanguage.com]
> > > > > Sent: Saturday, September 25, 2004 2:21 PM
> > > > > To: Ben Hutchings; Hans Boehm; Kevlin Henney; Bill Pugh;
> > > Douglas C.
> > > > > Schmidt; Doug Lea
> > > > > Cc: Jim Rogers
> > > > > Subject: Fwd: Re: C++ Threading Proposal
> > > > >
> > > > >
> > > > > Jim  Rogers  (cc'd)  sent  me some examples of using
> > > Ada's concurrency
> > > > > mechanisms, for possible inspiration.
> > > > > One   is   attached   in   text   format,   and   the
> > > other   is  at
> > > > > http://home.att.net/~jimmaureenrogers/Shared_Resource_Design_P
> > > > > atterns.html.
> > > > > I  am sending this to you FYI, and am also adding the
> > > documents to our
> > > > > nascent institutional memory.
> > > > > Andrei
> > > > > This is a forwarded message
> > > > > From: Jim Rogers <jimmaureenrogers at att.net>
> > > > > To: "Andrei Alexandrescu" <andrei at metalanguage.com>
> > > > > Date: Monday, September 20, 2004, 9:13:24 PM
> > > > > Subject: C++ Threading Proposal
> > > > > ===8<==============Original message text===============
> > > > > Hello Andrei,
> > > > > The attached file is a simple text file containing an
> > > example of Ada
> > > > > tasking.
> > > > > I used the attachment because my email tool was mangling my
> > > > > formatting.
> > > > > Jim Rogers
> > > > > ===8<===========End of original message text===========
> > > > > -- 
> > > > > Best regards,
> > > > >  Andrei                            
mailto:andrei at metalanguage.com
> > > >
> > >
> >
> >
>