[cpp-threads] D2335 (sequential consistency proof) revision
Chris Thomasson
cristom at comcast.net
Sat Aug 25 00:17:48 BST 2007
----- Original Message -----
From: "Boehm, Hans" <hans.boehm at hp.com>
To: "Chris Thomasson" <cristom at comcast.net>; "C++ threads standardisation"
<cpp-threads at decadentplace.org.uk>
Cc: "Howard Hinnant" <howard.hinnant at gmail.com>
Sent: Friday, August 24, 2007 3:37 PM
Subject: RE: [cpp-threads] D2335 (sequential consistency proof) revision
> From: Chris Thomasson
>
> [...]
>
> Taking a lock does not necessarily require a #StoreLoad | #StoreStore
> barrier. There are asymmetric locking algorithms in which
> certain thread(s)
> can take the lock without using interlocked rmw or store/load
> memory barrier
> instructions.
>
> You're presumably talking about cases in which a thread knows that it
> held the lock last?
I am referring to a type of "biased" locking scheme that can designate a
thread(s) which in turn can take locks without using interlocked rmw or
membars. Here is a basic example:
http://blogs.sun.com/dave/resource/Asymmetric-Dekker-Synchronization.txt
(search webpage text for 'The Key Innovation'...)
http://www.trl.ibm.com/people/kawatiya/Kawachiya05phd.pdf
(chapter 6)
Foreign threads must wait until the biased thread serializes itself by
executing something that is analogous to a release membar. The membar does
not necessarily have to be in the locking functions themselves. You can even
do a asymmetric read-write lock that has no interlocked rmw or membars on
the read-lock side in the fast-path case. It kind of resembles epoch
synchronization in which per-thread/cpu synchronizations are tracked at a
wider scale than usual:
http://appcore.home.comcast.net/vzoom/round-2.pdf
(page 2/section 1.2/second paragraph...)
I say wider scale because most synchronization algorithms expect explicit
serializations to occur on a per-operation basis. For instance, a reference
counting algorithm could rely on each reference increment and decrement
operation being guarded with the appropriate membars. Or, a locking
algorithm might rely on each acquisition and subsequent release will be
protected by the correct barriers... These types of strict rules can be
loosed a great deal by tracking the membars on a wider-scale. Basically, the
membars can be amortized. They do not necessarily _have_ to be invoked on a
per-operation basis.
I am not sure that this type of "weird" synchronization algorithms would
warrant a modification to the standardized mutex api... Humm. Need to think
some more on this. Anyway, I am sorry for the very brief information, but,
did it help clear anything up at all?
:^)
More information about the cpp-threads
mailing list