[cpp-threads] Failed compare-and-swap

Tue Jul 31 19:31:20 BST 2007

[Catching up with this thread ...] 

> -----Original Message-----
> From: cpp-threads-bounces at decadentplace.org.uk 
> [mailto:cpp-threads-bounces at decadentplace.org.uk] On Behalf 
> Of Herb Sutter
> Sent: Tuesday, July 31, 2007 10:45 AM
> To: Chris Thomasson; C++ threads standardisation
> Subject: Re: [cpp-threads] Failed compare-and-swap
> 
> Summarizing:
> 
> The point well illustrated by these examples is that if CAS 
> returns the old value (rather than a bool) it can be used to 
> perform an atomic load. If we allow that, then it should have 
> the same default acquire semantics as a load (unless 
> overridden in the weak/relaxed atomics of course).
> 
> Lawrence and everyone, what do you think? Now that I see 
> this, it seems obvious in retrospect, and this question must 
> have come up before with CAS semantics. What do other systems do?
I completely agree with this.  If a CAS can be used to read the old
value (which it can, arguably if it only has a boolean result), and it
has at least acquire semantics, then it should still have acquire
semantics even if it fails and the write doesn't take place. 
The read still succeeds.

Whether or not it has release semantics is not observable by our
definitions, since there is no write for another thread to see.  Thus I
don't think there is any reason to say anything special about that
either.

We clearly need the acquire property for at least seq_cst, since we
don't otherwise get sequential consistency for race-free programs.  We
could conceivably specify something else for acquire and acq_rel.  But
that would clearly complicate things, and I tend to agree with Paul that
it's probably not necessary.  If this seriously affects the performance
of your code, you have bigger fish to fry.

Trylock is a different situation, since a failed trylock by definition
reads a value written by an operation (lock acquisition)without release
semantics.  Thus it currently never establishes a synchronizes with
relationship, and hence has no impact on visibility.  To meaningfully
change that, lock acquisition would need release semantics, as it
technically does in Posix.  (As Alexander points out, a simple trylock
can't tell the difference in Posix.  On the other hand, a trylock
followed by an unrelated successful synchronization operation can.)
We've brought this up repeatedly, and I believe there is little support
for release semantics for lock acquisition, not even from the Posix
side.  (I believe a lot of Posix implementations don't follow this
interpretation of the rules as it stands.)

Hans

> 
> 
> Details follow:
> 
> 
> Chris wrote:
> > From: "Herb Sutter" <hsutter at microsoft.com>
> > > Besides using Interlocked* as a way to spell a full barrier, what 
> > > cases do you know of where people look at the return value? I 
> > > haven't looked at this issue yet, and I would be interested in 
> > > understand such cases better.
> >
> > Here are two examples:
> >
> > 
> http://groups.google.com/group/comp.programming.threads/msg/b6a4eec6cb
> > b
> > a625b (contrived example)
> 
> For convenience, here's the example (let's call it Example 
> 1), which basically uses a CAS to perform an atomic read:
> 
>   // Example 1
> 
>   static CMyData my_data;
>   static volatile LONG flag = 0;
> 
>   Thread 1:
>     my_data.setup();
>     InterlockedCompareExchange( &flag, 1, 0 ); // 0 -> 1
> 
>   Thread 2:
>     while ( ! InterlockedCompareExchange( &flag, 0, 0 ) ) {
>       Sleep( 0 ); _asm pause;
>     }
>     // flag is set
>     my_data.use();
> 
> Yes, this relies on a failed CAS being a barrier, though this 
> particular example needs it only to be an acquire barrier. 
> The general technique here is to use a CAS purely as an atomic read.
> 
> So the question here is whether we want to support the 
> technique of using a CAS to perform an atomic read. Do we? We 
> can say yes by returning the old value, or no by returning a 
> succeeded/failed bool.
> 
> 
> > 
> http://groups.google.com/group/comp.programming.threads/msg/57a691b215
> > 9
> > fa698
> 
> OK, this is a followup where Joe Seigh points out what I said 
> above that you could just use a plain atomic read:
> 
>   // Example 2
> 
>   ... as in Example 1, except:
> 
>   Thread 2:
>     while (flag != 1)
>       sleep(0);
>     InterlockedCompareExchange(&flag, 1, 1);
>     // flag is set
>     my_data.use();
> 
> I think Joe adds the ICE call only because it seems this 
> thread predates VS 2005 where we added more ordering 
> semantics to volatile. In VS 2005, I you should be able to just do:
> 
>   // Example 3
> 
>   ... as in Example 1, except:
> 
>   Thread 2:
>     while (flag != 1)
>       sleep(0);
>     // flag is set
>     my_data.use();
> 
> and have that work reliably.
> 
> 
> > 
> http://groups.google.com/group/comp.programming.threads/msg/31803c3398
> > 6
> > 58e06
> > (less contrived example; examine the 'initFreeQueue' function)
> >
> > I think the reasoning is that the InterlockedCompareExchangePointer 
> > does return real data.
> 
> Thanks, I didn't have time to read it carefully. So it seems 
> to me that the point is that if you can use CAS to do a read, 
> it should be ordered like a read (by default).
> 
> Herb
> 
> 
> 
> --
> cpp-threads mailing list
> cpp-threads at decadentplace.org.uk
> http://www.decadentplace.org.uk/cgi-bin/mailman/listinfo/cpp-threads
>