[cpp-threads] Yet another visibility question

Fri Jan 12 18:19:32 GMT 2007

On Fri, Jan 12, 2007 at 06:50:03PM +0100, Alexander Terekhov wrote:
> On 1/12/07, Paul E. McKenney <paulmck at linux.vnet.ibm.com> wrote:
> >On Fri, Jan 12, 2007 at 10:30:46AM +0100, Alexander Terekhov wrote:
> >> On 1/11/07, Peter Dimov <pdimov at mmltd.net> wrote:
> >> >Alexander Terekhov wrote:
> >> >
> >> >> Yeah, and IBM does some NUMA with Intel Xeons, IIRC. That's why I've
> >> >> been telling all along that lock_cmpxchg(&var, 42, 42) for loads is
> >> >> the way to do SC on x86/IA32. ;-)
> >> >
> >> >If x86 loads do have acquire semantics as we've been assuming, wouldn't 
> >it
> >> >be enough to just use lock xchg for stores?
> >>
> >> I don't think so. Why would that be enough? Well, actually xchg for
> >> stores is still needed to ensure leading store-load fencing regarding
> >> load inside cmpxchg, I think. So for SC on x86/IA32 we need cmpxchg
> >> for loads and xchg for stores (both locked).
> >
> >Will this force all stores to independent variables from different
> >processors to be ordered?  On a large NUMA configuration with multiple
> >front-side busses?
> 
> Think of implementing SC  with per-variable locks on a large x86/IA32
> NUMA configuration with multiple FSBs or whatever. It won't be any
> better than cmpxchg for loads and xchg for stores (both locked).

My understanding is that locking is only guaranteed to force SC execution
of critical sections when all of the critical sections of interest
are guarded by the same lock.  Therefore, per-variable locks will not
necessarily force SC execution of all accesses.

Do you agree or disagree?

							Thanx, Paul