[cpp-threads] A draft memory model paper

Tue Aug 15 01:55:31 BST 2006

 Please ignore the two sentences about 3.3.2 and uncertainty about
write-read reordering.  That should have been 3.2.2 and it was
previously addressed.  Sorry.

Hans

> -----Original Message-----
> From: Boehm, Hans 
> Sent: Monday, August 14, 2006 4:31 PM
> To: 'C++ threads standardisation'
> Subject: RE: [cpp-threads] A draft memory model paper
> 
> I expect this is an issue that we need to discuss at the 
> meeting for the C++ model.
> 
> Herb's paper is a bit unclear about whether interlocked 
> writes followed by reads can be reordered.  (I think 3.3.2 
> says they can?)  Assuming they can't, which I think is the 
> intent, I agree that a fence is required for stores on both 
> X86 and IA64.  I think that means you take 100+ cycle hit on 
> an interlocked store for at least some Pentium 4 variants.  
> Since you don't take the hit on a load, and I think it's 
> unlikely that future processors would share this feature, 
> that may be OK, at least for default behavior.  I don't know.
> 
> Hans
> 
> 
> > -----Original Message-----
> > From: cpp-threads-bounces at decadentplace.org.uk
> > [mailto:cpp-threads-bounces at decadentplace.org.uk] On Behalf 
> Of Peter 
> > Dimov
> > Sent: Tuesday, August 08, 2006 3:50 AM
> > To: C++ threads standardisation
> > Subject: Re: [cpp-threads] A draft memory model paper
> > 
> > Herb Sutter wrote:
> > > Peter wrote:
> > >> How do you expect a compiler that follows this memory model to
> > > translate:
> > > [code with interlocked variables]
> > >
> > > Same as JSR-133 cookbook for Java volatile, no?
> > 
> > Yes indeed. :-)
> > 
> > > BTW, the paper uses "interlocked" simply to avoid the 
> V-word, since 
> > > "volatile" is a lightning rod even though this is similar
> > to Java and
> > > .NET volatile.
> > >
> > >> extern interlocked int x;
> > >>
> > >> void f()
> > >> {
> > >>     x = 1;
> > >
> > > On IA64, st.rel [x] = 1. On IA32 plain store + mfence is 
> one way, I 
> > > suppose?
> > >
> > >> }
> > >>
> > >> and:
> > >>
> > >> extern interlocked int y;
> > >> extern void h();
> > >>
> > >> void g()
> > >> {
> > >>     if( y ) h();
> > >
> > > On IA64, ld.acq [y] = 1. On IA32 plain load, I suppose?
> > >
> > >> }
> > 
> > But if you later call f(); g(); in sequence, st.rel can be 
> reordered 
> > with ld.acq, and the MM doesn't allow that. So one of the 
> two needs an 
> > additional mf. On x86, plain load is ld.acq and plain store 
> is st.rel, 
> > but translating stores to plain store + mfence would indeed 
> impose a 
> > total order.
> > 
> > Unfortunately it can also impose a performance hit.
> > 
> > 
> > --
> > cpp-threads mailing list
> > cpp-threads at decadentplace.org.uk
> > http://www.decadentplace.org.uk/cgi-bin/mailman/listinfo/cpp-threads
> > 
>