[cpp-threads] A draft memory model paper
Boehm, Hans
hans.boehm at hp.com
Tue Aug 15 01:55:31 BST 2006
Please ignore the two sentences about 3.3.2 and uncertainty about
write-read reordering. That should have been 3.2.2 and it was
previously addressed. Sorry.
Hans
> -----Original Message-----
> From: Boehm, Hans
> Sent: Monday, August 14, 2006 4:31 PM
> To: 'C++ threads standardisation'
> Subject: RE: [cpp-threads] A draft memory model paper
>
> I expect this is an issue that we need to discuss at the
> meeting for the C++ model.
>
> Herb's paper is a bit unclear about whether interlocked
> writes followed by reads can be reordered. (I think 3.3.2
> says they can?) Assuming they can't, which I think is the
> intent, I agree that a fence is required for stores on both
> X86 and IA64. I think that means you take 100+ cycle hit on
> an interlocked store for at least some Pentium 4 variants.
> Since you don't take the hit on a load, and I think it's
> unlikely that future processors would share this feature,
> that may be OK, at least for default behavior. I don't know.
>
> Hans
>
>
> > -----Original Message-----
> > From: cpp-threads-bounces at decadentplace.org.uk
> > [mailto:cpp-threads-bounces at decadentplace.org.uk] On Behalf
> Of Peter
> > Dimov
> > Sent: Tuesday, August 08, 2006 3:50 AM
> > To: C++ threads standardisation
> > Subject: Re: [cpp-threads] A draft memory model paper
> >
> > Herb Sutter wrote:
> > > Peter wrote:
> > >> How do you expect a compiler that follows this memory model to
> > > translate:
> > > [code with interlocked variables]
> > >
> > > Same as JSR-133 cookbook for Java volatile, no?
> >
> > Yes indeed. :-)
> >
> > > BTW, the paper uses "interlocked" simply to avoid the
> V-word, since
> > > "volatile" is a lightning rod even though this is similar
> > to Java and
> > > .NET volatile.
> > >
> > >> extern interlocked int x;
> > >>
> > >> void f()
> > >> {
> > >> x = 1;
> > >
> > > On IA64, st.rel [x] = 1. On IA32 plain store + mfence is
> one way, I
> > > suppose?
> > >
> > >> }
> > >>
> > >> and:
> > >>
> > >> extern interlocked int y;
> > >> extern void h();
> > >>
> > >> void g()
> > >> {
> > >> if( y ) h();
> > >
> > > On IA64, ld.acq [y] = 1. On IA32 plain load, I suppose?
> > >
> > >> }
> >
> > But if you later call f(); g(); in sequence, st.rel can be
> reordered
> > with ld.acq, and the MM doesn't allow that. So one of the
> two needs an
> > additional mf. On x86, plain load is ld.acq and plain store
> is st.rel,
> > but translating stores to plain store + mfence would indeed
> impose a
> > total order.
> >
> > Unfortunately it can also impose a performance hit.
> >
> >
> > --
> > cpp-threads mailing list
> > cpp-threads at decadentplace.org.uk
> > http://www.decadentplace.org.uk/cgi-bin/mailman/listinfo/cpp-threads
> >
>
More information about the cpp-threads
mailing list