[cpp-threads] modes, interlude

Alexander Terekhov alexander.terekhov at gmail.com
Tue May 10 10:26:16 BST 2005


On 5/10/05, Peter Dimov <pdimov at mmltd.net> wrote:
> Doug Lea wrote:
> 
> > If we are going to do this, is there any reason not go all the way and
> > have four flavors of explicit barrier functions in the atomic classes
> 
> We will.
> 
> template<class M> void barrier( M msync );
> 
> > and not bother adding modes to load, store etc?
> 
> A library can't fold the explicit barrier() calls into the preceding or
> subsequent atomic operation, so providing only barrier() may be inefficient
> for the first ten years or so. :-)

I don't see how unidirectional constraints can be separated from
the associated atomic operation.

> 
> On platforms where a bidirectional barrier is the natural primitive, the
> modes can easily be implemented in terms of barriers:
> 
> template<class M, class T> T atomic_load( M msync, T * addr )
> {
>    barrier( (msync & msync_rel) | msync_hlb );
> 
>    T r = _load_none( addr );
> 
>    barrier( (msync & msync_acq) | msync_slb );
> 
>    return r;
> }
> 
> ("constraint algebra" at work.)

Yep. barrier() must have a precondition true == (msync & msync_rel) 
&& (msync & msync_acq), so to speak.

ssb|hsb	-> StoreStore		// eieio or {lw}sync
ssb|hlb	-> StoreLoad		// sync
slb|hsb	-> LoadStore		// {lw}sync
slb|hlb	-> LoadLoad		// {lw}sync
rel|hsb	-> StoreStore+LoadStore	// {lw}sync
rel|hlb	-> StoreLoad+LoadLoad	// sync
ssb|acq	-> StoreStore+StoreLoad	// sync
slb|acq	-> LoadStore+LoadLoad	// {lw}sync
rel|acq	-> Sledgehammer proper 	// sync

Note that there is no need to overly constrain compiler reordering
across unidirectionally constrained atomic operations merely 
because hardware/lib-impl is using bidirectional fences. So,

template<class M, class T> T atomic_load( M msync, T * addr )
compiler_reordering_constraint_across_this_function( M )
{

   if ( msync & msync_rel )
     barrier( (msync & msync_rel) | msync_hlb );

   T r = _compiler_reordering_contraint( addr );

   if ( msync & msync_acq )
     barrier( (msync & msync_acq) | msync_slb );

   return r;
}

or something like that with some directive to compiler to relax
ordering with respect to "outside" stuff in spite of bidirectional
fences used internally. 

Oder?

regards,
alexander.




More information about the cpp-threads mailing list