[cpp-threads] modes, interlude

Tue May 10 10:26:16 BST 2005

On 5/10/05, Peter Dimov <pdimov at mmltd.net> wrote:
> Doug Lea wrote:
> 
> > If we are going to do this, is there any reason not go all the way and
> > have four flavors of explicit barrier functions in the atomic classes
> 
> We will.
> 
> template<class M> void barrier( M msync );
> 
> > and not bother adding modes to load, store etc?
> 
> A library can't fold the explicit barrier() calls into the preceding or
> subsequent atomic operation, so providing only barrier() may be inefficient
> for the first ten years or so. :-)

I don't see how unidirectional constraints can be separated from
the associated atomic operation.

> 
> On platforms where a bidirectional barrier is the natural primitive, the
> modes can easily be implemented in terms of barriers:
> 
> template<class M, class T> T atomic_load( M msync, T * addr )
> {
>    barrier( (msync & msync_rel) | msync_hlb );
> 
>    T r = _load_none( addr );
> 
>    barrier( (msync & msync_acq) | msync_slb );
> 
>    return r;
> }
> 
> ("constraint algebra" at work.)

Yep. barrier() must have a precondition true == (msync & msync_rel) 
&& (msync & msync_acq), so to speak.

ssb|hsb	-> StoreStore		// eieio or {lw}sync
ssb|hlb	-> StoreLoad		// sync
slb|hsb	-> LoadStore		// {lw}sync
slb|hlb	-> LoadLoad		// {lw}sync
rel|hsb	-> StoreStore+LoadStore	// {lw}sync
rel|hlb	-> StoreLoad+LoadLoad	// sync
ssb|acq	-> StoreStore+StoreLoad	// sync
slb|acq	-> LoadStore+LoadLoad	// {lw}sync
rel|acq	-> Sledgehammer proper 	// sync

Note that there is no need to overly constrain compiler reordering
across unidirectionally constrained atomic operations merely 
because hardware/lib-impl is using bidirectional fences. So,

template<class M, class T> T atomic_load( M msync, T * addr )
compiler_reordering_constraint_across_this_function( M )
{

   if ( msync & msync_rel )
     barrier( (msync & msync_rel) | msync_hlb );

   T r = _compiler_reordering_contraint( addr );

   if ( msync & msync_acq )
     barrier( (msync & msync_acq) | msync_slb );

   return r;
}

or something like that with some directive to compiler to relax
ordering with respect to "outside" stuff in spite of bidirectional
fences used internally. 

Oder?

regards,
alexander.