[cpp-threads] memory model

Sun May 1 16:15:54 BST 2005

Doug Lea wrote:
> But to get back to the main point, the basic question here is,
> Do these special msync forms describe properties that are already in
> the code itself? If so, someday those forms will seem just
> as quaint as "register" qualifiers.
>
> Or, in other words, are there cases where a sufficiently good
> optimizer that did not use the special forms would NOT be able to
> weaken a barrier because of upcoming dependent loads, etc?

I can think of no interesting cases where an optimizer can turn acq into 
ddacq or ccacq.

A ccacq or ddacq gives the optimizer permission to weaken the barrier, 
permission that the ordinary acquire does not grant.

The optimizer is simply not allowed to not honor the acquire and weaken it.

Consider:

if( int * p = atomic_load_acq( &x ) )
{
    f( *p );
}

g( y );

The acquire forces the compiler to insert a barrier that prevents the read 
from y to migrate above the load.

if( int * p = atomic_load_ddacq( &x ) )
{
    f( *p );
}

g( y );

The ddacq grants the compiler freedom to not concern itself with the 
non-dependent read from y and only deal with the read from *p, if necessary 
(on Alpha, for example.)

This maps well to hardware because most CPUs already track these 
dependencies.

In Sparc-like terms, load_acq is

    load
    #LoadLoad | #LoadStore

load_ccacq is

    load
    #LoadLoad

because control-dependent stores are implicitly ordered, and load_ddacq is

    load

because data-dependent accesses are implicitly ordered.

(I hope I got these right.)