[cpp-threads] A question about N2153

Peter Dimov pdimov at mmltd.net
Wed Jan 17 18:44:00 GMT 2007


What is the reference implementation of acquire_fence for x86, SPARC RMO, 
PowerPC, IA-64? I'm guessing (no op), #LoadLoad | #LoadStore, lwsync, mf.

If we take the example:

if( fetchadd_release( &ref_count, -1 ) == 1 ) // old value
{
    acquire_fence();
    destroy_object();
}

this is fine on x86/SPARC. On PowerPC, however, the most efficient 
implementation has an isync instead of lwsync, right? On IA-64 an ld.acq 
from ref_count may also be more efficient than a mf.

Since a load_acquire( location ) on PowerPC is (pseudocode):

mov r1, location
cmp r1, r1
bne- $
isync

(correct? :-) ) it seems to me that a more efficient formulation of the 
example would be:

if( fetchadd_release( &ref_count, -1 ) == 1 ) // old value
{
    load_acquire( &ref_count );
    destroy_object();
}

This is not guaranteed to work in theory, but I think that it will work in 
practice on all implementations (and the extra load can be optimized out on 
non-IA-64). If I'm right, this might pose a problem. It's not good if 
something not backed by the official specification works better in practice. 




More information about the cpp-threads mailing list