[cpp-threads] A question about N2153
Peter Dimov
pdimov at mmltd.net
Wed Jan 17 18:44:00 GMT 2007
What is the reference implementation of acquire_fence for x86, SPARC RMO,
PowerPC, IA-64? I'm guessing (no op), #LoadLoad | #LoadStore, lwsync, mf.
If we take the example:
if( fetchadd_release( &ref_count, -1 ) == 1 ) // old value
{
acquire_fence();
destroy_object();
}
this is fine on x86/SPARC. On PowerPC, however, the most efficient
implementation has an isync instead of lwsync, right? On IA-64 an ld.acq
from ref_count may also be more efficient than a mf.
Since a load_acquire( location ) on PowerPC is (pseudocode):
mov r1, location
cmp r1, r1
bne- $
isync
(correct? :-) ) it seems to me that a more efficient formulation of the
example would be:
if( fetchadd_release( &ref_count, -1 ) == 1 ) // old value
{
load_acquire( &ref_count );
destroy_object();
}
This is not guaranteed to work in theory, but I think that it will work in
practice on all implementations (and the extra load can be optimized out on
non-IA-64). If I'm right, this might pose a problem. It's not good if
something not backed by the official specification works better in practice.
More information about the cpp-threads
mailing list