[cpp-threads] Brief example ITANIUM Implementation for C/C++ MemoryModel

Hans Boehm Hans.Boehm at hp.com
Tue Dec 30 22:40:37 GMT 2008



On Sat, 27 Dec 2008, Alexander Terekhov wrote:

> On Fri, Dec 26, 2008 at 8:40 AM, Hans Boehm <Hans.Boehm at hp.com> wrote:
>>
>> On Wed, 24 Dec 2008, Peter Dimov wrote:
>>
>>>> Load Seq_Cst:  mf,ld.acq
>>>
>>> I think that the intent of the seq_cst spec was to allow a single ld.acq
>>> here (and a simple MOV on x86).
>>>
>>
>> Yes.  For both Itanium and X86.  Sorry.  I overlooked that.
>
> I disagree. ld.acq is load acquire, not seq_cst. Load seq_cst ought to
> impose an extra leading store-load barrier (same as with trailing
> store load barrier for seq_cst stores vs. st.rel). In the case of
> adjacent seq_cst operations, redundant store-load fencing can be
> optimized out by the compiler/implementation. Think of mixing seq_cst
> operations with relaxed and/or acquire/release ones.
>
The seq_cst operations must effectively ensure that data-race-free
programs using only seq_cst operations behave sequentially consistently.
In a data-race-free program, you cannot tell if an ordinary load followed
by a seq_cst load are reordered.  Neither can you tell if a seq_cst atomic
store followed by an ordinary store are reordered.

Thus the only reason to add fences before a seq_cst load or after a 
seq_cst store is to prevent reordering of a seq_cst store followed by a 
seq_cst load.  For that, it suffices to do one or the other; you don't 
need both.  It's usually cheaper to add the fence only for stores, though 
PowerPC has other constraints.  If you see two consecutive seq_cst stores, 
you only need the fence following the second.

Note that if we actually got a chance to adjust the hardware, the trailing
fence for a seq_cst store is actually much more than we need, and we could
probably make this significantly cheaper.  These fences only have to
order the preceding store with respect to a subsequent ATOMIC load.

I think all of the recipes you have been posting need adjustments to
remove the redundant leading fences for seq_cst loads.

Hans



More information about the cpp-threads mailing list