[cpp-threads] Brief example ITANIUM Implementation for C/C++ MemoryModel
Hans Boehm
Hans.Boehm at hp.com
Tue Dec 30 22:40:37 GMT 2008
On Sat, 27 Dec 2008, Alexander Terekhov wrote:
> On Fri, Dec 26, 2008 at 8:40 AM, Hans Boehm <Hans.Boehm at hp.com> wrote:
>>
>> On Wed, 24 Dec 2008, Peter Dimov wrote:
>>
>>>> Load Seq_Cst: mf,ld.acq
>>>
>>> I think that the intent of the seq_cst spec was to allow a single ld.acq
>>> here (and a simple MOV on x86).
>>>
>>
>> Yes. For both Itanium and X86. Sorry. I overlooked that.
>
> I disagree. ld.acq is load acquire, not seq_cst. Load seq_cst ought to
> impose an extra leading store-load barrier (same as with trailing
> store load barrier for seq_cst stores vs. st.rel). In the case of
> adjacent seq_cst operations, redundant store-load fencing can be
> optimized out by the compiler/implementation. Think of mixing seq_cst
> operations with relaxed and/or acquire/release ones.
>
The seq_cst operations must effectively ensure that data-race-free
programs using only seq_cst operations behave sequentially consistently.
In a data-race-free program, you cannot tell if an ordinary load followed
by a seq_cst load are reordered. Neither can you tell if a seq_cst atomic
store followed by an ordinary store are reordered.
Thus the only reason to add fences before a seq_cst load or after a
seq_cst store is to prevent reordering of a seq_cst store followed by a
seq_cst load. For that, it suffices to do one or the other; you don't
need both. It's usually cheaper to add the fence only for stores, though
PowerPC has other constraints. If you see two consecutive seq_cst stores,
you only need the fence following the second.
Note that if we actually got a chance to adjust the hardware, the trailing
fence for a seq_cst store is actually much more than we need, and we could
probably make this significantly cheaper. These fences only have to
order the preceding store with respect to a subsequent ATOMIC load.
I think all of the recipes you have been posting need adjustments to
remove the redundant leading fences for seq_cst loads.
Hans
More information about the cpp-threads
mailing list