[cpp-threads] Brief example ITANIUM Implementation for C/C++ MemoryModel
Alexander Terekhov
alexander.terekhov at gmail.com
Fri Jan 2 16:12:48 GMT 2009
On Tue, Dec 30, 2008 at 11:40 PM, Hans Boehm <Hans.Boehm at hp.com> wrote:
>
> On Sat, 27 Dec 2008, Alexander Terekhov wrote:
>
>> On Fri, Dec 26, 2008 at 8:40 AM, Hans Boehm <Hans.Boehm at hp.com> wrote:
>>>
>>> On Wed, 24 Dec 2008, Peter Dimov wrote:
>>>
>>>>> Load Seq_Cst: mf,ld.acq
>>>>
>>>> I think that the intent of the seq_cst spec was to allow a single ld.acq
>>>> here (and a simple MOV on x86).
>>>>
>>>
>>> Yes. For both Itanium and X86. Sorry. I overlooked that.
>>
>> I disagree. ld.acq is load acquire, not seq_cst. Load seq_cst ought to
>> impose an extra leading store-load barrier (same as with trailing
>> store load barrier for seq_cst stores vs. st.rel). In the case of
>> adjacent seq_cst operations, redundant store-load fencing can be
>> optimized out by the compiler/implementation. Think of mixing seq_cst
>> operations with relaxed and/or acquire/release ones.
>>
> The seq_cst operations must effectively ensure that data-race-free
> programs using only seq_cst operations behave sequentially consistently.
> In a data-race-free program, you cannot tell if an ordinary load followed
> by a seq_cst load are reordered. Neither can you tell if a seq_cst atomic
> store followed by an ordinary store are reordered.
>
> Thus the only reason to add fences before a seq_cst load or after a
> seq_cst store is to prevent reordering of a seq_cst store followed by a
> seq_cst load. For that, it suffices to do one or the other; you don't
> need both.
That would be the case if C/C++ would offer only seq_cst atomic
operations without any relaxed ones (I mean release/acquire/relaxed).
But that is not the case under the current draft. So my interpretation
is based on simple reasoning: seq_cst means fully-fenced (with
redundant fencing being eligible for removal by optimizers).
> It's usually cheaper to add the fence only for stores, though
> PowerPC has other constraints. If you see two consecutive seq_cst stores,
> you only need the fence following the second.
Umm. Under
http://www.rdrop.com/users/paulmck/scalability/paper/N2745r.2008.12.16a.html
seq_cst store is
"Store Seq Cst hwsync; st"
and for two consecutive seq_cst stores this results in
hwsync; st, hwsync; st
without any fence following the second store. Instead
http://www.rdrop.com/users/paulmck/scalability/paper/N2745r.2008.12.16a.html
prescribes leading hwsync for loads:
"Load Seq Cst hwsync; ld; cmp; bc; isync "
which is quite similar with respect to store-load fencing to Itanium's
"mf,ld.acq"
Under my interpretation, "Store Seq Cst hwsync; st" is not strong
enough and should rather be
"Store Seq Cst lwsync, st, hwsync"
which is quite similar with respect to store-load fencing to Itanium's
"st.rel, mf"
So are you in agreement with me or with
http://www.rdrop.com/users/paulmck/scalability/paper/N2745r.2008.12.16a.html
<?>
<double wink>
>
> Note that if we actually got a chance to adjust the hardware, the trailing
> fence for a seq_cst store is actually much more than we need, and we could
> probably make this significantly cheaper. These fences only have to
> order the preceding store with respect to a subsequent ATOMIC load.
Ha! Let's consider seq_cst cmpxchg for Power...
http://www.rdrop.com/users/paulmck/scalability/paper/N2745r.2008.12.16a.html
indicates something along the lines of
"hwsync; ldarx; cmp; bc _exit; stcwx; bc _loop; isync"
Under my interpretation, this is not strong enough either and should rather be
"hwsync; ldarx; cmp; bc _exit; stcwx; bc _loop; hwsync"
(with _exit on compare failure performing isync rather than hwsync)
which is quite similar with respect to store-load fencing to Itanium's
"mf, cmpxchg.acq, mf"
(in compare success case)
On Itanium, we could as well do
"mf, cmpxchg.rel, mf"
Do you agree?
>
> I think all of the recipes you have been posting need adjustments to
> remove the redundant leading fences for seq_cst loads.
I disagree.
regards,
alexander.
More information about the cpp-threads
mailing list