[cpp-threads] [Javamemorymodel-discussion] there's a happens-before orderhere. right?

Mon Dec 15 22:04:19 GMT 2008

On Mon, Dec 15, 2008 at 9:38 PM, Boehm, Hans <hans.boehm at hp.com> wrote:
>
> [I think all of this is relevant to Java only in that vaguely similar
> extensions have been considered there at times.  Most of it
> doesn't currently translate, in spite of the fact that the basic
> memory models are similar.]
>
>> From: Alexander Terekhov [mailto:alexander.terekhov at gmail.com]
>> On Sat, Dec 13, 2008 at 2:04 AM, Boehm, Hans
>> <hans.boehm at hp.com> wrote:
>> [...]
>> >>
>> http://www.hpl.hp.com/personal/Hans_Boehm/c++mm/threadsintro.html#examples
>> >>
>> >> would you label the following C++0x program
>> >>
>> >>     int data; //= 0
>> >>     atomic<int> x; //= 0
>> >>
>> >>     thread 1:
>> >>     ------------
>> >>     data = 1;
>> >>     x.store(1, release);
>> >>
>> >>     thread 2:
>> >>     ------------
>> >>     if (x.load(relaxed))
>> >>       data = 2;
>> >>
>> >> data-race-free or not? Why? TIA.
>> >>
>> > This is well in the "escapes from sequential consistency" category,
>> > and it doesn't currently have a Java analog.
>>
>> Yes. I'm actually driving at
>>
>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2745.html
>>
>> > As I said, I was really trying to steer most people well away from
>> > this kind of code.
>>
>> The problem is that std::atomic<> in SC mode is utterly
>> expensive on POWER. Even in acquire-release mode it will
>> inject way too much totally redundant (lw/i)sync(hronization).
>
> I agree that this can be an issue on some current architectures,
> as it is with Java volatiles.  But the above example can be fixed
> by either:
>
> 1) Using an acquire load, or

using an atomic_thread_fence(memory_order_acquire)

[...]

>> > if (r1 = x.load(memory_order_relaxed)) {
>> >    data.store(2, memory_order_relaxed); } else {
>> >    data.store(2, memory_order_relaxed); }
>> >
>> > to be ordered, for example.
>>
>> Why not?
>
> Any compiler that cares about code size is likely to transform this to
>
> r1 = x.load(memory_order_relaxed);
> data.store(2, memory_order_relaxed);
>
> and go from there, rordering at will, and allowing the hardware to reorder.
>
> Preventing this is hard, since the actual code may look like
>
> inside f():
>
>  r1 = x.load(memory_order_relaxed);
>  g(r1);
>
> inside g(), in a separate compilation unit:

You're using the term not defined in the current C++ standard
("compilation unit").

FWIW, it is fine to restrict reordering of

if (r1 = x.load(memory_order_relaxed)) {
   data.store(2, memory_order_relaxed); } else {
   data.store(2, memory_order_relaxed); }

only in the same "scope". ;-)

regards,
alexander.