[cpp-threads] seq_cst compare_exchange and store-load fencing

Paul E. McKenney paulmck at linux.vnet.ibm.com
Fri Jan 2 17:55:38 GMT 2009


On Fri, Jan 02, 2009 at 05:23:23PM +0100, Alexander Terekhov wrote:
> compare_exchange performs both load and (conditional) store. This
> leads to questions regarding store-load fencing for compare_exchange
> in seq_cst mode:
> 
> Q1) Does it provide store-load fencing in the case of
> 
>    A.store(relaxed|release) ... B.compare_exchange(..., seq_cst)
> 
> regarding A's store and B's load (in either success or failure case of
> B's compare_exchange)?

The proposed Power implementation provides this, but by accident.
I do not believe that this is required.  Now, if you do:

    A.store(seq_cst) ... B.compare_exchange(..., seq_cst)

Alternatively, place an atomic_thread_fence(seq_cst) between the
relaxed/release fence and the compare_exchange.

Then the proposed standard would guarantee the ordering.

> Q2) Does it provide store-load fencing in the case of
> 
>    B.compare_exchange(..., seq_cst) ... C.load(relaxed|acquire)
> 
> regarding B's store and C's load (in success case of B's compare_exchange)?

The proposed Power implementation provides a weak form of ordering
in this case, but again, only by accident.  To be guaranteed this
ordering

    B.compare_exchange(..., seq_cst) ... C.load(seq_cst)

As before, another approach is to place an atomic_thread_fence(seq_cst)
between the relaxed/release fence and the compare_exchange.

> Under simple interpretation of "seq_cst" meaning "fully-fenced" the
> answer to both questions is "yes"...

But that is not the definition of "seq_cst" in the proposed standard,
at least not as I read it.

> Do you agree with the same outcome under the proposed C/C++ memory model?
> 
> What is your reasoning in case you disagree?

I appeal to the wording of section 29.1 of the proposed standard:

	The enumeration memory_order specifies the detailed regular
	(non-atomic) memory synchronization order as defined in Clause
	1.10 and may provide for operation ordering.  Its enumerated
	values and their meanings are as follows:

	    — memory_order_relaxed: no operation orders memory.
	    — memory_order_release, memory_order_acq_rel, and
	      memory_order_seq_cst: a store operation performs a release
	      operation on the affected memory location.
	    — memory_order_consume: a load operation performs a consume
	      operation on the affected memory location.
	    — memory_order_acquire, memory_order_acq_rel, and
	      memory_order_seq_cst: a load operation performs an acquire
	      operation on the affected memory location.

	There shall be a single total order S on all memory_order_seq_cst
	operations, consistent with the happens before order and
	modification orders for all affected locations, such that each
	memory_order_seq_cst operation that loads a value observes either
	the last preceding modification according to this order S, or
	the result of an operation that is not memory_order_seq_cst. [
	Note: Although it is not explicitly required that S include locks,
	it can always be extended to an order that does include lock and
	unlock operations, since the ordering between those is already
	included in the happens before ordering. — end note ]

None of this requires that seq_cst operations be ordered with respect to
non-seq_cst operations except as required by acquire, consume, and
release semantics.

Now I personally have no objection to making seq_cst operations more
expensive, but others might.  ;-)

							Thanx, Paul



More information about the cpp-threads mailing list