[cpp-threads] Re: Review comments on N2176 WRT dependency ordering

Paul E. McKenney paulmck at linux.vnet.ibm.com
Tue Apr 10 12:39:39 BST 2007


On Mon, Apr 09, 2007 at 05:28:26PM -0700, Hans Boehm wrote:
> On Sun, 8 Apr 2007, Paul E. McKenney wrote:
> 
> > Hello again!
> >
> > I once again thank Hans for his careful description of a number of
> > interesting optimization situations relating to dependency-based
> > ordering.  The following text discusses some possible approaches
> > to resolving these situations.
> >
> > Thoughts?
> >
> > 						Thanx, Paul
> >
> 
> Thanks for posting this.  It would also be good to place a copy on the Oxford
> wiki.  This is an issue that I think we really need to resolve asap.

Will do once I get to a high-bandwidth connection.

> My concern with all of these proposals is that they seem to require major
> compiler and standards changes to accomodate what I think will be perceived
> as a fairly narrow problem.  The problem will be to get both the people who
> would need to write the standardese (nontrivial) and the people who will
> need to implement this (harder) to buy into this sufficiently.

Understood.

> If we do want to address this, I would certainly advocate a formulation that
> allows dependency-based ordering to be dropped at the cost of replacing
> load_relaxed with load_acquire.  Thus I think the implementation overhead
> on X86 would be near zero; any extra syntax could basically be ignored, at
> the expense of some compiler reordering constraints for load_relaxed.

This approach would not be unreasonable for x86.

> Thus the implementation overhead would fall mostly on weakly ordered architectures
> like PowerPC.  Are IBMs compiler groups willing to buy into any of these
> solutions?

I have not yet encountered much resistance.

> I bring up the standardese issue, since it is not at all clear to me that the
> notion of "dependency" is easily definable, and it seeasm to me that we would
> have to.

What is your reaction to the notion of dependency defined in the
Itanium and the POWER architecture documents?  Page 384 of my copy
of volume 2 of the Itanium Architecture manual defines a dependency
from A to B as follows:

	"A precedes B in program order and A produces a value that
	B consumes."

> ...
> > N2176's third example shows that an innocent-seeming transformation
> > might convert a dependency chain that would be recognized by a given
> > system into a form that might not be:
> >
> > r1 = x.load_relaxed();
> > if (r1 == 0)
> > 	r2 = *r1;
> > else
> > 	r2 = *(r1 + 1);
> >
> > The innocent transformation might result in the following:
> >
> > r1 = x.load_relaxed();
> > if (r1 == 0)
> > 	r3 = r1;
> > else
> > 	r3 = r1 + 1;
> > r2 = *r3;
> 
> I think I really mangled that example in N2176.  Sorry about that.
> 
> For a better example of data to control dependence conversion, assume x
> has a value of 0 or 1, and I write
> 
> if (x) {
>     ...
> } else {
>     ...
> }
> y = 42 * x / 13;
> 
> The compiler could certainly convert this to
> 
> if (x) {
>     ...
>     y = 3;
> } else {
>     ...
>     y = 0;
> }
> 
> I know of at least one major architecture for which control dependencies
> do not enforce ordering.

This would certainly be an argument for having the programmer mark
the important dependencies -- similar to the way in which atomics
require a programmer to mark the important variables.  If "x" was
marked, for example, as follows:

	if (load_raw(x)) {
	    ...
	} else {
	    ...
	}
	y = 42 * x / 13;

then the compiler could either leave the dependency (assuming the
hardware respected it) or emit a memory barrier, for example, as
follows:

	if (x) {
	    acquire_fence();
	    ...
	    y = 3;
	} else {
	    acquire_fence();
	    ...
	    y = 0;
	}

> There are other more subtle differences in dependency type.  On Itanium,
> 
> if (x_init) {
>     y = x;
> } else {
>     ... // initialize x;
>     y = x;
> }
> 
> (when naively compiler) is guaranteed to enforce the order between the loads
> of x_init and x, but the same is not true for
> 
> if (x_init) {
>     ;
> } else {
>     ... // initialize x;
> }
> y = x;

Again, this seems to be another motivation for marking the dependencies
that matter, perhaps via the perObjectPostLoadFence() primitive.

						Thanx, Paul



More information about the cpp-threads mailing list