[cpp-threads] Belated comments on dependency-based orderingproposal
Paul E. McKenney
paulmck at linux.vnet.ibm.com
Thu Sep 20 00:24:37 BST 2007
On Wed, Sep 19, 2007 at 03:38:10PM -0700, Lawrence Crowl wrote:
> On 9/19/07, Paul E. McKenney <paulmck at linux.vnet.ibm.com> wrote:
> > > B) The compiler generates an acquire load to preserve the
> > > ordering anyway. I think Lawrence and I are arguing for this version.
> >
> > Ah! I was still thinking of this as a trivial implementation as opposed
> > to a component of an alternative implementation strategy. The idea is
> > as follows, correct?
> >
> > o If no N2361 annotations, the compiler must emit an appropriate
> > memory fences when control leaves the compilation unit. The code
> > in the other compilation unit will therefore work correctly even
> > in presence of local dependency-breaking optimizations.
>
> I'm not comfortable with "leaves the compilation unit", for at
> least two reasons. First, in many cases, the programmer has no
> idea which functions are in the current compilation unit. Can we
> write it in terms of the function calls themselves? Second, your
> statement implies a unit of analysis equal to the compilation
> unit, and I'd rather not restrict compilers from doing something
> different. E.g. a compiler could, after analysis, decide to use
> dependence-breaking optimizations in half the compilation unit.
>
> On the other hand, I do not want to force breaking the dependence
> at all function calls, because virtually, they're everywhere.
> Perhaps the way to say this is with an "as if" rule.
How about if we made this be "leaves the function" rather than "leaves
the compilation unit"? I believe we would still need to classify library
functions/wrappers/conversions, but at least it would be more definite
from the programmer's viewpoint.
> > o If there are N2361 annotations, then the compiler can allow
> > the dependency chain to cross compilation-unit boundaries
> > via the annotated function arguments and return values.
> >
> > o There would then need to be some way to kill a dependency chain
> > in order to avoid gratuitous memory fences. Your
> > ignore_dependency() template below would be one approach.
> > Another approach would be another set of annotations for
> > function arguments and return values. The ignore_dependency()
> > approach seems to cover all the possibilities, so is where I
> > believe that we should start.
>
> I think I agree here. It can also be formulated as a type-generic macro
> for the C language.
Sounds good!
> > And this would need explicit compiler support -- the kill_dependency_chain()
> > called out in N2361 relied on lack of annotations to make this work.
> >
> > So, is it better to have a special template class that has special
> > semantics, or should we instead define a [[dependence_ignore]] attribute
> > from which ignore_dependency() can be constructed:
> >
> > template<class T> [[dependence_ignore]]
> > T ignore_dependency(T x) { return (x); }
>
> That proposal makes sense to me.
Sounds good as well!
> > Then we have the following cases:
> >
> > o An unannotated argument that is a member of a dependency chain
> > causes the compiler to emit a memory fence.
> >
> > o An argument annotated with N2361 [[dependence_propagate]]
> > causes the compiler to extend the dependency chain across
> > a compilation-unit boundary.
> >
> > o An argument annotated with a new N2361 [[dependence_ignore]]
> > would neither emit a memory fence nor extend the dependency
> > chain across a compilation-unit boundary.
> >
> > o An unannotated return value that is a member of a dependency
> > chain would neither emit a memory fence nor extend the dependency
> > chain across a compilation-unit boundary. Note that the default
> > is different than for arguments -- in the default case discussed
> > above, the naively-written library function is protected by the
> > caller, so the function itself need do nothing upon return.
> >
> > However, the other naive case is where the naive function calls
> > another function that explicitly pushes the dependency chain
> > out through its return value. This could happen in cases where
> > there are wrapper functions that are invoked due to implicit
> > conversions (these still can happen in C++, right?).
>
> Yes.
Good to know, thank you!
> > In this
> > case, the called function would have annotated its return value,
> > and it seems to me that the compiler would have to distinguish
> > between these two cases, terminating the dependency chain if
> > the head of the chain is an atomic load that is lexically within
> > the function, and propagating it (via explicit memory fence if
> > need be) if the local head of the chain is instead an annotated
> > return value from a called function.
>
> Things got fuzzy on me here. In f(h(a)) where f and h have annotated
> arguments and h has an annotated return, you are worried that there
> might be an implicit conversion, and the actual code is f(g(h(a))) where
> g has neither annotation. Why doesn't the code get protected by a
> fence before the call to g?
>
> (I admit that we have a subtle performance implication here, but that
> is more the result of implicit conversions than of the dependences.)
We are thinking of different cases. Here is what I was considering:
[[dependency_propagate]] foo_t *h(void)
{
return x.load(memory_order_dependency);
}
int f(void)
{
implicitly_converted_from_foo_t *p;
p = h();
return(p->data);
}
The compiler would convert the assignment in f() to something like:
p = implicit_conversion_from_foo_t(h());
If the conversion was a library function, then I am advocating that the
compiler do the following transformation instead, given that it can see
the return-value annotation:
foo_t temp1 = h();
atomic_fence(memory_order_acquire);
p = implicit_conversion_from_foo_t(temp1);
Of course, if the conversion function were annotated, there would
be no need for the fence, since the conversion function would then
be known to preserve dependency chains.
Seem reasonable?
> > o A return value annotated with N2361 [[dependency_propagate]]
> > causes the compiler to extend the dependency chain across
> > a compilation-unit boundary.
> >
> > o A return value annotated with a new N2361 [[dependence_ignore]]
> > would neither emit a memory fence nor extend the dependency
> > chain across a compilation-unit boundary. This is the same
> > as the default.
> >
> > o If the head of the dependency chain is an atomic load that is
> > lexically contained within the function in question, and if
> > the programmer wants to maintain ordering, but not to propagate
> > the dependency chain, then the programmer should insert an
> > explicit memory fence before the return. (Or do we want
> > another annotation that causes the compiler to implicitly
> > place the memory fences, perhaps omitting it on code paths
> > that have other memory fences already in place?)
>
> I'd like a situation where the compiler has a bit of freedom to decide
> whether or not to insert the fence, and the problem with manual fences
> is that the compiler usually won't know why they are there.
Makes sense -- so something like [[dependency_fence]]?
And I presume that I should get new document numbers from Clark and
revise to suit?
Thanx, Paul
More information about the cpp-threads
mailing list