[cpp-threads] Belated comments on dependency-based orderingproposal

Lawrence Crowl Lawrence at Crowl.org
Wed Sep 19 23:38:10 BST 2007


On 9/19/07, Paul E. McKenney <paulmck at linux.vnet.ibm.com> wrote:
> >       B) The compiler generates an acquire load to preserve the
> > ordering anyway.   I think Lawrence and I are arguing for this version.
>
> Ah!  I was still thinking of this as a trivial implementation as opposed
> to a component of an alternative implementation strategy.  The idea is
> as follows, correct?
>
> o       If no N2361 annotations, the compiler must emit an appropriate
>         memory fences when control leaves the compilation unit.  The code
>         in the other compilation unit will therefore work correctly even
>         in presence of local dependency-breaking optimizations.

I'm not comfortable with "leaves the compilation unit", for at
least two reasons.  First, in many cases, the programmer has no
idea which functions are in the current compilation unit.  Can we
write it in terms of the function calls themselves?  Second, your
statement implies a unit of analysis equal to the compilation
unit, and I'd rather not restrict compilers from doing something
different.  E.g. a compiler could, after analysis, decide to use
dependence-breaking optimizations in half the compilation unit.

On the other hand, I do not want to force breaking the dependence
at all function calls, because virtually, they're everywhere.
Perhaps the way to say this is with an "as if" rule.

>
> o       If there are N2361 annotations, then the compiler can allow
>         the dependency chain to cross compilation-unit boundaries
>         via the annotated function arguments and return values.
>
> o       There would then need to be some way to kill a dependency chain
>         in order to avoid gratuitous memory fences.  Your
>         ignore_dependency() template below would be one approach.
>         Another approach would be another set of annotations for
>         function arguments and return values.  The ignore_dependency()
>         approach seems to cover all the possibilities, so is where I
>         believe that we should start.

I think I agree here.  It can also be formulated as a type-generic macro
for the C language.

> And this would need explicit compiler support -- the kill_dependency_chain()
> called out in N2361 relied on lack of annotations to make this work.
>
> So, is it better to have a special template class that has special
> semantics, or should we instead define a [[dependence_ignore]] attribute
> from which ignore_dependency() can be constructed:
>
> template<class T> [[dependence_ignore]]
> T ignore_dependency(T x) { return (x); }

That proposal makes sense to me.

> Then we have the following cases:
>
> o       An unannotated argument that is a member of a dependency chain
>         causes the compiler to emit a memory fence.
>
> o       An argument annotated with N2361 [[dependence_propagate]]
>         causes the compiler to extend the dependency chain across
>         a compilation-unit boundary.
>
> o       An argument annotated with a new N2361 [[dependence_ignore]]
>         would neither emit a memory fence nor extend the dependency
>         chain across a compilation-unit boundary.
>
> o       An unannotated return value that is a member of a dependency
>         chain would neither emit a memory fence nor extend the dependency
>         chain across a compilation-unit boundary.  Note that the default
>         is different than for arguments -- in the default case discussed
>         above, the naively-written library function is protected by the
>         caller, so the function itself need do nothing upon return.
>
>         However, the other naive case is where the naive function calls
>         another function that explicitly pushes the dependency chain
>         out through its return value.  This could happen in cases where
>         there are wrapper functions that are invoked due to implicit
>         conversions (these still can happen in C++, right?).

Yes.

>         In this
>         case, the called function would have annotated its return value,
>         and it seems to me that the compiler would have to distinguish
>         between these two cases, terminating the dependency chain if
>         the head of the chain is an atomic load that is lexically within
>         the function, and propagating it (via explicit memory fence if
>         need be) if the local head of the chain is instead an annotated
>         return value from a called function.

Things got fuzzy on me here.  In f(h(a)) where f and h have annotated
arguments and h has an annotated return, you are worried that there
might be an implicit conversion, and the actual code is f(g(h(a))) where
g has neither annotation.  Why doesn't the code get protected by a
fence before the call to g?

(I admit that we have a subtle performance implication here, but that
is more the result of implicit conversions than of the dependences.)

>
> o       A return value annotated with N2361 [[dependency_propagate]]
>         causes the compiler to extend the dependency chain across
>         a compilation-unit boundary.
>
> o       A return value annotated with a new N2361 [[dependence_ignore]]
>         would neither emit a memory fence nor extend the dependency
>         chain across a compilation-unit boundary.  This is the same
>         as the default.
>
> o       If the head of the dependency chain is an atomic load that is
>         lexically contained within the function in question, and if
>         the programmer wants to maintain ordering, but not to propagate
>         the dependency chain, then the programmer should insert an
>         explicit memory fence before the return.  (Or do we want
>         another annotation that causes the compiler to implicitly
>         place the memory fences, perhaps omitting it on code paths
>         that have other memory fences already in place?)

I'd like a situation where the compiler has a bit of freedom to decide
whether or not to insert the fence, and the problem with manual fences
is that the compiler usually won't know why they are there.

-- 
Lawrence Crowl



More information about the cpp-threads mailing list