[cpp-threads] Belated comments on dependency-based ordering proposal

Sat Sep 15 02:04:54 BST 2007

On 9/14/07, Boehm, Hans <hans.boehm at hp.com> wrote:
> Consider example 2:
>
> atomic<int> x;
> vector<int> v;
>
> void f() {
>         int i = x.load(memory_order_dependency);
>         int j = v[i];
>         ...
> }
>
> I don't think we can reasonably annotate the standard vector class.
> Thus I don't know how to enforce the dependency ordering here, since
> operator[] is elsewhere and not annotated.  If you don't enforce
> ordering here, it seems to me the code becomes brittle in that if you
> change a C array to a C++ library one, things are allowed to break.

What do you mean by break?  As I understand the proposal, all that
happens is that the dependency ordering gets strengthened to acquire
ordering before the call.  Even in the case that strengthening happens,
there is still a potential performance advantage in less visited paths.

    int i = x.load( memory_order_dependency );
    if ( i > 0 )
        int j = v[i];

becomes (internally to the compiler)

    int i = x.load( memory_order_dependency );
    if ( i > 0 ) {
        x.fence( memory_order_acquire );
        int j = v[i];
    }

when v[i] is an unannotated function call.  If x is usually zero, you
avoid the acquire cost.

> I thought we had tentatively concluded in Toronto that all "data
> dependent" loads occurring as a result of the execution of f() should be
> in the dependency tree anyway, no matter whether they occurred as a
> result of a nested function call or not.  If the value flows into a
> separately compiled function, the compiler has to back off and treat
> memory_order_dependency as memory_order_acquire.  This may add
> unexpected overhead in some cases.  But examples like the above will be
> guaranteed to work as expected.
>
> But I think this also isn't reflected here?
>
> My guess is that the N2360 approach (if I understood it correctly) might
> be viable for C, where function calls are obvious in the source.  But
> too many things in C++ are a function call behind the scenes.  I'm not
> very comfortable with an approach that treats v[i] differently depending
> on whether v is a C int[10] or a C++ std::array<int, 10>.

If the std::array index is inlined, the compiler has full knowledge
of the dependences, and one gets a performance win.  If the index
is not inlined, the performance drops.  But that is true anyway.  I
am willing to live with a bit of performance uncertainty in C++
codes, in part because we already have it.  More importantly, I
think, is that programmers are unlikely to go to dependence-based
ordering without serious experimental evidence and lots of other
tuning, which may very well include careful management of the
inlining process.

> I think we'll encounter many more problems along these lines,
> e.g also with loads in destructors.

Can you elaborate?

> I suspect we need more work defining what it means to "consume the result
> of a prior evaluation".  But someone from the C++ standards committee
> core group will have to help there.

Hm.  The pragmatic meaning is clear, which is that the C++ lowered
to C uses the value.  The standard definition is probably a bit fuzzier.

-- 
Lawrence Crowl