[cpp-threads] Review comments on N2176 WRT dependency ordering
Paul E. McKenney
paulmck at linux.vnet.ibm.com
Tue Apr 17 21:54:48 BST 2007
On Tue, Apr 17, 2007 at 02:38:29PM +0300, Peter Dimov wrote:
> Paul E. McKenney wrote:
> >On Mon, Apr 16, 2007 at 06:12:14PM +0300, Peter Dimov wrote:
>
> >>N2195 proposes the following minimalistic approach:
> >>
> >>template< class T > inline T * atomic_load_address( T * const * p );
> >>template< class T > inline T * atomic_load_address( T * const
> >>volatile * p );
> >
> >I do like the template approach!
> >
> >>Returns: *p.
> >>
> >>Constraint: _Acquire only with respect to **p.
> >
> >OK -- but what about p->some_field? Or does **p indicate any
> >dereference of p? I had been assuming the more restrictive mode.
>
> **p denotes the entire object that *p refers to, including all of its
> fields.
OK.
> In the dependency discussion paper, you write that a single level of
> indirection is not enough for the Linux kernel, but I don't see how you
> could make a multi-level primitive work on an Alpha.
This depends on how the multi-level list was created. If it was
created by a pair of threads as follows:
- Thread 1:
p = new typeof(*p);
p->a = 1;
p->next = NULL;
release_fence();
head = p;
- Thread 2:
q = new typeof(*q);
q->a = 2;
q->next = NULL;
release_fence();
head->next = q;
Then, yes, an atomic_load_address() would be needed for each stage, as
in:
p = atomic_load_address(head);
q = atomic_load_address(p->next);
However, if a single thread created both elements and published them
in one shot as follows:
p = new typeof(*p);
p->a = 1;
q = new typeof(*q);
q->a = 2;
q->next = NULL;
p->next = q;
release_fence();
head = p;
Then atomic_load_address() would be needed only for the initial fetch
from head, as in:
p = atomic_load_address(head);
q = p->next;
Unlike the earlier example, this multilinked structure is
essentially one element. This sort of thing happens rather frequently,
such as where each element of a list has other data structures linked to
it, but where these other data structures are owned by the corresponding
element.
> In:
>
> r1 = x;
> // rmb
> r2 = r1->a;
> // rmb
> r3 = r2->b;
>
> I believe that you need two rmb fences.
>
> r1 = atomic_load_address( &x );
> r2 = atomic_load_address( &r1->a );
> r3 = r2->b;
>
> can insert the two fences for you, but if the second atomic_load_address is
> an ordinary operation (somewhere in a translation unit far, far away),
> there's no way to insert a rmb after it.
Again, it depends on how the structure was created, effectively, on
whether the linked structure is published incrementally (in which case
multiple atomic_load_address()es are needed) or in one shot (in which
case only a single atomic_load_address() is required).
> >>Index-based dependencies are not supported.
> >
> >Why not?
>
> Are they important enough?
I believe that they are -- they do show up fairly frequently.
> We can add a minimalistic index-based primitive
> along the lines of:
>
> template<class T, class U> inline U* atomic_load_index( T const * pi, U *
> base );
>
> Requires: T shall be integral.
> Returns: base + *pi.
> Constraint: _Acquire only with respect to base[ *pi ].
>
> but it doesn't cover the complex cases where you want to index more than
> one array.
Why not just use atomic_load_address() on the index? Then given a
static array whose indexes are dynamically filled in:
i = atomic_load_address(myindex);
r1 = myarray[i];
A dynamic array would require both pointer and index be treated this
way:
i = atomic_load_address(myindex);
p = atomic_load_address(myarray);
r1 = p[i];
Thanx, Paul
More information about the cpp-threads
mailing list