[cpp-threads] Prism 0.9.1, and further on the effects of races

Fri Sep 22 02:26:50 BST 2006

(Sorry for the lag, I'm on the road and email is intermittent.)

Peter wrote:
> Herb Sutter wrote:
> >
> >   thread 1: global_ptr = new Derived();
> >
> > With no constraints, this could execute as:
> >
> >             global_ptr = (Derived*)operator new( sizeof(Derived) );
> >             global_ptr->vptr = &Base::vtable;     // a
> >             global_ptr->Base ctor body;
> >             global_ptr->vptr = &Derived::vtable;  // b
> >             global_ptr->Derived ctor body;
> >
> > And in a race we perform:
> >
> >   thread 2: global_ptr->DerivedFunc();            // c
>
> This isn't a problem in either model A (undefined behavior) and model B
> (global_ptr has an undefined value in the event of a race). In both cases,
> the (c) statement is allowed to trap.
>
> It is only a problem if a read that participates in a race is constrained
> to
> only return one of the possible values that the variable could hold if the
> race was resolved in a sequentially consistent manner, but this is
> generally
> not true for non-atomic variables (if it were, we wouldn't need raw
> atomics.)

Good points. For a moment, let's restrict to the case where global_ptr is a type that is known to be atomically read/written -- either a proposed std::atomic<Derived*> or a plain Derived* on a platform that happens to document a guarantee that reads and writes from a plain default-aligned T* are atomic.

In line c, the behaviors that I think will be unsurprising to users are: (1) a correct function call on a completely constructed object, and (2) a null pointer trap.

The question is, should we also permit general undefined behavior, such as a wild branch?

Here's a different and maybe more useful way to put this question: Should line c be allowed any additional behaviors than if global_ptr pointed to a type having no virtual functions? If the answer is yes, we'd be saying that we're permitting the implementation details of vptrs/vtbls to leak out and affect the programmer -- that's a possible answer, but I personally am worried about that. I would really prefer the answer to be no, saying that if any language-generated machinery (such as virtual dispatch) should not affect what the programmer has to do to achieve correct thread safety. Philosophically, my concern is that the language is generating the implementation details of vptrs/vtbls (and hiding them), so it ought to bear at least some responsibility for their thread safety.

In case it helps, this discussion is making me realize that I've been subconsciously viewing this as similar to the case of COW strings. With COW strings, the implementer is generating shared state under the covers where the programmer can't see it (not even to name the shared variables) or detect it, and the string implementer is responsible for ensuring that doesn't make the string less thread-safe -- that is, he's not responsible for making strings "fully" thread safe (whatever that means, and which he can't do anyway), but rather for getting the guarantees back up to where the calling code can assume its normal duty of care for thread safety (e.g., locking shared string objects).

It may be that not everyone sees the implementation details of vptrs/vtables as analogous to the COW implementation details of ref counted strings, but I offer this as one view of the problem in case it helps to explain my concerns because I guess this is one of the parallels I've been carrying around in the back of my mind.

Herb