[cpp-threads] memory model
Peter Dimov
pdimov at mmltd.net
Thu May 5 00:27:18 BST 2005
Boehm, Hans wrote:
> In particular:
>
> if (load_cc_acquire(x)) {
> y = 1;
> z = a[1];
> } else {
> y = 0;
> z = a[0];
> }
>
> is NOT equivalent to
>
> if (load_cc_acquire(x)) {
> y = 1;
> } else {
> y = 0;
> }
> z = a[y];
>
> In fact, it appears to me that
>
> if (r1 = load_cc_acquire(x)) {
> z = a[0];
> } else {
> z = a[0];
> }
>
> is not equivalent to
>
> r1 = load_cc_acquire(x);
> z = a[0]
This cc* thing will be pretty painful to express formally.
The idea is that an operation is conditional on a read if two executions of
the abstract machine that read different values differ in whether they
contain the operation.
The abstract machine in example 1 has two possible executions:
load_cc_acq x
y = 1
z = a[1]
and
load_cc_acq x
y = 0
z = a[0]
and both versions of the source generate these same executions.
Similarly, in example 2, the abstract machine has a single execution:
load_cc_acq x
z = a[0]
generated by both versions of the source.
Example 2 is easy: z = a[0] is unconditional.
Example 1 is hard. The reads from a[0] and a[1] are conditional. But are the
stores to y and z conditional? They must be, so a store that is dependent on
a conditional load must be conditional as well.
Maybe we need to drop the whole cc* thing and just mandate that 'cchsb' is
always present (conditional stores are never carried out speculatively),
which is true on all known platforms and which the proposed memory model
effectively requires anyway. Alexander doesn't like this, however. :-)
>> This definition makes x = a[ r1? 0: 1 ] code conditional
>> instead of data
>> dependent.
>
> Now you lost me. Isn't the load executed unconditionally?
> At least if we assume a straightforward compiler?
No, we have the following two possible executions here: x = a[0] and x =
a[1]. The actual load may look unconditional on the assembly level, but the
microcode will almost certainly treat it as a control dependency and not a
data dependency.
Again, there is the option of sidestepping the whole issue and just
mandating an implicit ddacq on every load. The poor Alpha (and any future
hardware with value prediction) will suffer. Alexander likes this even less.
(The proposed memory model appears to require this too, but I'm not
positive. Does it allow reads from p->x to come before the read from p?)
More information about the cpp-threads
mailing list