[cpp-threads] SC on PPC

Raul Silvera rauls at ca.ibm.com
Tue May 1 01:20:54 BST 2007


Alexander wrote on 04/30/2007 04:18:35 PM:

> On 4/30/07, Raul Silvera <rauls at ca.ibm.com> wrote:
> [...]
> > There are two components to cumulativity. What you're describing is
what
> > we call B-cumulativity, which extends the B set with loads that observe
> > a store in B. What you're missing is A-cumulativity, which extends the
> > A set with operations that were performed before the barrier with
respect
> > to the current processor. This is all spelled out in Book II.
>
> The A set is extended, all right. But what does it bring us given that
> we end up with an empty B set with respect to another processor? Under
> your logic lwsync would do it as well, so why insist on {hw}sync?
>

The B set is not empty. It contains the load.

Let's revisit the code:

 P1: [1] x = 1;
 P2: [2] y = 1;
 P3: [3] r1 = x(1); hwsync ; [4] r2 = y(0);
 P4: [5] r3 = y(1); hwsync ; [6] r4 = x(0);

For P3's hwsync, A=[3] and B=[4]. A-cumulativity extends that to A=[1],[3]
B=[4]
Because it is a hwsync, [1] is performed wrt P4 before [4] (a lwsync
wouldn't order them since [1] is a store and [4] is a load).

For P4's hwsync, A=[5] and B=[6]

[4] is performed wrt P4 before [5], and since [5] is performed wrt P4
before [6], [4] is performed wrt P4 before [6].
Since [1] is perforwed wrt P4 before [4] and [4] is performed wrt P4 before
[6], [1] is performed wrt P4 before [6]. So, [6] must return 1, not 0.

--
Raúl E. Silvera         IBM Toronto Lab   Team Lead, Toronto Portable
Optimizer (TPO)
Tel: 905-413-4188 T/L: 969-4188           Fax: 905-413-4854
D2/KC9/8200/MKM




More information about the cpp-threads mailing list