[cpp-threads] Alternatives to SC
Paul E. McKenney
paulmck at linux.vnet.ibm.com
Mon Jan 15 22:13:57 GMT 2007
On Mon, Jan 15, 2007 at 10:38:06AM -0800, Paul E. McKenney wrote:
> On Mon, Jan 15, 2007 at 07:57:52AM +0100, Alexander Terekhov wrote:
> > On 1/14/07, Raul Silvera <rauls at ca.ibm.com> wrote:
> > [...]
> > >Furthermore, as several other people have mentioned already, SC requires
> > >reads to wait for any writes observed from other threads to become globally
> > >visible. This means you need a StoreLoad barrier ...
> >
> > P1: x = 1;
> > P2: if (x == 1) y = 2;
> > P3: if (y == 2) assert(x == 1);
> >
> > PowerPC Book II discusses that example and says that it requires
> >
> > P1: x = 1;
> > P2: if (x == 1) sync(), y = 2;
> > P3: if (y == 2) sync(), assert(x == 1);
> >
> > ('Cumulative ordering' property of 'sync' instruction.)
> >
> > I somehow doubt that it will outperform a sync()-less version doing
> > load of x on P3 via stwcx-validated lwarx. ;-)
>
> Hard to say which would be faster. Raul's version has the advantage of
> permitting the caches to retain read sharing. Your version has the
> advantage of getting rid of an expensive sync() instruction. If there
> are lots of P3 executions and very few P1 and P2 executions, I would
> guess that Raul's approach wins.
And besides, don't you also need at least an isync preceding the assert()
on POWER for the the stwcx-validated lwarx to do what you want?
Thanx, Paul
> Of course, even better would be avoiding -both- the sync and the
> lwarx/stwcx. This can actually be done in some specialized but
> very commonly occurring cases, and this is what RCU is all about.
More information about the cpp-threads
mailing list