[cpp-threads] Slightly revised memory model proposal (D2300)

Boehm, Hans hans.boehm at hp.com
Wed Jun 20 00:20:36 BST 2007


________________________________

	From:  Raul Silvera
	
	Hans wrote on 06/15/2007 12:09:39 PM:
	
	> Unfortunately, I think I posted some misinformation here, with
respect
	> to flickering.  I believe the version of the example that I
posted:
	> 
	> > > Thread 1:
	> > > store_relaxed(&x, 1);
	> > >
	> > > Thread 2:
	> > > store_relaxed(&x, 2);
	> > >
	> > > Thread 3:
	> > > r1 = load_acquire(&x); (1)
	> > > r2 = load_acquire(&x); (2)
	> > > r3 = load_acquire(&x); (1)
	> > >
	> is already allowed to flicker under the D2300 rules.  And
looking back
	> at Sarita's example, weakening this doesn't seem to help.
(The example
	> that we should really have been discussing would have had
release stores.
	> That's the one that's currently constrained by the
modification order
	> rule.  And having that flicker does seem dubious.)
	
	I find this very troubling. From T3's point of view, it is just
doing acquire 
	operations, and it is not expecting any flickering, regardless
of which stores 
	are going to satisfy its loads.  
	 

I was almost going to agree with you, and try to change this.  But this
again runs into synchronization elimination issues, which seem central
here.  If the "acquire"s in thread 3 mean anything without a matching
"release" then, by similar reasons,
 
r1 = load_relaxed(&x); r2 = load_relaxed(&x); r3 = load_relaxed(&x);
 
can allow different outcomes from
 
r1 = load_relaxed(&x); fetch_and_add_acq_rel(&dead1, 0); r2 =
load_relaxed(&x); fetch_and_add_acq_rel(&dead2, 0); r3 =
load_relaxed(&x);
 
which means that the dead fetch_and_adds can't be eliminated, which is
very unfortunate.  It also means that I can't ever eliminate locks after
thread inlining without understanding the whole program.
 
I'm more and more inclined to do what Sarita was advocating anyway,
which is to switch to a more conventional formulation of the memory
model in which happens-before is
just the transitive closure of the union of sequenced-before and
synchronizes-with.  That makes it clearer that acquire and release only
provide any guarantees if they occur in pairs.
 
(The last proposal has another similar synchronization elimination issue
with the "precedes" relation, which includes happens-before, but not
sequenced-before.  I think we can also get rid of by moving back to a
more conventional, Java-like, happens-before model.)
 
My general feeling is that if we have a trade-off between
synchronization elimination and more expressive low-level atomics,
synchronization elimination should win, since it effects lock-based user
code, which is bound to make up a much larger body of code than
low-level atomics clients.
 
And although I also find this a bit troubling, I'm still having a lot of
trouble constructing a case in which this matters.
 
Hans
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.decadentplace.org.uk/pipermail/cpp-threads/attachments/20070619/7d3a4a20/attachment.htm


More information about the cpp-threads mailing list