[cpp-threads] Slightly revised memory model proposal (D2300)

Mon Jul 9 16:03:41 BST 2007

And the code fragment needs a bit of help to be safe in face of
freeing and reallocating "foo" structs.

		foo_func(foo *p)
		{
		   __thread foo *cached = 0;
		   __thread bar precomputed_info;
		   long *cached_gen;
		   bar my_info;

		   acquire lock for *p;
		   manipulate *p ;
		   if (cached == p && cached_gen == p->gen) {
		     manipulate *cached some more using precomputed_info;
		   } else {
		     cached = p;
		     cached_gen = p->gen;
		     manipulate *p some more setting precomputed_info;
		   }
		   release lock for *p;
		}

Of course, if "foo" structs are never deallocated, then this is not
necessary.

Or did a larger example have some other mechanism to prevent freeing
and later reallocation of the foo structs?

						Thanx, Paul

On Sun, Jul 08, 2007 at 01:35:04PM -0700, Paul E. McKenney wrote:
> On Fri, Jul 06, 2007 at 06:30:11PM -0000, Boehm, Hans wrote:
> > This qualifies as a frequently discussed topic.  I had this discussion
> > in another group not long ago ... 
> 
> OK.  How about the following?
> 
> Q.	Once atomics are part of the standard, compiler writers will be
> 	free to perform whatever optimizations that they like on any
> 	sequence of evaluations not including access to atomics, right?
> 
> A.	Not quite.  Because non-atomic accesses interact with atomic
> 	accesses, there are a number of situations where care is required.
> 	One such situation appears in functions which use fine-grained
> 	locking to manipulate elements of a data structure, but which
> 	cache information about the last element that they manipulated.
> 	This technique can greatly improve efficiency in cases where
> 	a given element is likely to be manipulated multiple times
> 	in sequence.  Consider foo_func() below:
> 
> 		foo_func(foo *p)
> 		{
> 		   __thread foo *cached = 0;
> 		   __thread bar precomputed_info;
> 		   bar my_info;
> 
> 		   acquire lock for *p;
> 		   manipulate *p ;
> 		   if (cached == p) {
> 		     manipulate *cached some more using precomputed_info;
> 		   } else {
> 		     cached = p;
> 		     manipulate *p some more setting precomputed_info;
> 		   }
> 		   release lock for *p;
> 		}
> 
> 	In this case, profiling would show that the then-clause of the
> 	if-statement would be executed a very large fraction of the time.
> 	An aggressive compiler writer might therefore be tempted to
> 	execute this then-clause speculatively, however, doing so would
> 	introduce data races due to the fact that some other thread
> 	might well be manipulating *cached (and holding cached->lock).
> 
> 	Note that the above case does not introduce any additional
> 	constraints on optimization.  After all, some intervening
> 	function might well have deleted *cached, so that any access
> 	to it could potentially result in a page fault even in absence
> 	of multithreading.  This could happen even in the absence
> 	of threading.  Therefore, this sort of speculative-execution
> 	optimization requires extreme care, even in single-threaded code.
> 
> 						Thanx, Paul
> 
> > > -----Original Message-----
> > > From:  Paul E. McKenney
> > > > [Sarita: ]
> > > > Paul,
> > > > 
> > > > The variables in the example below are NOT atomic.
> > > > 
> > > > > > > > > T1: if (A==1) B=1;
> > > > > > > > > T2: if (B==1) A=1
> > > > 
> > > > Yes, we do want programmers to use atomics so their program will be 
> > > > easy to reason about.
> > > > 
> > > > But how do we explain when to use atomics so the program 
> > > will be easy 
> > > > to reason about? Minimally, one would expect that atomics 
> > > would only 
> > > > be needed in situations where two threads access the same 
> > > variable. Agreed?
> > > >
> > > > For the example above, the only way in which two threads can access 
> > > > the same variable is if the threads perform speculative writes.
> > > 
> > > In theory, perhaps.  But in practice, is this even decidable 
> > > in general?
> > > For example:
> > > 
> > > 	T1: foo(); B=1;
> > > 	T2: bar(); B=2;
> > > 
> > > To determine whether there is a data race, one must solve the 
> > > halting problem on foo() and bar().  If either can be 
> > > guaranteed not to return, then there is no data race.  Only 
> > > if both return is there a data race.
> > > I suspect that one could leverage any number of other 
> > > undecidability results as well.
> >
> > Definitely.  But the primary purpose of the standard is to define
> > dynamic behavior of programs.  And basically all aspects of dynamic
> > behavior are undecidable.  It's also clearly undecidable whether a
> > program will encounter a subscript error.  Yet we give that undefined
> > behavior.
> >
> > > Therefore, I recommend strongly against taking this approach.
> > > 
> > > > So if you say the above program must use atomics, I think it is 
> > > > equivalent to saying that we have to teach undergrads about 
> > > > speculative writes when we teach them parallel programming. 
> > > Would you 
> > > > agree that's unacceptable?
> > > 
> > > I agree that this approach would be quite ill-advise for most 
> > > undergraduates that I have met.
> > > 
> > > > And in that case, how would we explain the need for atomic 
> > > variables 
> > > > in the above situation, but not in many others?
> > > 
> > > Any attempt to create a set of guidelines that accurately 
> > > describes whether or not atomics are required for all 
> > > possible code fragments is guaranteed to end in tears.  The 
> > > set of guidelines will become arbitrarily complex, given that 
> > > the process of creating such guidelines reduces to the 
> > > halting problem.
> > > 
> > > I therefore recommend starting with a list of situations in 
> > > which non-atomics can safely be used.  This list can be 
> > > incomplete, because unnecessary use of atomics will not be a 
> > > problem for most undergraduates.
> > > Undergraduates who are unsatisfied with this situation can be 
> > > referred to graduate courses or to the relevant literature -- 
> > > professors were certainly not shy about taking this approach 
> > > with me back when I was an undergraduate, after all.  I would 
> > > suggest also referring them to relevant concurrent 
> > > open-source projects, but not all professors will be 
> > > comfortable with open source.
> > > 
> > > The list we have covered on this email list thus far seems to include:
> > > 
> > > 1.	Variables used only under the protection of a given 
> > > exclusive lock.
> > > 
> > > 2.	Variables modified only under the protection of the write side
> > > 	of a given reader-writer lock, and referenced only under the
> > > 	protection of that same reader-writer lock (either read or
> > > 	write side).
> > > 
> > > 3.	Variables modified only preceding a release operation on a
> > > 	given atomic variable, and accessed only following an acquire
> > > 	operation on that same variable.
> > > 
> > > 4.	Any of the above combined with single-threaded use, where
> > > 	"single-threaded use" is defined to be use at a time when
> > > 	only a single thread exists, for example, during initialization.
> > > 
> > > 5.	Any of the above combined with accesses and modifications
> > > 	to newly allocated data structures, before a pointer to the
> > > 	given structure has been made accessible to other threads.
> > > 
> > > We might refine this list, possibly expanding its scope 
> > > somewhat, though I suspect that it needs to be kept to under 
> > > 5-7 points for general undergraduate consumption.  But you 
> > > would know better than would I.
> > I think this is unnecessary.  The rule is simply that if a variable can
> > be simultaneously accessed by more than one thread (using the SC program
> > interpretation), and at least one access is a write, then either
> > 
> > - (most likely for beginners) your code has a bug, or
> > - it needs to be atomic
> > 
> > I think the problem with Sarita's example is that it has been pruned to
> > the point that it is hard to extract its significance.  The real problem
> > here is that the if you claim that examples like this do not contain
> > races, you effectively have to tell the user (somehow, it won't be easy)
> > that you have to hold locks for data that might be accessed by
> > functions, even if the function were to take the wrong control path.
> > (Presumably the only motivation would be to allow speculation that
> > corresponds to executing code from the wrong control path.)  This just
> > doesn't work in real code.
> > 
> > Here's an example from the earlier discussion:
> > 
> > Assume I have a main function that is passed a pointer p and wants to
> > manipulate *p, after acquiring the appropriate lock.  It needs to call a
> > helper function for some of this.  The helper function is often called
> > on the same p, and optimizes this case by remembering some information
> > about the last p it was called on:
> > 
> > main_func(foo *p)
> > {
> >    acquire lock for *p;
> >    manipulate *p ;
> >    helper(p);
> >    release lock for *p;
> > }
> > 
> > helper(foo *p)
> > {
> >    __thread foo *cached = 0;
> >    __thread bar precomputed_info;
> >    bar my_info;
> >  
> > 
> >    if (cached == p) {
> >      manipulate *cached some more using precomputed_info;
> >    } else {
> >      cached = p;
> >      manipulate *p some more setting precomputed_info;
> >    }
> > }
> > 
> > This would be incorrect since, if helper took the wrong control path, it
> > could access *cached, for which it doesn't have the lock.
> > (On the control path on which it actually does access it, it of course
> > does have the lock.)  And indeed this corresponds to a speculative
> > update to *cached, which an unscrupulous compiler might find profitable.
> > 
> > Note that I cannot fix this by moving the lock acquisition for *p into
> > helper; that would make the two manipulations of *p in main_func and
> > helper nonatomic.  And it would greatly increase locking overhead.
> > 
> > I don't think that a programming model in which this can fail is
> > defensible.  But this is basically equivalent to Sarita's example.
> > 
> > Hans
> > 
> >  
> > 
> 
> -- 
> cpp-threads mailing list
> cpp-threads at decadentplace.org.uk
> http://www.decadentplace.org.uk/cgi-bin/mailman/listinfo/cpp-threads