[cpp-threads] C++ memory model

Tue Sep 22 20:54:04 BST 2009

> -----Original Message-----
> From: cpp-threads-bounces at decadentplace.org.uk 
> [mailto:cpp-threads-bounces at decadentplace.org.uk] On Behalf 
> Of N.M. Maclaren
> Sent: Tuesday, September 22, 2009 12:50 AM
> To: C++ threads standardisation
> Subject: Re: [cpp-threads] C++ memory model
> 
> On Sep 22 2009, Boehm, Hans wrote:
> >
> >> As an aside, Fortran has backed off from this, and has specified a 
> >> simple sequentially consistent segment model, with implementation 
> >> dependent atomic semantics.  There are good reasons for 
> this, as it 
> >> is targetting a much wider range of architectures (including 
> >> distributed memory clusters), and considerable scalability 
> (thousands 
> >> of threads and upwards).
> >> Sequentially consistent atomics are not good news for such 
> >> requirements ....
> >> 
> > That seems to be the general belief. I personally haven't been 
> > convinced of wither the truth or falsity of the above statement.
> 
> That should convince you!  Seriously.  If nobody knows how to 
> implement something, requiring it in a standard is NOT good 
> news, even if people are quite certain it can be done.  As 
> experience with Algol 68 should have taught everybody ....

It depends on the other options.  The choice that we typically seem to have in this area is between:

1) A nonstandard that leaves critical things implementation-defined.  (As Nick points out, in the Fortran case, this is currently a small and controversial part of the language that is not perceived as critical, so it may be fine for now.)

2) Existing practice that either implements an ill-defined spec, or gets performance by doing the wrong thing in some unpleasant corner cases, or

3) Generating a very complex spec that tries to accurately characterize existing implementations, or

4) Pushing the envelope a bit when it comes to implementations.

I don't think there is a risk-free solution.  And I'd argue that the last one has the least potential for long-term damage, at least if it's done after careful study.  Admittedly in the Fortran case, timing may have precluded the careful study.

> 
> Consider the following (reasonable) scenario.  There are N 
> threads and N atomic variables, each of which is accessed 
> more-or-less randomly by a random collection of M threads in 
> any interval of time T, but by all threads over longer 
> periods.  How can that be implemented scalably in, say, the 
> case where M = T = sqrt(N)?

Again, I don't see a fundamental issue, but I'm not an expert on the appropriate protocols.  Fundamentally, you need to ensure that every reader of an atomic sees at least the memory updates that were available to the writer, and that you ensure a total order among the atomic operations themselves.  I think those can all be done with communication among the M active threads.  The effective cost of the individual atomic operations will no doubt increase with M, and hence this won't scale perfectly, but usable alternatives seem to have at least similar issues.

The one expert in distributed implementations whom I did ask seemed to be largely in agreement with me.

We agree that none of this is a proof ...

> 
> Like you, I haven't been convinced whether it is possible or 
> impossible, but I have been convinced that it is at least a 
> difficult problem and it is quite likely that there is no 
> known solution.
> 
> > So far, it looks to me more and more like the issues are similar to 
> > the shared memory case: It's easy to implement something 
> cheaper whose 
> > semantics we can't rigorously define (particularly if you 
> don't have 
> > control over the underlying architecture, which implements 
> something 
> > cheaper without a simple definition). Once you look at the 
> fundamental 
> > properties that need to be enforced for something of manageable 
> > complexity, I have a hard time convincing myself that the 
> alternatives 
> > are really much cheaper, unless you expose something similar to a 
> > message-passing model instead, which of course presents different 
> > tradeoffs.
> 
> Yes.  The reason it was taken was as the standard committee 
> compromise between the people who demanded atomics and those 
> who wanted them excluded and higher level primitives provided 
> instead (e.g. reductions).
> 
> Regards,
> Nick Maclaren.
> 
> 
> 
> --
> cpp-threads mailing list
> cpp-threads at decadentplace.org.uk
> http://www.decadentplace.org.uk/cgi-bin/mailman/listinfo/cpp-threads
>