[cpp-threads] OpenMP Memory Model

Thu Apr 13 00:25:24 BST 2006

Thanks for forwarding this.  I copied the authors, with the hope of
encouraging discussion.

After attempting to read this quickly (a mistake I'm sure), it seems to
me that this diverges from the proposed C++ memory model in several
ways.  I think it would be useful to make those differences explicit,
and to make sure that they are intentional:

1) The OpenMP spec ignores things like bit-fields, and the related
issues of when one write may require reading and rewriting other
objects.  That may not matter much in the OpenMP context.

2) It assumes that compilers cannot introduce new writes (e.g. as part
of speculative register promotion).  We agree that's the proper rule,
but I suspect it's no more true for current implementations than it is
for C++.

3) The paper doesn't mention volatile.  I looked at the OpenMP 2.5 spec
again.  That does give semantics to volatile, though the semantics don't
look plausible to me.  (The implicit flush operation seems to be on the
wrong side, at least when comparing it to Java volatile semantics.  And
putting it at the previous/next sequence point seems to me to allow some
really weird code, at the expense of optimization.)  The C++ memory
model may or may not mention volatile, but I'm pretty sure it will not
use the OpenMP semantics.

4) The notion of atomic operations seems quite different, in that

  (a) There is no need to label reads of atomic variables specially.  An
ordinary read that races with an atomic write is OK.  I don't think this
is a good design decision, but it seems to be the way OpenMP works.

  (b) Atomic operations are what we would call "unordered".  Given that
the corresponding reads are not special and hence unordered, this
probably doesn't lose anything. 

I expect the resulting notion of atomics usually requires lots of
explicit "flush"es, which I think are expensive to implement on
architectures like X86 or IA64, since I think they require StoreLoad
ordering.

5) Is the notion of eclipsing reads designed to prevent variables from
"flip flopping", as in

Thread 1:
x = 1;
flush1;
x = 2;

Thread 2:
flush2;
r1 = x;
r2 = x;

outcome: flush1 Oflsh flush2 && r1 = 2 && r2 = 1?

If so, I think that would get us back to the "reads kill" problem in the
original Java memory model, which wasn't a good idea, mostly because it
seemed to be too expensive to get anyone to bother to implement
correctly.  The issue arises because in real cases, the reads in thread
2 may look more like:

r0 = *y/17;
r1 = *x;
r2 = *y/17;

With this rule, you can no longer eliminate the common subexpressions
unless you know that *x and *y don't alias. 

6) If a read is involved in a race, the value associated with the read
is undefined, but it does not render the meaning of the entire program
undefined.  I think that's problematic for C++, in that a race can
result in a read of an uninitialized vtable pointer, which can normally
result in a wild branch, and hence any behavior whatsoever.
Implementing anything else often requires some sort of memory fence
after the vtable initialization. 

Hans

> -----Original Message-----
> From: cpp-threads-bounces at decadentplace.org.uk 
> [mailto:cpp-threads-bounces at decadentplace.org.uk] On Behalf 
> Of Lawrence.Crowl at Sun.com
> Sent: Wednesday, April 12, 2006 12:35 PM
> To: cpp-threads at decadentplace.org.uk
> Subject: [cpp-threads] OpenMP Memory Model
> 
> Here is a paper on the OpenMP memory model.  I have not yet 
> had a chance to read it, so I have no comments at this time.
> 
>   Lawrence Crowl             650-786-6146   Sun Microsystems, Inc.
>                    Lawrence.Crowl at Sun.com   16 Network 
> Circle, UMPK16-303
>            http://www.Crowl.org/Lawrence/   Menlo Park, 
> California, 94025
>