[cpp-threads] RE: Memory model formalism, again

Thu Dec 15 22:02:35 GMT 2005

> From: Jeremy Manson [mailto:jmanson at cs.purdue.edu] 
> 
> Hi Hans,
> 
> Thanks for including me on this - I don't know how helpful I 
> will be, as 
> I don't really have access to information about what the requirements 
> for the C++ MM are.  But I'm happy to look at it.
> 
> Boehm, Hans wrote:
> > 1) (I think this is roughly a reuse of something Peter suggested a 
> > long time ago in a different context.)  We define an atomic 
> store in a 
> > given thread to depend on an (earlier in thread ordering) 
> atomic load 
> > if by altering only the value seen by that load, the store 
> (i.e. the 
> > value, location, or existence) is altered.  (This is 
> informal, but I 
> > think unambiguous and easily formalizable.  It is independent of 
> > compiler dependence analysis, or the hardware's notion of 
> dependence.)
> 
> This is a little tricky, although perhaps not terminally so.  
> It looks 
> as if you are going for a model where you are given an 
> execution E, and 
> then validate its correctness.  To do so, you need to, for 
> every atomic 
> load, generate an execution E' where that atomic load sees another 
> value.  How do you generate E'?
> 
> The issue here is that you need E and E' to be as similar as 
> possible, 
> and you need a way to make sure that the atomic load in E' 
> corresponds 
> to the atomic load in E.  If you don't do this, it can cause 
> problems. 
> Here's an artificial example:
> 
> Thread 1:
> r1 = read input from keyboard
> if (r1 == 1)
>    r2 = v;
>    w = 42;
> else {
>    r2 = v;
>    if (r2 == 42)
>      w = 42;
> }
> 
> Thread 2:
> r3 = w;
> if (r3 == 42)
>    v = 42;
> 
> If, in E, r1 == 0, but in all E' your method generates, r1 == 
> 1, you can 
> claim that w = 42 is not dependent on r2 = v.
> 
> This is a fairly stupid example (it's late and I don't feel 
> like working 
> out a better one :) ), but you should see where I am going.  An 
> iterative formulation (as per the JMM's commit sets) seems to work.
Thanks for the response.

I agree that this is a bit tricky, but I would also argue that it's not
terminally so, as you put it.  I would define actions as including the
(position of) the corresponding source statement.  You can then also
talk about valid executions of a single thread.  (I do this in
http://www.hpl.hp.com/techreports/2005/HPL-2005-217.html, and I think
you may want to anyway to make intra-thread constraints explicit.  It's
admittedly more complicated here, since in the tech report I carefully
talk only about sequentially consistent executions.)  In this context I
think you can talk about single-thread executions that differ only in
the value returned by a single, potentially inter-thread, load.

The result still seems considerably simpler to me than the iterative
formulation.  And there seems to be a non-misleading informal version of
the statement, which is probably want we want in the standard.
> 
> As for the rest, I can't really evaluate it without knowing 
> what are and 
> are not acceptable outcomes for atomics.  I would urge you to look at 
> Causality Test Case 17, though, as I suspect that your intent would 
> disallow it, and I am not sure you want to do that (although 
> maybe you 
> do, given that we are talking about atomics).
That's an interesting example to look at.

Let's reinterpret all loads and stores in test case 17 of
http://www.cs.umd.edu/~pugh/java/memoryModel/CausalityTestCases.html as
atomic and unordered operations.
Adding some line numbers, we get

Initially,  x = y = 0

Thread 1:
1: r3 = x
2: if (r3 != 42)
3:   x = 42
4: r1 = x
5: y = r1

Thread 2:
6: r2 = y
7: x = r2

Can r1 = r2 = r3 = 42?  (r3 is the interesting one.)

By my posted rules, 1 happens-before 3, 4 happens-before 5, and 6
happens-before 7.  For the outcome to occur,
1 would have to see 7, 4 would have to see 3, and 6 would have to see 5.

Thus 7 hb 1 hb 3 hb 4 hb 5 hb 6 hb 7, and we have a cycle.  Thus this is
indeed disallowed.  This means we can't do constant propagation on
thread 1 and transform it to

Thread 1':
1: r3 = x
2: if (r3 != 42)
3:   x = 42
5: y = 42

Given our current context, I wouldn't be far less upset about that than
in the Java context.  I doubt anyone does constant propagations through
volatile stores and loads now, which would be the closest analog.

But I think we can actually arrange to get the other answer, assuming we
want it.  If we say that the fact that a load sees a store induces a hb
relationship only across threads, not within a thread, then we no longer
have 3 hb 4, and I think the questionable outcome becomes allowable.  I
think that has the right effect of allowing intra-thread analysis by not
enforcing implicit intra-thread visibility orders, and it's probably
what I should have said.  But it would be good to think about this more.
> 
> 					Jeremy
> 
> 
>