[cpp-threads] Alternatives to SC
Paul E. McKenney
paulmck at linux.vnet.ibm.com
Sat Jan 13 01:53:07 GMT 2007
On Fri, Jan 12, 2007 at 01:39:38PM -0500, Doug Lea wrote:
> Paul E. McKenney wrote:
> >
> >
> >Anyway, given your clarification, the overall effect seems to be:
> >
> > For every read r of x, there is some write to x that produces
> > the value read, and the read r occurs in a unique place within
> > the totally-ordered series of writes to x.
>
> Yes. Note that two reads at the same position are not ordered
> wrt each other though.
Got it, thank you!
> >>>>[Causal Propagation]
> >>>> e --> e'' if there is an e' in E(p(e)) u E(p(e'')) st e --> e' --> e''
> >>>> .
> >>>
> >OK. I am a bit confused by this
>
> Here's the intuition. Suppose you have:
> T1 T2
> e; wx{x=1} rx{r1=x}
> Where e is any other read or write, and rx sees the write wx.
> in which case
> e --> wx (program order)
> wx --> r (freshness)
> Now, e and wx are in T1, rx is in T2.
> So we can apply causal propagation to get e --> rx.
>
> And similarly if we had:
> T1 T2
> wx{x=1} rx{r1=x}; e
> We'd get wx --> e.
>
> And further, if we had
> T1 T2 T3
> e; wx{x=1} rx{r1=x}; wy(y=1} ry{r2=y}
> we'd eventually get e --> ry
(and I understand the "eventually" from your next message)
> But the rule doesn't allow arbitrary inference-at-a-distance
> propagation through third parties. That's the "causal" aspect
> of this rule.
So, does the following formalism describe this?
e --> e'' if exists(e') in E(p(e)) u E(p(e'')) st
e --> e' --> e'' and
(p(e) == p(e') or p(e') == p(e''))
Or is the restriction on the processors implied in some other way that
I missed?
Ah!!! I finally see it -- The fact that e' is selected from the
executions of the (up to) two processors executing e and e'' means
that e' cannot possibly be executing on a third processor. So my
restriction after the "and" above is indeed redundant.
So this formalism supports what I have been calling "pairwise memory
barriers" -- a pair of CPUs can reliably communicate ordered changes
with memory barriers (or with volatiles, atomics, or whatever for which
the compiler supplies the needed memory barriers).
Trying another example:
T1 T2 T3
[1] x = 1 [3] y++ [4] if (y == 2)
[2] y = 1 [5] assert(x == 1)
If we reach the assertion at [5], we know:
[2] --> [3] --> [4] by Write Serialization
[1] --> [2] and [4] --> [5] by Program Order
Applying Causal Propagation with e=[1], e'=[2], and e''=[4], we have
[1] --> [4]. This is permitted because [1] and [2] are both
executed by T1 and [4] is executed by T3.
Applying Causal Propagation with e=[1], e'=[4], and e''=[5], we have
[1] --> [5]. This is permitted because [1] is executed by T1
and [4] and [5] are both executed by T3.
So the set G is the transitive closure of:
{[1] --> [2] --> [3] --> [4] --> [5]}
The set W is {[1], [2], [3]}.
Can G legally contain [5] --> [1]? No, it cannot, because if it did:
G | W u E(T3)
would contain a cycle. Therefore, the assertion, if reached, cannot fail.
Or am I missing something here? (I believe that something like this
is in fact needed if the formalism is to be able to implement locking.)
> Hope this helps!
Well, let's see how I do on the examples. ;-)
> EXAMPLES:
>
> 1. Dekker (forbidden outcome)
>
> Given
>
> T1 T2
> [1] x = 1 [3] y = 1
> [2] r1 = y(0) [4] r2 = x(0)
>
> Program Order gives
> [1] --> [2], [3] --> [4],
> Write Serialization and Freshness give
> [2] --> [3], [4] --> [1].
> Propagation adds
> [3] --> [1], [1] --> [3] (and some other edges).
>
> This graph is not an execution because (G | {[1],[3],[4]})* has a
> cycle ([1] --> [3] --> [1])
This is using T2:
(G | W u E(T2))*
expands to:
(G | {[1],[3]} u {[3],[4]})*
which is the (G | {[1],[3],[4]})* above.
> ======
>
> 2. IRIW (Independent reads of independent write -- allowed outcome)
> Note that this is not SC.
>
>
> T1 T2 T3 T4
>
> [1] x = 1 [2] y = 1
>
> [3] r1 = x (1) [5] r3 = y (1)
> [4] r2 = y (0) [6] r4 = x (0)
>
> Program Order, Write Serialization, and Freshness give:
> [1] --> [3] --> [4] --> [2]
> [2] --> [5] --> [6] --> [1]
> Propagation adds
> [2] --> [6], [5] --> [1] and [1] --> [4], [3] --> [2].
>
> This graph satisfies all these conditions. (Note that [1] --> [2] is
> in (G | W u E(T2))* and [2] --> [1] is in (G | W u E(T1)*), but both
> these graphs are separately acyclic.)
The trick here is that neither T3 nor T4 write to any volatiles,
and the only cycles in G involve both T3 and T4. So projecting
onto any one task will mask away some portion of the cycle.
In contrast, in my earlier example, both T1 and T2 had -only- writes,
and an assertion failure produced a cycle involving T1, T2, and T3.
T3 was able to see all operations, and so the cycle was visible to T3,
and thus not permitted.
> =======
>
> 3. CC (Causal consistency -- forbidden outcome)
>
> T1 T2 T3
> [1] x = 1
> [2] r1 = x (1) [4] r2 = y (1)
> [3] y = 1 [5] r3 = x (0)
>
> Program Order, Write Serialization, and Freshness give:
> [1] --> [2] --> [3] --> [4] --> [5] --> [1].
> This is not an execution because
> G | E u E(T3) = G | {[1],[3],[4],[5]}*
The above line needs to be:
G | W u E(T3) = G | {[1],[3],[4],[5]}*
Right?
> has cycle [1]-->[3]-->[4] --> [5] --> [1].
So we retain dependencies among writes, even if an intervening
read on the non-focus task was needed to establish that dependency?
So even though [2] is dropped from the local-consistency check,
the [1] --> [3] dependency that it established is retained, due to
the fact that this dependency has effect on a pair of writes. In
contrast, the dependencies in example 2 in T3 and T4 were only
on reads, and thus not retained.
Another way of putting this (I think) is that a read can establish
a dependency, but that dependency matters only if that read either:
1. was executed by the focus task (T3 in this case), or
2. force a dependency involving a write event that was
executed on the same processor as was the read.
Is this correct?
> =======
>
> 4. DC (Another causal consistency example -- forbidden outcome)
>
> T1 T2 T3
> [1] y = 1
> [2] x = 1 [4] r3 = y (1)
> [3] r2 = y (0) [5] r4 = x (0)
>
> Program Order, Write Serialization, and Freshness give:
> [1] -> [4] --> [5] --> [2] --> [3] --> [1]
> Propagation adds
> [2] --> [1] (among others).
> This is not an execution because
> G | W u E(T3) = G | {[1],[2],[4],[5]}
> has the cycle [1] --> [4] --> [5] --> [2] --> [1].
The upside-down version of example 3.
Thanx, Paul
More information about the cpp-threads
mailing list