[cpp-threads] memory model
Alexander Terekhov
alexander.terekhov at gmail.com
Sat Apr 30 16:09:35 BST 2005
<the message was sent in private>
> Let me check that I've understood this correctly. Is the following
> summary correct?
>
> Hoist-load barriers (hlb), hoist-store barriers (hsb) and acquire
> barriers (acq) prevent logically subsequent memory reads, writes or both
> (respectively) being performed before the associated operation.
Yep. Classic acquire is hlb+hsb.
> The dd-variants only constrain speculative execution of subsequent
> memory operations whose operands depend on the value read by the
> associated operation. The cc-variants only constrain speculative
> execution of subsequent memory operations that are conditional on
> the value read by the associated operation.
Yes. dd is basically about "value prediction" and making it work
"correctly" (in the presence of dd barrier).
http://www.cs.wisc.edu/~cain/pubs/micro01_correct_vp.pdf
> (Do the cc-variants also include the constraints of the dd-variants?)
No. If you need both you'd have to use either straight acq or
dd+cc (plus control condition).
>
> Sink-load barriers (slb), sink-store barriers (ssb) and release barriers
> (rel) prevent logically preceding memory reads, writes or both
> (respectively) being performed after the associated operation.
Yep.
>
> A full fence is a combined acquire and release barrier (prevents any
> reordering across it). Its associated operation may be a no-op.
Yup.
>
> A store-load fence (slfence) is a combined sink-store barrier and
> hoist-load barrier, and needn't be associated with an operation of its
> own.
Always no-op.
>
> (End of summary.)
>
> All that given, presumably the cc- and dd-variants are important for
> constraining compiler optimisations even though they generally won't
> require it to generate barrier instructions?
Compilers aside for a moment, they are important to get rid of
redundant hardware barriers.
http://www.google.de/groups?selm=ca25eec8.0406141820.7a334e99%40posting.google.com
(the change in membars gave about a 7% performance improvement
on UltraSparc IIs..)
http://www.google.de/groups?selm=40CED460.1B7C3FAB%40web.de
(I've changed terminology in the mean time. old ddacq == ccacq -- (acquire
with control condition)
>
> Where a compiler can't "see" the body of a function, it must treat calls
> to that function as fully fenced, right?
Unless it is declared as http://tinyurl.com/68jav (or something like that).
;-)
regards,
alexander.
More information about the cpp-threads
mailing list