[cpp-threads] SC on PPC

Alexander Terekhov alexander.terekhov at gmail.com
Sat May 12 20:55:40 BST 2007


On 5/9/07, Boehm, Hans <hans.boehm at hp.com> wrote:
[...]
> > It seems natural to me that if an atomic relaxed load
> > observes the value stored by an atomic relaxed store, then
> > the relaxed store hb the relaxed load.
> This would have negative consequences on synchronization elimination.  It
> would mean that if I have
>
> T1: x.store_relaxed(1); z.fetch_add_acq_rel(1); y.store_relaxed(1)
> T2: if (y.load_relaxed()) w.store_release(1);
>
> with z thread local (possibly after thread coalescing by the compiler), I still

Well, you can certainly postulate that a sort of pure thread locals
(broken coalescing by the compiler aside for a moment) don't
synchronization anything. That said, consider:

http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/40546.pdf
(40546 Rev. 3.02 May 2007 Software Optimization Guide for AMD Family
10h Processors)
-----
Preferred:
mov sharedmem1, rbx
xchg localmem2, rax ;; performs local store and StoreLoad barrier
;; in one instruction (note: modifies rax)
mov rcx, sharedmem3

Avoid:
mov localmem2, rax
xchg sharedmem1, rbx ;; avoid using shared mem for locked operation
mov rcx, sharedmem3

Avoid:
mov sharedmem1, rbx
mov localmem2, rax
mfence ;; avoid MFENCE which is serializing
mov rcx, sharedmem3
-----

;-)

regards,
alexander.



More information about the cpp-threads mailing list