Threads: first milestone completed
Maged Michael
magedm at us.ibm.com
Thu Oct 21 05:38:44 BST 2004
Andrei, Thanks for a great job.
Gentlemen,
My apologies for the delay in input but I am swamped these days. Here are
some preliminary thoughts from my (narrow) perspective as an algorithm
designer.
- As you all well know fence instructions are very expensive and getting
more expensive with faster processors. So, it is crucial to define a model
that can offer performance-conscious programmers the ability to write code
that has *zero* fence overhead over hardware-mandated fence instructions.
That is, the model should allow programmers to write code that generates
no more fence instructions than those that have to be included in an
assembly implementation.
- I suggest considering excluding visibility from the semantics of
"volatile" (assuming that we want to redeem volatile and not define a new
qualifier like "shared"). In many cases all I need is to restrict compiler
optimizations with respect to certain shared variables, without ever or
always needing fence instructions around accesses to these variables. It
might be ok to offer an "ordered volatile" (with fences around every
access) for cases where fast coding is more important than performance. An
example of volatile variables that doesn't need fences is shared counters.
Visibility is forced by the semantics of CAS or LL/SC without requiring
fences or assuming implicit fences in CAS and LL/SC.
- I suggest the following semantics for "volatile" (or "shared"): "Every
read or write of such a variable generates exactly one load or store,
respectively." Generating extraneous loads can be as wrong as generating
extraneous stores or omitting loads or stores.
I'd appreciate feedback.
Maged
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://shadbolt.decadentplace.org.uk/pipermail/cpp-threads/attachments/20041021/02a3adc6/attachment.html
More information about the cpp-threads
mailing list