[cpp-threads] modes, pass 2

Alexander Terekhov alexander.terekhov at gmail.com
Thu May 12 11:48:59 BST 2005


Forgot one thing...

On 5/12/05, Alexander Terekhov <alexander.terekhov at gmail.com> wrote:
> [... JSR-133 cookbook and isync ...]
> 
> > some people assume you already have a control dependency
> > and some don't. I changed this accordingly on cookbook page.
> 
> Uhmm. FWIW I (still) don't like your use of Sparc's bidrectional
> fences to illustrate Java's RCsc. Even with control dependency,
> isync isn't quite LoadLoad (in the sense of a bidirectional fence
> that prohibits reordering of *all* preceding loads with respect
> to subsequent ones).
> 
> http://www.cs.umd.edu/~pugh/java/memoryModel/archive/1222.html
> http://www.decadentplace.org.uk/pipermail/cpp-threads_decadentplace.org.uk/2005-May/000445.html
> 
> Why don't you switch to hoist/sink stuff labels? Ok, perhaps with
> "msync constraint(allow_reordering pre, allow_reordering post)"
> calculator, so to speak. ;-)

You say in cookbook:

----
For descriptions of the underlying models supported on different processors, 
see Sarita Adve et al, Recent Advances in Memory Consistency Models for 
Hardware Shared-Memory Systems and Sarita Adve and Kourosh 
Gharachorloo, Shared Memory Consistency Models: A Tutorial. 
----

I think that  

http://tinyurl.com/dzc8l
(Kourosh Gharachorloo, Anoop Gupta, and John Hennessy. Two 
techniques to enhance the performance of memory consistency 
models.)

is also quite illuminative: "acq, done, and store tag" fields, etc.

<quote>

The speculative-load buffer provides the detection mechanism by 
signaling when the speculated result is incorrect. The buffer 
works as follows. Loads that are retired from the reservation 
station are put into the buffer in addition to being issued to 
the memory system. There are four fields per entry (as shown in 
Figure 4): load address, acq, done, and store tag. The load 
address field holds the physical address for the load. The acq 
field is set if the load is considered an acquire access. For 
SC, all loads are treated as acquires. The done field is set 
when the load is performed. If the consistency constraints 
require the load to be delayed for a previous store, the store 
tag uniquely identifies that store. A null store tag specifies 
that the load depends on no previous stores. When a store 
completes, its corresponding tag in the speculative-load buffer 
is nullified if present. Entries are retired in a FIFO manner. 
Two conditions need to be satisfied before an entry at the head 
of the buffer is retired. First, the store tag field should 
equal null. Second, the done field should be set if the acq 
field is set. Therefore, for SC, an entry remains in the buffer 
until all previous load and store accesses complete and the 
load access it refers to completes. Appendix A describes how an 
atomic read-modify-write can be incorporated in the above 
implementation.

We now describe the detection mechanism. The following 
coherence transactions are monitored by the speculativeload 
buffer: invalidations (or ownership requests), updates, and 
replacements.3 The load addresses in the buffer are 
associatively checked for a match with the address of such 
transactions.4 Multiple matches are possible. We assume the 
match closest to the head of the buffer is reported. A match 
in the buffer for an address that is being invalidated or 
updated signals the possibilityof an incorrect speculation. A 
match for an address that is being replaced signifies that 
future coherence transactions for that address will not be 
sent to the processor. In either case, the speculated value 
for the load is assumed to be incorrect. Guaranteeing the 
constraints for release consistency can be done in a similar 
way to SC. The conventional way to provide RC is to delay a 
release access until its previous accesses complete and to 
delay accesses following an acquire until the acquire 
completes. Let us first consider delays for stores. 

The mechanism that provides precise interrupts by holding 
back store accesses in the store buffer is sufficient for 
guaranteeing that stores are delayed for the previous 
acquire. Although the mechanism described is stricter than 
what RC requires, the conservative implementation is required 
for providing precise interrupts. The same mechanism also 
guarantees that a release (which is simply a special store 
access) is delayed for previous load accesses. To guarantee a 
release is also delayed for previous store accesses, the store 
buffer delays the issue of the release operation until all 
previously issued stores are complete. In contrast to SC, 
however, ordinary stores are issued in a pipelined manner.

</quote>

regards,
alexander.




More information about the cpp-threads mailing list