[cpp-threads] Yet another visibility question

Fri Dec 29 07:23:18 GMT 2006

> > >	a = 1;
> > >	read_barrier_depends();
> > >	b = 2;
> >
> > > it might or might not force any ordering.  Currently it does on Alpha,
> > > but not necessarily on any other platform (x86's store ordering could
> > > be defeated by compiler optimizations, for example).
> >
> > Indeed. Your RCU patent applications make heavy use of atomic loads with
> > data-dependency hints. Just like this:
[...]

> ;-)  Looks interesting! 

:^) Simple and straight-forward lock-free reader pattern...

> I am assuming that the "release barrier" called
> out in the update is at least a store-store barrier (smp_wmb() in Linux
> kernel parlance).  

Yeah... A (membar #StoreStore) works just fine for the example... However,
as you know, there are times when you might have to use a (membar #LoadStore
| #StoreStore) if an algorithm requires that any prior loads from a node
needs to be ordered along with the stores which have to be rendered visible
to any potential readers before they could even possibly access it...

> The readers do not seem to be modifying the stack (no
> pushing or popping from what I can see), so as long as the do_whatever()
> function can safely run concurrently from different threads on the same
> stack element, it should be safe.

Yes. do_whatever() would never modify the node... A read-only thread that
makes frequent iterations through shared data-structures that are controlled
by a lock-free reader pattern must not be executing any explicit memory
barriers; not at all... Well... If a new arch was developed that did not
have atomic loads that have implicit #LoadLoad barriers w/ data-dependency
hints associated with them,, e.g. like the Alpha, you have to make the
explicit call to the damn #LoadLoad... That could damage RCU performance...
Luckily, you can amortize the cost of the explicit #LoadLoad memory barrier
by running batches of nodes through your lock-free reader algorithm(s)...
One atomic load with a #LoadLoad on a batch of nodes will amortize the cost
of it over the depth of the batch...:

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/e7
b5fe64cb35d64d

http://groups.google.com/group/comp.lang.java.programmer/browse_frm/thread/e
7b5fe64cb35d64d
(interesting thread...)

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/1f
f48ea953a3ec0e

> Just out of curiosity, what do you use this for?

Basically, I make extremely heavy use of these types of algorithms anytime
one of my applications needs to make use of read-mostly shared linked
data-structures... For instance, I use of vZOOM to implement an algorithm
that allows a couple of database applications to achieve virtually
zero-overhead lock-free fast-paths... The database queries hit the shared
memory with vZOOM, before I even think about going for the disk... I also
use it to manage a couple of experimental heavily multi-thread GUI
engines... 

[...]
> > Luckily, 
> > now that the committee is getting input from you, I think the odds are 
> > good that it will.

> Thank you for the vote of confidence!  

Sure. I think you know what the issues are from all of your RCU experience.

> I guess we will see how the
> discussion goes.  These sorts of algorithms are similar to recursion
> in the sense that some people take right to them while others suffer
> extreme cognitive dissonance when first faced with them.  So it should
> be interesting!

Indeed!

;^)

> > P.S...
> > Paul... I think it would be nice if you could join us over at
> > comp.programming.threads. We noticed a patent applications on a lock 
> > free proxy garbage collector you an Michael (SMR) invented... We would 
> > like to discuss your implementation and possibly introduce you to a
> > couple of other algorithms... (e.g., Joe Seighs RCU+SMR and my vZOOM)

> I have exchanged emails with Joe Seigh on RCU+SMR

Ahh... I think I remember him mentioning that. Okay...

> but have not yet come across vZOOM.  

There might be a chance that you like it...

;^)

> Would you have a URL for a good starting point?

Here are some links to some brief information...

On comp.programming.threads:

http://groups.google.com/group/comp.programming.threads/search?group=comp.pr
ogramming.threads&q=vzoom&qt_g=1

http://groups.google.com/group/comp.programming.threads/msg/fdc665e616176dce

http://groups.google.com/group/comp.programming.threads/msg/523aa1e77990215a

http://groups.google.com/group/comp.programming.threads/msg/16b31df060411bcd

http://groups.google.com/group/comp.programming.threads/msg/e59c2dd1afb8e0a3
(memory-barrier amortization over epochs)

here are some very introductory papers I created:

http://appcore.home.comcast.net/vzoom/ 
(old site... not sure if its active)

http://appcore.home.comcast.net/vzoom/round-1.pdf 

http://appcore.home.comcast.net/vzoom/round-2.pdf 

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/12
16861ac568b2be

Basically, vZOOM allows an unbounded and arbitrarily number of references to
persist across multiple successive quiescent-points... The references can be
acquired with virtually zero-overhead... This is simple not allowed by RCU,
and SMR was not designed to allow you to grab an arbitrarily number of
hazard pointers...

I even created and claimed a method for implanting vZOOM with POSIX
mutexs... I know this sounds crazy, but yes, you can use POSIX mutexs to
implement vZOOM, and RCU... Super portable...

P.S.

I hope the links I posted make it thourgh to you okay... This posting
process seems to mangle the hell out of them. If it works tings over real
good, just go to google groups, go to comp.programming.threads and search
for vzoom.

Oh yeah. I almost forgot! Paul, I also have a patent application on what is
in my opinion, the fastest multi-threaded user-space memory allocator
algorithm that is currently out there. Here is the link:

http://groups.google.com/group/comp.arch/browse_frm/thread/24c40d42a04ee855

Well, do you agree? This memory allocator simply rocks!

;^)

Thank you...

--
http://appcore.home.comcast.net/
(portable lock-free data-structures)

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/20
5dcaed77941352
(appcore is getting noticed?)