[cpp-threads] N2800's C/C++MM and read-write locking

Fri Jan 9 02:56:45 GMT 2009

On Thu, Jan 08, 2009 at 07:42:43PM +0100, Alexander Terekhov wrote:
> On Thu, Jan 8, 2009 at 5:50 PM, Paul E. McKenney
> <paulmck at linux.vnet.ibm.com> wrote:
> > On Thu, Jan 08, 2009 at 04:01:28PM +0200, Peter Dimov wrote:
> >> >> I also often tend to think of "rwlocks" as shared/exclusive locks.  If
> >> >> you really
> >> >> want to use weaker fences for readers, I think you would need to split
> >> >> them
> >> >> into two variantes: shared/exclusive and read/write.
> >> >
> >> > Care to elaborate?
> >>
> >> Something like:
> >>
> >> take shared lock
> >> try STM transaction in a lock-free manner
> >> unlock
> >>
> >> if failed
> >>     take write lock
> >>     use ordinary ops
> >>     unlock
> >>
> >> Here the lock is not used as a pure read lock since the transaction contains
> >> writes.
> >
> > I agree with Peter.  In fact, I have come across situations where one
> > read-acquires the lock to protect an update and write-acquires the lock
> > to protect a read.  What, you don't believe me?  ;-)  Read on, then...
> >
> > Here is such an example, which assumes per-thread variables that I
> > represent as arrays, as I cannot remember what if anything C++0x does
> > with thread-local storage.  I also invent a me() function that returns
> > a small integer uniquely identifying the current thread, and I steal
> > the Linux-kernel syntax for reader-writer locks.
> >
> > Yes, this sort of thing really has been used in production in real life.
> > Explanation after example.
> >
> >        long count[N_THREADS] = 0;
> >        DEFINE_RWLOCK(rwl);
> >
> >        void get_reference(void)
> >        {
> >                read_lock(&rwl);
> >                count[me()]++;
> >                read_unlock(&rwl);
> >        }
> >
> >        void put_reference(void)
> >        {
> >                read_lock(&rwl);
> >                count[me()]--;
> >                read_unlock(&rwl);
> >        }
> >
> >        /*
> >         * Caller must write-hold rwl.
> >         */
> >        int no_references(void)
> >        {
> >                int i;
> >
> >                for (i = 0; i < N_THREADS; i++)
> >                        if (count[i] != 0)
> >                                return 0;
> >                return 1;
> >        }
> >
> >        /* Sample use. */
> >
> >        write_lock(&rwl);
> >        if (no_references())
> >                perform_action_requiring_there_be_no_references();
> >        write_unlock(&rwl);
> >
> 
> What's the problem with insisting to add extra fencing to
> get_reference() and put_reference() in the case of using rwl in
> (shared-)read(-only) mode?
> 
> With full set of locking modes
> 
>        shared-read-only
>        shared-write-only
>        shared-read-write
>        exclusive-read-only
>        exclusive-write-only
>        exclusive-read-write
> 
> your example would be
> 
>        long count[N_THREADS] = 0;
>        DEFINE_RWLOCK(rwl);
> 
>        void get_reference(void)
>        {
>                long r = count[me()];
>                shared_write_only_lock(&rwl);
>                count[me()] = r + 1;            // or simply
> ++count[me()] without r
>                shared_write_only_unlock(&rwl);
>        }
> 
>        void put_reference(void)
>        {
>                long r = count[me()];
>                shared_write_only_lock(&rwl);
>                count[me()] = r - 1;            // or simply
> --count[me()] without r
>                shared_write_only_unlock(&rwl);
>        }
> 
>        /*
>         * Caller must hold exclusive-read-only lock on rwl
>         */
>        int no_references(void)
>        {
>                int i;
> 
>                for (i = 0; i < N_THREADS; i++)
>                        if (count[i] != 0)
>                                return 0;
>                return 1;
>        }
> 
>        /* Sample use. */
> 
>        exclusive_read_only_lock(&rwl);
>        if (no_references())
>                perform_action_requiring_there_be_no_references();
>        exclusive_read_only_unlock(&rwl);
> 
> correct?
> 
> And Peter's example:
> 
>        take shared-read-write lock
>        try STM transaction in a lock-free manner
>        release shared-read-write lock
> 
>        if failed
>          take exclusive-read-write lock
>          use ordinary ops
>          release exclusive-read-write lock
> 
> right?

The real issue is that there are a number of reader-writer lock
implementations with wildly different properties.  Roughly speaking,
there are three categories:

1.	Readers incur cache misses and execute memory barriers acquiring
	and releasing the lock even in absence of writers.  Writers
	incur a single cache miss per acquisition and release, and
	execute memory barriers as well.

2.	Readers execute memory barriers but do not incur cache misses
	acquiring and releasing the lock even in absence of writers.
	Writers incur one cache miss per thread per acquisition and
	release, and execute memory barriers as well.

3.	Readers need no cache misses, memory barriers, or even atomic
	instructions if there are no writers.  Writers must translate the
	Dead Sea Scrolls into Urdu while skydiving on each acquisition
	and release.  OK, OK, writers just incur heavy overheads and
	large latencies, but you get the idea.

I am not aware of a reader-writer lock API that deals well with this
variety, let alone with the potential slight weakening of memory
barriers for readers who really only read.

							Thanx, Paul