[cpp-threads] Comments on n2094

Fri Sep 29 23:50:13 BST 2006

Peter Dimov wrote:
> Some brief comments:
> 
> Ion Gaztañaga wrote:
> 
>> a) The first consequence is that implementing an upgradable_mutex
>> using the suggested two condition variables + a mutex is not correct,
>> since all operations (including unlock()) will need to lock the
>> internal mutex, and that internal mutex locking can throw. Not that
>> this is worries me much, but it's a bit annoying.
> 
> This is only a problem if mutex::lock can succeed once and fail the 
> second time. This is usually not true for designs where failure comes 
> from allocating resources on first lock.

I'm taking some code snippets from the paper "Futexes Are Tricky" 
(http://people.redhat.com/drepper/futex.pdf) where glibc/linux mutex 
implementation philosophy is sketched. For example, a mutex is 
constructed with just an integer member:

class mutex
{
    public:

    mutex()
       :   val(0)
    {}

    void lock()
    {
       int c;
       if((c = compxchg(val, 0, 1)) != 0) {
          if(c != 2)
             c = xchg(val, 2);
          while(c != 0){
             futex_wait(&val, 2);
             c = xchg(val, 2);
          }
       }
    }

    void unlock()
    {
       if(atomic_dec(val) != 1){
          val = 0;
          futex_wake(&val, 1);
       }
    }

    private:

    int val;
};

The key is that the mutex does not store any resource handle. If there 
is no contention, no kernel call is made. When there is contention, a 
kernel call is made and that's when the race is solved and resources are 
acquired. We might think that we can have an additional atomic flag in 
the mutex to acquire resources in the first lock() call and then set the 
flag indicating that resources are already there. However, this 
disallows using the mutex as a process-shared mutex because once the 
file where mutex is constructed is remapped the information is not 
correct. On the other hand, the shown mutex can be used as a 
process-shared mutex, but resources could be acquired at any moment, 
when there is contention and we need to queue the thread in the kernel.

I've just realized that a thread can only be blocked in a single kernel 
synchronization object so the kernel can allocate a single queue 
resource compatible with all synchronization resources per thread when 
the thread is created and use it when the thread needs to block on a 
mutex/semaphore/condition. The resource is cached for that thread again 
and it's always available so we might avoid any resource acquisition 
error, even when lazy initialization is used. The same resource can be 
transferred from a condition variable kernel queue to a mutex queue when 
the condition is signaled and the thread relocks the external mutex.

This still does not solve the EDEADLK problem, though. But without 
EDEADLOCK, maybe a no-throw mutex lock can be guaranteed. Revising 
linux, futex system call (http://ds9a.nl/futex-manpages/futex2.html):

"All operations may return -EINVAL in case of unaligned futexes, as well 
as -EFAULT, -EPERM, -EACCESS when passing pointers to bad or 
inaccessible memory."

So it seems that the system call can't fail due to resource constraints. 
We suppose that we always pass aligned integers to the system call when 
using mutexes so we can ignore the rest of errors or just abort if that 
happens. A no-throw mutex lock simplifies things a lot.

>> 0.1 Condition variables
>> ------------------------
>>
>> Condition variables and exceptions are a very tricky combination,
>> IMHO. In this proposal, the situation is more tricky, since condition
>> variables can work with user-defined mutexes (which is a great
>> feature). If mutex locking can throw, that means that condition wait
>> can also throw, since it must lock the mutex after it's been
>> notified.
> 
> condition::wait is a cancelation point and it can fail with EINVAL or 
> EPERM, so it doesn't matter whether locking a mutex can throw; 
> condition::wait can throw.
> 
>> What is the user supposed to do if a condition throws? Can the user
>> recover resources and try it again? Can we guarantee that the mutex
>> will be always relocked?
> 
> POSIX says yes. The mutex is in a locked state when pthread_cond_wait 
> "throws" a cancelation exception or returns an error.

Agreed. The key is that N2094 outlines a condition that is compatible 
with user-defined mutexes, so we need to state clearly a protocol to 
guarantee a no-throw mutex relock, or just require a no-throw lock() 
function.

Regards,

Ion