[cpp-threads] Spurious failures of try_lock{_for}({rel_time}) vs. strong try_lock{_for}({rel_time})

Tue Dec 23 17:04:49 GMT 2008

(final observation, sorry for thinking out loud in iterations :-) )

On Tue, Dec 23, 2008 at 5:18 PM, Alexander Terekhov
<alexander.terekhov at gmail.com> wrote:
> (additional observation + correction)
>
> On Tue, Dec 23, 2008 at 4:16 PM, Alexander Terekhov
> <alexander.terekhov at gmail.com> wrote:
>> On Tue, Dec 23, 2008 at 7:44 AM, Hans Boehm <Hans.Boehm at hp.com> wrote:
>>>
>>> On Mon, 22 Dec 2008, Alexander Terekhov wrote:
>>>
>>>> N2800's try_lock():
>>>>
>>>> "Effects: Attempts to obtain ownership of the mutex for the calling
>>>> thread without blocking. If ownership is not obtained, there is no
>>>> effect and try_lock() immediately returns. An implementation may fail
>>>> to obtain the lock even if it is not held by any other thread."
>>>>
>>>> N2800's try_lock_for(rel_time):
>>>>
>>>> "Effects: The function attempts to obtain ownership of the mutex
>>>> within the time specified by rel_time. If the time specified by
>>>> rel_time is less than or equal to 0, the function attempts to obtain
>>>> ownership without blocking (as if by calling try_lock())."
>>>>
>>>> seem to contradict POSIX:
>>>>
>>>> http://www.opengroup.org/onlinepubs/009695399/functions/pthread_mutex_trylock.html
>>>>
>>>> "The pthread_mutex_trylock() function shall be equivalent to
>>>> pthread_mutex_lock(), except that if the mutex object referenced by
>>>> mutex is currently locked (by any thread, including the current
>>>> thread), the call shall return immediately."
>>>>
>>>> http://www.opengroup.org/onlinepubs/009695399/functions/pthread_mutex_timedlock.html
>>>>
>>>> "Under no circumstance shall the function fail with a timeout if the
>>>> mutex can be locked immediately."
>>>
>>> I would argue that this is a mistake in the Posix standard.  Current
>>
>> I disagree. Current POSIX XBD 4.10 Memory Synchronization states that
>>
>> http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap04.html#tag_04_10
>>
>> "Unless explicitly stated otherwise, if one of the above functions
>> returns an error, it is unspecified whether the invocation causes
>> memory to be synchronized."
>>
>> So there isn't any release-acquire pairing involving failed
>> try{timed}lock() in POSIX. See my interpretation of XBD 4.10 in terms
>> of release-acquire pairings:
>>
>> http://www.decadentplace.org.uk/pipermail/cpp-threads/2005-April/000222.html
>>
>> I agree with N2800's note stating
>>
>> "[ Note: Since lock() does not synchronize with a failed subsequent
>> try_lock(), the visibility rules are weak enough that little would be
>> known about the state after a failure, even in the absence of spurious
>> failures. —end note ]"
>>
>> The situation is the same in POSIX.
>>
>>> implementations on weakly ordered architectures are arguably often
>>> incorrect with respect to this specification.  If you want to promise
>>> sequential consistency for data-race-free programs in the presence of
>>> pthread_mutex_timedlock(), you would need lock() to have release semantics
>>> as well.
>>
>> I disagree, see below.
>>
>>>
>>> The details are in my PPoPP 07 paper.  But the basic example is:
>>>
>>> Thread1: x = 42; pthread_mutex_lock(&l);
>
> Note that as long as POSIX doesn't insist on having any
> release-acquire pairing involving pthread_mutex_lock() on release side
> (just like it is the case for mutex's lock() under proposed C++ memory
> model... unless I'm just missing something), then in regards to
> ordering with respect to other threads, Thread1's code above is
> equivalent to
>
> Thread1: pthread_mutex_lock(&l); x = 42;
>
> so ...
>
>>>
>>> Thread2:
>>>
>>> while (pthread_mutex_timed_lock(&l, small) == 0) pthread_mutex_unlock(&l);
>>> pthread_mutex_lock(&dummy); pthread_mutex_unlock(&dummy);
>>> assert(x == 42);
>
> ... even C++'s thread_atomic_fence(memory_order_seq_cst) in Thread2:
>
> while (pthread_mutex_timed_lock(&l, small) == 0) pthread_mutex_unlock(&l);
> thread_atomic_fence(memory_order_seq_cst);
> assert(x == 42);
>
> won't help to make this example data-race-free.
>
>>
>> This example is not data-race-free. Consider that it doesn't prevent
>> reordering of x's load above dummy's unlock() and completion of failed
>> timedlock() after x's load resulting in a race on x.
>>
>>>
>>> (The initial loop waits for thread 1 to acquire the lock.  Yes, this
>>> is evil code.)

OTOH, consider:

Thread1:

x = 42;
thread_atomic_fence(memory_order_release);
pthread_mutex_lock(&l);

Thread2:

while (pthread_mutex_timed_lock(&l, small) == 0) pthread_mutex_unlock(&l);
thread_atomic_fence(memory_order_acquire);
assert(x == 42);

Under the proposed C++ memory model this is data-race-free. (Unless
I'm just missing and/or misunderstanding something.)

So what's the problem here? Wanna write evil code... no problem, just
add extra C++ fences here and there. ;-)

>>>
>>> This can result in non-sequentially-consistent behavior, and the
>>> assertion may fail, if the assignment and lock acquisition in thread1
>>> are not ordered.  I believe they often are not.  Fixing this can result
>>> in appreciable, and completely useless, overhead for most lock
>>> acquisitions.
>>
>> Well, consider lock operations on semaphores
>>
>> http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap04.html#tag_04_15_00_01
>>
>> sem_wait() (and XSI IPC sema's locking as well).
>>
>> The example would be
>>
>> (Initially: semaphore value == 1)
>>
>> Thread1:
>>
>> x = 42;
>> sem_wait(&l); // decrement semaphore value
>>
>> Thread2:
>>
>> int v;
>> sem_getvalue(&l, &v);
>> if (l == 0) assert(x == 42);
>
> Typo. I meant: if (v == 0) assert(x == 42);
>
>>
>> I think that this shall be data-race-free due to release-acquire
>> pairing between sem_wait(&l) and sem_getvalue(&l, &v).
>>
>>> (The dummy lock acquisition isn't necessary to see that this is non-SC
>>> behavior.  It does help to argue that the current behavior is actually
>>> officially inconsistent with the current spec, which is unclear
>>> about when SC behavior is intended.)
>>
>> I agree that POSIX XBD 4.10 needs some fixing to provide clear listing
>> of all release-acquire pairings involving XBD.4.10 calls.
>>
>> http://www.decadentplace.org.uk/pipermail/cpp-threads/2005-April/000222.html
>

regards,
alexander.