[cpp-threads] Spurious failures of try_lock{_for}({rel_time}) vs. conditional fence semantics

Tue Dec 23 06:44:40 GMT 2008

On Mon, 22 Dec 2008, Alexander Terekhov wrote:

> N2800's try_lock():
>
> "Effects: Attempts to obtain ownership of the mutex for the calling
> thread without blocking. If ownership is not obtained, there is no
> effect and try_lock() immediately returns. An implementation may fail
> to obtain the lock even if it is not held by any other thread."
>
> N2800's try_lock_for(rel_time):
>
> "Effects: The function attempts to obtain ownership of the mutex
> within the time specified by rel_time. If the time specified by
> rel_time is less than or equal to 0, the function attempts to obtain
> ownership without blocking (as if by calling try_lock())."
>
> seem to contradict POSIX:
>
> http://www.opengroup.org/onlinepubs/009695399/functions/pthread_mutex_trylock.html
>
> "The pthread_mutex_trylock() function shall be equivalent to
> pthread_mutex_lock(), except that if the mutex object referenced by
> mutex is currently locked (by any thread, including the current
> thread), the call shall return immediately."
>
> http://www.opengroup.org/onlinepubs/009695399/functions/pthread_mutex_timedlock.html
>
> "Under no circumstance shall the function fail with a timeout if the
> mutex can be locked immediately."
I would argue that this is a mistake in the Posix standard.  Current 
implementations on weakly ordered architectures are arguably often 
incorrect with respect to this specification.  If you want to promise 
sequential consistency for data-race-free programs in the presence of 
pthread_mutex_timedlock(), you would need lock() to have release semantics 
as well.

The details are in my PPoPP 07 paper.  But the basic example is:

Thread1: x = 42; pthread_mutex_lock(&l);

Thread2:

while (pthread_mutex_timed_lock(&l, small) == 0) pthread_mutex_unlock(&l);
pthread_mutex_lock(&dummy); pthread_mutex_unlock(&dummy);
assert(x == 42);

(The initial loop waits for thread 1 to acquire the lock.  Yes, this
is evil code.)

This can result in non-sequentially-consistent behavior, and the 
assertion may fail, if the assignment and lock acquisition in thread1
are not ordered.  I believe they often are not.  Fixing this can result
in appreciable, and completely useless, overhead for most lock
acquisitions.

(The dummy lock acquisition isn't necessary to see that this is non-SC
behavior.  It does help to argue that the current behavior is actually
officially inconsistent with the current spec, which is unclear
about when SC behavior is intended.)

>
> Judging from the draft and also
>
> http://www.hpl.hp.com/techreports/2008/HPL-2008-56.pdf
>
> the primary intent behind introduction of spurious failures was to
> make unlock()->try_lock{_for}({rel_time}) "synchronize with"
> conditionally ("If try_lock() returns true, prior unlock() operations
> on the same object synchronize with (1.10) this operation.").
>
> But why insist on spurious failures for try_lock{_for}({rel_time})
> instead of simply saying that these operations provide the effect of
> atomic_thread_fence(memory_order_acquire) if mutex ownership is
> obtained, otherwise no fence effect
> (atomic_thread_fence(memory_order_relaxed) if you like)?
The problem isn't really primarily with the try_lock implementation.  It 
has to do with the cost of the lock operation.

Hans
>
> regards,
> alexander.
>
> --
> cpp-threads mailing list
> cpp-threads at decadentplace.org.uk
> http://www.decadentplace.org.uk/cgi-bin/mailman/listinfo/cpp-threads
>