When destroying a read/write lock, Helgrind reported the following error:
pthread_rwlock_destroy of a locked mutex
Leaving aside the fact that it is a lock I am destroying, not a mutex (though the library’s implementation may rely on mutexes), the error is probably accurate, especially since a subsequent attempt to release the lock is flagged by Helgrind as releasing an invalid lock.
I understand that it is probably an error to destroy a lock that is still being held by another thread. (Locks are typically destroyed together with the resource they protect, and if the lock is being held, it means the resource is still in use and should not be destroyed).
Now my questions:
Is it an error to destroy a lock that is still being held by the current thread?
If so, what is the reasoning behind it?
If so, how can I prevent other threads from acquiring the lock and messing with the resource when I am about to destroy both?
Is it an error to destroy a lock that is still being held by the
current thread?
Yes, it is. POSIX says:
Results are undefined if pthread_rwlock_destroy() is called when any
thread holds rwlock.
This is clear - "any thread" includes the current thread.
The reasoning would be along these lines: Either another thread can be racing to take the lock with the current thread's pthread_rwlock_destroy(), or it cannot. If it can be, then the program is already erroneous because attempting to lock an uninitialised lock is undefined; if it cannot, then it suffices for the current thread to unlock the lock first, then destroy it.
If so, how can I prevent other threads from acquiring the lock and
messing with the resource when I am about to destroy both?
The reasoning above hints at the answer to this - to destroy the object, including the lock within it, you must first make it unreachable for any other thread. You would do this by removing all references to it from other data structures, which likely involves taking and releasing other locks, but once you have the object itself isolated, you can safely unlock it because your thread must then hold the only remaining reference.
Related
When a lock (or try_lock) function of a mutex discovers that the mutex is already locked (by another thread (presumably)), could it try to determine whether the owning thread is (or was extremely recently) running on another core?
Knowing whether the owner is running gives an indication of a possible reason the thread still holds the lock (why it couldn't unlock it): if the thread owning the mutex is either "runnable" (waiting for an available core and time slice) or "sleeping" or waiting for I/O... then clearly spinning on the mutex is not useful.
I understand that non recursive, non "safe", non priority inheritance mutexes (that I call "anonymous" mutexes: mutexes that in practice can be unlocked by another thread as they are just common semaphores) carry as little information as possible, but surely that could be determined for those "owned" mutexes that know the identify of the locking thread.
Another more useful information would be more how much time it was locked, but that would potentially require adding a field in the mutex object.
Is that ever done?
I would like to know how many threads are waiting on a lock so I would be able to destroy it safely.
The problem is that I can't destroy the lock when someone holds it or someone is waiting on it.
My program can make sure that no new requests are made to acquire the lock, but how can I know when all the threads that waited on it are done with it?
I thought about a conditional variable but I suspect it will create problems..
dlv, could you add some code snippet to your description.
I hope you should be using condition variables,
Each thread will block in pthread_cond_wait() until the other thread signals it to wake up. This will not cause a deadlock. It can easily be extended to many threads, by allocating one int, pthread_cond_t and pthread_mutex_t per thread.
pthread_cond_wait() blocks the calling thread until the specified condition is signalled. This routine should be called while mutex is locked, and it will automatically release the mutex while it waits. After signal is received and thread is awakened, mutex will be automatically locked for use by the thread. The programmer is then responsible for unlocking mutex when the thread is finished with it.
The pthread_cond_signal() routine is used to signal (or wake up) another thread which is waiting on the condition variable. It should be called after mutex is locked, and must unlock mutex in order for pthread_cond_wait() routine to complete.
The pthread_cond_broadcast() routine should be used instead of pthread_cond_signal() if more than one thread is in a blocking wait state.
It is a logical error to call pthread_cond_signal() before calling pthread_cond_wait().
Proper locking and unlocking of the associated mutex variable is essential when using these routines. For example:
Failing to lock the mutex before calling pthread_cond_wait() may cause it NOT to block.
Failing to unlock the mutex after calling pthread_cond_signal() may not allow a matching pthread_cond_wait() routine to complete (it will remain blocked).
If threads that can use the mutex still exist or might be created in the future then don't delete it.
You do know and are tracking what threads are created, right?
If, for some reason, you cannot keep track of the threads using a resource, your only way out is to leak the resource. It can never be safely deleted because you never know when you are done using it.
Say you had a counter that counted the threads using a mutex. That counter would need its own mutex. Then how do you decide when to delete that one?
That way of thinking is the road that leads to hell. You could do what you want with condition variables, but the result would be an extremely weak design.
Assuming you managed to create such a monster, it would basically allow you to kill "safely" any other thread regardless of its internal state. Except for a quick and dirty panic exit (in case of some internal software error), this is the worst possible way of solving synchronization issues.
A design relying on such tricks would have to create implicit synchronizations between tasks to make sure the terminations occur in the proper order. A lot of software are designed that way, and most of them allow mediocre programmers to make a living by maintaining the pile of crap they created in the first place.
Task termination should be an issue solved at global design level, not by a toolbox of wonky objects that allow you to twist synchronization any odd way.
We know that multi-threaded code has the bane of possible deadlocks if the threads acquire mutex locks but before it gets a chance to release it, the thread gets suspended by main thread or pre-empted out by Scheduler?
I am a beginner in using pthread library so please bear with me if my below query/proposed solution might be unfeasible or outright wrong.
void main()
{
thread_create(T1,NULL,thr_function,NULL)
suspend_thread(T1);
acquire_lock(Lock1);<--- //Now here is a possible deadlock if thread_function acquried Lock1 before main and main suspended T1 before its release
//Do something further;
}
void *thr_function(void *val)
{
///do something;
acquire_lock(Lock1);
//do some more things;
//do some more things;
release_lock(Lock1);
}
In this below pseudo code segment above I have, can't the thread run-time/compiler work together to make sure if a thread which has acquired a mutex lock, is suspended/pre-empted then it executes some 'cleanup code' of releasing all locks it has held before it gets out. The compiler/linker can identify the places inside a thread function which acquire , release lock, then when a thread is suspended between those two places(i.e. after acquire but before release) the execution in the thread function should jump via some kind of 'goto label;' inserted by the runtime where at the label: the thread would release the lock and then the thread gets blocked or context switch happens. [ I know if a thread acquires more than 1 locks it might get messy to jump across those points to release those locks...]
But basic idea/question is can the thread function not do the necessary releases of acquired locks for mutexes, semaphores before it gets blocked out or goes out of execution state to wait or some other state?
No. The reason a thread holds a lock is so that it can make data temporarily inconsistent or see a consistent view of that data itself. If some scheme were to automatically release that lock before the thread made the data consistent again, other threads would acquire the lock, see the inconsistent data, and fail. Or when that thread was resumed, it would either not have the lock or have the lock and see inconsistent data itself. This is why you can only reliably suspend a thread with that thread's cooperation.
Consider this logic to add an object to a linked list protected by a mutex:
Acquire the lock protecting a linked list.
Modify the link's head pointer.
Modify the object's next pointer.
Release the lock.
Now imagine if something were to suspend the thread between steps 2 and 3. If the lock were released, other threads would see the link's head pointer pointing to an object that had not been linked to the list. And when the thread resumed, it might set the object to the wrong pointer because the list had changed.
The general consensus is that suspending threads is so evil that even a feeling that you might want to suspend a thread suggests an incorrect application design. There is practically no reason a properly-designed application would ever want to suspend a thread. (If you didn't want that thread to continue doing the work it was doing, why did you code it to continue doing that work in the first place?)
By the way, scheduler pre-emption is not a problem. Eventually, the thread will be scheduled again and release the lock. So long as there are other threads that can make forward progress, no harm is done. And if there are no other threads that can make forward progress, the only thing the system can do is schedule the thread that was pre-empted.
One way to avoid this kind of deadlocks is to have a global, mutexed variable should_stop_thread which eventually gets set to true by the master thread.
The child thread checks the variable regularly and terminates in a controlled manner if it is true. "Controlled" in this sense means that all data (pointers) are valid (again) and mutex locks are released.
https://stackoverflow.com/a/189778/462608
In the case of non-recursive mutexes, there is no sense of ownership and any thread can usually release the mutex no matter which thread originally took the mutex.
What I have studied about Mutexes is that a thread acquires it when it wants to do something to a shared object, and when it completes whatever it wanted to do, it releases the lock. And meanwhile other threads can either sleep or spinlock.
What does the above quote mean by "any thread can usually release the mutex no matter which thread originally took the mutex."?
What's the point that I am missing?
This may differ between different thread implementations, but since you've tagged your question with "pthreads" I assume you're interested in pthread mutexes (and not vxworks mutexes, which is apparently what the link you provide describes).
So in pthreads the rule is that the same thread that locks a mutex must unlock it. You can set attributes on the mutex object whether you want an error to be generated if this rule is violated, or whether the result is undefined behavior (say, for debug vs. release builds). See the manpage for the pthread_mutexattr_settype function for details.
The specification for unlocking a pthread_mutex_t by a thread that wasn't the one that locked it depends on the mutex type (at best it returns an error):
Attempting to unlock a mutex on a thread that didn't lock it is undefined behavior for the following mutex types:
PTHREAD_MUTEX_NORMAL
PTHREAD_MUTEX_DEFAULT
Attempting to unlock a mutex on a thread that didn't lock it returns an error (EPERM) for these types:
PTHREAD_MUTEX_ERRORCHECK
PTHREAD_MUTEX_RECURSIVE
See http://pubs.opengroup.org/onlinepubs/007904875/functions/pthread_mutex_lock.html for details.
The bottom line is that it's never OK to unlock a mutex on a different thread, even if things seem to work.
This is an interview question.
On linux, how to make sure to unlock a POSIX mutex which was locked in a POSIX thread that dies/terminates?
My idea:
Linux will release it automatically when it send kill or termination signal to the program ? But, I cannot find more details about how OS do this ?
thanks
A robust mutex can be used to handle the case where the owner of the mutex is terminated while holding the mutex lock, so that a deadlock does not occur. These have more overhead than a regular mutex, and require that all clients locking the mutex be prepared to handle the error code EOWNERDEAD. This indicates that the former owner has died and that the client receiving this error code is the new owner and is responsible for cleaning up any inconsistent state.
A robust mutex is a mutex with the robust attribute set. It is set using the POSIX.1-2008 standard function pthread_mutexattr_setrobust(&attr, PTHREAD_MUTEX_ROBUST).
Further details and example code can be found on the Linux manual page for pthread_mutexattr_getrobust.
If it's not a process-shared mutex, it doesn't matter. When one thread dies, the process dies, and the mutex goes away.
If it's a process-shared mutex, you're asking the wrong question. You wouldn't want to unlock the mutex if a thread died while holding it. The reason a thread holds a mutex is so that it can manipulate shared data through states that must not be seen by other threads. If a thread dies while holding a mutex, it is likely that the data was left in such an inconsistent state. Unlocking the mutex would just allow other threads to see the invalid/corrupt data.