Readers/Writers synchronization - order of releasing locks in fair version - multithreading

I was going through the completely fair solution to the Readers/Writers problem, and the order of releasing locks seems confusing to me. I'd like to know if we could swap the order of releasing the serviceQueue lock and the readCountAccess lock in the reader() function. It seems counter-intuitive to release the locks in this manner if the order doesn't matter. But I don't see what's wrong in releasing the locks in the opposite order (first readCountAccess and then then serviceQueue lock).

This is probably a remnant from the days where you could justify first releasing the broader lock (in this case serviceQueue) if that doesn't affect correctness, because another thread can immediately proceed with acquiring it while you are releasing the more narrow lock.
Imagine that each acquire or release takes 1 time unit, each other operation takes 0 time units, a reader just incremented the reader counter at time 0, and there's another reader next in line in the service wait queue.
If readCountAccess is released first and serviceQueue second, the next reader could acquire the serviceQueue mutex no earlier than at time 3. Therefore the earliest it can be done with the read lock registration ceremony is at time 6. The benefiting party here will be other readers waiting to exit, and they are less important, because they shouldn't be able to also release resourceAccess (because our original reader has just registered as such).
If on the other hand serviceQueue is released first and readCountAccess second, the next reader could acquire the serviceQueue mutex as early as at time 2. This means that it can be done with the read lock registration ceremony as early as at time 5.
I'd still prefer to unlock using a symmetric scheme though - it is less error-prone, it has broader recognition, and the burden of proof that currently it's worse in any way than the version above will lie on the shoulders of the doubters.

Related

Is there any reason to use a regular lock over a recursive lock?

When a thread tries to acquire a recursive lock again that it already holds, rlock.acquire() allows the thread to continue and does not block the thread.
When, on the other hand, a thread tries to acquire a regular lock that it already holds then the thread is then just stuck in a deadlock.
The second case seems to me like just a source of trouble since it is a situation that cannot be easily recovered from (the thread is just stuck on the lock.acquire()) and that is kinda hard to diagnose (no exception is thrown or anything, the thread is just stuck).
I have seen it quite a few times so far that someone actually wanted to use an RLock but instead used a regular Lock and spent a while debugging that problem. While on the other hand I never encountered a situation where a Lock would have actually been better. It could arguably be used when there is a really critical part of the code that should not be accessed by the same thread twice at a time, but for that to happen the code inside that critical part would need to call itself, which would be something that should be quite obvious to the programmer.
So, is there any case where an Lock is better than an RLock? And if not, should language designers keep providing the regular Lock at all?
Assuming these are Python lock objects the documentation shows that they are quite different. The main differences between the two are:
A Lock can be released by any thread not just the thread that acquired it
An Rlock can only be released by the thread that acquired
An Rlock must be released once for each time it is acquired by the thread
So a Lock allows you to build threading schemes where the lock is acquired in one thread but released in another thread. One example might be a pipeline of threads processing a piece of work, the work distributer gets the lock but it's released by the last thread in the pipeline.

posix: interprocess lock abandoned, is there a better way?

I'm coding on AIX, but looking for a general 'nix solution, posix compliant ideally. Can't use anything in C++11 or later.
I have shared memory with many threads from many processes involved. The data in shared memory has to stay self-consistent, so I need a lock, to get everyone to take turns.
Processes crashing with the lock is a thing, so I have to be able to detect an abandoned lock, fix (aka reset) the data, and move on. Twist: deciding the lock is abandoned by waiting for it for some fixed period is not a viable solution.
A global mutex (either living in shared memory, or named) appears not to be a solution. There's no detection mechanism for abandonment (except timing) and even then you can't delete and reform the mutex without risking undefined behaviour.
So I opted for lockf() and a busy flag - get the file lock, set the flag in shared memory, do stuff, unset the flag, drop the lock. On a crash with the lock owned, the lock is automatically dropped, and the next guy to get it can see the busy flag is still set, and knows he has to clean up a mess.
This doesn't work - because lockf() will keep threads from other processes out, but it has special semantics for other threads in your own process. It lets them through unchecked.
In the end I came up with a two step solution - a local (thread) mutex and a file lock. Get the local mutex first; now you're the only thread in this process doing the next step, which is lockf(). lockf() in turn guarantees you're the only process getting through, so now you can set the busy flag and do the work. To unlock, go in reverse order: clear the busy flag, drop the file lock, drop the mutex lock. In a crash, the local mutex vanishes when the process does, so it's harmless.
Works fine. I hate it. Using two locks nested like this strikes me as expensive, and takes a page worth of comments in the code to explain. (My next code review will be interesting). I feel like I missed a better solution. What is it?
Edit: #Matt I probably wasn't clear. The busy flag isn't part of the locking mechanism; it's there to indicate when some process successfully acquired the lock(s). If, after acquiring the locks, you see the busy flag is already set, it means some other process got the locks and then crashed, leaving the shared memory it was in the middle of writing to in an incomplete state. In that case the thread now in possess of the lock gets the job of re-initializing the shared memory to a usable state. I probably should have called it a "memoryBeingModified" flag.
No variation of "tryLock" is going to be permissible. Polling is absolutely out of the question in this application. Threads that need to modify shared memory may only block on the locks (which are never held long) and have to take their turn as soon as the lock is available to them. They have to experience the minimum possible delay.
You can just
//always returns true unless something horrible happened
bool lock()
{
if (pthread_mutex_lock(&local_mutex)==0)
{
if (lockf(global_fd, F_LOCK, 0))
return true;
pthread_mutex_unlock(&local_mutex);
}
return false;
}
void unlock()
{
lockf(global_fd, F_ULOCK, 0);
pthread_mutex_unlock(&local_mutex);
}
This seems pretty straightforward to me, and I wouldn't feel too bad about using 2 levels of lock -- the pthread_mutex is quite fast and consumes almost no resources.
The simple answer is, there's no good solution. On AIX, lockf turns out to be extremely slow, for no good reason. But mutexes in shared memory, while very fast on any platform, are fragile (anyone can crash while holding the lock and there's no recovery for that.) It would be nice is posix defined a "this mutex is held by a thread/process that died ", but it doesn't and even if there was such an error code, there's no way to repair things and continue. Using shared memory with multiple readers and writers continues to be the wild west.

Using semaphore with counter

Such a question.I am trying to understand how to use a semaphore. For exercise I took classical problem of readers / writers
with a cyclic memory buffer. I would like to discuss only the writers. If I initialize the semaphore with a count greater than 1,
I see that my writers can write to the same memory location. Then what is the meaning of the semaphore with the counter if it does
not guarantee synchronized access to a shared resource? It seems I should have for each memory cell the separate semaphore.
Well your use case is a special situation when the semaphore is initialized to 1 and behaves like a mutex. Obviously putting 2 would be an error as it would not be a correct lock anymore.
Nevertheless, semaphores a used in many other situations, for example, say you want to make sure that you do not have more than 5 thread running at a time.
You would setup the semaphore at 5, and each time you spawn a thread you do a down on it, and each time a thread finishes, you would do a up.
Trying to spawn the 6th thread would cause you to be 'stuck' in the down() until a thread eventually finishes at performs a up() that will unblock you.
Semaphore
Semaphores are a way to share a resource among multiple threads. In the Readers-writers problem, it is a way to guarantee consistency of the data, by preventing updates while it is being read, and preventing reads while it is being written to. It allows only one writer, and multiple concurrent readers.
Talking about semaphores is only useful if there are both readers and writers; In the case of an exclusive lock, where there can only be one thread who 'owns' the lock (has access to the resource), they are usually called Mutex (short for mutual exclusion).
Implementation
I implemented semaphores the other way around (due to CPU specifics): positive indicates how many readers there are, and a negative number indicates that there is one writer.
Initially the semaphore is 0, indicating no writer, and no readers.
Read Lock
Any time a reader wants to read, the semaphore must be 0 or positive, to support concurrent reads. If this is so, it is incremented. Positive numbers then, indicate that there are readers.
A reader would do a LOCK_READ, which succeeds, unless the semaphore is negative, indicating it is in the process of being written to and thus inconsistent. If this happens, the thread doing the read lock is suspended until the semaphore becomes 0 (or higher).
Write Lock
Any time a writer wants to write, the semaphore must be 0, because if it is positive, the readers may get partially updated (corrupt) data, and if it is negative, it is already locked for writing by another thread. If the semaphore indicates that the resource is not being accessed (0), the semaphore is decremented.
Unlocking
The unlocking is the reverse of the locking, except that there is no need to suspend the thread to unlock a resource. A read lock is lifted by decrementing the semaphore, and a write lock is lifted by incrementing the semaphore.

Java Thread Live Lock

I have an interesting problem related to Java thread live lock. Here it goes.
There are four global locks - L1,L2,L3,L4
There are four threads - T1, T2, T3, T4
T1 requires locks L1,L2,L3
T2 requires locks L2
T3 required locks L3,L4
T4 requires locks L1,L2
So, the pattern of the problem is - Any of the threads can run and acquire the locks in any order. If any of the thread detects that a lock which it needs is not available, it release all other locks it had previously acquired waits for a fixed time before retrying again. The cycle repeats giving rise to a live lock condition.
So, to solve this problem, I have two solutions in mind
1) Let each thread wait for a random period of time before retrying.
OR,
2) Let each thread acquire all the locks in a particular order ( even if a thread does not require all the
locks)
I am not convinced that these are the only two options available to me. Please advise.
Have all the threads enter a single mutex-protected state-machine whenever they require and release their set of locks. The threads should expose methods that return the set of locks they require to continue and also to signal/wait for a private semaphore signal. The SM should contain a bool for each lock and a 'Waiting' queue/array/vector/list/whatever container to store waiting threads.
If a thread enters the SM mutex to get locks and can immediately get its lock set, it can reset its bool set, exit the mutex and continue on.
If a thread enters the SM mutex and cannot immediately get its lock set, it should add itself to 'Waiting', exit the mutex and wait on its private semaphore.
If a thread enters the SM mutex to release its locks, it sets the lock bools to 'return' its locks and iterates 'Waiting' in an attempt to find a thread that can now run with the set of locks available. If it finds one, it resets the bools appropriately, removes the thread it found from 'Waiting' and signals the 'found' thread semaphore. It then exits the mutex.
You can twiddle with the algorithm that you use to match up the available set lock bools with waiting threads as you wish. Maybe you should release the thread that requires the largest set of matches, or perhaps you would like to 'rotate' the 'Waiting' container elements to reduce starvation. Up to you.
A solution like this requires no polling, (with its performance-sapping CPU use and latency), and no continual aquire/release of multiple locks.
It's much easier to develop such a scheme with an OO design. The methods/member functions to signal/wait the semaphore and return the set of locks needed can usually be stuffed somewhere in the thread class inheritance chain.
Unless there is a good reason (performance wise) not to do so,
I would unify all locks to one lock object.
This is similar to solution 2 you suggested, only more simple in my opinion.
And by the way, not only is this solution more simple and less bug proned,
The performance might be better than solution 1 you suggested.
Personally, I have never heard of Option 1, but I am by no means an expert on multithreading. After thinking about it, it sounds like it will work fine.
However, the standard way to deal with threads and resource locking is somewhat related to Option 2. To prevent deadlocks, resources need to always be acquired in the same order. For example, if you always lock the resources in the same order, you won't have any issues.
Go with 2a) Let each thread acquire all of the locks that it needs (NOT all of the locks) in a particular order; if a thread encounters a lock that isn't available then it releases all of its locks
As long as threads acquire their locks in the same order you can't have deadlock; however, you can still have starvation (a thread might run into a situation where it keeps releasing all of its locks without making forward progress). To ensure that progress is made you can assign priorities to threads (0 = lowest priority, MAX_INT = highest priority) - increase a thread's priority when it has to release its locks, and reduce it to 0 when it acquires all of its locks. Put your waiting threads in a queue, and don't start a lower-priority thread if it needs the same resources as a higher-priority thread - this way you guarantee that the higher-priority threads will eventually acquire all of their locks. Don't implement this thread queue unless you're actually having problems with thread starvation, though, because it's probably less efficient than just letting all of your threads run at once.
You can also simplify things by implementing omer schleifer's condense-all-locks-to-one solution; however, unless threads other than the four you've mentioned are contending for these resources (in which case you'll still need to lock the resources from the external threads), you can more efficiently implement this by removing all locks and putting your threads in a circular queue (so your threads just keep running in the same order).

Is Deadlock recovery possible in MultiThread programming?

Process has some 10 threads and all 10 threads entered DEADLOCK state( assume all are waiting for Mutex variable ).
How can you free process(threads) from DEADLOCK state ? .
Is there any way to kill lower priority thread ?( in Multi process case we can kill lower priority process when all processes in deadlock state).
Can we attach that deadlocked process to the debugger and assign proper value to the Mutex variable ( assume all the threads are waiting on a mutex variable MUT but it is value is 0 and can we assign MUT value to 1 through debugger ) .
If every thread in the app is waiting on every other, and none are set to time out, you're rather screwed. You might be able to run the app in a debugger or something, but locks are generally acquired for a reason -- and manually forcing a mutex to be owned by a thread that didn't legitimately acquire it can cause some big problems (the thread that previously owned it is still going to try and release it, the results of which can be unpredictable if the mutex is unexpectedly yanked away. Could cause an unexpected exception, could cause the mutex to be unlocked while still in use.) Anyway it defeats the whole purpose of mutexes, so you're just covering up a much bigger problem.
There are two common solutions:
Instead of having threads wait forever, set a timeout. This is slightly harder to do in languages like Java that embed mutexes into the language via synchronized or lock blocks, but it's almost always possible. If you time out waiting on the lock, release all the locks/mutexes you had and try later.
Better, but potentially much more complex, is to figure out why everything's fighting for the resource and remove that contention. If you must lock, lock consistently. But if there's 10 threads blocking on a single mutex, that could be a clue either that your operations are badly chunked (ie: that your threads are doing too much or too little at once before trying to acquire a lock), or that there's unnecessary locking going on. Don't lock unless you have to. Some synchronization could be obviated by using collections and algorithms specifically designed to be "lock-free" while still offering thread-safety.
Adding another answer because I don't agree with the solutions proposed by cHao earlier - the analysis is fine.
First, why I disagree with the two solutions offered:
Reduce contention
Contention doesn't lead to deadlocks. It just causes poor performance. Deadlock means no performance whatsoever. Therefore, reducing contention does not solve deadlocks.
timeout on mutex.
A mutex protects a resource, and a thread locks the mutex because it needs the resource. With a timeout, you won't be able to acquire the resource, and your thread fails. Does it solve the deadlock problem? Only if the failing thread releases another resource that was blocking the other threads.
But in that case, there's a much better solution. Mutexes should have a partial ordering. If there is at least one thread that can both mutex A and B, you should decide whether A or B is acquired first, and then stick with that. This must be a transitive order: if you lock A before B, and B before C, then obviously you must lock A before C.
This is a perfect solution to deadlocks. Look back at the timeout example: it only works if the thread that times out waiting on A then releases its lock on B, to release another thread that was waiting on B. In the most simple case, that other thread was itself directly locking A. Thus, the mutexes A and B are not properly ordered. You should have consistently locked either A or B first.
The timeout case could also be the result of a cyclic order problem; one thread locks A then B, another B then C, and a third C then A, with the deadlock happening when each thread owns one lock. The solution again is the same; order the locks.
Alternatively said, mutex lock orders can be described by a directed graph. If a thread locks A before B, there's an arc from A to B. Deadlocks appear if the directed graph is cyclic, and then the arcs of that cycle are the deadlocked threads.
This theory can be a bit complex, but there are some simple insights to be found. For instance, from the graph theory, we know that trees are acyclic graphs. Hence, neither "leaf mutexes" (those that are always locked last) nor "root mutexes" (those that are always locked first) can cause deadlocks. Leaf mutexes are excluded because no thread ever blocks holding them, and root mutexes are excluded because the thread that holds them will be able to lock all subsequent mutexes in due time.

Resources