I read that mutex and binary semaphore are different in only one aspect, in the case of mutex the locking thread has to unlock, but in semaphore the locking and unlocking thread can be different?
Which one is more efficient?
Assuming you know the basic differences between a sempahore and mutex :
For fast, simple synchronization, use a critical section.
To synchronize threads across process boundaries, use mutexes.
To synchronize access to limited resources, use a semaphore.
Apart from the fact that mutexes have an owner, the two objects may be optimized for different usage. Mutexes are designed to be held only for a short time; violating this can cause poor performance and unfair scheduling. For example, a running thread may be permitted to acquire a mutex, even though another thread is already blocked on it, creating a deadlock. Semaphores may provide more fairness, or fairness can be forced using several condition variables.
Related
I am taking a course on concurrency. The text says that multi-threading allows high throughput as it takes advantage of the multiples cores of the cpu.
I have a question about locking in the context of multiple cores. If we have multiple threads and they are running in different cpu cores, why can't two threads acquire the same lock? How does os protect against such scenarios?
Locking and locks are for synchronization to prevent data corruption when multiple threads want to write to the same memory.
Generally you run multiple threads and use locking only in critical situations.
If two or more threads want to write into the same place at the same time then the multi core calculation is limited. Of course you can use no locking in this situation but results can be unpredictable at that moment.
For example to write multi-threaded calculation of matrix multiplication you make a thread for every row of the resulting matrix. There is no locking needed because every thread writes to different place and this scenario can fully benefit from multiple processors.
If you want to permit more than one shared access to a resource then you can use Semaphore (in java).
If we have multiple threads and they are running in different cpu cores, why can't two threads acquire the same lock?
The purpose of mutex/lock is to implement mutual exclusion - only one thread can lock a mutex at a time. Or, in other words, many threads cannot lock the same mutex at the same time, by definition. This mechanism is needed to allow multiple threads to store into or read from a shared non-atomic resource without data race conditions.
How does os protect against such scenarios?
OS support is needed to prevent the threads from busy-waiting when locking a mutex that is already locked by another thread. Linux implementations of mutex (and semaphore) use futex to put the waiting threads to sleep and wake them up when the mutex is released.
Here is a longer explanation from Linus Torvalds of how mutex is implemented.
When a lock (or try_lock) function of a mutex discovers that the mutex is already locked (by another thread (presumably)), could it try to determine whether the owning thread is (or was extremely recently) running on another core?
Knowing whether the owner is running gives an indication of a possible reason the thread still holds the lock (why it couldn't unlock it): if the thread owning the mutex is either "runnable" (waiting for an available core and time slice) or "sleeping" or waiting for I/O... then clearly spinning on the mutex is not useful.
I understand that non recursive, non "safe", non priority inheritance mutexes (that I call "anonymous" mutexes: mutexes that in practice can be unlocked by another thread as they are just common semaphores) carry as little information as possible, but surely that could be determined for those "owned" mutexes that know the identify of the locking thread.
Another more useful information would be more how much time it was locked, but that would potentially require adding a field in the mutex object.
Is that ever done?
I have two question about mutexes:
1. When a mutex variable equals to 1, and we do a signal() operation on it, what is expected to happen?
2. When mutex equals to 0, and we do a wait(), then the thread is blocked, and mutex stays 0. correct? After a while, another thread makes a signal() operation, then the blocking is released. What will be the value of the mutex now? 0 or 1?
So conceptually mutex has 2 states: locked and unlocked. If it is represented by 0 or 1 is not important here.
If you unlock (i.e. signal) a mutex it changes its state from locked to unlocked. Further unlocking doesn't change its state and it actually doesn't do anything.
If a mutex is unlocked and you call wait then the call does nothing (it does not wait) and the thread continues its execution.
When a mutex is locked and you call wait then the thread is blocked. When other thread calls unlock then blocking is released and the mutex becomes unlocked.
The most important thing is that unlock and lock operations are atomic in the sense that parallel calls cannot overlap each other to produce corrupted result (formally: parallel calls to lock/unlock are always equivalent to some serialized call history). Otherwise the whole concept of mutex would be simply stupid. :)
After reading the comments (and original unedited question) it is clear that there is enough people out there who believe binary sempahores to be interchangeable with mutexes. If we speak in practical terms (that is, pthread mutex and System V semaphore) they are very different. I will try to outline the most important differences below.
Conceptual ownership. Mutexes are owned by their locker, sempahores are not owned by anybody. This leads to two disticntions. Very important one is that mutexes can (should) only be unlocked by the owner (locker thread), while sempahores can be unlocked by any thread (see below for permissions). The less important thing is that mutexes can be made re-entrant - that is, can be locked multiple times with the owner thread - while semaphores can't behave so. I say it is less important because reentrant mutexes almost always present design flaw.
Semaphores are objects which are more or less indepenent of the user. They can be created, used and destroyed by completely unrelated processess or threads, which do not even have to know anything of each other (or execute at the same time). For example, a process might create a semaphore and than die, and other processes can use it, while third process will remove it). Semaphores have permissions associated with them (not unlike file permissions), while mutexes have no such thing - anybody who have access to mutex technically can do anything with it.
Semaphores are process-shared. That is, they can be used by multiple processes without extra efforts. Default mutexes are single-process only, if the same mutex is to be used by multiple processes, it has to be created in a special mode.
In a discussion elsewhere someone has proposed that there may be platforms on which a mutex could be unlocked from a thread other than that which locked it.
I'm unconvinced, but my experience is limited to platforms where this is never allowed; are there any platforms that allow it?
If there are, how would one possibly utilise such a facility? If you can no longer assume that a mutex remains locked between the LOCK and UNLOCK steps that seems to me to defeat the point. Are there scenarios where it must be done and can be done safely with care?
Strictly speaking, no there isn't a shared mutex between threads like this. As that would defeat the purpose of a mutex. If it's something you really want, then you can use a Semaphore(1). Semaphores allow non-acquiring threads to release the acquired count.
I am confused with the usage of semaphores and mutexes at thread and process level. Can we use semphores and mutexes for both thread and process synchronization, or do we have different semaphores and mutexes both at thread and process level? My question is with reference to the POSIX API's.
The answer to both questions is yes. You can create both mutexes and semaphores as either process-shared or not. So you can use them as interprocess or interthread synchronization objects, but you have to specify which when you create them.
Of course, you must create the synchronization object in memory that is shared by all contexts that wish to access it. With threads, that's trivial since they share a view of memory. With processes, you have to create the synchronization object in shared memory specifically.
Synchronization protects elements when they share data or when their tasks must be ordered.
Processes and threads basically are the same (with differences) they are pieces of computation that make some work, the only thing you have to pay attention is when you are working with processes and when with threads but the method used is the same.