What are the practical use of weak semaphores? - semaphore

Weak semaphores are often defined as
if a process (or thread), say p1, performs a V operation and some
other process is waiting on that semaphore, then the first process
(i.e. p1) is not allowed to immediately perform another P operation on
that semaphore. Instead, one of the waiting processes must be given
its turn.
Further, while a single weak-semaphore cannot guarantee starvation-freedom, using few of them can actually guarantee starvation-freedom.
Can anyone mention any practical use of such weak-semaphores!
^The quoted section is from S. A. Friedberg and G. L. Peterson, "An Efficient Solution to the Mutual Exclusion Problem Using Weak Semaphores" with a bit embellishment

Related

Terminology question: mutex lock, spin lock, sleepable lock

All over StackOverflow and the net I see folks to distinguish mutexes and spinlocks as like mutex is a mutual exclusion lock providing acquire() and release() functions and if the lock is taken, then acquire() will allow a process to be preempted.
Nevertheless, A. Silberschatz in his Operating System Concepts says in the section 6.5:
... The simplest of these tools is the mutex lock. (In fact, the term mutex is short for mutual exclusion.) We use the mutex lock to protect critical sections and thus prevent race conditions. That is, a process must acquire the lock before entering a critical section; it releases the lock when it exits the critical section. The acquire() function acquires the lock, and the release() function releases the lock.
and then he describes a spinlock, though adding a bit later
The type of mutex lock we have been describing is also called a spinlock because the process “spins” while waiting for the lock to become available.
so as spinlock is just a type of mutex lock as opposed to sleepable locks allowing a process to be preempted. That is, spinlocks and sleepable locks are all mutexes: locks by means of acquire() and release() functions.
I see totally logical to define mutex locks in the way Silberschatz did (though a bit implicitly).
What approach would you agree with?
The type of mutex lock we have been describing is also called a spinlock because the process “spins” while waiting for the lock to become available.
Maybe you're misreading the book (that is, "The type of mutex lock we have been describing" might not refer to the exact passage you think it does), or the book is outdated. Modern terminology is quite clear in what a mutex is, but spinlocks get a bit muddy.
A mutex is a concurrency primitive that allows one agent at a time access to its resource, while the others have to wait in the meantime until it the exclusive access is released. How they wait is not specified and irrelevant, their process might go to sleep, get written to disk, spin in a loop, or perhaps you are using cooperative concurrency (also known as "asynchronous programming") and passing control to the event loop as your 'waiting operation'.
A spinlock does not have a clear definition. It might be used to refer to:
A synonym for mutex (this is in my opinion wrong, but it happens).
A specific mutex implementation that always waits in a busy loop.
Any sort of busy-waiting loop waiting for a resource. A semaphore, for example, might also get implemented using a 'spinlock'.
I would consider any use of the word to refer to a (part of a) specific implementation of a concurrency primitive that waits in a busy loop to be correct, if a more general term is not appropriate. That is, use mutex (or whatever primitive you desire) unless you specifically want to talk about a busy-waiting concurrency primitive.
The words that one author uses in one book or manual will not always have the same exact meaning in every book and every manual. The meanings of the words evolve over time, and it can happen fast when the words are names for new ideas.
Not every book was written at the same time. Not every author is the same age or had the same teachers. It's just something you'll have to get used to.
"Mutex" was a name for a new idea not so very long ago.
In one book, it might mean nothing more than a thing that keeps two or more threads from entering the same critical section at the same time. In another book, it might refer to a specific type of object in a certain operating system or library that is used for that same purpose.
A spinlock is a lock/mutex whose implementation relies mainly on a spinning loop.
More advanced locks/mutexes may have spinning parts in their implementation, however those often last for no more than a few microseconds or so.

Is what I am doing preventing Deadlock?

I have 12 resources (r_1, r_2, ..., r_12), and 12 corresponding locks (l_1, l_2, ..., l_12) that my threads try to access. Each thread needs a specific sequence of resources to operate on. For example, thread 1 needs r_1, r_3, and r_5. Thread 2 needs r_1, r_7, r_8, r_10.
Now what I've basically done is ordered the resources from 1 to 12, make each thread lock its required resources in this order (ascending order), then when the thread is done, I unlock them in the reverse order (descending order) to maintain an order.
So my question is, am I preventing a deadlock in this case? Or can there happen a deadlock?
TL;DR: Yes, this system is totally immune to deadlock. At any point in time, the thread holding the highest-numbered lock must be able to make progress, since it cannot be waiting to acquire any locks held by other processes. More formally, your conditions ensure a total ordering on lock acquisition by all processes, which in turn ensures that circular wait can never occur. Circular wait is a necessary precondition for deadlock.
Detail: In order for deadlock to take place, all four of the following conditions must apply (see relevant Wikipedia):
Mutual exclusion - i.e. concurrent processes are accessing unsharable resources. Locks are unsharable by definition (they are also called mutexes for this reason).
Hold and wait - at least one process is attempting to access multiple resources, and it does so by holding some of them and then waiting for the others. This condition probably applies in your case, depending on the exact semantics of your program.
No preemption - it is not possible for processes to have their resources taken from them by other processes. Once again, this is a property of the locks we're using.
Circular wait - there is a cycle of processes, each waiting on a resource held by the next. This condition doesn't apply here. Consider a thread A, waiting on accessing a lock L_i. That lock must be held by a thread B which has already obtained all the locks it requires from indices 1 to i. As a result, B cannot be waiting on A. Similarly, any thread that B is waiting on in order to acquire its next lock L_j (where j > i by the order in which locks are acquired) cannot be waiting on any locks with indices 1 to j. By induction, there can be no cycles of dependency in this system.
In concurrent programming, it is typical for the first three cases to be set by the context in which you are developing (which concurrency primitives are being used etc.), whereas the last can occasionally™ be avoided by cleverness.

What does it mean for a process to block?

In my OS class, my professor keeps using block as a verb instead of an adjective when describing multi-threading/synchronization. For example: "Thread B tries to access a resource that is currently being used by Thread A, and so Thread B blocks."
Without any prior knowledge, I would initially think that for a thread to block, it would be preventing some other thread from doing something (e.g. it holds the lock on a resource). But from the way he talks, it sounds like "Thread B blocks" actually means that Thread B is being blocked, or prevented, from accessing the resource it wants to access.
Which is correct?
Your interpolation is correction: When it is said "Thread B blocks" it means that thread B's operation is suspended pending some condition (which may or may not be achieved, "not" being the case in a deadlock for instance).

Is there any difference between "mutex" and "atomic operation"?

I'm learning Operating System now, and I'm quite confused with the two concepts - mutex and atomic operation. In my understanding, they are the same, but my OS instructor gave us such a question,
Suppose a multi-processor operating system kernel tracks the number of processes created by each user. This operating system kernel maintains a counter variable for each user that it increments every time it creates a new process for a user and decrements every time a process from that user terminates. Furthermore, this operating system runs on a processor that provides atomic fetch-and-increment and fetch-and-decrement instructions.
Should the operating system update the counter using the atomic increment and decrement instructions, or should it update the counter in a critical section protected by a mutex?
This question indicates that mutex and atomic operation are two things. Could anyone help me with it?
An atomic operation is one that cannot be subdivided into smaller parts. As such, it will never be halfway done, so you can guarantee that it will always be observed in a consistent state. For example, modern hardware implements atomic compare-and-swap operations.
A mutex (short for mutual exclusion) excludes other processes or threads from executing the same section of code (the critical section). Basically, it ensures that at most one thread is executing a given section of code. A mutex is also called a lock.
Underneath the hood, locks must be implemented using hardware somehow, and the implementation must make use of the atomicity guarantees of the underlying hardware.
Most nontrivial operations cannot be made atomic, so you must either use a lock to block other threads from operating while the critical section executes, or else you must carefully design a lock-free algorithm that ensures that all the critical state-changing operations can be safely implemented using atomic operations.
This is a very deep subject, and there is a large body of literature on all these topics. The Wikipedia links I've given are a good starting point, but since you're taking a class on operating systems right now, it might be best for you to ask your professor to provide good resources for learning and understanding this stuff.
If you're a total noob, my answer may be a good place to start. I've just learned how these work, and feel I'm in a good place to relay back.
Generally, both of these are means of avoiding bad things that happen when you read something that's halfway written.
Mutex
A mutex is like the key to a bathroom at a small business. Only one person ever has the key, so if some other person comes along they'll likely have to wait. Here's the rubs:
If someone walks off with the key, then the waiting person never stops waiting.
Nothing can stop some other process from making its own door to the bathroom.
In the context of code, a mutex is mostly the key part, and the person is a process.
Atomic
Atomic means something that can't be split into smaller steps. In the natural world there is no CPU clock -- so everything we do could be smaller steps -- but let's pretend...
When you're typing on your keyboard, every key you hit is an atomic action. It happens all at once, and you can not hit two keys at exactly the same time. Here's what's good about this:
No waiting: the fact that no two keys are being hit at the same time is not because one has to wait. It's because one is always done by the time the next gets there.
No collision: no matter how much you hammer away, you'll never get two characters overlaid. One always happens before the other, completely.
For a counter example, if you were trying to type two words at the same time, that would be not atomic. The letters would mix up.
In the context of code, hitting keys is the same as running a single CPU command. It doesn't matter what other commands are in queue, the one your are doing will finish in its entirety before the next happens.
If you can do something atomically, then you don't have to worry about collision. But not everything is feasible within these bounds. Generally, atomics are for really low level operations -- like getting and setting an primitive (int, boolean, etc). For anything that's going to run a bunch of CPU commands but wants to be atomic, there's a couple tricks:
Use a mutex. Kind of cheating, not really atomic. But some things do this and call themselves atomic.
Carefully writing code such that it never requires more than one concurrent instruction on a piece of data in a row to remain correct. This one gets a bit deeper, but sometimes it can be done.
From here there's tons of reading to get into the nitty gritty details, but this should be enough to give you a foundation understanding of the subject.
first read #Daniel answer then mine.
If your processor provides atomic instructions enough to complete your task you do not need Mutex/locks. In your case fetch-increment and fetch-decrement are supposed to be atomic so you do not need to use Mutex.
Atomic operations use low level/hardware level locks to make some operations ATOMIC: operations which are virtually performed in one go/cpu cycle. So atomic operations never place system in inconsistent state
EDIT
No Atomic and Mutex are not same thing but two opposite things used for same purpose of making sure that state of system should not become inconsistent. You use Mutex for Non-ATOMIC operations while for ATOMIC operations you do not use Mutex.

How to use V to wake up a designated P?

Suppose we have a semaphore s and there are multiple threads waiting for it by calling P(s). Then V(s) would wake up exact one thread among them. Is there a way to wake up a designated thread instead of having the system make the decision? For instance, in the barbershop problem, after each haircut, the barber wants to serve the longest waiting customer, instead of a random one.
You could just use a queue to store the P's. that'll let you do it based off of longest wait. If not you could store in a sorted tree based off of whatever paramater you want, and remove when needed.
I think the crux of it would be some sort of ordering mechanism for the P's, which souldn't be too complicated.
It depends on the implementation of the semaphore. You would have to use a smart semaphore that creates a queue of waiting threads and signals them in the right order. I think the regular semaphore implementation on Windows doesn't work that way. It just sends a signal to the OS, which in turn sends a signal to any of the waiting threads. It would even make sense if this uses a lifo stack, because that is implemented more easily.
But it wouldn't be hard to build this yourself by implementing a queue, which could be a linked list, or a cyclic array.
No, not with classical semaphores by themselves. If you want queue-like behavior, you create a queue (with a semaphore, or maybe a couple of them) to protect the queue's shared data structure(s).
The reality is, that while semaphores are theoretically all you need to do synchronization, you'd rarely (never?) write a significant body of real code that just used bare semaphores directly. Most of the time, you build higher-level constructs with (for example) a semaphore to protect that critical data in that construct.

Resources