Does Linux use some of the solutions of priority inversion? - linux

As known, priority inversion problem - when a thread with higher priority waits for a thread with lower-priority: https://en.wikipedia.org/wiki/Priority_inversion
It happen when we have 3 threads: L (low), M (medium), H (high-priority). L and H use the same mutex, but L acquire it early than H, and H blocked and goes to sleep. And then M occupies CPU-core because has higher priority than L, and L goes to sleep, but mutex still acquired. L & H are sleeping, but M is working.
There are some solutions of priority inversion:
Disabling all interrupts to protect critical sections
A priority ceiling
Priority inheritance
Random boosting - Ready tasks holding locks are randomly boosted in priority until they exit the critical section. This solution is used in Microsoft Windows.
Avoid blocking
Does Linux use some of the solutions of priority inversion, and which of its?

Linux uses Priority inheritance.
Priority inversion can be solved by using function int pthread_mutexattr_setprotocol(pthread_mutexattr_t *attr, int protocol);
Where protocol is:
PTHREAD_PRIO_NONE - nothing going on when acquires mutex ownership
PTHREAD_PRIO_INHERIT - Priority inheritance
PTHREAD_PRIO_PROTECT - uses a fixed level of the priority in accordance with prioceiling value which got by function pthread_mutexattr_getprioceiling()
When a thread owns one or more mutexes initialized with the
PTHREAD_PRIO_PROTECT protocol, it shall execute at the higher of its
priority or the highest of the priority ceilings of all the mutexes
owned by this thread and initialized with this attribute, regardless
of whether other threads are blocked on any of these mutexes or not.

Related

Is it possible to implement a real FIFO mutex (in c)?

Is it possible to implement a real FIFO mutex meaning that it is guaranteed that a requesting thread is granted its request only if all "younger" threads have been granted their request. Note that the order in granting the requests of two threads that requested at the same time is not relevant meaning that the solution can .
If the answer is no, is it possible to guarantee the following condition:
Let x_t be the time in which thread x has requested.
x is younger by n than y if and only if y_t - x_t = n.
x is at least younger by n than y if and only if y_t - x_t >= n.
The condition is that there exists n so that a thread is granted its request only if all threads at least younger than it by n have been granted their request.
Note: my terminology may not be accurate. With "requesting" I mean requesting acquiring and locking the mutex. With "granting" I mean locking the mutex by the specified thread.
Q: What does "in C" mean? You can do anything in C if you're willing to write the code.*
Are you asking whether a fair mutex can be provided by an operating system? That's almost trivially easy. The OS already must have a container for each mutex in which it stores the IDs of all of the threads that are waiting for the mutex. All you have to do to ensure fairness is to change the scheduling strategy to always wake the thread whose ID has been in the container longer than any other when the mutex is released.
Are you asking whether you can implement a fair mutex in application code, by using one or more OS-provided unfair mutexes? That is possible, but it won't be as clean as an OS-provided fair mutex. One approach would be to have a pool of condition variables (CVs), and have each different thread use a different CV to await its turn to enter the mutex. When a thread tries and fails to acquire the mutex, it would grab a CV from the pool, put a pointer to the CV into a FIFO queue, and then it would wait on the CV.
Whenever a thread releases the mutex, it would signal the CV that's been waiting longest in the queue. The thread that was awakened, would then return its CV to the pool, and enter the mutex.
* But see Greenspun's Tenth Rule.

What is the relationship between pthread priority and pthread policy?

I'm currently learning pthreads and am struggling to understand the relationship between thread priority and policy. What I know so far:
The thread priority is an integer that indicates priority. The higher this number, the higher priority the thread is treated by the OS.
The thread policy determines how the thread is executed among processes with the shared priority number. SCHED_RR and SCHED_FIFO are real-time policies that continuously execute unless an explicit "sleep" command is issued. Thus a programmer must very carefully write code when using these policies. SCHED_OTHER is a round robin policy that is not executed in real-time.
However, let's say I have the following scenarios (assume each thread does not use a "sleep" command).
Thread 1: priority = 0, policy = SCHED_OTHER
Thread 2: priority = 1, policy = SCHED_OTHER
// would thread 1 run at all?
Thread 1: priority = 0, policy = SCHED_RR
Thread 2: priority = 1, policy = SCHED_RR
// would thread 1 run at all?
I'm confused as to whether or not the the thread policy affects the thread priority, or if the thread priority always trumps the policy.
Edit: Found a web page that cleared up most of my confusion: https://computing.llnl.gov/tutorials/pthreads/man/sched_setscheduler.txt
Documentation for SCHED_RR says that it is the same as SCHED_FIFO except in certain cases when two or more threads have the same static priority.
Documentation for SCHED_FIFO makes it clear that if a thread with higher static priority is ready-to-run but not running, and if one or more threads with lower static priority are running, then one of the lower priority threads will be preempted in favor of the higher priority thread.
would thread 1 run at all [in the SCHED_RR case]?
That depends. What is thread 0 doing? How many CPUs does the system have? If those were the only two threads on a system that had only one CPU, then thread 1 would be allowed to run whenever thread 0 did not want to run.
Generally speaking, when you use static priorities, you want the highest priority threads to do the least amount of work. A high priority thread should spend most of its time waiting for some event. Then when the event happens, the thread should promptly acknowledge it, and then possibly signal a lower-priority thread if some kind of follow-up computation is required.
would thread 1 run at all [in the SCHED_OTHER case]?
As mentioned in my comment, if you're talking about static priorities (i.e., as set by the sched_setattr() system call, then the question is meaningless because threads that are scheduled under the SCHED_OTHER policy are all required to have the same static priority--zero.

Is this an example of a livelock or deadlock or starvation?

Scheduling Scheme : Preemptive Priority Scheduling
Situation :
A process L (Low priority) acquires a spinlock on a resource (R). While still in the Critical Section, L gets preempted because of the arrival of another process - H (Higher Priority) into the ready queue. .
However, H also needs to access resource R and so tries to acquire a spin lock which results in H going to busy wait. Because spinlocks are used, H never actually goes into Wait and will always be in Running state or Ready state (in case an even higher priority process arrives ready queue), preventing L or any process with a priority lower than H from ever executing.
A) All processes with priority less than H can be considered to be under Starvation
B) All processes with priority less than H as well as the process H, can be considered to be in a deadlock. [But, then don't the processes have to be in Wait state for the system to be considered to be in a deadlock?]
C) All processes with priority less than H as well as the process H, can be considered to be in a livelock.[But, then only the state of H changes constantly, all the low priority process remain in just the Ready state. Don't the state of all processes need to change (as part of a spin lock) continuously if the system in livelock?]
D) H alone can be considered to be in livelock, all lower priority processes are just under starvation, not in livelock.
E) H will not progress, but cannot be considered to be in livelock. All lower priority processes are just under starvation, not in livelock.
Which of the above statements are correct? Can you explain?
This is not a livelock, because definition of livelock requires that "states of the processes involved in the livelock constantly change with regard to one another", and here states effectively do not change.
The first process can be considered to be in processor starvation, because if there were additional processor, it could run on it and eventually release the lock and let the second processor to run.
The situation also can be considered as a deadlock, with 2 resources in the resource graph, and 2 processes attempting to acquire that resources in opposite directions: first process owns the lock and needs the processor to proceed, while the second process owns the processor and needs the lock to proceed.

When should we use mutex and when should we use semaphore

When should we use mutex and when should we use semaphore ?
Here is how I remember when to use what -
Semaphore:
Use a semaphore when you (thread) want to sleep till some other thread tells you to wake up. Semaphore 'down' happens in one thread (producer) and semaphore 'up' (for same semaphore) happens in another thread (consumer)
e.g.: In producer-consumer problem, producer wants to sleep till at least one buffer slot is empty - only the consumer thread can tell when a buffer slot is empty.
Mutex:
Use a mutex when you (thread) want to execute code that should not be executed by any other thread at the same time. Mutex 'down' happens in one thread and mutex 'up' must happen in the same thread later on.
e.g.: If you are deleting a node from a global linked list, you do not want another thread to muck around with pointers while you are deleting the node. When you acquire a mutex and are busy deleting a node, if another thread tries to acquire the same mutex, it will be put to sleep till you release the mutex.
Spinlock:
Use a spinlock when you really want to use a mutex but your thread is not allowed to sleep.
e.g.: An interrupt handler within OS kernel must never sleep. If it does the system will freeze / crash. If you need to insert a node to globally shared linked list from the interrupt handler, acquire a spinlock - insert node - release spinlock.
A mutex is a mutual exclusion object, similar to a semaphore but that only allows one locker at a time and whose ownership restrictions may be more stringent than a semaphore.
It can be thought of as equivalent to a normal counting semaphore (with a count of one) and the requirement that it can only be released by the same thread that locked it(a).
A semaphore, on the other hand, has an arbitrary count and can be locked by that many lockers concurrently. And it may not have a requirement that it be released by the same thread that claimed it (but, if not, you have to carefully track who currently has responsibility for it, much like allocated memory).
So, if you have a number of instances of a resource (say three tape drives), you could use a semaphore with a count of 3. Note that this doesn't tell you which of those tape drives you have, just that you have a certain number.
Also with semaphores, it's possible for a single locker to lock multiple instances of a resource, such as for a tape-to-tape copy. If you have one resource (say a memory location that you don't want to corrupt), a mutex is more suitable.
Equivalent operations are:
Counting semaphore Mutual exclusion semaphore
-------------------------- --------------------------
Claim/decrease (P) Lock
Release/increase (V) Unlock
Aside: in case you've ever wondered at the bizarre letters (P and V) used for claiming and releasing semaphores, it's because the inventor was Dutch. In that language:
Probeer te verlagen: means to try to lower;
Verhogen: means to increase.
(a) ... or it can be thought of as something totally distinct from a semaphore, which may be safer given their almost-always-different uses.
It is very important to understand that a mutex is not a semaphore with count 1!
This is the reason there are things like binary semaphores (which are really semaphores with count 1).
The difference between a Mutex and a Binary-Semaphore is the principle of ownership:
A mutex is acquired by a task and therefore must also be released by the same task.
This makes it possible to fix several problems with binary semaphores (Accidental release, recursive deadlock, and priority inversion).
Caveat: I wrote "makes it possible", if and how these problems are fixed is up to the OS implementation.
Because the mutex has to be released by the same task it is not very good for the synchronization of tasks. But if combined with condition variables you get very powerful building blocks for building all kinds of IPC primitives.
So my recommendation is: if you got cleanly implemented mutexes and condition variables (like with POSIX pthreads) use these.
Use semaphores only if they fit exactly to the problem you are trying to solve, don't try to build other primitives (e.g. rw-locks out of semaphores, use mutexes and condition variables for these)
There is a lot of misunderstanding between mutexes and semaphores. The best explanation I found so far is in this 3-Part article:
Mutex vs. Semaphores – Part 1: Semaphores
Mutex vs. Semaphores – Part 2: The Mutex
Mutex vs. Semaphores – Part 3 (final part): Mutual Exclusion Problems
While #opaxdiablo answer is totally correct I would like to point out that the usage scenario of both things is quite different. The mutex is used for protecting parts of code from running concurrently, semaphores are used for one thread to signal another thread to run.
/* Task 1 */
pthread_mutex_lock(mutex_thing);
// Safely use shared resource
pthread_mutex_unlock(mutex_thing);
/* Task 2 */
pthread_mutex_lock(mutex_thing);
// Safely use shared resource
pthread_mutex_unlock(mutex_thing); // unlock mutex
The semaphore scenario is different:
/* Task 1 - Producer */
sema_post(&sem); // Send the signal
/* Task 2 - Consumer */
sema_wait(&sem); // Wait for signal
See http://www.netrino.com/node/202 for further explanations
See "The Toilet Example" - http://pheatt.emporia.edu/courses/2010/cs557f10/hand07/Mutex%20vs_%20Semaphore.htm:
Mutex:
Is a key to a toilet. One person can have the key - occupy the toilet - at the time. When finished, the person gives (frees) the key to the next person in the queue.
Officially: "Mutexes are typically used to serialise access to a section of re-entrant code that cannot be executed concurrently by more than one thread. A mutex object only allows one thread into a controlled section, forcing other threads which attempt to gain access to that section to wait until the first thread has exited from that section."
Ref: Symbian Developer Library
(A mutex is really a semaphore with value 1.)
Semaphore:
Is the number of free identical toilet keys. Example, say we have four toilets with identical locks and keys. The semaphore count - the count of keys - is set to 4 at beginning (all four toilets are free), then the count value is decremented as people are coming in. If all toilets are full, ie. there are no free keys left, the semaphore count is 0. Now, when eq. one person leaves the toilet, semaphore is increased to 1 (one free key), and given to the next person in the queue.
Officially: "A semaphore restricts the number of simultaneous users of a shared resource up to a maximum number. Threads can request access to the resource (decrementing the semaphore), and can signal that they have finished using the resource (incrementing the semaphore)."
Ref: Symbian Developer Library
Mutex is to protect the shared resource.
Semaphore is to dispatch the threads.
Mutex:
Imagine that there are some tickets to sell. We can simulate a case where many people buy the tickets at the same time: each person is a thread to buy tickets. Obviously we need to use the mutex to protect the tickets because it is the shared resource.
Semaphore:
Imagine that we need to do a calculation as below:
c = a + b;
Also, we need a function geta() to calculate a, a function getb() to calculate b and a function getc() to do the calculation c = a + b.
Obviously, we can't do the c = a + b unless geta() and getb() have been finished.
If the three functions are three threads, we need to dispatch the three threads.
int a, b, c;
void geta()
{
a = calculatea();
semaphore_increase();
}
void getb()
{
b = calculateb();
semaphore_increase();
}
void getc()
{
semaphore_decrease();
semaphore_decrease();
c = a + b;
}
t1 = thread_create(geta);
t2 = thread_create(getb);
t3 = thread_create(getc);
thread_join(t3);
With the help of the semaphore, the code above can make sure that t3 won't do its job untill t1 and t2 have done their jobs.
In a word, semaphore is to make threads execute as a logicial order whereas mutex is to protect shared resource.
So they are NOT the same thing even if some people always say that mutex is a special semaphore with the initial value 1. You can say like this too but please notice that they are used in different cases. Don't replace one by the other even if you can do that.
Trying not to sound zany, but can't help myself.
Your question should be what is the difference between mutex and semaphores ?
And to be more precise question should be, 'what is the relationship between mutex and semaphores ?'
(I would have added that question but I'm hundred % sure some overzealous moderator would close it as duplicate without understanding difference between difference and relationship.)
In object terminology we can observe that :
observation.1 Semaphore contains mutex
observation.2 Mutex is not semaphore and semaphore is not mutex.
There are some semaphores that will act as if they are mutex, called binary semaphores, but they are freaking NOT mutex.
There is a special ingredient called Signalling (posix uses condition_variable for that name), required to make a Semaphore out of mutex.
Think of it as a notification-source. If two or more threads are subscribed to same notification-source, then it is possible to send them message to either ONE or to ALL, to wakeup.
There could be one or more counters associated with semaphores, which are guarded by mutex. The simple most scenario for semaphore, there is a single counter which can be either 0 or 1.
This is where confusion pours in like monsoon rain.
A semaphore with a counter that can be 0 or 1 is NOT mutex.
Mutex has two states (0,1) and one ownership(task).
Semaphore has a mutex, some counters and a condition variable.
Now, use your imagination, and every combination of usage of counter and when to signal can make one kind-of-Semaphore.
Single counter with value 0 or 1 and signaling when value goes to 1 AND then unlocks one of the guy waiting on the signal == Binary semaphore
Single counter with value 0 to N and signaling when value goes to less than N, and locks/waits when values is N == Counting semaphore
Single counter with value 0 to N and signaling when value goes to N, and locks/waits when values is less than N == Barrier semaphore (well if they dont call it, then they should.)
Now to your question, when to use what. (OR rather correct question version.3 when to use mutex and when to use binary-semaphore, since there is no comparison to non-binary-semaphore.)
Use mutex when
1. you want a customized behavior, that is not provided by binary semaphore, such are spin-lock or fast-lock or recursive-locks.
You can usually customize mutexes with attributes, but customizing semaphore is nothing but writing new semaphore.
2. you want lightweight OR faster primitive
Use semaphores, when what you want is exactly provided by it.
If you dont understand what is being provided by your implementation of binary-semaphore, then IMHO, use mutex.
And lastly read a book rather than relying just on SO.
I think the question should be the difference between mutex and binary semaphore.
Mutex = It is a ownership lock mechanism, only the thread who acquire the lock can release the lock.
binary Semaphore = It is more of a signal mechanism, any other higher priority thread if want can signal and take the lock.
All the above answers are of good quality,but this one's just to memorize.The name Mutex is derived from Mutually Exclusive hence you are motivated to think of a mutex lock as Mutual Exclusion between two as in only one at a time,and if I possessed it you can have it only after I release it.On the other hand such case doesn't exist for Semaphore is just like a traffic signal(which the word Semaphore also means).
As was pointed out, a semaphore with a count of one is the same thing as a 'binary' semaphore which is the same thing as a mutex.
The main things I've seen semaphores with a count greater than one used for is producer/consumer situations in which you have a queue of a certain fixed size.
You have two semaphores then. The first semaphore is initially set to be the number of items in the queue and the second semaphore is set to 0. The producer does a P operation on the first semaphore, adds to the queue. and does a V operation on the second. The consumer does a P operation on the second semaphore, removes from the queue, and then does a V operation on the first.
In this way the producer is blocked whenever it fills the queue, and the consumer is blocked whenever the queue is empty.
A mutex is a special case of a semaphore. A semaphore allows several threads to go into the critical section. When creating a semaphore you define how may threads are allowed in the critical section. Of course your code must be able to handle several accesses to this critical section.
I find the answer of #Peer Stritzinger the correct one.
I wanted to add to his answer the following quote from the book Programming with POSIX Threads by David R Butenhof. On page 52 of chapter 3 the author writes (emphasis mine):
You cannot lock a mutex when the calling thread already has that mutex locked. The result of attempting to do so may be an error return (EDEADLK), or it may be a self-deadlock, where the unfortunate thread waits forever. You cannot unlock a mutex that is unlocked, or that is locked by another thread. Locked mutexes are owned by the thread that locks them. If you need an "unowned" lock, use a semaphore. Section 6.6.6 discusses semaphores)
With this in mind, the following piece of code illustrates the danger of using a semaphore of size 1 as a replacement for a mutex.
sem = Semaphore(1)
counter = 0 // shared variable
----
Thread 1
for (i in 1..100):
sem.lock()
++counter
sem.unlock()
----
Thread 2
for (i in 1..100):
sem.lock()
++counter
sem.unlock()
----
Thread 3
sem.unlock()
thread.sleep(1.sec)
sem.lock()
If only for threads 1 and 2, the final value of counter should be 200. However, if by mistake that semaphore reference was leaked to another thread and called unlock, than you wouldn't get mutual exclusion.
With a mutex, this behaviour would be impossible by definition.
Binary semaphore and Mutex are different. From OS perspective, a binary semaphore and counting semaphore are implemented in the same way and a binary semaphore can have a value 0 or 1.
Mutex -> Can only be used for one and only purpose of mutual exclusion for a critical section of code.
Semaphore -> Can be used to solve variety of problems. A binary semaphore can be used for signalling and also solve mutual exclusion problem. When initialized to 0, it solves signalling problem and when initialized to 1, it solves mutual exclusion problem.
When the number of resources are more and needs to be synchronized, we can use counting semaphore.
In my blog, I have discussed these topics in detail.
https://designpatterns-oo-cplusplus.blogspot.com/2015/07/synchronization-primitives-mutex-and.html

How does VxWorks deal with priority inheritance?

We have 3 tasks running at different priorities: A (120), B (110), C (100).
A takes a mutex semaphore with the Inversion Safe flag.
Task B does a semTake, which causes Task A's priority to be elevated to 110.
Later, task C does a semTake. Task A's priority is now 100.
At this point, A releases the semaphore and C grabs it.
We notice that A's priority did not go back down to its original priority of 120.
Shouldn't A's priority be restored right away?
Ideally, when the inherited priority level is
lowered, it will be done in a step-wise fashion. As each
dependency that caused the priority level to be bumped up is removed,
the inherited priority level should drop down to the priority level of
the highest remaining dependency.
For Example:
task A (100 bumped up to 80) has two mutexes (X & Y)
that tasks B (pri 90) and task C (pri 80) are respectively pending
for. When task A gives up mutex Y to task C, we might expect that its
priority will drop to 90. When it finally gives up mutex X to task B,
we would expect its priority level to drop back to 100.
Priority inheritance does not work that way in VxWorks.
How it works depends on the version of VxWorks you are using.
pre-VxWorks 6.0
The priority level remains "bumped up" until the task that has the
lock on the mutex semaphore gives up its last inversion safe mutex
semaphore.
Using the example from above, when task A gives up mutex Y
to task C, its priority remains at 80. After it gives up mutex X to
task B, then its priority will drop back to 100 (skipping 90).
Let's throw curve ball #1 into the mix. What if task A had a lock on mutex
Z while all this was going on, but no one was pending on Z? In that
case, the priority level will remain at 80 until Z is given up--then
it will drop back to 100.
Why do it this way?
It's simple, and for most cases, it is good
enough. However, it does mean that when "curve ball #1" comes into
play, the priority will remain higher for a longer period of time than
is necessary.
VxWorks 6.0+
The priority level now
remains elevated until the task that has the lock on the mutex
semaphore gives up its last inversion safe mutex that contributed to
raising the priority level.
This improvement avoids the problem of
"curve ball #1". It does have its own limitations. For example, if
task B and/or task C time(s) out while waiting for task A to give up
the semaphores, task A's priority level does not get recalculated
until it gives up the semaphore.

Resources