How to use V to wake up a designated P? - multithreading

Suppose we have a semaphore s and there are multiple threads waiting for it by calling P(s). Then V(s) would wake up exact one thread among them. Is there a way to wake up a designated thread instead of having the system make the decision? For instance, in the barbershop problem, after each haircut, the barber wants to serve the longest waiting customer, instead of a random one.

You could just use a queue to store the P's. that'll let you do it based off of longest wait. If not you could store in a sorted tree based off of whatever paramater you want, and remove when needed.
I think the crux of it would be some sort of ordering mechanism for the P's, which souldn't be too complicated.

It depends on the implementation of the semaphore. You would have to use a smart semaphore that creates a queue of waiting threads and signals them in the right order. I think the regular semaphore implementation on Windows doesn't work that way. It just sends a signal to the OS, which in turn sends a signal to any of the waiting threads. It would even make sense if this uses a lifo stack, because that is implemented more easily.
But it wouldn't be hard to build this yourself by implementing a queue, which could be a linked list, or a cyclic array.

No, not with classical semaphores by themselves. If you want queue-like behavior, you create a queue (with a semaphore, or maybe a couple of them) to protect the queue's shared data structure(s).
The reality is, that while semaphores are theoretically all you need to do synchronization, you'd rarely (never?) write a significant body of real code that just used bare semaphores directly. Most of the time, you build higher-level constructs with (for example) a semaphore to protect that critical data in that construct.

Related

Example of wake up waiting bit not working for multiple producers or consumers

In Andrew Tanenbaum's book about Operating Systems, in the section about the Consumer/Producer problem, he gives an example about a system consisting of one producer, which insert data blocks in a buffer, and a consumer, which gets them, taking them out of the buffer. Also, there is a counter, which keeps track of the number of data blocks in the buffer. When the consumer verifies that the counter is zero, it sleeps. When the produces verifies that the counter is zero, and puts some data in the buffer, it sends a wake up signal to the consumer.
This can lead to racing conditions, in a well known manner, and Tanenbaum gives a possible solution, using a wake up waiting bit. If the consumer receives a wake up signal while still wake, the wake up waiting bit is set to one. Before sleeping, it verifies the bit. If one, it decreases the bit to zero and does not sleep.
This solves the problem for one producer and one consumer. But Tanenbaum states that, when three or more producer/consumer threads are involved, it is easy to think of an example where the wake up waiting bit does not prevent racing conditions.
I could not think of such an example, and all sources about producer/consumer problem and semaphores seem to avoid the same point.
Can some one please provide such an example of three or more processes (preferably three) where the wake up waiting bit does not prevent a racing condition?
The apparent problem here is that you are dealing with primitive locking mechanisms that no competent programmer would use in real life. Any rationally-designed operating system will have locking mechanisms that are more sophisticated then just setting bits and counters.
Some operating systems have a lock manager. For such a queue you could define a named write lock. All the producers and consumers try to take out exclusive write locks and all the mechanism you describe would be implemented in terms of operating system services.

what is "synchronising point" in multi threading?

I need to decide on how many counting semaphores needs to be used for one of the multi threaded application. I came to know, if we know synchronising points then we can decide the number of semaphores to be used. What are synchronising points?
A synchronization point is a place in the flow of execution where a thread must wait for other busy threads, so that the data they are working on is in a proper state to continue.
For instance, a process may have to way for a free slot in a buffer, while another thread is emptying the buffer.

Semaphores & threads - what is the point?

I've been reading about semaphores and came across this article:
www.csc.villanova.edu/~mdamian/threads/posixsem.html
So, this page states that if there are two threads accessing the same data, things can get ugly. The solution is to allow only one thread to access the data at the same time.
This is clear and I understand the solution, only why would anyone need threads to do this? What is the point? If the threads are blocked so that only one can execute, why use them at all? There is no advantage. (or maybe this is a just a dumb example; in such a case please point me to a sensible one)
Thanks in advance.
Consider this:
void update_shared_variable() {
sem_wait( &g_shared_variable_mutex );
g_shared_variable++;
sem_post( &g_shared_variable_mutex );
}
void thread1() {
do_thing_1a();
do_thing_1b();
do_thing_1c();
update_shared_variable(); // may block
}
void thread2() {
do_thing_2a();
do_thing_2b();
do_thing_2c();
update_shared_variable(); // may block
}
Note that all of the do_thing_xx functions still happen simultaneously. The semaphore only comes into play when the threads need to modify some shared (global) state or use some shared resource. So a thread will only block if another thread is trying to access the shared thing at the same time.
Now, if the only thing your threads are doing is working with one single shared variable/resource, then you are correct - there is no point in having threads at all (it would actually be less efficient than just one thread, due to context switching.)
When you are using multithreading not everycode that runs will be blocking. For example, if you had a queue, and two threads are reading from that queue, you would make sure that no thread reads at the same time from the queue, so that part would be blocking, but that's the part that will probably take the less time. Once you have retrieved the item to process from the queue, all the rest of the code can be run asynchronously.
The idea behind the threads is to allow simultaneous processing. A shared resource must be governed to avoid things like deadlocks or starvation. If something can take a while to process, then why not create multiple instances of those processes to allow them to finish faster? The bottleneck is just what you mentioned, when a process has to wait for I/O.
Being blocked while waiting for the shared resource is small when compared to the processing time, this is when you want to use multiple threads.
This is of course a SSCCE (Short, Self Contained, Correct Example)
Let's say you have 2 worker threads that do a lot of work and write the result to a file.
you only need to lock the file (shared resource) access.
The problem with trivial examples....
If the problem you're trying to solve can be broken down into pieces that can be executed in parallel then threads are a good thing.
A slightly less trivial example - imagine a for loop where the data being processed in each iteration is different every time. In that circumstance you could execute each iteration of the for loop simultaneously in separate threads. And indeed some compilers like Intel's will convert suitable for loops to threads automatically for you. In that particular circumstances no semaphores are needed because of the iterations' data independence.
But say you were wanting to process a stream of data, and that processing had two distinct steps, A and B. The threadless approach would involve reading in some data then doing A then B and then output the data before reading more input. Or you could have a thread reading and doing A, another thread doing B and output. So how do you get the interim result from the first thread to the second?
One way would be to have a memory buffer to contain the interim result. The first thread could write the interim result to a memory buffer and the second could read from it. But with two threads operating independently there's no way for the first thread to know if it's safe to overwrite that buffer, and there's no way for the second to know when to read from it.
That's where you can use semaphores to synchronise the action of the two threads. The first thread takes a semaphore that I'll call empty, fills the buffer, and then posts a semaphore called filled. Meanwhile the second thread will take the filled semaphore, read the buffer, and then post empty. So long as filled is initialised to 0 and empty is initialised to 1 it will work. The second thread will process the data only after the first has written it, and the first won't write it until the second has finished with it.
It's only worth it of course if the amount of time each thread spends processing data outweighs the amount of time spent waiting for semaphores. This limits the extent to which splitting code up into threads yields a benefit. Going beyond that tends to mean that the overall execution is effectively serial.
You can do multithreaded programming without semaphores at all. There's the Actor model or Communicating Sequential Processes (the one I favour). It's well worth looking up JCSP on Wikipedia.
In these programming styles data is shared between threads by sending it down communication channels. So instead of using semaphores to grant another thread access to data it would be sent a copy of that data down something a bit like a network socket, or a pipe. The advantage of CSP (which limits that communication channel to send-finishes-only-if-receiver-has-read) is that it stops you falling into the many many pitfalls that plague multithreaded do programs. It sounds inefficient (copying data is inefficient), but actually it's not so bad with Intel's QPI architecture, AMD's Hypertransport. And it means hat the 'channel' really could be a network connection; scalability built in by design.

Multi Threading

How I can determine which thread is waiting for more time?
My requirement is, in a synchronized methods, when one thread finishes its work, I want to allow the thread which is waiting for the longest time. I hope my question make sense.
All depends on which language and/or environment you are using. So far as I know there's no intrinsic support for this in Java, if multiple threads are waiting to enter a synchronized method then the system will pick an arbitrary one to run when entry is possible.
If instead you use Java's wait() / notify() then you control which threads are notified and so can build your own priority mechanism, for example you could have a simple queue to which each thread adds itself before its wait() then you just take the top item from the queue and notify that thread.
You should not and almost certainly do not need to do this.
The threading environment will schedule threads for you.
If the software design is such that this appears to be a problem, then the design is incorrect for a pre-emptive threading environment.
What you may want to be doing is something more like managing and prioritizing units of work, where you for example service work in the order that it arrives.
In other words, the order of work processing should not in your design depend on which thread runs, but rather, on your design of how work is handed out to threads.
#djna Java doesn't let you choose which thread to notify. If 10 threads are in the queue any one of them can be notified.
This can be done by using the lock/condition interfaces in concurrent package.
Here you can associate each of these threads with a condition and then take out an item from that queue and signal the condition that is mapped with that thread/task.

Are "benaphores" worth implementing on modern OS's?

Back in my days as a BeOS programmer, I read this article by Benoit Schillings, describing how to create a "benaphore": a method of using atomic variable to enforce a critical section that avoids the need acquire/release a mutex in the common (no-contention) case.
I thought that was rather clever, and it seems like you could do the same trick on any platform that supports atomic-increment/decrement.
On the other hand, this looks like something that could just as easily be included in the standard mutex implementation itself... in which case implementing this logic in my program would be redundant and wouldn't provide any benefit.
Does anyone know if modern locking APIs (e.g. pthread_mutex_lock()/pthread_mutex_unlock()) use this trick internally? And if not, why not?
What your article describes is in common use today. Most often it's called "Critical Section", and it consists of an interlocked variable, a bunch of flags and an internal synchronization object (Mutex, if I remember correctly). Generally, in the scenarios with little contention, the Critical Section executes entirely in user mode, without involving the kernel synchronization object. This guarantees fast execution. When the contention is high, the kernel object is used for waiting, which releases the time slice conductive for faster turnaround.
Generally, there is very little sense in implementing synchronization primitives in this day and age. Operating systems come with a big variety of such objects, and they are optimized and tested in significantly wider range of scenarios than a single programmer can imagine. It literally takes years to invent, implement and test a good synchronization mechanism. That's not to say that there is no value in trying :)
Java's AbstractQueuedSynchronizer (and its sibling AbstractQueuedLongSynchronizer) works similarly, or at least it could be implemented similarly. These types form the basis for several concurrency primitives in the Java library, such as ReentrantLock and FutureTask.
It works by way of using an atomic integer to represent state. A lock may define the value 0 as unlocked, and 1 as locked. Any thread wishing to acquire the lock attempts to change the lock state from 0 to 1 via an atomic compare-and-set operation; if the attempt fails, the current state is not 0, which means that the lock is owned by some other thread.
AbstractQueuedSynchronizer also facilitates waiting on locks and notification of conditions by maintaining CLH queues, which are lock-free linked lists representing the line of threads waiting either to acquire the lock or to receive notification via a condition. Such notification moves one or all of the threads waiting on the condition to the head of the queue of those waiting to acquire the related lock.
Most of this machinery can be implemented in terms of an atomic integer representing the state as well as a couple of atomic pointers for each waiting queue. The actual scheduling of which threads will contend to inspect and change the state variable (via, say, AbstractQueuedSynchronizer#tryAcquire(int)) is outside the scope of such a library and falls to the host system's scheduler.

Resources