Multi Threading - multithreading

How I can determine which thread is waiting for more time?
My requirement is, in a synchronized methods, when one thread finishes its work, I want to allow the thread which is waiting for the longest time. I hope my question make sense.

All depends on which language and/or environment you are using. So far as I know there's no intrinsic support for this in Java, if multiple threads are waiting to enter a synchronized method then the system will pick an arbitrary one to run when entry is possible.
If instead you use Java's wait() / notify() then you control which threads are notified and so can build your own priority mechanism, for example you could have a simple queue to which each thread adds itself before its wait() then you just take the top item from the queue and notify that thread.

You should not and almost certainly do not need to do this.
The threading environment will schedule threads for you.
If the software design is such that this appears to be a problem, then the design is incorrect for a pre-emptive threading environment.
What you may want to be doing is something more like managing and prioritizing units of work, where you for example service work in the order that it arrives.
In other words, the order of work processing should not in your design depend on which thread runs, but rather, on your design of how work is handed out to threads.

#djna Java doesn't let you choose which thread to notify. If 10 threads are in the queue any one of them can be notified.
This can be done by using the lock/condition interfaces in concurrent package.
Here you can associate each of these threads with a condition and then take out an item from that queue and signal the condition that is mapped with that thread/task.

Related

What is the difference between user level threads and coroutines?

User-level threading involves N user level threads that run on a single kernel thread. What are the details of the user-level threading and how does this differ from coroutines?
Wikipedia has a quite in-depth summary on the subject: Thread (computing).
With Green threads there's a VM executing instructions that typically decides between to switch thread in-between two instructions.
With coroutines the two functions yield to each other at specified points, possibly passing values along, and typically requiring special language support. E.g. a producer yielding to a consumer, passing along an item.
The idea behind user-level threads is to have multiple different logical threads running in the same program but to have the user program handle the mapping from logical threads to kernel threads (Which actually get scheduled) rather than having the OS handle the entire mapping. This can improve performance by letting the user program handle scheduling. Conceptually, user threads are one implementation of preemptive multitasking, where multiple jobs are run to completion in parallel by having the threads periodically stopped while other threads run.
Coroutines, on the other hand, are a generalization of standard function call and return ("subroutines") where functions pass control back and forth to one another, communicating values as they switch between routines. The switching back and forth between coroutines is under the control of the coroutines themselves; control only passes from one coroutine to another if one of the coroutines explicitly yields a value to another. This is an example of cooperative multitasking, where multiple jobs are completed in parallel by having the individual steps in the task manually coordinate who gets to run and when.
Hope this helps!

QProcess, QEventLoop - of any use for parallel-processing

I wonder whether I could use QEventLoop (QProcess?) to parallelize multiple calls to same function with Qt. What is precisely the difference with QtConcurrent or QThread? What is a process and an event loop more precisely? I read that QCoreApplication must exec() as early as possible in main() method, so that I wonder why it is different from main Thread.
could you point as some efficient reference to processes and thread with Qt? I came through the official doc and those things remain unclear.
Thanks and regards.
Process and thread are not Qt-specific concepts. You can search for "process vs. thread" anywhere for that distinction to be explained. For instance: What resources are shared between threads?
Though related concepts, spawning a new process is a more "heavyweight" form of parallelism than spawning a new thread within your existing process. Processes are protected from each other by default, while threads of execution within a process can read and write each other's memory directly. The protection you get from spawning processes comes at a greater run-time cost...and since independent processes can't read each other's memory, you have to share data between them using methods of inter-process communication.
Odds are that you want threads, because they're simpler to use in a case where one is writing all the code in a program. Given all the complexities in multithreaded programming, I'd suggest looking at a good book or reading some websites to start with. See: What are some good resources for learning threaded programming?
But if you want to dive in and just get a feel for how threading in Qt looks, you can spend time looking at the examples:
http://qt-project.org/doc/qt-4.8/examples-threadandconcurrent.html
QtConcurrent is an abstraction library that makes it easier to implement some kinds of parallel programming patterns. It's built on top of the QThread abstractions, and there's nothing it can do that you couldn't code yourself by writing to QThread directly. But it might make your code easier to write and less prone to errors.
As for an event loop...that is merely a generic term for how any given thread of execution in your program waits for work items to process, processes them, and can decide when it is no longer needed. If a thread's job were merely to start up, do some math, and exit...then it wouldn't need an event loop. But starting and stopping a thread takes time and churns resources. So typically threads live for longer periods of time, and have an event loop that knows how to wait for events it needs to respond to.
If you build on top of QtConcurrent, you won't have to worry about an event loop in your worker threads because they are managed automatically in a thread pool. The word count example is pretty simple to see:
http://qt-project.org/doc/qt-4.8/qtconcurrent-wordcount-main-cpp.html

TPL Tasks, Threads, etc

Could someone clear up to me how these things correlate:
Task
Thread
ThreadPool's thread
Paraller.For/ForEach/Invoke
I.e. when I create a Task and run it, where does it get a thread to execute on? And when I call Parallel.* what is really going on under the covers?
Any links to articles, blogposts, etc are also very welcomed!
The ideal state of a system is to have 1 actively running thread per CPU core. By defining work in more general terms of "tasks", the TPL can dynamically decide how many threads to use and which tasks to do on each one in order to come closest to achieving that ideal state. These are decisions that are almost always best made dynamically at runtime because when writing the code you can't know for sure how many CPU cores will be available to your application, how busy they are with other work, etc.
Thread: is a real OS thread, has handle and ID.
ThreadPool: is a collection of already-created OS Threads. These threads are owned/maintained by the runtime, and your code is only allowed to "borrow" them for a while, you can only do work short-termed work in these threads, and you can't modify any thread state, nor delete these threads.
Best guesses on these two:
Task: might get run on a pre-created thread in the thread pool, or might get run as part of user-mode scheduling, this is all depending on what the runtime thinks is best. Another guess: with TPL, the user-mode scheduling is NOT based on OS Fibers, but is its own complete (and working) implementation).
Parallel.For: actually, no clue how this is implemented. The runtime might create new threads to do the parallel bits, or much more likely use the thread pool's threads for the parallelism.

How to use V to wake up a designated P?

Suppose we have a semaphore s and there are multiple threads waiting for it by calling P(s). Then V(s) would wake up exact one thread among them. Is there a way to wake up a designated thread instead of having the system make the decision? For instance, in the barbershop problem, after each haircut, the barber wants to serve the longest waiting customer, instead of a random one.
You could just use a queue to store the P's. that'll let you do it based off of longest wait. If not you could store in a sorted tree based off of whatever paramater you want, and remove when needed.
I think the crux of it would be some sort of ordering mechanism for the P's, which souldn't be too complicated.
It depends on the implementation of the semaphore. You would have to use a smart semaphore that creates a queue of waiting threads and signals them in the right order. I think the regular semaphore implementation on Windows doesn't work that way. It just sends a signal to the OS, which in turn sends a signal to any of the waiting threads. It would even make sense if this uses a lifo stack, because that is implemented more easily.
But it wouldn't be hard to build this yourself by implementing a queue, which could be a linked list, or a cyclic array.
No, not with classical semaphores by themselves. If you want queue-like behavior, you create a queue (with a semaphore, or maybe a couple of them) to protect the queue's shared data structure(s).
The reality is, that while semaphores are theoretically all you need to do synchronization, you'd rarely (never?) write a significant body of real code that just used bare semaphores directly. Most of the time, you build higher-level constructs with (for example) a semaphore to protect that critical data in that construct.

Thread Pool vs Thread Spawning

Can someone list some comparison points between Thread Spawning vs Thread Pooling, which one is better? Please consider the .NET framework as a reference implementation that supports both.
Thread pool threads are much cheaper than a regular Thread, they pool the system resources required for threads. But they have a number of limitations that may make them unfit:
You cannot abort a threadpool thread
There is no easy way to detect that a threadpool completed, no Thread.Join()
There is no easy way to marshal exceptions from a threadpool thread
You cannot display any kind of UI on a threadpool thread beyond a message box
A threadpool thread should not run longer than a few seconds
A threadpool thread should not block for a long time
The latter two constraints are a side-effect of the threadpool scheduler, it tries to limit the number of active threads to the number of cores your CPU has available. This can cause long delays if you schedule many long running threads that block often.
Many other threadpool implementations have similar constraints, give or take.
A "pool" contains a list of available "threads" ready to be used whereas "spawning" refers to actually creating a new thread.
The usefulness of "Thread Pooling" lies in "lower time-to-use": creation time overhead is avoided.
In terms of "which one is better": it depends. If the creation-time overhead is a problem use Thread-pooling. This is a common problem in environments where lots of "short-lived tasks" need to be performed.
As pointed out by other folks, there is a "management overhead" for Thread-Pooling: this is minimal if properly implemented. E.g. limiting the number of threads in the pool is trivial.
For some definition of "better", you generally want to go with a thread pool. Without knowing what your use case is, consider that with a thread pool, you have a fixed number of threads which can all be created at startup or can be created on demand (but the number of threads cannot exceed the size of the pool). If a task is submitted and no thread is available, it is put into a queue until there is a thread free to handle it.
If you are spawning threads in response to requests or some other kind of trigger, you run the risk of depleting all your resources as there is nothing to cap the amount of threads created.
Another benefit to thread pooling is reuse - the same threads are used over and over to handle different tasks, rather than having to create a new thread each time.
As pointed out by others, if you have a small number of tasks that will run for a long time, this would negate the benefits gained by avoiding frequent thread creation (since you would not need to create a ton of threads anyway).
My feeling is that you should start just by creating a thread as needed... If the performance of this is OK, then you're done. If at some point, you detect that you need lower latency around thread creation you can generally drop in a thread pool without breaking anything...
All depends on your scenario. Creating new threads is resource intensive and an expensive operation. Most very short asynchronous operations (less than a few seconds max) could make use of the thread pool.
For longer running operations that you want to run in the background, you'd typically create (spawn) your own thread. (Ab)using a platform/runtime built-in threadpool for long running operations could lead to nasty forms of deadlocks etc.
Thread pooling is usually considered better, because the threads are created up front, and used as required. Therefore, if you are using a lot of threads for relatively short tasks, it can be a lot faster. This is because they are saved for future use and are not destroyed and later re-created.
In contrast, if you only need 2-3 threads and they will only be created once, then this will be better. This is because you do not gain from caching existing threads for future use, and you are not creating extra threads which might not be used.
It depends on what you want to execute on the other thread.
For short task it is better to use a thread pool, for long task it may be better to spawn a new thread as it could starve the thread pool for other tasks.
The main difference is that a ThreadPool maintains a set of threads that are already spun-up and available for use, because starting a new thread can be expensive processor-wise.
Note however that even a ThreadPool needs to "spawn" threads... it usually depends on workload - if there is a lot of work to be done, a good threadpool will spin up new threads to handle the load based on configuration and system resources.
There is little extra time required for creating/spawning thread, where as thread poll already contains created threads which are ready to be used.
This answer is a good summary but just in case, here is the link to Wikipedia:
http://en.wikipedia.org/wiki/Thread_pool_pattern
For Multi threaded execution combined with getting return values from the execution, or an easy way to detect that a threadpool has completed, java Callables could be used.
See https://blogs.oracle.com/CoreJavaTechTips/entry/get_netbeans_6 for more info.
Assuming C# and Windows 7 and up...
When you create a thread using new Thread(), you create a managed thread that becomes backed by a native OS thread when you call Start – a one to one relationship. It is important to know only one thread runs on a CPU core at any given time.
An easier way is to call ThreadPool.QueueUserWorkItem (i.e. background thread), which in essence does the same thing, except those background threads aren’t forever tied to a single native thread. The .NET scheduler will simulate multitasking between managed threads on a single native thread. With say 4 cores, you’ll have 4 native threads each running multiple managed threads, determined by .NET. This offers lighter-weight multitasking since switching between managed threads happens within the .NET VM not in the kernel. There is some overhead associated with crossing from user mode to kernel mode, and the .NET scheduler minimizes such crossing.
It may be important to note that heavy multitasking might benefit from pure native OS threads in a well-designed multithreading framework. However, the performance benefits aren’t that much.
With using the ThreadPool, just make sure the minimum worker thread count is high enough or ThreadPool.QueueUserWorkItem will be slower than new Thread(). In a benchmark test looping 512 times calling new Thread() left ThreadPool.QueueUserWorkItem in the dust with default minimums. However, first setting the minimum worker thread count to 512, in this test, made new Thread() and ThreadPool.QueueUserWorkItem perform similarly.
A side effective of setting a high worker thread count is that new Task() (or Task.Factory.StartNew) also performed similarly as new Thread() and ThreadPool.QueueUserWorkItem.

Resources