I was researching the answer to this question and ran across this post. Is ThreadPool safe? How does ThreadPool compare with the OmniThreadLibrary? What are the pluses and minuses of using each?
Here is an example of what I am doing:
procedure DoWork(nameList: TList<Integer>)
var
i: Integer;
oneThread: PerNameThread;
begin
for (i := 0; to nameList.Count-1) do
begin
oneThread := PerNameThread.Create(Self);
oneThread.nameID = nameList[i];
oneThread.Start();
end
end;
I am creating a thread for each nameList item, and this could be up to 500 names. All these threads are too much, and slowing down the process, to the point where this process would be faster with just one thread.
First, you need to understand what a thread pool is.
A thread pool is a concept where you have a list of multiple threads that are suspended when they are not performing any tasks.
These threads are defined a bit differently than you are probably used to. Instead of them having all the necessary code inside their Execute() method, their Execute() method only contains a few lines of code to execute external code (giving the threads the ability to perform practically any processing that you require), take care of synchronizing the result back to the caller/UI, and returning the thread to the pool and putting it into a suspended state. When a new task is needed at a later time, a thread is given the task and resumed.
So by providing each thread with a method pointer for a task, you actually define what kind of job each thread will be processing each time it is run.
The main advantage of using a thread pool is that doing so avoids the overhead of creating and destroying a thread for each specific task.
As for OmniThreadLibrary, it is a full blown task management library. It uses its own thread pool and a pretty advanced task managing system that allows you to easily define which tasks can be executed in parallel, which tasks need to be executed in sequence, and which tasks have higher priority than others.
The only drawback of OmniThreadLibrary is that it is still limited to Windows only, so if you are thinking of providing multiplatform support for your application then you will have to find another solution.
Related
I know Synchronize must be used in the Execute procedure, but should it be used in Create and Destroy methods too, or is it safe to do whatever I want?
I know Synchronize must be used in the Execute procedure.
That is somewhat vague. You need to use Synchronize when you have code that must execute on the main thread. So the answer to whether or not you will need to use Synchronize depends crucially on what the code under consideration actually does. The question that you must ask yourself, and which is one that only you can answer, is do you have code that must run on the main thread?
As a general rule it would be considered prudent for you not to need to call Synchronize outside the Execute method. If you can find a way to avoid doing so then that would be wise. Remember that the ideal scenario with threads is that they never need to block with Synchronize if at all possible.
You might also wish to consider which thread executes the constructor and destructor.
The constructor Create runs in the thread that calls it. It does not run in the newly created thread. Therefore it is unlikely that you would need to use Synchronize there.
The destructor Destroy runs in the thread that calls it. Typically this is the thread that calls Free on the thread object. And usually that would be called from the same thread that originally created the thread. The common exception to that is a FreeOnTerminate thread which calls Free from the thread.
There is a need to use Synchronize() when the code is executing outside of the context of the main (GUI) thread of the application. Therefore the answer to your question depends on whether the constructor and destructor are called from that thread or not.
If you are unsure you can check that by comparing the result of the Windows API function GetCurrentThreadId() with the variable MainThreadID - if they equal the code executes in the context of the main thread.
Threads that have FreeOnTerminate set will have their destructor called from another thread context, so you would need to use Synchronize() or Queue(). Or you use the termination event the VCL already provides, I believe it is executed in the main thread, but check the documentation for details.
First of all, you don't want to call Synchronize() unnecessarily, because that simply defeats the purpose of using a thread. So the decision should be based on whether: (a) it's possible to encounter race conditions with shared data. (b) you'll be using VCL code which usually has to run on the main thread.
It's unlikely you would need to synchronise in the constructor because TThread instances are usually created from the main thread already. (The exception being if you're creating some TThread's from another child thread.)
NOTE: It won't cause any harm though because Synchronize() already checks if you're on the main thread and will call the synchronised method immediately if you are.
class procedure TThread.Synchronize(ASyncRec: PSynchronizeRecord; QueueEvent: Boolean = False);
var
SyncProc: TSyncProc;
SyncProcPtr: PSyncProc;
begin
if GetCurrentThreadID = MainThreadID then
ASyncRec.FMethod
As for the destructor there are 3 usage patterns:
The TThread instances destroys itself.
Another thread (possibly the main thread) can WaitFor the instance to finish, then destroy it.
You can intercept the OnTerminate event. This is fired when the instance is finished, and you could then destroy it.
NOTE: The OnTerminate event will already be synchronised.
procedure TThread.DoTerminate;
begin
if Assigned(FOnTerminate) then Synchronize(CallOnTerminate);
end;
Given the above, the only time you might need to synchronise is if the thread self-destructs.
However, I'd advise that you rather avoid putting code into your destructor that might need to be synchronised. If you need some results of a calculation from your thread instance, OnTerminate is the more appropriate place to get this.
To add to what has been said in other answers...
You never need to use Synchronize at all. Synchronize may be useful, however, in the following circumstance:
In the context of your thread you need to execute code that touches objects that have affinity to the main thread.
You require your thread to block until that code has been executed.
Even in that case, there are other ways to achive the same goal, but Synchronize provides a convenient way to satisfy those two needs. If you need only one of those two items, there are better strategies available.
On topic #1, the obvious objects are user interface objects. These are objects that have thread affinity to the main thread simply by virtue of the fact that the main thread is continually reading and writing the properties of those objects (not the least because it needs to paint them to the screen, etc) and it does so at its own convenience. This means that your thread cannot safely access those components with a guarantee that the main thread will not also be accessing or modifying them at the same time. In order to prevent corruption, the thread has to pass the work to the main thread (since the main thread can only do one thing at a time and can't, obviously, interfere with itself). Synchronize simply places the work onto the main thread's queue and waits until the main thread gets around to completing it before returning.
This gets to point #2. Do you need to (or, equally, can you afford to) wait around until the main thread finishes the work? There are three cases and two options.
Yes, you can or must wait. (Synchronize is a good fit)
No, you cannot wait. (Synchronize is not a good fit)
Don't care. (Synchronize is easy, so it's a sensible option)
If you are simply updating a status display that will soon be overwritten anyway and your thread has more pressing issues, then it's probably sensible to just post a message to the main thread and carry on doing things, for example. If your thread is just waiting around doing nothing, mostly, and it's not worth the time to code anything more sophisticated, then Synchronize is just fine, and it can be replaced with something better if needs dictate so in the future.
As others have said, it really depends on what you are doing. The more important question, I think, at least conceptually, is to sort out when you need to worry about concurrency and when you don't. Any time you have more than one thread that requires access to a single resource you need to use some sort of mechanism to coordinate that access to avoid the threads crashing into each other. Synchronize is one of those methods, but it not the least nor the last of them.
I'm new to Multithread in Win32. And I have an assignment with Semaphore. But I cannot understand this.
Assume that we have 20 tasks (each task is the same with other tasks). We use semaphore then there's 2 circumstances:
First, there should be have 20 childthreads in order that each thread will handle 1 task.
Or:
Second, there would be have n childthreads. When a thread finishs a task, it will handle another task?
The second problem I counter that I cannot find any samples for Semaphore in Win32(API) but Consonle that I found in MSDN.
Can you help me with the "20 task" and tell me the instruction of writing a Semaphore in WinAPI application (Where should I place CreateSemaphore() function ...)?
Your suggestion will be appreciated.
You can start a thread for every task, which is a common approach, or you can use a "threadpool" where threads are reused. This is up to you. In both scenarios, you may or may not use a semaphore, the difference is only how you start the multiple threads.
Now, concerning your question where to place the CreateSemaphore() function, you should call that before starting any further threads. The reason is that these threads need to access the semaphore, but they can't do that if it doesn't exist yet. You could of course pass it to the other threads, but that again would give you the problem how to pass it safely without any race conditions, which is something that semaphores and other synchronization primitives are there to avoid. In other words, you would only complicate things by creating a chicken-and-egg problem.
Note that if this doesn't help you any further, you should perhaps provide more info. What are the goals? What have you done yourself so far? Any related questions here that you read but that didn't fully present answers to your problem?
Well, if you are contrained to using semaphores only, you could use two semaphores to create an unbounded producer-consumer queue class that you could use to implement a thread pool.
You need a 'SimpleQueue' class for task objects. I assume you either have one already, can easily build one or whatever.
In the ctor of your 'ProducerConsumerQueue' class, (or in main(), or in some factory function that returns a *ProducerConsumerQueue struct, whatever your language has), create a SimpleClass and two semaphores. A 'QueueCount' semaphore, initialized with a count of 0, and a 'QueueAccess' semaphore, initialized with a count of 1.
Add 'push(*task)' and ' *task pop()' methods/memberFunctions/methods to the ProducerConsumerQueue:
In 'push', first call 'WaitForSingleObject()' API on QueueAccess, then push the *task onto the SimpleQueue, then ReleaseSemaphore() API on QueueAccess. This pushes the *task in a thread-safe manner. Then ReleaseSemaphore() on QueueCount - this will signal any waiting threads.
In pop(), first call 'WaitForSingleObject()' API on QueueCount - this ensures that any calling consumer thread has to wait until there is a *task in the queue. Then call 'WaitForSingleObject()' API on QueueAccess, then pop task from the SimpleQueue, then ReleaseSemaphore() API on QueueAccess and return the task - this this thread-safely dequeues the *task.
Once you have created your ProducerConsumerQueue, create some threads to run the tasks. In CreateThread(), pass the same *ProducerConsumerQueue as the 'auxiliary' *void parameter.
In the thread function, cast the *void back to *ProducerConsumerQueue and then just loop around for ever, calling pop() and then running the returned task.
OK, your pool of threads is now ready to do stuff. If you want to run 20 tasks, create them in a loop and push them onto the ProducerConsumerQueue. The threads will then run them all.
You can create as many threads as you want to in the pool, (within reason). As many threads as cores is reasonable for tasks that are CPU-intensive. If the tasks make blocking calls, you may want to create many more threads for quickest overall throughput.
A useful enhancement is to check for 'null' in the thread function loop after each task is received and, if it is null, clean up an exit the thread, so terminating it. This allows the threads to be easily terminated by queueing up nulls, making it easier to shutdown your thread pool, (should you need to), and also to control the number of threads in the pool at runtime.
How do I control the number of threads that my program is working on?
I have a program that is now ready for mutithreading but one problem is that the program is extremely memory intensive and i have to limit the number of threads running so that i don't run out of ram. The main program goes through and creates a whole bunch of handles and associated threads in suspended state.
I want the program to activate a set number of threads and when one thread finishes, it will automatically unsuspended the next thread in line until all the work has been completed. How do i do this?
Someone has once mentioned something about using a thread handler, but I can't seem to find any information about how to write one or exactly how it would work.
If anyone can help, it would be greatly appreciated.
Using windows and visual c++.
Note: i don't need to worry about the traditional problems of access with the threads, each one is completely independent of each other, its more of like batch processing rather than true mutithreading of a program.
Thanks,
-Faken
Don't create threads explicitly. Create a thread pool, see Thread Pools and queue up your work using QueueUserWorkItem. The thread pool size should be determined by the number of hardware threads available (number of cores and ratio of hyperthreading) and the ratio of CPU vs. IO your work items do. By controlling the size of the thread pool you control the number of maximum concurrent threads.
A Suspended thread doesn't use CPU resources, but it still consumes memory, so you really shouldn't be creating more threads than you want to run simultaneously.
It is better to have only as many threads as your maximum number of simultaneous tasks, and to use a queue to pass units of work to the pool of worker threads.
You can give work to the standard pool of threads created by Windows using the Windows Thread Pool API.
Be aware that you will share these threads and the queue used to submit work to them with all of the code in your process. If, for some reason, you don't want to share your worker threads with other code in your process, then you can create a FIFO queue, create as many threads as you want to run simultaneously and have each of them pull work items out of the queue. If the queue is empty they will block until work items are added to the queue.
There is so much to say here.
There are a few ways
You should only create as many thread handles as you plan on running at the same time, then reuse them when they complete. (Look up thread pool).
This guarantees that you can never have too many running at the same time. This raises the question of funding out when a thread completes. You can have a callback be called just before a thread terminates where a parameter in that callback is the thread handle that just finished. Use Boost bind and boost signals for that. When the callback is called, look for another task for that thread handle and restart the thread. That way all you have to do is add to the "tasks to do" list and the callback will remove the tasks for you. No polling needed, and no worries about too many threads.
Can someone list some comparison points between Thread Spawning vs Thread Pooling, which one is better? Please consider the .NET framework as a reference implementation that supports both.
Thread pool threads are much cheaper than a regular Thread, they pool the system resources required for threads. But they have a number of limitations that may make them unfit:
You cannot abort a threadpool thread
There is no easy way to detect that a threadpool completed, no Thread.Join()
There is no easy way to marshal exceptions from a threadpool thread
You cannot display any kind of UI on a threadpool thread beyond a message box
A threadpool thread should not run longer than a few seconds
A threadpool thread should not block for a long time
The latter two constraints are a side-effect of the threadpool scheduler, it tries to limit the number of active threads to the number of cores your CPU has available. This can cause long delays if you schedule many long running threads that block often.
Many other threadpool implementations have similar constraints, give or take.
A "pool" contains a list of available "threads" ready to be used whereas "spawning" refers to actually creating a new thread.
The usefulness of "Thread Pooling" lies in "lower time-to-use": creation time overhead is avoided.
In terms of "which one is better": it depends. If the creation-time overhead is a problem use Thread-pooling. This is a common problem in environments where lots of "short-lived tasks" need to be performed.
As pointed out by other folks, there is a "management overhead" for Thread-Pooling: this is minimal if properly implemented. E.g. limiting the number of threads in the pool is trivial.
For some definition of "better", you generally want to go with a thread pool. Without knowing what your use case is, consider that with a thread pool, you have a fixed number of threads which can all be created at startup or can be created on demand (but the number of threads cannot exceed the size of the pool). If a task is submitted and no thread is available, it is put into a queue until there is a thread free to handle it.
If you are spawning threads in response to requests or some other kind of trigger, you run the risk of depleting all your resources as there is nothing to cap the amount of threads created.
Another benefit to thread pooling is reuse - the same threads are used over and over to handle different tasks, rather than having to create a new thread each time.
As pointed out by others, if you have a small number of tasks that will run for a long time, this would negate the benefits gained by avoiding frequent thread creation (since you would not need to create a ton of threads anyway).
My feeling is that you should start just by creating a thread as needed... If the performance of this is OK, then you're done. If at some point, you detect that you need lower latency around thread creation you can generally drop in a thread pool without breaking anything...
All depends on your scenario. Creating new threads is resource intensive and an expensive operation. Most very short asynchronous operations (less than a few seconds max) could make use of the thread pool.
For longer running operations that you want to run in the background, you'd typically create (spawn) your own thread. (Ab)using a platform/runtime built-in threadpool for long running operations could lead to nasty forms of deadlocks etc.
Thread pooling is usually considered better, because the threads are created up front, and used as required. Therefore, if you are using a lot of threads for relatively short tasks, it can be a lot faster. This is because they are saved for future use and are not destroyed and later re-created.
In contrast, if you only need 2-3 threads and they will only be created once, then this will be better. This is because you do not gain from caching existing threads for future use, and you are not creating extra threads which might not be used.
It depends on what you want to execute on the other thread.
For short task it is better to use a thread pool, for long task it may be better to spawn a new thread as it could starve the thread pool for other tasks.
The main difference is that a ThreadPool maintains a set of threads that are already spun-up and available for use, because starting a new thread can be expensive processor-wise.
Note however that even a ThreadPool needs to "spawn" threads... it usually depends on workload - if there is a lot of work to be done, a good threadpool will spin up new threads to handle the load based on configuration and system resources.
There is little extra time required for creating/spawning thread, where as thread poll already contains created threads which are ready to be used.
This answer is a good summary but just in case, here is the link to Wikipedia:
http://en.wikipedia.org/wiki/Thread_pool_pattern
For Multi threaded execution combined with getting return values from the execution, or an easy way to detect that a threadpool has completed, java Callables could be used.
See https://blogs.oracle.com/CoreJavaTechTips/entry/get_netbeans_6 for more info.
Assuming C# and Windows 7 and up...
When you create a thread using new Thread(), you create a managed thread that becomes backed by a native OS thread when you call Start – a one to one relationship. It is important to know only one thread runs on a CPU core at any given time.
An easier way is to call ThreadPool.QueueUserWorkItem (i.e. background thread), which in essence does the same thing, except those background threads aren’t forever tied to a single native thread. The .NET scheduler will simulate multitasking between managed threads on a single native thread. With say 4 cores, you’ll have 4 native threads each running multiple managed threads, determined by .NET. This offers lighter-weight multitasking since switching between managed threads happens within the .NET VM not in the kernel. There is some overhead associated with crossing from user mode to kernel mode, and the .NET scheduler minimizes such crossing.
It may be important to note that heavy multitasking might benefit from pure native OS threads in a well-designed multithreading framework. However, the performance benefits aren’t that much.
With using the ThreadPool, just make sure the minimum worker thread count is high enough or ThreadPool.QueueUserWorkItem will be slower than new Thread(). In a benchmark test looping 512 times calling new Thread() left ThreadPool.QueueUserWorkItem in the dust with default minimums. However, first setting the minimum worker thread count to 512, in this test, made new Thread() and ThreadPool.QueueUserWorkItem perform similarly.
A side effective of setting a high worker thread count is that new Task() (or Task.Factory.StartNew) also performed similarly as new Thread() and ThreadPool.QueueUserWorkItem.
I found this on the Dr Dobbs site today at
http://www.ddj.com/hpc-high-performance-computing/220300055?pgno=3
It's a nice suggestion regarding thread implmentation.
What is best way of achieving this with TThread in Delphi I wonder?
Thanks
Brian
=== From Dr Dobbs ==============
Make multithreading configurable! The number of threads used in a program should always be configurable from 0 (no additional threads at all) to an arbitrary number. This not only allows a customization for optimal performance, but it also proves to be a good debugging tool and sometimes a lifesaver when unknown race conditions occur on client systems. I remember more than one situation where customers were able to overcome fatal bugs by switching off multithreading. This of course does not only apply to multithreaded file I/O.
Consider the following pseudocode:
int CMyThreadManger::AddThread(CThreadObj theTask)
{
if(mUsedThreadCount >= gConfiguration.MaxThreadCount())
return theTask.Execute(); // execute task in main thread
// add task to thread pool and start the thread
...
}
Such a mechanism is not very complicated (though a little bit more work will probably be needed than shown here), but it sometimes is very effective. It also may be used with prebuilt threading libraries such as OpenMP or Intel's Threaded Building Blocks. Considering the measurements shown here, its a good idea to include more than one configurable thread count (for example, one for file I/O and one for core CPU tasks). The default might probably be 0 for file I/O and <number of cores found> for CPU tasks. But all multithreading should be detachable. A more sophisticated approach might even include some code to test multithreaded performance and set the number of threads used automatically, may be even individually for different tasks.
===================
I would create an abstract class TTask. This class is meant to executes the task. With the method Execute:
type
TTask = abstract class
protected
procedure DoExecute; virtual; abstract;
public
procedure Execute;
end;
TTaskThread = class (TThread)
private
FTask : TTask;
public
constructor Create(const ATask: TTask);
// Assigns FTask and enables thread, free on terminate.
procedure Execute; override; // Calls FTask.Execute.
end;
The method Execute checks the number of threads. If the max is not reached, it starts a thread using TTaskThread that calls DoExecute and as such execute the task in a thread. If the max is reached, DoExecute is called directly.
The answer by Gamecat is good as far as the abstract task class is concerned, but I think calling DoExecute() for a task in the calling thread (as the article itself does too) is a bad idea. I would always queue the tasks to be executed by background threads, unless threading was disabled completely, and here's why.
Consider the following (contrived) case, where you need to execute three independent CPU-bound procedures:
Procedure1_WhichTakes200ms;
Procedure2_WhichTakes400ms;
Procedure3_WhichTakes200ms;
For better utilisation of your dual core system you want to execute them in two threads. You would limit the number of background threads to one, so with the main thread you have as many threads as cores.
Now the first procedure will be executed in a worker thread, and it will finish after 200 milliseconds. The second procedure will start immediately and be executed in the main thread, as the single configured worker thread is already occupied, and it will finish after 400 milliseconds. Then the last procedure will be executed in the worker thread, which has already been sleeping for 200 milliseconds now, and will finish after 200 milliseconds. Total execution time 600 milliseconds, and for 2/3 of that time only one of both threads was actually doing meaningful work.
You could reorder the procedures (tasks), but in real life it's probably impossible to know in advance how long each task will take.
Now consider the common way of employing a thread pool. As per configuration you would limit the number of threads in the pool to 2 (number of cores), use the main thread only to schedule the threads into the pool, and then wait for all tasks to complete. With above sequence of queued tasks thread 1 would take the first task, thread two would take the second task. After 200 milliseconds the first task would complete, and the first worker thread would take the third task from the pool, which is empty afterwards. After 400 milliseconds both the second and the third task would complete, and the main thread would be unblocked. Total time for execution 400 milliseconds, with 100% load on both cores in that time.
At least for CPU-bound threads it's of vital importance to always have work queued for the OS scheduler. Calling DoExecute() in the main thread interferes with that, and shouldn't be done.
I generally have only one class inheriting from TThread, one that takes 'worker items' from a queue or stack, and have them suspend when no more items are available. The main program can then decide how many instances of this thread to instantiate and start. (using this config value).
This 'worker items queue' should also be smart enough to resume suspended threads or create a new thread when required (and when the limit permits it), when a worker item is queued or a thread has finished processing a worker item.
My framework allows for a thread pool count for any of the threads in a configuration file, if you wish to have a look (http://www.csinnovations.com/framework_overview.htm).
From a certain version (think is was one of the XE versions) Delphi has as Parallel Programming Library included:
https://docwiki.embarcadero.com/RADStudio/Sydney/en/Using_the_Parallel_Programming_Library
It has theTTask to scedule work, and also several configuration options and the possibility to create your own thread pool(s).