Note that I'm not talking about any specific implementation in any specific language.
Lets say I have a thread pool and a task queue. When a thread runs it pops a task from the task queue and handles it - that thread might add additional tasks into the task queue, as a result.
The time the thread has to handle a certain task is unlimited - meaning the thread works until the task is finished and never terminates before that.
What kind of problems (e.g. deadlocks) each the following thread pool configurations are susceptible to?
Possible thread pool configurations I'm concerned with:
1) Unbounded task queue with bounded num. of threads
2) Bounded task queue with unbounded num. of threads
3) Bounded task queue with bounded num. of threads.
4) Unbounded task queue with unbounded num. of threads
Also - say that now the thread has a limited time to handle each task, and is forcibly terminated if it doesn't finish the task in the time frame that was given. How does that change things?
If you have a bounded number of threads then you can experience deadlocks if a task running on a pool thread submits a new task to the queue and then waits for that task --- if there is no free thread then the new task will not be run, and the original task will block, holding up the pool thread until the new task can run. If you end up with enough of these blocked tasks then the whole pool can deadlock.
This isn't really helped by bounding the number of tasks, unless the bound is the same as the number of threads --- once each thread is doing something then you can no longer submit new tasks.
What does help is either (a) adding new threads when a thread becomes blocked like this, or (b) if a pool thread task is waiting for another task from the same pool then that thread switches to running the task being waited for.
If you have an unbounded number of threads then you have to watch out for oversubscription --- if I have a quad-core machine, but submit 1000 tasks, and run 1000 threads then these will compete with each other and slow everything down.
In practice, the number of threads is bounded to some large number by the OS either due to a hard-coded number, or due to memory constraints --- each thread needs a new stack, so you can only have as many threads as you've got memory for their stacks.
You can always get a deadlock with 2 tasks if they wait for each other, regardless of any scheme you use, unless you start forcibly terminating tasks after a time limit.
The problem with forcibly terminating tasks is twofold. Firstly, you need to communicate to any code that was waiting for that task that the task was terminated forcibly rather than finished normally. Secondly (and this is the bigger issue) you don't know what state the task was in. It might have owned a lock, or any other resources, and forcibly terminating the task will leak those resources, and potentially leave the application in a bad state.
Related
I am trying to understand the concept behind the threadpool. Based on my understanding, a thread can not be restarted once completed. One will have to create a new thread in order to execute a new task. If that is the right understanding, does ThreadPool executor recreates new thread for every task that is added?
One will have to create a new thread in order to execute a new task
No. Task are an abstraction of a logical work to perform. It can be typically a function reference/pointer with an ordered list of well-defined parameters (to give to the function). Multiple tasks can be assigned to a given thread. A thread pool is usually a set of threads waiting for new incoming tasks to be executed.
As a result, threads of a given thread-pool are created once.
I need help in enhancing a thread scheduling strategy I am working on.
Background
To set the context, I have a couple (20-30 thousand) "tasks" that needs to be executed. Each task can execute independently. In reality, the range of execution time varies between 40ms and 5mins across tasks. Also, each individual task when re-run takes the same amount of time.
I need options to control the way these tasks are executed, so I have come up with a scheduling engine that schedules these tasks based on various strategies. The most basic strategy is FCFS, i.e. my tasks get executed sequentially, one by one. The second one is a batch strategy, the scheduler has a bucket size "b" which controls how many threads can run in parallel. The scheduler will kick off non-blocking threads for the frist "b" tasks it gets, then waits for those started tasks to complete, then proceed with the next "b" tasks, starting them in parallel and then waiting for completion. Each "b" set of tasks processed at a time is termed a batch and hence batch scheduling.
Now, with batch scheduling, activity begins to increase at the beginning of the batch, when threads start getting created, then peaks in the middle, when most of the threads would be running, and then wanes down as we block and wait for the threads to join back in. Batch scheduling becomes FCFS scheduling when batch size "b" = 1.
One way to improve on batch scheduling is what I will term as parallel scheduling - the scheduler will ensure, if sufficient number of tasks are present, that "b" number of threads will keep running at any point in time. The number of threads initially will ramp up to "b", then maintain the count at "b" running threads, until the last set of tasks finish execution. To maintain execution of "b" threads at any time, we need to start a new thread the moment an old thread finishes execution. This approach can reduce the amount of time taken to finish processing all the tasks compared to batch scheduling (average case scenario).
Part where I need help
The logic I have to implement parallel scheduling follows. I would be obliged if anyone can help me on:
Can we avoid the use of the
startedTasks list? I am using that
because I need to be sure that when
the Commit() exits, all tasks have
completed execution, so I just loop
through all startedTasks and block
until they are complete. One current
problem is that list will be long.
--OR--
Is there a better way to do parallel scheduling?
(Any other suggestions/strategies are also welcome - main goal here is to shorten overall execution duration within the constraints of the batch size "b")
ParallelScheduler pseudocode
// assume all variable access/updates are thread safe
Semaphore S: with an initial capacity of "b"
Queue<Task> tasks
List<Task> startedTasks
bool allTasksCompleted = false;
// The following method is called by a callee
// that wishes to start tasks, it can be called any number of times
// passing various task items
METHOD void ScheduleTask( Task t )
if the PollerThread not started yet then start it
// starting PollerThead will call PollerThread_Action
// set up the task so that when it is completed, it releases 1
// on semaphore S
// assume OnCompleted is executed when the task t completes
// execution after a call to t.Start()
t.OnCompleted() ==> S.Release(1)
tasks.Enqueue ( t )
// This method is called when the callee
// wishes to notify that no more tasks are present that needs
// a ScheduleTask call.
METHOD void Commit()
// assume that the following assignment is thread safe
stopPolling = true;
// assume that the following check is done efficiently
wait until allTasksCompleted is set to true
// this is the method the poller thread once started will execute
METHOD void PollerThread_Action
while ( !stopPolling )
if ( tasks.Count > 0 )
Task nextTask = tasks.Deque()
// wait on the semaphore to relase one unit
if ( S.WaitOne() )
// start the task in a new thread
nextTask.Start()
startedTasks.Add( nextTask )
// we have been asked to start polling
// this means no more tasks are going to be added
// to the queue
// finish off the remaining tasks
while ( tasks.Count > 0 )
Task nextTask = tasks.Dequeue()
if ( S.WaitOne() )
nextTask.Start()
startedTasks.Add ( nextTask )
// at this point, there are no more tasks in the queue
// each task would have already been started at some
// point
for every Task t in startedTasks
t.WaitUntilComplete() // this will block if a task is running, else exit immediately
// now all tasks are complete
allTasksCompleted = true
Search for 'work stealing scheduler' - it is one of the most efficient generic schedulers. There are also several open source and commercial implementations around.
The idea is to have fixed number of worker threads, that take tasks from a queue. But to avoid the congestion on a single queue shared by all the threads (very bad performance problems for multi-CPU systems) - each thread has its own queue. When a thread creates new tasks - it places them to its own queue. After finishing tasks, thread gets next task from its own queue. But if the thread's queue is empty, it "steals" work from some other thread's queue.
When your program knows a task needs to be run, place it in a queue data structure.
When your program starts up, also start up as many worker threads as you will need. Arrange for each thread to do a blocking read from the queue when it needs something to do. So, when the queue is empty or nearly so, most of your threads will be blocked waiting for something to go into the queue.
When the queue has plenty of tasks in it, each thread will pull one task from the queue and carry it out. When it is done, it will pull another task and do that one. Of course this means that tasks will be completed in a different order than they were started. Presumably that is acceptable.
This is far superior to a strategy where you have to wait for all threads to finish their tasks before any one can get another task. If long-running tasks are relatively rare in your system, you may find that you don't have to do much more optimization. If long-running tasks are common, you may want to have separate queues and separate threads for short- and long- running tasks, so the short-running tasks don't get starved out by the long-running ones.
There is a hazard here: if some of your tasks are VERY long-running (that is, they never finish due to bugs) you'll eventually poison all your threads and your system will stop working.
You want to use a space-filling-curve to subdivide the tasks. A sfc reduce a 2d complexity to a 1d complexity.
If I had threads as below
void thread(){
while() {
lock.acquire();
if(condition not true)
{
Cond.wait()
}
// blah blah
Cond.Signal();
lock.release();
}
}
Well I guess my main question is that whether the signalling thread continues running for a while after cond.signal() or immediately gives up the CPU?. I would like it in some cases not to release the lock before the woken up thread finishes execution and in some other cases it may be beneficial to release the lock immediately after signalling, without waiting for the other woken thread to finish.
I understand that if there are any threads waiting on the condition then they get woken up on Cond.signal(). But what do you mean by woekn up - put on the ready queue or does the scheduler make sure that it runs immediately?.
and what about the signalling thread.. does it go to sleep on the same condtion upon signalling? .. so then some other thread has to wake it up to make it release the lock?.
This is in large part dependent on your environment (OS, library, language...) and how the synchronisation primitives are implemented. Since you haven't specified any I'll just give a general answer.
When putting a thread to sleep, most environment will choose to remove it from the scheduler's ready queue and the thread will give up its remaining CPU time. When woken up, the thread is simply placed back into the ready queue and will resume execution the next time the scheduler selects it from the queue.
It's also possible that the thread will do some active waiting (spinning) instead of being removed from the scheduler's ready queue. In this case, the thread will resume execution right away. Note that since a thread can still be run out of CPU of time while spinning, it might have to wait to be rescheduled before waking up. This is a useful strategy if your critical sections are very small and you don't want to pay for the scheduling overheads.
A hybrid approach would be to do a small amount of active waiting before removing the thread from the scheduler's ready queue.
As for the signaling thread, unless specified explicitly by your environment (I can't of any reasons but you never know), I wouldn't expect a call to signal() to block in a way that you have to wake it up. Signal() might have to synchronize itself with other threads calling signal() but those are implementation details and you shouldn't have to do anything about it.
I see this in the book "CLR via C#" and I don't catch it. If there are still threads available in the thread pool, why does it create additional threads?
It might just be poor wording.
On a given machine the threadpool has a good guess of the optimum number of threads the machine can run without overextending resources. If, for some reason, a thread becomes IO blocked (for instance it is waiting for a long time to save or retrieve data from disk or for a response from a network device) the threadpool can start up another thread to take advantage of unused CPU time. When the other thread is no longer blocking, the threadpool will take the next freed thread out of the pool to reduce the size back to "optimum" levels.
This is part of the threadpool management to keep the system from being over-tasked (and reducing efficiency by all the context switches between too many threads) while reducing wasted cycles (while a thread is blocked there might not be enough other work to task the processor(s) fully even though there are tasks waiting to be run) and wasted memory (having threads spun up and ready but never allocated because they'd over task the CPU).
More info on the Managed Thread Pool from MSDN.
The book lied.
Threadpool only creates additional threads when all available threads have been blocked for more than 1 second. If there are free threads, it will use them to process your additional tasks. Note that after 30 seconds of thread idle, the CLR retires the thread (terminates it, gracefully of course).
How do I control the number of threads that my program is working on?
I have a program that is now ready for mutithreading but one problem is that the program is extremely memory intensive and i have to limit the number of threads running so that i don't run out of ram. The main program goes through and creates a whole bunch of handles and associated threads in suspended state.
I want the program to activate a set number of threads and when one thread finishes, it will automatically unsuspended the next thread in line until all the work has been completed. How do i do this?
Someone has once mentioned something about using a thread handler, but I can't seem to find any information about how to write one or exactly how it would work.
If anyone can help, it would be greatly appreciated.
Using windows and visual c++.
Note: i don't need to worry about the traditional problems of access with the threads, each one is completely independent of each other, its more of like batch processing rather than true mutithreading of a program.
Thanks,
-Faken
Don't create threads explicitly. Create a thread pool, see Thread Pools and queue up your work using QueueUserWorkItem. The thread pool size should be determined by the number of hardware threads available (number of cores and ratio of hyperthreading) and the ratio of CPU vs. IO your work items do. By controlling the size of the thread pool you control the number of maximum concurrent threads.
A Suspended thread doesn't use CPU resources, but it still consumes memory, so you really shouldn't be creating more threads than you want to run simultaneously.
It is better to have only as many threads as your maximum number of simultaneous tasks, and to use a queue to pass units of work to the pool of worker threads.
You can give work to the standard pool of threads created by Windows using the Windows Thread Pool API.
Be aware that you will share these threads and the queue used to submit work to them with all of the code in your process. If, for some reason, you don't want to share your worker threads with other code in your process, then you can create a FIFO queue, create as many threads as you want to run simultaneously and have each of them pull work items out of the queue. If the queue is empty they will block until work items are added to the queue.
There is so much to say here.
There are a few ways
You should only create as many thread handles as you plan on running at the same time, then reuse them when they complete. (Look up thread pool).
This guarantees that you can never have too many running at the same time. This raises the question of funding out when a thread completes. You can have a callback be called just before a thread terminates where a parameter in that callback is the thread handle that just finished. Use Boost bind and boost signals for that. When the callback is called, look for another task for that thread handle and restart the thread. That way all you have to do is add to the "tasks to do" list and the callback will remove the tasks for you. No polling needed, and no worries about too many threads.