ThreadPool, QueueUserWorkItem and Deadlock on Shutdown - multithreading

I just implemented a thread pool like described here
Allen Bauer on thread pools
Very simple implementation, works fine, but my application no longer shuts down. Seems that two worker threads (and one other thread, I guess the queuing thread) stuck in the function
ntdll.ZwRemoveIoCompletion
I remember to have read something about IO completions in the help entry for QueueUserWorkItem (the WinAPI function used in the thread pool implementation), but I couldn't understand it properly. I used WT_EXECUTELONGFUNCTION for my worker threads since execution can take a while and I want a new worker thread created instead of waiting for the existing ones to finish. Some of the tasks assigned to the worker threads perform some I/O stuff. I tried to use WT_EXECUTEINIOTHREAD but it does not seem to help.
I should mention that the main thread waits for entry to a critical section witht the call stack being
System.Halt0, System.FinalizeUnits, Classes.Finalization, TThread.Destroy,
RtlEnterCriticalSection, RtlpWaitForCriticalSection
Any ideas what I'm doing wrong here? Thanks for your help in advance.

To make sure the worker threads shut down, you need to have some way of waking them up if they are waiting on the empty IO completion port. The simplest way would seem to be to post a NULL message of some kind to the port - they should then treat this as a signal to halt in an orderly fashion.

You must leave from the critical section before you can enter again. So the problem is inside a lock.
In some thread:
EnterCriticalSection(SomeCriticalSection);
sort code...
LeaveCriticalSection(SomeCriticalSection);
In some other thread:
EnterCriticalSection(SomeCriticalSection);
clean up code...
LeaveCriticalSection(SomeCriticalSection);
If the sort code is running in the first thread and the second thread try to run the clean up code the second thread will wait until the sort code finish and you leave the critical section. Only after leaving the critical section you can enter the same critical section. I hope this will help you narrow down the deadlock code because it is inside a critical section.
To get the completion port handle you can save it's handle when you create the completion port:
FIoCPHandle := CreateIoCompletionPort(INVALID_HANDLE_VALUE, 0 , 0, FNumberOfConcurrentThreads);

When using QueueUserWorkItem, as long as the worker threads have been returned to the thread pool, you should not have to do anything to shut them down. The WT_EXECUTEDEFAULT component of the thread pool queues the work items up onto an I/O completion port. This port is part of the thread pool's internal implementation and is not accessible to you.
Could you provide some more detailed call stacks for the threads that appear to be stuck? It would make this problem much easier to diagnose.

Related

How to know if a thread is alive and then kill it?

I've been searching and reading about killing threads (C posix threads), and everybody says that is not a good idea because a thread should make its work and then return... but my problem is the next:
I'm reciving messages in my local network (using the recvfrom function), but this function "blocks" my program, I mean, if I don't revice any messege the function keeps locked (forever) until it recives something.
To avoid this, I thought to use threads, so, while my main thread is "counting", my second thread is try to recive messages. If in a determinated time (i.e. 1 second), my second thread is still waiting for a message (is locked in the recvfrom function) I need to "kill it" and then create another thread to start again (and try to recive messages from another IP). This means that not always my thread going to finish its work and I can't wait forever...
So far I can do that (create a lot of threads and recive the messages from the IP I'm interested in), but I don't know how to kill the threads that never recived anything...
Someone knows how to kill the threads? Or they are killed automatically when my main program returns?
Thank you and really sorry for my poor english...
Looks like its related to one of my questions How to avoid thread waiting in the following or similar scenarios (want to make a thread wait iff its really really necessary)?
But its .net, though (code sample is in C#)
Essentially i spawned new thread and performing some i/o oeprations and its a blocking call.
And for some reason it just waits foreve, i do have timeout so that i can abort the thread 'abort' method.
Rearchitect so the thread can receive messages from any IP. That way, you can try to receive messages from another IP without having to disturb the thread.

In what condition would a thread exit or stop running

I am writing a server application in which there is a thread deployed to read/write many sockets connecting to clients. My manager tells me that it is not a good design, because if the thread aborts due to unknown reason then all the read/write work will stop forever.
So I wonder in what conditions will a thread abort, except the case we return from the Run() function of a thread. Do we need consider the case that the thread stops running abnormally?
It depends. One thread per client can be a bad thing scalability wise, especially if the thread doesn't do that much work per client. In that circumstance it can be better to have a thread that handles a number of clients, the idea to achieve a good balance between the number of threads and having them do a decent amount of work.
If on the other hand each thread is doing a lot of work per client then one thread isn't such a bad idea, the overhead of the thread not being significant in comparison to the work load.
So setting that aside, a thread will abort if your code is written so that the thread returns or self-terminates. If another thread in your program knows the thread's handle/id then the library you're using may have a function with a name like thread_kill(). That would allow that other thread to kill this thread, though that's almost always a bad idea.
So as far as I'm concerned your thread will only abort and disappear if you've written your code to make that happen deliberately.
Handling exceptions is probably best done in its entirety within the thread where the exception arose. I've never tried to do otherwise (still writing in pure C), but the word is that it's difficult to handle them outside the thread. Irrespective of whether each thread handles one or many clients you still have to handle all errors and events within thread.
It may be simpler to get that correct if you write I so that a thread handles handles a single client. Getting it wrong could lead to a thread getting into a stalled state (eg waiting for the client that is listening too) and accumulating those as time goes past will eventually kill your whole system.
I am writing a server application in which there is a thread deployed to read/write many sockets connecting to clients.
Not a good design. There should be at least one thread per client, in some circumstances two: one to read and one to write. If you're dealing in blocking I/O, servicing one client could block out all the others. (If you're dealing in non-blocking I/O you don't need threads at all.)
My manager tells me that it is not a good design, because if the thread aborts due to unknown reason then all the read/write work will stop forever.
He's right, for more reasons than he is advancing.

Thread Pool Execution Order and Passing Future to Another Thread

I would like to create a thread pool with two threads. I would like to ensure the first threads get executed first and after the completion of first thread then the second thread get start. Besides this, I need to pass Future result from first thread into second thread.
Any idea how to do this?
Please help.
Thanks.
The situation is not suitable to use thread. Thus, avoid using thread.

Deadlock Delphi explanation/solution

On a server application we have the following:
A class called a JobManager that is a singleton.
Another class, the Scheduler, that keeps checking if it is time to add any sort of job to the JobManager.
When it is time to do so, the Scheduler do something like:
TJobManager.Singleton.NewJobItem(parameterlist goes here...);
At the same time, on the client application, the user do something that generates a call to the server. Internally, the server sends a message to itself, and one of the classes listening for that message is the JobManager.
The JobManager handles the message, and knows that it is time to add a new job to the list, calling its own method:
NewJobItem(parameter list...);
On the NewJobItem method, I have something like this:
CS.Acquire;
try
DoSomething;
CallAMethodWithAnotherCriticalSessionInternally;
finally
CS.Release;
end;
It happens that the system reaches a deadlock at this point (CS.Acquire).
The communication between client and server application, is made via Indy 10.
I think, the RPC call that fire the server application method that sends a message to the JobManager is running on the context of the Indy Thread.
The Scheduler has its own thread running, and it makes a direct call to the JobManager method. Is this situation prone to deadlocks?
Can someone help me understand why a deadlock is happening here?
We knew that, sometimes, when the client did a specific action, that cause the system to lock, then I could finally find out this point, where the critical section on the same class is reached twice, from different points (the Scheduler and the message handler method of the JobManager).
Some more info
I want to add that (this may be silly, but anyway...) inside the DoSomething there is another
CS.Acquire;
try
Do other stuff...
finally
CS.Release;
end;
This internal CS.Release is doing anything to the external CS.Acquire? If so, this could be the point where the Scheduler is entering the Critical Section, and all the lock and unlock becomes a mess.
There isn't enough information about your system to be able to tell you definitively if your JobManager and Scheduler are causing a deadlock, but if they are both calling the same NewJobItem method, then this should not be the problem since they will both acquire the locks in the same order.
For your question if your NewJobItem CS.acquire and DoSomething CS.acquire interact with each other: it depends. If the lock object used in both methods is different, then no the two calls should be independant. If it's the same object then it depends on the type of lock. If you locks are re-entrant locks (eg. they allow acquire to be called multiple times from the same thread and count how many time they have been acquired and released) then this should not be a problem. On the other hand if you have simple lock objects that don't support re-entry, then the DoSomething CS.release could release your lock for that thread and then the CallAMethodWithAnotherCriticalSessionInternally would be running without the protection of the CS lock that was acquired in NewJobItem.
Deadlocks occur when there are two or more threads running and each thread is waiting for another thread to finish it's current job before it can continue its self.
For Example:
Thread 1 executes:
lock_a.acquire()
lock_b.acquire()
lock_b.release()
lock_a.release()
Thread 2 executes:
lock_b.acquire()
lock_a.acquire()
lock_a.release()
lock_b.release()
Notice that the locks in thread 2 are acquired in the opposite order from thread 1. Now if thread 1 acquires the lock_a and then is interrupted and thread 2 now runs and acquires lock_b and then starts waiting for lock_a to be available before it can continue. Then thread 1 continues running and the next thing it does is try to acquire lock_b, but it is already taken by thread 2 and so it waits. Finally we are in a situation in which thread 1 is waiting for thread 2 to release lock_b and thread 2 is waiting for thread 1 to release lock_a.
This is a deadlock.
There are several common solutions:
Only use one shared global lock in all your code. This way it is impossible to have two threads waiting for two locks. This makes your code wait a lot for the lock to be available.
Only ever allow your code to hold one lock at a time. This is usually too hard to control since you might not know or control the behavior of method calls.
Only allow your code to acquire multiple locks all at the same time, and release them all at the same time, and disallow acquiring new locks while you already have locks acquired.
Make sure that all locks are acquired in the same global order. This is a more common technique.
With solution 4. you need to be careful programming and always make sure that you acquire the locks/critical sections in the same order. To help with debugging you can place a global order on all the locks in your system (eg. just a unique integer for each lock) and then throwing an error if you try to acquire a lock that has a lower ranking that a lock that the current thread already has acquired (eg. if new_lock.id < lock_already_acquired.id then throw exception)
If you can't put in a global debugging aid to help find which locks have been acquired out of order, the I'd suggest that you find all the places in your code that you acquire any lock and just print a debugging message with the current time, the method calling acquire/release, the thread id, and the lock id that is being acquired. Also do the same thing with all the release calls. Then run your system until you get the deadlock and find in your log file which locks have been acquired by which threads and in which order. Then decide which thread is accessing it's locks in the wrong order and change it.

Mutithreading thread control

How do I control the number of threads that my program is working on?
I have a program that is now ready for mutithreading but one problem is that the program is extremely memory intensive and i have to limit the number of threads running so that i don't run out of ram. The main program goes through and creates a whole bunch of handles and associated threads in suspended state.
I want the program to activate a set number of threads and when one thread finishes, it will automatically unsuspended the next thread in line until all the work has been completed. How do i do this?
Someone has once mentioned something about using a thread handler, but I can't seem to find any information about how to write one or exactly how it would work.
If anyone can help, it would be greatly appreciated.
Using windows and visual c++.
Note: i don't need to worry about the traditional problems of access with the threads, each one is completely independent of each other, its more of like batch processing rather than true mutithreading of a program.
Thanks,
-Faken
Don't create threads explicitly. Create a thread pool, see Thread Pools and queue up your work using QueueUserWorkItem. The thread pool size should be determined by the number of hardware threads available (number of cores and ratio of hyperthreading) and the ratio of CPU vs. IO your work items do. By controlling the size of the thread pool you control the number of maximum concurrent threads.
A Suspended thread doesn't use CPU resources, but it still consumes memory, so you really shouldn't be creating more threads than you want to run simultaneously.
It is better to have only as many threads as your maximum number of simultaneous tasks, and to use a queue to pass units of work to the pool of worker threads.
You can give work to the standard pool of threads created by Windows using the Windows Thread Pool API.
Be aware that you will share these threads and the queue used to submit work to them with all of the code in your process. If, for some reason, you don't want to share your worker threads with other code in your process, then you can create a FIFO queue, create as many threads as you want to run simultaneously and have each of them pull work items out of the queue. If the queue is empty they will block until work items are added to the queue.
There is so much to say here.
There are a few ways
You should only create as many thread handles as you plan on running at the same time, then reuse them when they complete. (Look up thread pool).
This guarantees that you can never have too many running at the same time. This raises the question of funding out when a thread completes. You can have a callback be called just before a thread terminates where a parameter in that callback is the thread handle that just finished. Use Boost bind and boost signals for that. When the callback is called, look for another task for that thread handle and restart the thread. That way all you have to do is add to the "tasks to do" list and the callback will remove the tasks for you. No polling needed, and no worries about too many threads.

Resources