Terminated thead with delayed deletion - multithreading

i have a general question. We are postponing the deletion of terminated threads with say 10 minutes. Those threads are not running, they are sent to something like a garbage collector which takes care of deleting them after the specified time elapses and joining it to the main thread. My question is can those threads still content for resources i.e., can we have context switching caused by them?

Since each Thread is terminated (i.e. Thread.IsAliveevals to false), the scheduler will not care about them anymore. I wonder however, what the reason for your approach is. Why wouldn't using the ThreadPool work for you, instead of house-keeping Threads yourself?

Related

Thread.yield and sleep

I'm new to multithreading and I ran into a two questions about thread scheduling with thread.yield and sleep in which I couldn't find a clear anwser to from my book or with googling. I'm going to save all pseudo codes or real codes because I think I already understand the possible starvation problem if my assumptions aren't right.
I'm going to refer to 3 pseudo threads in my questions:
My first question is that if I call thread yield or sleep in one of my 3 threads, is it guaranteed that CPU tries to schelude and process the other 2 threads before it comes back to the thread which called yield? So basically are threads in a clear queue, that makes the yiealding thread go to last of the queue?
I know that yield should give other threads chance to run but is it possible for example that after the yielding thread one of the 2 other threads tries to run and after that it goes back to the original thread which called yield, skipping the last thread and not giving it a chance to run at all?
My second question is related to the first. So do yield and sleep both have the same propeties that they both go to be the last on the queue when called like I assumed in my first question or is there anything other differences between them but the sleeping time in sleep?
If these question doesn't make sense the possible problem in my code is that before the thread which goes to sleep it has unlocked a mutex which one of the other threads has tried locking before, failed and gone waiting for it to open. So after the thread has gone to sleep, is it guaranteed that the thread which tried to lock the mutex will lock it before the sleeping thread?
Thread.yield() is a hint to thread scheduler which means "hey, right now I feel ok if you alseep me and let other thread run". There is no guarantees, it is only a hint. The assumption about the ordering of threads in "queue" is also incorrect because thread scheduling is done also by OS and it is very hard to predict a particular exection order without additional thread interaction mechanisms.
Thread.sleep() puts current thread to sleep for a specified amount of time, so the answer to your second question is - no, they do different things.

Intervening threads that waited for too long

Is there anyway in F# that I can detect if a current waiting thread is waiting for too long without being contacted?
I have a case where threads must be actively contacting other waiting threads to pass their work to once they're finished. My solution is having a bug somewhere that sometimes one or more threads just wait for too long and eventually the program got deadlocked because other threads don't contact them.
I think by detecting if a waiting thread is simply waiting for too long, it will just actively go looking for available work, rather than keeping waiting for other threads to pass to it.
It's probably better to try and understand why your threads are getting stuck than just terminating them. If you can reproduce this with the Visual Studio debugger attached, you can click the Pause button and use the Threads window to see what code all threads are in.
That said; if you still have the need to do this, the solution will depend on how you're managing your threads. To monitor them from the outside, you'll need some process that has a list of threads and the ability to tell whether they're dead.
The Thread class doesn't appear have any built-in mechanism for sharing state between the thread and its control except for Name. You could possibly abuse name, but I would probably have a thread-safe collection (eg. a ConcurrentDictionary<Thread, DateTime>) to store all of the threads and the timestamp of their last communication, and pass an Action into each thread when it's started that allows it to "Ping" by calling the action periodically. The action would simply update the DateTime stored against that thread.
The controlling process then simply scans through the dictionary periodically for anything with a timestamp that is too old, declares that thread dead and Aborts() it.
It's hard to give a code sample without knowing exactly how you're spawning your threads and describe what a thread "being contacted" means in more detail.

How is ThreadPool implemented in .NET 4.0?

I recently tried to work out how the solution to a ThreadPool class works in .NET 4.0. I tried to read through a reflected code but it seems a bit too extensive for me.
Could someone explain in simple terms how this class works i.e.
How it stores each methods that are coming in
Is it thread safe, supposedly multiple threads try to enqueue their methods in the thread pool?
When it reaches the limit of available threads, how does it return to execute the remaining batch waiting in the queue when one of the threads becomes free? Is there some callback mechanism for it?
Of course, in the absence of the actual implementation (or in the absence of Eric Lippert :) ) what I'm saying is only common sense:
The thread pool holds an internal (circular?) queue where the tasks are kept (hence QueueUserWorkItem).
Putting tasks in the queue is thread-safe (this is for sure, as I've used myself in this scenario several times).
I think that each thread loops indefinitely and keeps taking tasks from the queue (in a thread-safe manner of course) automatically when it's done with the current task. If the queue is empty it will just block.
In a queue of delegates
TBH, I don't know for sure but, if it's not, it's dangerous, nearly useless and probably the worst code ever emitted by M$, (even including Windows ME). Just assume it's thread safe.
The work threads are while loops, waiting on the work request queue for a delegate, invoking one when it becomes available, then looping back round again when the the delegate returns to wait on the queue again for another delegate. There is no need for any callback.
I don't know exectly but to my mind it stores it in a collection of
Task
MSDN says yes
GetMaxThreads() returns the amount of onetime-executed threads if
you reach this border all others are queued. As I understand you
need mechanism for knowing when thread is executed. There is
RegisterWaitForSingleObject(WaitHandle, WaitOrTimerCallback, Object, Int32, Boolean)

Which is the better method? Allowing the thread to sleep for a while or deleting it and recreating it later?

We have a process that needs to run every two hours. It's a process that needs to run on it's own thread so as to not interrupt normal processing.
When it runs, it will download 100k records and verify them against a database. The framework to run this has a lot of objects managing this process. These objects only need to be around when the process is running.
What's a better standard?
Keep the thread in wait mode by letting it sleep until I need it again. Or,
Delete it when it is done and create it the next time I need it? (System Timer Events.)
There is not that much difference between the two solutions. I tend to prefer the one where the thread is created each time.
Having a thread lying around consumes resources (memory at least). In a garbage collected language, it may be easy to have some object retained in this thread, thus using even more memory. If you have not the thread laying around, all resources are freed and made available for two hours to the main process.
When you want to stop your whole process, where your thread may be executing or not, you need to interrupt the thread cleanly. It is always difficult to interrupt a thread or knowing if it is sleeping or working. You may have some race conditions there. Having the thread started on demand relieves you from those potential problems: you know if you started the thread and in that case calling thread_join makes you wait until the thread is done.
For those reasons, I would go for the thread on demand solution, even though the other one has no insurmontable problems.
Starting one thread every two hours is very cheap, so I would go with that.
However, if there is a chance that at some time in the future the processing could take more than the run interval, you probably want to keep the thread alive. That way, you won't be creating a second thread that will start processing the records while the first is still running, possibly corrupting data or processing records twice.
Either should be fine but I would lean towards keeping the thread around for cases where the verification takes longer than expected (ex: slow network links or slow database response).
How would you remember to start a new thread when the two hours are up ? With a timer? (That's on another thread!) with another thread that sleeps until the specified time? Shutting down the thread and restarting it based on something running somewhere else does you no good if the something else is either on it's own separate thread, or blocks the main app while it's waiting to "Create" the worker thread when the two hours are up, no?
Just let the Thread sleep...
I agree with Vilx that it's mostly a matter of taste. There is processing and memory overhead of both methods, but probably not enough for either to matter.
If you are using Java you could check Timer class. It allows you to schedule tasks on given time.
Also, if you need more control you can use quartz library.
I guess actually putting the thread to sleep is most effective, ending it and recreating it would "cost" some resources, while putting it to sleep would just fill a little space in the sceduler while it's data could be paged by the operationg system if needed.
But anyway it's probably not a very big difference, and the difference would probably depend on how good the OS' sceduler is, etc...
It really depends on one thing as I can tell... state.
If the thread creates a lot of state (allocates memory) that is useful to have during the next iteration of the thread run, then I would keep it around. That way, your process can potentially optimize its run by only performing certain operations if certain things changed since the last running.
However, if the state that the process creates is significant compared with the amount of work to be done, and you are short on resources on the machine, then it may not be worth the cost of keeping the state around in between exectutions. If thats the case, then you should recreate the thread from scratch each time.
I think it's just a matter of taste. Both are good. Use the one which you find easier to implement. :)
I would create the thread a single time, and use events/condition variables to let it sleep until signaled to wake up again. That way if the amount of time needed ever has to change, you only need change the timing in firing the event and your code will still be pretty clean.
I wouldn't think it's very important, but the best approach is very platform dependent.
A .NET System.Threading.Timer costs nothing while it's waiting, and will invoke your code on a pool thread. In theory, that would be the best of both your suggestions.
Another important thing to consider if you are on a garbage collected system like Java is that anything strongly referenced by a sleeping thread is not garbage. In that respect, it's better to kill idle threads, and let them, and any objects they reference, get cleaned up.
It all depends, of course. But by default I would go with a separate process (not thread) started on demand.

How independent are threads inside the same process?

Now, this might be a very newbie question, but I don't really have experience with multithreaded programming and I haven't fully understood how threads work compared to processes.
When a process on my machine hangs, say it's waiting for some IO that never comes or something similar, I can kill and restart it because other processes aren't affected and can, for example, still operate my terminal. This is very obvious, of course.
I'm not sure whether it is the same with threads inside a process: If one hangs, are the others unaffected? In other words, can I run a "watchdog" thread which supervises the other threads and, for example kill and recreate hanging threads? For example, if I have a threadpool that I don't want to be drained by occasional hangups.
Threads are independent, but there's a difference between a process and a thread, and that is that in the case of processes, the operating system does more than just "kill" it. It also cleans up after it.
If you start killing threads that seems to be hung, most likely you'll leave resources locked and similar, something that the operating system would close for you if you did the same to a process.
So for instance, if you open a file for writing, and start producing data and write it to the file, and this thread now hangs, for whatever reason, killing the thread will leave the file still open, and most likely locked, up until you close the entire program.
So the real answer to your question is: No, you can not kill threads the hard way.
If you simply ask a thread to close, that's different because then the thread is still in control and can clean up and close resources before terminating, but calling an API function like "KillThread" or similar is bad.
If a thread hangs, the others will continue executing. However, if the hung thread has locked a semaphore, critical section or other kind of synchronization object, and another thread attempts to lock the same synchronization object, you now have a deadlock with two dead threads.
It is possible to monitor other threads from a thread. Depending on your platform, there are appliable API's: I refer you to those as you haven't stated what OS you are writing for.
You didn't mention about the platform, but as far as I'm concerned, NT kernel schedules threads, not processes and threats them independently in that manner. This might not be and is not true on other platforms (some platforms, like Windows 3.1, do not use preemptive multithreading and if one thread goes in infinite loop, everything is affected).
The simple answer is yes.
Typically though code in a thread will handle this likely hood itself. Most commonly many APIs that perform operations that may hang will have timeout features of their own.
Alternatively a thread will wait on not just an the operation that might hang but also a timer. If the timer signals first its assummed the operation has hung.
Since for a watch dog thread to be useful in this scenario would need some co-operation from code in the other threads having the threads themselves set timeouts makes more sense than a watchdog.
Threads get scheduled independent of each other. So you could indeed stop and restart hanging threads. Threads do not run in a separate address-space so a misbehaving thread can still overwrite memory or take locks needed by other threads in the same process.
There's a pretty good overview of some of the pitfalls of killing and suspending threads in the Java documentation explaining why the methods that do it are deprecated. Basically, if you expect to be able to kill a thread, you have to be very, very careful to make it work without some sort of corruption. If a thread is hung it's probably because of a bug...in which case killing it will probably result in corruption.
http://java.sun.com/j2se/1.4.2/docs/guide/misc/threadPrimitiveDeprecation.html
If you need to be able to kill things, use processes.

Resources