Necessity of gracefully ending a thread

Necessity of gracefully ending a thread - multithreading

If I am building a multithreaded application, all its threads would automatically get killed when I abort the application.
If I want a thread to have a lifetime equal to that of the main thread, do I really need to gracefully end the thread, or let the application abort take care of killing it?
Edit: As threading rules depend on the OS, I'd like to hear opinions for the following too:
Android
Linux
iOS

It depends on what the thread is doing.
When a thread is killed, it's execution stops at any point in the code, meaning some operations may not be finished, like
writing a file
sending network messages
But the OS will
close all handles the application owns
release any locks
free all memory
close any open file
etc...
So, as long as you can make sure that all your files etc. are in a consistent state, you don't have to worry about the system resources.
I know this is true for Windows, and I would be very surprised if it was different on other OSes. The time when a application that didn't release all resources could affect the entire system is long gone, fortunately.

No. With most non-trivial OS, you do not need to explicitly/gracefully terminate app-lifetime threads unless there is a specific and overriding need to do so.
Just one reason is that you cannot always actually do it with user code. User-level code cannot stop a thread that is running on another core than the thread requesting the stop. The OS can, and does.
Your linux/Windows OS is very good indeed at stopping threads in any state on an core and releasing resources like thread stacks, heaps, OS object handles/fd's etc. at process-termination. It's had millions of hours of testing on systems world-wide, something that your own user code is very unlikely to ever experience. If you can do so, you should let the OS do what it's good at.
In other posts, several cases have been made where user-level termination of a thread may be unavoidable. Inter-process comms is one area, as are DB connections/transactions. If you are forced into it by your requirements, then fine, go for it but, otherwise, don't try - it's a waste of time and effort writing/testing/debugging thread-stop code to do what the OS can do effectively on its own.
Beware of premature stoptimization.

Related

sand boxing threads without separate processes

In the interest of ease of programming (local function calls instead of IPC) and performance (e.g. avoiding copies of large buffers), I'd like to have a Java VM call native code using JNI instead of through interprocess communication. There would be lots of worker threads, each doing computer vision on some image and sending back a list of detected features.
I've found a few other posts about this topic:
How to implement a native code sandbox?
Linux: Is it possible to sandbox shared library code
but in all cases, the agreed upon solution is to use multiple processes.
But I would like to explore the feasibility of partly sand boxing threads. Clearly, this goes against common sense, but I think if your client processes aren't malicious and if you can recover from faults, and in the worst case, are willing to tolerate a whole system crash once in a blue moon, it might work.
There are some hints that this is possible such as from jmajnert in #2. You would have to capture segfaults and other crashes, and terminate and restart the crashed thread. But I also want to reset the heap of the thread. That means each thread should have a private heap, but I don't know of any common malloc implementation that lets you create multiple heaps (AIX seems to).
Then I would want to close all files opened by the thread when it gets restarted.
Also, if Java objects get compromised by the native code, would it be practical to provide some fault tolerance like recreating them?

Because if complexity of hopping models between some native code, and the JVM -- The idea itself is really not even feasible.
To be feasible, you'd need to be within a single machine/threading model.
Lets assume you're in posix/ansi c.
You'd need to write a custom allocator that allocated from pools. Each time you launched a thread you'd allocate a new pool and set that pool as a thread local variable that all your custom_malloc() functions would allocate from. This way, when your thread died you could crush all of it's memory along with it.
Next, you'll need to set up some niftyness with setjmp/longjmp and signal to catch all those segfaults etc. exit the thread, crush it's memory and restart.
If you have objects from the "parent process" that you don't want to get corrupted, you'd have to create some custom mutexes that would have rollback functions that could be triggered when a threads signal handler was triggered to destroy the thread.

C# When thread switching will most probably occur?

I was wondering when .Net would most probably switch from a thread to another?
I understand we can't predict when this will happen exactly, but is there any intelligence in this? For example, when a thread is executed will it try to wait for a method to returns or a loop to finish before switching?

I'm not an expert on .NET, but in general scheduling is handled by the kernel.
Either your thread's timeslice has expired (threads/processes only get a certain amount of CPU time)
Your thread has blocked for IO.
Some other obscure reason, like waiting for an IPC message, a network packet or something.
Threads can be preempted at any point along their execution path, be it in a loop or returning from a function. This in general isn't handled by the underlying VM (.NET or JVM) but is controlled by the OS.

Of course there is 'intelligence', of a sort:). The set of running threads can only change upon an interrupt, either:
An actual hardware interrupt from a peripheral device, eg. disk, NIC, KB, mouse, timer.
A software interrupt, (ie. a system call), that can change the state of thread/s. This encompasses sleep calls and calls to wait/signal on inter-thread synchro objects, as well as I/O calls that request data that is not immediately available.
If there is no interrupt, the OS cannot change the set of running threads because it is not entered. The OS does not know or care about loops, function/methods calls, (except those that make system calls as above), gotos or any other user-level flow-control mechanisms.

I read your question now, it may not be rellevant anymore, but after reading the above answers, i want to just to make sure:
Threads are managed (or as i know) by the process they belong to. There is nothing to do with the Operation System(and that's is the main reason why working with multithreads is more faster than working with multiprocess, because there are data sharing between threads and the switching between them is occuring faster than the context switch wich occure between process by the Short-Term-Scheduler).
(NOTE: There are two types of threads: USER_MODE' threads and KERNEL_MODE' threadss, and each os can have both of them or just on of them. Anyway a thread that working in a user application environment is considered as a USER_MODE' thread and managed by the process it's belong to.)
Am I Write?
Thanks!!!

Will a waiting thread still eat up cpu time?

I'm trying to make a thread pool for a game engine and I've been considering how my system should react to third party libraries spawning their own threads.
From what I've read, it is ideal to only have one thread for each CPU you have access to. So if my third party physics update spawns four threads, it would be ideal to turn off four threads from my thread pool while it is running, then turn them back on afterwards, that way multiple threads are never contending over one CPU.
My question is about the underlying mechanics behind functionality like conditional variables. Since spawning threads is expensive, having four threads wait on a conditional variable and then notifying them when the physics is done seems like a much better option than joining four threads and re-spawning them afterwards. But if they are waiting on a variable, are the threads truly "asleep" or are they still contending for CPU resources in the background?

Although you did not write what platform you are programming on, in most implementations threads that are waiting consume little to no CPU resources.
They do however use some memory (to save the stack, etc.), so you should avoid spawning an excessive number of threads and trying to reuse them as much as possible, since as you noted, spawning a new thread is an expensive operation on most platforms.
Even though you did not provide a lot of information, I'm guessing that in your scenario letting the threads wait is a much better option, as a small number of threads will not use a lot of resources and possibly having to spawn new threads frequently will affect performance badly on almost all platforms.

Any good third party library should give you the option of running it's work through your thread pool, to avoid that problem in the first place.
For example here's the documentation on how you can do that with PhysX - https://developer.nvidia.com/sites/default/files/akamai/physx/Docs/TaskManager.html

How does process blocking apply to a multi-threaded process?

I've learned that a process has running, ready, blocked, and suspended states. Threads also have these states except for suspended because it lives in the process's address space.
A process blocks most of the time when it is doing a blocking i/o or waiting for an event.
I can easily picture out a process getting blocked if its single-threaded or if it follows a one-to-many model, but how does it work if the process is multi-threaded?
For example:
I have a process with two threads in a system that follows a one-to-one model. One handles the gui and the other handles the blocking i/o. I know the process remains responsive because the other thread handles the i/o.
So is there by any chance the process gets blocked or should I just rule it out in this case?
I'm just getting into these stuff so forgive me If I haven't understand some of the important details yet.

Let's say you have a work queue where the UI thread schedules work to be done and the I\O thread looks there for work to do. The work queue itself is data that is read and modified from both threads, therefor you must synchronize access somehow or race conditions result.
The naive approach is to synchronize access to the queue using a lock (aka critical section). If the I\O thread acquires the lock and then blocks, the UI thread will only remain responsive until it decides it needs to schedule work and tries to acquire the lock. A better approach is to use a lock-free queue about which much has been written and you can easily search for more info.
But to answer your question, yes, it is still much easier than you might think to cause UI to stutter / hang even when using multiple threads. There are various libraries that make it easier or harder to solve this problem, so depending on your OS and language of choice, there may be something better than just OS primitives. Win32 (from what I remember) doesn't it make it very easy at all despite having all sorts of synchronization primitives. Pthreads and Boost never seemed very straightforward to me either. Apple's GCD makes it semantically much easier to express what you want (in my opinion), though there are still pitfalls one must be aware of (such as scheduling too many blocking operations on a single work queue to be done in parallel and causing the processor to thrash when they all wake up at the same time).
My advice is to just dive in and write lots of multithreaded code. It can be tough to debug but you will learn a lot and eventually it becomes second nature.

How independent are threads inside the same process?

Now, this might be a very newbie question, but I don't really have experience with multithreaded programming and I haven't fully understood how threads work compared to processes.
When a process on my machine hangs, say it's waiting for some IO that never comes or something similar, I can kill and restart it because other processes aren't affected and can, for example, still operate my terminal. This is very obvious, of course.
I'm not sure whether it is the same with threads inside a process: If one hangs, are the others unaffected? In other words, can I run a "watchdog" thread which supervises the other threads and, for example kill and recreate hanging threads? For example, if I have a threadpool that I don't want to be drained by occasional hangups.

Threads are independent, but there's a difference between a process and a thread, and that is that in the case of processes, the operating system does more than just "kill" it. It also cleans up after it.
If you start killing threads that seems to be hung, most likely you'll leave resources locked and similar, something that the operating system would close for you if you did the same to a process.
So for instance, if you open a file for writing, and start producing data and write it to the file, and this thread now hangs, for whatever reason, killing the thread will leave the file still open, and most likely locked, up until you close the entire program.
So the real answer to your question is: No, you can not kill threads the hard way.
If you simply ask a thread to close, that's different because then the thread is still in control and can clean up and close resources before terminating, but calling an API function like "KillThread" or similar is bad.

If a thread hangs, the others will continue executing. However, if the hung thread has locked a semaphore, critical section or other kind of synchronization object, and another thread attempts to lock the same synchronization object, you now have a deadlock with two dead threads.
It is possible to monitor other threads from a thread. Depending on your platform, there are appliable API's: I refer you to those as you haven't stated what OS you are writing for.

You didn't mention about the platform, but as far as I'm concerned, NT kernel schedules threads, not processes and threats them independently in that manner. This might not be and is not true on other platforms (some platforms, like Windows 3.1, do not use preemptive multithreading and if one thread goes in infinite loop, everything is affected).

The simple answer is yes.
Typically though code in a thread will handle this likely hood itself. Most commonly many APIs that perform operations that may hang will have timeout features of their own.
Alternatively a thread will wait on not just an the operation that might hang but also a timer. If the timer signals first its assummed the operation has hung.
Since for a watch dog thread to be useful in this scenario would need some co-operation from code in the other threads having the threads themselves set timeouts makes more sense than a watchdog.

Threads get scheduled independent of each other. So you could indeed stop and restart hanging threads. Threads do not run in a separate address-space so a misbehaving thread can still overwrite memory or take locks needed by other threads in the same process.

There's a pretty good overview of some of the pitfalls of killing and suspending threads in the Java documentation explaining why the methods that do it are deprecated. Basically, if you expect to be able to kill a thread, you have to be very, very careful to make it work without some sort of corruption. If a thread is hung it's probably because of a bug...in which case killing it will probably result in corruption.
http://java.sun.com/j2se/1.4.2/docs/guide/misc/threadPrimitiveDeprecation.html
If you need to be able to kill things, use processes.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string