Python getting running Threads - multithreading

I'm doing a project with python and in my code i had to start some threads. Now i need to call a thread to stop it from, but from another class. Is there some way to get a list of all running threads?
Thanks for help.

You can use threading.enumerate() : Python documentation about it here
for thread in threading.enumerate():
print(thread.name)

threading.enumerate() can be used for getting the list of running threads (Thread objects). As per library reference, running threads imply
All Thread objects that are currently alive, created using threading module
Daemonic threads (whose presence doesn't prevent the process from exiting)
Dummy thread objects created by current thread (Threads directly created from C code. They are always alive and daemonic and cannot be joined)
Main Thread (Default thread in python)
It excludes Threads that are not yet started and already terminated.
You can use threading.active_count to get the length of the list returned by threading.enumerate

Related

Calling fork on a multithreaded process

I had a doubt on using fork on a multi-threaded process.
If a process has multiple threads (already created using pthread_create and did a pthread_join) and I call fork, will it copy the same functions assigned to the threads in the child process or create a space where we can reassign the functions?
Read carefully what POSIX says about fork() and threads. In particular:
A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.
The child process will have a single thread running in the context of the calling thread. Other parts of the original process may be tied up by threads that no longer exist (so mutexes may be locked, for example).
The rationale section (further down the linked page) says:
There are two reasons why POSIX programmers call fork(). One reason is to create a new thread of control within the same program (which was originally only possible in POSIX by creating a new process); the other is to create a new process running a different program. In the latter case, the call to fork() is soon followed by a call to one of the exec functions.
The general problem with making fork() work in a multi-threaded world is what to do with all of the threads. There are two alternatives. One is to copy all of the threads into the new process. This causes the programmer or implementation to deal with threads that are suspended on system calls or that might be about to execute system calls that should not be executed in the new process. The other alternative is to copy only the thread that calls fork(). This creates the difficulty that the state of process-local resources is usually held in process memory. If a thread that is not calling fork() holds a resource, that resource is never released in the child process because the thread whose job it is to release the resource does not exist in the child process.
When a programmer is writing a multi-threaded program, the first described use of fork(), creating new threads in the same program, is provided by the pthread_create() function. The fork() function is thus used only to run new programs, and the effects of calling functions that require certain resources between the call to fork() and the call to an exec function are undefined.

Linux/POSIX: Why doesn't fork() fork *all* threads

It is well-known that the default way to create a new process under POSIX is to use fork() (under Linux this internally maps to clone(...))
What I want to know is the following: It is well-known that when one calls fork() "The child process is created with a single thread--the one that called fork()"
(cf. https://linux.die.net/man/2/fork). This can of course cause problems if for example some other thread currently holds a lock. To me not also forking all the threads that exist in the process intuitively feels like a "leaky abstraction".
So I would like to know: What is the reason why only the thread calling fork() will exist in the child process instead of all threads of the process? Is there a good technical reason for this?
I know that on Multithreaded fork there is a related question, but the answers given there don't answer mine.
Of these two possibilities:
only the thread calling fork() continues running in the child process
Downside: if another thread was holding on to an internal resource such as a lock, it will not be released.
after fork(), all threads are duplicated into the child process
Downside: threads that were interacting with external resources continue running in parallel. If a thread was appending data to a file: now it happens twice.
Both are bad, but the first one choice only deadlocks the new child process, while the second choice results in corruption outside of the process. This could be described as "bad".
POSIX did standardize pthread_atfork to try to allow automatic cleanup in the first case, but it cannot possibly work.
tl;dr Don't use both threads and forks. Use posix_spawn if you have to.

Does a thread die when its parent is dead?

Does a thread die once I kill the program which started it?
Probably it has to do with my English, but I couldn't find it here:
https://docs.python.org/2/library/threading.html
Yes, when a process is killed (e.g. by sending it SiGKILL), all of its threads get terminated.
It's worth noting that this is not Python-specific.
It seems that this is the part of the documentation you're looking for, and it states:
When the main thread exits, it is system defined whether the other threads survive. On SGI IRIX using the native thread implementation, they survive. On most other systems, they are killed without executing try ... finally clauses or executing object destructors.
So it's a thing Python doesn't define - it can vary depending on the particular OS.

Why not use a full list of runnable threads, as opposed to just threads that are runnable but not running?

I'm considering the concepts behind multiprocessing, and I'm trying to come up with some reason why a ready list is used that contains all runnable threads that aren't running, as opposed to a list of all runnable threads with the head of the data structure being the running thread(s)?
Thanks for your opinions.
EDIT: Let me clarify. As far as I know, thread packages use a ready list to identify those processes that are ready to run, while the running process is identified by a separate variable. Why don't they just include the running processes in the ready list data structure with the running thread at the head of the structure, making the thread package all inclusive. Would multiprocessing cause problems in this design scheme?
Because a thread can only run on one processor (core) at a time. The list (queue, really) of threads that are ready to run is used primarily by the scheduler when it's looking for what thread it should run; if a thread is already running on one CPU, it can't be run on another CPU at the same time, so the scheduler does not want to look at it (at that time -- sometime later when it's not running and eligible to run again, it will care about it again...)

writing a thread(educational purpose)

Sorry if this is a duplicate...
I have a task to write a thread. And the question is - what a good thread class should contain. I looked through Java implementation and some other, but since it is just an educational project, I wouldn't want to make it too complex. If you can tell or point me to source witch contains required information, I would be very grateful.
Simple thread class consists of following along with threadManager class for easier management of multiple threads
Thread class:
Constructor
function to execute thread
Check if thread is running and process thread's output, if any is present. Returns
TRUE if the thread is still executing, FALSE if it's finished.
Wait until the thread exits
ThreadManager class:
Constructor
Add an existing thread to the manager queue.
Remove a thread from the manager queues.
Process all threads. Returns the number of threads that are still running.
Create and start a new thread. Returns the ID assigned to the thread or FALSE on error.
Remove a finished thread from the internal queue and return it. Returns FALSE if there are no threads that have completed execution.
On the highest level of abstraction you can think about the thread as a combinration of:
Finite-state machine to represent thread's state
Queue of tasks to proceed
Scheduler which can manage threads (start, pause, notify etc ..). Scheduler can be OS level scheduler or some custom scheduler, for example, on the VM level - so called "green threads".
To be more specific, I would recommend to look at Erlang VM. Sources are available online and you can go through their implementantion for "green threads" which are extremely lightweight.
Erlang Downloads

Resources