how are the multiprocessing and threading and thread pooling working - multithreading

https://code.tutsplus.com/articles/introduction-to-parallel-and-concurrent-programming-in-python--cms-28612
From this link I have studied, I have few questions
Q1 : How thread pool (Concurrent) and threading are different here? why do we see the performance improvement. Threading with Que is having 4 threads and each runs cooperatively during the idle time and picks the item from the Que once they get website response. As i see, the thread pool is also in a way doing the same. completing its work and waiting for the manager to assign a task; which is very similar to picking a new item from the Que. I'm not sure how this is different and why i see the perfroamcne improvment. Seems i'm wrong in interpreting the poling here. Could you expalin
Q2 : Question 2 : using multiprocessing the time taken is more. If I have multiprocessor which can handle multiple processes at a time, then all my 4 processes should be handled by it at a time. That is the real parallelization is happening. Also, I have a question here - in such case since 4 processes are running same function doesn't GIL try to stop them executing the same piece of code. Lets suppose all of them share a common variable that gets updated - like number of websites checked. So how does GIL work in these cases of multiprocessing?
Also, here are the same processes used again and again or they get killed and created every time after their job - I think same processes are used. Also, I think that the performance problem is because of the process creation compared to light weight threads at the concurrent threading phase - which is costly. So could you explain more in detail how the GIL is working here and process are running, are they running cooperatively (like each process wait for its turn - like threads in a process do). Or are these processes using the multiprocessors to run really parallel. Also, my other question is If I have a 8 core machine, I think I can run 8 threads of a same process simultaneously or parallel. if I have the 8 core machine can I run 2 processes with 4 threads each? can I run 8 processes on 8 cores? I think cores are only for threads of a process, which means I cant run the 8 process on 8 cores but I can run as many number of processes as many CPU's or multiprocessor system is mine, am i right? So can I run 2 processes with 4 threads each? on my 8 core machine with 2 multiprocessors and each processor having 4 cores each?

Python has a rich set of libraries for multitasking with Processes and Threads. However, there is overlap between the libraries, the choice depends on how abstractly you view the computational tasks. For example, the concurrent.futures library views threads as asynchronous tasks, while the Threading library deals with them as high-level threads. Further, the _thread implements a low-level interface for threading exposing all the synchronization mechanisms.
The GIL(Global Interpreter Lock) is just a synchronization primitive, specifically a mutex which prevents multiple threads of the same process from executing Python bytecode fragments(for certain objects which need to remain consistent with concurrent operations). This is exactly why Python threads excel with I/O operations in terms of speed when compared to compute intensive tasks.(owing to the fact that the GIL is released in case of certain blocking calls/computationally intensive libraries such as numpy). Note that only CPython and Pypy versions of Python are constrained by the GIL mechanism.
Now, let's see those questions...
How thread pool (Concurrent) and threading are different here? Why do we see the performance improvement?
Coming to the comparison between Threading and concurrent.futures.ThreadPoolExecutor (aka threading_squirrel vs future_squirrel), I've executed both programs with the same test case. There are two factors that contribute to this "performance improvement":
Network HEAD requests: Remember that network operations need not complete in the same time period every time you execute them... due to the very nature of packet transfer delays...
Order of thread execution: In the website you've linked, the author creates all threads initially, sets up the queue full of website links and then starts all of them in a list comprehension loop. In ThreadPoolExecutor of concurrent.futures, each time a task is submitted, a thread is assigned to it if the predefined maximum number of threads/workers have not been reached. I've changed the code to mirror this technique. It seems to give a speedup as the first thread begins work early on and doesn't need to wait for the queue to be filled up...
How does GIL work in these cases of multiprocessing?
Remember that the GIL comes into effect for threads of a process only, not among processes. GIL locks up the whole interpreter bytecode during a thread of execution, so the other threads have to wait for their turn. This is the reason multiprocessing used processes instead of threads, as each process has it's own interpreter and consequently, it's own GIL.
Are the same processes used again and again or they get killed and created every time after their job?
The concept of pooling is to reduce the overhead of creating and destroying workers(be it threads or processes) during computation. However, the processes are kind of "brand new" in the sense that the library effectively asks the OS to perform a fork in an UNIX based OS or spawn in an NT based OS...
Also, are the processes running co-operatively?
Maybe. They have to run in co-operation if they use shared memory...(need not be running together). There is definitely going to be a context switch if there are more processes than the OS can allocate to its processors' cores. They can run in parallel if there's no shared memory updates to make.
If I have the 8 core machine can I run 2 processes with 4 threads each? Can I run 8 processes on 8 cores?
Sure (subject to the GIL, in Python). Each process can be allocated to each processing unit for execution. A processing unit can be a physical or a virtual core of a CPU. As long as the OS scheduler supports it, it's possible. Any reasonable split up of processes and threads are possible. If all are allocatable, that's the best situation, else you will encounter context switches...(which are more expensive when it comes to processes)
Hope I've answered all those questions!
Here are a few resources:
MultiCore CPUs, Multithreading and context switching?
Why does multiprocessing use only a single core after I import numpy?
Bonus celery-squirrel resource

Related

Parrallelism and concurrency in multiprocess and multithreading

Hello I'm studying Operating Sysyem. I recognize the difference between parrallelism and concurrency but I still wonder at some point so I wanna get some help thank you!
What I know is that threads are parallel in multiThreading but there are contexts switching among threads. How does it possible? Does that happen when they approach to shared resources?
In case of 4cores 8threads. Are processes running parrallel or concurrently? If they run concurrently, processes switch each other but only 2 threads are running at once at any time in cpu right?
I heard coroutine is concurrent. Which means it doesnt share any resources but how can race conditions still happen there?
Context switching is required, because usually we execute more threads than we have CPU cores. And yes, this often happens when the currently running thread has to wait for something (I guess this is what you meant by "shared resource"), then the thread goes to sleep and another is awaken. But context switching could also happen at almost any time - OS scheduler is responsible for sharing CPU cores between threads in the way that one or few threads don't consume all resources.
"Do processes are running parrallel or concurrent?" - processes or rather threads? Threads run both concurrently and in parallel, but the parallelism is limited to 4. And no, there is only one thread running in a specific core at a time (I ignore hyper-threading here).
I have no idea what do you mean here. Coroutines are concurrent in a similar way to threads and the resource sharing pose a similar challenge when using them.

What is the difference between CPU core's thread and any software's application thread?

I have a web application which supports multi threading in which we can run async tasks simultaneously on different thread. I understood what that thread mean.
Now suppose the server on which the application is running has multiple cores CPU with hyper threading enabled.
Now, how my application is supposed to take advantage of these threads. Is there any relation between these two which I am missing.
What i understand from CPU's threads is that
A thread is a single line of commands that are getting processed, each application has at least one thread, most have multiples. A core is the physical hardware that works on the thread. In general a processor can only work on one thread per core, CPUs with hyper threading can work on up to two threads per core.
For processors with hyper threading, there are extra registers and execution units in the core so it can store the state of two threads and work on them both, normally to change threads you have to empty the registers into the cache, write that back to the main memory, then load up the cache with the new values and load up the registers, context switches hurt performance significantly.
But when you have too much backgrounds tasks running, how they are utilizing just limited number of core's threads (i.e. 2 to 8).
PS: I have already checked What is the difference between a process and a thread? and not looking for definition of process. So its not a duplicate.
If you are making use of multiple cores in your program, then the os will schedule which cores run which threads and will take many factors into account, including other processes running, what exactly your code is trying to do, and much more. In regards to async tasks, these may not necessarily be running on a different thread or core, they may be tasks that are not instantaneous, so some scheduler may decide to start doing other things until there is a signal that the async task is complete. It will vary widely depending on the language you are programming the web application in, and the implementation.

Process vs thread with example

I read articles on processes vs threads, but I am still not clear on the difference.
Suppose a process is using the CPU/Processor, doing some big calculation that takes 10 minutes. How will another process run at the same time in parallel? In a single core vs a dual core processor?
Same thing for threads, how will another thread run in parallel when the CPU/Processor is engaged with another thread?
How is context switching different for threads and for processes? I mean both process and threads use the same RAM memory, so what's the difference?
From my vague memory of Operating Systems I can offer you a little bit of help. First you have to know the difference between concurrent and simultaneous. They are not the same thing; simultaneous means both things occur at the same time and concurrent means they appear to be running simultaneously but in reality they're switching so fast you can't tell.
Processes and threads can be considered similar, but a big difference is that a process is much larger than a thread. For that reason, it is not good to have switching between processes. There is too much information in a process that would have to be saved and reloaded each time the CPU decides to switch processes.
A thread on the other hand is smaller and so it is better for switching. A process may have multiple threads that run concurrently, meaning not at the same exact time, but run together and switch between them. The context switching here is better because a thread won't have as much information to store/reload.
If you only have a single core then you can only do concurrent execution, for the most part. Once you have multiple cores you can have threads run on both cores and thus have simultaneous execution. It is up to the Operating System to schedule when threads run, when processes get to run, when to switch, how to switch them, etc. The Operating System gives you the illusion that work is being done simultaneously when this is not always the case.
If you have more confusion feel free to comment.
A process is a thing very related to the Operating System (OS). The thread is in the simplest terms, is an executing program. One or more threads run in the context of the process. The Java Virtual Machine (JVM) is a process in your OS.
And inside the JVM you can have multiple threads running concurrently.
The processor is a resource of your machine, like the memory. Your OS let your process to share the available resources, in our simple case processors and memory.
When you develop in Java, all processor in your machine are available resources.
When you develop your solution, you can have even multiple Java processes (i.e. multiple JVM) running a single or multiple thread each. But this mostly depends by your problem.
The real difference between a process and a thread is that both have an executing program, but threads share the same memory. This let your threads to theoretically work on the same data, but you have pay the complexity of concurrency and synchronisation.
Each CPU only runs one thread in a process at a time. However the OS can stop and save a thread and load and run another quickly (as little as 0.0001 seconds) This gives the illusion that many threads are running at once, even though only one is running.

multithreading and multitasking on single core processor vs multicore processor

Definitions:
Process(task): ist a program in execution. e.g: Notepad
Thread: A thread is a single sequence of instructions. A process consists of one or more threads(but only one can execute at a time).
According to the lecture a single core processor can run a single process(task) at a time.Only one thread can execute at a time but the Operating system achieves Multithreading using time slicing(thread context switch). This Thread switching happens frequently enough that the user perceives the threads as running at the same time (but they aren't running parallel!)and it occurs inside the one process.A Process context switch is similar to thread context switch with a difference that it takes place between processes (example between mediaplayer und notepad) instead between threads.
I'm not sure if this example is valid : taking two processes e.g: Notepad and Mediaplayer on a single core processor. One can play music and write in a Notepad at the same time although the two processes aren't runnin parallely(Process context switching or multitasking).Inside the one process e.g :Mediaplayer one can listen to music and create playlists at the same time although the two threads aren't running parallely (Thread context switch or multithreading)
1st Question : Are my Information above right ?
2nd Question : would an Execution of Threads in a multicore Process look the same inside a one core but with a difference that the threads of different processes can run parallely?.Is multithreading here the process of running multiple threads simultaneously on difference processes or the process of swiching between threads on a one core ? The same Question would be also for Multitasking.
How would the Process context switch and thread context switch in this case take place ?
3rd Question: The Professor used the Term single threaded processor. Is this Term an another name for sigle core Processor ?
or
several threads belonging to the same process can be executed on several CPU cores simultaneously.Time slicing still happens on multicore systems. Say one have Process with 20 Threads running on a quadcore - the OS still has to schedule 21 Threads to run on only 4 cores.
A single-threaded process runs on only one single core at a time. But that doesn't mean it'll run on the same core until it exits. The OS might give him a time slice to run on Core 1 now, pause it, and give it another time slice on Core 2 later
note : I read a lot of books and i googled enough before i decided to ask here.
EDITED
Yes, you seem to have a good understanding of this topic (not sure if it is really interesting, though). However, you seem to overthink it. I suggest a simpler way of understanding the way it works on the modern systems (it is really wild west when you start look back, with the idea of light-weighted processess and such, but I will not talk about it).
The process is a shell. It's only purpose in life is to provide environment for threads. Only the threads are really executed, process itself is never executed. A single process can host multiple threads within it, and when it hosts only one thread, one can say process is executed - but it is simply a manner of saying. A CPU can only execute a thread, not a process.
Your professor, as they often do, makes misleading statements. There is no such thing as single-threaded processor. There are single and multicored processors, and those processors can be joined together to provide multi-processor environment. From the application developer perspective, a single CPU with 4 cores does not differ from 4 single-core CPUs. There are differences, of course - but usually not for the application developer.
Multitasking is a laymen term. It can mean whatever one wants it to mean, and better be avoided in non-specific contexts.
I hope I did clarify your confusiuon.
The answer to your questions is the following:
Q1: On a single core processor two tasks can't run parallelly in the form of executing two (processor) instructions at the same time, the only possible way of multithreading is time-slicing realized by the task-scheduler (of the OS), so in that case you are approximately right. I would complete your view on the subject with the fact, that nowadays almost none of the applications are single-threaded. I don't know if notepad uses multiple threads, but I'm pretty sure, media player is multithreaded, and the task scheduler schedules time slices between threads not processes. (Fun fact: a single-threaded .NET application already runs 4-5 threads.)
Q2: Task scheduler on any system tries to spread the load between available cores, so time slices will work most likely how you displayed above, but if a process executes an additional thread, it will be executed on the core with the least load over it. Multiple cores also mean, multiple (processor) instructions can and will be executed at the same time.
Q3: In practice multithreaded processor and multicore processor means something very similar, but not the same. You see for example Intel Core i3/i5/i7 CPUs are equipped with an internal pseudo-task-scheduler, which doubles the number of virtual cores by scheduling the execution of two threads on the same core, so for example my i5 system is 2 cored but 4 threaded.
your most of concepts seem valid with non standard terms.
here is explanation of what are threads and process and then multithreading
process is running instance of program is true
when there were no thread then resources were only distributed among processes.
Now processes have threads so resources are distributed to threads but isolation is same of process means two processes still need IPC to communicate with each other. You can say multithreading as lightweight processes which can be scheduled by operating system. multit-hreading is an extension of multi-tasking so if there is one core and two processes: one with two threads and one with 4 threads, the contention of accessing core is between 6 threads not 2 processes.
for thread switch and process switch see thread context switch vs process context switch

erlang threading and OS threads correlation [duplicate]

Erlang is known for being able to support MANY lightweight processes; it can do this because these are not processes in the traditional sense, or even threads like in P-threads, but threads entirely in user space.
This is well and good (fantastic actually). But how then are Erlang threads executed in parallel in a multicore/multiprocessor environment? Surely they have to somehow be mapped to kernel threads in order to be executed on separate cores?
Assuming that that's the case, how is this done? Are many lightweight processes mapped to a single kernel thread?
Or is there another way around this problem?
Answer depends on the VM which is used:
1) non-SMP: There is one scheduler (OS thread), which executes all Erlang processes, taken from the pool of runnable processes (i.e. those who are not blocked by e.g. receive)
2) SMP: There are K schedulers (OS threads, K is usually a number of CPU cores), which executes Erlang processes from the shared process queue. It is a simple FIFO queue (with locks to allow simultaneous access from multiple OS threads).
3) SMP in R13B and newer: There will be K schedulers (as before) which executes Erlang processes from multiple process queues. Each scheduler has it's own queue, so process migration logic from one scheduler to another will be added. This solution will improve performance by avoiding excessive locking in shared process queue.
For more information see this document prepared by Kenneth Lundin, Ericsson AB, for Erlang User Conference, Stockholm, November 13, 2008.
I want to ammend previous answers.
Erlang, or rather the Erlang runtime system (erts), defaults the number of schedulers (OS threads) and the number of runqueues to number of processing elements on your platform. That is processors cores or hardware threads. You can change these settings in runtime using:
erlang:system_flag(schedulers_online, NP) -> PrevNP
The Erlang processes does not have any affinity to any schedulers yet. The logic balancing the processes between the schedulers follows two rules. 1) A starving scheduler will steal work from another scheduler. 2) Migration paths are setup to push processes from schedulers with lots of processes to schedulers with less work. This is done to assure fairness in reduction count (execution time) for each process.
Schedulers however can be locked to specific processing elements. This not done by default. To let erts do the scheduler->core affinity use:
erlang:system_flag(scheduler_bind_type, default_bind) -> PrevBind
Several other bind types can be found in the documentation. Using affinity can greatly improve performance in heavy load situations! Especially in high lock contention situations. Also, the linux kernel cannot handle hyperthreads to say the least. If you have hyperthreads on your platform you should really use this feature in erlang.
I'm purely guessing here, but I'd imagine that there's a small number of threads, which pick processes from a common process pool for execution. Once a process hits a blocking operation, the thread executing it puts it aside and picks another. When a process being executed causes another process to become unblocked, that newly unblocked process gets placed into the pool. I suppose a thread might also stop execution of a process even when it's not blocked at certain points to serve other processes.
I would like to add some input to what was described in the accepted answer.
Erlang Scheduler is the essential part of the Erlang Runtime System and provides its own abstraction and implementation of the conception of lightweight processes atop the OS threads.
Each Scheduler runs within a single OS thread. Normally, there are as many schedulers as CPU (cores) are on he hardware (it is configurable though and naturally does not bring much value when number of schedulers exceeds those of hardware cores). The system might also be configured that scheduler will not jump between OS threads.
Now, when the Erlang process is being created it is entirely the responsibility of the ERTS and Scheduler to manage life cycle and resources consumption as well as its memory footprint etc.
One of the core implementation details is that each process has a time budget of 2000 reductions available when the Scheduler picks up that process from the run queue. Each progress in the system (even I/O) is guaranteed to have a reductions budget. That is what actually makes ERTS a system with preemptive multitasking.
I would recommend a great blog post on that topic by Jesper Louis Andersen http://jlouisramblings.blogspot.com/2013/01/how-erlang-does-scheduling.html
As the short answer: Erlang processes are not OS threads and do not map to them directly. Erlang Schedulers are what runs on the OS threads and provide smart implementation of more finely grained Erlang processes hiding those details behind programmer's eyes.

Resources