Parrallelism and concurrency in multiprocess and multithreading

Parrallelism and concurrency in multiprocess and multithreading - multithreading

Hello I'm studying Operating Sysyem. I recognize the difference between parrallelism and concurrency but I still wonder at some point so I wanna get some help thank you!
What I know is that threads are parallel in multiThreading but there are contexts switching among threads. How does it possible? Does that happen when they approach to shared resources?
In case of 4cores 8threads. Are processes running parrallel or concurrently? If they run concurrently, processes switch each other but only 2 threads are running at once at any time in cpu right?
I heard coroutine is concurrent. Which means it doesnt share any resources but how can race conditions still happen there?

Context switching is required, because usually we execute more threads than we have CPU cores. And yes, this often happens when the currently running thread has to wait for something (I guess this is what you meant by "shared resource"), then the thread goes to sleep and another is awaken. But context switching could also happen at almost any time - OS scheduler is responsible for sharing CPU cores between threads in the way that one or few threads don't consume all resources.
"Do processes are running parrallel or concurrent?" - processes or rather threads? Threads run both concurrently and in parallel, but the parallelism is limited to 4. And no, there is only one thread running in a specific core at a time (I ignore hyper-threading here).
I have no idea what do you mean here. Coroutines are concurrent in a similar way to threads and the resource sharing pose a similar challenge when using them.

Related

How does multithreading utilizes multiple cores?

So recently I've learned some basic knowledge about multithreading. What I've understood is that thread is a lightweight process that runs under processes by sharing memory, while one process is running under one CPU core.
Yet by this perspective I couldn't understand some saying that threads utilizes multiple cores and make the whole program executes more effective. From what I've known, threads created by one process should run only under that specific process, which means that it should only run under that very one CPU core. If we want to utilize multiple cores, we should actually use multiprocess to run parallelly. Most of what I've researched is only about the conclusion, i.e multithreading utilizes multiple cores, but none of them explains my question. Did I think anything wrong? Thanks!

Your confusion lies here:
[...] while one process is running under one CPU core.
[...] threads created by one process should run only under that specific process, which means that it should only run under that very one CPU core.
This is not true. I think what the various explanations you have read meant that any process have at least one thread (where a 'thread' is a sequence of instructions ran by a CPU core).
If you have a multithreaded program, the process will have several threads (sequences of instructions ran by a CPU core) that can run concurrently on different CPU cores.
There are many processes executing on your computer at any given time. The Operating System (OS) is the program that allocates the hardware resources (CPU cores) to all these processes and decides which process can use which cores for what amount of time before another process gets to use the CPU. Whether or not a process gets to use multiple cores is not entirely up to the process. More confusing still, multithreaded programs can use more threads than there are cores on the computer's CPU. In that case you can be certain that all your threads do not run in parallel.
One more thing:
[...] threads utilizes multiple cores and make the whole program executes more effective
I am going to sound very pedantic, but it is more complicated than that. It depends on what you mean by "effective". Are we talking about total computation time, energy consumption ..?
A sequential (1 thread) program may be very effective in terms of power consumption but taking a very long time to compute. If you are able to use multiple threads, you may be able to reduce that computation time but it will probably incur new costs (synchronization between threads, additional protection mechanisms against concurrent accesses ...).
Also, multithreading cannot help for certain tasks that fall outside of the CPU realm. For example, unless you have some very specific hardware support, reading a file from the hard-drive with 2 or more concurrent threads cannot be parallelized efficiently.

how are the multiprocessing and threading and thread pooling working

https://code.tutsplus.com/articles/introduction-to-parallel-and-concurrent-programming-in-python--cms-28612
From this link I have studied, I have few questions
Q1 : How thread pool (Concurrent) and threading are different here? why do we see the performance improvement. Threading with Que is having 4 threads and each runs cooperatively during the idle time and picks the item from the Que once they get website response. As i see, the thread pool is also in a way doing the same. completing its work and waiting for the manager to assign a task; which is very similar to picking a new item from the Que. I'm not sure how this is different and why i see the perfroamcne improvment. Seems i'm wrong in interpreting the poling here. Could you expalin
Q2 : Question 2 : using multiprocessing the time taken is more. If I have multiprocessor which can handle multiple processes at a time, then all my 4 processes should be handled by it at a time. That is the real parallelization is happening. Also, I have a question here - in such case since 4 processes are running same function doesn't GIL try to stop them executing the same piece of code. Lets suppose all of them share a common variable that gets updated - like number of websites checked. So how does GIL work in these cases of multiprocessing?
Also, here are the same processes used again and again or they get killed and created every time after their job - I think same processes are used. Also, I think that the performance problem is because of the process creation compared to light weight threads at the concurrent threading phase - which is costly. So could you explain more in detail how the GIL is working here and process are running, are they running cooperatively (like each process wait for its turn - like threads in a process do). Or are these processes using the multiprocessors to run really parallel. Also, my other question is If I have a 8 core machine, I think I can run 8 threads of a same process simultaneously or parallel. if I have the 8 core machine can I run 2 processes with 4 threads each? can I run 8 processes on 8 cores? I think cores are only for threads of a process, which means I cant run the 8 process on 8 cores but I can run as many number of processes as many CPU's or multiprocessor system is mine, am i right? So can I run 2 processes with 4 threads each? on my 8 core machine with 2 multiprocessors and each processor having 4 cores each?

Python has a rich set of libraries for multitasking with Processes and Threads. However, there is overlap between the libraries, the choice depends on how abstractly you view the computational tasks. For example, the concurrent.futures library views threads as asynchronous tasks, while the Threading library deals with them as high-level threads. Further, the _thread implements a low-level interface for threading exposing all the synchronization mechanisms.
The GIL(Global Interpreter Lock) is just a synchronization primitive, specifically a mutex which prevents multiple threads of the same process from executing Python bytecode fragments(for certain objects which need to remain consistent with concurrent operations). This is exactly why Python threads excel with I/O operations in terms of speed when compared to compute intensive tasks.(owing to the fact that the GIL is released in case of certain blocking calls/computationally intensive libraries such as numpy). Note that only CPython and Pypy versions of Python are constrained by the GIL mechanism.
Now, let's see those questions...
How thread pool (Concurrent) and threading are different here? Why do we see the performance improvement?
Coming to the comparison between Threading and concurrent.futures.ThreadPoolExecutor (aka threading_squirrel vs future_squirrel), I've executed both programs with the same test case. There are two factors that contribute to this "performance improvement":
Network HEAD requests: Remember that network operations need not complete in the same time period every time you execute them... due to the very nature of packet transfer delays...
Order of thread execution: In the website you've linked, the author creates all threads initially, sets up the queue full of website links and then starts all of them in a list comprehension loop. In ThreadPoolExecutor of concurrent.futures, each time a task is submitted, a thread is assigned to it if the predefined maximum number of threads/workers have not been reached. I've changed the code to mirror this technique. It seems to give a speedup as the first thread begins work early on and doesn't need to wait for the queue to be filled up...
How does GIL work in these cases of multiprocessing?
Remember that the GIL comes into effect for threads of a process only, not among processes. GIL locks up the whole interpreter bytecode during a thread of execution, so the other threads have to wait for their turn. This is the reason multiprocessing used processes instead of threads, as each process has it's own interpreter and consequently, it's own GIL.
Are the same processes used again and again or they get killed and created every time after their job?
The concept of pooling is to reduce the overhead of creating and destroying workers(be it threads or processes) during computation. However, the processes are kind of "brand new" in the sense that the library effectively asks the OS to perform a fork in an UNIX based OS or spawn in an NT based OS...
Also, are the processes running co-operatively?
Maybe. They have to run in co-operation if they use shared memory...(need not be running together). There is definitely going to be a context switch if there are more processes than the OS can allocate to its processors' cores. They can run in parallel if there's no shared memory updates to make.
If I have the 8 core machine can I run 2 processes with 4 threads each? Can I run 8 processes on 8 cores?
Sure (subject to the GIL, in Python). Each process can be allocated to each processing unit for execution. A processing unit can be a physical or a virtual core of a CPU. As long as the OS scheduler supports it, it's possible. Any reasonable split up of processes and threads are possible. If all are allocatable, that's the best situation, else you will encounter context switches...(which are more expensive when it comes to processes)
Hope I've answered all those questions!
Here are a few resources:
MultiCore CPUs, Multithreading and context switching?
Why does multiprocessing use only a single core after I import numpy?
Bonus celery-squirrel resource

Process vs thread with example

I read articles on processes vs threads, but I am still not clear on the difference.
Suppose a process is using the CPU/Processor, doing some big calculation that takes 10 minutes. How will another process run at the same time in parallel? In a single core vs a dual core processor?
Same thing for threads, how will another thread run in parallel when the CPU/Processor is engaged with another thread?
How is context switching different for threads and for processes? I mean both process and threads use the same RAM memory, so what's the difference?

From my vague memory of Operating Systems I can offer you a little bit of help. First you have to know the difference between concurrent and simultaneous. They are not the same thing; simultaneous means both things occur at the same time and concurrent means they appear to be running simultaneously but in reality they're switching so fast you can't tell.
Processes and threads can be considered similar, but a big difference is that a process is much larger than a thread. For that reason, it is not good to have switching between processes. There is too much information in a process that would have to be saved and reloaded each time the CPU decides to switch processes.
A thread on the other hand is smaller and so it is better for switching. A process may have multiple threads that run concurrently, meaning not at the same exact time, but run together and switch between them. The context switching here is better because a thread won't have as much information to store/reload.
If you only have a single core then you can only do concurrent execution, for the most part. Once you have multiple cores you can have threads run on both cores and thus have simultaneous execution. It is up to the Operating System to schedule when threads run, when processes get to run, when to switch, how to switch them, etc. The Operating System gives you the illusion that work is being done simultaneously when this is not always the case.
If you have more confusion feel free to comment.

A process is a thing very related to the Operating System (OS). The thread is in the simplest terms, is an executing program. One or more threads run in the context of the process. The Java Virtual Machine (JVM) is a process in your OS.
And inside the JVM you can have multiple threads running concurrently.
The processor is a resource of your machine, like the memory. Your OS let your process to share the available resources, in our simple case processors and memory.
When you develop in Java, all processor in your machine are available resources.
When you develop your solution, you can have even multiple Java processes (i.e. multiple JVM) running a single or multiple thread each. But this mostly depends by your problem.
The real difference between a process and a thread is that both have an executing program, but threads share the same memory. This let your threads to theoretically work on the same data, but you have pay the complexity of concurrency and synchronisation.

Each CPU only runs one thread in a process at a time. However the OS can stop and save a thread and load and run another quickly (as little as 0.0001 seconds) This gives the illusion that many threads are running at once, even though only one is running.

How does multithreading work on a singe-core system?

When multiple threads run on a single-core system , are those threads running simultaneously or sequentially with a fast context switch(which gives a feeling of threads running simultaneously)?
Thanks

Many modern processors adapt techniques that allow them to execute several threads on a single core. Such techniques are called Simultaneous multithreading (or SMT). For instance, "Hyper-threading" is the Intel's implementation of SMT.
SMT implies that a core can fetch and execute two or more instructions from different threads simultaneously, in one cycle. If the OS also knows how to work with SMT, it can schedule threads in a way that actually allows executing different threads on the same core simultaneously. In some cases it might give nearly the same boost as executing threads on two (or more in some processors) cores.
Otherwise, it's only context switching.

With a single CPU core, different threads don't literally run simultaneously, but the OS can prempt one thread and let another thread run.

TPL Tasks, Threads, etc

Could someone clear up to me how these things correlate:
Task
Thread
ThreadPool's thread
Paraller.For/ForEach/Invoke
I.e. when I create a Task and run it, where does it get a thread to execute on? And when I call Parallel.* what is really going on under the covers?
Any links to articles, blogposts, etc are also very welcomed!

The ideal state of a system is to have 1 actively running thread per CPU core. By defining work in more general terms of "tasks", the TPL can dynamically decide how many threads to use and which tasks to do on each one in order to come closest to achieving that ideal state. These are decisions that are almost always best made dynamically at runtime because when writing the code you can't know for sure how many CPU cores will be available to your application, how busy they are with other work, etc.

Thread: is a real OS thread, has handle and ID.
ThreadPool: is a collection of already-created OS Threads. These threads are owned/maintained by the runtime, and your code is only allowed to "borrow" them for a while, you can only do work short-termed work in these threads, and you can't modify any thread state, nor delete these threads.
Best guesses on these two:
Task: might get run on a pre-created thread in the thread pool, or might get run as part of user-mode scheduling, this is all depending on what the runtime thinks is best. Another guess: with TPL, the user-mode scheduling is NOT based on OS Fibers, but is its own complete (and working) implementation).
Parallel.For: actually, no clue how this is implemented. The runtime might create new threads to do the parallel bits, or much more likely use the thread pool's threads for the parallelism.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string