Multiprocessing vs multithreading misconception? - multithreading

From my understanding, multithreading means under one process, multiple threads that containing instructions, registers, stack, etc,
1, run concurrently on single thread/core cpu device
2, run parallelly on multi core cpu device (just for example 10 threads on 10 core cpu)
And multiprocessing I thought means different processes run parallelly on multi core cpu device.
And today after reading an article, it got me thinking if I am wrong or the article is wrong.
https://medium.com/better-programming/is-node-js-really-single-threaded-7ea59bcc8d64
Multiprocessing is the use of two or more CPUs
(processors) within a single computer system. Now, as there are
multiple processors available, multiple processes can be executed at a
time.
Isn't it the same as a multithreading process that runs on a multi core cpu device??
What did I miss? or maybe it's me not understanding multiprocessing fully.

Multiprocessing means running multiple processes in accordance to the operating system scheduling algorithm. Modern operating systems use some variation of time sharing to run user process in a pseudo-parallel mode. In presence of multiple cpus, the OS can take advantage of them and run some processes in real parallel mode.
Processes in contrast to threads are independent from each other in respect of memory and other process context. They could talk to each other using Inter Process Communication (IPC) mechanisms. Shared resources can be allocated for the processes and require process level synchronization to access them.
Threads, on the other hand share the same memory location and other process context. They can access the same memory location and need to be synchronized using thread synchronization techniques, like mutexes and conditional variables.
Both threads and processes are scheduled by the operating system in similar manner. So, the quote you provided is not completely correct. You do not need multiple cpus for multi-processing, however you need them to allow few processes to really run at the same time. There could be as many processes as cores which run simultaneously, however other processes will share the cpus as well in time-sharing manner.

Related

Can you run NodeJs parallelly in a single-core CPU?

I know that a single-core CPU (typically) will be able to have 2 threads running. So does this means you can have NodeJs running parallelly in a single-core CPU?
First off, nodejs only runs your Javascript in a single thread, regardless of how many CPUs there are (assuming there are no WorkerThreads being used). It may use some other threads internally for the implementation of some library functions (like file operations or asynchronous crypto operations). But, it only ever uses a single thread/CPU to execute your Javascript instructions.
So does this means you can have NodeJs running parallelly in a single-core CPU?
That depends upon what you mean by "parallelly".
Your operating system supports lots of threads, even with only a single CPU. But, when you only have a single CPU, those threads get time-sliced across the single CPU such that none of them are ever actually running at the same time.
One thread gets to run for a short time, then the OS suspends that thread, context switches to another thread, runs it for a short time, suspends that thread, context switches to another thread and so on.
So, the one CPU is "shared" among multiple threads, but no each thread is still running one at a time (typically for short periods of time).
The more CPUs you have, the more threads can run simultaneously where there is true parallel execution.
This is all managed by the OS, independent of nodejs or any particular app running on the computer. Also, please be aware that a typical modern OS has a lot of services running in the OS. Each of these services may also have their own threads that needs to use the CPU from time to time in order to keep the OS and its services running properly. For example, you might be doing a file system backup while typing into your word processor, while running a nodejs app. That can all happen on a single CPU by just sharing it between the different threads that all want to have some cycles. Apps or services that need lots of CPU to do their job will run more slowly when a single CPU is being shared among a bunch of different uses, but they will all still proceed via the time-slicing.
Time-slicing on a single CPU will give the appearance of parallel execution because multiple threads can appear to be making progress, but in reality, one thread may run for a few milliseconds, then the OS switches over to another thread which runs for a few milliseconds and so on. Tasks get done in parallel (at a somewhat slower rate) even though both tasks are never actually using the CPU at exactly the same moment.

How to ensure node.js process will run on different thread?

So for example there would be service 1 that runs on http://127.0.0.1:5000 that runs on thread 1.
And I would like to run service 2 that would run on http://127.0.0.1:5001 that would run on any thread but not on thread 1.
Is it possible to do something like that?
First off, I think you meant to say "CPU core" instead of "thread". Code runs in a thread and a thread runs on a CPU core when it is running. A process may contain one or more threads. In fact, a nodejs process contains several threads, one thread for running your Javascript, but other threads are involved in running the overall nodejs process.
Which CPU core a given thread runs on is up to the operating system.
Normally with a multi-core CPU, two processes that are trying to run at the same time will be assigned to different CPU cores. This is a dynamic thing inside the OS and can change from time to time as different threads/processes are time sliced. Processes of any kind (including nodejs processes) are not hard bound to a particular core and threads within those processes are not hard bound to a particular core either.
The operating system will decide based on which threads in which processes are vying for time to run how to allocate CPU cores to each thread and it is a dynamically changing assignment depending upon demand. If more threads are trying to run than there are cores, then the threads will each get slices of time on a CPU core and they will all share the CPU cores, each making progress, but not getting to hog a CPU core all to themselves.
If your two services, one running on port 5000 and one running on port 5001 are both nodejs apps, then the operating system will dynamically allocate CPU cores upon demand to each of them. Neither of those two service processes are bound to a specific core. If they are both heavily busy at the same time and you have a multi-core CPU and there's not a lot else in computer also contending for CPU time, then each service's main thread that runs your Javascript will have a different CPU core to run on.
But, keep in mind that this is a dynamic assignment. If you have a four core CPU and all of a sudden several other things start up on your computer and are also contending for CPU resources, then the CPU cores will be shared across all the threads/processes contending for CPU resources. The sharing is done via rotation in small time slices and can incorporate a priority system too. The specific details of how that works vary by operating system, but the principle of "time-sharing" the available CPU cores among all those threads requesting CPU resources is the same.

What is the difference between Child_process and Worker Threads?

I am trying to understand Threading in NodeJS and how it works.
Currently what i understand:
Cluster: -
Built on top of Child_process, but with TCP distributed between clusters.
Best for distributing/balancing incoming http requests, while bad for cpu intensive tasks.
Works by taking advantage of available cores in cpu, by cloning nodeJS webserver instances on other cores.
Child_process:
Make use also of different cores available, but its bad since it costs huge amount of resources to fork a child process since it creates virtual memory.
Forked processes could communicate with the master thread through events and vice versa, but there is no communication between forked processes.
Worker threads:
Same as child process, but forked processes can communicate with each other using bufferArray
1) Why worker threads is better than child process and when we should use each of them?
2) What would happen if we have 4 cores and clustered/forked nodeJS webserver 4 times(1 process for each core), then we used worker threads (There is no available cores) ?
You mentioned point under worker-threads that they are same in nature to child-process. But in reality they are not.
Process has its own memory space on other hand, threads use the shared memory space.
Thread is part of process. Process can start multiple threads. Which means that multiple threads started under process share the memory space allocated for that process.
I guess above point answers your 1st question why thread model is preferred over the process.
2nd point: Lets say processor can handle load of 4 threads at a time. But we have 16 threads. Then all of them will start sharing the CPU time.
Considering 4 core CPU, 4 processes with limited threads can utilize it in better way but when thread count is high, then all threads will start sharing the CPU time. (When I say all threads will start sharing CPU time I'm not considering the priority and niceness of the process, and not even considering the other processes running on the same machine.)
My Quick search about time-slicing and CPU load sharing:
https://en.wikipedia.org/wiki/Time-sharing
https://www.tutorialspoint.com/operating_system/os_process_scheduling_qa2.htm
This article even answers how switching between processes can slow down the overall performance.
Worker threads are are similar in nature to threads in any other programming language.
You can have a look at this thread to understand in overall about
difference between thread and process:
What is the difference between a process and a thread?

will schedule threads of the same process to different cores benefitial

Well, I'm currently working on Linux scheduler, and i wonder if there is a situation that threads run on different cores will accelerate process in Linux. i already heard that pin threads of the same process to different cores will make cache 'hot', thus it's beneficial. however, if not pinned, but try to dynamic binding threads to different cores, will it have any benefits or what's the pitfalls? Thanks!

Is a multi-user and multi-processor environment useful with threading?

Taking CPU affinity into account, will such an environment be useful with threading? Or will there be a performance degradation in such a system, if multiple users login and spawn multiple kernel and user threads?
When you say "taking CPU affinity into account" - are you saying that all processes have CPU affinity in this hypothetical system? Or is that just as one extra possible bit of information?
Using multiple threads will slow things down a bit if the system is already loaded (so there are more runnable threads than cores) but if there are often times where there are only (say) 2 users and 4 cores available, threading may help.
Another typical use for threads is to do something "in the background" whether that's explicitly using threads or using async calls. At that point multi-threading can definitely give a benefit (e.g. a non-hanging UI) without actually using more than one core simultaneously for much of the time.

Resources