Each Tomcat server runs on its own JVM and each JVM is a separate process in Operating System. Now I have deployed multiple application in Tomcat which has it own context and own Class Loaders. And if I run multiple Thread in each of this application, how Operating system handles this Thread switching and how entire JVM as process is switched with other process. How these JVM process and Java threads are related in terms of context switching. How it works in most of latest operating system.
In Linux threads are implemented mostly the same way as the processes. So scheduler doesn't care much about processes, it switches between threads instead. Read here more low level explanation.
Now JVM is a process which typically has a lot of threads. Each one of them is mapped one to one to some linux process. In that case scheduler will assign timeslaces(time for run for a particular thread) regardless what process(JVM in your case) owns this thread. This means if one JVM has ten times more threads in total than the other JVM, then the first JVM still has more time consuming CPU than the other one.
You can affect this behaviour in a different number of ways.
You can change scheduler algorithm in your OS
You can change priority of a particular thread. In that case it will get more time than other threads within the same JVM and threads from other JVM. You can assing priority both through java and linux terminal (nice command)
You can bind a particular thread to some set of CPUs. Use taskset command. Each thread has its own PID which you can get with help of jstack utility(bundled with JDK)
Related
I know that a single-core CPU (typically) will be able to have 2 threads running. So does this means you can have NodeJs running parallelly in a single-core CPU?
First off, nodejs only runs your Javascript in a single thread, regardless of how many CPUs there are (assuming there are no WorkerThreads being used). It may use some other threads internally for the implementation of some library functions (like file operations or asynchronous crypto operations). But, it only ever uses a single thread/CPU to execute your Javascript instructions.
So does this means you can have NodeJs running parallelly in a single-core CPU?
That depends upon what you mean by "parallelly".
Your operating system supports lots of threads, even with only a single CPU. But, when you only have a single CPU, those threads get time-sliced across the single CPU such that none of them are ever actually running at the same time.
One thread gets to run for a short time, then the OS suspends that thread, context switches to another thread, runs it for a short time, suspends that thread, context switches to another thread and so on.
So, the one CPU is "shared" among multiple threads, but no each thread is still running one at a time (typically for short periods of time).
The more CPUs you have, the more threads can run simultaneously where there is true parallel execution.
This is all managed by the OS, independent of nodejs or any particular app running on the computer. Also, please be aware that a typical modern OS has a lot of services running in the OS. Each of these services may also have their own threads that needs to use the CPU from time to time in order to keep the OS and its services running properly. For example, you might be doing a file system backup while typing into your word processor, while running a nodejs app. That can all happen on a single CPU by just sharing it between the different threads that all want to have some cycles. Apps or services that need lots of CPU to do their job will run more slowly when a single CPU is being shared among a bunch of different uses, but they will all still proceed via the time-slicing.
Time-slicing on a single CPU will give the appearance of parallel execution because multiple threads can appear to be making progress, but in reality, one thread may run for a few milliseconds, then the OS switches over to another thread which runs for a few milliseconds and so on. Tasks get done in parallel (at a somewhat slower rate) even though both tasks are never actually using the CPU at exactly the same moment.
So for example there would be service 1 that runs on http://127.0.0.1:5000 that runs on thread 1.
And I would like to run service 2 that would run on http://127.0.0.1:5001 that would run on any thread but not on thread 1.
Is it possible to do something like that?
First off, I think you meant to say "CPU core" instead of "thread". Code runs in a thread and a thread runs on a CPU core when it is running. A process may contain one or more threads. In fact, a nodejs process contains several threads, one thread for running your Javascript, but other threads are involved in running the overall nodejs process.
Which CPU core a given thread runs on is up to the operating system.
Normally with a multi-core CPU, two processes that are trying to run at the same time will be assigned to different CPU cores. This is a dynamic thing inside the OS and can change from time to time as different threads/processes are time sliced. Processes of any kind (including nodejs processes) are not hard bound to a particular core and threads within those processes are not hard bound to a particular core either.
The operating system will decide based on which threads in which processes are vying for time to run how to allocate CPU cores to each thread and it is a dynamically changing assignment depending upon demand. If more threads are trying to run than there are cores, then the threads will each get slices of time on a CPU core and they will all share the CPU cores, each making progress, but not getting to hog a CPU core all to themselves.
If your two services, one running on port 5000 and one running on port 5001 are both nodejs apps, then the operating system will dynamically allocate CPU cores upon demand to each of them. Neither of those two service processes are bound to a specific core. If they are both heavily busy at the same time and you have a multi-core CPU and there's not a lot else in computer also contending for CPU time, then each service's main thread that runs your Javascript will have a different CPU core to run on.
But, keep in mind that this is a dynamic assignment. If you have a four core CPU and all of a sudden several other things start up on your computer and are also contending for CPU resources, then the CPU cores will be shared across all the threads/processes contending for CPU resources. The sharing is done via rotation in small time slices and can incorporate a priority system too. The specific details of how that works vary by operating system, but the principle of "time-sharing" the available CPU cores among all those threads requesting CPU resources is the same.
From my understanding, multithreading means under one process, multiple threads that containing instructions, registers, stack, etc,
1, run concurrently on single thread/core cpu device
2, run parallelly on multi core cpu device (just for example 10 threads on 10 core cpu)
And multiprocessing I thought means different processes run parallelly on multi core cpu device.
And today after reading an article, it got me thinking if I am wrong or the article is wrong.
https://medium.com/better-programming/is-node-js-really-single-threaded-7ea59bcc8d64
Multiprocessing is the use of two or more CPUs
(processors) within a single computer system. Now, as there are
multiple processors available, multiple processes can be executed at a
time.
Isn't it the same as a multithreading process that runs on a multi core cpu device??
What did I miss? or maybe it's me not understanding multiprocessing fully.
Multiprocessing means running multiple processes in accordance to the operating system scheduling algorithm. Modern operating systems use some variation of time sharing to run user process in a pseudo-parallel mode. In presence of multiple cpus, the OS can take advantage of them and run some processes in real parallel mode.
Processes in contrast to threads are independent from each other in respect of memory and other process context. They could talk to each other using Inter Process Communication (IPC) mechanisms. Shared resources can be allocated for the processes and require process level synchronization to access them.
Threads, on the other hand share the same memory location and other process context. They can access the same memory location and need to be synchronized using thread synchronization techniques, like mutexes and conditional variables.
Both threads and processes are scheduled by the operating system in similar manner. So, the quote you provided is not completely correct. You do not need multiple cpus for multi-processing, however you need them to allow few processes to really run at the same time. There could be as many processes as cores which run simultaneously, however other processes will share the cpus as well in time-sharing manner.
I am trying to understand Threading in NodeJS and how it works.
Currently what i understand:
Cluster: -
Built on top of Child_process, but with TCP distributed between clusters.
Best for distributing/balancing incoming http requests, while bad for cpu intensive tasks.
Works by taking advantage of available cores in cpu, by cloning nodeJS webserver instances on other cores.
Child_process:
Make use also of different cores available, but its bad since it costs huge amount of resources to fork a child process since it creates virtual memory.
Forked processes could communicate with the master thread through events and vice versa, but there is no communication between forked processes.
Worker threads:
Same as child process, but forked processes can communicate with each other using bufferArray
1) Why worker threads is better than child process and when we should use each of them?
2) What would happen if we have 4 cores and clustered/forked nodeJS webserver 4 times(1 process for each core), then we used worker threads (There is no available cores) ?
You mentioned point under worker-threads that they are same in nature to child-process. But in reality they are not.
Process has its own memory space on other hand, threads use the shared memory space.
Thread is part of process. Process can start multiple threads. Which means that multiple threads started under process share the memory space allocated for that process.
I guess above point answers your 1st question why thread model is preferred over the process.
2nd point: Lets say processor can handle load of 4 threads at a time. But we have 16 threads. Then all of them will start sharing the CPU time.
Considering 4 core CPU, 4 processes with limited threads can utilize it in better way but when thread count is high, then all threads will start sharing the CPU time. (When I say all threads will start sharing CPU time I'm not considering the priority and niceness of the process, and not even considering the other processes running on the same machine.)
My Quick search about time-slicing and CPU load sharing:
https://en.wikipedia.org/wiki/Time-sharing
https://www.tutorialspoint.com/operating_system/os_process_scheduling_qa2.htm
This article even answers how switching between processes can slow down the overall performance.
Worker threads are are similar in nature to threads in any other programming language.
You can have a look at this thread to understand in overall about
difference between thread and process:
What is the difference between a process and a thread?
Is it possible that a program which does not kill its threads properly before exiting still be running some piece of code somewhere even though it has been killed in system monitor? I am running ubuntu in a non virtual environment. My application is made with QT, it contains QThreads, a main thread and concurent functions.
If you kill the process then you kill all its threads. The only cause for concern would be if your application had spawned multiple processes - if that is the case then you may still have code executing on the machine.
This is all very speculative though as I don't know what operating system you code is running on, whether or not your application runs in a virtual environment, etc. Environment-specific factors are very relevant to the discussion, can you share a bit more about your application?
It is not possible, all modern heaviliy used operating systems manage these resources quite tightly. Threads cannot run without a process... They are all brantches from the original thread.
I don't know of any OS that doesn't fully terminate all it's threads when you kill the processes, it's possible to spawn child processes that live on after the main process has exited but in the case of threads i'd say it's not possible.