node.js job partition over specific CPU subset - node.js

I have 2 node.js programs that communicate with each other through a socket. I originally intended on running them on separate linux machines, but realized it would be much simpler to have them both on one multi-core machine, each allocated a specific subset of the total CPUs. For example, I have a machine with 8 cores, I want to run program_A on 2 cores as 1 process (using node's fork module), and program_B on the remaining 6 cores, also as a single process forked over several cores. Is there any way to do this with node.js? I'm aware of taskset to set CPU affinity, but I'm not quite sure if that would work for this scenario.

Related

How could I monitor the CPUs' efficiencies of a Slurm job?

I developed a Python script that enables multiple-threads. I sent my code to Slurm. I would like to verify if the multiple threads work well.
Therefore I want to monitor the real-time CPUs' usage. Is there any command can do this?
Most clusters allow logins to nodes if the user has a job running on it. So, just ssh to the node that is running your job, and run top(1)
If your code is multithreaded, the value in the %CPU column should be greater than 100%. Each thread, if it is fully busy, can consume up to 100% CPU (i.e. all the cycles on a single CPU core). So, N threads can consume up to N*100%.

Where is master process located in cluster architecture of nodejs app?

Let's say, I've got a 4-core CPU, which basically means, my computer could run 4 processes at the same time; no more no less.
Now let's look at cluster module of nodejs; there are
1 master process
4 worker processes
Can I say, each worker process is to be assigned to each core of the CPU?
And if it is, then where is its master process located?
The master process is fired as a standard single-thread process (or service) so it "is assigned to" and lives on one of your CPU cores (or threads, depending on the case).
Each worker process is then spawned (forked) as described in the docs:
https://nodejs.org/api/cluster.html
Just so you know, Node now supports the use of worker_threads to execute JavaScript in parallel on multiple threads, so it's technically not true anymore that Node is limited by single-thread execution:
https://nodejs.org/api/worker_threads.html

Can slurm run 3 seperate computers as one "node"?

I'm an intern that's been tasked with installing slurm across three compute units running ubuntu. How things work now is people ssh into one of the compute units and run a job on there, since all three units share memory through nfs mounting. Otherwise they are separate machines though.
My issue is that from what I've read in the documentation, it seems like when installing slurm I would specify each of these compute units as a completely separate node, and any jobs I'd like to run that use multiple cores would still be limited by how many cores are available on the individual node. My supervisor has told me however that the three units should be installed as a single node, and when a job comes in that needs more cores than available on a single compute unit, slurm should just use all the cores. The intention is that we won't be changing how we execute jobs (like a parallelized R script), just "wrapping" them in a sbatch script before sending them to slurm for scheduling and execution.
So is my supervisor correct in that slurm can be used to run our parallelized scripts unchanged with more cores than available on a single machine?
Running a script on more cores than available is nonsense. It does not provide any performance increase, rather the oposite, as more threads have to be managed but the computing power is the same.
But he is right in the sense that you can wrap your current script and send it to SLURM for execution using the whole node. But the three machines will be three nodes. They cannot work as a single node because, well, they are not a single node/machine. They do not share either memory nor busses, nor peripherals... they just share some disk thru the network.
You say that
any jobs I'd like to run that use multiple cores would still be limited by how many cores are available on the individual node
but that's the current situation with SSH. Nothing is lost by using SLURM to manage the resources. In fact, SLURM will take care of giving each job the proper resources and avoiding other users interfering with your computations.
Your best bet: create a cluster of three nodes as usual and let people send their jobs asking for as many resources they need without exceeding the available resources.

Node.js cluster module cannot use all the cpu cores when running inside docker container

When run Node.js cluster module on my physical machine, the os.cpus().length will get 4, but after put the app inside docker container then it returns 2!
I generally know this is because that by default Golang will just run on one single core, that's why here the cluster module only can see one single CPU core (2 logical cores).
If I want my cluster module to utilize all the physical CPU cores, what is the proper way to achieve that?
I tried to play with the --cpuset-cpus=0-1 options, till now haven't figure out much.
I am thinking if I just create an arbitrary amount of workers, will that really can utilize all the CPU cores? The os.cpus().length here is just used to figure out how many cpu cores the machine has, I can get around of this by calling into shell script. That means this question can be just simply equal to Node.js os.cpus() API is not compatible with docker? Is that true?
Your docker machine uses default 2 core. On mac you can change the amount in advanced.

How to make sure that nodejs cluster assigns processes to different cores?

When utilizing multi cores via Node.js' cluster module is it guaranteed that each forked node worker is assigned to a different core?
If it's not guaranteed is there any way to control or manage it and eventually guarantee that all end up in different cores? Or the OS' scheduler distributes them evenly?
A while ago I did some tests with cluster module which you can check in this post that I wrote. Looking at the system monitor screenshots it is this pretty straightforward to understand what happens under the hood (with and without cluster module).
It is indeed up to the OS to distribute processes over the cores. You could obtain the pid of a child process and use an external utility to set the CPU affinity of that process. That would, of course, not be very portable.

Resources