Scaling Node.js: Using autoscaling groups with small virtual servers or cluster processes on VM with many vCPUs? - node.js

In learning about Node.js's cluster module I've been turning over the following architecture in my head: Balancing costs with performance, would it be more beneficial (i.e. cheapest but still scalable) to run your Node.js application in a cloud service's autoscaling group using small servers with one virtual CPU (say, AWS's t2.small EC2, 1 vCPU, 2gb memory) or use a larger server (say, an m5.xlarge 4 vCPU, 16gb memory), run Node.js to cluster four child processes to use the 4 vCPUs, but still autoscale?
A possible trade-off is the time it takes AWS to deploy another small server to autoscale, but on a low-traffic app or utility app you'll have to take on the cost of running the larger server when usage is low. But if the time it takes to spin up another server to handle the load is nominal, does that negate the benefits of using the cluster module?
Specifically, my question is twofold: Are these two approaches feasible and, if so, is my presumption about the cluster module's usefulness in the small server approach correct?

Related

Does clustering in Node.js and auto-scaling web application using Kubernetes serve the same purpose?

Node.js has introduced the Cluster module to scale up applications for performance optimization. We have Kubernetes doing the same thing.
I'm confused if both are serving the same purpose? My assumption is clustering can spawn up to max 8 processes (if there are 4 cpu cores with 2 threads each) and there is no such limitation in Kubernetes.
Kubernetes and the Node.js Cluster module operate at different levels.
Kubernetes is in charge of orchestrating containers (amongst many other things). From its perspective, there are resources to be allocated, and deployments that require or use a specific amount of resources.
The Node.js Cluster module behaves as a load-balancer that forks N times and spreads the requests between the various processes it owns, all within the limits defined by its environment (CPU, RAM, Network, etc).
In practice, Kubernetes has the possibility to spawn additional Node.js containers (scaling horizontally). On the other hand, Node.js can only grow within its environment (scaling vertically). You can read about this here.
While from a performance perspective both approaches might be relatively similar (you can use the same number of cores in both cases); the problem with vertically scaling on a single machine is that you lose the high-availability aspect that Kubernetes provides. On the other hand, if you decide to deploy several Node.js containers on different machines, you are much more tolerant for the day one of them is going down.

Erlang NUMA Technology configuration

I am trying to run erlang application on openstack vm and getting very poor performance and after testing i found something going on with NUMA, This is what i observe in my test.
My openstack compute host with 32 core so i have created 30 vCPU core vm on it which has all NUMA awareness, when i am running Erlang application benchmark on this VM getting worst performance but then i create new VM with 16 vCPU core (In this case my all VM cpu pinned with Numa-0 node) and in this case benchmark result was great.
based on above test its clear if i keep VM on single numa node then performance is much better but when i spread it out to multiple numa zone it get worse.
But interesting thing is when i run same erlang application run on bare metal then performance is really good, so trying to understand why same application running on VM doesn't perform well?
Is there any setting in erlang to better fit with NUMA when running on virtual machine?
It's possible that Erlang is not able to properly detect the cpu topology of your VM.
You can inspect the cpu topology as seen by the VM using lscpu and lstopo-no-graphics from the hwloc package:
#lscpu | egrep '^(CPU\(s\)|Thread|Core|Socket|NUMA)'
#lstopo-no-graphics --no-io
If it doesn't look correct, consider rebuilding the VM using OpenStack options like hw:cpu_treads=2 hw:cpu_sockets=2 as described at https://specs.openstack.org/openstack/nova-specs/specs/juno/implemented/virt-driver-vcpu-topology.html
On the Erlang side, you might experiment with the Erlang VM options +sct, +sbt as described at http://erlang.org/doc/man/erl.html#+sbt

Should You Use PM2, Node Cluster, or Neither in Kubernetes?

I am deploying some NodeJS code into Kubernetes. It used to be that you needed to run either PM2 or the NodeJS cluster module in order to take full advantage of multi-core hardware.
Now that we have Kubernetes, it is unclear if one must use one or the other, to get the full benefit of multiple cores.
Should a person specify the number of CPU units in their pod YAML configuration?
Or is there simply no need to account for multiple cores with NodeJS in Kubernetes?
You'll achieve utilization of multiple cores either way; the difference being that with the nodejs cluster module approach, you'd have to "request" more resources from Kubernetes (i.e., multiple cores), which might be more difficult for Kubernetes to schedule than a few different containers requesting one core (or less...) each (which it can, in turn, schedule on multiple nodes, and not necessarily look for one node with enough available cores).

Is Imap.get() expensive in Hazelcast if Hazelcast cluster is running in the Cloud?

I have a distributed map stored in hazelcast. My hazelcast cluster run in a cloud either private or public. My app may not run on the same network where hazelcast cluster is running.
My app tries to access distributed map using IMap.get() may be thousands per second. I tried to major performance of the above operation on the local cluster by running hazelcast cluster on my local machine. I could read everything in 15-20ms. But I am not getting the same performance if hazelcast cluster runs in the cloud.
If you are reading a map, more frequently, Will it increase the load on hazelcast in the cloud environment?, yes any reasons?
Performance of running software locally will always be different than running in a distributed environment, more so when servers are located elsewhere - network latencies being the most prominent factor.
Servers in cloud, app on local = not the recipe for best performance. Either move all cluster components- servers and app clients, in one network (aim for same availability zone if looking for best performance) or expect delays. Its not the cloud in particular that deteriorates the performance, its the way VMs are setup in cloud. For example, if one VM is in us-east-1 and other in London and your app is in Tokyo then expect inferior performance numbers.

How to scale Azure VM Cores

I have a Python code that I need to run on 1000 CSVs in parallel computing to do calculations. One CPU core can finish running the code over each CSV in 8 hours.
Thus I am looking for a way to use Azure for this. I would like to create several virtual machines, say 4x D5v2 with 16 cores each to access a Windows Server that runs on a 64 Cores machine.
I tried to create these VMs in the same Cloud Service and I put them into the same Availability Set, which worked fine. When all VMs are running and I access any one of those VMs, I see that the cores on all other VMs are allocated to "Other Roles".
My questions are:
1) Is it possible to create a hypothetical VM out of 4 VMs to use more cores?
2) How can I manually allocate all cores in the Cloud Service to one specific VM?
Your best solution would be to use Azure Batch With Batch you create a job, and it will run on as many CPU's as you specify it can run on.
Taken from the Batch front page
When you are ready to run a job, Batch starts a pool of compute virtual machines for you, installing applications and staging data, running jobs with as many tasks as you have, identifying failures and re-queuing work and scaling down the pool as work completes. You have control over scale to meet deadlines, manage costs, and run at the right scale for your application.
1) Is it possible to create a hypothetical VM out of 4 VMs to use more cores?
No you can not.
2) How can I manually allocate all cores in the Cloud Service to one specific VM?
You can not do this. You need to use a cloud native solution to scale your process over multiple resources.

Resources