Can you get more performance using multiple websockets on one Node.js server? - node.js

Ok guys, is it useful to generate more than one websocket instance on a Node.js server? I mean possibly you also create subworkers..
I know, it depends on your hardware and the network cards maximum. But can you reach this maximium with only one task or can you get more performance with several parallel processes?

It depends on the CPU count in your server. Parallel processes (the cluster module) aren't a silver bullet for performance - on a 1 CPU system they might even decrease it just due to overhead. But if you have more than one CPU, a NodeJS process CANNOT use that extra horsepower. The cluster module solves this. Used properly you can use all the available CPU power on the system.
Note that this has nothing to do with WebSockets themselves. This is a core NodeJS architectural constraint. But WebSockets are a great use-case where clustering is advantageous...

Related

Any benefits when clustering node.js application server in single core computer

This question is meant for single core only therefore multiple cores are out.
I am running Node.js application as HTTP server on single core computer using Express.js. Assuming that my server is able to handle 1000 concurrent requests, would clustering brings about any better in response speed ?
Does process context switch has much impact on performance in this case ?
I wouldn't expect improvements in speed, but you might get some other benefits.
for example if the process crashes, the other nodejs instance could still work.
would clustering brings about any better in response speed ?
You are aware of the fact that in clustered mode, your application will be running behind a load balancer, that will in turn take some CPU and memory to manage and forward the network traffic. Then, what's left of the resources, will be used to distribute the network load.
Apart from a few, rare and easily avoidable cases such as #Cristyan mentions—in which case your load balancer can be an orchestrator managing most of the stuff, like Kubernetes—running a Node.js app in cluster does not make sense to me, on a single CPU core. If the process has to wait for an item, it has to wait for it! Asynchronously you can make it work on other requests, but even in this case, other processes would want to take a share of CPU too.

Steps to improve throughput of Node JS server application

I have a very simple nodejs application that accepts json data (1KB approx.) via POST request body. The response is sent back immediately to the client and the json is posted asynchronously to an Apache Kafka queue. The number of simultaneous requests can go as high as 10000 per second which we are simulating using Apache Jmeter running on three different machines. The target is to achieve an average throughput of less than one second with no failed requests.
On a 4 core machine, the app handles upto 4015 requests per second without any failures. However since the target is 10000 requests per second, we deployed the node app in a clustered environment.
Both clustering in the same machine and clustering between two different machines (as described here) were implemented. Nginx was used as a load balancer to round robin the incoming requests between the two node instances. We expected a significant improvement in the throughput (like documented here) but the results were on the contrary.
The number of successful requests dropped to around 3100 requests per second.
My questions are:
What could have gone wrong in the clustered approach?
Is this even the right way to increase the throughput of Node application?
We also did a similar exercise with a java web application in Tomcat container and it performed as expected 4000 requests with a
single instance and around 5000 successful requests in a cluster
with two instances. This is in contradiction to our belief that
nodejs performs better than a Tomcat. Is tomcat generally better
because of its thread per request model?
Thanks a lot in advance.
Per your request, I'll put my comments into an answer:
Clustering is generally the right approach, but whether or not it helps depends upon where your bottleneck is. You will need to do some measuring and some experiments to determine that. If you are CPU-bound and running on a multi-core computer, then clustering should help significantly. I wonder if your bottleneck is something besides CPU such as networking or other shared I/O or even Nginx? If that's the case, then you need to fix that before you would see the benefits of clustering.
Is tomcat generally better because of its thread per request model?
No. That's not a good generalization. If you are CPU-bound, then threading can help (and so can clustering with nodejs). But, if you are I/O bound, then threads are often more expensive than async I/O like nodejs because of the resource overhead of the threads themselves and the overhead of context switching between threads. Many apps are I/O bound which is one of the reasons node.js can be a very good choice for server design.
I forgot to mention that for http, we are using express instead of the native http provided by node. Hope it does not introduce an overhead to the request handling?
Express is very efficient and should not be the source of any of your issues.
As jfriend said , you need to find the bottlenecks ,
one thing you can try is to reduce the bandwith/throughput by using sockets to pass the json and especially this library https://github.com/uNetworking/uWebSockets.
The main reason for that is that an http request is significantly heavier than a socket connection.
Good Example : https://webcheerz.com/one-million-requests-per-second-node-js/
lastly you can also compress the json via (http gzip) or a third party module.
work on the weight ^^
Hope it helps!

How to use clusters in node js?

I am very new to Node.js and express. I am currently learning it by building my own services.
I recently read about clusters. I understood what clusters do. What I am not able to understand is how to make use of clusters in a production application.
One way I can think of is to use the Master process to just sit in front and route the incoming request to the next available child process in a round robin fashion. I am not sure if this is how it is designed to be used. I would like to know how should clusters be used in a typical web application.
Thanks.
The node.js cluster modules is used with node.js any time you want to spread out the request processing across multiple node.js processes. This is most often used when you wish to increase your ability to handle more requests/second and you have multiple CPU cores in your server. By default, a single instance of node.js will not fully utilize multiple cores because the core Javascript you run in your server is single threaded (uses one core). Node.js itself does use threads for some things internally, but that's still unlikely to fully utilize a mult-core system. Setting up a clustered node.js process for each CPU core will allow you to better maximize the available compute resources.
Clustering also provides you with some additional fault tolerance. If one cluster process goes down, you can still have other live clusters serving requests while the disabled cluster restarts.
The cluster module for node.js has a couple different scheduling algorithms - the round robin you mention is one. You can read more about that here: Cluster Round-Robin Load Balancing.
Because each cluster is a separate process, there is no automatic shared data among the different cluster processes. As such, clustering is simplest to implement either where there is no shared data or where the shared data is already in a place that it can be accessed by multiple processes (such as in a database).
Keep in mind that a single node.js process (if written to properly use async I/O and not heavily compute bound) can server many requests itself at once. Clustering is when you want to expand scalability beyond what one instance can deliver.
I have created a poc on cluster in nodejs and added some details in the below blogs. Once go through it. It may provide some clearance.
https://jksnu.blogspot.com/2022/02/cluster-in-node-js-application.html
https://jksnu.blogspot.com/2022/02/cluster-management-in-node-js.html

Performance implications of distribution of threads amongst processes in a server

The question title is pretty awkward, sorry about that.
I am currently working on the design of a server, and a comment came up from one of my co-workers that we should use multiple processes, since the was some performance hit to having too many threads in a single process (as opposed to having that same number of threads spread over multiple processes on the same machine)
The only thing I can think of which would cause this (other than bad OS scheduling), would be from increased contention (for example on the memory allocator), but I'm not sure how much that matters.
Is this a 'best practice'? Does anyone have some benchmarks they could share with me? Of course the answer may depend on the platform (I'm interested mostly in windows/linux/osx, although I need to care about HP-UX, AIX, and Solaris to some extent)
There are of course other benefits to using a multi-process architecture, such as process isolation to limit the effect of a crash, but I'm interested about performance for this question.
For some context, the server is going to service long-running, stateful connections (so they cannot be migrated to other server processes) which send back a lot of data, and can also cause a lot of local DB processing on the server machine. It's going to use the proactor architecture in-process and be implemented in C++. The server will be expected to run for weeks/months without need of restart (although this may be implemented by rotating new instances transparently under some proxy).
Also, we will be using a multi-process architecture, my concern is more about scheduling connections to processes.

Nodejs on heavy load

I am using nodejs with socket.io on my chat system. My hardware is 13.6 Ghz Cpu and 16gb ram.
When the online users count reaches 600,some users can't connect to socket,can't send messages.And some users disconnects from chat.
How can I resolve this problem ? What is your opinion of this problem ?
First, I'm not sure how you have a clock speed of 13.6ghz for a single thread. I'd assume your CPU has multiple cores, or your mobo supports multiple processor sockets, and 13.6 is simply a sum. (8.2GHz was a world record that was set July 23rd, 2013.)
Secondly, I'd ask yourself why the disconnect is happening.
Are you watching processor load - is a single thread maxing out its processor allocation (i.e. 100% usage on a single core)?
How's your RAM consumption; is it climbing, and has the OS offloaded
memory onto the page file/swap partition - could there be a memory
leak?
Is your network bandwidth capped? Has it reached its maximum capacity?
My high-level recommendations are:
Make sure your application is non-blocking. This means using asynchronous methods whenever possible. By design, however, all Node.js' Net methods are asynchronous.
Consider clustering your application and use a shared port. This is possible with Node.js child processes using Cluster. This will distribute the load on the CPU to multiple cores. You don't have to worry about load balancing (e.g. round-robin, fastest, ratio) - handling the client is on a first-available first-serve basis, whichever Node.js process can handle the client's request first, wins.
Verify your NIC has enough throughput to handle the load. If it is configured for / auto-negotiating as 10BASE-T or 100BASE-TX half-duplex, you could be in trouble.
Ultimately, you need to perform more diagnostics to isolate the issue. Curiosity, digging, patience and research will lead you to the answer. Your question is far too open-ended to be provided with an exact answer - it is more theoretical. There are also too many variables to pinpoint an exact cause.

Resources