NodeJS Load/Stress tests results - node.js

I have developed a basic nodejs+express server which, for one route returns status 200 and perform a couple of load tests on localhost. The thing is that I cannot pass 1000 requests/second for this simple route:
app.get('/test',function(req.res){
res.status(200).send()
}
For 1000 requests/s the server behaves alright, but after this the loadtest returns errors. How can I increase this value for nodejs, and, is this a good result? This is only a simple route without any processing so I think the server should accept much more requests. Thanks.

There are so many factors that can influence your results. This is generally something that you shouldn't do on your own dev machine if you want solid, real-life results. The reason you shouldn't use your own dev machine is because it is always running something else that's completely unrelated to your app that uses your system resources - like Google Chrome, I/O, etc.
Another factor is your hardware and how saturated your system resources currently are.
If you still want to test locally I want you to keep in mind that the results will not reflect a production environment, hence the test is useless.
You can also improve performance a bit by using Node's cluster module to make your server utilize all your machines' processor threads. Node.js is a single-threaded language.

Related

Any benefits when clustering node.js application server in single core computer

This question is meant for single core only therefore multiple cores are out.
I am running Node.js application as HTTP server on single core computer using Express.js. Assuming that my server is able to handle 1000 concurrent requests, would clustering brings about any better in response speed ?
Does process context switch has much impact on performance in this case ?
I wouldn't expect improvements in speed, but you might get some other benefits.
for example if the process crashes, the other nodejs instance could still work.
would clustering brings about any better in response speed ?
You are aware of the fact that in clustered mode, your application will be running behind a load balancer, that will in turn take some CPU and memory to manage and forward the network traffic. Then, what's left of the resources, will be used to distribute the network load.
Apart from a few, rare and easily avoidable cases such as #Cristyan mentions—in which case your load balancer can be an orchestrator managing most of the stuff, like Kubernetes—running a Node.js app in cluster does not make sense to me, on a single CPU core. If the process has to wait for an item, it has to wait for it! Asynchronously you can make it work on other requests, but even in this case, other processes would want to take a share of CPU too.

Steps to improve throughput of Node JS server application

I have a very simple nodejs application that accepts json data (1KB approx.) via POST request body. The response is sent back immediately to the client and the json is posted asynchronously to an Apache Kafka queue. The number of simultaneous requests can go as high as 10000 per second which we are simulating using Apache Jmeter running on three different machines. The target is to achieve an average throughput of less than one second with no failed requests.
On a 4 core machine, the app handles upto 4015 requests per second without any failures. However since the target is 10000 requests per second, we deployed the node app in a clustered environment.
Both clustering in the same machine and clustering between two different machines (as described here) were implemented. Nginx was used as a load balancer to round robin the incoming requests between the two node instances. We expected a significant improvement in the throughput (like documented here) but the results were on the contrary.
The number of successful requests dropped to around 3100 requests per second.
My questions are:
What could have gone wrong in the clustered approach?
Is this even the right way to increase the throughput of Node application?
We also did a similar exercise with a java web application in Tomcat container and it performed as expected 4000 requests with a
single instance and around 5000 successful requests in a cluster
with two instances. This is in contradiction to our belief that
nodejs performs better than a Tomcat. Is tomcat generally better
because of its thread per request model?
Thanks a lot in advance.
Per your request, I'll put my comments into an answer:
Clustering is generally the right approach, but whether or not it helps depends upon where your bottleneck is. You will need to do some measuring and some experiments to determine that. If you are CPU-bound and running on a multi-core computer, then clustering should help significantly. I wonder if your bottleneck is something besides CPU such as networking or other shared I/O or even Nginx? If that's the case, then you need to fix that before you would see the benefits of clustering.
Is tomcat generally better because of its thread per request model?
No. That's not a good generalization. If you are CPU-bound, then threading can help (and so can clustering with nodejs). But, if you are I/O bound, then threads are often more expensive than async I/O like nodejs because of the resource overhead of the threads themselves and the overhead of context switching between threads. Many apps are I/O bound which is one of the reasons node.js can be a very good choice for server design.
I forgot to mention that for http, we are using express instead of the native http provided by node. Hope it does not introduce an overhead to the request handling?
Express is very efficient and should not be the source of any of your issues.
As jfriend said , you need to find the bottlenecks ,
one thing you can try is to reduce the bandwith/throughput by using sockets to pass the json and especially this library https://github.com/uNetworking/uWebSockets.
The main reason for that is that an http request is significantly heavier than a socket connection.
Good Example : https://webcheerz.com/one-million-requests-per-second-node-js/
lastly you can also compress the json via (http gzip) or a third party module.
work on the weight ^^
Hope it helps!

Hapijs: Performance tuning for lots of concurrent requests

Are there any special tuning tips for strengthening an API built on top of the hapijs framework?
Especially if you have lots of concurrent request (+10000/sec) that are accessing the DB?
I'm using PM2 to start my process in "cluster mode" to be able to load-balance to different cores on the server
I don't need to serve static content, so there's no apache/nginx proxy
update 17:11
Running tests with 1000 requests/sec (with loader.io) results in this curve - ok, so far. but I'm wondering if there is still room for improvements.
(hardware: 64gb / 20 core digital ocean droplet)
In the end I just used a combination of node's http and the body-parser module to achieve what I needed.
But I think that this was only viable, because my application had just two endpoint (one GET, one POST).
If your application logic is rather complicated and you want to stick with hapi, think about using a load-balancer and dividing the load to multiple VMs.
Loadtest results (new setup on an even smaller DO droplet):

What exactly are the implications of the fact that nodejs is single threaded?

The NodeJS website says the following. Emphasis is mine.
Node.js is a platform built on Chrome's JavaScript runtime for easily building fast, scalable network applications. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient, perfect for data-intensive real-time applications that run across distributed devices.
Even though I love NodeJS I dont see why it is better for scalable applications compared to the existing technologies such as Python, Java or even PHP.
As I understand the JavaScript run-time always runs as a single thread in the CPU. The IO however probably uses underlying kernel methods which might rely on the thread pools provided by the kernel.
So the real questions that need to be answered are:
Because all JS code will run in a single thread NodeJS is unsuitable for applications where there is less IO and lots of computation ?
If I am writing a web application using nodejs and there are 100 open connections each performing a pure computation requiring 100ms, at least one of them will take 10s to finish ?
If your machine has 10 cores but if you are running just one nodeJS instance your other 9 CPUs are sitting ducks ?
I would appreciate if you also post how other technologies perform viz a viz NodeJS in these cases.
I haven't done a ton of node, but I have some opinions on this. Please correct if I am mistaken, SO.
Because all JS code will run in a single thread NodeJS is unsuitable for applications where there is less IO and lots of computation ?
Yeah. Single threaded means if you are crunching lots of data hard in your JS code, you are blocking everything else. And that sucks. But this isn't typical for most web applications.
If I am writing a web application using nodejs and there are 100 open connections each performing a pure computation requiring 100ms, at least one of them will take 10s to finish?
Yep. 10 seconds of CPU time.
If your machine has 10 cores but if you are running just one nodeJS instance your other 9 CPUs are sitting ducks?
That I'm not sure about. The V8 engine might have some optimizations in it that take advantage of multiple cores, transparent to the programmer. But I doubt it.
The thing is, most of the time a web application isn't calculating. If your app is engineered well, a single request can be responded to very quickly. And if you have to fetch things to do that (db, files, remote services) you shouldn't have to wait for that fetch to return before processing the next request.
So you may have many requests in various stages at the same time in various stages of completion, due to when I/O callbacks happen. Even though only one request is running JS code at a time, that code should do what it needs to do very quickly, exit the run loop, and await the next event callback.
If your JS can't run quickly, then this model does pose a problem. As you note, things will get hung as the CPU churns. So don't build a node web application that does lots of intense calculation on the fly.
However, you can refactor things to be asynchronous instead. Maybe you have a standalone node script that can do the calculation for you, with a callback when it's done. Your web application can then boot up that script as a child process, tell it do stuff, and provide a callback to run when it's done. You now have sort of faked threads, in a round about way.
In virtually all web application technologies, you do not want to be doing complex and intense calculation on the fly. Even with proper threading, it's a losing battle. Instead you have to strategize. Do the calculations in the background, or at regular intervals on a cron job, outside of the main web application process itself.
The things you point out are flaws in theory, but in practice it really only becomes an issue if you aren't doing it right.
Node.js is single threaded. This means anything that would block the main thread needs to be done outside the main thread.
In practice this just means using callbacks for heavy computations the same way you use callbacks for I/O.
For instace here's the API for node bcrypt
var bcrypt = require('bcrypt');
bcrypt.genSalt(10, function(err, salt) {
bcrypt.hash("B4c0/\/", salt, function(err, hash) {
// Store hash in your password DB.
});
});
Which Mozilla Persona uses in production. See their code here.

Which Node.js Concurrent Web Server is best on Heroku?

I have just learned about Heroku and was pretty much excited to test it out. Ive quickly assembled their demo's with Node.js Language and stumbled across a problem. When running the application locally, apache benchmark prints roughly about 3500 request/s but when its on the cloud that drops to 10 request/s and does not increase or lower based on network latency. I cannot believe that this is the performance they are asking 5 cents/hour for and highly suspect my application to be not multi-threaded.
This is my code.js: http://pastebin.com/hyM47Ue7
What configuration do I need to apply in order to get it 'running' (faster) on Heroku ? Or what other web servers for node.js could I use ?
I am thankful for every answer on this topic.
Your little example is not multi-threaded. (Not even on your own machine.) But you don't need to pay for more dyno's immediately, as you can make use of multiple cores on a dyno, see this answer Running Node.js App with cluster module is meaningless in Heroku?
To repeat that answer: a node solution to using multiple processes that should increase your throughput is to use the (built-in) cluster module.
I would guess that you can easily get more than 10 req/s from a heroku dyno without a problem, see this benchmark, for example:
http://openhood.com/ruby/node/heroku/sinatra/mongo_mapper/unicorn/express/mongoose/cluster/2011/06/14/benchmark-ruby-versus-node-js/
What do you use to benchmark?
You're right, the web server is not multi-threaded, until you pay for more web dynos. I've found Heroku is handy for prototyping; depending on the monetary value of your time, you may or may not want to use it to set up a scalable server instead of using EC2 directly.

Resources