Hapijs: Performance tuning for lots of concurrent requests - node.js

Are there any special tuning tips for strengthening an API built on top of the hapijs framework?
Especially if you have lots of concurrent request (+10000/sec) that are accessing the DB?
I'm using PM2 to start my process in "cluster mode" to be able to load-balance to different cores on the server
I don't need to serve static content, so there's no apache/nginx proxy
update 17:11
Running tests with 1000 requests/sec (with loader.io) results in this curve - ok, so far. but I'm wondering if there is still room for improvements.
(hardware: 64gb / 20 core digital ocean droplet)

In the end I just used a combination of node's http and the body-parser module to achieve what I needed.
But I think that this was only viable, because my application had just two endpoint (one GET, one POST).
If your application logic is rather complicated and you want to stick with hapi, think about using a load-balancer and dividing the load to multiple VMs.
Loadtest results (new setup on an even smaller DO droplet):

Related

Benchmarking Nginx against Express

I have Nginx set up as a reverse proxy in front of my express application.
So every request that comes to Nginx is proxied to express running on 4 ports. Both Nginx and express run on the same hosts .
After having read that all the static content should be served by Nginx and Express should be left for dynamic requests only, I gave it a shot and set up the Nginx config . It works perfectly . So now all JS / CSS and HTML assets are served by Nginx itself.
Now how do I prove that this is a better setup in terms of numbers ? Should I use some tool to simulate requests ( to both older and the newer setup ), and compare the average load times of assets ?
Open your browser => Dev tools => Networks
Here you can see the network wait time and download time for every request. So you can open your webpage and compare it with both the configs.
This can be helpful on a local env so latency has minimal effect on testing.
Other than that you can do a load test. Google load testing tools!
In a word, "benchmark." You have two configurations. You need to understand the efficiencies under each model. To do so you need to instrument the hosts to collect data on the finite resources (CPU, DISK, MEMORY, NETWORK and related sub statistics) as well as response times.
Any performance testing tool which exercises the HTTP interface and allows for the collection and aggregation of your monitoring data while under test should do the trick. You should be able to collect information on the most common paths through your site, the number of users on your system for any given slice of time, the average session duration (iteration interval) all from an examination of the logs. The most common traversals then become the basis for the business processes you will need to replicate with your performance testing tool.
If you have no engaged in performance testing efforts before then this would be a good time to tag someone in your organization who does this work on an ongoing basis. The learning curve is steep and (if you haven't done this before and you have no training or mentor) fairly long. You can burn a lot of cycles on poor tests/benchmark executions before you get it "right" where you can genuinely compare the performance of configuration A to configuration B.

Steps to improve throughput of Node JS server application

I have a very simple nodejs application that accepts json data (1KB approx.) via POST request body. The response is sent back immediately to the client and the json is posted asynchronously to an Apache Kafka queue. The number of simultaneous requests can go as high as 10000 per second which we are simulating using Apache Jmeter running on three different machines. The target is to achieve an average throughput of less than one second with no failed requests.
On a 4 core machine, the app handles upto 4015 requests per second without any failures. However since the target is 10000 requests per second, we deployed the node app in a clustered environment.
Both clustering in the same machine and clustering between two different machines (as described here) were implemented. Nginx was used as a load balancer to round robin the incoming requests between the two node instances. We expected a significant improvement in the throughput (like documented here) but the results were on the contrary.
The number of successful requests dropped to around 3100 requests per second.
My questions are:
What could have gone wrong in the clustered approach?
Is this even the right way to increase the throughput of Node application?
We also did a similar exercise with a java web application in Tomcat container and it performed as expected 4000 requests with a
single instance and around 5000 successful requests in a cluster
with two instances. This is in contradiction to our belief that
nodejs performs better than a Tomcat. Is tomcat generally better
because of its thread per request model?
Thanks a lot in advance.
Per your request, I'll put my comments into an answer:
Clustering is generally the right approach, but whether or not it helps depends upon where your bottleneck is. You will need to do some measuring and some experiments to determine that. If you are CPU-bound and running on a multi-core computer, then clustering should help significantly. I wonder if your bottleneck is something besides CPU such as networking or other shared I/O or even Nginx? If that's the case, then you need to fix that before you would see the benefits of clustering.
Is tomcat generally better because of its thread per request model?
No. That's not a good generalization. If you are CPU-bound, then threading can help (and so can clustering with nodejs). But, if you are I/O bound, then threads are often more expensive than async I/O like nodejs because of the resource overhead of the threads themselves and the overhead of context switching between threads. Many apps are I/O bound which is one of the reasons node.js can be a very good choice for server design.
I forgot to mention that for http, we are using express instead of the native http provided by node. Hope it does not introduce an overhead to the request handling?
Express is very efficient and should not be the source of any of your issues.
As jfriend said , you need to find the bottlenecks ,
one thing you can try is to reduce the bandwith/throughput by using sockets to pass the json and especially this library https://github.com/uNetworking/uWebSockets.
The main reason for that is that an http request is significantly heavier than a socket connection.
Good Example : https://webcheerz.com/one-million-requests-per-second-node-js/
lastly you can also compress the json via (http gzip) or a third party module.
work on the weight ^^
Hope it helps!

Load Balancing in Nodejs

I recently started with node and I have been reading a lot about its limitation of it being single threaded and how it does not utilise your cores and then I read this
http://bit.ly/1n2YW68 (which talk about the new cluster module of nodejs for loadbalancing)
Now I'm not sure I completely agree to it :) because the first thing that I thought of before starting with node on how to make it utilise cores with proper load balancing is via web-server some like upstream module like nginx
like doing something like this
upstream domain1 {
server http://nodeapp1;
server http://nodeapp2;
server http://nodeapp3;
}
So my question is there an advantage to use such cluster module for load balancing to utilise the cores does it has any significant advantage over web server load balancing
or is blog post too far from real use.
Note: I'm ain't concerned about load balancing handle by various app server like passenger(passenger has nodejs support as well but something that I'm not looking for answer :)) which I already know since I'm mostly a ruby programmer
One other option you can use to cluster NodeJs applications is to deploy the app using PM2.
Clustering is just easy as this, You don't need to implement clustering by hand
pm2 start app.js -i max
PM2 is an expert to auto detect the number of available CPUs and run as many processes as possible
Read about PM2 cluster mode here
http://pm2.keymetrics.io/docs/usage/cluster-mode/
For controlling the load of IO operations, I wrote a library called QueueP using the memoization concept. You can even customize the memoization logic and gain speedup values of more than 10, sometimes
https://www.npmjs.com/package/queuep
As far as I know, the built in node cluster is not a good solution yet (load is not evenly distributed across cores). Until v0.12: http://strongloop.com/strongblog/whats-new-in-node-js-v0-12-cluster-round-robin-load-balancing/
So you should use nginx until then. After that we will see some benchmarks comparing both options and see if the built in cluster module is a good choice.

run multiple instances of node.js in parallel

I was thinking about using a reverse proxy to distribute API requests to multiple node.js instances of a REST API. Like this it should be possible to achieve much better overall performance since multiprocessor systems can perfectly run multiple instances on one core each (or similar).
What are common solutions for such a distribution of requests onto multiple node instances and what are important points to take in mind?
First and foremost, you can use the cluster module for running many instances of the same server application. It's important to remember to correctly handle shared state, such as storing sessions in a common database.
This works standalone and you can let your users connect directly to that server, or use e.g. nginx, HAProxy, Varnish or lighttpd in front of your server.

Scaling Node.JS across multiple cores / servers

Ok so I have an idea I want to peruse but before I do I need to understand a few things fully.
Firstly the way I think im going to go ahead with this system is to have 3 Server which are described below:
The First Server will be my web Front End, this is the server that will be listening for connection and responding to clients, this server will have 8 cores and 16GB Ram.
The Second Server will be the Database Server, pretty self explanatory really, connect to the host and set / get data.
The Third Server will be my storage server, this will be where downloadable files are stored.
My first questions is:
On my front end server, I have 8 cores, what's the best way to scale node so that the load is distributed across the cores?
My second question is:
Is there a system out there I can drop into my application framework that will allow me to talk to the other cores and pass messages around to save I/O.
and final question:
Is there any system I can use to help move the content from my storage server to the request on the front-end server with as little overhead as possible, speed is a concern here as we would have 500+ clients downloading and uploading concurrently at peak times.
I have finally convinced my employer that node.js is extremely fast and its the latest in programming technology, and we should invest in a platform for our Intranet system, but he has requested detailed documentation on how this could be scaled across the current hardware we have available.
On my front end server, I have 8
cores, what's the best way to scale
node so that the load is distributed
across the cores?
Try to look at node.js cluster module which is a multi-core server manager.
Firstly, I wouldn't describe the setup you propose as 'scaling', it's more like 'spreading'. You only have one app server serving the requests. If you add more app servers in the future, then you will have a scaling problem then.
I understand that node.js is single-threaded, which implies that it can only use a single core. Not my area of expertise on how to/if you can scale it, will leave that part to someone else.
I would suggest NFS mounting a directory on the storage server to the app server. NFS has relatively low overhead. Then you can access the files as if they were local.
Concerning your first question: use cluster (we already use it in a production system, works like a charm).
When it comes to worker messaging, i cannot really help you out. But your best bet is cluster too. Maybe there will be some functionality that provides "inter-core" messaging accross all cluster workers in the future (don't know the roadmap of cluster, but it seems like an idea).
For your third requirement, i'd use a low-overhead protocol like NFS or (if you can go really crazy when it comes to infrastructure) a high-speed SAN backend.
Another advice: use MongoDB as your database backend. You can start with low-end hardware and scale up your database instance with ease using MongoDB's sharding/replication set features (if that is some kind of requirement).

Resources