Simple message passing Nodejs server accepting only 4 requests at a time

Simple message passing Nodejs server accepting only 4 requests at a time - node.js

We have a simple express node server deployed on windows server 2012 that recieves GET requests with just 3 parameters. It does some minor processing on these parameters, has a very simple in-memory node-cache for caching some of these parameter combinations, interfaces with an external license server to fetch license for the requesting user and sets it in the cookie, followed by which, it interfaces with some workers via a load balancer (running with zmq) to download some large files (in chunks, and unzips and extracts them, writes them to some directories) and display them to the user. On deploying these files, some other calls to the workers are initiated as well.
The node server does not talk to any database or disk. It simply waits for response from the load balancer running on some other machines (these are long operations taking typically between 2-3 minutes to send response). So, essentially, the computation and database interactions happens on other machines. The node server is only a simple message passing/handshaking server that waits for response in event handlers, initiates other requests and renders the response.
We are not using a 'cluster' module or nginx at the moment. With a bare bones node server, is it possible to accept and process atleast 16 requests simultaneously ? Pages such as these http://adrianmejia.com/blog/2016/03/23/how-to-scale-a-nodejs-app-based-on-number-of-users/ mention that a simple node server can handle only 2-9 requests at a time. But even with our bare bones implementation, not more than 4 requests are accepted at a time.
Is using a cluster module or nginx necessary even for this case ? How to scale this application for a few hundred users to begin with ?

An Express server can handle many more than 9 requests at a time, especially if it isn't talking to a datebase.
The article you're referring to assumes some database access on each request and serving static assets via node itself, rather than a CDN. All of this taking place on a single CPU with 1GB of RAM. That's a database and web server all running on a single core with minimal RAM.
There really are not hard numbers on this sort of thing; You build it and see how it performs. If it doesn't perform well enough, put a reverse proxy in front of it like nginx or haproxy to do load balancing.
However, based on your problem, if you really are running into bottlenecks where only 4 connections are possible at a time, it sounds like you're keeping those connections open way too long and blocking others. Better to have those long running processes kicked off by node, close the connections, then have those servers call back somehow when they're done.

Related

My application stops writing to files and opening new TCP connections after some time

I have no idea what could be causing this.
I have a Node application which connects to an external server over TCP and communicates with it. Part of its functionality also includes making relatively frequent HTTP requests.
Each instance of the application establishes up to 30 TCP connections to the external server, and makes HTTP requests as needed. Previously, I've been hosting the application on relatively cheap VPSes, with one instance of the application per server.
Now I'm setting it up on a proper dedicated server. I could set it up to run one instance on the dedicated and increase the connection limit that I've set so that one instance could cover several smaller instances on the VPSes, but I'd rather set up several instances of the application on the dedicated each limited to 30 connections.
The application also writes logs to disk (just a plain flat file), and sends logs via UDP to an external logging server. This is done using winston.
After some uptime, however, I'm experiencing an issue where HTTP requests time out (ETIMEDOUT) and the logs stop being written to disk. The application itself is still running, and the TCP connection to the server is still active and working. I can communicate with the application through that connection and it responds as expected. The logging server is still receiving the UDP packets as well. I've noticed that the log files stop being written to, but after a few minutes they appear to be flushed to disk finally, and the missed logs then appear.
My first suspicion was an open-files limit being hit, but the OS (Ubuntu) doesn't have a limit that I'm hitting. I tried disabling any Node HTTP Agent behavior (I'm using the request module, so I just passed false for the agent option).
It's not the webserver on the other end rejecting my connections. While the issue was occurring I was able to successfully wget a file from the webserver using the same external IP as the Node app is using.
I'm tailing the log file and noticing that the time between when a line is generated and when it's flushed to the disk is gradually increasing.
CPU and memory usage are low so there's no way that's the issue. iowait in top is 0.0. I have no idea where to go from here. Any help at all would be greatly appreciated.
I have Node 5.10.1.

Nginx - Routing, Capabililties of with Node.js

I am running a Node.js server which is now getting more load and I need to start getting this running on multiple cores, as Node.js is single threaded and can only run on one.
This is a simple solution given the Node.js Cluster module and tons of NPM packages for this very thing.
I have a problem in that I need browser sessions to retain the same Node.js worker after the first request. This is because I store authentication data, etc. in a single node worker process and do not want to open the can of worms of messaging between worker processes, etc. etc.
My browsers store a session id cookie once authenticated, and I want a system to re-route requests to their correct worker based on their session cookie.
Nginx looks promising, but I know nothing about it, and while I will put the work into it, I would like to know before I spend hours diving into it, if it is capable of routing to Node.js worker processes based on arbitrary data from the request header, such as a session cookie.
Is this doable? If I know it is, I'll get down and dirty figuring out Nginx, ground up.

I assume you are storing your sessions in nodejs memory. You might want to store your sessions in redis instead. This way it is persisted outside of a single server, and can be accessed from multiple processes.
In addition to redis, you might also want to look into Amazon Elastic Beanstalk for managing your load-balancing. You can setup an nginx proxy to route your requests to multiple servers based on their load.
This link might be able to get you started http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_nodejs_express_elasticache.html

Load test a Backbone App

I've got an NGinx/Node/Express3/Socket.io/Redis/Backbone/Backbone.Marionette app that proxies requests to a PHP/MySQL REST API. I need to load test the entire stack as a whole.
My app takes advantage of static asset caching with NGinx, clustering with node/express and socket is multi-core enabled using Redis. All that's to say, I've gone through a lot of trouble to try and make sure it can stand up to the load.
I hit it with 50,000 users in 10 seconds using blitz.io and it didn't even blink... Which concerned me because I wanted to see it crash, or at least breath a little heavy; but 50k was the max you could throw at it with that tool, indicating to me that they expect you to not reasonably be able to, or need to, handle more than that... Which is when I realized it wasn't actually incurring the load I was expecting because the load is initiated after the page loads and the Backbone app starts up and kicks off the socket connection and requests the data from the correct REST API endpoint (from different server).
So, here's my question:
How can I load test the entire app as a whole? I need the load test to tax the server in the same way that the clients actually will, which means:
Request the single page Backbone app from my NGinx/Node/Express server
Kick off requests for the static assets from NGinx (simulating what the browser would do)
Kick off requests to the REST API (PHP/MySQL running on a different server)
Create the connection to the Socket.io service (running on NGinx/Node/Express, utilizing Redis to handle multi-core junk)
If the testing tool uses a browser-like environment to load the page up, parsing the JS and running it, everything will be copasetic (NGinx/Node/Express server will get hit and so will the PHP/MySQL server). Otherwise, the testing tool will need to simulate this by firing off at least a dozen different kinds of requests nearly simultaneously. Otherwise it's like stress testing a door by looking at it 10,000 times (that is to say, it's pointless).
I need to ensure my app can handle 1,000 users hitting it in under a minute all loading the same page.

You should learn to use Apache JMeter http://jmeter.apache.org/
You can perform stress tests with it,
see this tutorial https://www.youtube.com/watch?v=8NLeq-QxkSw
As you said, "I need the load test to tax the server in the same way that the clients actually will"
That means that the tests is agnostic to the technology you are using.
I highly recommend Jmeter, is widely used and you can integrate it with Jenkins and do a lot of cool stuff with it.

node.js server with socket.io handling 50000 simultaneous clients

We are developing a Javascript control which should be constantly connected to a server for receiving animation updates.
We are planning to host this stuff on an Amazon cloud.
The scenario is like this: server connects to activemq queue waiting for updates, for each update it broadcasts it to all connected clients.
Is it even possible to handle such load with node.js + socket.io?
Will a single node.js server be able to handle such load?
How to organize fast transport between different nodes if we will have to use more than one node?

Will single node.js server be able to handle such load?.. How to organize fast transport between different nodes if we will have to use more than one node
You say that you are planning to host on Amazon. So first off, nothing should be scoped for a single server. Amazon machines will simply "disappear", you have to assume that you are going to use multiple computers.
...handling 50k simultaneous clients
So to start with, 50k connections for a single box is a very big number. Here's a very detailed blog post discussing "getting to 10k" with node.js+socket.io.
Here's a very telling quote:
it seemed as though 10,000 clients simply required more serialization
than my server was able to handle.
So a key component to "getting to 50k" is going to be the amount of work required just pushing data over the wire.
How to organize fast transport between different nodes if we will have to use more than one node.
That blog post is the first of 3. When you're done the first, read the other two. That should point you in the right direction.

Why is node.js only processing six requests at a time?

We have a node.js server which implements a REST API as a proxy to a central server which has a slightly different, and unfortunately asymmetric REST API.
Our client, which runs in various browsers, asks the node server to get the tasks from the central server. The node server gets a list of all the task ids from the central one and returns them to the client. The client then makes two REST API calls per id through the proxy.
As far as I can tell, this stuff is all done asynchronously. In the console log, it looks like this when I start the client:
Requested GET URL under /api/v1/tasks/*: /api/v1/tasks/
This takes a couple seconds to get the list from the central server. As soon as it gets the response, the server barfs this out very quickly:
Requested GET URL under /api/v1/tasks/id/:id :/api/v1/tasks/id/438
Requested GET URL under /api/v1/workflow/id/:id :/api/v1/workflow/id/438
Requested GET URL under /api/v1/tasks/id/:id :/api/v1/tasks/id/439
Requested GET URL under /api/v1/workflow/id/:id :/api/v1/workflow/id/439
Requested GET URL under /api/v1/tasks/id/:id :/api/v1/tasks/id/441
Requested GET URL under /api/v1/workflow/id/:id :/api/v1/workflow/id/441
Then, each time a pair of these requests gets a result from the central server, another two lines is barfed out very quickly.
So it seems our node.js server is only willing to have six requests out at a time.

There are no TCP connection limits imposed by Node itself. (The whole point is that it's highly concurrent and can handle thousands of simultaneous connections.) Your OS may limit TCP connections.
It's more likely that you're either hitting some kind of limitation of your backend server, or you're hitting the builtin HTTP library's connection limit, but it's hard to say without more details about that server or your Node implementation.
Node's built-in HTTP library (and obviously any libraries built on top of it, which are most) maintains a connection pool (via the Agent class) so that it can utilize HTTP keep-alives. This helps increase performance when you're running many requests to the same server: rather than opening a TCP connection, making a HTTP request, getting a response, closing the TCP connection, and repeating; new requests can be issued on reused TCP connections.
In node 0.10 and earlier, the HTTP Agent will only open 5 simultaneous connections to a single host by default. You can change this easily: (assuming you've required the HTTP module as http)
http.globalAgent.maxSockets = 20; // or whatever
node 0.12 sets the default maxSockets to Infinity.
You may want to keep some kind of connection limit in place. You don't want to completely overwhelm your backend server with hundreds of HTTP requests under a second – performance will most likely be worse than if you just let the Agent's connection pool do its thing, throttling requests so as to not overload your server. Your best bet will be to run some experiments to see what the optimal number of concurrent requests is in your situation.
However, if you really don't want connection pooling, you can simply bypass the pool entirely – sent agent to false in the request options:
http.get({host:'localhost', port:80, path:'/', agent:false}, callback);
In this case, there will be absolutely no limit on concurrent HTTP requests.

It's the limit on number of concurrent connections in the browser:
How many concurrent AJAX (XmlHttpRequest) requests are allowed in popular browsers?
I have upvoted the other answers, as they helped me diagnose the problem. The clue was that node's socket limit was 5, and I was getting 6 at a time. 6 is the limit in Chrome, which is what I was using to test the server.

How are you getting data from the central server? "Node does not limit connections" is not entirely accurate when making HTTP requests with the http module. Client requests made in this way use the http.globalAgent instance of http.Agent, and each http.Agent has a setting called maxSockets which determines how many sockets the agent can have open to any given host; this defaults to 5.
So, if you're using http.request or http.get (or a library that relies on those methods) to get data from your central server, you might try changing the value of http.globalAgent.maxSockets (or modify that setting on whatever instance of http.Agent you're using).
See:
http.Agent documentation
agent.maxSockets documentation
http.globalAgent documentation
Options you can pass to http.request, including an agent parameter to specify your own agent

Node js can handle thousands of incoming requests - yes!
But when it comes down to ougoing requests every request has to deal with a dns lookup and dns lookup's, disk reads etc are handled by the libuv which is programmed in C++. The default value of threads for each node process is 4x threads.
If all 4x threads are busy with https requests ( dns lookup's ) other requests will be queued. That is why no matter how brilliant your code might be : you sometimes get 6 or sometimes less concurrent outgoing requests per second completed.
Learn about dns cache to reduce the amount of dns look up's and increase libuv size. If you use PM2 to manage your node processes they do have a well documentation on their side on environment variables and how to inject them. What you are looking for is the environment variable UV_THREADPOOL_SIZE = 4
You can set the value anywhere between 1 or max limit of 1024. But keep in mind libuv limit of 1024 is across all event loops.

I have seen the same problem in my server. It was only processing 4 requests.
As explained already from 0.12 maxsockets defaults to infinity. That easily overwhelms the sever. Limiting the requests to say 10 by
http.globalAgent.maxSockets = 20;
solved my problem.

Are you sure it just returns the results to the client? Node processes everything in one thread. So if you do some fancy response parsing or anything else which doesn't yield, then it would block all your requests.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string