How to send the maximal possible number of GET requests using NodeJS? - node.js

My task is to send as much as possible GET requests using standart nodejs http module (with http.get) to a remote server (for the data import, not DDOS :) ) But after a certain number of requests sending stops stop or go on very slowly.
I have already set the value http.globalAgent.maxSockets = Infinity, req.setNoDelay(true); and req.setSocketKeepAlive(true);. Also i make the requests in the async queue with 10-1000 concurrency and it affects the number of connections sent to a stop. I increased ulimit -n to a maximum.
Does somebody have advice or similar experience? Maybe I do something wrong?

See my appropriate node issues on GitHub and Stackoverflow
Maybe my workaround described there works for you too. Instead of modifying the globalAgent I disabled it.

Related

ForEach Bulk Post Requests are failing

I have this script where I'm taking a large dataset and calling a remote api, using request-promise, using a post method. If I do this individually, the request works just fine. However, if I loop through a sample set of 200-records using forEach and async/await, only about 6-15 of the requests come back with a status of 200, the others are returning with a 500 error.
I've worked with the owner of the API, and their logs only show the 200-requests. So I don't think node is actually sending out the ones that come back as 500.
Has anyone run into this, and/or know how I can get around this?
To my knowledge, there's no code in node.js that automatically makes a 500 http response for you. Those 500 responses are apparently coming from the target server's network. You could look at a network trace on your server machine to see for sure.
If they are not in the target server logs, then it's probably coming from some defense mechanism deployed in front of their server to stop misuse or overuse of their server (such as rate limiting from one source) and/or to protect its ability to respond to a meaningful number of requests (proxy, firewall, load balancer, etc...). It could even be part of a configuration in the hosting facility.
You will likely need to find out how many simultaneous requests the target server will accept without error and then modify your code to never send more than that number of requests at once. They could also be measuring requests/sec to it might not only be an in-flight count, but could be the rate at which requests are sent.

Does the request Node module limit outbound requests?

Running into an issue when load testing a server page, where we are making outbound requests to an api using the request module, and then on completion or error, redirecting the page (using express). Under stress the page we are posting to times out, and I'm wondering if it has to do with the max maxSockets parameter in that module. Does anyone know the default maxSockets for the request module? how to I change this to something reasonable.
Thanks
Max sockets for the HTTP agent is defaulted to 5 per host. Just set it to whatever amount is useful to you, maybe 1,000.
It's documented here: https://nodejs.org/api/http.html#http_agent_maxsockets

Throttling express server

I'm using a very simple express server, with a PUT and GET routes on an Ubuntu machine, but if I use several clients (around 8) doing requests at the same time it very easily gets flooded and starts to return connect EADDRNOTAVAIL errors. I have found no way to avoid this other than reducing the number of requests per client, but is there a way to throttle answers on the server so that instead of returning error it queues petitions and serves them in due time?
Maybe it's better to check whether there are answers to requests on the client and not insert new ones if they have not been served? Client is here
Queuing seems to be wrong, you should first check your current ulimit (every connection needs a handle).
To solve your problem, just change the ulimit.

Optimizing Node.js for a large number of outbound HTTP requests?

My node.js server is experiencing times when it becomes slow or unresponsive, even occasionally resulting in 503 gateway timeouts when attempting to connect to the server.
I am 99% sure (based upon tests that I have run) that this lag is coming specifically from the large number of outbound requests I am making with the node-oauth module to contact external APIs (Facebook, Twitter, and many others). Admittedly, the number of outbound requests being made is relatively large (in the order of 30 or so per minute). Even worse, this frequently means that the corresponding inbound requests to my server can take ~5-10 seconds to complete. However, I had a previous version of my API which I had written in PHP which was able to handle this amount of outbound requests without any problem at all. Actually, the CPU usage for the same number (or even fewer) requests with my Node.js API is about 5x that of my PHP API.
So, I'm trying to isolate where I can improve upon this, and most importantly to make sure that 503 timeouts do not occur. Here's some stuff I've read about or experimented with:
This article (by LinkedIn) recommends turning off socket pooling. However, when I contacted the author of the popular nodejs-request module, his response was that this was a very poor idea.
I have heard it said that setting "http.globalAgent.maxSockets" to a large number can help, and indeed it did seem to reduce bottlenecking for me
I could go on, but in short, I have been able to find very little definitive information about how to optimize performance so these outbound connections do not lag my inbound requests from clients.
Thanks in advance for any thoughts or contributions.
FWIW, I'm using express and mongoose as well, and my servers are hosted on the Amazon Cloud (2x M1.Large for the node servers, 2x load balancers, and 3x M1.Small MongoDB instances).
It sounds to me that the Agent is capping your requests to the default level of 5 per-host. Your tests show that cranking up the agent's maxSockets helped... you should do that.
You can prove this is the issue by firing up a packet sniffer, or adding more debugging code to your application, to show that this is the limiting factor.
http://engineering.linkedin.com/nodejs/blazing-fast-nodejs-10-performance-tips-linkedin-mobile
Disable the agent altogether.

Why is node.js only processing six requests at a time?

We have a node.js server which implements a REST API as a proxy to a central server which has a slightly different, and unfortunately asymmetric REST API.
Our client, which runs in various browsers, asks the node server to get the tasks from the central server. The node server gets a list of all the task ids from the central one and returns them to the client. The client then makes two REST API calls per id through the proxy.
As far as I can tell, this stuff is all done asynchronously. In the console log, it looks like this when I start the client:
Requested GET URL under /api/v1/tasks/*: /api/v1/tasks/
This takes a couple seconds to get the list from the central server. As soon as it gets the response, the server barfs this out very quickly:
Requested GET URL under /api/v1/tasks/id/:id :/api/v1/tasks/id/438
Requested GET URL under /api/v1/workflow/id/:id :/api/v1/workflow/id/438
Requested GET URL under /api/v1/tasks/id/:id :/api/v1/tasks/id/439
Requested GET URL under /api/v1/workflow/id/:id :/api/v1/workflow/id/439
Requested GET URL under /api/v1/tasks/id/:id :/api/v1/tasks/id/441
Requested GET URL under /api/v1/workflow/id/:id :/api/v1/workflow/id/441
Then, each time a pair of these requests gets a result from the central server, another two lines is barfed out very quickly.
So it seems our node.js server is only willing to have six requests out at a time.
There are no TCP connection limits imposed by Node itself. (The whole point is that it's highly concurrent and can handle thousands of simultaneous connections.) Your OS may limit TCP connections.
It's more likely that you're either hitting some kind of limitation of your backend server, or you're hitting the builtin HTTP library's connection limit, but it's hard to say without more details about that server or your Node implementation.
Node's built-in HTTP library (and obviously any libraries built on top of it, which are most) maintains a connection pool (via the Agent class) so that it can utilize HTTP keep-alives. This helps increase performance when you're running many requests to the same server: rather than opening a TCP connection, making a HTTP request, getting a response, closing the TCP connection, and repeating; new requests can be issued on reused TCP connections.
In node 0.10 and earlier, the HTTP Agent will only open 5 simultaneous connections to a single host by default. You can change this easily: (assuming you've required the HTTP module as http)
http.globalAgent.maxSockets = 20; // or whatever
node 0.12 sets the default maxSockets to Infinity.
You may want to keep some kind of connection limit in place. You don't want to completely overwhelm your backend server with hundreds of HTTP requests under a second – performance will most likely be worse than if you just let the Agent's connection pool do its thing, throttling requests so as to not overload your server. Your best bet will be to run some experiments to see what the optimal number of concurrent requests is in your situation.
However, if you really don't want connection pooling, you can simply bypass the pool entirely – sent agent to false in the request options:
http.get({host:'localhost', port:80, path:'/', agent:false}, callback);
In this case, there will be absolutely no limit on concurrent HTTP requests.
It's the limit on number of concurrent connections in the browser:
How many concurrent AJAX (XmlHttpRequest) requests are allowed in popular browsers?
I have upvoted the other answers, as they helped me diagnose the problem. The clue was that node's socket limit was 5, and I was getting 6 at a time. 6 is the limit in Chrome, which is what I was using to test the server.
How are you getting data from the central server? "Node does not limit connections" is not entirely accurate when making HTTP requests with the http module. Client requests made in this way use the http.globalAgent instance of http.Agent, and each http.Agent has a setting called maxSockets which determines how many sockets the agent can have open to any given host; this defaults to 5.
So, if you're using http.request or http.get (or a library that relies on those methods) to get data from your central server, you might try changing the value of http.globalAgent.maxSockets (or modify that setting on whatever instance of http.Agent you're using).
See:
http.Agent documentation
agent.maxSockets documentation
http.globalAgent documentation
Options you can pass to http.request, including an agent parameter to specify your own agent
Node js can handle thousands of incoming requests - yes!
But when it comes down to ougoing requests every request has to deal with a dns lookup and dns lookup's, disk reads etc are handled by the libuv which is programmed in C++. The default value of threads for each node process is 4x threads.
If all 4x threads are busy with https requests ( dns lookup's ) other requests will be queued. That is why no matter how brilliant your code might be : you sometimes get 6 or sometimes less concurrent outgoing requests per second completed.
Learn about dns cache to reduce the amount of dns look up's and increase libuv size. If you use PM2 to manage your node processes they do have a well documentation on their side on environment variables and how to inject them. What you are looking for is the environment variable UV_THREADPOOL_SIZE = 4
You can set the value anywhere between 1 or max limit of 1024. But keep in mind libuv limit of 1024 is across all event loops.
I have seen the same problem in my server. It was only processing 4 requests.
As explained already from 0.12 maxsockets defaults to infinity. That easily overwhelms the sever. Limiting the requests to say 10 by
http.globalAgent.maxSockets = 20;
solved my problem.
Are you sure it just returns the results to the client? Node processes everything in one thread. So if you do some fancy response parsing or anything else which doesn't yield, then it would block all your requests.

Resources