Concurrent outbound HTTP Request in Node.js makes the response slower

Concurrent outbound HTTP Request in Node.js makes the response slower - node.js

I'm currently load testing one of my API (Node.js + Express). This API makes a HTTP request to another server. Here's an example code:
var start = new Date()
axios.get('https://google.com')
.then(function (response) {
var end = (new Date() - start)/1000
console.info('Finished in %ds', end)
})
During the test, I find out that the more concurrent HTTP requests to the other server (in this example it's google.com), the slower the response becomes. I use Apache Jmeter for testing.
For example, if I do 1 request in one second:
Finished in 0.150s
But if I do 100 requests in one second:
Finished in 0.320s
...
Finished in 1.190s
Finished in 2.559s
Finished in 1.230s
Finished in 5.530s
At first I thought there must be a problem in the other server but that is not the case, even after I changed it to google.com (as per example), the same thing happened.
The more outbound http request that node.js has to make, the slower the response becomes. I have tried to improve my API by using node cluster, the workers help but I want to improve the response time even further.
Is there anything that I can do? or perhaps an explanation on why does this happen? I thought since my API makes asynchronous http requests, there should be no blocking, thus the response time should not be increased by such a significant amount.
Thanks.

I was facing a similar issue - in my instance I was awaiting each API call rather than allowing them to all occur asynchronously.
To do this you can push all of your async API calls into an array. For example, if you need to call a series of urls:
const requests = []
urls = ['http...a/get','http...b/get']
urls.map(item => {
request.push(axios.get(item))
})
Now that each of these calls are occurring asynchronously, be sure to wait for all of them to resolve before consuming the data.
const allAPIData = await Promise.all(requests)
Just be sure to handle your promise resolution in the event any of the API calls fail, perhaps with a helper function that nests axios.get(url). Otherwise any failed API promises could cause issues awaiting and resolving the Promise.all() statement.

Related

Concurrency in node js express app for get request with setTimeout

Console log Image
const express = require('express');
const app = express();
const port = 4444;
app.get('/', async (req, res) => {
console.log('got request');
await new Promise(resolve => setTimeout(resolve, 10000));
console.log('done');
res.send('Hello World!');
});
app.listen(port, () => {
console.log(`Example app listening at http://localhost:${port}`);
});
If I hit get request http://localhost:4444 three times concurrently then it is returning logs as below
got request
done
got request
done
got request
done
Shouldn't it return the output in the below way because of nodes event loop and callback queues which are external to the process thread? (Maybe I am wrong, but need some understanding on Nodes internals) and external apis in node please find the attached image
Javascript Run time environment
got request
got request
got request
done
done
done

Thanks to https://stackoverflow.com/users/5330340/phani-kumar
I got the reason why it is blocking. I was testing this in chrome. I am making get requests from chrome browser and when I tried the same in firefox it is working as expected.
Reason is because of this
Chrome locks the cache and waits to see the result of one request before requesting the same resource again.
Chrome stalls when making multiple requests to same resource?

It is returning the response like this:
Node.js is event driven language. To understand the concurrency, you should look a How node is executing this code. Node is a single thread language(but internally it uses multi-thread) which accepts the request as they come. In this case, Node accepts the request and assign a callback for the promise, however, in the meantime while it is waiting for the eventloop to execute the callback, it will accept as many request as it can handle(ex memory, cpu etc.). As there is setTimeout queue in the eventloop all these callback will be register there and once the timer is completed the eventloop will exhaust its queue.
Single Threaded Event Loop Model Processing Steps:
Client Send request to the Node.js Server.
Node.js internally maintains a limited(configurable) Thread pool to provide services to the Client Requests.
Node.js receives those requests and places them into a Queue that is known as “Event Queue”.
Node.js internally has a Component, known as “Event Loop”. Why it got this name is that it uses indefinite loop to receive requests and process them.
Event Loop uses Single Thread only. It is main heart of Node JS Platform Processing Model.
Event Loop checks any Client Request is placed in Event Queue. If not then wait for incoming requests for indefinitely.
If yes, then pick up one Client Request from Event Queue
Starts process that Client Request
If that Client Request Does Not requires any Blocking IO Operations, then process everything, prepare response and send it back to client.
If that Client Request requires some Blocking IO Operations like interacting with Database, File System, External Services then it will follow different approach
Checks Threads availability from Internal Thread Pool
Picks up one Thread and assign this Client Request to that thread.
That Thread is responsible for taking that request, process it, perform Blocking IO operations, prepare response and send it back to the Event Loop
You can check here for more details (very well explained).

Slow time response on DialogFlow fullfilment http requests

I am developing an app for google assistant on DialogFlow.
On certain intent I have a fullfilment which has to do a http request.
The code is like this:
const syncrequest = require('sync-request');
console.log('Request start');
var res = syncrequest('GET', urlRequest, {
json: {},
});
console.log('Request end');
Testing the url that I'm using it takes approximately 0.103 seconds to respond.
But looking at the firebase log, it is like this:
3:01:58.555 PM dialogflowFirebaseFulfillment Request end
3:01:56.585 PM dialogflowFirebaseFulfillment Request start
Even thought my server respond in 0.103 seconds, the request takes 2 seconds to be processed.
Sometimes it takes more than 4 seconds and makes my app crash.
Does anyone have any idea why is it taking so long? Is there something that I can do to do the request faster?
Thanks in advance

I haven't looked too hard at the sync-request package, but I do see this big warning on the npm page for it:
You should not be using this in a production application. In a node.js
application you will find that you are completely unable to scale your
server. In a client application you will find that sync-request causes
the app to hang/freeze. Synchronous web requests are the number one
cause of browser crashes. For production apps, you should use
then-request, which is exactly the same except that it is
asynchronous.
Based on this, and some other information on the page, it sounds like this package is very poor on performance, and may handle the synchronous operations grossly inefficiently.
You may wish to switch to the then-request package, as it suggests, however the most common way to handle HTTP calls is using request-promise-native, where you'd do something like:
const rp = require('request-promise-native');
return rp.get(url)
.then( body => {
// Set the Dialogflow response here
// You didn't really show this in your code.
});
If you are doing asynchronous tasks - you must return a promise from your intent handler.

Throttling event-driven Nodejs HTTP requests

I have a Node net.Server that listens to a legacy system on a TCP socket. When a message is received, it sends an http request to another http server. Simplified, it looks like this:
var request = require('request-promise');
...
socket.on('readable', function () {
var msg = parse(socket.read());
var postOptions = {
uri: 'http://example.com/go',
method: 'POST',
json: msg,
headers: {
'Content-Type': 'application/json'
}
};
request(postOptions);
})
The problem is that the socket is readable about 1000 times per second. The requests then overload the http server. Almost immediately, we get multiple-second response times.
In running Apache benchmark, it's clear that the http server can handle well over 1000 requests per second in under 100ms response time - if we limit the number of concurrent requests to about 100.
So my question is, what is the best way to limit the concurrent requests outstanding using the request-promise (by extension, request, and core.http.request) library when each request is fired separately within an event callback?
Request's documentation says:
Note that if you are sending multiple requests in a loop and creating multiple new pool objects, maxSockets will not work as intended. To work around this, either use request.defaults with your pool options or create the pool object with the maxSockets property outside of the loop.
I'm pretty sure that this paragraph is telling me the answer to my problem, but I can't make sense of it. I've using defaults to limit the number of sockets open:
var rp = require('request-promise');
var request = rp.defaults({pool: {maxSockets: 50}});
Which doesn't help. My only thought at the moment is to manually manage a queue, but I expect that would be unnecessary if I only knew the conventional way to do it.

Well you need to throttle your request right? I have workaround this in two ways, but let me show you one patter I always use. I often use throttle-exec and Promise to make wrapper for request. You could install it with npm install throttle-exec and use Promise natively or third-party. Here is my gist for this wrapper https://gist.github.com/ans-4175/d7faec67dc6374803bbc
How do you use it? It's simple, just like ordinary request.
var Request = require("./Request")
Request({
url:url_endpoint,
json:param,
method:'POST'
})
.then(function(result){
console.log(result)
})
.catch(reject)
Tell me after you implement it. Either way I have another wrapper :)

node, is each request and response unique or cached irrespective of url

In an app that I was working, I encountered "headers sent already error" if I test using concurrency and parallel request methods.
ultimately I resolved the problem using !response.headersSent but my question is why am I forced to use it? is node caching similar requests and reuses them for the next repeated call.
if(request.headers.accept == "application/json") {
if(!response.headersSent) {response.writeHead(200, {'Content-Type': 'application/json'})}
response.end(JSON.stringify({result:{authToken:data.authToken}}));
}
Edit
var express = require('express');
var app = express();
var server = app.listen(process.env.PORT || 3000, function () {
console.log('Example app listening at http://%s:%s', server.address().address, server.address().port);
});
Edit 2:
Another problem is while testing using mocha, super agent and while the tests in progress if I just send another request through postman on the side, one of the tests in mocha end with a timeout error. These steps I'm taking to ensure the code is production ready for simultaneous, parallel requests? please advise on what measures I can take to ensure node/code works under stress.
Edit 3:
app.use(function(request, response, next){
request.id = Math.random();
next();
});

OK, in an attempt to capture what solved this for you via all our conversation in comments, I will attempt to summarize here:
The message "headers sent already error" is nearly always caused by improper async handling which causes the code to call methods on the response object in a wrong sequence. The most common case is non-async code that ends the request and then an async operation that ends some time later that then tries to use the request (but there are other ways to misuse it too).
Each request and response object is uniquely created at the time each individual HTTP request arrives at the node/express server. They are not cached or reused.
Because of asynchronous operations in the processing of a request, there may be more than one request/response object in use at any given time. Code that is processing these must not store these objects in any sort of single global variable because multiple ones can be in the state of processing at once. Because node is single threaded, code will only be running on any given request at any given moment, but as soon as that code hits an async operation (and thus has nothing to do until the async operation is done), another request could start running. So multiple requests can easily be "in flight" at the same time.
If you have a system where you need to keep track of multiple requests at once, you can coin a request id and attach it to each new request. One way to do that is with a few lines of express middleware that is early in the middleware stack that just adds a unique id property to each new request.
One simple way of coining a unique id is to just use a monotonically increasing counter.

“Proxying” a lot of HTTP requests with Node.js + Express 2

I'm writing proxy in Node.js + Express 2. Proxy should:
decrypt POST payload and issue HTTP request to server based on result;
encrypt reply from server and send it back to client.
Encryption-related part works fine. The problem I'm facing is timeouts. Proxy should process requests in less than 15 secs. And most of them are under 500ms, actually.
Problem appears when I increase number of parallel requests. Most requests are completed ok, but some are failed after 15 secs + couple of millis. ab -n5000 -c300 works fine, but with concurrency of 500 it fails for some requests with timeout.
I could only speculate, but it seems thant problem is an order of callbacks exectuion. Is it possible that requests that comes first are hanging until ETIMEDOUT because of node's focus in latest ones which are still being processed in time under 500ms.
P.S.: There is no problem with remote server. I'm using request for interactions with it.
upd
The way things works with some code:
function queryRemote(req, res) {
var options = {}; // built based on req object (URI, body, authorization, etc.)
request(options, function(err, httpResponse, body) {
return err ? send500(req, res)
: res.end(encrypt(body));
});
}
app.use(myBodyParser); // reads hex string in payload
// and calls next() on 'end' event
app.post('/', [checkHeaders, // check Content-Type and Authorization headers
authUser, // query DB and call next()
parseRequest], // decrypt payload, parse JSON, call next()
function(req, res) {
req.socket.setTimeout(TIMEOUT);
queryRemote(req, res);
});
My problem is following: when ab issuing, let's say, 20 POSTs to /, express route handler gets called like thousands of times. That's not always happening, sometimes 20 and only 20 requests are processed in timely fashion.
Of course, ab is not a problem. I'm 100% sure that only 20 requests sent by ab. But route handler gets called multiple times.
I can't find reasons for such behaviour, any advice?

Timeouts were caused by using http.globalAgent which by default can process up to 5 concurrent requests to one host:port (which isn't enough in my case).
Thouthands of requests (instead of tens) were sent by ab (Wireshark approved fact under OS X; I can not reproduce this under Ubuntu inside Parallels).

You can have a look at node-http-proxy module and how it handles the connections. Make sure you don't buffer any data and everything works by streaming. And you should try to see where is the time spent for those long requests. Try instrumenting parts of your code with conosle.time and console.timeEnd and see where is taking the most time. If the time is mostly spent in javascript you should try to profile it. Basically you can use v8 profiler, by adding --prof option to your node command. Which makes a v8.log and can be processed via a v8 tool found in node-source-dir/deps/v8/tools. It only works if you have installed d8 shell via scons(scons d8). You can have a look at this article to help you further to make this working.
You can also use node-webkit-agent which uses webkit developer tools to show the profiler result. You can also have a look at my fork with a bit of sugar.
If that didn't work, you can try profiling with dtrace(only works in illumos-based systems like SmartOS).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string