“Proxying” a lot of HTTP requests with Node.js + Express 2 - node.js

I'm writing proxy in Node.js + Express 2. Proxy should:
decrypt POST payload and issue HTTP request to server based on result;
encrypt reply from server and send it back to client.
Encryption-related part works fine. The problem I'm facing is timeouts. Proxy should process requests in less than 15 secs. And most of them are under 500ms, actually.
Problem appears when I increase number of parallel requests. Most requests are completed ok, but some are failed after 15 secs + couple of millis. ab -n5000 -c300 works fine, but with concurrency of 500 it fails for some requests with timeout.
I could only speculate, but it seems thant problem is an order of callbacks exectuion. Is it possible that requests that comes first are hanging until ETIMEDOUT because of node's focus in latest ones which are still being processed in time under 500ms.
P.S.: There is no problem with remote server. I'm using request for interactions with it.
upd
The way things works with some code:
function queryRemote(req, res) {
var options = {}; // built based on req object (URI, body, authorization, etc.)
request(options, function(err, httpResponse, body) {
return err ? send500(req, res)
: res.end(encrypt(body));
});
}
app.use(myBodyParser); // reads hex string in payload
// and calls next() on 'end' event
app.post('/', [checkHeaders, // check Content-Type and Authorization headers
authUser, // query DB and call next()
parseRequest], // decrypt payload, parse JSON, call next()
function(req, res) {
req.socket.setTimeout(TIMEOUT);
queryRemote(req, res);
});
My problem is following: when ab issuing, let's say, 20 POSTs to /, express route handler gets called like thousands of times. That's not always happening, sometimes 20 and only 20 requests are processed in timely fashion.
Of course, ab is not a problem. I'm 100% sure that only 20 requests sent by ab. But route handler gets called multiple times.
I can't find reasons for such behaviour, any advice?

Timeouts were caused by using http.globalAgent which by default can process up to 5 concurrent requests to one host:port (which isn't enough in my case).
Thouthands of requests (instead of tens) were sent by ab (Wireshark approved fact under OS X; I can not reproduce this under Ubuntu inside Parallels).

You can have a look at node-http-proxy module and how it handles the connections. Make sure you don't buffer any data and everything works by streaming. And you should try to see where is the time spent for those long requests. Try instrumenting parts of your code with conosle.time and console.timeEnd and see where is taking the most time. If the time is mostly spent in javascript you should try to profile it. Basically you can use v8 profiler, by adding --prof option to your node command. Which makes a v8.log and can be processed via a v8 tool found in node-source-dir/deps/v8/tools. It only works if you have installed d8 shell via scons(scons d8). You can have a look at this article to help you further to make this working.
You can also use node-webkit-agent which uses webkit developer tools to show the profiler result. You can also have a look at my fork with a bit of sugar.
If that didn't work, you can try profiling with dtrace(only works in illumos-based systems like SmartOS).

Related

Concurrent outbound HTTP Request in Node.js makes the response slower

I'm currently load testing one of my API (Node.js + Express). This API makes a HTTP request to another server. Here's an example code:
var start = new Date()
axios.get('https://google.com')
.then(function (response) {
var end = (new Date() - start)/1000
console.info('Finished in %ds', end)
})
During the test, I find out that the more concurrent HTTP requests to the other server (in this example it's google.com), the slower the response becomes. I use Apache Jmeter for testing.
For example, if I do 1 request in one second:
Finished in 0.150s
But if I do 100 requests in one second:
Finished in 0.320s
...
Finished in 1.190s
Finished in 2.559s
Finished in 1.230s
Finished in 5.530s
At first I thought there must be a problem in the other server but that is not the case, even after I changed it to google.com (as per example), the same thing happened.
The more outbound http request that node.js has to make, the slower the response becomes. I have tried to improve my API by using node cluster, the workers help but I want to improve the response time even further.
Is there anything that I can do? or perhaps an explanation on why does this happen? I thought since my API makes asynchronous http requests, there should be no blocking, thus the response time should not be increased by such a significant amount.
Thanks.
I was facing a similar issue - in my instance I was awaiting each API call rather than allowing them to all occur asynchronously.
To do this you can push all of your async API calls into an array. For example, if you need to call a series of urls:
const requests = []
urls = ['http...a/get','http...b/get']
urls.map(item => {
request.push(axios.get(item))
})
Now that each of these calls are occurring asynchronously, be sure to wait for all of them to resolve before consuming the data.
const allAPIData = await Promise.all(requests)
Just be sure to handle your promise resolution in the event any of the API calls fail, perhaps with a helper function that nests axios.get(url). Otherwise any failed API promises could cause issues awaiting and resolving the Promise.all() statement.

what happens if neither res.send() nor res.end() is called in express.js?

I have a security issue that someone is trying to call random APIs that are not supported on our server but are frequently used for administrators API in general. and I set this code below to handle 404 to not respond to this attack
url-not-found-handler.js
'use strict';
module.exports = function () {
//4XX - URLs not found
return ((req, res, next) => {
});
};
what happens to client is that it waits until the server responds but I want to know if this will affect the performance of my express.js server also what happens behind the scene in the server without res.send() or res.end() ?
According to the documentation of res.end().
Ends the response process. This method actually comes from Node core,
specifically the response.end() method of http.ServerResponse.
And then response.end
This method signals to the server that all of the response headers and
body have been sent; that server should consider this message
complete. The method, response.end(), MUST be called on each response.
If you leave your request hanging, the httpserver will surely keep data about it. Which means that if you let hang many requests, your memory will grow and reduce your server performance.
About the client, he's going to have to wait until he got a request timeout.
The best to do having a bad request is to immediately reject the request, which is freeing the memory allowed for the request.
You cannot prevent bad requests (maybe have a firewall blocking requests from certains IP address?). Best you can do is to handle them as fast as possible.

Slow time response on DialogFlow fullfilment http requests

I am developing an app for google assistant on DialogFlow.
On certain intent I have a fullfilment which has to do a http request.
The code is like this:
const syncrequest = require('sync-request');
console.log('Request start');
var res = syncrequest('GET', urlRequest, {
json: {},
});
console.log('Request end');
Testing the url that I'm using it takes approximately 0.103 seconds to respond.
But looking at the firebase log, it is like this:
3:01:58.555 PM dialogflowFirebaseFulfillment Request end
3:01:56.585 PM dialogflowFirebaseFulfillment Request start
Even thought my server respond in 0.103 seconds, the request takes 2 seconds to be processed.
Sometimes it takes more than 4 seconds and makes my app crash.
Does anyone have any idea why is it taking so long? Is there something that I can do to do the request faster?
Thanks in advance
I haven't looked too hard at the sync-request package, but I do see this big warning on the npm page for it:
You should not be using this in a production application. In a node.js
application you will find that you are completely unable to scale your
server. In a client application you will find that sync-request causes
the app to hang/freeze. Synchronous web requests are the number one
cause of browser crashes. For production apps, you should use
then-request, which is exactly the same except that it is
asynchronous.
Based on this, and some other information on the page, it sounds like this package is very poor on performance, and may handle the synchronous operations grossly inefficiently.
You may wish to switch to the then-request package, as it suggests, however the most common way to handle HTTP calls is using request-promise-native, where you'd do something like:
const rp = require('request-promise-native');
return rp.get(url)
.then( body => {
// Set the Dialogflow response here
// You didn't really show this in your code.
});
If you are doing asynchronous tasks - you must return a promise from your intent handler.

Node.js+express+mongodb handling two http get requests fired almost at the same time

I have a node.js+express+mongo DB app where,
app.get('/test/:testString', function (req, res) {
console.log('req.params.testString: %s ', req.params.testString);
... insert to a collection....
from my client side, I fired two http get almost at the same time. I thought there should be two console.log print out for these two requests. However, strangely, only the second request's log shows and only the second request goes thru the collection insertion. On my server side terminal, I can see two http get requests arrived. Any idea?
Regards
Hammer
More:#CFrei # Peter
I think I know the cause. I have http://www.google.com in my testString. If I remove '//' and '/', then I can see the log printed out. I have already used NSUTF8StringEncoding in my native firing of http get. Any special handling need to be added on express side?
If the client is a browser my first guess would be browser connection management interfering. Try using curl on the command line instead to test.

Caching responses in express

I have some real trouble caching responses in express… I have one endpoint that gets a lot of requests (around 5k rpm). This endpoint fetches data from mongodb and to speed things up I would like to cache the full json response for 1 second so that only the first request each second hits the database while the others are served from a cache.
When abstracting out the database part of the problem my solution looks like this. I check for a cached response in redis. If one is found I serve it. If not I generate it, send it and set the cache. The timeout is too simulate the database operation.
app.get('/cachedTimeout', function(req,res,next) {
redis.get(req.originalUrl, function(err, value) {
if (err) return next(err);
if (value) {
res.set('Content-Type', 'text/plain');
res.send(value.toString());
} else {
setTimeout(function() {
res.send('OK');
redis.set(req.originalUrl, 'OK');
redis.expire(req.originalUrl, 1);
}, 100);
}
});
});
The problem is that this will not only make the first request every second hit the database. Instead all requests that comes in before we had time to set the cache (before 100ms) will hit the database. When adding real load to this it really blows up with response times around 60 seconds because a lot of requests are getting behind.
I know this could be solved with a reverse proxy like varnish but currently we are hosting on heroku which complicates such a setup.
What I would like to do is to do some sort of reverse-proxy cache inside of express. I would like it so that all the requests that comes in after the initial request (that generates the cache) would wait for the cache generation to finish before using that same response.
Is this possible?
Use a proxy layer on top your node.js application. Vanish Cache would be a good choice
to work with Nginx to serve your application.
p-throttle should do exactly what you need: https://www.npmjs.com/package/p-throttle

Resources