Is there any kind of limit with node for I/O? - node.js

I am writing a code that is downloading one file from some place and I am streaming to the client real time. The file is never full in my sever. Only chunks. Here is the code:
downloader.getLink(link, cookies[acc], function(err, location) {
if (!err) {
downloader.downloadLink(location, cookies[acc], function(err, response) {
if (!err) {
res.writeHead(200, response.headers);
response.pipe(res);
} else
res.end(JSON.stringfy(err));
});
} else {
res.end(JSON.stringfy(err));
}
});
As I can see there is nothing blocking this code since response is comming from a simple http.response...
The problem is, this way I can only stream 6 files at the same time. But the server is not using all the resources(cpu 10%, memory 10%) and it is a single core. After +/- 5 files I only get the loading page and the stream doesn't starts, only after some of them has completed.
This is not a limitation on the 1st server where I am downloading the files because using my browser for example I can download as many as I want. Am I doing something wrog or this is some limitation in node that I can change? Thanks

If your code is using the node.js core http module's http.Agent, it has an initial limit of 5 simultaneous outgoing connections to the same remote server. Try reading substack's rant in the hyperquest README for the details. But in short, try using a different module for your connections (I recommend superagent or hyperquest), or adjust the http Agent's maxSockets setting for the node core http module.

Related

Node.js http server: "getifaddres: Too many open files"

I'm currently running a nodejs server, and using GazeboJS to connect to the Gazebo server in order to send messages.
The problem is:
From my searches it seems like its due to the linux open file limit which is default at 1024 (Using Ubunuty 14.04). Most solutions seem to be to increase the open file limit.
However I don't know why my script is opening files and not closing them. It seems like each http request opens a connection which is not closed even though a response is sent? The http requests are coming from a Lua script using async.
The error
getifaddres: Too many open files
occurs after exactly 1024 requests.
I have no experience with webservers so I hope someone could give an explanation.
The details of the nodejs server i'm running:
The server is created using
http.createServer(function(req, res))
when a HTTP GET request is received, the response is sent as a string. Example of one response
gazebo.subscribe(obj.states[select].type, obj.states[select].topic, function(err, msg) // msg is a JSON object
{
if (err)
{
console.log('Error: ' + err);
return;
}
res.setHeader('Content-Type', 'text/html');
res.end(JSON.stringify(msg));
gazebo.unsubscribe(obj.states[select].topic);
})
The script makes use of the publish/subscribe topics in the Gazebo server to extract information or publish actions. More information about Gazebo communication is here.

Streaming large files causing server to hang

I have a feature in my web app that allows users to upload and download files. I serve up the app with Express, but the files are stored in a different server, so I proxy the requests to that server. Here's the proxy code using the request library:
module.exports = function(req, res) {
req.headers['x-private-id'] = getId();
var url = rewriteUrl(req.url);
var newRequest = request(url, function(error) {
if (error) console.log(error);
});
req.pipe(newRequest).on('response', function(res) {
delete res.headers['x-private-id'];
}).pipe(res);
};
This works fine for all of my requests, including downloading the file. However, I run into issues when 'streaming' the file. And by streaming, I mean I use fancybox to display the video using a video tag. The video displays fine the first few times.
But if I close fancybox and then reopen it enough times (5 specifically), it quits working after that; the video no longer shows up. The entire Express server seems to hang, unable to process any more requests. If I restart the server, everything is OK. To me it seems like the sockets from the proxy requests aren't being closed properly, but I can't figure out why. Is there something wrong with my proxy code?
You need to either increase the pool.maxSockets value passed in the request() config since it defaults to node's HTTP Agent's maxSockets which is 5, or opt out of connection pooling altogether with pool: false in the request() config.

Caching responses in express

I have some real trouble caching responses in express… I have one endpoint that gets a lot of requests (around 5k rpm). This endpoint fetches data from mongodb and to speed things up I would like to cache the full json response for 1 second so that only the first request each second hits the database while the others are served from a cache.
When abstracting out the database part of the problem my solution looks like this. I check for a cached response in redis. If one is found I serve it. If not I generate it, send it and set the cache. The timeout is too simulate the database operation.
app.get('/cachedTimeout', function(req,res,next) {
redis.get(req.originalUrl, function(err, value) {
if (err) return next(err);
if (value) {
res.set('Content-Type', 'text/plain');
res.send(value.toString());
} else {
setTimeout(function() {
res.send('OK');
redis.set(req.originalUrl, 'OK');
redis.expire(req.originalUrl, 1);
}, 100);
}
});
});
The problem is that this will not only make the first request every second hit the database. Instead all requests that comes in before we had time to set the cache (before 100ms) will hit the database. When adding real load to this it really blows up with response times around 60 seconds because a lot of requests are getting behind.
I know this could be solved with a reverse proxy like varnish but currently we are hosting on heroku which complicates such a setup.
What I would like to do is to do some sort of reverse-proxy cache inside of express. I would like it so that all the requests that comes in after the initial request (that generates the cache) would wait for the cache generation to finish before using that same response.
Is this possible?
Use a proxy layer on top your node.js application. Vanish Cache would be a good choice
to work with Nginx to serve your application.
p-throttle should do exactly what you need: https://www.npmjs.com/package/p-throttle

respond to each request without having to wait until current stream is finished

I'm testing streaming by creating a basic node.js app code that basically streams a file to the response. Using code from here and here.
But If I make a request from http://127.0.0.1:8000/, then open another browser and request another file, the second file will not start to download until the first one is finished. In my example I created a 1GB file. dd if=/dev/zero of=file.dat bs=1G count=1
But if I request three more files while the first one is downloading, the three files will start downloading simultaneously once the first file has finished.
How can I change the code so that it will respond to each request as it's made and not have to wait for the current download to finish?
var http = require('http');
var fs = require('fs');
var i = 1;
http.createServer(function(req, res) {
console.log('starting #' + i++);
// This line opens the file as a readable stream
var readStream = fs.createReadStream('file.dat', { bufferSize: 64 * 1024 });
// This will wait until we know the readable stream is actually valid before piping
readStream.on('open', function () {
console.log('open');
// This just pipes the read stream to the response object (which goes to the client)
readStream.pipe(res);
});
// This catches any errors that happen while creating the readable stream (usually invalid names)
readStream.on('error', function(err) {
res.end(err);
});
}).listen(8000);
console.log('Server running at http://127.0.0.1:8000/');
Your code seems fine the way it is.
I checked it with node v0.10.3 by making a few requests in multiple term sessions:
$ wget http://127.0.0.1:8000
Two requests ran concurrently.
I get the same result when using two different browsers (i.e. Chrome & Safari).
Further, I can get concurrent downloads in Chrome by just changing the request url slightly, as in:
http://localhost:8000/foo
and
http://localhost:8000/bar
The behavior you describe seems to manifest when making multiple requests from the same browser for the same url.
This may be a browser limitation - it looks like the second request isn't even made until the first is completed or cancelled.
To answer your question, if you need multiple client downloads in a browser
Ensure that your server code is implemented such that file-to-url mapping is 1-to-many (i.e. using a wildcard).
Ensure your client code (i.e. javascript in the browser), uses a different url for each request.

“Proxying” a lot of HTTP requests with Node.js + Express 2

I'm writing proxy in Node.js + Express 2. Proxy should:
decrypt POST payload and issue HTTP request to server based on result;
encrypt reply from server and send it back to client.
Encryption-related part works fine. The problem I'm facing is timeouts. Proxy should process requests in less than 15 secs. And most of them are under 500ms, actually.
Problem appears when I increase number of parallel requests. Most requests are completed ok, but some are failed after 15 secs + couple of millis. ab -n5000 -c300 works fine, but with concurrency of 500 it fails for some requests with timeout.
I could only speculate, but it seems thant problem is an order of callbacks exectuion. Is it possible that requests that comes first are hanging until ETIMEDOUT because of node's focus in latest ones which are still being processed in time under 500ms.
P.S.: There is no problem with remote server. I'm using request for interactions with it.
upd
The way things works with some code:
function queryRemote(req, res) {
var options = {}; // built based on req object (URI, body, authorization, etc.)
request(options, function(err, httpResponse, body) {
return err ? send500(req, res)
: res.end(encrypt(body));
});
}
app.use(myBodyParser); // reads hex string in payload
// and calls next() on 'end' event
app.post('/', [checkHeaders, // check Content-Type and Authorization headers
authUser, // query DB and call next()
parseRequest], // decrypt payload, parse JSON, call next()
function(req, res) {
req.socket.setTimeout(TIMEOUT);
queryRemote(req, res);
});
My problem is following: when ab issuing, let's say, 20 POSTs to /, express route handler gets called like thousands of times. That's not always happening, sometimes 20 and only 20 requests are processed in timely fashion.
Of course, ab is not a problem. I'm 100% sure that only 20 requests sent by ab. But route handler gets called multiple times.
I can't find reasons for such behaviour, any advice?
Timeouts were caused by using http.globalAgent which by default can process up to 5 concurrent requests to one host:port (which isn't enough in my case).
Thouthands of requests (instead of tens) were sent by ab (Wireshark approved fact under OS X; I can not reproduce this under Ubuntu inside Parallels).
You can have a look at node-http-proxy module and how it handles the connections. Make sure you don't buffer any data and everything works by streaming. And you should try to see where is the time spent for those long requests. Try instrumenting parts of your code with conosle.time and console.timeEnd and see where is taking the most time. If the time is mostly spent in javascript you should try to profile it. Basically you can use v8 profiler, by adding --prof option to your node command. Which makes a v8.log and can be processed via a v8 tool found in node-source-dir/deps/v8/tools. It only works if you have installed d8 shell via scons(scons d8). You can have a look at this article to help you further to make this working.
You can also use node-webkit-agent which uses webkit developer tools to show the profiler result. You can also have a look at my fork with a bit of sugar.
If that didn't work, you can try profiling with dtrace(only works in illumos-based systems like SmartOS).

Resources