Handling multiple parallel HTTP requests in Node.js - node.js

I know that Node is non-blocking, but I just realized that the default behaviour of http.listen(8000) means that all HTTP requests are handled one-at-a-time. I know I shouldn't have been surprised at this (it's how ports work), but it does make me seriously wonder how to write my code so that I can handle multiple, parallel HTTP requests.
So what's the best way to write a server so that it doesn't hog port 80 and long-running responses don't result in long request queues?
To illustrate the problem, try running the code below and loading it up in two browser tabs at the same time.
var http = require('http');
http.createServer(function (req, res) {
res.setHeader('Content-Type', 'text/html; charset=utf-8');
res.write("<p>" + new Date().toString() + ": starting response");
setTimeout(function () {
res.write("<p>" + new Date().toString() + ": completing response and closing connection</p>");
res.end();
}, 4000);
}).listen(8080);

You are misunderstanding how node works. The above code can accept TCP connections from hundreds or thousands of clients, read the HTTP requests, and then wait the 4000 ms timeout you have baked in there, and then send the responses. Each client will get a response in about 4000 + a small number of milliseconds. During that setTimeout (and during any I/O operation) node can continue processing. This includes accepting additional TCP connections. I tested your code and the browsers each get a response in 4s. The second one does NOT take 8s, if that is how you think it works.
I ran curl -s localhost:8080 in 4 tabs as quickly as I can via the keyboard and the seconds in the timestamps are:
54 to 58
54 to 58
55 to 59
56 to 00
There's no issue here, although I can understand how you might think there is one. Node would be totally broken if it worked as your post suggested.
Here's another way to verify:
for i in 1 2 3 4 5 6 7 8 9 10; do curl -s localhost:8080 &;done

Your code can accept multiple connections because the job is done in callback function of the setTimeout call.
But if you instead of setTimeout do a heavy job... then it is true that node.js will not accept other multiple connections! SetTimeout accidentally frees the process so the node.js can accept other jobs and you code is executed in other "thread".
I don't know which is the correct way to implement this. But this is how it seems to work.

Browser blocks the other same requests. If you call it from different browsers then this will work parallelly.

I used following code to test request handling
app.get('/', function(req, res) {
console.log('time', MOMENT());
setTimeout( function() {
console.log(data, ' ', MOMENT());
res.send(data);
data = 'changing';
}, 50000);
var data = 'change first';
console.log(data);
});
Since this request doesn't take that much processing time, except for 50 sec of setTimeout and all the time-out were processed together like usually do.
Response 3 request together-
time moment("2017-05-22T16:47:28.893")
change first
time moment("2017-05-22T16:47:30.981")
change first
time moment("2017-05-22T16:47:33.463")
change first
change first moment("2017-05-22T16:48:18.923")
change first moment("2017-05-22T16:48:20.988")
change first moment("2017-05-22T16:48:23.466")
After this i moved to second phase... i.e., what if my request takes so much time to process a sync file or some thing else that take time.
app.get('/second', function(req, res) {
console.log(data);
if(req.headers.data === '9') {
res.status(200);
res.send('response from api');
} else {
console.log(MOMENT());
for(i = 0; i<9999999999; i++){}
console.log('Second MOMENT', MOMENT());
res.status(400);
res.send('wrong data');
}
var data = 'second test';
});
As my first request was still in process so my second didn't get accepted by Node. Thus i got following response of 2 request-
undefined
moment("2017-05-22T17:43:59.159")
Second MOMENT moment("2017-05-22T17:44:40.609")
undefined
moment("2017-05-22T17:44:40.614")
Second MOMENT moment("2017-05-22T17:45:24.643")
Thus For all Async functions theres a virtual thread in Node and Node does accept other request before completing previous requests async work like(fs, mysql,or calling API), however it keeps it self as single thread and does not process other request until all previous ones are completed.

Related

improve http request response time

I've created a nodejs script that make HTTP requests every 50ms, but it takes too long to receive response as request number grows.
how can I improve response time?
function makeRequest() {
superagent
.post('http://example.com')
.send({"test": "test"})
.set('Connection', 'keep-alive')
.then(console.log, console.log);
}
setInterval(() => makeRequest(), 50);
This is troublesome code. If your http request takes longer than 50ms to complete, then the number of active requests in flight will get larger and larger until eventually, you will consume too many system resources (sockets, memory, etc...). Things may get slower and slower or you may actually exhaust some resource and start to get errors or crash.
In addition, you don't want to be hitting the target server with thousands of simultaneous requests as it may also slow down under that type of load. This type of issue can also lead to an avalanche failure where a slight delay in the responsiveness of the response causes sudden build-up of requests which slows down the target server which leads to more build-up which quickly gets out of control and something dies. It's important to always code these types of things to avoid any sort of avalanche failure.
What I would suggest is making a new request some fixed number of ms from completion of the previous request (so there is only one request at a time in flight). Or a more complicated version would make a new request 50ms from when the previous one started, but not before the previous one finishes. This way, you'd only ever have one request in flight at a time and they would never build-up and accumulate and resource usage should stay fairly constant, not building over time, even if the target server gets slow for some reason.
Here's a way to make the next request after the completion of the previous request and no more often than once every 50ms:
function makeRequest() {
return superagent
.post('http://example.com')
.send({ "test": "test" })
.set('Connection', 'keep-alive');
}
function delay(t) {
return new Promise(resolve => {
setTimeout(resolve, t);
});
}
function run() {
const repeatTime = 50;
const startTime = Date.now();
return makeRequest().catch(err => {
console.log(err);
// decide here if you want to keep going or not
// if so, then just return
// if not, then throw
}).then(result => {
console.log(result);
let delta = Date.now() - startTime;
if (delta < repeatTime) {
// wait until at least repeatTime has passed before starting next request
return delay(repeatTime - delta).then(run);
} else {
return run();
}
}).catch(() => {
// aborted because of error
});
}
run();

Understanding The NodeJS Internal execution

I'm trying to understand what happen under the hood
if I try to execute this NodeJS code :
http.createServer(function (request, response) {
response.writeHead(200, {'Content-Type': 'text/plain'});
response.end('Hello World\n');
}).listen(8081);
I have 2 cases about the above code :
1 . Modify the code to do some blocking in the end line of
the http.createServer callback function :
http.createServer(function (request, response) {
response.writeHead(200, {'Content-Type': 'text/plain'});
response.end('Hello World\n');
sleep(2000); //sleep 2 seconds after handling the first request
}).listen(8081);`
//found this code on the web, to simulate php like sleep function
function sleep(milliseconds)
{
var start = new Date().getTime();
for (var i = 0; i < 1e7; i++)
{
if ((new Date().getTime() - start) > milliseconds)
{
break;
}
}
}
I use this simple bash loop to do two requests to the NodeJS server
$for i in {1..2}; do curl http://localhost:1337; done
result on the client console :
Hello world #first iteration
after two second the next hello world is printed on client console
Hello world #second iteration
On the first iteration of the requests, the server can response immediately to the request.
But at the second iteration of the requests, the server is blocking, and return the response to requests after two second. This is because the sleep
function that blocking the request after handling the first request.
Modify the code, instead of using sleep, i'm using setTimeout in the end line of the http.createServer callback function.
http.createServer(function (request, response) {
response.writeHead(200, {'Content-Type': 'text/plain'});
response.end('Hello World\n');
setTimeout(function(){console.log("Done");}, 2000);
}).listen(8081);`
Again i'm using this simple bash loop to do the requests
for i in {1..2}; do curl http://localhost:1337; done
The result is the response is returned to the two requests immediately.
And the Hello world message is printed also immediately on the console.
This because I'm using the setTimeout function which itself is an asynchronous function.
I have questions about what happen here :
1.Am I right if I say : It is the responsibility for the programmer to make asynchronous call in NodeJS code so that the NodeJS internal can continue to execute other code or request without blocking.
2.The NodeJS internal Use the Google V8 Engine to execute the javascript code and using the libuv for doing the asynchronous thing.
The Event Loop is responsible for checking is there any event associated with callback occur in the event queue and check is there any remaining code in the call stack, if the event queue is not empty and call stack is empty then callback from event queue is pushed to stack, caused the callback to be executed.
The question is :
A. When doing Async thing in NodeJS, Is that execution of callback function is separated (by using libuv thread pool) from the execution of the code in NodeJS main thread?
B. How The Event Loop handle the connections if there is multiple connection arrive at the same time to the server?
I will highly appreciated every answers and try to learn from them.
Regarding few of your questions:
It is the responsibility for the programmer to make asynchronous call
in NodeJS code so that the NodeJS internal can continue to execute
other code or request without blocking.
Correct! notice that it is possible (if required) to execute synchronous blocking code. As example see all the 'Sync' functions of fs module like fs.accessSync
When doing Async thing in NodeJS, Is that execution of callback
function is separated (by using libuv thread pool) from the execution
of the code in NodeJS main thread
Node.js is single threaded, so there is no 'main thread'. When triggered, the execution of the callback function is the only code that is executed. The asynchronous design of node.js is accomplished by the 'Event Loop' as you mentioned
How The Event Loop handle the connections if there is multiple
connection arrive at the same time to the server?
There is no 'same time' really. one comes first, and the rest are being queued. Assuming you have no blocking code they should be handled quickly (you can and should load test your server and see how quick exactly)
First of all, I don't know what sleep does.
Basically event loop keeps a check on what resources are free and what are the needs of queued events if any. When you call setTimeout, it executes the console.log("Done") after 2 seconds. Did you program it to stop the overall execution of the function ? NO. You only asked that particular request to do something after sending down the response. You did not ask to stop the function execution or block events. You can read more about threads here. The program is asynchronous by itself.
Now if you want it to make synchronous, you need your own event loop. Can you take all of actions inside setTimeout.
setTimeout(function() {
response.end('Hello World\n');
response.writeHead(200, {'Content-Type': 'text/plain'});
console.log("Done");
}, 2000);
Do you still deny other requests to stop executing? NO. If you fire 2 requests simultaneously, you will get 2 responses simultaneously after 2 seconds.
Let us go deeper and control the requests more. Let there be two global variables counter = 0 and current_counter = 0. They reside outside http.create.... Once a request comes, we assign it a counter and execute it. Then we wait for 2 seconds and and increment the counter and execute the next request.
counter = 0;
current_counter = 0;
http.createServer(function (request, response) {
var my_count = counter; // my_count specific to each request, not common, not global
counter += 1;
while(current_counter <= my_count)
if (current_counter == my_count) {
setTimeout(function() {
response.end('Hello World\n');
response.writeHead(200, {'Content-Type': 'text/plain'});
console.log("Done");
return current_counter += 1;
}, 2000);
}
}
}).listen(8081);`
Try to understand what I did. I made my own event loop in the form of while loop. It listens to the condition that current_counter equals my_count. Imagine, 3 requests come in less that 2 seconds. Remember, we increment current_counter only after 2 seconds.
Requests which came in less than 2 seconds
A - current_counter = 0, my_count = 0 -> in execution, will send response after 2 seconds.
B - current_counter = 0 (increments after 2 seconds only), my_count = 1 -> stuck in while loop condition. Waiting current_counter to equal 1 for execution.
C - current_counter = 0, my_count = 2 -> stuck in while loop like previous request.
After 2 seconds, request A is responded by setTimeout. The variable current_count becomes 1 and request B's local variable my_count equals it and executes the setTimeout. Hence response to request B is sent and current_counter is incremented after 2 seconds which leads of execution of request C and so on.
You can queue as many requests as possible but the execution happens only after 2 seconds because of my own event loop which check for condition which in turn depends on setTimeout which executes only after 2 seconds.
Thanks !

Is making sequential HTTP requests a blocking operation in node?

Note that irrelevant information to my question will be 'quoted'
like so (feel free to skip these).
Problem
I am using node to make in-order HTTP requests on behalf of multiple clients. This way, what originally took the client(s) several different page loads to get the desired result, now only takes a single request via my server. I am currently using the ‘async’ module for flow control and ‘request’ module for making the HTTP requests. There are approximately 5 callbacks which, using console.time, takes about ~2 seconds from start to finish (sketch code included below).
Now I am rather inexperienced with node, but I am aware of the
single-threaded nature of node. While I have read many times that node
isn’t built for CPU-bound tasks, I didn’t really understand what that
meant until now. If I have a correct understanding of what’s going on,
this means that what I currently have (in development) is in no way
going to scale to even more than 10 clients.
Question
Since I am not an expert at node, I ask this question (in the title) to get a confirmation that making several sequential HTTP requests is indeed blocking.
Epilogue
If that is the case, I expect I will ask a different SO question (after doing the appropriate research) discussing various possible solutions, should I choose to continue approaching this problem in node (which itself may not be suitable for what I'm trying to do).
Other closing thoughts
I am truly sorry if this question was not detailed enough, too noobish, or had particularly flowery language (I try to be concise).
Thanks and all the upvotes to anyone who can help me with my problem!
The code I mentioned earlier:
var async = require('async');
var request = require('request');
...
async.waterfall([
function(cb) {
console.time('1');
request(someUrl1, function(err, res, body) {
// load and parse the given web page.
// make a callback with data parsed from the web page
});
},
function(someParameters, cb) {
console.timeEnd('1');
console.time('2');
request({url: someUrl2, method: 'POST', form: {/* data */}}, function(err, res, body) {
// more computation
// make a callback with a session cookie given by the visited url
});
},
function(jar, cb) {
console.timeEnd('2');
console.time('3');
request({url: someUrl3, method: 'GET', jar: jar /* cookie from the previous callback */}, function(err, res, body) {
// do more parsing + computation
// make another callback with the results
});
},
function(moreParameters, cb) {
console.timeEnd('3');
console.time('4');
request({url: someUrl4, method: 'POST', jar: jar, form : {/*data*/}}, function(err, res, body) {
// make final callback after some more computation.
//This part takes about ~1s to complete
});
}
], function (err, result) {
console.timeEnd('4'); //
res.status(200).send();
});
Normally, I/O in node.js are non-blocking. You can test this out by making several requests simultaneously to your server. For example, if each request takes 1 second to process, a blocking server would take 2 seconds to process 2 simultaneous requests but a non-blocking server would take just a bit more than 1 second to process both requests.
However, you can deliberately make requests blocking by using the sync-request module instead of request. Obviously, that's not recommended for servers.
Here's a bit of code to demonstrate the difference between blocking and non-blocking I/O:
var req = require('request');
var sync = require('sync-request');
// Load example.com N times (yes, it's a real website):
var N = 10;
console.log('BLOCKING test ==========');
var start = new Date().valueOf();
for (var i=0;i<N;i++) {
var res = sync('GET','http://www.example.com')
console.log('Downloaded ' + res.getBody().length + ' bytes');
}
var end = new Date().valueOf();
console.log('Total time: ' + (end-start) + 'ms');
console.log('NON-BLOCKING test ======');
var loaded = 0;
var start = new Date().valueOf();
for (var i=0;i<N;i++) {
req('http://www.example.com',function( err, response, body ) {
loaded++;
console.log('Downloaded ' + body.length + ' bytes');
if (loaded == N) {
var end = new Date().valueOf();
console.log('Total time: ' + (end-start) + 'ms');
}
})
}
Running the code above you'll see the non-blocking test takes roughly the same amount of time to process all requests as it does for a single request (for example, if you set N = 10, the non-blocking code executes 10 times faster than the blocking code). This clearly illustrates that the requests are non-blocking.
Additional answer:
You also mentioned that you're worried about your process being CPU intensive. But in your code, you're not benchmarking CPU utility. You're mixing both network request time (I/O, which we know is non-blocking) and CPU process time. To measure how much time the request is in blocking mode, change your code to this:
async.waterfall([
function(cb) {
request(someUrl1, function(err, res, body) {
console.time('1');
// load and parse the given web page.
console.timeEnd('1');
// make a callback with data parsed from the web page
});
},
function(someParameters, cb) {
request({url: someUrl2, method: 'POST', form: {/* data */}}, function(err, res, body) {
console.time('2');
// more computation
console.timeEnd('2');
// make a callback with a session cookie given by the visited url
});
},
function(jar, cb) {
request({url: someUrl3, method: 'GET', jar: jar /* cookie from the previous callback */}, function(err, res, body) {
console.time('3');
// do more parsing + computation
console.timeEnd('3');
// make another callback with the results
});
},
function(moreParameters, cb) {
request({url: someUrl4, method: 'POST', jar: jar, form : {/*data*/}}, function(err, res, body) {
console.time('4');
// some more computation.
console.timeEnd('4');
// make final callback
});
}
], function (err, result) {
res.status(200).send();
});
Your code only blocks in the "more computation" parts. So you can completely ignore any time spent waiting for the other parts to execute. In fact, that's exactly how node can serve multiple requests concurrently. While waiting for the other parts to call the respective callbacks (you mention that it may take up to 1 second) node can execute other javascript code and handle other requests.
Your code is non-blocking because it uses non-blocking I/O with the request() function. This means that node.js is free to service other requests while your series of http requests is being fetched.
What async.waterfall() does it to order your requests to be sequential and pass the results of one on to the next. The requests themselves are non-blocking and async.waterfall() does not change or influence that. The series you have just means that you have multiple non-blocking requests in a row.
What you have is analogous to a series of nested setTimeout() calls. For example, this sequence of code takes 5 seconds to get to the inner callback (like your async.waterfall() takes n seconds to get to the last callback):
setTimeout(function() {
setTimeout(function() {
setTimeout(function() {
setTimeout(function() {
setTimeout(function() {
// it takes 5 seconds to get here
}, 1000);
}, 1000);
}, 1000);
}, 1000);
}, 1000);
But, this uses basically zero CPU because it's just 5 consecutive asynchronous operations. The actual node.js process is involved for probably no more than 1ms to schedule the next setTimeout() and then the node.js process literally could be doing lots of other things until the system posts an event to fire the next timer.
You can read more about how the node.js event queue works in these references:
Run Arbitrary Code While Waiting For Callback in Node?
blocking code in non-blocking http server
Hidden threads in Javascript/Node that never execute user code: is it possible, and if so could it lead to an arcane possibility for a race condition?
How does JavaScript handle AJAX responses in the background? (written about the browser, but concept is the same)
If I have a correct understanding of what’s going on, this means that
what I currently have (in development) is in no way going to scale to
even more than 10 clients.
This is not a correct understanding. A node.js process can easily have thousands of non-blocking requests in flight at the same time. Your sequentially measured time is only a start to finish time - it has nothing to do with CPU resources or other OS resources consumed (see comments below on non-blocking resource consumption).
I still have concerns about using node for this particular
application then. I'm worried about how it will scale considering that
the work it is doing is not simple I/O but computationally intensive.
I feel as though I should switch to a platform that enables
multi-threading. Does what I'm asking/the concern I'm expressing make
sense? I could just be spitting total BS and have no idea what I'm
talking about.
Non-blocking I/O consumes almost no CPU (only a little when the request is originally sent and then a little when the result arrives back), but while the compmuter is waiting for the remove result, no CPU is consumed at all and no OS thread is consumed. This is one of the reasons that node.js scales well for non-blocking I/O as no resources are used when the computer is waiting for a response from a remove site.
If your processing of the request is computationally intensive (e.g. takes a measurable amount of pure blocking CPU time to process), then yes you would want to explore getting multiple processes involved in running the computations. There are multiple ways to do this. You can use clustering (so you simply have multiple identical node.js processes each working on requests from different clients) with the nodejs clustering module. Or, you can create a work queue of computationally intensive work to do and have a set of child processes that do the computationally intensive work. Or, there are several other options too. This not the type of problem that one needs to switch away from node.js to solve - it can be solved using node.js just fine.
You can use queue to process concurrent http calls in nodeJs
https://www.npmjs.com/package/concurrent-queue
var cq = require('concurrent-queue');
test_queue = cq();
// request action method
testQueue: function(req, res) {
// queuing each request to process sequentially
test_queue(req.user, function (err, user) {
console.log(user.id+' done');
res.json(200, user)
});
},
// Queue will be processed one by one.
test_queue.limit({ concurrency: 1 }).process(function (user, cb) {
console.log(user.id + ' started')
// async calls will go there
setTimeout(function () {
// on callback of async, call cb and return response.
cb(null, user)
}, 1000);
});
Please remember that it needs to implement for sensitive business calls where the resource needs to be accessed or update at a time by one user only.
This will block your I/O and make your users to wait and response time will be slow.
Optimization:
You can make it faster and optimize it by creating resource dependent queue. So that the there is a separate queue for each shared resource and synchronous calls for same resource can only be execute for same resource and for different resources the calls will be executed asynchronously
Let suppose that you want to implement that on the base of current user. So that for the same user http calls can only execute synchronously and for different users the https calls will be asynchronous
testQueue: function(req, res) {
// if queue not exist for current user.
if(! (test_queue.hasOwnProperty(req.user.id)) ){
// initialize queue for current user
test_queue[req.user.id] = cq();
// initialize queue processing for current user
// Queue will be processed one by one.
test_queue[req.user.id].limit({ concurrency: 1 }).process(function (task, cb) {
console.log(task.id + ' started')
// async functionality will go there
setTimeout(function () {
cb(null, task)
}, 1000)
});
}
// queuing each request in user specific queue to process sequentially
test_queue[req.user.id](req.user, function (err, user) {
if(err){
return;
}
res.json(200, user)
console.log(user.id+' done');
});
},
This will be fast and block I/O for only that resource for which you want.

Close listener after idle time

I've a simple nodejs server that is started automatically.
It uses express to host the endpoint, which is started with a simple app.listen(port); command.
Since I've an automatic startup, I'd like to shutdown the server after an idle period - say 3 mins.
I've coded it manually just using the function below, which is called on each app.post:
//Idle timer
var timer;
function resetIdleTimer() {
if (timer != null) clearTimeout(timer);
timer = setTimeout(function () {
logger.info('idle shutdown');
process.exit();
}, 3 * 60 * 1000);
}
This seems a little crude though, so I wondered if there is an neater way (some sort of timer within express maybe).
Looking in the express docs I didn't see an easy way to configure this.
Is there a neater way to have this idle shutdown implemented?
app.listen() returns a wrapped HTTP server (as can be seen here in the source), on which you can then the .close() method.
var app = express();
var server = app.listen(port);
setTimeout(function() {
server.close();
}, 3 * 60 * 1000);
This will prevent the server from accepting new connection. When it has stopped serving existing connections, it will gracefully stop. This will then stop Nodejs entirely.
Edit: You might also find this GitHub issue relevant.
Take a look at forever . You can require it as a module into your application and it provides you with some functions that can help you achieve what you are looking for (such as forever.stop(index) which terminates the node process running at that index. Before terminating the process, you could retrieve the list of processes and manipulate the strings in order to get the uptime. Then, I would monitor the time that passes between server calls. If there is a gap of 3 minutes between requests, I would call forever.stop() in order to terminate the process.
I dont think it's "crude" to use your timer solution; I would take a slightly different tack:
app.timeOutDate = new Date().valueOf() + 1000*60*3; // 3 minutes from now, in ms
function quitIfTimedout(req, res, next){
if(new Date().valueOf() > app.timeOutDate){
logger.info('idle shutdown');
process.exit();
} else {
app.timeOutDate = new Date().valueOf() + 1000*60*3; //reset
next();
}
};
app.all('*', quitIfTimedout);
however this wont actually quit after 3 minutes, it would instead quit on the next request after 3 minutes. so that might not solve your problem

Node js - http.request() problems with connection pooling

Consider the following simple Node.js application:
var http = require('http');
http.createServer(function() { }).listen(8124); // Prevent process shutting down
var requestNo = 1;
var maxRequests = 2000;
function requestTest() {
http.request({ host: 'www.google.com', method: 'GET' }, function(res) {
console.log('Completed ' + (requestNo++));
if (requestNo <= maxRequests) {
requestTest();
}
}).end();
}
requestTest();
It makes 2000 HTTP requests to google.com, one after the other. The problem is it gets to request No. 5 and pauses for about 3 mins, then continues processing requests 6 - 10, then pauses for another 3 minutes, then requests 11 - 15, pauses, and so on. Edit: I tried changing www.google.com to localhost, an extremely basic Node.js app running my machine that returns "Hello world", I still get the 3 minute pause.
Now I read I can increase the connection pool limit:
http.globalAgent.maxSockets = 20;
Now if I run it, it processes requests 1 - 20, then pauses for 3 mins, then requests 21 - 40, then pauses, and so on.
Finally, after a bit of research, I learned I could disable connection pooling entirely by setting agent: false in the request options:
http.request({ host: 'www.google.com', method: 'GET', agent: false }, function(res) {
...snip....
...and it'll run through all 2000 requests just fine.
My question, is it a good idea to do this? Is there a danger that I could end up with too many HTTP connections? And why does it pause for 3 mins, surely if I've finished with the connection it should add it straight back into the pool ready for the next request to use, so why is it waiting 3 mins? Forgive my ignorance.
Failing that, what is the best strategy for a Node.js app making a potentially large number HTTP requests, without locking up, or crashing?
I'm running Node.js version 0.10 on Mac OSX 10.8.2.
Edit: I've found if I convert the above code into a for loop and try to establish a bunch of connections at the same time, I start getting errors after about 242 connections. The error is:
Error was thrown: connect EMFILE
(libuv) Failed to create kqueue (24)
...and the code...
for (var i = 1; i <= 2000; i++) {
(function(requestNo) {
var request = http.request({ host: 'www.google.com', method: 'GET', agent: false }, function(res) {
console.log('Completed ' + requestNo);
});
request.on('error', function(e) {
console.log(e.name + ' was thrown: ' + e.message);
});
request.end();
})(i);
}
I don't know if a heavily loaded Node.js app could ever reach that many simultaneous connections.
You have to consume the response.
Remember, in v0.10, we landed streams2. That means that data events don't happen until you start looking for them. So, you can do stuff like this:
http.createServer(function(req, res) {
// this does some I/O, async
// in 0.8, you'd lose data chunks, or even the 'end' event!
lookUpSessionInDb(req, function(er, session) {
if (er) {
res.statusCode = 500;
res.end("oopsie");
} else {
// no data lost
req.on('data', handleUpload);
// end event didn't fire while we were looking it up
req.on('end', function() {
res.end('ok, got your stuff');
});
}
});
});
However, the flip side of streams that don't lose data when you're not reading it, is that they actually don't lose data if you're not reading it! That is, they start out paused, and you have to read them to get anything out.
So, what's happening in your test is that you're making a bunch of requests and not consuming the responses, and then eventually the socket gets killed by google because nothing is happening, and it assumes you've died.
There are some cases where it's impossible to consume the incoming message: that is, if you don't add a response event handler on a requests, or where you completely write and finish the response message on a server without ever reading the request. In those cases, we just dump the data in the garbage for you.
However, if you are listening to the 'response' event, it's your responsibility to handle the object. Add a response.resume() in your first example, and you'll see it processes on through at a reasonable pace.

Resources