Nodejs Keep-alive with Unix Domain sockets - node.js

Keep-alive works just fine over TCP. But Unix Domain Sockets gives me weird behavior. If I send a couple thousand requests like this:
request.post('http://unix:/tmp/http.sock:/check', {
json: {
...
},
forever: true,
pool: {maxSockets: 10},
headers: {
'Host': '',
'Connection': 'keep-alive'
})
a kernel trace will show 2000 sockets being created (and never closed), one for each request. I'd expect only 10 sockets to be created and reused as necessary.
Is there a way to set things up so unix sockets are kept alive and reused the same way TCP sockets are?

From the request documentation:
Note that if you are sending multiple requests in a loop and creating multiple new pool objects, maxSockets will not work as intended. To work around this, either use request.defaults with your pool options or create the pool object with the maxSockets property outside of the loop.
So it seems like you need to create the pool object outside the loop in order for sockets to be reused as you expect.

This behavior is broken in Node prior to version v8.7.0. A commit by user bengl fixing keep-alive for Unix domain sockets was put into the v8.7.0 build. That build was released about 6 days ago.

Related

How to get a count of the current open sockets in Node?

I am using the request module to crawl a list of URLs and would like to
limit the number of open sockets to 2:
var req = request.defaults({
forever: true,
pool: {maxSockets: 1}
});
req(options, function(error, response, body) {
... code ...
done();
however, when looping over an array of URLs and issuing a new request to each - that does not seem to work.
is there a way to get the current number of open sockets to test it?
I believe that maxSockets maps to http.Agent.maxSockets, which limits the number of concurrent requests to the same origin (host:port).
This comment, from the developer of request, suggests the same:
actually, pooling controls the agent passed to core. each agent holds all hosts and throttles the maxSockets per host
In other words, you can't use it to limit the number of concurrent requests in general. For that, you need to use an external solution, for instance using limiter or async.queue.

Throttling event-driven Nodejs HTTP requests

I have a Node net.Server that listens to a legacy system on a TCP socket. When a message is received, it sends an http request to another http server. Simplified, it looks like this:
var request = require('request-promise');
...
socket.on('readable', function () {
var msg = parse(socket.read());
var postOptions = {
uri: 'http://example.com/go',
method: 'POST',
json: msg,
headers: {
'Content-Type': 'application/json'
}
};
request(postOptions);
})
The problem is that the socket is readable about 1000 times per second. The requests then overload the http server. Almost immediately, we get multiple-second response times.
In running Apache benchmark, it's clear that the http server can handle well over 1000 requests per second in under 100ms response time - if we limit the number of concurrent requests to about 100.
So my question is, what is the best way to limit the concurrent requests outstanding using the request-promise (by extension, request, and core.http.request) library when each request is fired separately within an event callback?
Request's documentation says:
Note that if you are sending multiple requests in a loop and creating multiple new pool objects, maxSockets will not work as intended. To work around this, either use request.defaults with your pool options or create the pool object with the maxSockets property outside of the loop.
I'm pretty sure that this paragraph is telling me the answer to my problem, but I can't make sense of it. I've using defaults to limit the number of sockets open:
var rp = require('request-promise');
var request = rp.defaults({pool: {maxSockets: 50}});
Which doesn't help. My only thought at the moment is to manually manage a queue, but I expect that would be unnecessary if I only knew the conventional way to do it.
Well you need to throttle your request right? I have workaround this in two ways, but let me show you one patter I always use. I often use throttle-exec and Promise to make wrapper for request. You could install it with npm install throttle-exec and use Promise natively or third-party. Here is my gist for this wrapper https://gist.github.com/ans-4175/d7faec67dc6374803bbc
How do you use it? It's simple, just like ordinary request.
var Request = require("./Request")
Request({
url:url_endpoint,
json:param,
method:'POST'
})
.then(function(result){
console.log(result)
})
.catch(reject)
Tell me after you implement it. Either way I have another wrapper :)

How to use Request js (Node js Module) pools

Can someone explain how to use the request.js pool hash?
The github notes say this about pools:
pool - A hash object containing the agents for these requests. If omitted this
request will use the global pool which is set to node's default maxSockets.
pool.maxSockets - Integer containing the maximum amount of sockets in the pool.
I have this code for writing to a CouchDB instance (note the question marks). Basically, any user who connects to my Node server will write to the DB independent of each other:
var request = require('request');
request({
//pool:, // ??????????????????
'pool.maxSockets' : 100, // ??????????????????
'method' : 'PUT',
'timeout' : 4000,
'strictSSL' : true,
'auth' : {
'username' : myUsername,
'password' : myPassword
},
'headers' : {
'Content-Type': 'application/json;charset=utf-8',
'Content-Length': myData.length
},
'json' : myData,
'url': myURL
}, function (error, response, body){
if (error == null) {
log('Success: ' + body);
}
else {
log('Error: ' + error);
}
});
What's best for high throughput/performance?
What are the drawbacks of a high 'maxSockets' number?
How do I create a separate pool to use instead of the global pool? Why do I only want to create a separate pool?
The pool option in request uses agent which is same as http.Agent from standard http library. See the documentation for http.Agent and see the agent options in http.request.
Usage
pool = new http.Agent(); //Your pool/agent
http.request({hostname:'localhost', port:80, path:'/', agent:pool});
request({url:"http://www.google.com", pool:pool });
If you are curious to know what is that you can see it from console.
{ domain: null,
_events: { free: [Function] },
_maxListeners: 10,
options: {},
requests: {},
sockets: {},
maxSockets: 5,
createConnection: [Function] }
The maxSockets determines how many concurrent sockets the agent can have open per host, is present in an agent by default with value 5. Typically you would set it before. Passing pool.maxSockets explicitly would override the maxSockets property in pool. This option only makes sense if passing pool option.
So different ways to use it :
Don't give agent option, will be undefined will use http.globalAgent. The default case.
Give it as false, will disable pooling.
Provide your own agent, like above example.
Answering your questions in reverse.
Pool is meant to keep certain number of sockets to be used by the program. Firstly the sockets are reused for different requests. So it reduces overhead of creating new sockets. Secondly it uses fewer sockets for requests, but consistently. It will not take up all sockets available. Thirdly it maintains queue of requests. So there is waiting time implied.
Pool acts like both a cache and a throttle. The throttle effect will be more visible if you have more requests and lesser sockets. When using global pool it may limit functioning of two different clients, there are no guarantees on waiting time. Having separate pool for them will be fairer to both (think if one requests more than other).
The maxSockets property gives maximum concurrency possible. It increases the overall throughput/performance. Drawback is throttle effect is reduced. You cannot control peak overhead. Setting it to large number, will be like no pooling at all. You would start getting errors like socket not available. It cannot be more than the allowed maximum limit set by the OS.
So what is best for high throughput/performance? There is a physical limit in throughput. If you reach the limit, response time will increase with number of connections. You can keep increasing maxSockets till then, but after that increasing it will not help.
You should take a look at the forever-agent module, which is a wrapper to http.Agent.
Generally the pool is a hash object that contains a number of http agent. it tries to reuse created sockets from "keep-alive" connection. per host:port. For example, you performed several requests to host www.domain1.com:80 and www.domain2.com:80, if any of response contains no header Connection: close, it will put the socket in pool and give it to pending requests.
If no pending requests need this pooled socket, it will be destroyed.
The maxSockets means the max concurrent sockets for a single host:port, the default value is 5. I would suggest thinking of this value with your scenario together:
According to those hot sites requests visit, you'd better create separate pool. so that new requests can pick up idle sockets very fast. the point is, you need to reduce the number of pending requests to certain sites by increasing maxSockets value of a pool. Notice that it doesn't matter if you set a very high number to maxSockets when the connection is well managed by the origin server via response header Connection: close.
According to those sites that your requests hardly visit, use pool: false to disable pool.
You can use this way to specify separate pool for your request:
// create a separate socket pool with 10 concurrent sockets as max value.
var separateReqPool = {maxSockets: 10};
var request = require('request');
request({url: 'http://localhost:8080/', pool: separateReqPool}, function(e, resp){
});

Possible to simulate several concurrent connections to test a nodejs app

I have a simple node.js /socket.io (websockets) application running #localhost. I am trying to see how many concurrent connections it can handle. Is it possible to simulate several concurrent users on localhost itself ?
This is my half baked attempt using socket.io-client:
function connectAndSend(){
socket.emit('qand',{
code :'ubuntu'
});
}
socket.on('connect', function () {
});
socket.on('q', function (data) {
console.log(data);
});
function callConnect(){
console.log('calling');
connectAndSend() ;
setTimeout(callConnect,100) ;
}
callConnect() ;
As I see it this only 'emits' a new message every 100 ms and is not simulating concurrent connections.
In your call to connect, you must tell socket.io to create a new connection for each call to connect. For example:
var socket = io.connect(server, { "force new connection": true });
Also, if you want to raise the outbound TCP connection limit (which seems to default to 5 connections per target), do something like
require('http').globalAgent.maxSockets = 1000;
before connecting.
But note that creating and closing tcp sockets at a fast rate will make TCP connections pile up in state TIME_WAIT and depending on your OS and your network settings you'll hit a limit pretty soon, meaning you'll have to wait for those old sockets to timeout before you can establish new connections.
If I recall correctly, the limit was around 16k connections (per target ip/port combo) on Windows (both Server 2008 R2 and Windows 7), and the default TIME_WAIT timeout in Windows is 4 minutes, so if you create more than 16k connections in 4 minutes on Windows, you'll probably hit that wall.
Check here:
Long connections with Node.js, how to reduce memory usage and prevent memory leak? Also related with V8 and webkit-devtools
and specifically - test procedure used by the author of question mentioned above
EDIT:
You can use following tools to check how many requests per second your server is capable of serving
ab - http://httpd.apache.org/docs/2.2/programs/ab.html
siege - http://www.joedog.org/siege-home/

Sockets don't appear to be closing when using Node.js http.get

OS X 10.8.3
Node 0.10.0
I'm using the 'http' module to make requests of the Facebook graph API.
Here are the options that I pass to 'http.get':
var options = {host: 'graph.facebook.com',
port: 80,
path: '/' + fb_id + '/picture'}; //fb_id is a Facebook user identifier
My code looks like this:
http.get(options,
function(res) {
...some stuff...
DONE(RESULT); //DONE is a callback function
}).on('error', function(e) {
...some error handling...
});
What I observe is that I can only do as many requests as the value of http.globalAgent.maxSockets. Once I reach that many requests, the next call to http.get never (apparently) connects. I've verified that I'm not getting errors on the requests.
It's as though the sockets are not being closed after the response comes in.
Is there something I need to do as part of the response handler to ensure that the socket is closed?
Are these sockets not closing because of the default keepalive behavior?
How should I proceed to debug this?
Try setting agent: false in the options. Default behaviour is indeed to keep connections open for HTTP keep-alive:
var options = {host: 'graph.facebook.com',
port: 80,
path: '/' + fb_id + '/picture',
agent: false};
Node's http module states that agent defaults to global agent: http://nodejs.org/api/http.html#http_http_globalagent, which means that keep-alive is shared regardless of the module that origins the request.
BTW, responding to Wes' comment of Apr9'13 at 20:47: it does not matter how many times you load a node module, it will be loaded only once and share by all the modules.
What you're experiencing is a pool exhaustion problem. The simplest way to avoid it is to use a new agent (http://nodejs.org/api/http.html#http_class_http_agent) with your desired maxSockets. Remember that the agent you create can be shared between modules if you place it in an export of that module (modules in node are stateful!!!) .
I experienced the same behavior, except that my connections were finally reused after a timeout period. Check if the connections are reused after a certain period of time (couple of minutes), and also check if response headers contain 'Connection: keep-alive'.
If that's the case a possible solution would be to use the 'Connection: Close' header instead of keep-alive, that way pooled connections could be reused earlier as in the usual setup. I am not sure if this leads to any performance issues using the facebook endpoints.
var options = {host: 'graph.facebook.com',
port: 80,
path: '/' + fb_id + '/picture',
headers: { 'Connection':'Close' }
};
For me using agent:false did not work because the vast number of requests I sent exhausted server resources.

Resources