How to pass an active WebSocket to a clustered thread in Node.js?

How to pass an active WebSocket to a clustered thread in Node.js? - node.js

In Node.js they expose a handy way to pass net.Sockets to child processes (cluster.Worker) via:
var socket; // some instance of net.Socket
var worker = process.fork();
worker.on("online", function() {
worker.send("socket", socket);
});
Which is super cool and works handily. But how would I do this with a WebSocket connection? I'm open to try any module.
Currently I've tried using various modules like ws. Most of them store the initial net.Socket HTTP Request and then upgrade it, but none seem simple enough to pass to the child process as a net.Socket because they need tons of handshake info needed by the WebSocket spec, so far as I can tell.
I know there are hackish solutions, like opening a WebSocket server on the child process on a unique port, then telling the WebScoket connection to reconnect on that port, but then I need an open port for every child thread. Or, piping all data to the WebSocket connection through process.send so the main thread does all the io, but that defeats some of the performance benefits by running stuff on multiple threads.
So does anyone have any ideas?

Welp I figured it out. ws may have been too much for my intended purposes. Instead I found a pretty obscure WebSocket library, lark-websocket which exposes a function that given a net.Socket can wrap it up in in their Client class and work with it as a WebSocket. The only issue was both the parent and child threads would then try to ping the connection on the other end so I had to fork it and add a way for the parent thread to pause pinging.
Here's some example code for anyone interested:
var cluster = require("cluster");
var ws = require('lark-websocket');
if(cluster.isMaster) { // make a child process and pipe all ws connections to it
var worker = cluster.fork();
worker.once("online", function() {
console.log("worker online with pid", worker.process.pid);
})
ws.createServer(function(client, request){
worker.send("socket", client._socket); // send all websocket clients to the worker thread
}).listen(27015);
}
else { // we are a worker, so we handle the ws connections
process.on("message", function(message, handler) {
if(message === "socket") { // Note: Node js can only send sockets via handler if message === "socket", because passing sockets between threads is sketchy as fuck
var client = ws.createClient(handler);
client.on('message',function(msg){
console.log("worker " + process.pid + " got:", msg);
client.send("I got your: " + msg);
});
}
});
}

Related

How to fork a process in node that writes express response

I'd like to fork a long running express request in node and send an express response with the child, allowing the parent to serve other requests. I'm already using cluster but I'd like to fork another process in addition to the cluster for specific long running requests. What I'd like to prevent is all the processes in the cluster being consumed by a specific long running processes, while most of the other requests are fast.
Thanks
var express = require('express');
var webserver = express();
webserver.get("/test", function(request, response) {
// long running HTTP request
response.send(...);
});
What I'm thinking of is something like following, although I'm not sure this works:
var cp = require('child_process');
var express = require('express');
var webserver = express();
webserver.get("/test", function(request, response) {
var child = cp.fork('do_nothing.js');
child.on("message", function(message) {
if(message == "start") {
response.send(...);
process.exit();
}
});
child.send("start");
});
Let me know if anyone knows how to do this.
Edit: So, the idea is that the child could take a long time. There are a limited number of processes in the cluster serving express responses and I don't want to consume them all on a specific long-running request type. In the code below, the entire cluster would be consumed by the long running express requests.
while(1) {
if(rand() % 100 == 0) {
if(fork() == 0) {
sleep(hour(1));
exit(0);
}
} else {
sleep(second(1));
}
waitpid(WAIT_ANY, &status, WNOHANG);
}
Edit: I am going to mark the self-answer as solved. I'm sure there's a way to pass a socket to a child but it's not really necessary because the cluster master can manage all child processes. Thanks for your help.

Your second code block is confusing because it appears that you're killing the parent process with process.exit() rather than the child.
In any case, if we assume the problem is this:
You have a cluster of "regular processes".
Occasionally, you want to take an incoming request that was assigned to one of the cluster processes and pass it off to a long running child that will eventually send the response.
After sending the response, the long running child process should exit.
You have a couple options.
You can have the clustered process that was assigned the request, start up a child, send it some initial data and listen for a message back from the child. When it gets the message back from the child, it can send the response and kill the child. This appears to be what you're attempting to do in your second code block.
You can have the clustered process that was assigned the request, start up a child and reassign the request socket to the child process and the child can then own that socket from then on. When it finally sends the response, it can then exit itself.
The first is simpler because no socket assignment from one process to another is required. To implement the second, you'd have to write or find the code to do socket reassignment and then reconstituted as an express request within the child. The cluster module does something like this so the code is there to be found and learned from, but I'm not aware of a trivial way to do it.
Personally, I don't see any particular downside to the first. I suppose if the clustered process were to die for some , you'd lose the long running request socket, but hopefully you can just code your clustered processes not to die unnecessarily.
You can read this article on sending a socket to a new node.js process:
Sending a socket to a forked process
And, this node.js doc on sending a socket:
Example: sending a socket object

So, I've verified that this is not necessary for my use case, but I was able to get it working using the code below. It's not exactly what the OP asks for, but it works.
What it's doing is sending an instruction to the cluster master, which forks the additional process upon receipt of the slow express request.
Since the express request doesn't need to know the status of the newly forked cluster worker, it just handles the slow request as normal and then exits.
The instruction to the cluster master informs the master not to replace the dying slow express request process, so the number of workers reverts to the original number after the slow request finishes.
The pool will increase in size when there are slow requests, but revert to normal. This will prevent like 20 simultaneous slow requests from bringing down the cluster.
var numberOfWorkers = 10;
var workerCount = 0;
var slowRequestPids = { };
if (cluster.isMaster) {
for(var i = 0; i < numberOfWorkers; i++) {
workerCount++;
cluster.fork();
}
cluster.on('exit', function(worker) {
workerCount--;
var pidString = String(worker.process.pid);
if(pidString in slowRequestPids) {
delete slowRequestPids[pidString];
if(workerCount >= numberOfWorkers) {
logger.info('not forking replacement for slow process');
return;
}
}
logger.info('forking replacement for a process that died unexpectedly');
workerCount++;
cluster.fork();
}
cluster.on("message", function(msg) {
if(typeof msg.fork != "undefined" && workerCount < 100) {
logger.info("forking additional process upon slow request");
slowRequestPids[msg.fork] = 1;
workerCount++;
cluster.fork();
}
});
return;
}
webserver.use("/slow", function(req, res) {
process.send({fork: String(process.pid) });
sleep.sleep(300);
res.send({ response_from: "virtual child" });
res.on("finish", function() {
logger.info('process exits, restoring cluster to original size');
process.exit();
});
});

Node.js cluster module appears to break Socket.io handshake

I have the following simple WebSocket server built around the Socket.io library:
var PROCESSES = 1,
cluster = require('cluster'),
i;
if (cluster.isMaster) {
for (i = 0; i < PROCESSES; i++) {
console.log('Forking worker', i);
cluster.fork();
}
} else {
(function () {
var server = require('http').Server(),
io = require('socket.io')(server);
io.on('connection', function (socket) {
socket.on('message', function (message) {
socket.emit('message', message + ' too!');
});
});
server.listen(8080);
})();
}
When started, it creates a single server process which listens for WebSocket connections and echoes a variation of the message back to the client:
$ iocat --socketio ws://localhost:8080
> i am hungry
i am hungry too!
> i like you
i like you too!
>
Now, when I change the PROCESSES variable to a number larger than 1, the client can no longer connect.
var PROCESSES = 2,
...
...results in...
$ iocat --socketio ws://localhost:8080
> client.on error
$ iocat -v --socketio ws://localhost:8080
> SIOClient> SIOClient: url-> ws://localhost:8080
SIOClient> onError { [Error: xhr poll error] description: 400 }
client.on error
My gut feeling is that the cluster module, when given more than one worker process, inappropriately switches from one process to another mid-handshake. But I would have thought that the entire connection, from the client initiating the handshake to the closing of the socket at the very very end, occurred over one persistent, keep-alive'd connection.
So what exactly is going on here? And how could it be worked around? I'm familiar with the idea of using a Redis store to share state between server processes on different machines, but that feels like too much infrastructure for my use case (collecting a stream of events from the client and replying with an acknowledgement).
Versions: socket.io#1.3.3, node#0.10.36, seen on OS X 10.10 and CentOS 6.6

socket.io is not a simple wrapper over WebSockets, it does much more. The opening handshake is an http request to decide on a protocol (WebSocket, polling, flash sockets, etc.) followed by, in your case, probably a WebSocket request. If those hit different processes, the handshake will fail.
socket.io requires that you use sticky sessions, to ensure that a given client hits the same process each time. They suggest using the sticky-session module if you want to use cluster.

Handle new TCP connections synchronously

I know nodejs is asynchronous by nature and it is preferable to use that way, but I have a use case where we need to handle incoming TCP connections in synchronous way. Once a new connections received we need to connect to some other TCP server and perform some book keeping stuff etc and then handle some other connection. Since number of connections are limited, it is fine to handle this in synchronous way.
Looking for an elegant way to handle this scenario.
net.createServer(function(sock) {
console.log('Received a connection - ');
var sock = null;
var testvar = null;
sock = new net.Socket();
sock.connect(PORT, HOST, function() {
console.log('Connected to server - ');
});
//Other listeners
}
In the above code if two connections received simultaneously the output may be (since asynchronous nature):
Received a connection
Receive a connection
Connected to server
Connected to server
But the expectation is:
Received a connection
Connected to server
Receive a connection
Connected to server
What is the proper way of ding this?
One solution is implement a queue kind of solution with emitting 'done' or 'complete' events to handle next connection.
For this we may have to take the connection callback out of the createServer call. How to handle scoping of connection and other variables (testvar) in this case?
In this case what happens to the data/messages if received on connections which are in queue but not yet processed and not yet 'data' listener is registered.?
Any other better solutions will be helpful.

I think it is important to separate the concepts of synchronous code vs serial code. You want to process each request serially, but that can still be accomplished while handling each request asynchronously. For your case, the easiest way would probably be to have a queue of requests to handle instead.
var inProgress = false;
var queue = [];
net.createServer(function(sock){
queue.push(sock);
processQueue();
});
function processQueue(){
if (inProgress || queue.length === 0) return;
inProgress = true;
handleSockSerial(queue.shift(), function(){
inProgress = false;
processQueue();
});
}
function handleSockSerial(sock, callback){
// Do all your stuff and then call 'callback' when you are done.
}
Note, as long as you are using node >= 0.10, the data coming in from the socket will be buffered until you read the data.

How do I shutdown a Node.js http(s) server immediately?

I have a Node.js application that contains an http(s) server.
In a specific case, I need to shutdown this server programmatically. What I am currently doing is calling its close() function, but this does not help, as it waits for any kept alive connections to finish first.
So, basically, this shutdowns the server, but only after a minimum wait time of 120 seconds. But I want the server to shutdown immediately - even if this means breaking up with currently handled requests.
What I can not do is a simple
process.exit();
as the server is only part of the application, and the rest of the application should remain running. What I am looking for is conceptually something such as server.destroy(); or something like that.
How could I achieve this?
PS: The keep-alive timeout for connections is usually required, hence it is not a viable option to decrease this time.

The trick is that you need to subscribe to the server's connection event which gives you the socket of the new connection. You need to remember this socket and later on, directly after having called server.close(), destroy that socket using socket.destroy().
Additionally, you need to listen to the socket's close event to remove it from the array if it leaves naturally because its keep-alive timeout does run out.
I have written a small sample application you can use to demonstrate this behavior:
// Create a new server on port 4000
var http = require('http');
var server = http.createServer(function (req, res) {
res.end('Hello world!');
}).listen(4000);
// Maintain a hash of all connected sockets
var sockets = {}, nextSocketId = 0;
server.on('connection', function (socket) {
// Add a newly connected socket
var socketId = nextSocketId++;
sockets[socketId] = socket;
console.log('socket', socketId, 'opened');
// Remove the socket when it closes
socket.on('close', function () {
console.log('socket', socketId, 'closed');
delete sockets[socketId];
});
// Extend socket lifetime for demo purposes
socket.setTimeout(4000);
});
// Count down from 10 seconds
(function countDown (counter) {
console.log(counter);
if (counter > 0)
return setTimeout(countDown, 1000, counter - 1);
// Close the server
server.close(function () { console.log('Server closed!'); });
// Destroy all open sockets
for (var socketId in sockets) {
console.log('socket', socketId, 'destroyed');
sockets[socketId].destroy();
}
})(10);
Basically, what it does is to start a new HTTP server, count from 10 to 0, and close the server after 10 seconds. If no connection has been established, the server shuts down immediately.
If a connection has been established and it is still open, it is destroyed.
If it had already died naturally, only a message is printed out at that point in time.

I found a way to do this without having to keep track of the connections or having to force them closed. I'm not sure how reliable it is across Node versions or if there are any negative consequences to this but it seems to work perfectly fine for what I'm doing. The trick is to emit the "close" event using setImmediate right after calling the close method. This works like so:
server.close(callback);
setImmediate(function(){server.emit('close')});
At least for me, this ends up freeing the port so that I can start a new HTTP(S) service by the time the callback is called (which is pretty much instantly). Existing connections stay open. I'm using this to automatically restart the HTTPS service after renewing a Let's Encrypt certificate.

If you need to keep the process alive after closing the server, then Golo Roden's solution is probably the best.
But if you're closing the server as part of a graceful shutdown of the process, you just need this:
var server = require('http').createServer(myFancyServerLogic);
server.on('connection', function (socket) {socket.unref();});
server.listen(80);
function myFancyServerLogic(req, res) {
req.connection.ref();
res.end('Hello World!', function () {
req.connection.unref();
});
}
Basically, the sockets that your server uses will only keep the process alive while they're actually serving a request. While they're just sitting there idly (because of a Keep-Alive connection), a call to server.close() will close the process, as long as there's nothing else keeping the process alive. If you need to do other things after the server closes, as part of your graceful shutdown, you can hook into process.on('beforeExit', callback) to finish your graceful shutdown procedures.

The https://github.com/isaacs/server-destroy library provides an easy way to destroy() a server with the behavior desired in the question (by tracking opened connections and destroying each of them on server destroy, as described in other answers).

As others have said, the solution is to keep track of all open sockets and close them manually. My node package killable can do this for you. An example (using express, but you can call use killable on any http.server instance):
var killable = require('killable');
var app = require('express')();
var server;
app.route('/', function (req, res, next) {
res.send('Server is going down NOW!');
server.kill(function () {
//the server is down when this is called. That won't take long.
});
});
var server = app.listen(8080);
killable(server);

Yet another nodejs package to perform a shutdown killing connections: http-shutdown, which seems reasonably maintained at the time of writing (Sept. 2016) and worked for me on NodeJS 6.x
From the documentation
Usage
There are currently two ways to use this library. The first is explicit wrapping of the Server object:
// Create the http server
var server = require('http').createServer(function(req, res) {
res.end('Good job!');
});
// Wrap the server object with additional functionality.
// This should be done immediately after server construction, or before you start listening.
// Additional functionailiy needs to be added for http server events to properly shutdown.
server = require('http-shutdown')(server);
// Listen on a port and start taking requests.
server.listen(3000);
// Sometime later... shutdown the server.
server.shutdown(function() {
console.log('Everything is cleanly shutdown.');
});
The second is implicitly adding prototype functionality to the Server object:
// .extend adds a .withShutdown prototype method to the Server object
require('http-shutdown').extend();
var server = require('http').createServer(function(req, res) {
res.end('God job!');
}).withShutdown(); // <-- Easy to chain. Returns the Server object
// Sometime later, shutdown the server.
server.shutdown(function() {
console.log('Everything is cleanly shutdown.');
});

My best guess would be to kill the connections manually (i.e. to forcibly close it's sockets).
Ideally, this should be done by digging into the server's internals and closing it's sockets by hand. Alternatively, one could run a shell-command that does the same (provided the server has proper privileges &c.)

I have answered a variation of "how to terminate a HTTP server" many times on different node.js support channels. Unfortunately, I couldn't recommend any of the existing libraries because they are lacking in one or another way. I have since put together a package that (I believe) is handling all the cases expected of graceful HTTP server termination.
https://github.com/gajus/http-terminator
The main benefit of http-terminator is that:
it does not monkey-patch Node.js API
it immediately destroys all sockets without an attached HTTP request
it allows graceful timeout to sockets with ongoing HTTP requests
it properly handles HTTPS connections
it informs connections using keep-alive that server is shutting down by setting a connection: close header
it does not terminate the Node.js process
Usage:
import http from 'http';
import {
createHttpTerminator,
} from 'http-terminator';
const server = http.createServer();
const httpTerminator = createHttpTerminator({
server,
});
await httpTerminator.terminate();

const Koa = require('koa')
const app = new Koa()
let keepAlive = true
app.use(async (ctx) => {
let url = ctx.request.url
// destroy socket
if (keepAlive === false) {
ctx.response.set('Connection', 'close')
}
switch (url) {
case '/restart':
ctx.body = 'success'
process.send('restart')
break;
default:
ctx.body = 'world-----' + Date.now()
}
})
const server = app.listen(9011)
process.on('message', (data, sendHandle) => {
if (data == 'stop') {
keepAlive = false
server.close();
}
})

process.exit(code); // code 0 for success and 1 for fail

Nodeunit Execution Order?

I am trying to test my web server using nodeunit:
test.js
exports.basic = testCase({
setUp: function (callback) {
this.ws = new WrappedServer();
this.ws.run(PORT);
callback();
},
tearDown: function (callback) {
delete this.ws;
callback();
},
testFoo: function(test) {
var socket = ioClient.connect(URL);
console.log('before client emit')
socket.emit('INIT', 1, 1);
console.log('after client emit');
}
});
and this is my very simple nodejs server:
WrappedServer.prototype.run = function(port) {
this.server = io.listen(port, {'log level': 2});
this.attachCallbacks();
};
WrappedServer.prototype.attachCallbacks = function() {
var ws = this;
ws.server.sockets.on('connection', function(socket) {
ws.attachDebugToSocket(socket);
console.log('socket attaching INIT');
socket.on('INIT', function(userId, roomId) {
// do something here
});
console.log('socket finished attaching INIT');
});
}
Basically I am getting this error:
[...cts/lolol/nodejs/testing](testingServer)$ nodeunit ws.js
info - socket.io started
before client emit
after client emit
info - handshake authorized 1013616781193777373
The "sys" module is now called "util". It should have a similar interface.
socket before attaching INIT
socket finished attaching INIT
info - transport end
Somehow, the socket emits INIT BEFORE the server attaches callbacks for sockets.
Why is this happening? In addition, what's the right way to do this?

I'm assuming you were expecting the order to be this?
socket before attaching INIT
socket finished attaching INIT
before client emit
after client emit
From the small amount of code given, the issue is probably two things.
First, and probably the main issue, is that your ioClient.connect will not connect immediately. You need to pass some kind of callback to that, and emit INIT, and then execute the test's callback function once it has actually connected.
Second, you should probably do the same thing with you run command. listen will not stary listening immediately, so you're going to get inconsistent results occasionally if it hasn't started listening by the time it executes your test. You should also pass the setUp's callback to io.listen.
Update
To be clear for listen, just like most things in node, the socketio server's listen method is asynchronous. Calling the method tells it to start listening, but there is some time in the background where the server sets up the networking stuff to start listening. Just like node's core listen, http://nodejs.org/docs/latest/api/net.html#server.listen, socket.io's version takes a callback argument that is called once the server is up and listening.
io.listen(port, {'log level': 2}, callback);
Unless socket.io starts giving you errors about failing to connect, this probably is not an issue, but it is something to keep in mind. Treating asynchronous actions as if they were instantaneous is an easy way to make bugs that only come up occasionally. Since your run wraps listen, I think in general, not just for testing, passing a callback to run would be a very good idea.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string