Is it possible to keep websockets connected on Node.js server restart? - node.js

Using any technique, is it possible to keep the websocket connected when restarting Node.js server?

No, not directly. By design, an OS cleans up any resources owned by a process when that process shuts-down. This is how the OS prevents resource leaks over time as processes start up and shut down. So, when your server process shuts down, any sockets it still has open will get shut down by the OS.
The usual design solution for this is to code clients that automatically reconnect if they lose their webSocket connection when the client didn't intend to lose the connection. This type of auto-reconnect behavior is built into socket.io (a layer that sits on top of webSocket) for this exact reason.
If you insert a proxy in front of your server, configured so that clients connect to the proxy and then the proxy then connects through to your server, then it might be possible to teach the proxy to auto-reconnect to the server in a way that the clients would not know anything had happened (as long as they didn't try to send messages while the server was down). Of course, then you have the same restart issue with the proxy.
In some operating systems, it is possible to transfer ownership of TCP sockets from one process on the same host to another process. So, I could conceive of a scheme (which I have not tried) where you could fire up a temporary process, transfer all the webSocket sockets from your server to this temporary process, then restart your server, then after the new server instance comes up, transfer the sockets back and then kill the temporary process.
Since there are multiple other reasons why a webSocket might be unintentionally disconnected, I think the client-side reconnect is a solution that covers both the server restart and many other potential things that could happen in your system and for which code has already been written.

Related

Horizontal scaling with a node.js app & socket io

My team and I are working on a digital signage platform.
We have ~ 2000 Raspberry Pi around the world connected to a Nodejs server using Socket IO. The Raspberries are initiating the connection.
We would like to be able to scale horizontally our application on multiple servers but we have a problem that we can’t figure out.
Basically, the application stores the sockets of the connected Raspberry in an array.
We have an external program that calls the API within the server, this results by the server searching which sockets will be "impacted" by the API call and send them the informations.
After lots of search, we assume that we have to stores the sockets (or their ID) elsewhere (Redis ?), to make the application stateless. Then, any server can respond to a API call and look the sockets in a central place.
Unfortunately, we can’t find any detailed example on how to do that.
Can you please help us ?
Thanks
(You can't store sockets from multiple server instances in a shared datastore like redis: they only make sense in the context of the server where they were initiated).
You will need a cluster of node.js servers to handle this. There are various ways to make a cluster. They all involve directing incoming connections from your RPis to a "generic" hostname, for example server.example.com. Behind that server.example.com hostname will be multiple node.js servers.
Each incoming connection from each RPi connects to just one of those multiple servers. (You know this, I believe.) This means one node.js server in your cluster "owns" each individual RPi.
(Telling you how to rig up a cluster of node.js servers is beyond the scope of this answer. Hints: round-robin DNS or a reverse-proxy nginx front end.)
Then, you want to route -- to fan out -- the incoming data from each API call to each server in the cluster, so the server can route it to the RPis it owns.
Here's a good way to handle that:
Set up a redis cache or other shared data store. It can be very small.
When each node.js server starts, have it register itself as active. That is, have it place its own specific address for handling API calls into the shared server. The specific address is probably of the form 12.34.56.78:3000: that is, an IP address and port.
Have each server update that address every so often, once a minute or so, to show it is still alive.
When an API call arrives at server.example.com, it will come to a more-or-less randomly chosen node.js server instance.
Get that server to read the list of server addresses from the redis cache
Get that server to repeat the API call to all servers except itself. Add a parameter like repeated=yes to the repeated API calls.
Then, each server looks at its list of connected sockets and does what your application requires.
On server shutdown, have the server unregister itself -- remove its address from redis -- if possible.
In other words, build a way of fanning out the API calls to all active node.js servers in your cluster.
If this must scale up to a very large number (more than a hundred or so) node.js servers, or to many hundreds of API calls a minute, you probably should investigate using message queuing software.
SECURE YOUR REDIS server from random cybercreeps on the internet.

Multiple Socket.io app processes cause each client socket connects and disconnects repeatedly

I am working on a nodejs app with Socket.io and I did a test in a single process using PM 2 and it was no errors. Then I move to our production environment(We use Google Cloud Compute Instance).
I run 3 app processes and a iOS client connects to the server.
By the way the iOS client doesn't keep the socket connection. It doesn't send disconnect to the server. But it's disconnected and reconnect to the server. It happens continuously.
I am not sure why the server disconnects the client.
If you have any hint or answer for this, I would appreciate you.
That's probably because requests end up on a different machine rather than the one they originated from.
Straight from Socket.io Docs: Using Multiple Nodes:
If you plan to distribute the load of connections among different processes or machines, you have to make sure that requests associated with a particular session id connect to the process that originated them.
What you need to do:
Enable session affinity, a.k.a sticky sessions.
If you want to work with rooms/namespaces you also need to use a centralised memory store to keep track of namespace information, such as the Redis/Redis Adapter.
But I'd advise you to read the documentation piece I posted, things might have changed a bit since the last time I've implemented something like this.
By default, the socket.io client "tests" out the connection to its server with a couple http requests. If you have multiple server requests and those initial http requests don't go to the exact same server each time, then the socket.io connect will never get established properly and will not switch over to webSocket and it will keep attempting to use http polling.
There are two ways to fix this.
You can configure your clients to just assume the webSocket protocol will work. This will initiate the connection with one and only one http connection which will then be immediately upgraded to the webSocket protocol (with socket.io running on top of that). In socket.io, this is a transport option specified with the initial connection.
You can configure your server infrastructure to be sticky so that a request from a given client always goes back to the exact same server. There are lots of ways to do this depending upon your server architecture and how the load balancing is done between your servers.
If your servers are keeping any client state local to the server (and not in a shared database that all servers access), then you will need even a dropped connection and reconnect to go back to the same server and you will need sticky connections as your only solution. You can read more about sticky sessions on the socket.io website here.
Thanks for your replies.
I finally figured out the issue. The issue was caused by TTL of backend service in Google Cloud Load Balancer. The default TTL was 30 seconds and it made each socket connection tried to disconnect and reconnect.
So I updated the value to 3600s and then I could keep the connection.

Node.js & Socket.io with High Availibility

We have a node.js server that is primarily used with socket.io for browser inter connectivity in a web application.
We want to have a high availability solution which would theoretically consist of two node.js servers, one as a primary server and the other as a backup should the primary fail. The solution would allow that if or when the primary node.js server goes down the backup would take over to provide seamless functionality without interruption.
Is there a solution that allows socket.io to maintain the array of client connections over multiple servers without duplication of clients or of messages sent?
Is there another paradigm we should be considering for HA and node.js?
There is no way to have a webSocket auto fail-over without any interruption to a new server when the one it is currently connected to goes down. The webSockets that were connected to the server that went down will die. That's just how TCP sockets work.
Fortunately with socket.io, the client will quickly realize that the connection has been lost (within seconds) and the clients will try to reconnect fairly quickly. If your backup server is immediately in place (e.g. hot standby) to handle the incoming socket.io connections, then the reconnect will be fairly seamless from the client point of view. It will appear to just be a momentary network interruption from the client's point of view.
On the server, however, you need to not only have a backup, but you have to be able to restore any state that was present for each connection. If the connections are just pipes for delivering notifications and are stateless, then this is fairly easy since your backup server that receives the reconnects will immediately be in business.
If your socket.io connections are stateful on the server-side, then you will need a way to restore/access that state when the backup server takes over. One way of doing this is by keeping the state in a redis server that is separate from your web server (though you will then need a backup/high availability plan for the redis server too).
SocketIO in primary and backup server can be connected to a redis server. This will maintain the sessions in the primary server and can be used by backup server, The clients should once again connect to the new server(when primary fails).
SoketIO- Redis
HA - proxy is used for load balancing between multiple node.js instances. The usage of HA proxy will depend on how you are going to deal with failure of primary server. If you have any method to automatically switch primary server, then HA-proxy will not be much useful, else you can configure HA-Proxy to forward request to backup server if the primary server is unreachable.
Other options similar to HA-Proxy are:
node-http-proxy
Nginx

How nodejs can maintain multiple concurrent connections? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I was reading a lot about nodejs but still not clear about following :
With TCP protocol client and server agree on one port and then can maintain a connection. Server knows IP address of client and hence can send back messages. If we use nodejs then multiple clients can connect to same nodejs server on same port. How this can be possible? How multiple connections can be established on same port by same server.
If client is behind NAT then its IP can be dynamic, so how can nodejs server can send data to client.
What will be resource utilization in maintaining persistent connections on server and client ?
What happend when nodejs server crashes? How can client initiate connection again ?
If there is network problem on client side and it terminates and initiate connection after every 5 mins ..then is there a way this scenario can be handled?
With TCP protocol client and server agree on one port and then can maintain a connection.Server knows IP address of client and hence can send back messages.If we use nodejs then multiple clients can connect to same nodejs server on same port. How this can be possible? How multiple connections can be established on same port by same server.
There is no limitation to the number of connections which can be maintained on a single port (although, in practice, there may be operating system or hardware limitations). Of course, only a single process can listen on a port, but that has nothing to do with connections. That's not node-specific, all TCP servers work like that.
If client is behind NAT then its IP can be dynamic , so how can nodejs server can send data to client.
You're essentially asking how NAT works. Again, there's nothing node-specific in this case. The NAT server simply alters packet headers as necessary, and maintains translation tables for routing, just like with any connection.
What will be resource utilization in maintaining persistent connections on server and client
The overhead is really quite minimal for just the connection itself. A little bit of memory, but almost insignificant in the big picture. If you're storing additional associated data with each connection, that may be different. Node.js handles large number of concurrent connections very well, but if you're concerned, you can always search for benchmark tests, or write your own.
What happend when nodejs server crashes?How can client initiate connection again?
Sockets emit both close and error events. Simply listen for them, and attempt to reconnect afterwards, probably with a back-off delay.
If there is network problem on client side and it terminates and initiate connection after every 5 mins ..then is there a way this scenario can be handled?
Not entirely sure what you're asking here. The client simply needs to reconnect as stated in the previous question/answer. If you're associating certain data with a client socket, and you want a grace period where the client has a chance to reconnect before that data is freed, then you'll need to set a timeout which will free those resources after a certain amount of time. Then, in the connection listener, or perhaps on an authentication event, you'll want to analyse the new connection to see if it matches a recently disconnected client.
I would definitely recommend looking at socket.io, especially if your use case is web-based, although it can be used for more than just browser/server connections. It will do a lot of the things you seem to be concerned about (reconnection, resource association, disconnect grace period, etc) more or less automatically.
TCP connection is kept continously open. Saving is achieved by client not having to continously refresh from server in a long pooling manner in order to check if there are any new messages on the server like in case of AJAX. Creating a new connection every couple of seconds for client to refresh from server is heavy on the server, proxies and routers. In case of Node.js connection is kept open, but it is not active until the client or server have something to send. Found good article here http://www.html5rocks.com/en/tutorials/websockets/basics/
Imagine having a 1000 chat clients and each of them asks server every 3 seconds if any new messages arrived. That results in 2000 requests and responses per minute on the server. In case of Node.js server will send message to client only when there is a message to send, while all 1000 connections will be idle in the meantime, but will be kept open.
TCP connections are always initiated on the same port like port 80, but communication is maintaines on different ports assigned to each connection when it is open. So you would still need to keep connections continously open, but you will not have to send pooling messages continously like you needed before.

how to efficiently transfer file between 2 node.js instances?

I'm developing chat application using app.js which is webkit+node.js framework.
So i have node.js plus bridged web browser environment on both sides.
I want to make file transfer feature somewhat similar to Skype one.
So, initial idea is to:
1.connect clients to main server.
2.Each client gets ip of oposite ones.
3.Start socket or websocket server on both clients and connect to each other.
4.Sender reads the file and transmits it to the reciver.
Question are:
1.Im not really sure that one client can "see" the other.
2.file is a binary data, but websockets are made for text messages so i need some kind of coding/decoding stuff. I thought about base 64 but it has 30% of "overhead" information. So i need something more effitient (base 128?).
3.If it is not efficient to use websocket should i use TCP sockets instead? What problems can appear if i decide to use them?
Yeah i know about node2node and BinaryJS, i just dont know should i use them or not. And i really what to do something myself.
OK, with your communication looking like this:
(C->N)<->N<->(N->C)
(...) is installed on one client's machine. N's are node servers, C's are web clients.
This is out of your control. Some file sharing apps send test packets from the central server to clients, to check whether ports are open and NAT rules are configured correctly, etc. Your clients will start their own servers on some port, your master server can potentially create a test connection to these servers to see whether they're started correctly and open to the web, BEFORE telling other clients that they can send files.
Websockets are great for status messages from your servers to the web GUIs and general client-to-client communication. For the actual file transfers, I would use TCP sockets, see the next answer. On the other hand base64 encoding is really not a slow process, play with it and benchmark its performance, then decide with some data to back up your decision.
You could use a combination: websockets from your servers to the web GUIs, but TCP communication between the servers themselves. TCP servers (and streams) aren't hard to set up in Node, I see no disadvantages. It might actually be less complicated than installing node2node on those servers, since TCP is already built-in.

Resources