How nodejs can maintain multiple concurrent connections? [closed] - node.js

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I was reading a lot about nodejs but still not clear about following :
With TCP protocol client and server agree on one port and then can maintain a connection. Server knows IP address of client and hence can send back messages. If we use nodejs then multiple clients can connect to same nodejs server on same port. How this can be possible? How multiple connections can be established on same port by same server.
If client is behind NAT then its IP can be dynamic, so how can nodejs server can send data to client.
What will be resource utilization in maintaining persistent connections on server and client ?
What happend when nodejs server crashes? How can client initiate connection again ?
If there is network problem on client side and it terminates and initiate connection after every 5 mins ..then is there a way this scenario can be handled?

With TCP protocol client and server agree on one port and then can maintain a connection.Server knows IP address of client and hence can send back messages.If we use nodejs then multiple clients can connect to same nodejs server on same port. How this can be possible? How multiple connections can be established on same port by same server.
There is no limitation to the number of connections which can be maintained on a single port (although, in practice, there may be operating system or hardware limitations). Of course, only a single process can listen on a port, but that has nothing to do with connections. That's not node-specific, all TCP servers work like that.
If client is behind NAT then its IP can be dynamic , so how can nodejs server can send data to client.
You're essentially asking how NAT works. Again, there's nothing node-specific in this case. The NAT server simply alters packet headers as necessary, and maintains translation tables for routing, just like with any connection.
What will be resource utilization in maintaining persistent connections on server and client
The overhead is really quite minimal for just the connection itself. A little bit of memory, but almost insignificant in the big picture. If you're storing additional associated data with each connection, that may be different. Node.js handles large number of concurrent connections very well, but if you're concerned, you can always search for benchmark tests, or write your own.
What happend when nodejs server crashes?How can client initiate connection again?
Sockets emit both close and error events. Simply listen for them, and attempt to reconnect afterwards, probably with a back-off delay.
If there is network problem on client side and it terminates and initiate connection after every 5 mins ..then is there a way this scenario can be handled?
Not entirely sure what you're asking here. The client simply needs to reconnect as stated in the previous question/answer. If you're associating certain data with a client socket, and you want a grace period where the client has a chance to reconnect before that data is freed, then you'll need to set a timeout which will free those resources after a certain amount of time. Then, in the connection listener, or perhaps on an authentication event, you'll want to analyse the new connection to see if it matches a recently disconnected client.
I would definitely recommend looking at socket.io, especially if your use case is web-based, although it can be used for more than just browser/server connections. It will do a lot of the things you seem to be concerned about (reconnection, resource association, disconnect grace period, etc) more or less automatically.

TCP connection is kept continously open. Saving is achieved by client not having to continously refresh from server in a long pooling manner in order to check if there are any new messages on the server like in case of AJAX. Creating a new connection every couple of seconds for client to refresh from server is heavy on the server, proxies and routers. In case of Node.js connection is kept open, but it is not active until the client or server have something to send. Found good article here http://www.html5rocks.com/en/tutorials/websockets/basics/
Imagine having a 1000 chat clients and each of them asks server every 3 seconds if any new messages arrived. That results in 2000 requests and responses per minute on the server. In case of Node.js server will send message to client only when there is a message to send, while all 1000 connections will be idle in the meantime, but will be kept open.
TCP connections are always initiated on the same port like port 80, but communication is maintaines on different ports assigned to each connection when it is open. So you would still need to keep connections continously open, but you will not have to send pooling messages continously like you needed before.

Related

Multiple Socket.io app processes cause each client socket connects and disconnects repeatedly

I am working on a nodejs app with Socket.io and I did a test in a single process using PM 2 and it was no errors. Then I move to our production environment(We use Google Cloud Compute Instance).
I run 3 app processes and a iOS client connects to the server.
By the way the iOS client doesn't keep the socket connection. It doesn't send disconnect to the server. But it's disconnected and reconnect to the server. It happens continuously.
I am not sure why the server disconnects the client.
If you have any hint or answer for this, I would appreciate you.
That's probably because requests end up on a different machine rather than the one they originated from.
Straight from Socket.io Docs: Using Multiple Nodes:
If you plan to distribute the load of connections among different processes or machines, you have to make sure that requests associated with a particular session id connect to the process that originated them.
What you need to do:
Enable session affinity, a.k.a sticky sessions.
If you want to work with rooms/namespaces you also need to use a centralised memory store to keep track of namespace information, such as the Redis/Redis Adapter.
But I'd advise you to read the documentation piece I posted, things might have changed a bit since the last time I've implemented something like this.
By default, the socket.io client "tests" out the connection to its server with a couple http requests. If you have multiple server requests and those initial http requests don't go to the exact same server each time, then the socket.io connect will never get established properly and will not switch over to webSocket and it will keep attempting to use http polling.
There are two ways to fix this.
You can configure your clients to just assume the webSocket protocol will work. This will initiate the connection with one and only one http connection which will then be immediately upgraded to the webSocket protocol (with socket.io running on top of that). In socket.io, this is a transport option specified with the initial connection.
You can configure your server infrastructure to be sticky so that a request from a given client always goes back to the exact same server. There are lots of ways to do this depending upon your server architecture and how the load balancing is done between your servers.
If your servers are keeping any client state local to the server (and not in a shared database that all servers access), then you will need even a dropped connection and reconnect to go back to the same server and you will need sticky connections as your only solution. You can read more about sticky sessions on the socket.io website here.
Thanks for your replies.
I finally figured out the issue. The issue was caused by TTL of backend service in Google Cloud Load Balancer. The default TTL was 30 seconds and it made each socket connection tried to disconnect and reconnect.
So I updated the value to 3600s and then I could keep the connection.

Load balancing sockets on a horizontally scaling WebSocket server?

Every few months when thinking through a personal project that involves sockets I find myself having the question of "How would you properly load balance sockets on a dynamic horizontally scaling WebSocket server?"
I understand the theory behind horizontally scaling the WebSockets and using pub/sub models to get data to the right server that holds the socket connection for a specific user. I think I understand ways to effectively identify the server with the fewest current socket connections that I would want to route a new socket connection too. What I don't understand is how to effectively route new socket connections to the server you've picked with low socket count.
I don't imagine this answer would be tied to a specific server implementation, but rather could be applied to most servers. I could easily see myself implementing this with vert.x, node.js, or even perfect.
First off, you need to define the bounds of the problem you're asking about. If you're truly talking about dynamic horizontal scaling where you spin up and down servers based on total load, then that's an even more involved problem than just figuring out where to route the latest incoming new socket connection.
To solve that problem, you have to have a way of "moving" a socket from one host to another so you can clear connections from a host that you want to spin down (I'm assuming here that true dynamic scaling goes both up and down). The usual way I've seen that done is by engaging a cooperating client where you tell the client to reconnect and when it reconnects it is load balanced onto a different server so you can clear off the one you wanted to spin down. If your client has auto-reconnect logic already (like socket.io does), you can just have the server close the connection and the client will automatically re-connect.
As for load balancing the incoming client connections, you have to decide what load metric you want to use. Ultimately, you need a score for each server process that tells you how "busy" you think it is so you can put new connections on the least busy server. A rudimentary score would just be number of current connections. If you have large numbers of connections per server process (tens of thousands) and there's no particular reason in your app that some might be lots more busy than others, then the law of large numbers probably averages out the load so you could get away with just how many connections each server has. If the use of connections is not that fair or even, then you may have to also factor in some sort of time moving average of the CPU load along with the total number of connections.
If you're going to load balance across multiple physical servers, then you will need a load balancer or proxy service that everyone connects to initially and that proxy can look at the metrics for all currently running servers in the pool and assign the connection to the one with the most lowest current score. That can either be done with a proxy scheme or (more scalable) via a redirect so the proxy gets out of the way after the initial assignment.
You could then also have a process that regularly examines your load score (however you decided to calculate it) on all the servers in the cluster and decides when to spin a new server up or when to spin one down or when things are too far out of balance on a given server and that server needs to be told to kick several connections off, forcing them to rebalance.
What I don't understand is how to effectively route new socket connections to the server you've picked with low socket count.
As described above, you either use a proxy scheme or a redirect scheme. At a slightly higher cost at connection time, I favor the redirect scheme because it's more scalable when running and creates fewer points of failure for an existing connection. All clients connect to your incoming connection gateway server which is responsible for knowing the current load score for each of the servers in the farm and based on that, it assigns an incoming connection to the host with the lowest score and this new connection is then redirected to reconnect to one of the specific servers in your farm.
I have also seen load balancing done purely by a custom DNS implementation. Client requests IP address for farm.somedomain.com and that custom DNS server gives them the IP address of the host it wants them assigned to. Each client that looks up the IP address for farm.somedomain.com may get a different IP address. You spin hosts up or down by adding or removing them from the custom DNS server and it is that custom DNS server that has to contain the logic for knowing the load balancing logic and the current load scores of all the running hosts.
Route the websocket requests to a load balancer that makes the decision about where to send the connections.
As an example, HAProxy has a leastconn method for long connections that picks the least recently used server with the lowest connection count.
The HAProxy backend server weightings can also be modified by external inputs, #jfriend00 detailed the technicalities of weighting in their answer.
I found this project that might be useful:
https://github.com/apundir/wsbalancer
A snippet from the description:
Websocket balancer is a stateful reverse proxy for websockets. It distributes incoming websockets across multiple available backends. In addition to load balancing, the balancer also takes care of transparently switching from one backend to another in case of mid session abnormal failure.
During this failover, the remote client connection is retained as-is thus remote client do not even see this failover. Every attempt is made to ensure none of the message is dropped during this failover.
Regarding your question : that new connection will be routed by the load balancer if configured to do so.
As #Matt mentioned, for example with HAProxy using the leastconn option.

Websocket (node.js) connection limit, clients are getting disconnected after reaching 400-450 connections

I have a big problem with socket.io connection limit. If the number of connections is more than 400-450 connected clients (by browsers) users are getting disconnected. I increased soft and hard limits for tcp but it didn't help me.
The problem is only for browsers. When I tried to connect by socket-io-client module from other node.js server I reached 5000 connected clients.
Its very big problem for me and totally blocked me. Please help.
Update
I have tried with standard Websocket library (ws module with node.js) and problem was similar. I can reach only 456 connected clients.
Update 2
I devided connected clients between a few instances of server. Every group of clients were connecting by other port. Unfortunately this change didn't help me. Sum of connected users was the same like before.
Solved (2018)
There were not enough open ports for a Linux user which run the pm2 manager ("pm2" or "pm" username).
You may be hitting a limit in your operating system. There are security limits in the number of concurrent files open, take a look at this thread.
https://github.com/socketio/socket.io/issues/1393
Update:
I wanted to expand this answer because I was answering from mobile before. Each new connection that gets established is going to open a new file descriptor under your node process. Of course, each connection is going to use some portion of RAM. You would most likely run into the FD limit first before running out of RAM (but that depends on your server).
Check your FD limits: https://rtcamp.com/tutorials/linux/increase-open-files-limit/
And lastly, I suspect your single client concurrency was not using the correct flags to force new connections. If you want to test concurrent connections from one client, you need to set a flag on the webserver:
var socket = io.connect('http://localhost:3000', {'force new connection': true});

How to make the safest Websocket Authentication

I have seen this question answered a few times, but I have a very specific problem with it.
I am currently making a game, where a HTML5 programm is talking to a C++ programm on the server side. The game does also include matches with valuable prices and therefore the low latency between the client and the server as well as the security should be as high priority.
And that leads to my question: Is it safe enough to authenticate a websocket session (TLS encrypted) a single time when it is started or should I send the SESSIONID with every message send form the client to the server?
This question is very opinion based, and does not apply to the nature of questions of StackOverflow.
Here is my opinion:
WebSockets protocol is implemented on top of TCP network layer protocol which is connection based. So that means that connection is established and then persisted until it is closed by client or server. Interceptions in between are very unlikely possible.
After TCP connection is established WebSockets will send HTTP headers from client, just like any other HTTP request would do, but will not close connection, and wait for response from server, which is if everything "fine" header for approving HTTP protocol upgrade to WebSockets communication. Since then WebSockets are valid to be used on client and server side for communication. Since TCP connects it is persistent connection. So sending session for every request - is pointless, as it is sent once connection is established.
So no, it is not a good idea to send session details on every message as just pointless. You better make sure that restoring your session is secure process, and just obtaining cookies of a client - will not allow to connect as another user.

TCP Servers: Drop Connection, instead of resetting or responding?

Is it possible in Node.JS to "drop" a connection in such a way that
The client never receives a response (200, 404 or otherwise)
The client is never notified that the connection is terminated (never receives connection reset or end of stream)
The server's resources are released (the server should not attempt to maintain the connection in any way)
I am specifically asking about Node.JS HTTP Servers (which are really just complex TCP servers) on Solaris., but if there are cases on other OSes (Windows, Linux) or programming languages (C/C++, Java) that permit this, I am interested.
Why do I want this?
To annoy or slow down (possibly single-threaded) robots such as phpMyAdmin Probe.
I know this is not really something that matters, but these types of questions can better help me learn the boundaries of my programs.
I am aware that the client host is likely to re-transmit the packets of the connection since I am never sending reset.
These are not possible in a generic TCP stack (vs. your own custom TCP stack). The reasons are:
Closing a socket sends a RST
Even if you avoid sending a RST, the client continues to think the connection is open while the server has closed the connection. If the client sends any packet on this connection, the server is going to send a RST.
You may want to explore firewalling these robots and block / rate limit their IP addresses with something like iptables (linux) or the equivalent on solaris.
closing a connection should NOT send an RST. There is a 3 way tear down process.

Resources