We have a client server application in use on our suse linux server.
Sometimes it happens, that on the client side the tcp socket somehow goes away and
on the server side the other end of the socket remains existent.
At the end of the day, when we stop the backend on the linux server, the backend tries to close all remaining tcp connections, also those "zombie" sockets.
(I watch this with strace).
When the backend tries to close a tcp connection, where there is nothing anymore on the client side, it sends a [FIN, ACK] packet to the target. And of course, nothing comes back.
The backend repeats to send this packet. The first time it waits only a few hundredths of a second to repeat it, then, it waits longer and longer. At the end it waits seconds to repeat it. But, after 15 seconds, there is a timeout, and it goes on to end another
tcp connection.
Now, I do not know where this 15 second timeout is coming from. I would like to change it.
Thank you very much in advance.
I think you may have two problems.
You should detect the client disconnecting and close the server's end of the socket so you free that resources ASAP. You may set a timeout yourself for connections with no activity in the application layer. Read this.
If you cannot handle that "zombie" sockets in the app layer you may change the timeout in the SO. Read this.
Related
In Chrome, Socket IO seems to stop transmitting data. Is there an internal reason for this?
I've tried a very simple client and simple server side but consistently the server stops receiving any emits after 5 minute, will then reconnect and it's fine for another 5 minutes.
On top of the internal ping mechanism I have a polling mechanism which sends back session data every 20 seconds.
I don't use WebSocket with NodeJS or Socket.io but experienced the same behaviour with Jetty. It turns out that Jetty has an idle timeout default to 5 minutes (or 300 seconds) for all WebSocket's sessions. You could change the default idle timeout value to an appropriate value or ping/pong those connections before it timed out.
In my situation, I decided to use ping/pong as it also helps determine when the connection is no longer there. I observed that in some cases, connection was not closed even when the network is down.
According to engine.io (which is used by socket.io) docs, the server seems to have default pingInterval of 25 seconds. So unless you inadvertently disabled or changed default options, the ping/pong mechanism should be in place.
How do I find out from a socket client program that the remote connection is down (e.g. the server is down). When I do a recv and the server is down it blocks if I do not set any timeout. However in my case I cannot put any reliable timeout value to get around it since otherwise the recv times out even when the server is up but the response really takes longer than the timeout value that I have set.
Unfortunately, ZeroMQ just passes this on to the next layer. So the protocol you are implementing on top of ZeroMQ will have to handle this.
Heartbeats are recommended. Basically, just have one side send a message if the connection is otherwise idle. The other side can treat the absence of such messages as a failure condition and close the connection.
You may wish to modify your higher level protocols to be more robust. For example, you can submit a command, query its status, and allow the other side to forget about the command. That way, if the connection is lost, you can reconnect and query any outstanding commands. Any it doesn't have, you know didn't get through and can resubmit. Once you get a reply with the result of a command, you can tell the other side that it can now forget the response.
This allows you to keep the connection active while a long-running command is ongoing. Every so often you ask, "is everything okay". The other side responds, "yes". You can use long polling where the other side delays responding for a second or so while the command is in process. This allows it to return the results immediately rather than having to wait a second for your next query.
The specifics depend on your exact requirements, but you must design this correctly into your protocol.
If the remote host goes down without sending you a tcp FIN package then you have no chance to detect that. You can test that behaviour by firewalling a port after a connection has been established on that port. Your program will "hang" forever.
However, the Linux kernel supports a mechanism called TCP keep alives which are meant to close a tcp connection after a given timeout. If you can't specify a timeout for your application, than there isn't a reliable chance to use that. Last chance might be to use features of the application protocol (can you name it?), if that protocol does not support features for connection handling you may invent something on your own on top of that.
I created a game using node.js and socket.io. All works well, but from time to time this game socket server doesn't respond to any connections. When I go to Process information -> Files and connections (in webmin), then I see there are many connections with CLOSE_WAIT and FIN_WAIT2 statuses. I think the problem is in these connections, because game fails when there are about 1,000 connections. Server OS is Ubuntu Linux 12.04.
How can I kill these connections or increase maximum allowed connections?
To add to Jim answer, i think there is a problem in your client handling of closing of socket connections . It seems your client is not closing the sockets properly(both server initiated and client initiated close) and that is the reason your server has so many wait states
You don't need to kill connections or increase the number allowed. You need to fix a defect in the application on one side of the connection, specifically, the side which does not initiate the close.
See Figure 13 of RFC 793. Your programs are at step 3 of the close sequence. The side which you see in FIN-WAIT-2 is behaving correctly. It has initiated the close and the TCP stack has sent a FIN packet on the network. The side in CLOSE-WAIT has the defect. The TCP stack on that side has received and acknowledged the FIN packet, but the application has failed to notice. How the application is expected to detect that the remote side has closed the connection will depend on your platform. Unfortunately, I am old, and don't know node.js or socket.io.
What happens in C is that the socket appears readable, but a read() returns a zero-length packet. When the application sees this, it is expected to call close(). You will find something equivalent in the docs for node.js or socket.io.
When you find it, considering answering your own question here and accepting the answer.
Linux has the SO_REUSEADDR option for setting socket parameters. It allows immediate reuse of the same port. Someone who knows your toolset can tell you how to set socket options. You may already know how. I do not know this toolset.
From older java docset:
http://docs.oracle.com/javase/1.5.0/docs/guide/net/socketOpt.html
I have a web application where the user needs to be constantly connected. By default, socket.io will disconnect the connection after 60 seconds. I have 'reconnection' turned on though, so it is essentially closing and reopening the connection every minute. This can cause issues with feeds/notifications to my connected clients. Would it be safe to set this timeout to lets say, 10 minutes, or possibly higher? Is there a reason it is so low right now?
My guess is that you may be misinterpreting the 'close timeout' configuration. It does not cause the connection to be closed after 60 seconds. (Heartbeats would be pointless if clients constantly reconnected).
If a client disconnects, close timeout is the amount of time the server will wait before releasing resources associated with that connection. Essentially, this allows clients with intermittent connectivity issues to attempt to reconnect before the server has forgotten about them. Setting close timeout to ten minutes is probably a bad idea since it will tie up server resources.
If your clients are, in fact, disconnecting every 60 seconds, then, like samjm said, something else is wrong.
I don't believe your socket should disconnect after 60 seconds. I would investigate why that is actually happening. After handshaking correctly the socket should heartbeat and stay open indefinitely (barring network issues out of your control) until either the client or the server closes the connection, that is definitely my experience.
The fact that your connection is actually closing sounds like it may not be handshaking correctly, or heartbeats are not being received.
You might have already figured this out, but your socket might be disconnecting after 60 seconds because you're not sending a heartbeat ("2::") back to the server.
Here's some Python code that works with the websocket client module.
# on_message handles messages from the server
def on_message(ws, message):
if message[:3] == '2::':
ws.send('2::')
I am learning tcp-ip stack, server-client connections. I wrote a simple client server. The client and servers were able to transfer data to each other without any issues. I am running client and server on the same machine. When I used to close the server with ctrl+c, I found kernel was sending RST signal instead of FIN. (Please refer my question: Active closure of server sockets )
With little more investigation, I realized one of my client was in read call and corresponding server thread was in infinite while loop doing nothing (Some buggy dumb coding on my part). But when I removed that infinite while loop, I saw expected behavior. I could see FIN being sent in both the directions.
So, I am wondering why tcp layer was sending RST in first case.
Eventually, you give up on waiting for the other end to accept the data.