What is the initial re-transmission timeout for TCP? - linux

I am interested in Linux. I have a client and a server. Lets say the client is trying to connect to the server (which is down). What is the initial re-transmission timeout value? Does it exponentially back off?
Lets say the server is now trying to connect to the client. The server receives the SYN packet and sends a SYN ACK. But the client dies. What is the re-transmission policy for this scenario?

What is the initial re-transmission timeout value?
It is both platform-dependent and configuration-dependent.
Does it exponentially back off?
All retransmissions in TCP do that.
What is the re-transmission policy for this scenario?
See above.

Related

TCP sockets connection becomes unreliable with SO_REUSEADDR

I have an app where a single client talks to a single server. Normally, the client does a single connect, and then calls send repeatedly, and there's no problem.
However, I need to do a version where the client sets up a connection for each individual send (a bit like HTTP with and without keep-alive). In this version, the client calls socket, connect, send once, and then close.
The problem with this is that I very quickly run out of ephemeral client ports, and the connect fails. To get around this I call setsockopt with SO_REUSEADDR, and then bind to port 0, before calling connect (see here, for example).
This works, except that the TCP connection is no longer reliable. I get occasional incorrect data, presumably because there's still data around when the TCP connection is closed.
Is there any way to make this reliable (and fast)? shutdown before close doesn't help. Maybe I can get select to tell me if the socket is ready for output, but that seems like overkill.
Do you have to use TCP? If so, you will probably have to maintain an open connection and route your messages over that one connection.
There is SCTP, which may be a good fit for your use case - a reliable datagram protocol:
Like TCP, SCTP provides reliable, connection oriented data delivery with congestion control. Unlike TCP, SCTP also provides message bound‐ ary preservation, ordered and unordered message delivery, multi-stream‐ ing and multi-homing. Detection of data corruption, loss of data and duplication of data is achieved by using checksums and sequence numbers. A selective retransmission mechanism is applied to correct loss or corruption of data.

Which is the better way to implement heartbeat on the client side for websockets?

On the Server side for websockets there is already an ping/pong implementation where the server sends a ping and client replies with a pong to let the server node whether a client is connected or not. But there isn't something implemented in reverse to let the client know if the server is still connected to them.
There are two ways to go about this I have read:
Every client sends a message to server every x seconds and whenever
an error is thrown when sending, that means the server is down, so
reconnect.
Server sends a message to every client every x seconds, the client receives this message and updates a variable on the client, and on the client side you have a thread that constantly checks every x seconds which checks if this variable has changed, if it hasn't in a while it means it hasn't received a message from the server and you can assume the server is down so reestablish a connection.
You can achieve trying to figure out on client side whether the server is still online using either methods. The first one you'll be sending traffic to the server whereas the second one you'll be sending traffic out of the server. Both seem easy enough to implement but I'm not so sure which is the better way in terms of being the more efficient/cost effective.
Server upload speeds are higher than client upload speeds, but server CPUs are an expensive resource while client CPUs are relatively cheap. Unloading logic onto the client is a more cost-effective approach...
Having said that, servers must implement this specific logic (actually, all ping/timeout logic), otherwise they might be left with "half-open" sockets that drain resources but aren't connected to any client.
Remember that sockets (file descriptors) are a limited resource. Not only do they use memory even when no traffic is present, but they prevent new clients from connecting when the resource is maxed out.
Hence, servers must clear out dead sockets, either using timeouts or by implementing ping.
P.S.
I'm not a node.js expert, but this type of logic should be implemented using the Websocket protocol ping rather than by your application. You should probably look into the node.js server / websocket framework and check how to enable ping-ing.
You should set pings to accommodate your specific environment. i.e., if you host on Heroku, than Heroku will implement a timeout of ~55 seconds and your pings should be sent before this timeout occurs.

read the content of send-q TCP socket in linux

I have a TCP client sending data to a server continuously . After successful connection of client with the server , client sends data continuously with some intervals in terms of few seconds .
When the link between the client and server got disconnected after sending few data ,I came to know that TCP retransmits the data according to the value in TCP_retries2 , I configured this value to be 8 , such that I get write error after 100 secs .
But there will be some unacknowledged packets in send-q .
Is there way to read the content of this unacknowledged packets in send-q in my program before closing this socket or should i remember the send data and resend it after connecting again ? Is there any other way to implement this ?
You can get the size of sendq with an ioctl:
SIOCOUTQ
Returns the amount of unsent data in the socket send queue.
The socket must not be in LISTEN state, otherwise an error
(EINVAL) is returned. SIOCOUTQ is defined in
<linux/sockios.h>. Alternatively, you can use the synonymous
TIOCOUTQ, defined in <sys/ioctl.h>.
Note that sendq only tells you what the kernel of the remote system accepted, it does not guarantee that the application running on that host handled it. Most failures exist in the network between the communicating parties, but this metric can't be used for definite proof as successful transmission.
Once the application has given its data to TCP, it is the responsibility of TCP to keep track of the acknowledgement of the packets. If ACKs are not forthcoming, it tries its best to get the packet delivered based on RTO algorithm. Now until ACK is received, the data is kept in TCP_SEND_Q. I do not think there is any control from the application to determine current state of TCP_SEND_Q.
//should i remember the send data and resend it after connecting again//
How do you do this? The previous connection status is gone, isn't? Until the client and the server applications maintain some understanding as to what was received and sent offline, you have to start fresh with new connection.
No there isn't.
If you need to know that the peer application has received the data, you need to have the peer application acknowledge it back to your application via your application protocol, and treat any unacknowledged data as needing re-sending from your application somehow. This also brings in the question of transactional idempotence, so that you can resend with impunity.
It takes two to tango. You can close your end of the connection and it waits for the other end of the connection to drop, too. Think 3-way handshake in reverse.
How long do you wait between closing the connectiion and re-opening it? You must wait at least the TIME_WAIT before trying to reconnect using the same connection info.

Identifying remote disconnection in socket client

How do I find out from a socket client program that the remote connection is down (e.g. the server is down). When I do a recv and the server is down it blocks if I do not set any timeout. However in my case I cannot put any reliable timeout value to get around it since otherwise the recv times out even when the server is up but the response really takes longer than the timeout value that I have set.
Unfortunately, ZeroMQ just passes this on to the next layer. So the protocol you are implementing on top of ZeroMQ will have to handle this.
Heartbeats are recommended. Basically, just have one side send a message if the connection is otherwise idle. The other side can treat the absence of such messages as a failure condition and close the connection.
You may wish to modify your higher level protocols to be more robust. For example, you can submit a command, query its status, and allow the other side to forget about the command. That way, if the connection is lost, you can reconnect and query any outstanding commands. Any it doesn't have, you know didn't get through and can resubmit. Once you get a reply with the result of a command, you can tell the other side that it can now forget the response.
This allows you to keep the connection active while a long-running command is ongoing. Every so often you ask, "is everything okay". The other side responds, "yes". You can use long polling where the other side delays responding for a second or so while the command is in process. This allows it to return the results immediately rather than having to wait a second for your next query.
The specifics depend on your exact requirements, but you must design this correctly into your protocol.
If the remote host goes down without sending you a tcp FIN package then you have no chance to detect that. You can test that behaviour by firewalling a port after a connection has been established on that port. Your program will "hang" forever.
However, the Linux kernel supports a mechanism called TCP keep alives which are meant to close a tcp connection after a given timeout. If you can't specify a timeout for your application, than there isn't a reliable chance to use that. Last chance might be to use features of the application protocol (can you name it?), if that protocol does not support features for connection handling you may invent something on your own on top of that.

SYN Denial Of Service attack

This may be a trivial question. This is regarding Syn Cookie.
Why only half open connections are only considered as DOS attack.
It may be possible that a client completes the handshake (SYN, SYN-ACK, ACK) and never replies after that. That will also take system resources.
So if a client is flooding with (SYN, SYN-ACK, ACK) sequence why that is not considered as DOS attack?
A SYN flood attack, which is what are describing, is a specific form of Denial of Service attack. A DOS can take many forms, often unrelated to SYN requests.
The reason that a SYN flood attack is effective is because you can forge the client IP address. This allows a very large number of SYN requests from the same client, but since the SYN-ACK will never be received, there is no way of sending the ACK, and the server is left waiting for the response, hence using available connections on the server. A client sending SYN and ACK will not be using up the available connections. A large number of useless (SYN, SYN-ACK, ACK) would still be a DOS attack, just not such an effective one.
In a SYN flood the client does not have to track state or complete connections. The client produces a number of SYN packets with spoofed IP, which can be done very fast. The server (when not using SYN cookies) consumes resources waiting for each connection attempt to time out or complete. As a result, this is a very effective DoS with leverage of resources consumed by the (DoS-ing) client vs. attacked server in the range of 1:10000.
If the client completes each connection, then the leverage disappears - the server does not have to wait any more and the client has to start tracking state. Thus we have 1:1 instead of 1:10000. There is a secondary issue here - buffers for half open connection are (or were when this attack was first invented) small compared to established connections and are (were) exhausted easier.
SYN cookies allow the server to forget about the connection immediately after replying to the SYN packet, until it receives and ACK with the correct sequence number. Here again the resource use becomes 1:1
Yes, Sockstress is an old attack, from 2008, that completes TCP handshakes and then lowers the Window size to zero (or some other small value), tying up connections at layer 4, somewhat similar to the SlowLoris layer 7 attack.
Click Here to learn more about Sockstress

Resources