Linux: socket close() vs shutdown() - linux

So on linux, shutdown() can take a parameter SHUT_RD, SHUT_WR or SHUT_RDWR to shutdown only part of the communication channel. But in terms of the TCP messages sending to the peer, how does it work?
In TCP state machine, the closing works in a 4-way handshake fashion,
(1) (2)
FIN---------->
<----------ACK
<----------FIN
ACK----------->
So what messages dose it send when I do a shutdown(sock, SHUT_RD) or shutdown(sock, SHUT_WR)?

shutdown(sd, SHUT_WR) sends a FIN which the peer responds to with an ACK. Any further attempts to write to the socket will incur an error. However the peer can still continue to send data.
shutdown(sd, SHUT_RD) sends nothing on the network: it just conditions the local API to return EOS for any subsequent reads on the socket. The behaviour when receiving data on a socket that has been shutdown for read is system-dependent: Unix will ACK it and throw it away; Linux will ACK it and buffer it, which will eventually stall the sender; Windows will issue an RST, which the sender sees as 'connection reset by peer'.

The FIN packets don't have to be symmetric. Each end sends FIN when its local writer has closed the socket.

shutdown with write end send FIN and close will close the socket .so any attempt to send close socket will result RESET packet.
I met with intresting problem where
sender send:
Packet A send
Pakcet B send
shutdown with write FIN
shutdown read
close socket
but recevier receive out of order
Packet A
Pakcket containing FIN
Packet B
ACK from reciver ,this cause connection reset.

Related

What happens with messages that are being sent over a Unix socket while the receiving end is not reading from the socket?

I am currently learning about IPC and Unix domain sockets. I was wondering what happens with messages that are being sent over a Unix socket while the receiving end is not reading from the socket?
Do they get sent regardless whether someone is reading or do they stay in some sort of queue waiting for a reader
Based on my research (Linux), in the case of datagram (message-oriented) Unix socket:
if the receiving end has not done bind() on the socket, the sender will fail to perform sendto();
if the receiving end has done bind() on the socket and does not keep on doing recvfrom(), the sender will enqueue a batch of messages up to some limit and stall;
if the receiving end resumes doing recvfrom(), the sender will resume.
See also: sysctl net.unix.max_dgram_qlen (for queue size).

TCP send() is blocking

I am running an application in which I am using TCP blocking socket. TCP send() is blocked, but netstat is showing send and recv Q = 0.
Can someone suggest why would send() be blocked?
The two reasons I can think of would be:
The receiving program keeps the socket open but does not read the data. In this case when the receiving socket buffer would become full, the sender could not send any more, it's socket send buffer would fill up and send() would block.
The network connection between the sender and the receiver is completely blocked after you initially connect. This would result in the sender socket typically failing after a timeout, also this would usually not be repeatable.
Neither case exactly agrees with your netstat results, but from experience I'd say tcp send() does not block unless the socket send buffer is full.

Linux - timeout on disconnected send

I'm developing server application and have following problem with sending data back to client, that suddenly terminates the connection.
When I call send on blocking socket with write timeout set via setsockopt(SO_SNDTIMEO) and client disconnects during sending (i.e a few bytes are sent, then client properly terminates TCP - as can be seen in wireshark), send still blocks, until send timeout elapses. Following call to send returns error as expected.
I would expect that TCP termination (FIN/ACK) will cause blocking send to return immediately, not after timeout.
Have someone ever seen such behaviour? Is it normal?
No, the FIN sent from the client does not unblock the send() at the server. When the client calls close() the FIN is sent to server and the connection is closed for the direction from client to server. The direction from server to client is still open till the server calls close() and FIN is sent to the client.
When the client sends FIN and the server sends ACK back then connection on server is in the CLOSE_WAIT state and the connection on client is in the FIN_WAIT_2 state. Server still may send data and client still may receive data till server close the connection.
The close connection cannot be detected through send(). Only recv() detects the connection closed by peer.
If your code must execute an immediate action when client close the connection then it must call poll() or select() and use non-blocking send() and recv() calls.

read the content of send-q TCP socket in linux

I have a TCP client sending data to a server continuously . After successful connection of client with the server , client sends data continuously with some intervals in terms of few seconds .
When the link between the client and server got disconnected after sending few data ,I came to know that TCP retransmits the data according to the value in TCP_retries2 , I configured this value to be 8 , such that I get write error after 100 secs .
But there will be some unacknowledged packets in send-q .
Is there way to read the content of this unacknowledged packets in send-q in my program before closing this socket or should i remember the send data and resend it after connecting again ? Is there any other way to implement this ?
You can get the size of sendq with an ioctl:
SIOCOUTQ
Returns the amount of unsent data in the socket send queue.
The socket must not be in LISTEN state, otherwise an error
(EINVAL) is returned. SIOCOUTQ is defined in
<linux/sockios.h>. Alternatively, you can use the synonymous
TIOCOUTQ, defined in <sys/ioctl.h>.
Note that sendq only tells you what the kernel of the remote system accepted, it does not guarantee that the application running on that host handled it. Most failures exist in the network between the communicating parties, but this metric can't be used for definite proof as successful transmission.
Once the application has given its data to TCP, it is the responsibility of TCP to keep track of the acknowledgement of the packets. If ACKs are not forthcoming, it tries its best to get the packet delivered based on RTO algorithm. Now until ACK is received, the data is kept in TCP_SEND_Q. I do not think there is any control from the application to determine current state of TCP_SEND_Q.
//should i remember the send data and resend it after connecting again//
How do you do this? The previous connection status is gone, isn't? Until the client and the server applications maintain some understanding as to what was received and sent offline, you have to start fresh with new connection.
No there isn't.
If you need to know that the peer application has received the data, you need to have the peer application acknowledge it back to your application via your application protocol, and treat any unacknowledged data as needing re-sending from your application somehow. This also brings in the question of transactional idempotence, so that you can resend with impunity.
It takes two to tango. You can close your end of the connection and it waits for the other end of the connection to drop, too. Think 3-way handshake in reverse.
How long do you wait between closing the connectiion and re-opening it? You must wait at least the TIME_WAIT before trying to reconnect using the same connection info.

Epoll and remote 1-way shutdown

Assume a TCP socket on the local linux host is in a connected state with a remote host. The local host is using epoll_wait to be notified of events on the socket with the remote host.
If the remote host were to call:
shutdown(s,SHUT_WR);
on its connected socket to indicate it is done transmitting, what event(s) will epoll_wait return on the local host for its socket?
I'm assuming EPOLLIN would always get returned and a subsequent recv call would return 0 to indicate the remote side has finished tranmitting.
What about EPOLLHUP or EPOLLRDHUP? (And what is the difference between these two events)?
Or even EPOLLERR ?
If the remote host calls "close" instead of "shutdown", does the answer to any of the above change?
I'm answering this myself after doing the heavy lifting to find the answer.
A socket listening for epoll events will typically receive an EPOLLRDHUP (in addition to EPOLLIN) event flag upon the remote peer calling close or shutdown(SHUT_WR). This does not neccessarily mean the socket is dead. Subsequent calls to recv() will return any unread data on the socket and eventually "0" will be returned to indicate EOF. It may even be possible to send data back if the remote peer only did a half-close of its socket.
The one notable exception is if the remote peer is using the SO_LINGER option enabled on its socket with a linger value of "0". The result of closing such a socket may result in a TCP RST getting sent instead of a FIN. From what I've read, a connection reset event will generate either a EPOLLHUP or EPOLLERR. (I haven't had time to confirm, but it makes sense).
There is some documentation to suggest there are older Linux implementations that don't support EPOLLRDHUP, as such EPOLLHUP gets generated instead.
And for what it is worth, in my particular case, I found that it is not too interesting to have code that special cases EPOLLHUP or EPOLLRDHUP events. Instead, just treat these events the same as EPOLLIN/EPOLLOUT and call recv() (or send() as appropriate). But pay close attention to return codes returned back from recv() and send().

Resources