what will happen if we close a closed socket - linux

I wonder what will happen if we close a closed socket or a non-existing socket?
Will the exception affect the other sockets which are sending/receiving packets?
Edit:
Sorry, I didn't say it clearly. I mean I know what it will return from close or shutdown function and what the return means, but I don't know what it affects the existing sockets.

Potentially, yes. If you call close on a random integer which used to be an fd, you might break some other part of your code that's just opened another connection that got given the same fd number. Therefore, you should never double-close an fd: although it's perfectly safe from the kernel's point of view (you harmlessly get EBADF), it can seriously mess up your application.

Or close(): per http://pubs.opengroup.org/onlinepubs/000095399/functions/close.html
will return -1 and set errno to EBADF. The fildes argument is not a valid file descriptor.

Related

What Error Shall We handle in Using Sockets

When using non-blocking sockets to communicate with clients, what error codes we must care and do something other than calling close() directly? Can anyone list them and give a few comments about what extra work we must do?
Currently we only handle the EAGAIN and EWOULDBLOCK, for all the other errors we just close the socket. Is this kind of socket exception routine enough for a server software?
I think you've mostly got it right. There are two other transient errors that my code retries on rather than closing the socket, they are:
EINTR - this can be returned if the system call was interrupted by delivery of a signal. The usual appropriate response is to try the system call again. (See here for details on why this can occur)
ENOBUFS - Under certain circumstances, send() can return this if the network interface's output queue is full. (yes, you'd expect send() to return EWOULDBLOCK or EAGAIN instead, but I have seen this returned)

Can dup2 really return EINTR?

In the spec and two implementations:
According to POSIX, dup2() may return EINTR.
The linux man pages list it as permitted.
The FreeBSD man pages indicate it's not ever returned. Is this a bug - since its close implementation can EINTR (at least for TCP linger if nothing else).
In reality, can Linux return EINTR for dup2()? Presumably if so, it would be because close() decided to wait and a signal arrived (TCP linger or dodgy file system drivers that try to sync when closing).
In reality, does FreeBSD guarantee not to return EINTR for dup2()? In that case, it must be that it doesn't bother waiting for any outstanding operations on the old fd and simply unlinks the fd.
What does POSIX dup2() mean when it refers to "closing" (not in italics), rather than referencing the actual close() function - are we to understand it's just talking about "closing" it in an informal way (unlinking the file descriptor), or is it attempting to say that the effect should be as if the close() function were first called and then dup2() were called atomically.
If fildes2 is already a valid open file descriptor, it shall be closed first, unless fildes is equal to fildes2 in which case dup2() shall return fildes2 without closing it.
If dup2() does have to close, wait, then atomically dup, it's going to be a nightmare for implementors! It's much worse than the EINTR with close() fiasco. Cowardly POSIX doesn't even say if the dup took place in the case of EINTR...
Here's the relevant information from the C/POSIX library documentation with respect to the standard Linux implementation:
If OLD and NEW are different numbers, and OLD is a valid
descriptor number, then `dup2' is equivalent to:
close (NEW);
fcntl (OLD, F_DUPFD, NEW)
However, `dup2' does this atomically; there is no instant in the
middle of calling `dup2' at which NEW is closed and not yet a
duplicate of OLD.
It lists the possible error values returned by dup and dup2 as EBADF, EINVAL, and EMFILE, and no others. The documentation states that all functions that can return EINTR are listed as such, which indicates that these don't. Note that these are implemented via fcntl, not a call to close.
8 years later this still seems to be undocumented.
I looked at the linux sources and my conclusion is that dup2 can't return EINTR in a current version of Linux.
In particular, the function do_dup2 in fs/file.c ignores the return value of filp_close, which is what can cause close to return EINTR in some cases (see fs/open.c and fs/file.c).
The way dup2 works is it first makes the atomic file descriptor update, and then waits for any flushing that needs to happen on close. Any errors happening on flush are simply ignored.

Handling short reads using epoll()

Let's say client sent 100 bytes of data but somehow server only received 90 bytes. How do I handle this case? If server calls the "read" function inside of while loop checking the total received data then the server will wait forever for the pack last 10 bytes..
Also, it could happen that client got disconnected in the middle of data transfer. In this case also server will wait forever until it receives all the data which won't arrive..
I am using tcp but in real world network environment, this situation could happen. Thanks in advance...
You do not call the read() function in a loop until you receieve the number of bytes you require. Instead, you set the socket to nonblocking and call the read() function in a loop until it returns 0 (indicating end of stream) or an error.
In the normal case the loop will terminate by read() returning -1, with errno set to EAGAIN. This indicates that the connection hasn't been closed, but no more data is available at the current time. At this point, if you do not have enough data from the client yet, you simply save the data that you do have for later, and return to the main epoll() loop.
If and when the remainder of the data arrives, the socket will be returned as readable by epoll(), you will read() the rest of the data, retreieve the saved data and process it all.
This means that you need space in your per-socket data structure to store the read-but-not-processed-yet data.
You must carefully check the return value of read. It can return any of three things:
A positive number, indicating some bytes were read.
Zero, indicating the other end has gracefully closed the connection.
-1, meaning an error occurred. (If the socket is non-blocking, then the error EAGAIN or EWOULDBLOCK means the connection is still open but no data is ready for you right now, so you need to wait until epoll says there is more data for you.)
If your code is not checking for each of these three things and handling them differently, then it is almost certainly broken.
These cover all of the cases you are asking about, like a client sending 90 bytes then closing or rudely breaking the connection (because read() will return 0 or -1 for those cases).
If you are worried that a client might send 90 bytes and then never send any more, and never close the connection, then you have to implement your own timeouts. For that your best bet is non-blocking sockets and putting a timeout on select() / poll() / epoll(), ditching the connection if it is idle for too long.
TCP connection is a bi-directional stream layered on top of packet-based network. It's a common occurrence to read only part of what the other side sent. You have to read in a loop, appending until you have a complete message. For that you need an application level protocol - types, structure, and semantics of messages - that you use on top of TCP (FTP, HTTP, SMTP, etc. are such protocols).
To answer the specific second part of the question - add EPOLLRDHUP to the set of epoll(7) events to get notified when connection drops.
In addition to what caf has said, I'd recommend just subscribing EPOLLRDHUP because this is the only safe way to figure out whether a connection was closed (read() == 0 is not reliable as, caf mentioned this too, may be true in case of an error). EPOLLERR is always subscribed to, even if you didn't specifically asked for it. The correct behaviour is to close the connection using close() in case of EPOLLRDHUP and probably even when EPOLLERR is set.
For more information, I've given a similar answer here: epoll_wait() receives socket closed twice (read()/recv() returns 0)

Sockets & File Descriptor Reuse (or lack thereof)

I am getting the error "Too many open files" after the call to socket in the server code below. This code is called repeatedly, and it only occurs just after server_SD gets the value 1022. so i am assuming that i am hitting the limit of 1024 as proscribed by "ulimit -n". What i don't understand is that i am closing the Socket, which should make the fd reusable, but this seems not to be happening.
Notes: Using linux, and yes the client is closed also, no i am not a root user so moving the limits is not an option, I should have a maximum of 20 (or so) sockets open at one time. Over the lifetime of my program i would expect to open & close close to 1000000 sockets (hence need to reuse very strong).
server_SD = socket (AF_INET, SOCK_STREAM, 0);
bind (server_SD, (struct sockaddr *) &server_address, server_len)
listen (server_SD,1)
client_SD = accept (server_SD, (struct sockaddr *)&client_address, &client_len)
// read, write etc...
shutdown (server_SD, 2);
close (server_SD)
Does anyone know how to guarantee closure & re-usability ?
Thanks.
Run your program under valgrind with the --track-fds=yes option:
valgrind --track-fds=yes myserver
You may also need --trace-children=yes if your program uses a wrapper or it puts itself in the background.
If it doesn't exit on its own, interrupt it or kill the process with "kill pid" (not -9) after it accumulates some leaked file descriptors. On exit, valgrind will show the file descriptors that are still open and the stack trace corresponding to where they were created.
Running your program under strace to log all system calls may also be helpful. Another helpful command is /usr/sbin/lsof -p pid to display all currently used file descriptors and what they are being used for.
From your description it looks like you are opening server socket for each accept(2). That is not necessary. Create server socket once, bind(2) it, listen(2), then call accept(2) on it in a loop (or better yet - give it to poll(2))
Edit 0:
By the way, shutdown(2) on listening socket is totally meaningless, it's intended for connected sockets only.
Perhaps your problem is that you're not specifying the SO_REUSEADDR flag?
From the socket manpage:
SO_REUSEADDR
Indicates that the rules used in validating addresses supplied in a bind(2) call should allow reuse of local addresses. For PF_INET sockets this means that a socket may bind, except when there is an active listening socket bound to the address. When the listening socket is bound to INADDR_ANY with a specific port then it is not possible to bind to this port for any local address.
Are you using fork()? if so, your children may be inheriting the opened file descriptors.
If this is the case, you should have the child close any fds that don't belong to it.
This looks like you might have a "TIME_WAIT" problem. IIRC, TIME_WAIT is one of the status a TCP socket can be in, and it's entered when both side have closed the connection, but the system keeps the socket for a while, to avoid delayed messages to be accepted as proper payload for subsequent connections.
You shoud maybe have a look at this (bottom of page 99 and top of 100). And maybe that other question.
One needs to close the client before closing the server (reverse order to my code above!)
Thanks all who offered suggestions !

When does the write() system call write all of the requested buffer versus just doing a partial write?

If I am counting on my write() system call to write say e.g., 100 bytes, I always put that write() call in a loop that checks to see if the length that gets returned is what I expected to send and, if not, it bumps the buffer pointer and decreases the length by the amount that was written.
So once again I just did this, but now that there's StackOverflow, I can ask you all if people know when my writes will write ALL that I ask for versus give me back a partial write?
Additional comments: X-Istence's reply reminded me that I should have noted that the file descriptor was blocking (i.e., not non-blocking). I think he is suggesting that the only way a write() on a blocking file descriptor will not write all the specified data is when the write() is interrupted by a signal. This seems to make at least intuitive sense to me...
write may return partial write especially using operations on sockets or if internal buffers full. So good way is to do following:
while(size > 0 && (res=write(fd,buff,size))!=size) {
if(res<0 && errno==EINTR)
continue;
if(res < 0) {
// real error processing
break;
}
size-=res;
buf+=res;
}
Never relay on what usually happens...
Note: in case of full disk you would get ENOSPC not partial write.
You need to check errno to see if your call got interrupted, or why write() returned early, and why it only wrote a certain number of bytes.
From man 2 write
When using non-blocking I/O on objects such as sockets that are subject to flow control, write() and writev() may write fewer bytes than requested; the return value must be noted, and the remainder of the operation should be retried when possible.
Basically, unless you are writing to a non-blocking socket, the only other time this will happen is if you get interrupted by a signal.
[EINTR] A signal interrupted the write before it could be completed.
See the Errors section in the man page for more information on what can be returned, and when it will be returned. From there you need to figure out if the error is severe enough to log an error and quit, or if you can continue the operation at hand!
This is all discussed in the book: Advanced Unix Programming by Marc J. Rochkind, I have written countless programs with the help of this book, and would suggest it while programming for a UNIX like OS.
Writes shouldn't have any reason to ever write a partial buffer afaik. Possible reasons I could think of for a partial write is if you run out of disk space, you're writing past the end of a block device, or if you're writing to a char device / some other sort of device.
However, the plan to retry writes blindly is probably not such a good one - check errno to see whether you should be retrying first.

Resources