Using thread to write and select to read

Using thread to write and select to read - linux

Has any one tried to create a socket in non blocking mode and use a dedicated thread to write to the socket, but use the select system call to identify if data is available to read data.
if the socket is non blocking, the write call will return immediately and the application will not know the status of the write (if it passed or failed).
is there a way of knowing the status of the write call without having to block on it.

Has any one tried to create a socket in non blocking mode and use a dedicated thread to write to the socket, but use the select system call to identify if data is available to read data.
Yes, and it works fine. Sockets are bi-directional. They have separate buffers for reading and writing. It is perfectly acceptable to have one thread writing data to a socket while another thread is reading data from the same socket at the same time. Both threads can use select() at the same time.
if the socket is non blocking, the write call will
return immediately and the application will not
know the status of the write (if it passed or failed).
The same is true for blocking sockets, too. Outbound data is buffered in the kernel and transmitted in the background. The difference between the two types is that if the write buffer is full (such as if the peer is not reading and acking data fast enough), a non-blocking socket will fail to accept more data and report an error code (WSAEWOULDBLOCK on Windows, EAGAIN or EWOULDBLOCK on other platforms), whereas a blocking socket will wait for buffer space to clear up and then write the pending data into the buffer. Same thing with reading. If the inbound kernel buffer is empty, a non-blocking socket will fail with the same error code, whereas a blocking socket will wait for the buffer to receive data.
select() can be used with both blocking and non-blocking sockets. It is just more commonly used with non-blocking sockets than blocking sockets.
is there a way of knowing the status of the write
call without having to block on it.
On non-Windows platforms, about all you can do is use select() or equivalent to detect when the socket can accept new data before writing to it. On Windows, there are ways to receive a notification when a pending read/write operation completes if it does not finish right away.
But either way, outbound data is written into a kernel buffer and not transmitted right away. Writing functions, whether called on blocking or non-blocking sockets, merely report the status of writing data into that buffer, not the status of transmitting the data to the peer. The only way to know the status of the transmission is to have the peer explicitly send back a reply message once it has received the data. Some protocols do that, and others do not.

is there a way of knowing the status of the write call without having
to block on it.
If the result of the write call is -1, then check errno to for EAGAIN or EWOULDBLOCK. If it's one of those errors, then it's benign and you can go back to waiting on a select call. Sample code below.
int result = write(sock, buffer, size);
if ((result == -1) && ((errno == EAGAIN) || (errno==EWOULDBLOCK)) )
{
// write failed because socket isn't ready to handle more data. Try again later (or wait for select)
}
else if (result == -1)
{
// fatal socket error
}
else
{
// result == number of bytes sent.
// TCP - May be less than the number of bytes passed in to write/send call.
// UDP - number of bytes sent (should be the entire thing)
}

Related

Is it possible for UnixStream::send to return EWOULDBLOCK?

I am using UnixStream and I am not calling set_nonblocking(true). I thought a blocking socket would never return EWOULDBLOCK; is that true?

can a blocking socket return EWOULDBLOCK
It depends on the OS implementation, but NO for POSIX-compliant OSs and Linux.
From recv() of POSIX.1-2017:
[EAGAIN] or [EWOULDBLOCK]
The socket's file descriptor is marked O_NONBLOCK and no data is
waiting to be received; or MSG_OOB is set and no out-of-band data is
available and either the socket's file descriptor is marked O_NONBLOCK
or the socket does not support blocking to await out-of-band data.
Linux is not POSIX-certified. But it's still NO. From recv(2):
EAGAIN or EWOULDBLOCK
The socket is marked nonblocking and the receive operation would block, or a receive timeout had been set and the timeout expired before data was received. POSIX.1 allows either error to be returned for this case, and does not require these constants to have the same value, so a portable application should check for both possibilities.

The general answer is no, however I've seen two exceptions:
If you set a timeout on the socket with SO_RCVTIMEO or SO_SNDTIMEO - in this case, a receive or send will return with EWOULDBLOCK if the timeout elapses while no input data becomes available or the output buffer remains full, respectively.
If you call send/recv with the MSG_DONTWAIT flag, effectively turning the socket into a nonblocking one temporarily.
I don't know if UnixStream actually exposes the above functionalities, this is a POSIX question not a Rust one, and I'm not familiar with Rust.

Will select() block if called while there is still data to be read?

If a socket has data to be read and the select() function is called, will select():
Return immediately, indicating the socket is ready for reading, or
Block until more data is received on the socket
??

It can easily be tested, but I assure you select() will never block if there is data already available to read on one of the readfds. If it did block in that case, it wouldn't be very useful for programming with non-blocking I/O. Take the example where you are looping on select(), you see that there is data to be read, and you read it. Then while you are processing the data read, more data comes in. When you return to select() it blocks, waiting for more data. However your peer on the other side of the connection is waiting for a response to the data already sent. Your program ends up blocking forever. You could work around it with timeouts and such, but the whole point is to make non-blocking I/O efficient.
If an fd is at EOF, select() will never block even if called multiple times.

man 2 select seems to answer this question pretty directly:
select() and pselect() allow a program to monitor multiple file descriptors, waiting until one or more of the file descriptors become "ready" for some class of I/O operation (e.g., input possible). A file descriptor is considered ready if it is possible to perform the corresponding I/O operation (e.g., read(2)) without blocking.
So at least according to the manual, it would return immediately if there is any data available.

Linux TCP socket in blocking mode

When I create a TCP socket in blocking mode and use the send (or sendto) functions, when the will the function call return?
Will it have to wait till the other side of the socket has received the data? In that case, if there is traffic jam on the internet, could it block for a long time?

Both the sender and the receiver (and possibly intermediaries) will buffer the data.
Sending data successfully is no guarantee that the receiving end has received it.
Normally writes to a blocking socket, won't block as long as there is space in the sending-side buffer.
Once the sender's buffer is full, then the write WILL block, until there is space for the entire write in it.
If the write is partially successful (the receiver closed the socket, shut it down or an error occurred), then the write might return fewer bytes than you had intended. A subsequent write should give an error or return 0 - such conditions are irreversible on TCP sockets.
Note that if a subsequent send() or write() gives an error, then some previously written data could be lost forever. I don't think there is a real way of knowing how much data actually arrived (or was acknowledged, anyway).

Handling short reads using epoll()

Let's say client sent 100 bytes of data but somehow server only received 90 bytes. How do I handle this case? If server calls the "read" function inside of while loop checking the total received data then the server will wait forever for the pack last 10 bytes..
Also, it could happen that client got disconnected in the middle of data transfer. In this case also server will wait forever until it receives all the data which won't arrive..
I am using tcp but in real world network environment, this situation could happen. Thanks in advance...

You do not call the read() function in a loop until you receieve the number of bytes you require. Instead, you set the socket to nonblocking and call the read() function in a loop until it returns 0 (indicating end of stream) or an error.
In the normal case the loop will terminate by read() returning -1, with errno set to EAGAIN. This indicates that the connection hasn't been closed, but no more data is available at the current time. At this point, if you do not have enough data from the client yet, you simply save the data that you do have for later, and return to the main epoll() loop.
If and when the remainder of the data arrives, the socket will be returned as readable by epoll(), you will read() the rest of the data, retreieve the saved data and process it all.
This means that you need space in your per-socket data structure to store the read-but-not-processed-yet data.

You must carefully check the return value of read. It can return any of three things:
A positive number, indicating some bytes were read.
Zero, indicating the other end has gracefully closed the connection.
-1, meaning an error occurred. (If the socket is non-blocking, then the error EAGAIN or EWOULDBLOCK means the connection is still open but no data is ready for you right now, so you need to wait until epoll says there is more data for you.)
If your code is not checking for each of these three things and handling them differently, then it is almost certainly broken.
These cover all of the cases you are asking about, like a client sending 90 bytes then closing or rudely breaking the connection (because read() will return 0 or -1 for those cases).
If you are worried that a client might send 90 bytes and then never send any more, and never close the connection, then you have to implement your own timeouts. For that your best bet is non-blocking sockets and putting a timeout on select() / poll() / epoll(), ditching the connection if it is idle for too long.

TCP connection is a bi-directional stream layered on top of packet-based network. It's a common occurrence to read only part of what the other side sent. You have to read in a loop, appending until you have a complete message. For that you need an application level protocol - types, structure, and semantics of messages - that you use on top of TCP (FTP, HTTP, SMTP, etc. are such protocols).
To answer the specific second part of the question - add EPOLLRDHUP to the set of epoll(7) events to get notified when connection drops.

In addition to what caf has said, I'd recommend just subscribing EPOLLRDHUP because this is the only safe way to figure out whether a connection was closed (read() == 0 is not reliable as, caf mentioned this too, may be true in case of an error). EPOLLERR is always subscribed to, even if you didn't specifically asked for it. The correct behaviour is to close the connection using close() in case of EPOLLRDHUP and probably even when EPOLLERR is set.
For more information, I've given a similar answer here: epoll_wait() receives socket closed twice (read()/recv() returns 0)

can the send function block

I'm writing a chat program and for the server, when I send data can the send() function take a long time to send out the data?
Here is my problem:
I'm using linux 2.6 with epoll, server in single thread
If send() blocks, then this means all other activity on the server will stop. Like if there is a very slow client that does not send ACK responses for a long time to a tcp packet, will the send function just move on right away, or will it wait a long time for the client. The thing I don't want is for a single/few slow clients to cause delays in the chat server.
What I want is for send() to nonblock and to return very quickly. If it doesn't send all the data, it will simply return the amount sent and I will remove that from the buffer and keep sending next time serviced until all data sent. Basically I don't want send to block for a long time on a slow or unresponsive client.

You can set a socket to non-blocking mode and a send will not block. The problem is that you'll have to manage the fact that a partial write occurred and send the rest of the data when the write file descriptor becomes active again.
In general I've found that doing both send and recv in non-blocking mode, while complicating the program, works pretty well.
Use something like:
if (-1 == (flags = fcntl(fd, F_GETFL, 0)))
flags = 0;
return fcntl(fd, F_SETFL, flags | O_NONBLOCK);

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string