Get all avalable items from a futures Stream (non-blocking) - rust

I have a WebSocket connection which wraps a futures_core::stream::Stream (incoming) and Sink (outgoing).
I want to decode and process all available messages from the Stream without blocking. Clearly at the socket level it's a TCP/IP stream of bytes and there is going to be 0..N messages sitting in the socket receive buffer waiting for a call to read(). A non-blocking call to read could well read multiple pipelined websocket frames. At the level of the Rust abstraction this might be possible with fn poll_next(...):
The trait is modelled after Future, but allows poll_next to be called
even after a value has been produced, yielding None once the stream
has been fully exhausted.
However, I don't know how to use this poll method directly without the async/await syntax, and even if I can, I don't see how it solves the problem. If I call it in a loop while I get back Some(frame), collecting the frames in a Vec, it will still suspend the task when it runs out of buffered frames and return Poll::Pending - so I won't be able to do anything with the collected frames immediately anyway. Ideally I need to process the collected frames when I get Poll::Pending without suspending anything, and then call it again allowing it to suspend only the second time around, if need be. Is there a solution possible here that doesn't involve discarding all of the future abstractions and resorting to buffering and parsing web socket frames myself?

You seem to have a misunderstanding of how suspensions work. When the parent function calls poll_next in a loop, it is not poll_next returning Poll::Pending that results in a suspension. Instead it is when the function containing the loop returns a Poll::Pending as a result of that. But there is nothing that says you have to do that immediatly. You are free to process the frames you have collected before returning to the executor.

Related

Is AsyncReadExt::read_u64 cancel safe?

In the documentation for AsyncReadExt::read_u64 it says it has the same errors as AsyncReadExt::read_exact, but says nothing about cancellation safety.
The same holds for all the other read_<type> functions on AsyncReadExt.
It seems likely that they have the same cancellation safety as read_exact (that is, none), but is that true?
Is there another way to read the next 4 bytes in a cancel safe way?
There's stuff in Tokio that covers my use case at a higher level, but I'd like to know how I would do this myself.
No it's not cancel safe
While the implementations of read_exact and the read_* functions differ they do the exact same thing:
Poll the underlying AsyncRead into a buffer, propagating errors appropriately.
If the reader returns Poll::Pending, propagate that.
If the buffer is full, return Ok(()).
If the buffer isn't full, repeat the whole thing over again.
If the future is canceled after some bytes are read it leaves the reader in an unknown state, thereby rendering them not cancel safe.
edit: making these methods object safe is difficult, the only way to do it is to rewrite the methods to do one of two things: when it is dropped, somehow communicate the internal state to a listener on the outside, probably via a channel, or have the future somehow run itself to completion when it's dropped. It would be preferrable to rewrite the surrounding code to not depend on its cancel safety.

Is there a way to make StreamExt::next non blocking (fail fast) if the stream is empty (need to wait for the next element)?

Currently I am doing something like this
use tokio::time::timeout;
while let Ok(option_element) = timeout(Duration::from_nanos(1), stream.next()).await {
...
}
to drain the items already in the rx buffer of the stream. I don't want to wait for the next element that has not been received.
I think the timeout would slow down the while loop.
I am wondering that is there a better way to do this without the use of the timeout?
Possibly like this https://github.com/async-rs/async-std/issues/579 but for the streams in futures/tokio.
The direct answer to your question is to use the FutureExt::now_or_never method from the futures crate as in stream.next().now_or_never().
However it is important to avoid writing a busy loop that waits on several things by calling now_or_never on each thing in a loop. This is bad because it is blocking the thread, and you should prefer a different solution such as tokio::select! to wait for multiple things. For the special case of this where you are constantly checking whether the task should shut down, see this other question instead.
On the other hand, an example where using now_or_never is perfectly fine is when you want to empty a queue for the items available now so you can batch process them in some manner. This is fine because the now_or_never loop will stop spinning as soon as it has emptied the queue.
Beware that if the stream is empty, then now_or_never will succeed because next() immediately returns None in this case.

Do AsyncIO stream writers/readers require manually ensuring that all data is sent/received?

When dealing with sockets, you need to make sure that all data is sent/received, since you may receive incomplete chunks of data when reading. From the docs:
In general, they return when the associated network buffers have been filled (send) or emptied (recv). They then tell you how many bytes they handled. It is your responsibility to call them again until your message has been completely dealt with.
Emphasis mine. It then shows sample implementations that ensure all data has been handled in each direction.
Is the same true though when dealing with AsyncIO wrappers over sockets?
For read, it seems to be required as the docs mention that it "[reads] up to n bytes.".
For write though, it seems like as long as you call drain afterwards, you know that it's all sent. The docs don't explicitly say that it must be called repeatedly, and write doesn't return anything.
Is this correct? Do I need to check how much was read using read, but can just drain the StreamWriter and know that everything was sent?
I thought that my above assumptions were correct, then I had a look at the example TCP Client immediately below the method docs:
import asyncio
async def tcp_echo_client(message):
reader, writer = await asyncio.open_connection(
'127.0.0.1', 8888)
print(f'Send: {message!r}')
writer.write(message.encode())
data = await reader.read(100)
print(f'Received: {data.decode()!r}')
print('Close the connection')
writer.close()
asyncio.run(tcp_echo_client('Hello World!'))
And it doesn't do any kind of checking. It assumes everything is both read and written the first time.
For read, [checking for incomplete read] seems to be required as the docs mention that it "[reads] up to n bytes.".
Correct, and this is a useful feature for many kinds of processing, as it allows you to read new data as it arrives from the peer and process it incrementally, without having to know how much to expect at any point. If you do know exactly how much you expect and need to read that amount of bytes, you can use readexactly.
For write though, it seems like as long as you call drain afterwards, you know that it's all sent. The docs don't explicitly say that it must be called repeatedly, and write doesn't return anything.
This is partially correct. Yes, asyncio will automatically keep writing the data you give it in the background until all is written, so you don't need to (nor can you) ensure it by checking the return value of write.
However, a sequence of stream.write(data); await stream.drain() will not pause the coroutine until all data has been transmitted to the OS. This is because drain doesn't wait for all data to be written, it only waits until it hits a "low watermark", trying to ensure (misguidedly according to some) that the buffer never becomes empty as long as there are new writes. As far as I know, in current asyncio there is no way to wait until all data has been sent - except for manually tweaking the watermarks, which is inconvenient and which the documentation warns against. The same applies to awaiting the return value of write() introduced in Python 3.8.
This is not as bad as it sounds simply because a successful write itself doesn't guarantee that the data was actually transmitted to, let alone received by the peer - it could be languishing in the socket buffer, or in network equipment along the way. But as long as you can rely on the system to send out the data you gave it as fast as possible, you don't really care whether some of it is in an asyncio buffer or in a kernel buffer. (But you still need to await drain() to ensure backpressure.)
The one time you do care is when you are about to exit the program or the event loop; in that case, a portion of the data being stuck in an asyncio buffer means that the peer will never see it. This is why, starting with 3.7, asyncio provides a wait_closed() method which you can await after calling close() to ensure that all the data has been sent. One could imagine a flush() method that does the same, but without having to actually close the socket (analogous to the method of the same name on file objects, and with equivalent semantics), but currently there are no plans to add it.

using MPI_Send_variable many times in a row before MPI_Recv_variable

To my current understanding, after calling MPI_Send, the calling thread should block until the variable is received, so my code below shouldn't work. However, I tried sending several variables in a row and receiving them gradually while doing operations on them and this still worked... See below. Can someone clarify step by step what is going on here?
matlab code: (because I am using a matlab mex wrapper for MPI functions)
%send
if mpirank==0
%arguments to MPI_Send_variable are (variable, destination, tag)
MPI_Send_variable(x,0,'A_22')%thread 0 should block here!
MPI_Send_variable(y,0,'A_12')
MPI_Send_variable(z,1,'A_11')
MPI_Send_variable(w,1,'A_21')
end
%recieve
if mpirank==0
%arguments to MPI_Recv_variable are (source, tag)
a=MPI_Recv_variable(0,'A_12')*MPI_Recv_variable(0,'A_22');
end
if mpirank==1
c=MPI_Recv_variable(0,'A_21')*MPI_Recv_variable(0,'A_22');
end
MPI_SEND is a blocking call only in the sense that it blocks until it is safe for the user to use the buffer provided to it. The important text to read here is in Section 3.4:
The send call described in Section 3.2.1 uses the standard communication mode. In this mode, it is up to MPI to decide whether outgoing messages will be buffered. MPI may buffer outgoing messages. In such a case, the send call may complete before a matching receive is invoked. On the other hand, buffer space may be unavailable, or MPI may choose not to buffer outgoing messages, for performance reasons. In this case, the send call will not complete until a matching receive has been posted, and the data has been moved to the receiver.
I highlighted the part that you're running up against in bold there. If your message is sufficiently small (and there are sufficiently few of them), MPI will copy your send buffers to an internal buffer and keep track of things internally until the message has been received remotely. There's no guarantee that when MPI_SEND is done, the message has been received.
On the other hand, if you do want to know that the message was actually received, you can use MPI_SSEND. That function will synchronize (hence the extra S both sides before allowing them to return from the MPI_SSEND and the matching receive call on the other end.
In a correct MPI program, you cannot do a blocking send to yourself without first posting a nonblocking receive. So a correct version of your program would look something like this:
Irecv(..., &req1);
Irecv(..., &req2);
Send(... to self ...);
Send(.... to self ...);
Wait(&req1, ...);
/* do work */
Wait(&req2, ...);
/* do more work */
Your code is technically incorrect, but the reason it is working correctly is because the MPI implementation is using internal buffers to buffer your send data before it is transmitted to the receiver (or matched to the later receive operation in the case of self sends). An MPI implementation is not required to have such buffers (generally called "eager buffers"), but most implementations do.
Since the data you are sending is small, the eager buffers are generally sufficient to buffer them temporarily. If you send large enough data, the MPI implementation will not have enough eager buffer space and your program will deadlock. Try sending, for example, 10 MB instead of a double in your program to notice the deadlock.
I assume that there is just a MPI_Send() behind MPI_Send_variable() and MPI_Receive() behind MPI_Receive_variable().
How do a process can ever receive a message that he sent to himself if both the send and receive operations are blocking ? Either send to self or receive to self are non-blocking or you will get a deadlock, and sending to self is forbidden.
Following answer of #Greginozemtsev Is the behavior of MPI communication of a rank with itself well-defined? , the MPI standard states that send to self and receive to self are allowed. I guess it implies that it's non blocking in this particular case.
In MPI 3.0, in section 3.2.4 Blocking Receive here, page 59, the words have not changed since MPI 1.1 :
Source = destination is allowed, that is, a process can send a message to itself.
(However, it is unsafe to do so with the blocking send
and receive operations described above, since this may lead to deadlock.
See Section 3.5.)
I rode section 3.5, but it's not clear enough for me...
I guess that the parenthesis are here to tell us that talking to oneself is not a good practice, at least for MPI communications !

Will select() block if called while there is still data to be read?

If a socket has data to be read and the select() function is called, will select():
Return immediately, indicating the socket is ready for reading, or
Block until more data is received on the socket
??
It can easily be tested, but I assure you select() will never block if there is data already available to read on one of the readfds. If it did block in that case, it wouldn't be very useful for programming with non-blocking I/O. Take the example where you are looping on select(), you see that there is data to be read, and you read it. Then while you are processing the data read, more data comes in. When you return to select() it blocks, waiting for more data. However your peer on the other side of the connection is waiting for a response to the data already sent. Your program ends up blocking forever. You could work around it with timeouts and such, but the whole point is to make non-blocking I/O efficient.
If an fd is at EOF, select() will never block even if called multiple times.
man 2 select seems to answer this question pretty directly:
select() and pselect() allow a program to monitor multiple file descriptors, waiting until one or more of the file descriptors become "ready" for some class of I/O operation (e.g., input possible). A file descriptor is considered ready if it is possible to perform the corresponding I/O operation (e.g., read(2)) without blocking.
So at least according to the manual, it would return immediately if there is any data available.

Resources