How are you supposed to handle a spurious wakeup of a Parker? - rust

According to the crossbeam::Parker documentation:
The park method blocks the current thread unless or until the token is available, at which point it automatically consumes the token. It may also return spuriously, without consuming the token.
How are you supposed to detect that a spurious wakeup occurred? Internally, it appears that the parker uses an atomic to track if the token has been consumed or not, but aside from the park and park_timeout methods, there doesn't seem to be a way to query its status.

You are supposed to handle it in some other manner. For example, if you are implementing an mpsc channel manually, your recv function might look something like this:
loop {
if let Some(message) = self.try_recv() {
return message;
}
park();
}
In this case, if a spurious wake-up happen, the loop will try to obtain the thing it is waiting for again, but since it was a spurious wake-up, the thing is not available, and the loop just goes to sleep again. Once a send actually happens, the sender will unpark the receiver, at which point the try_recv will succeed.
An example of such a channel implementation is available here (source), although it uses a CondVar instead of parking the thread, but it is the same idea.

This has been acknowledged as an issue on the relevant GitHub, and a pull request has been filed to fix it. Once that pull request is merged and released, I'll update this answer with the version that fixes the issue and mark this question as resolved.

Related

Rust concurrency question with SyncSender

I am new to Rust and trying to understand the Dining Philosopher code here :
https://google.github.io/comprehensive-rust/exercises/day-4/solutions-morning.html
By the time the execution reaches the following lines in the main thread, isn't it possible that none of the spawned threads have started executing their logic, resulting in nothing in 'rx' and the program simply quitting?
for thought in rx {
println!("{}", thought);
}
When iterating over a channel, it internally calls Receiver::recv, where the documentation specifies
This function will always block the current thread if there is no data available and it’s possible for more data to be sent (at least one sender still exists). Once a message is sent to the corresponding Sender (or SyncSender), this receiver will wake up and return that message.
So the receiver will block until it has data avalible, or all the senders have been dropped.
Yes, execution can reach for thought in rx { ... } before the threads have even started. However, this will still work because iterating over a Receiver will wait until there is a message and will only stop if all Senders have been destroyed (ergo it is no longer possible to receive any messages).

Is AsyncReadExt::read_u64 cancel safe?

In the documentation for AsyncReadExt::read_u64 it says it has the same errors as AsyncReadExt::read_exact, but says nothing about cancellation safety.
The same holds for all the other read_<type> functions on AsyncReadExt.
It seems likely that they have the same cancellation safety as read_exact (that is, none), but is that true?
Is there another way to read the next 4 bytes in a cancel safe way?
There's stuff in Tokio that covers my use case at a higher level, but I'd like to know how I would do this myself.
No it's not cancel safe
While the implementations of read_exact and the read_* functions differ they do the exact same thing:
Poll the underlying AsyncRead into a buffer, propagating errors appropriately.
If the reader returns Poll::Pending, propagate that.
If the buffer is full, return Ok(()).
If the buffer isn't full, repeat the whole thing over again.
If the future is canceled after some bytes are read it leaves the reader in an unknown state, thereby rendering them not cancel safe.
edit: making these methods object safe is difficult, the only way to do it is to rewrite the methods to do one of two things: when it is dropped, somehow communicate the internal state to a listener on the outside, probably via a channel, or have the future somehow run itself to completion when it's dropped. It would be preferrable to rewrite the surrounding code to not depend on its cancel safety.

Do AsyncIO stream writers/readers require manually ensuring that all data is sent/received?

When dealing with sockets, you need to make sure that all data is sent/received, since you may receive incomplete chunks of data when reading. From the docs:
In general, they return when the associated network buffers have been filled (send) or emptied (recv). They then tell you how many bytes they handled. It is your responsibility to call them again until your message has been completely dealt with.
Emphasis mine. It then shows sample implementations that ensure all data has been handled in each direction.
Is the same true though when dealing with AsyncIO wrappers over sockets?
For read, it seems to be required as the docs mention that it "[reads] up to n bytes.".
For write though, it seems like as long as you call drain afterwards, you know that it's all sent. The docs don't explicitly say that it must be called repeatedly, and write doesn't return anything.
Is this correct? Do I need to check how much was read using read, but can just drain the StreamWriter and know that everything was sent?
I thought that my above assumptions were correct, then I had a look at the example TCP Client immediately below the method docs:
import asyncio
async def tcp_echo_client(message):
reader, writer = await asyncio.open_connection(
'127.0.0.1', 8888)
print(f'Send: {message!r}')
writer.write(message.encode())
data = await reader.read(100)
print(f'Received: {data.decode()!r}')
print('Close the connection')
writer.close()
asyncio.run(tcp_echo_client('Hello World!'))
And it doesn't do any kind of checking. It assumes everything is both read and written the first time.
For read, [checking for incomplete read] seems to be required as the docs mention that it "[reads] up to n bytes.".
Correct, and this is a useful feature for many kinds of processing, as it allows you to read new data as it arrives from the peer and process it incrementally, without having to know how much to expect at any point. If you do know exactly how much you expect and need to read that amount of bytes, you can use readexactly.
For write though, it seems like as long as you call drain afterwards, you know that it's all sent. The docs don't explicitly say that it must be called repeatedly, and write doesn't return anything.
This is partially correct. Yes, asyncio will automatically keep writing the data you give it in the background until all is written, so you don't need to (nor can you) ensure it by checking the return value of write.
However, a sequence of stream.write(data); await stream.drain() will not pause the coroutine until all data has been transmitted to the OS. This is because drain doesn't wait for all data to be written, it only waits until it hits a "low watermark", trying to ensure (misguidedly according to some) that the buffer never becomes empty as long as there are new writes. As far as I know, in current asyncio there is no way to wait until all data has been sent - except for manually tweaking the watermarks, which is inconvenient and which the documentation warns against. The same applies to awaiting the return value of write() introduced in Python 3.8.
This is not as bad as it sounds simply because a successful write itself doesn't guarantee that the data was actually transmitted to, let alone received by the peer - it could be languishing in the socket buffer, or in network equipment along the way. But as long as you can rely on the system to send out the data you gave it as fast as possible, you don't really care whether some of it is in an asyncio buffer or in a kernel buffer. (But you still need to await drain() to ensure backpressure.)
The one time you do care is when you are about to exit the program or the event loop; in that case, a portion of the data being stuck in an asyncio buffer means that the peer will never see it. This is why, starting with 3.7, asyncio provides a wait_closed() method which you can await after calling close() to ensure that all the data has been sent. One could imagine a flush() method that does the same, but without having to actually close the socket (analogous to the method of the same name on file objects, and with equivalent semantics), but currently there are no plans to add it.

using MPI_Send_variable many times in a row before MPI_Recv_variable

To my current understanding, after calling MPI_Send, the calling thread should block until the variable is received, so my code below shouldn't work. However, I tried sending several variables in a row and receiving them gradually while doing operations on them and this still worked... See below. Can someone clarify step by step what is going on here?
matlab code: (because I am using a matlab mex wrapper for MPI functions)
%send
if mpirank==0
%arguments to MPI_Send_variable are (variable, destination, tag)
MPI_Send_variable(x,0,'A_22')%thread 0 should block here!
MPI_Send_variable(y,0,'A_12')
MPI_Send_variable(z,1,'A_11')
MPI_Send_variable(w,1,'A_21')
end
%recieve
if mpirank==0
%arguments to MPI_Recv_variable are (source, tag)
a=MPI_Recv_variable(0,'A_12')*MPI_Recv_variable(0,'A_22');
end
if mpirank==1
c=MPI_Recv_variable(0,'A_21')*MPI_Recv_variable(0,'A_22');
end
MPI_SEND is a blocking call only in the sense that it blocks until it is safe for the user to use the buffer provided to it. The important text to read here is in Section 3.4:
The send call described in Section 3.2.1 uses the standard communication mode. In this mode, it is up to MPI to decide whether outgoing messages will be buffered. MPI may buffer outgoing messages. In such a case, the send call may complete before a matching receive is invoked. On the other hand, buffer space may be unavailable, or MPI may choose not to buffer outgoing messages, for performance reasons. In this case, the send call will not complete until a matching receive has been posted, and the data has been moved to the receiver.
I highlighted the part that you're running up against in bold there. If your message is sufficiently small (and there are sufficiently few of them), MPI will copy your send buffers to an internal buffer and keep track of things internally until the message has been received remotely. There's no guarantee that when MPI_SEND is done, the message has been received.
On the other hand, if you do want to know that the message was actually received, you can use MPI_SSEND. That function will synchronize (hence the extra S both sides before allowing them to return from the MPI_SSEND and the matching receive call on the other end.
In a correct MPI program, you cannot do a blocking send to yourself without first posting a nonblocking receive. So a correct version of your program would look something like this:
Irecv(..., &req1);
Irecv(..., &req2);
Send(... to self ...);
Send(.... to self ...);
Wait(&req1, ...);
/* do work */
Wait(&req2, ...);
/* do more work */
Your code is technically incorrect, but the reason it is working correctly is because the MPI implementation is using internal buffers to buffer your send data before it is transmitted to the receiver (or matched to the later receive operation in the case of self sends). An MPI implementation is not required to have such buffers (generally called "eager buffers"), but most implementations do.
Since the data you are sending is small, the eager buffers are generally sufficient to buffer them temporarily. If you send large enough data, the MPI implementation will not have enough eager buffer space and your program will deadlock. Try sending, for example, 10 MB instead of a double in your program to notice the deadlock.
I assume that there is just a MPI_Send() behind MPI_Send_variable() and MPI_Receive() behind MPI_Receive_variable().
How do a process can ever receive a message that he sent to himself if both the send and receive operations are blocking ? Either send to self or receive to self are non-blocking or you will get a deadlock, and sending to self is forbidden.
Following answer of #Greginozemtsev Is the behavior of MPI communication of a rank with itself well-defined? , the MPI standard states that send to self and receive to self are allowed. I guess it implies that it's non blocking in this particular case.
In MPI 3.0, in section 3.2.4 Blocking Receive here, page 59, the words have not changed since MPI 1.1 :
Source = destination is allowed, that is, a process can send a message to itself.
(However, it is unsafe to do so with the blocking send
and receive operations described above, since this may lead to deadlock.
See Section 3.5.)
I rode section 3.5, but it's not clear enough for me...
I guess that the parenthesis are here to tell us that talking to oneself is not a good practice, at least for MPI communications !

winapi apc function parameter passing - what is the best practice

Hi i using winapi's QueueUserAPC to invoke an apc function call in another thread.
my question is, what is the best practice for passing a parameter to it.
i refer to the object lifetime and allocation/deallocation responsibility.
DWORD WINAPI QueueUserAPC(PAPCFUNC pfnAPC, HANDLE hThread, ULONG_PTR dwData);
i am using the dwData to pass the parameter to pass a pointer to some data and i was wondering how i should handle it.
i need to make sure that it lives until the receiving thread finished using it.
should i use a smart pointer to make sure that data is deallocated when no longer used?
i guess that allocation in the calling thread and dealloc. in the receiving is possible but probably not such a good thing.
anything else that can be done?
i think i would like to avoid synchronization between the two only to notify that the receiving thread is done with the data...
thanks!
Alloc'ing in the sending thread and dealloc'ing in the receiving one is easy, but it has the main drawback that it may leak, even if you handle the sending failure, the receiving thread may finish before having a chance to execute the APC.
Probably your easiest way to avoid the leak is to create a queue for sent data -maybe a queue per thread- and when thread finishes, you traverse the thread queue and free all the pending data.
But as usual, the devil is in the details...

Resources