Can I clone a future? - rust

I want to write some generic retry logic for a future.
I know the concrete return type and want to retry the same future.
My code only has access to the future - I do not want to wrap every fn call site in a closure to enable recreating it.
It seems that a "future" is a combination of (fn, args), and when .await is called, it runs and waits for the result in place.
If I am able to clone all of the args, would it be possible to create a clone of the not-started future to retry it if it fails the first time?

The problem is that a not-yet-started future is the same type as a future that has already started - the future transforms itself in-place. So while in theory a Future could be Clone, that would place severe constraints on the state it's allowed to keep during its whole lifetime. For futures implemented with async fn not only would the initial state (the parameters passed to async fn) have to be Clone, but also so would all the local variables that cross .await points.
A simple experiment shows that the current async doesn't auto-implement Clone the way it does e.g. Send, even for async functions where that would be safe. For example:
async fn retry(f: impl Future + Clone) {
todo!()
}
fn main() {
// fails to compile:
retry(async {});
// ^^^^^^^^ the trait `Clone` is not implemented for `impl Future`
}
I do not want to wrap every fn call site in a closure to enable recreating it.
In this situation that's probably exactly what you need to do. Or use some sort of macro if the closure requires too much boilerplate.

A Future can be cloned via https://docs.rs/futures/latest/futures/future/trait.FutureExt.html#method.shared. This is useful to pass the future to multiple consumers, but not suitable for retry.
To have retries with Futures, you need some kind of Future factory, to create a new Future for a retry when an error occurs. Ideally this retry mechanism would be wrapped in its own Future, to hide the complexity for consumers.
There's a crate which does that already: https://docs.rs/futures-retry/latest/futures_retry/struct.FutureRetry.html

Related

Does the Future trait implementation "force" you to break noalias in rust?

I'llI was thinking about the rust async infrastructure and at the heart of the API lies the Future trait which is:
pub trait Future {
type Output;
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}
According to the docs
The core method of future, poll, attempts to resolve the future into a final value. This method does not block if the value is not ready. Instead, the current task is scheduled to be woken up when it’s possible to make further progress by polling again. The context passed to the poll method can provide a Waker, which is a handle for waking up the current task.
This implies that the Context value passed to .poll() (particularly the waker) needs some way to refer to the pinned future &mut self in order to wake it up. However &mut implies that the reference is not aliased. Am I misunderstanding something or is this a "special case" where aliasing &mut is allowed? If the latter, are there other such special cases besides UnsafeCell?
needs some way to refer to the pinned future &mut self in order to wake it up.
No, it doesn't. The key thing to understand is that a “task” is not the future: it is the future and what the executor knows about it. What exactly the waker mutates is up to the executor, but it must be something that isn't the Future. I say “must” not just because of Rust mutability rules, but because Futures don't contain any state that says whether they have been woken. So, there isn't anything there to usefully mutate; 100% of the bytes of the future's memory are dedicated to the specific Future implementation and none of the executor's business.
Well on the very next page if the book you will notice that task contains a boxed future and the waker is created from a reference to task. So there is a reference to future held from task albeit indirect.
OK, let's look at those data structures. Condensed:
struct Executor {
ready_queue: Receiver<Arc<Task>>,
}
struct Task {
future: Mutex<Option<BoxFuture<'static, ()>>>,
task_sender: SyncSender<Arc<Task>>,
}
The reference to the task is an Arc<Task>, and the future itself is inside a Mutex (interior mutability) in the task. Therefore,
It is not possible to get an &mut Task from the Arc<Task>, because Arc doesn't allow that.
The future is in a Mutex which does run-time checking that there is at most one mutable reference to it.
The only things you can do with an Arc<Task> are
clone it and send it
get & access to the future in a Mutex (which allows requesting run-time-checked mutation access to the Future)
get & access to the task_sender (which allows sending things to ready_queue).
So, in this case, when the waker is called, it sort-of doesn't even mutate anything specific to the Task at all: it makes a clone of the Arc<Task> (which increments an atomic reference count stored next to the Task) and puts it on the ready_queue (which mutates storage shared between the Sender and Receiver).
Another executor might indeed have task-specific state in the Task that is mutated, such as a flag marking that the task is already woken and doesn't need to be woken again. That flag might be stored in an AtomicBoolean field in the task. But still, it does not alias with any &mut of the Future because it's not part of the Future, but the task.
All that said, there actually is something special about Futures and noalias — but it's not about executors, it's about Pin. Pinning explicitly allows the pinned type to contain “self-referential” pointers into itself, so Rust does not declare noalias for Pin<&mut T>. However, exactly what the language rules around this are is still not quite rigorously specified; the current situation is just considered a kludge so that async functions can be correctly compiled, I think.
There is no such special case. Judging from your comment about the Rust executor, you are misunderstanding how interior mutability works.
The example in the Rust book uses an Arc wrapped Task structure, with the future contained in a Mutex. When the task is run, it locks the mutex and obtains the singular &mut reference that's allowed to exist.
Now look at how the example implements wake_by_ref, and notice how it never touches the future at all. Indeed, that function would not be able to lock the future at all, as the upper level already has the lock. It would not be able to safely get a &mut reference, and so it prevents you from doing so, therefore, no issue.
The restriction for UnsafeCell and its wrappers is that only one &mut reference may exist for an object at any point in time. However, multiple & immutable references may exist to the UnsafeCell or structures containing it just fine - that is the point of interior mutability.

Is there a rust feature for async analogous to the recv_timeout function?

I'm trying to call an async function inside non-async context, and I'm having a really hard time with it.
Channels have been far easier to use for me - it's pretty simple and intuitive.
recv means block the thread until you receive something.
try_recv means see if something's there, otherwise error out.
recv_timeout means try for a certain amount of milliseconds, and then error out if nothing's there after the timeout.
I've been looking all over in the documentation of std::future::Future, but I don't see any way to do something similar. None of the functions that I've tried are simple solutions, and they all either take or give weird results that require even more unwrapping.
The Future trait in the standard library is very rudimentary and provides a stable foundation for others to build on.
Async runtimes (such as tokio, async-std, smol) include combinators that can take a future and turn it into another future. The tokio library has one such combinator called timeout.
Here is an example (playground link), which times out after 1 second while attempting to receive on a oneshot channel.
use std::time::Duration;
use tokio::{runtime::Runtime, sync::oneshot, time::{timeout, error::Elapsed}};
fn main() {
// Create a oneshot channel, which also implements `Future`, we can use it as an example.
let (_tx, rx) = oneshot::channel::<()>();
// Create a new tokio runtime, if you already have an async environment,
// you probably want to use tokio::spawn instead in order to re-use the existing runtime.
let rt = Runtime::new().unwrap();
// block_on is a function on the runtime which makes the current thread poll the future
// until it has completed. async move { } creates an async block, which implements `Future`
let output: Result<_, Elapsed> = rt.block_on(async move {
// The timeout function returns another future, which outputs a `Result<T, Elapsed>`. If the
// future times out, the `Elapsed` error is returned.
timeout(Duration::from_secs(1), rx).await
});
println!("{:?}", output);
}

How to move closures forever

I'm designing a little struct that runs closures for me and I can set them to stop:
pub fn run(&self, f: Box<dyn Fn()>) {
let should_continue = self.should_continue.clone();
self.run_thread = Some(std::thread::spawn(move || {
while should_continue.load(Ordering::Relaxed) {
//f should run fast so `should_continue` is readed frequently
f();
}
}));
}
as you can see, I'm passing Fn in a box, which gives me an error about Box not being shareable between threads. Actually, I don't care about fn once I pass it to this function run, so I wanted to move the closure to this function, since I'll not use it anymore. I cannot mark Fn as send because the f that I'm gonna actually pass does not implement Send.
So, how can I move a closure completely?
//move this closure to inside of run
self.run(||{});
Having a buildable reproduction case rather than code with random unprovided dependencies is useful so here's what I understand of your code.
The error I get is that the dyn Fn can not be sent between threads which is very different than shared: while there are many things which can not be shared (Sync) between threads (they can only be used from one thread at a time) there are also things which must remain on their original thread at all time. Rc for instance, is not Send, because it's not a thread-safe reference-counted pointer sending an Rc to a different thread would break its guarantees, therefore that's not allowed.
dyn Fn is opaque and offers no real guarantee as to what it's doing internally except for, well, being callable multiple times. So as far as the compiler is concerned it could close over something which isn't Send (e.g. a reference to a !Sync type, or an Rc, ...), which means the compiler assumes the Fn isn't Send either.
The solution is simple: just define f: Box<dyn Fn() + Send>, this way within run you guarantee that the function can, in fact, be sent between threads; and the caller to run will get an error if they're trying to send a function which can not be sent.
demo
run_ok uses a trivial closure, there is no issue with sending it over. run_not_ok closes over an Rc, and the function therefore doesn't compile (just uncomment it to see). run_ok2 is the same function as run_not_ok using an Arc instead of the Rc, and compiles fine.

Is there a way to spawn a thread with a specified lifetime in Rust? [duplicate]

This question already has an answer here:
How can I pass a reference to a stack variable to a thread?
(1 answer)
Closed 2 years ago.
I'm quite new to Rust, so I've encountered a few things I'm not used to. One issue that's causing me some grief is related to threads.
I would like to spawn a thread that executes a struct's method, but I'm not able to because the method needs to have a 'static lifetime. I'd prefer the method (and by extension the struct) didn't have a 'static lifetime.
If I know for certain that the thread will exit before the struct's instantiated value is dropped, is there a way to communicate this with Rust? In other words, can I tell Rust that I can guarantee the value will not have been dropped until after the thread exits? Or perhaps is there a way to pass a lifetime to a thread?
If none of this is possible, what can be done instead? I've looked into making the code asynchronous instead, but haven't had any success in fixing the issues described above.
If the method and struct must have a 'static lifetime, how might I go about appropriately specifying this?
Here's a simplified example of the problem:
pub struct Thing {
value: i32,
}
impl Thing {
pub fn new(value: i32) -> Thing {
Thing {
value,
}
}
fn in_thread(&self) {
println!("in thread");
// do things that block the thread
}
pub fn spawn_thread(&self) {
std::thread::spawn(move || {
self.in_thread(); // problem occurs here
});
}
}
If none of this is possible, what can be done instead? I've looked into making the code asynchronous instead, but haven't had any success in fixing the issues described above.
I wouldn't recommend to pass data via references to other threads. Instead, try to design your program so that the thread can own the data. You can do this by either move in the data when spawning a thread, alternatively, you may want to pass data via a Channel.

Do you always need an async fn main() if using async in Rust?

I'm researching and playing with Rust's async/.await to write a service in Rust that will pull from some websockets and do something with that data. A colleague of mine (who did this similar "data feed importing" in C#) has told me to handle these feeds asynchronously, since threads would be bad performance-wise.
It's my understanding that, to do any async in Rust, you need a runtime (e.g. Tokio). After inspecting most code I've found on the subject it seems that a prerequisite is to have a:
#[tokio::main]
async fn main() {
// ...
}
which provides the necessary runtime which manages our async code. I came to this conclusion because you cannot use .await in scopes which are not async functions or blocks.
This leads me to my main question: if intending to use async in Rust, do you always needs an async fn main() as described above? If so, how do you structure your synchronous code? Can structs have async methods and functions implemented (or should they even)?
All of this stems from my initial approach to writing this service, because the way I envisioned it is to have some sort of struct which would handle multiple websocket feeds and if they need to be done asynchronously, then by this logic, that struct would have to have async logic in it.
No. The #[tokio::main] is just a convenience feature which you can use to create a Tokio runtime and launch the main function inside it.
If you want to explicitly initialize a runtime instance, you can use the Builder. The runtime has the spawn method which takes an async closure and executes it inside the runtime without being async itself. This allows you to create a Tokio runtime anywhere in your non-async code.

Resources