How to do an "fire and forget" call in rust? - multithreading

What I would like to achieve using the Rust language is to execute a closure or method after X seconds (without waiting for it). Being more used to languages like C#, a simple solution would be to spawn a thread, sleep for X seconds, and execute whatever needs to be done.
I tried this:
fn fire_and_forget(&self) {
thread::spawn(|| {
sleep(Duration::from_secs(10));
self.do_something();
});
}
Obviously, this doesn't work. Depending on my context, I get errors like error[E0277]: cannot be sent between threads safely or error[E0759]: self has an anonymous lifetime '_ but it needs to satisfy a 'static lifetime requirement.
I hope there is solution for this. This "context" I am referring to is quite specific: this fire_and_forget method will be called from the outside so I am not sure I can change its signature with generic lifetimes.
Any idea if there is a solution / workaround / unsafe way for this?

If you check out thread::spawn's signature, you'll see that the closure is required to have a 'static lifetime. This means that it must own all of its data and can not borrow from its environment. This rule exists because there is nothing that guarantees that the parent thread will outlive the newly spawned thread and thus there's no guarantee that any references inside the thread::spawn closure stay valid.
There are a few work-arounds here, although they depend on the structure of self and your program logic. They all boil down to getting owned data in the end and moving that owned data into the closure.
Let's assume you want to capture self's state at the time of calling fire_and_forget and Self: Clone. In that case you can write the following:
let slf = self.clone(); // obtain an owned version of `slf`
thread::spawn(move || { // `move` makes the closure take ownership of `slf`
sleep(Duration::from_secs(10));
slf.do_something() // borrows data from the closure, not the environment
})
If Self doesn't implement Clone or it's not okay to use the old state, you'll need some kind of synchronization / shared access to its current state, e.g. achievable through Arc<Mutex<State>>. Then you'll have to clone the Arc, move it into the closure and call do_something with the shared reference to State.
struct State {
}
struct Foo {
state: Arc<Mutex<State>>,
}
impl Foo {
fn do_something(state: &State) {
}
fn fire_and_forget(&self) {
let owned_state = Arc::clone(&self.state); // obtain an owned version of `state`
thread::spawn(move || { // `move` makes the closure take ownership of `state`
sleep(Duration::from_secs(10));
let state = owned_state.lock().unwrap(); // get exclusive access to state
Self::do_something(&*state) // borrows data from the closure, not the environment
});
}
}
Ultimately the solution will depend on your specific requirements, but as a general rule, a closure can't be 'static if it borrows from its environment.

Related

Can the borrow checker know when an Arc is "released"? Can a 'static lifetime granted temporarily?

I'm trying to speed up a computationally-heavy Rust function by making it concurrent using only the built-in thread support. In particular, I want to alternate between quick single-threaded phases (where the main thread has mutable access to a big structure) and concurrent phases (where many worker threads run with read-only access to the structure). I don't want to make extra copies of the structure or force it to be 'static. Where I'm having trouble is convincing the borrow checker that the worker threads have finished.
Ignoring the borrow checker, an Arc reference seems like does all that is needed. The reference count in the Arc increases with the .clone() for each worker, then decreases as the workers conclude and I join all the worker threads. If (and only if) the Arc reference count is 1, it should be safe for the main thread to resume. The borrow checker, however, doesn't seem to know about Arc reference counts, and insists that my structure needs to be 'static.
Here's some sample code which works fine if I don't use threads, but won't compile if I switch the comments to enable the multi-threaded case.
struct BigStruct {
data: Vec<usize>
// Lots more
}
pub fn main() {
let ref_bigstruct = &mut BigStruct { data: Vec::new() };
for i in 0..3 {
ref_bigstruct.data.push(i); // Phase where main thread has write access
run_threads(ref_bigstruct); // Phase where worker threads have read-only access
}
}
fn run_threads(ref_bigstruct: &BigStruct) {
let arc_bigstruct = Arc::new(ref_bigstruct);
{
let arc_clone_for_worker = arc_bigstruct.clone();
// SINGLE-THREADED WORKS:
worker_thread(arc_clone_for_worker);
// MULTI-THREADED DOES NOT COMPILE:
// let handle = thread::spawn(move || { worker_thread(arc_clone_for_worker); } );
// handle.join();
}
assert!(Arc::strong_count(&arc_bigstruct) == 1);
println!("??? How can I tell the borrow checker that all borrows of ref_bigstruct are done?")
}
fn worker_thread(my_struct: Arc<&BigStruct>) {
println!(" worker says len()={}", my_struct.data.len());
}
I'm still learning about Rust lifetimes, but what I think (fear?) what I need is an operation that will take an ordinary (not 'static) reference to my structure and give me an Arc that I can clone into immutable references with a 'static lifetime for use by the workers. Once all the the worker Arc references are dropped, the borrow checker needs to allow my thread-spawning function to return. For safety, I assume this would panic if the the reference count is >1. While this seems like it would generally confirm with Rust's safety requirements, I don't see how to do it.
The underlying problem is not the borrowing checker not following Arc and the solution is not to use Arc. The problem is the borrow checker being unable to understand that the reason a thread must be 'static is because it may outlive the spawning thread, and thus if I immediately .join() it it is fine.
And the solution is to use scoped threads, that is, threads that allow you to use non-'static data because they always immediately .join(), and thus the spawned thread cannot outlive the spawning thread. Problem is, there are no worker threads on the standard library. Well, there are, however they're unstable.
So if you insist on not using crates, for some reason, you have no choice but to use unsafe code (don't, really). But if you can use external crates, then you can use the well-known crossbeam crate with its crossbeam::scope function, at least til std's scoped threads are stabilized.
In Rust Arc< T>, T is per definition immutable. Which means in order to use Arc, to make threads access data that is going to change, you also need it to wrap in some type that is interiorly mutable.
Rust provides a type that is especially suited for a single write or multiple read accesses in parallel, called RwLock.
So for your simple example, this would propably look something like this
use std::{sync::{Arc, RwLock}, thread};
struct BigStruct {
data: Vec<usize>
// Lots more
}
pub fn main() {
let arc_bigstruct = Arc::new(RwLock::new(BigStruct { data: Vec::new() }));
for i in 0..3 {
arc_bigstruct.write().unwrap().data.push(i); // Phase where main thread has write access
run_threads(&arc_bigstruct); // Phase where worker threads have read-only access
}
}
fn run_threads(ref_bigstruct: &Arc<RwLock<BigStruct>>) {
{
let arc_clone_for_worker = ref_bigstruct.clone();
//MULTI-THREADED
let handle = thread::spawn(move || { worker_thread(&arc_clone_for_worker); } );
handle.join().unwrap();
}
assert!(Arc::strong_count(&ref_bigstruct) == 1);
}
fn worker_thread(my_struct: &Arc<RwLock<BigStruct>>) {
println!(" worker says len()={}", my_struct.read().unwrap().data.len());
}
Which outputs
worker says len()=1
worker says len()=2
worker says len()=3
As for your question, the borrow checker does not know when an Arc is released, as far as I know. The references are counted at runtime.

Why doesn't Rayon require Arc<_>?

On page 465 of Programming Rust you can find the code and explanation (emphasis added by me)
use std::sync::Arc;
fn process_files_in_parallel(filenames: Vec<String>,
glossary: Arc<GigabyteMap>)
-> io::Result<()>
{
...
for worklist in worklists {
// This call to .clone() only clones the Arc and bumps the
// reference count. It does not clone the GigabyteMap.
let glossary_for_child = glossary.clone();
thread_handles.push(
spawn(move || process_files(worklist, &glossary_for_child))
);
}
...
}
We have changed the type of glossary: to run the analysis in parallel, the caller must pass in an Arc<GigabyteMap>, a smart pointer to a GigabyteMap that’s been moved into the heap, by doing Arc::new(giga_map). When we call glossary.clone(), we are making a copy of the Arc smart pointer, not the whole GigabyteMap. This amounts to incrementing a reference count. With this change, the program compiles and runs, because it no longer depends on reference lifetimes. As long as any thread owns an Arc<GigabyteMap>, it will keep the map alive, even if the parent thread bails out early. There won’t be any data races, because data in an Arc is immutable.
In the next section they show this rewritten with Rayon,
extern crate rayon;
use rayon::prelude::*;
fn process_files_in_parallel(filenames: Vec<String>, glossary: &GigabyteMap)
-> io::Result<()>
{
filenames.par_iter()
.map(|filename| process_file(filename, glossary))
.reduce_with(|r1, r2| {
if r1.is_err() { r1 } else { r2 }
})
.unwrap_or(Ok(()))
}
You can see in the section rewritten to use Rayon that it accepts &GigabyteMap rather than Arc<GigabyteMap>. They don't explain how this works though. Why doesn't Rayon require Arc<GigabyteMap>? How does Rayon get away with accepting a direct reference?
Rayon can guarantee that the iterator does not outlive the current stack frame, unlike what I assume is thread::spawn in the first code example. Specifically, par_iter under the hood uses something like Rayon's scope function, which allows one to spawn a unit of work that's "attached" to the stack and will join before the stack ends.
Because Rayon can guarantee (via lifetime bounds, from the user's perspective) that the tasks/threads are joined before the function calling par_iter exits, it can provide this API which is more ergonomic to use than the standard library's thread::spawn.
Rayon expands on this in the scope function's documentation.

How to defer lifetime checking to runtime

I'm trying to pass a non-static closure into tokio. Obviously this doesn't work. Is there a way to make sure the lifetimes are appropriate at runtime? Here's what I tried:
Attempt with Arc
In order to not pass the closure directly into tokio, I put it into the struct that manages our timers:
type Delays<'l, K: Eq + Hash + Debug + Copy + Send> = HashMap<K, Box<dyn FnOnce() + 'l + Send>>;
pub struct Timers<'l, K: Eq + Hash + Debug + Clone + Send> {
delays: Arc<Mutex<Delays<'l, K>>>,
}
The impl for that struct lets us easily add and remove timers. My plan was to somehow pass a static closure into tokio, by only moving a Weak reference
to the mutexed hashmap:
// remember handler function
delays.insert(key.clone(), Box::new(func));
// create a weak reference to the delay map to pass into the closure
let weak_handlers = Arc::downgrade(&self.delays);
// task that runs after a delay
let task = Delay::new(Instant::now() + delay)
.map_err(|e| warn!("Tokio timer error: {}", e)) // Map the error type to ()
.and_then(move |_| {
// get the handler from the table, of which we have only a weak ref.
let handler = Weak::upgrade(&weak_handlers)
.ok_or(())? // If the Arc dropped, return an error and thus aborting the future
.lock()
.remove(&key)
.ok_or(())?; // If the handler isn't there anymore, we can abort aswell.
// call the handler
handler();
Ok(())
});
So with the Weak we make sure that we abort, if the hash table was dropped.
It's important to know that the lifetime 'l is the same as that of the Timers struct, but how can I tell the compiler? Also, I think the real problem is that Weak<T>: 'static is not satisfied.
Writing it myself using unsafe
I tried building something similar to Sc to achieve this. First, is Sc going to work here? I read the code and understand it. I can't see any obvious problems - though it was kind of hard to come to the conclusion that the map method is actually safe, because the reference will definitely be dropped at the end of the map and not stored somewhere.
So I tried to adapt Sc for my needs. This is only a rough outline and I know there are some issues with this, but I believe something like this should be possible:
Have a struct Doa<T> that will own T
Doa::ref(&self) -> DoaRef<T> will produce a opaque object that internally contain a *const u8 to the owned object.
DoaRef doesn't contain references with non-static lifetimes and thus can be passed to tokio.
Have impl<T> Drop for Doa<T> that sets that *const u8 to null
So the DoaRef can now check if the value still exists and get a reference to it.
I also tried to make sure that the lifetime of &self in ref must be longer than the lifetimes of references in T, to ensure this works only if Doa really lives longer than the object the pointer points to.
struct Doa<'t, T: 'l> { ... }
pub fn ref(&'s self) -> DoaRef<T> where 't: 'a
But then T is lifetime-contrained and since DoaRef is parameterized over it DoaRef: 'static doesn't hold anymore.
Or is there some crate, or maybe even something in std that can do this?

Getting around Rust ownership problems when using state machine pattern

This question is about a specific pattern of ownership that may arise when implementing a state machine for a video game in Rust, where states can hold a reference to "global" borrowed context and where state machines own their states. I've tried to cut out as many details as I can while still motivating the problem, but it's a fairly large and tangled issue.
Here is the state trait:
pub trait AppState<'a> {
fn update(&mut self, Duration) -> Option<Box<AppState<'a> + 'a>>;
fn enter(&mut self, Box<AppState<'a> + 'a>);
//a number of other methods
}
I'm implementing states with a boxed trait object instead of an enum because I expect to have quite a lot of them. States return a Some(State) in their update method in order to cause their owning state machine to switch to a new state. I added a lifetime parameter because without it, the compiler was generating boxes with type: Box<AppState + 'static>, making the boxes useless because states contain mutable state.
Speaking of state machines, here it is:
pub struct StateMachine<'s> {
current_state: Box<AppState<'s> + 's>,
}
impl<'s> StateMachine<'s> {
pub fn switch_state(&'s mut self, new_state: Box<AppState<'s> + 's>) -> Box<AppState<'s> + 's> {
mem::replace(&mut self.current_state, new_state);
}
}
A state machine always has a valid state. By default, it starts with a Box<NullState>, which is a state that does nothing. I have omitted NullState for brevity. By itself, this seems to compile fine.
The InGame state is designed to implement a basic gameplay scenario:
type TexCreator = TextureCreator<WindowContext>;
pub struct InGame<'tc> {
app: AppControl,
tex_creator: &'tc TexCreator,
tileset: Tileset<'tc>,
}
impl<'tc> InGame<'tc> {
pub fn new(app: AppControl, tex_creator: &'tc TexCreator) -> InGame<'tc> {
// ... load tileset ...
InGame {
app,
tex_creator,
tileset,
}
}
}
This game depends on Rust SDL2. This particular set of bindings requires that textures be created by a TextureCreator, and that the textures not outlive their creator. Texture requires a lifetime parameter to ensure this. Tileset holds a texture and therefore exports this requirement. This means that I cannot store a TextureCreator within the state itself (though I'd like to), since a mutably-borrowed InGame could have texture creator moved out. Therefore, the texture creator is owned in main, where a reference to it is passed to when we create our main state:
fn main() {
let app_control = // ...
let tex_creator = // ...
let in_game = Box::new(states::InGame::new(app_control, &tex_creator));
let state_machine = states::StateMachine::new();
state_machine.switch_state(in_game);
}
I feel this program should be valid, because I have ensured that tex_creator outlives any possible state, and that state machine is the least long-lived variable. However, I get the following error:
error[E0597]: `state_machine` does not live long enough
--> src\main.rs:46:1
|
39 | state_machine.switch_state( in_game );
| ------------- borrow occurs here
...
46 | }
| ^ `state_machine` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
This doesn't make sense to me, because state_machine is only borrowed by the method invocation, but the compiler is saying that it's still borrowed when the method is over. I wish it let me trace who the borrower in the error message--I don't understand why the borrow isn't returned when the method returns.
Essentially, I want the following:
That states be implemented by trait.
That states be owned by the state machine.
That states be able to contain references to arbitrary non-static data with lifetime greater than that of the state machine.
That when a state is swapped out, the old box be still valid so that it can be moved into the constructor of the new state. This will allow the new state to switch back to the preceding state without requiring it to be re-constructed.
That a state can signal a state change by returning a new state from 'update'. The old state must be able to construct this new state within itself.
Are these constraints possible to satisfy, and if so, how?
I apologize for the long-winded question and the likelihood that I've missed something obvious, as there are a number of decisions made in the implementation above where I'm not confident I understand the semantics of the lifetimes. I've tried to search for examples of this pattern online, but it seems a lot more complicated and constrained than the toy examples I've seen.
In StateMachine::switch_state, you don't want to use the 's lifetime on &mut self; 's represents the lifetime of resources borrowed by a state, not the lifetime of the state machine. Notice that by doing that, the type of self ends up with 's twice: the full type is &'s mut StateMachine<'s>; you only need to use 's on StateMachine, not on the reference.
In a mutable reference (&'a mut T), T is invariant, hence 's is invariant too. This means that the compiler considers that the state machine has the same lifetime as whatever it borrows. Therefore, after calling switch_state, the compiler considers that the state machine ends up borrowing itself.
In short, change &'s mut self to &mut self:
impl<'s> StateMachine<'s> {
pub fn switch_state(&mut self, new_state: Box<AppState<'s> + 's>) -> Box<AppState<'s> + 's> {
mem::replace(&mut self.current_state, new_state)
}
}
You also need to declare state_machine in main as mutable:
let mut state_machine = states::StateMachine::new();

Borrow data out of a mutex "borrowed value does not live long enough"

How can I return an iterator over data within a mutex which itself is contained within a struct. The error the compiler gives is "borrowed value does not live long enough".
How do I get the lifetime of the value to extend into the outer scope?
Here is a minimal demo of what I am trying to achieve.
use std::sync::{Mutex, Arc};
use std::vec::{Vec};
use std::slice::{Iter};
#[derive(Debug)]
struct SharedVec {
pub data: Arc<Mutex<Vec<u32>>>,
}
impl SharedVec {
fn iter(& self) -> Iter<u32> {
self.data.lock().unwrap().iter()
}
}
fn main() {
let sv = SharedVec {
data: Arc::new(Mutex::new(vec![1, 2, 3, 4, 5]))
};
for element in sv.data.lock().unwrap().iter() { // This works
println!("{:?}", element);
}
for element in sv.iter() { // This does not work
println!("{:?}", element);
}
}
Rust playground link: http://is.gd/voukyN
You cannot do it exactly how you have written here.
Mutexes in Rust use RAII pattern for acquisition and freeing, that is, you acquire a mutex when you call the corresponding method on it which returns a special guard value. When this guard goes out of scope, the mutex is released.
To make this pattern safe Rust uses its borrowing system. You can access the value inside the mutex only through the guard returned by lock(), and you only can do so by reference - MutexGuard<T> implements Deref<Target=T> and DerefMut<Target=T>, so you can get &T or &mut T out of it.
This means that every value you derive from a mutexed value will necessarily have its lifetime linked to the lifetime of the guard. However, in your case you're trying to return Iter<u32> with its lifetime parameter tied to the lifetime of self. The following is the full signature of iter() method, without lifetime parameters elision, and its body with explicit temporary variables:
fn iter<'a>(&'a self) -> Iter<'a, u32> {
let guard = self.data.lock().unwrap();
guard.iter()
}
Here the lifetime of guard.iter() result is tied to the one guard, which is strictly smaller than 'a because guard only lives inside the scope of the method body. This is a violation of borrowing rules, and so the compiler fails with an error.
When iter() returns, guard is destroyed and the lock is released, so Rust in fact prevented you from making an actual logical error! The same code in C++ would compile and behave incorrectly because you would access protected data without locking it, causing data races at the very least. Just another demonstration of the power of Rust :)
I don't think you'll be able to do what you want without nasty hacks or boilerplate wrappers around standard types. And I personally think this is good - you have to manage your mutexes as explicit as possible in order to avoid deadlocks and other nasty concurrency problems. And Rust already makes your life much easier because it enforces absence of data races through its borrowing system, which is exactly the reason why the guard system behaves as described above.
As Vladimir Matveev's answer mentions, this isn't possible with return values. You can achieve your goal if you pass the iterator into a function instead of returning it:
impl SharedVec {
fn iter<R>(&self, func: impl FnOnce(Iter<'_, u32>) -> R) -> R {
let guard = self.data.lock().unwrap();
func(guard.iter())
}
}
This function is used like this:
sv.iter(|iter| {
for element in iter {
println!("{:?}", element);
}
});
This type of function wrapping will have to be repeated with every type of iterator. If you end up doing that, it may be easier to hand over a mutable slice or &mut SharedVec instead, making the closure choose the iteration method.
This method works because you never release the lock keeping the data protected from multiple threads from writing at the same time.

Resources