How to defer lifetime checking to runtime - rust

I'm trying to pass a non-static closure into tokio. Obviously this doesn't work. Is there a way to make sure the lifetimes are appropriate at runtime? Here's what I tried:
Attempt with Arc
In order to not pass the closure directly into tokio, I put it into the struct that manages our timers:
type Delays<'l, K: Eq + Hash + Debug + Copy + Send> = HashMap<K, Box<dyn FnOnce() + 'l + Send>>;
pub struct Timers<'l, K: Eq + Hash + Debug + Clone + Send> {
delays: Arc<Mutex<Delays<'l, K>>>,
}
The impl for that struct lets us easily add and remove timers. My plan was to somehow pass a static closure into tokio, by only moving a Weak reference
to the mutexed hashmap:
// remember handler function
delays.insert(key.clone(), Box::new(func));
// create a weak reference to the delay map to pass into the closure
let weak_handlers = Arc::downgrade(&self.delays);
// task that runs after a delay
let task = Delay::new(Instant::now() + delay)
.map_err(|e| warn!("Tokio timer error: {}", e)) // Map the error type to ()
.and_then(move |_| {
// get the handler from the table, of which we have only a weak ref.
let handler = Weak::upgrade(&weak_handlers)
.ok_or(())? // If the Arc dropped, return an error and thus aborting the future
.lock()
.remove(&key)
.ok_or(())?; // If the handler isn't there anymore, we can abort aswell.
// call the handler
handler();
Ok(())
});
So with the Weak we make sure that we abort, if the hash table was dropped.
It's important to know that the lifetime 'l is the same as that of the Timers struct, but how can I tell the compiler? Also, I think the real problem is that Weak<T>: 'static is not satisfied.
Writing it myself using unsafe
I tried building something similar to Sc to achieve this. First, is Sc going to work here? I read the code and understand it. I can't see any obvious problems - though it was kind of hard to come to the conclusion that the map method is actually safe, because the reference will definitely be dropped at the end of the map and not stored somewhere.
So I tried to adapt Sc for my needs. This is only a rough outline and I know there are some issues with this, but I believe something like this should be possible:
Have a struct Doa<T> that will own T
Doa::ref(&self) -> DoaRef<T> will produce a opaque object that internally contain a *const u8 to the owned object.
DoaRef doesn't contain references with non-static lifetimes and thus can be passed to tokio.
Have impl<T> Drop for Doa<T> that sets that *const u8 to null
So the DoaRef can now check if the value still exists and get a reference to it.
I also tried to make sure that the lifetime of &self in ref must be longer than the lifetimes of references in T, to ensure this works only if Doa really lives longer than the object the pointer points to.
struct Doa<'t, T: 'l> { ... }
pub fn ref(&'s self) -> DoaRef<T> where 't: 'a
But then T is lifetime-contrained and since DoaRef is parameterized over it DoaRef: 'static doesn't hold anymore.
Or is there some crate, or maybe even something in std that can do this?

Related

How to understand "primitive types are Sync" in rust?

I am reading https://doc.rust-lang.org/book/ch16-04-extensible-concurrency-sync-and-send.html
It says
In other words, any type T is Sync if &T (an immutable reference to T) is Send, meaning the reference can be sent safely to another thread. Similar to Send, primitive types are Sync
How to understand this? If primitive types are Sync, so integer such as i32 is Sync. Thus &i32 can be sent safely to another thread. But I don't think so. I don't think a number reference in main thread can be send to another thread.
You're confusing the concept of thread-safety with the concept of lifetime.
You are correct in that a reference to a value owned by main() can't be sent to a spawned thread:
// No-op function that statically proves we have an &i32.
fn opaque(_: &i32) {}
fn main() {
let x = 0i32;
std::thread::spawn(|| opaque(&x)); // E0373
}
This doesn't fail because &x is not Send (it is), but because std::thread::spawn() requires that the closure is 'static, and it isn't if it captures a reference to something that doesn't have static lifetime. The compiler gives us this hint along with the error:
note: function requires argument type to outlive `'static`
We can prove this by obtaining a reference with static lifetime (&'static i32) and sending it to a thread, which does work:
fn opaque(_: &i32) {}
fn main() {
static X: i32 = 0;
std::thread::spawn(|| opaque(&X));
}

on-the-fly substitution of `Option<Arc<Mutex<Box<dyn T>>>>`

Suppose I have an object video_source: Option<Arc<Mutex<Box<dyn GetVideo>>>> and I pass it to a thread:
std::thread::spawn(||{
loop {
if let Some(video_source) = video_source {
let video_frame = video_source.lock().unwrap().get();
}
}
})
where
trait GetVideo {
fn get() -> Vec<u8>
}
What if I want to change the video source on the fly? Well, I'd do this on another thread:
video_frame.unwrap().lock().unwrap() = Box::new(other_source);
I want to make this idea more generic. I want a type that permits such thing. Here's my sketch:
use std::sync::{Arc, Mutex};
pub type OnTheFlyInner<T> = Box<T + Send + Sync>;
pub type OnTheFly<T> = Arc<Mutex<OnTheFlyInner<T>>>;
//I'd like this to be a method of `OnTheFly`
pub fn on_the_fly_substitute(on_the_fly: &mut Option<OnTheFly>, substitute_by: Option<OnTheFlyInner>) {
if let Some(substitute_by) = substitute_by {
if let Some(on_the_fly) = on_the_fly {
*on_the_fly.lock().unwrap() = substitute_by;
}
} else {
on_the_fly.take();
}
}
However, I cannot make something generic over T where T is a trait, it should be a type.
Any ideas?
Bounty
This is solved by #user4815162342. But what if I want to make one OnTheFly object point to the same thing as the other one?
First, you are correct that T cannnot be a trait like GetVideo; traits are not types. However, T can be dyn GetVideo.
Second, your aliases have generic parameters, so they should be reflected as such in the function signature:
pub fn on_the_fly_substitute<T>(on_the_fly: &mut Option<OnTheFly<T>>, substitute_by: Option<OnTheFlyInner<T>>)
^^^ ^^^ ^^^
Third, your alias looks like an attempt to constrain T to be Send + Sync, but aliases cannot define additional bounds. You would instead put them on the function (with ?Sized since you want to allow trait objects):
pub fn on_the_fly_substitute<T: ?Sized>(on_the_fly: &mut Option<OnTheFly<T>>, substitute_by: Option<OnTheFlyInner<T>>)
where
T: ?Sized + Send + Sync
{
...
}
Note: your function body does not require Send and Sync so these bounds should probably not be included.
Fourth, Option<Arc<Mutex<Box<dyn GetVideo>>>> is not thread safe. You'll need to constrain that the trait object is at least Send:
Option<Arc<Mutex<Box<dyn GetVideo + Send>>>>
^^^^^^
Fifth, a complete example is lacking, but you appear to be wanting multiple threads to modify the same video_source. This would likely not compile since you would need multiple threads to keep a &mut _ in order to change it.
If you want shared ownership of a value that might not exist, move the option into the Mutex and adjust your function and aliases accordingly:
video_source: Arc<Mutex<Option<Box<dyn GetVideo>>>>
Sixth, your comment "I'd like this to be a method of OnTheFly" is misguided. Aliases are just aliases, you'd need a method on the aliased Option/Arc type. Keep it as a free function, introduce an extension trait for it, or create it as a wrapper type instead of an alias if you want more fine-grained control.

Cannot borrow as immutable because it is also borrowed as mutable when implementing an ECS

I am trying to write a simple ECS:
struct Ecs {
component_sets: HashMap<TypeId, Box<dyn Any>>,
}
impl Ecs {
pub fn read_all<Component>(&self) -> &SparseSet<Component> {
self.component_sets
.get(&TypeId::of::<Component>())
.unwrap()
.downcast_ref::<SparseSet<Component>>()
.unwrap()
}
pub fn write_all<Component>(&mut self) -> &mut SparseSet<Component> {
self.component_sets
.get_mut(&TypeId::of::<Component>())
.unwrap()
.downcast_mut::<SparseSet<Component>>()
.unwrap()
}
}
I am trying to get mutable access to a certain component while another is immutable. This testing code triggers the error:
fn testing() {
let all_pos = { ecs.write_all::<Pos>() };
let all_vel = { ecs.read_all::<Vel>() };
for (p, v) in all_pos.iter_mut().zip(all_vel.iter()) {
p.x += v.x;
p.y += v.y;
}
}
And the error
error[E0502]: cannot borrow `ecs` as immutable because it is also borrowed as mutable
--> src\ecs.rs:191:25
|
190 | let all_pos = { ecs.write_all::<Pos>() };
| --- mutable borrow occurs here
191 | let all_vel = { ecs.read_all::<Vel>() };
| ^^^ immutable borrow occurs here
My understanding of the borrow checker rules tells me that it's totally fine to get references to different component sets mutably or immutably (that is, &mut SparseSet<Pos> and &SparseSet<Vel>) since they are two different types. In order to get these references though, I need to go through the main ECS struct which owns the sets, which is where the compiler complains (i.e. first I use &mut Ecs when I call ecs.write_all and then &Ecs on ecs.read_all).
My first instinct was to enclose the statements in a scope, thinking it could just drop the &mut Ecs after I get the reference to the inner component set so as not to have both mutable and immutable Ecs references alive at the same time. This is probably very stupid, yet I don't fully understand how, so I wouldn't mind some more explaining there.
I suspect one additional level of indirection is needed (similar to RefCell's borrow and borrow_mut) but I am not sure what exactly I should wrap and how I should go about it.
Update
Solution 1: make the method signature of write_all take a &self despite returning a RefMut<'_, SparseSet<Component>> by wrapping the SparseSet in a RefCell (as illustrated in the answer below by Kevin Reid).
Solution 2: similar as above (method signature takes &self) but uses this piece of unsafe code:
fn write_all<Component>(&self) -> &mut SparseSet<Component> {
let set = self.component_sets
.get(&TypeId::of::<Component>())
.unwrap()
.downcast_ref::<SparseSet<Component>>()
.unwrap();
unsafe {
let set_ptr = set as *const SparseSet<Component>;
let set_ptr = set_ptr as *mut SparseSet<Component>;
&mut *set_ptr
}
}
What are benefits of using solution 1, is the implied runtime borrow-checking provided by RefCell an hindrance in this case or would it actually prove useful?
Would the use of unsafe be tolerable in this case? Are there benefits? (e.g. performance)
it's totally fine to get references to different component sets mutably or immutably
This is true: we can safely have multiple mutable, or mutable and immutable references, as long as no mutable reference points to the same data as any other reference.
However, not every means of obtaining those references will be accepted by the compiler's borrow checker. This doesn't mean they're unsound; just that we haven't convinced the compiler that they're safe. In particular, the only way the compiler understands to have simultaneous references is a struct's fields, because the compiler can know those are disjoint using a purely local analysis (looking only at the code of a single function):
struct Ecs {
pub pos: SparseSet<Pos>,
pub vel: SparseSet<Vel>,
}
for (p, v) in ecs.pos.iter_mut().zip(ecs.vel.iter()) {
p.x += v.x;
p.y += v.y;
}
This would compile, because the compiler can see that the references refer to different subsets of memory. It will not compile if you replace ecs.pos with a method ecs.pos() — let alone a HashMap. As soon as you get a function involved, information about field borrowing is hidden. Your function
pub fn write_all<Component>(&mut self) -> &mut SparseSet<Component>
has the elided lifetimes (lifetimes the compiler picks for you because every & must have a lifetime)
pub fn write_all<'a, Component>(&'a mut self) -> &'a mut SparseSet<Component>
which are the only information the compiler will use about what is borrowed. Hence, the 'a mutable reference to the SparseSet is borrowing all of the Ecs (as &'a mut self) and you can't have any other access to it.
The ways to arrange to be able to have multiple mutable references in a mostly-statically-checked way are discussed in the documentation page on Borrow Splitting. However, all of those are based on having some statically known property, which you don't. There's no way to express “this is okay as long as the Component type is not equal to another call's”. Therefore, to do this you do need RefCell, our general-purpose helper for runtime borrow checking.
Given what you've got already, the simplest thing to do is to replace SparseSet<Component> with RefCell<SparseSet<Component>>:
// no mut; changed return type
pub fn write_all<Component>(&self) -> RefMut<'_, SparseSet<Component>> {
self.component_sets
.get(&TypeId::of::<Component>())
.unwrap()
.downcast::<RefCell<SparseSet<Component>>>() // changed type
.unwrap()
.borrow_mut() // added this line
}
Note the changed return type, because borrowing a RefCell must return an explicit handle in order to track the duration of the borrow. However, a Ref or RefMut acts mostly like an & or &mut thanks to deref coercion. (Your code that inserts items in the map, which you didn't show in the question, will also need a RefCell::new.)
Another option is to put the interior mutability — likely via RefCell, but not necessarily — inside the SparseSet type, or create a wrapper type that does that. This might or might not help the code be cleaner.

How to store async closure created at runtime in a struct?

I'm learning Rust's async/await feature, and stuck with the following task. I would like to:
Create an async closure (or better to say async block) at runtime;
Pass created closure to constructor of some struct and store it;
Execute created closure later.
Looking through similar questions I wrote the following code:
use tokio;
use std::pin::Pin;
use std::future::Future;
struct Services {
s1: Box<dyn FnOnce(&mut Vec<usize>) -> Pin<Box<dyn Future<Output = ()>>>>,
}
impl Services {
fn new(f: Box<dyn FnOnce(&mut Vec<usize>) -> Pin<Box<dyn Future<Output = ()>>>>) -> Self {
Services { s1: f }
}
}
enum NumberOperation {
AddOne,
MinusOne
}
#[tokio::main]
async fn main() {
let mut input = vec![1,2,3];
let op = NumberOperation::AddOne;
let s = Services::new(Box::new(|numbers: &mut Vec<usize>| Box::pin(async move {
for n in numbers {
match op {
NumberOperation::AddOne => *n = *n + 1,
NumberOperation::MinusOne => *n = *n - 1,
};
}
})));
(s.s1)(&mut input).await;
assert_eq!(input, vec![2,3,4]);
}
But above code won't compile, because of invalid lifetimes.
How to specify lifetimes to make above example compile (so Rust will know that async closure should live as long as input). As I understand in provided example Rust requires closure to have static lifetime?
Also it's not clear why do we have to use Pin<Box> as return type?
Is it possible somehow to refactor code and eliminate: Box::new(|arg: T| Box::pin(async move {}))? Maybe there is some crate?
Thanks
Update
There is similar question How can I store an async function in a struct and call it from a struct instance?
. Although that's a similar question and actually my example is based on one of the answers from that question. Second answer contains information about closures created at runtime, but seems it works only when I pass an owned variable, but in my example I would like to pass to closure created at runtime mutable reference, not owned variable.
How to specify lifetimes to make above example compile (so Rust will know that async closure should live as long as input). As I understand in provided example Rust requires closure to have static lifetime?
Let's take a closer look at what happens when you invoke the closure:
(s.s1)(&mut input).await;
// ^^^^^^^^^^^^^^^^^^
// closure invocation
The closure immediately returns a future. You could assign that future to a variable and hold on to it until later:
let future = (s.s1)(&mut input);
// do some other stuff
future.await;
The problem is, because the future is boxed, it could be held around for the rest of the program's life without ever being driven to completion; that is, it could have 'static lifetime. And input must obviously remain borrowed until the future resolves: else imagine, for example, what would happen if "some other stuff" above involved modifying, moving or even dropping input—consider what would then happen when the future is run?
One solution would be to pass ownership of the Vec into the closure and then return it again from the future:
let s = Services::new(Box::new(move |mut numbers| Box::pin(async move {
for n in &mut numbers {
match op {
NumberOperation::AddOne => *n = *n + 1,
NumberOperation::MinusOne => *n = *n - 1,
};
}
numbers
})));
let output = (s.s1)(input).await;
assert_eq!(output, vec![2,3,4]);
See it on the playground.
#kmdreko's answer shows how you can instead actually tie the lifetime of the borrow to that of the returned future.
Also it's not clear why do we have to use Pin as return type?
Let's look at a stupidly simple async block:
async {
let mut x = 123;
let r = &mut x;
some_async_fn().await;
*r += 1;
x
}
Notice that execution may pause at the await. When that happens, the incumbent values of x and r must be stored temporarily (in the Future object: it's just a struct, in this case with fields for x and r). But r is a reference to another field in the same struct! If the future were then moved from its current location to somewhere else in memory, r would still refer to the old location of x and not the new one. Undefined Behaviour. Bad bad bad.
You may have observed that the future can also hold references to things that are stored elsewhere, such as the &mut input in #kmdreko's answer; because they are borrowed, those also cannot be moved for the duration of the borrow. So why can't the immovability of the future similarly be enforced by r's borrowing of x, without pinning? Well, the future's lifetime would then depend on its content—and such circularities are impossible in Rust.
This, generally, is the problem with self-referential data structures. Rust's solution is to prevent them from being moved: that is, to "pin" them.
Is it possible somehow to refactor code and eliminate: Box::new(|arg: T| Box::pin(async move {}))? Maybe there is some crate?
In your specific example, the closure and future can reside on the stack and you can simply get rid of all the boxing and pinning (the borrow-checker can ensure stack items don’t move without explicit pinning). However, if you want to return the Services from a function, you'll run into difficulties stating its type parameters: impl Trait would normally be your go-to solution for this type of problem, but it's limited and does not (currently) extend to associated types, such as that of the returned future.
There are work-arounds, but using boxed trait objects is often the most practical solution—albeit it introduces heap allocations and an additional layer of indirection with commensurate runtime cost. Such trait objects are however unavoidable where a single instance of your Services structure may hold different closures in s1 over the course of its life, where you're returning them from trait methods (which currently can’t use impl Trait), or where you're interfacing with a library that does not provide any alternative.
If you want your example to work as is, the missing component is communicating to the compiler what lifetime associations are allowed. Trait objects like dyn Future<...> are constrained to be 'static by default, which means it cannot have references to non-static objects. This is a problem because your closure returns a Future that needs to keep a reference to numbers in order to work.
The direct fix is to annotate that the dyn FnOnce can return a Future that can be bound to the life of the first parameter. This requires a higher-ranked trait bound and the syntax looks like for<'a>:
struct Services {
s1: Box<dyn for<'a> FnOnce(&'a mut Vec<usize>) -> Pin<Box<dyn Future<Output = ()> + 'a>>>,
}
impl Services {
fn new(f: Box<dyn for<'a> FnOnce(&'a mut Vec<usize>) -> Pin<Box<dyn Future<Output = ()> + 'a>>>) -> Self {
Services { s1: f }
}
}
The rest of your code now compiles without modification, check it out on the playground.

Borrow data out of a mutex "borrowed value does not live long enough"

How can I return an iterator over data within a mutex which itself is contained within a struct. The error the compiler gives is "borrowed value does not live long enough".
How do I get the lifetime of the value to extend into the outer scope?
Here is a minimal demo of what I am trying to achieve.
use std::sync::{Mutex, Arc};
use std::vec::{Vec};
use std::slice::{Iter};
#[derive(Debug)]
struct SharedVec {
pub data: Arc<Mutex<Vec<u32>>>,
}
impl SharedVec {
fn iter(& self) -> Iter<u32> {
self.data.lock().unwrap().iter()
}
}
fn main() {
let sv = SharedVec {
data: Arc::new(Mutex::new(vec![1, 2, 3, 4, 5]))
};
for element in sv.data.lock().unwrap().iter() { // This works
println!("{:?}", element);
}
for element in sv.iter() { // This does not work
println!("{:?}", element);
}
}
Rust playground link: http://is.gd/voukyN
You cannot do it exactly how you have written here.
Mutexes in Rust use RAII pattern for acquisition and freeing, that is, you acquire a mutex when you call the corresponding method on it which returns a special guard value. When this guard goes out of scope, the mutex is released.
To make this pattern safe Rust uses its borrowing system. You can access the value inside the mutex only through the guard returned by lock(), and you only can do so by reference - MutexGuard<T> implements Deref<Target=T> and DerefMut<Target=T>, so you can get &T or &mut T out of it.
This means that every value you derive from a mutexed value will necessarily have its lifetime linked to the lifetime of the guard. However, in your case you're trying to return Iter<u32> with its lifetime parameter tied to the lifetime of self. The following is the full signature of iter() method, without lifetime parameters elision, and its body with explicit temporary variables:
fn iter<'a>(&'a self) -> Iter<'a, u32> {
let guard = self.data.lock().unwrap();
guard.iter()
}
Here the lifetime of guard.iter() result is tied to the one guard, which is strictly smaller than 'a because guard only lives inside the scope of the method body. This is a violation of borrowing rules, and so the compiler fails with an error.
When iter() returns, guard is destroyed and the lock is released, so Rust in fact prevented you from making an actual logical error! The same code in C++ would compile and behave incorrectly because you would access protected data without locking it, causing data races at the very least. Just another demonstration of the power of Rust :)
I don't think you'll be able to do what you want without nasty hacks or boilerplate wrappers around standard types. And I personally think this is good - you have to manage your mutexes as explicit as possible in order to avoid deadlocks and other nasty concurrency problems. And Rust already makes your life much easier because it enforces absence of data races through its borrowing system, which is exactly the reason why the guard system behaves as described above.
As Vladimir Matveev's answer mentions, this isn't possible with return values. You can achieve your goal if you pass the iterator into a function instead of returning it:
impl SharedVec {
fn iter<R>(&self, func: impl FnOnce(Iter<'_, u32>) -> R) -> R {
let guard = self.data.lock().unwrap();
func(guard.iter())
}
}
This function is used like this:
sv.iter(|iter| {
for element in iter {
println!("{:?}", element);
}
});
This type of function wrapping will have to be repeated with every type of iterator. If you end up doing that, it may be easier to hand over a mutable slice or &mut SharedVec instead, making the closure choose the iteration method.
This method works because you never release the lock keeping the data protected from multiple threads from writing at the same time.

Resources