I'm learning Rust's async/await feature, and stuck with the following task. I would like to:
Create an async closure (or better to say async block) at runtime;
Pass created closure to constructor of some struct and store it;
Execute created closure later.
Looking through similar questions I wrote the following code:
use tokio;
use std::pin::Pin;
use std::future::Future;
struct Services {
s1: Box<dyn FnOnce(&mut Vec<usize>) -> Pin<Box<dyn Future<Output = ()>>>>,
}
impl Services {
fn new(f: Box<dyn FnOnce(&mut Vec<usize>) -> Pin<Box<dyn Future<Output = ()>>>>) -> Self {
Services { s1: f }
}
}
enum NumberOperation {
AddOne,
MinusOne
}
#[tokio::main]
async fn main() {
let mut input = vec![1,2,3];
let op = NumberOperation::AddOne;
let s = Services::new(Box::new(|numbers: &mut Vec<usize>| Box::pin(async move {
for n in numbers {
match op {
NumberOperation::AddOne => *n = *n + 1,
NumberOperation::MinusOne => *n = *n - 1,
};
}
})));
(s.s1)(&mut input).await;
assert_eq!(input, vec![2,3,4]);
}
But above code won't compile, because of invalid lifetimes.
How to specify lifetimes to make above example compile (so Rust will know that async closure should live as long as input). As I understand in provided example Rust requires closure to have static lifetime?
Also it's not clear why do we have to use Pin<Box> as return type?
Is it possible somehow to refactor code and eliminate: Box::new(|arg: T| Box::pin(async move {}))? Maybe there is some crate?
Thanks
Update
There is similar question How can I store an async function in a struct and call it from a struct instance?
. Although that's a similar question and actually my example is based on one of the answers from that question. Second answer contains information about closures created at runtime, but seems it works only when I pass an owned variable, but in my example I would like to pass to closure created at runtime mutable reference, not owned variable.
How to specify lifetimes to make above example compile (so Rust will know that async closure should live as long as input). As I understand in provided example Rust requires closure to have static lifetime?
Let's take a closer look at what happens when you invoke the closure:
(s.s1)(&mut input).await;
// ^^^^^^^^^^^^^^^^^^
// closure invocation
The closure immediately returns a future. You could assign that future to a variable and hold on to it until later:
let future = (s.s1)(&mut input);
// do some other stuff
future.await;
The problem is, because the future is boxed, it could be held around for the rest of the program's life without ever being driven to completion; that is, it could have 'static lifetime. And input must obviously remain borrowed until the future resolves: else imagine, for example, what would happen if "some other stuff" above involved modifying, moving or even dropping input—consider what would then happen when the future is run?
One solution would be to pass ownership of the Vec into the closure and then return it again from the future:
let s = Services::new(Box::new(move |mut numbers| Box::pin(async move {
for n in &mut numbers {
match op {
NumberOperation::AddOne => *n = *n + 1,
NumberOperation::MinusOne => *n = *n - 1,
};
}
numbers
})));
let output = (s.s1)(input).await;
assert_eq!(output, vec![2,3,4]);
See it on the playground.
#kmdreko's answer shows how you can instead actually tie the lifetime of the borrow to that of the returned future.
Also it's not clear why do we have to use Pin as return type?
Let's look at a stupidly simple async block:
async {
let mut x = 123;
let r = &mut x;
some_async_fn().await;
*r += 1;
x
}
Notice that execution may pause at the await. When that happens, the incumbent values of x and r must be stored temporarily (in the Future object: it's just a struct, in this case with fields for x and r). But r is a reference to another field in the same struct! If the future were then moved from its current location to somewhere else in memory, r would still refer to the old location of x and not the new one. Undefined Behaviour. Bad bad bad.
You may have observed that the future can also hold references to things that are stored elsewhere, such as the &mut input in #kmdreko's answer; because they are borrowed, those also cannot be moved for the duration of the borrow. So why can't the immovability of the future similarly be enforced by r's borrowing of x, without pinning? Well, the future's lifetime would then depend on its content—and such circularities are impossible in Rust.
This, generally, is the problem with self-referential data structures. Rust's solution is to prevent them from being moved: that is, to "pin" them.
Is it possible somehow to refactor code and eliminate: Box::new(|arg: T| Box::pin(async move {}))? Maybe there is some crate?
In your specific example, the closure and future can reside on the stack and you can simply get rid of all the boxing and pinning (the borrow-checker can ensure stack items don’t move without explicit pinning). However, if you want to return the Services from a function, you'll run into difficulties stating its type parameters: impl Trait would normally be your go-to solution for this type of problem, but it's limited and does not (currently) extend to associated types, such as that of the returned future.
There are work-arounds, but using boxed trait objects is often the most practical solution—albeit it introduces heap allocations and an additional layer of indirection with commensurate runtime cost. Such trait objects are however unavoidable where a single instance of your Services structure may hold different closures in s1 over the course of its life, where you're returning them from trait methods (which currently can’t use impl Trait), or where you're interfacing with a library that does not provide any alternative.
If you want your example to work as is, the missing component is communicating to the compiler what lifetime associations are allowed. Trait objects like dyn Future<...> are constrained to be 'static by default, which means it cannot have references to non-static objects. This is a problem because your closure returns a Future that needs to keep a reference to numbers in order to work.
The direct fix is to annotate that the dyn FnOnce can return a Future that can be bound to the life of the first parameter. This requires a higher-ranked trait bound and the syntax looks like for<'a>:
struct Services {
s1: Box<dyn for<'a> FnOnce(&'a mut Vec<usize>) -> Pin<Box<dyn Future<Output = ()> + 'a>>>,
}
impl Services {
fn new(f: Box<dyn for<'a> FnOnce(&'a mut Vec<usize>) -> Pin<Box<dyn Future<Output = ()> + 'a>>>) -> Self {
Services { s1: f }
}
}
The rest of your code now compiles without modification, check it out on the playground.
Related
Consider the following Rust code:
use std::future::Future;
use std::pin::Pin;
fn main() {
let mut v: Vec<_> = Vec::new();
for _ in 1..10 {
v.push(wrap_future(Box::pin(async {})));
}
}
fn wrap_future<T>(a: Pin<Box<dyn Future<Output=T>>>) -> impl Future<Output=T> {
async {
println!("doing stuff before awaiting");
let result=a.await;
println!("doing stuff after awaiting");
result
}
}
As you can see, the futures I'm putting into the Vec don't need to be boxed, since they are all the same type and the compiler can infer what that type is.
I would like to create a struct that has this Vec<...> type as one of its members, so that I could add a line at the end of main():
let thing = MyStruct {myvec: v};
without any additional overhead (i.e. boxing).
Type inference and impl Trait syntax aren't allowed on struct members, and since the future type returned by an async block exists entirely within the compiler and is exclusive to that exact async block, there's no way to reference it by name. It seems to me that what I want to do is impossible. Is it? If so, will it become possible in a future version of Rust?
I am aware that it would be easy to sidestep this problem by simply boxing all the futures in the Vec as I did the argument to wrap_future() but I'd prefer not to do this if I can avoid it.
I am well aware that doing this would mean that there could be only one async block in my entire codebase whose result values could possibly be added to such a Vec, and thus that there could be only one function in my entire codebase that could create values that could possibly be pushed to it. I am okay with this limitation.
Nevermind, I'm stupid. I forgot that structs could have type parameters.
struct MyStruct<F> where F: Future<Output=()> {
myvec: Vec<F>,
}
I am trying to write a simple ECS:
struct Ecs {
component_sets: HashMap<TypeId, Box<dyn Any>>,
}
impl Ecs {
pub fn read_all<Component>(&self) -> &SparseSet<Component> {
self.component_sets
.get(&TypeId::of::<Component>())
.unwrap()
.downcast_ref::<SparseSet<Component>>()
.unwrap()
}
pub fn write_all<Component>(&mut self) -> &mut SparseSet<Component> {
self.component_sets
.get_mut(&TypeId::of::<Component>())
.unwrap()
.downcast_mut::<SparseSet<Component>>()
.unwrap()
}
}
I am trying to get mutable access to a certain component while another is immutable. This testing code triggers the error:
fn testing() {
let all_pos = { ecs.write_all::<Pos>() };
let all_vel = { ecs.read_all::<Vel>() };
for (p, v) in all_pos.iter_mut().zip(all_vel.iter()) {
p.x += v.x;
p.y += v.y;
}
}
And the error
error[E0502]: cannot borrow `ecs` as immutable because it is also borrowed as mutable
--> src\ecs.rs:191:25
|
190 | let all_pos = { ecs.write_all::<Pos>() };
| --- mutable borrow occurs here
191 | let all_vel = { ecs.read_all::<Vel>() };
| ^^^ immutable borrow occurs here
My understanding of the borrow checker rules tells me that it's totally fine to get references to different component sets mutably or immutably (that is, &mut SparseSet<Pos> and &SparseSet<Vel>) since they are two different types. In order to get these references though, I need to go through the main ECS struct which owns the sets, which is where the compiler complains (i.e. first I use &mut Ecs when I call ecs.write_all and then &Ecs on ecs.read_all).
My first instinct was to enclose the statements in a scope, thinking it could just drop the &mut Ecs after I get the reference to the inner component set so as not to have both mutable and immutable Ecs references alive at the same time. This is probably very stupid, yet I don't fully understand how, so I wouldn't mind some more explaining there.
I suspect one additional level of indirection is needed (similar to RefCell's borrow and borrow_mut) but I am not sure what exactly I should wrap and how I should go about it.
Update
Solution 1: make the method signature of write_all take a &self despite returning a RefMut<'_, SparseSet<Component>> by wrapping the SparseSet in a RefCell (as illustrated in the answer below by Kevin Reid).
Solution 2: similar as above (method signature takes &self) but uses this piece of unsafe code:
fn write_all<Component>(&self) -> &mut SparseSet<Component> {
let set = self.component_sets
.get(&TypeId::of::<Component>())
.unwrap()
.downcast_ref::<SparseSet<Component>>()
.unwrap();
unsafe {
let set_ptr = set as *const SparseSet<Component>;
let set_ptr = set_ptr as *mut SparseSet<Component>;
&mut *set_ptr
}
}
What are benefits of using solution 1, is the implied runtime borrow-checking provided by RefCell an hindrance in this case or would it actually prove useful?
Would the use of unsafe be tolerable in this case? Are there benefits? (e.g. performance)
it's totally fine to get references to different component sets mutably or immutably
This is true: we can safely have multiple mutable, or mutable and immutable references, as long as no mutable reference points to the same data as any other reference.
However, not every means of obtaining those references will be accepted by the compiler's borrow checker. This doesn't mean they're unsound; just that we haven't convinced the compiler that they're safe. In particular, the only way the compiler understands to have simultaneous references is a struct's fields, because the compiler can know those are disjoint using a purely local analysis (looking only at the code of a single function):
struct Ecs {
pub pos: SparseSet<Pos>,
pub vel: SparseSet<Vel>,
}
for (p, v) in ecs.pos.iter_mut().zip(ecs.vel.iter()) {
p.x += v.x;
p.y += v.y;
}
This would compile, because the compiler can see that the references refer to different subsets of memory. It will not compile if you replace ecs.pos with a method ecs.pos() — let alone a HashMap. As soon as you get a function involved, information about field borrowing is hidden. Your function
pub fn write_all<Component>(&mut self) -> &mut SparseSet<Component>
has the elided lifetimes (lifetimes the compiler picks for you because every & must have a lifetime)
pub fn write_all<'a, Component>(&'a mut self) -> &'a mut SparseSet<Component>
which are the only information the compiler will use about what is borrowed. Hence, the 'a mutable reference to the SparseSet is borrowing all of the Ecs (as &'a mut self) and you can't have any other access to it.
The ways to arrange to be able to have multiple mutable references in a mostly-statically-checked way are discussed in the documentation page on Borrow Splitting. However, all of those are based on having some statically known property, which you don't. There's no way to express “this is okay as long as the Component type is not equal to another call's”. Therefore, to do this you do need RefCell, our general-purpose helper for runtime borrow checking.
Given what you've got already, the simplest thing to do is to replace SparseSet<Component> with RefCell<SparseSet<Component>>:
// no mut; changed return type
pub fn write_all<Component>(&self) -> RefMut<'_, SparseSet<Component>> {
self.component_sets
.get(&TypeId::of::<Component>())
.unwrap()
.downcast::<RefCell<SparseSet<Component>>>() // changed type
.unwrap()
.borrow_mut() // added this line
}
Note the changed return type, because borrowing a RefCell must return an explicit handle in order to track the duration of the borrow. However, a Ref or RefMut acts mostly like an & or &mut thanks to deref coercion. (Your code that inserts items in the map, which you didn't show in the question, will also need a RefCell::new.)
Another option is to put the interior mutability — likely via RefCell, but not necessarily — inside the SparseSet type, or create a wrapper type that does that. This might or might not help the code be cleaner.
As Rust reference documention said
Breaking the pointer aliasing rules. &mut T and &T follow LLVM’s scoped noalias model, except if the &T contains an UnsafeCell.
It's really ambiguous.
I want to know that what's the exactly moment an undefined behavior of &mut noalias occurred in Rust.
Is it any of below, or something else?
When defining two &mut that point to the same address?
When two &mut that point to the same address exposed to rust?
When perfrom any operation on a &mut that point to the same address of any other &mut?
For example, this code is observously UB:
unsafe {
let mut x = 123usize;
let a = (&mut x as *mut usize).as_mut().unwrap(); // created, but not accessed
let b = (&mut x as *mut usize).as_mut().unwrap(); // created, accessed
*b = 666;
drop(a);
}
But what if I modify the code like this:
struct S<'a> {
ref_x: &'a mut usize
}
fn main() {
let mut x = 123;
let s = S { ref_x: &mut x }; // like the `T` in `ManuallyDrop<T>`
let taken = unsafe { std::ptr::read(&s as *const S) }; // like `ManuallyDrop<T>::take`
// at thist ime, we have two `&mut x`
*(taken.ref_x) = 666;
drop(s);
// UB or not?
}
Is the second version also UB?
The second version is totally the same implemention to std::mem::ManuallyDrop. If the second version is UB, is it a security bug of std::mem::ManuallyDrop<T>?
What the aliasing restriction is not
It is actually common to have multiple existing &mut T aliasing the same item.
The simplest example is:
fn main() {
let mut i = 32;
let j = &mut i;
let k = &mut *j;
*k = 3;
println!("{}", i);
}
Note, though, that due to borrowing rules you cannot access the other aliases simultaneously.
If you look at the implementation of ManuallyDrop::take:
pub unsafe fn take(slot: &mut ManuallyDrop<T>) -> T {
ptr::read(&slot.value)
}
You will note that there are no simultaneously accessible &mut T: calling the function re-borrows ManuallyDrop making slot the only accessible mutable reference.
Why is aliasing in Rust so ill-defined
It's really ambiguous. I want to know that what's the exactly moment an undefined behavior of &mut noalias occurred in Rust.
Tough luck, because as specified in the Nomicon:
Unfortunately, Rust hasn't actually defined its aliasing model.
The reason is that the language team wants to make sure that the definition they reach is both safe (demonstrably so), practical, and yet does not close the door to possible refinements. It's a tall order.
The Rust Unsafe Code Guidelines Working Group is still working on establishing the exact boundaries, and in particular Ralf Jung is working on an operational model for aliasing called Stacked Borrows.
Note: the Stacked Borrows model is implemented in MIRI, and therefore you can validate your code against the Stacked Borrows model simply by executing your code in MIRI. Of course Stacked Borrows is still experimental, so this doesn't guarantee anything.
What caution recommends
I personally subscribe to caution. Seeing as the exact model is unspecified, the rules are ever changing and therefore I recommend taking the stricter interpretation possible.
Thus, I interpret the no-aliasing rule of &mut T as:
At any point in the code, there shall not be two accessible references in scope which alias the same memory if one of them is &mut T.
That is, I consider that forming a &mut T to an instance T for which another &T or &mut T is in scope without invalidating the aliases (via borrowing) is ill-formed.
It may very well be overly cautious, but at least if the aliasing model ends up being more conservative than planned, my code will still be valid.
I'm trying to pass a non-static closure into tokio. Obviously this doesn't work. Is there a way to make sure the lifetimes are appropriate at runtime? Here's what I tried:
Attempt with Arc
In order to not pass the closure directly into tokio, I put it into the struct that manages our timers:
type Delays<'l, K: Eq + Hash + Debug + Copy + Send> = HashMap<K, Box<dyn FnOnce() + 'l + Send>>;
pub struct Timers<'l, K: Eq + Hash + Debug + Clone + Send> {
delays: Arc<Mutex<Delays<'l, K>>>,
}
The impl for that struct lets us easily add and remove timers. My plan was to somehow pass a static closure into tokio, by only moving a Weak reference
to the mutexed hashmap:
// remember handler function
delays.insert(key.clone(), Box::new(func));
// create a weak reference to the delay map to pass into the closure
let weak_handlers = Arc::downgrade(&self.delays);
// task that runs after a delay
let task = Delay::new(Instant::now() + delay)
.map_err(|e| warn!("Tokio timer error: {}", e)) // Map the error type to ()
.and_then(move |_| {
// get the handler from the table, of which we have only a weak ref.
let handler = Weak::upgrade(&weak_handlers)
.ok_or(())? // If the Arc dropped, return an error and thus aborting the future
.lock()
.remove(&key)
.ok_or(())?; // If the handler isn't there anymore, we can abort aswell.
// call the handler
handler();
Ok(())
});
So with the Weak we make sure that we abort, if the hash table was dropped.
It's important to know that the lifetime 'l is the same as that of the Timers struct, but how can I tell the compiler? Also, I think the real problem is that Weak<T>: 'static is not satisfied.
Writing it myself using unsafe
I tried building something similar to Sc to achieve this. First, is Sc going to work here? I read the code and understand it. I can't see any obvious problems - though it was kind of hard to come to the conclusion that the map method is actually safe, because the reference will definitely be dropped at the end of the map and not stored somewhere.
So I tried to adapt Sc for my needs. This is only a rough outline and I know there are some issues with this, but I believe something like this should be possible:
Have a struct Doa<T> that will own T
Doa::ref(&self) -> DoaRef<T> will produce a opaque object that internally contain a *const u8 to the owned object.
DoaRef doesn't contain references with non-static lifetimes and thus can be passed to tokio.
Have impl<T> Drop for Doa<T> that sets that *const u8 to null
So the DoaRef can now check if the value still exists and get a reference to it.
I also tried to make sure that the lifetime of &self in ref must be longer than the lifetimes of references in T, to ensure this works only if Doa really lives longer than the object the pointer points to.
struct Doa<'t, T: 'l> { ... }
pub fn ref(&'s self) -> DoaRef<T> where 't: 'a
But then T is lifetime-contrained and since DoaRef is parameterized over it DoaRef: 'static doesn't hold anymore.
Or is there some crate, or maybe even something in std that can do this?
This question already has answers here:
Is there any way to return a reference to a variable created in a function?
(5 answers)
Closed 4 years ago.
I have the following function as part of a Rust WASM application to convert a Boxed closure into the Rust-representation for a JavaScript function.
use js_sys::Function;
type Callback = Rc<RefCell<Option<Closure<FnMut()>>>>;
fn to_function(callback: &Callback) -> &Function {
callback.borrow().as_ref().unwrap().as_ref().unchecked_ref()
}
However, the compiler complains that the return value uses a borrowed value (obtained with callback.borrow()) so cannot be returned.
Hence, I decided to add lifetime annotations to inform the compiler that this new reference should live as long as the input.
use js_sys::Function;
type Callback = Rc<RefCell<Option<Closure<FnMut()>>>>;
fn to_function<'a>(callback: &'a Callback) -> &'a Function {
callback.borrow().as_ref().unwrap().as_ref().unchecked_ref()
}
Unfortunately, this hasn't helped and I get the same error. What am I doing wrong here?
Yeah, this isn't going to work.
callback.borrow().as_ref().unwrap().as_ref().unchecked_ref()
Let's break this down in steps:
You're borrowing &RefCell<Option<Closure<FnMut()>>> - so you now have Ref<Option<...>>, which is step #1 of your issues. When this happens, this intermediary value now has a different lifetime than 'a (inferior, to be precise). Anything stemming from this will inherit this lesser lifetime. Call it 'b for now
You then as_ref this Ref, turning it into Option<&'b Closure<FnMut()>>
Rust then converts &'b Closure<FnMut()> into &'b Function
Step 1 is where the snafu happens. Due to the lifetime clash, you're left with this mess. A semi-decent way to solve it the following construct:
use std::rc::{Rc};
use std::cell::{RefCell, Ref};
use std::ops::Deref;
struct CC<'a, T> {
inner: &'a Rc<RefCell<T>>,
borrow: Ref<'a, T>
}
impl<'a, T> CC<'a, T> {
pub fn from_callback(item:&'a Rc<RefCell<T>>) -> CC<'a, T> {
CC {
inner: item,
borrow: item.borrow()
}
}
pub fn to_function(&'a self) -> &'a T {
self.borrow.deref()
}
}
It's a bit unwieldy, but it's probably the cleanest way to do so.
A new struct CC is defined, containing a 'a ref to Rc<RefCell<T>> (where the T generic in your case would end up being Option<Closure<FnMut()>>) and a Ref to T with lifetime 'a, auto-populated on the from_callback constructor.
The moment you generate this object, you'll have a Ref with the same lifetime as the ref you gave as an argument, making the entire issue go away. From there, you can call to_function to retrieve a &'a reference to your inner type.
There is a gotcha to this: as long as a single of these objects exists, you will (obviously) not be able to borrow_mut() on the RefCell, which may or may not kill your use case (as one doesn't use a RefCell for the fun of it). Nevertheless, these objects are relatively cheap to instantiate, so you can afford to bin them once you're done with them.
An example with Function and Closure types replaced with u8 (because js_sys cannot be imported into the sandbox) is available here.
Although I really like Sébastien's answer and explanation, I ended up going for Ömer's suggestion of using a macro, simply for the sake of conciseness. I'll post the macro in case it's of use to anyone else.
macro_rules! callback_to_function {
($callback:expr) => {
$callback
.borrow()
.as_ref()
.unwrap()
.as_ref()
.unchecked_ref()
};
}
I'll leave Sébastien's answer as the accepted one as I believe it is the more "correct" way to solve this issue and he provides a great explanation.