How do I avoid incurring in lifetime issues when refactoring a function? - rust

Playground if you want to jump directly into the code.
Problem
I'm trying to implement a function filter_con<T, F>(v: Vec<T>, predicate: F) that allows concurrent filter on a Vec, via async predicates.
That is, instead of doing:
let arr = vec![...];
let arr_filtered = join_all(arr.into_iter().map(|it| async move {
if some_getter(&it).await > some_value {
Some(it)
} else {
None
}
}))
.await
.into_iter()
.filter_map(|it| it)
.collect::<Vec<T>>()
every time I need to filter for a Vec, I want to be able to:
let arr = vec![...];
let arr_filtered = filter_con(arr, |it| async move {
some_getter(&it).await > some_value
}).await
Tentative implementation
I've extracted the function into its own but I am incurring in lifetime issues
async fn filter_con<T, B, F>(arr: Vec<T>, predicate: F) -> Vec<T>
where
F: FnMut(&T) -> B,
B: futures::Future<Output = bool>,
{
join_all(arr.into_iter().map(|it| async move {
if predicate(&it).await {
Some(it)
} else {
None
}
}))
.await
.into_iter()
.filter_map(|p| p)
.collect::<Vec<_>>()
}
error[E0507]: cannot move out of a shared reference
I don't know what I'm moving out of predicate?
For more details, see the playground.

You won't be able to make the predicate an FnOnce, because, if you have 10 items in your Vec, you'll need to call the predicate 10 times, but an FnOnce only guarantees it can be called once, which could lead to something like this:
let vec = vec![1, 2, 3];
let has_drop_impl = String::from("hello");
filter_con(vec, |&i| async {
drop(has_drop_impl);
i < 5
}
So F must be either an FnMut or an Fn. The standard library Iterator::filter takes an FnMut, though this can be a source of confusion (it is the captured variables of the closure that need a mutable reference, not the elements of the iterator).
Because the predicate is an FnMut, any caller needs to be able to get an &mut F. For Iterator::filter, this can be used to do something like this:
let vec = vec![1, 2, 3];
let mut count = 0;
vec.into_iter().filter(|&x| {
count += 1; // this line makes the closure an `FnMut`
x < 2
})
However, by sending the iterator to join_all, you are essentially allowing your async runtime to schedule these calls as it wants, potentially at the same time, which would cause an aliased &mut T, which is always undefined behaviour. This issue has a slightly more cut down version of the same issue https://github.com/rust-lang/rust/issues/69446.
I'm still not 100% on the details, but it seems the compiler is being conservative here and doesn't even let you create the closure in the first place to prevent soundness issues.
I'd recommend making your function only accept Fns. This way, your runtime is free to call the function however it wants. This does means that your closure cannot have mutable state, but this is unlikely to be a problem in a tokio application. For the counting example, the "correct" solution is to use an AtomicUsize (or equivalent), which allows mutation via shared reference. If you're referencing mutable state in your filter call, it should be thread safe, and thread safe data structures generally allow mutation via shared reference.
Given that restriction, the following gives the answer you expect:
async fn filter_con<T, B, F>(arr: Vec<T>, predicate: F) -> Vec<T>
where
F: Fn(&T) -> B,
B: Future<Output = bool>,
{
join_all(arr.into_iter().map(|it| async {
if predicate(&it).await {
Some(it)
} else {
None
}
}))
.await
.into_iter()
.filter_map(|p| p)
.collect::<Vec<_>>()
}
Playground

Related

What would be the benefit of binding result of operation to it self after mem_swap with empty one

I've been going through the base code of some Rust library (Carboxyl) and found some weird code. It performs a filter map to a Vector after swapping it with an empty one and reassigning the result to the original identifier.
impl<A: Send + Sync + Clone + 'static> Source<A> {
/// Make the source send an event to all its observers.
pub fn send(&mut self, a: A) {
use std::mem;
let mut new_callbacks = vec!();
mem::swap(&mut new_callbacks, &mut self.callbacks);
self.callbacks = new_callbacks
.into_iter()
.filter_map(|mut callback| {
let result = callback(a.clone());
match result {
Ok(_) => Some(callback),
Err(_) => None,
}
})
.collect();
}
}
What would be the possible benefits of doing so compared to performing filter_map to the original identifier and reassigning the result to itself?
The benefit is that it's possible.
Because callbacks is part of self which is behind a mutable reference you can't move out of it with into_iter().
Once drain_filter stabilizes this can be done inplace but until then this is the solution.

Iterate over Vec<"CustomStruct"> => unsatisfied trait bounds

I'm participating in this year's Advent of Code and wanted to take the opportunity to learn Rust. (So, if you're also participating, the following section might spoil something).
I want to iterate over the Vec vector and decrement the internal_counter value for each Item in this Vector. I tried the following:
let test: Vec<Lanternfish> = fish_list.map(|fish| fish.decrement_couner()).collect();
The compiler gives me the following error: method cannot be called on Vec<Lanternfish> due to unsatisfied trait bounds
I understand that the iterator function is not available for this, however I don't understand exactly how to fix the problem.
#[derive(Debug)]
struct Lanternfish {
internal_counter: u8,
}
impl Lanternfish {
fn new() -> Self {
Lanternfish {
internal_counter: 8,
}
}
fn decrement_counter(&mut self) {
self.internal_counter -= 1
}
}
fn part_one(content: &str) {
let content: Vec<char> = content.chars().filter(|char| char.is_digit(10)).collect();
let mut fish_list: Vec<Lanternfish> = init_list(content);
let test: Vec<Lanternfish> = fish_list.map(|fish| fish.decrement_counter()).collect();
}
fn init_list(initial_values: Vec<char>) -> Vec<Lanternfish> {
let mut all_lanternfish: Vec<_> = Vec::new();
for value in initial_values {
all_lanternfish.push(Lanternfish{internal_counter: value as u8});
}
all_lanternfish
}
The way to iterate over a Vec and call a mutating function on each element is:
for fish in &mut fish_list {
fish.decrement_counter();
}
What this line is doing:
fish_list.map(|fish| fish.decrement_couner).collect();
is
Try to call map on the Vec (it doesn't have that. Iterator has it, but you'd need to call iter(), iter_mut() or into_iter() on the Vec for that).
Assuming you get the right map, it then calls the lambda |fish| fish.decrement_couner on each element; typo aside, this is not a function call, but a field access, and Lanternfish doesn't have a field called decrement_couner. A call would need parentheses.
Assuming you fix the function call, you then collect all the results of the calls (a bunch of () "unit" values, since decrement_counter doesn't return anything) into a new Vec, which is of type Vec<()>.
And finally, you try to bind that to a variable of Vec<Lanternfish>, which will fail.
Meanwhile, the function calls will have modified the original Vec, if you used iter_mut(). Otherwise, the function calls will not compile.

Can I have a mutable reference to a type and its trait object in the same scope? [duplicate]

Why can I have multiple mutable references to a static type in the same scope?
My code:
static mut CURSOR: Option<B> = None;
struct B {
pub field: u16,
}
impl B {
pub fn new(value: u16) -> B {
B { field: value }
}
}
struct A;
impl A {
pub fn get_b(&mut self) -> &'static mut B {
unsafe {
match CURSOR {
Some(ref mut cursor) => cursor,
None => {
CURSOR= Some(B::new(10));
self.get_b()
}
}
}
}
}
fn main() {
// first creation of A, get a mutable reference to b and change its field.
let mut a = A {};
let mut b = a.get_b();
b.field = 15;
println!("{}", b.field);
// second creation of A, a the mutable reference to b and change its field.
let mut a_1 = A {};
let mut b_1 = a_1.get_b();
b_1.field = 16;
println!("{}", b_1.field);
// Third creation of A, get a mutable reference to b and change its field.
let mut a_2 = A {};
let b_2 = a_2.get_b();
b_2.field = 17;
println!("{}", b_1.field);
// now I can change them all
b.field = 1;
b_1.field = 2;
b_2.field = 3;
}
I am aware of the borrowing rules
one or more references (&T) to a resource,
exactly one mutable reference (&mut T).
In the above code, I have a struct A with the get_b() method for returning a mutable reference to B. With this reference, I can mutate the fields of struct B.
The strange thing is that more than one mutable reference can be created in the same scope (b, b_1, b_2) and I can use all of them to modify B.
Why can I have multiple mutable references with the 'static lifetime shown in main()?
My attempt at explaining this is behavior is that because I am returning a mutable reference with a 'static lifetime. Every time I call get_b() it is returning the same mutable reference. And at the end, it is just one identical reference. Is this thought right? Why am I able to use all of the mutable references got from get_b() individually?
There is only one reason for this: you have lied to the compiler. You are misusing unsafe code and have violated Rust's core tenet about mutable aliasing. You state that you are aware of the borrowing rules, but then you go out of your way to break them!
unsafe code gives you a small set of extra abilities, but in exchange you are now responsible for avoiding every possible kind of undefined behavior. Multiple mutable aliases are undefined behavior.
The fact that there's a static involved is completely orthogonal to the problem. You can create multiple mutable references to anything (or nothing) with whatever lifetime you care about:
fn foo() -> (&'static i32, &'static i32, &'static i32) {
let somewhere = 0x42 as *mut i32;
unsafe { (&*somewhere, &*somewhere, &*somewhere) }
}
In your original code, you state that calling get_b is safe for anyone to do any number of times. This is not true. The entire function should be marked unsafe, along with copious documentation about what is and is not allowed to prevent triggering unsafety. Any unsafe block should then have corresponding comments explaining why that specific usage doesn't break the rules needed. All of this makes creating and using unsafe code more tedious than safe code, but compared to C where every line of code is conceptually unsafe, it's still a lot better.
You should only use unsafe code when you know better than the compiler. For most people in most cases, there is very little reason to create unsafe code.
A concrete reminder from the Firefox developers:

How to store async closure created at runtime in a struct?

I'm learning Rust's async/await feature, and stuck with the following task. I would like to:
Create an async closure (or better to say async block) at runtime;
Pass created closure to constructor of some struct and store it;
Execute created closure later.
Looking through similar questions I wrote the following code:
use tokio;
use std::pin::Pin;
use std::future::Future;
struct Services {
s1: Box<dyn FnOnce(&mut Vec<usize>) -> Pin<Box<dyn Future<Output = ()>>>>,
}
impl Services {
fn new(f: Box<dyn FnOnce(&mut Vec<usize>) -> Pin<Box<dyn Future<Output = ()>>>>) -> Self {
Services { s1: f }
}
}
enum NumberOperation {
AddOne,
MinusOne
}
#[tokio::main]
async fn main() {
let mut input = vec![1,2,3];
let op = NumberOperation::AddOne;
let s = Services::new(Box::new(|numbers: &mut Vec<usize>| Box::pin(async move {
for n in numbers {
match op {
NumberOperation::AddOne => *n = *n + 1,
NumberOperation::MinusOne => *n = *n - 1,
};
}
})));
(s.s1)(&mut input).await;
assert_eq!(input, vec![2,3,4]);
}
But above code won't compile, because of invalid lifetimes.
How to specify lifetimes to make above example compile (so Rust will know that async closure should live as long as input). As I understand in provided example Rust requires closure to have static lifetime?
Also it's not clear why do we have to use Pin<Box> as return type?
Is it possible somehow to refactor code and eliminate: Box::new(|arg: T| Box::pin(async move {}))? Maybe there is some crate?
Thanks
Update
There is similar question How can I store an async function in a struct and call it from a struct instance?
. Although that's a similar question and actually my example is based on one of the answers from that question. Second answer contains information about closures created at runtime, but seems it works only when I pass an owned variable, but in my example I would like to pass to closure created at runtime mutable reference, not owned variable.
How to specify lifetimes to make above example compile (so Rust will know that async closure should live as long as input). As I understand in provided example Rust requires closure to have static lifetime?
Let's take a closer look at what happens when you invoke the closure:
(s.s1)(&mut input).await;
// ^^^^^^^^^^^^^^^^^^
// closure invocation
The closure immediately returns a future. You could assign that future to a variable and hold on to it until later:
let future = (s.s1)(&mut input);
// do some other stuff
future.await;
The problem is, because the future is boxed, it could be held around for the rest of the program's life without ever being driven to completion; that is, it could have 'static lifetime. And input must obviously remain borrowed until the future resolves: else imagine, for example, what would happen if "some other stuff" above involved modifying, moving or even dropping input—consider what would then happen when the future is run?
One solution would be to pass ownership of the Vec into the closure and then return it again from the future:
let s = Services::new(Box::new(move |mut numbers| Box::pin(async move {
for n in &mut numbers {
match op {
NumberOperation::AddOne => *n = *n + 1,
NumberOperation::MinusOne => *n = *n - 1,
};
}
numbers
})));
let output = (s.s1)(input).await;
assert_eq!(output, vec![2,3,4]);
See it on the playground.
#kmdreko's answer shows how you can instead actually tie the lifetime of the borrow to that of the returned future.
Also it's not clear why do we have to use Pin as return type?
Let's look at a stupidly simple async block:
async {
let mut x = 123;
let r = &mut x;
some_async_fn().await;
*r += 1;
x
}
Notice that execution may pause at the await. When that happens, the incumbent values of x and r must be stored temporarily (in the Future object: it's just a struct, in this case with fields for x and r). But r is a reference to another field in the same struct! If the future were then moved from its current location to somewhere else in memory, r would still refer to the old location of x and not the new one. Undefined Behaviour. Bad bad bad.
You may have observed that the future can also hold references to things that are stored elsewhere, such as the &mut input in #kmdreko's answer; because they are borrowed, those also cannot be moved for the duration of the borrow. So why can't the immovability of the future similarly be enforced by r's borrowing of x, without pinning? Well, the future's lifetime would then depend on its content—and such circularities are impossible in Rust.
This, generally, is the problem with self-referential data structures. Rust's solution is to prevent them from being moved: that is, to "pin" them.
Is it possible somehow to refactor code and eliminate: Box::new(|arg: T| Box::pin(async move {}))? Maybe there is some crate?
In your specific example, the closure and future can reside on the stack and you can simply get rid of all the boxing and pinning (the borrow-checker can ensure stack items don’t move without explicit pinning). However, if you want to return the Services from a function, you'll run into difficulties stating its type parameters: impl Trait would normally be your go-to solution for this type of problem, but it's limited and does not (currently) extend to associated types, such as that of the returned future.
There are work-arounds, but using boxed trait objects is often the most practical solution—albeit it introduces heap allocations and an additional layer of indirection with commensurate runtime cost. Such trait objects are however unavoidable where a single instance of your Services structure may hold different closures in s1 over the course of its life, where you're returning them from trait methods (which currently can’t use impl Trait), or where you're interfacing with a library that does not provide any alternative.
If you want your example to work as is, the missing component is communicating to the compiler what lifetime associations are allowed. Trait objects like dyn Future<...> are constrained to be 'static by default, which means it cannot have references to non-static objects. This is a problem because your closure returns a Future that needs to keep a reference to numbers in order to work.
The direct fix is to annotate that the dyn FnOnce can return a Future that can be bound to the life of the first parameter. This requires a higher-ranked trait bound and the syntax looks like for<'a>:
struct Services {
s1: Box<dyn for<'a> FnOnce(&'a mut Vec<usize>) -> Pin<Box<dyn Future<Output = ()> + 'a>>>,
}
impl Services {
fn new(f: Box<dyn for<'a> FnOnce(&'a mut Vec<usize>) -> Pin<Box<dyn Future<Output = ()> + 'a>>>) -> Self {
Services { s1: f }
}
}
The rest of your code now compiles without modification, check it out on the playground.

How do I mutate the item in Iterator::find's closure?

I would like to use Iterator::find on libusb::Devices object, which has a signature like so:
fn find<P>(&mut self, predicate: P) -> Option<Self::Item>
where Self: Sized, P: FnMut(&Self::Item) -> bool
I want to find a device with a particular vendor_id, which requires calling Device::device_descriptor on each device. However, the device_descriptor method requires a &mut to each device, and the find method only gives a &.
Does this mean that it's impossible to use mutable methods on any of the Iterator's methods (find, filter, etc.)?
Here's the example I'm trying to get working:
let mut device = context
.devices()
.unwrap()
.iter()
.find(&mut |dev: &libusb::Device| {
dev.device_descriptor().unwrap().vendor_id() == vendor_id
})
.unwrap();
Here is the error I'm getting:
error: cannot borrow immutable borrowed content `*dev` as mutable
Does this mean that it's impossible to use mutable methods on any of the Iterator's methods (find, filter, etc.)?
In the methods that receive a parameter of type F: Fn*(&Self::Item), yes. One cannot call a method that expects a mutable reference (&mut) on a reference (&). For example:
let mut x = vec![10];
// (&x)[0] = 20; // not ok
(&mut x)[0] = 20; // ok
//(& (&x))[0] = 20; // not ok
//(& (&mut x))[0] = 20; // not ok
(&mut (&mut x))[0] = 20; // ok
Note that this rule also applies to auto deref.
Some methods of Iterator receive a parameter of type F: Fn*(Self::Item), like map, filter_map, etc. These methods allow functions that mutate the item.
One interesting question is: Why do some methods expect Fn*(&Self::Item) and others Fn*(Self::item)?
The methods that will need to use the item, like filter (that will return the item if the filter function returns true), cannot pass Self::Item as parameter to the function, because doing that means give the ownership of the item to the function. For this reason, methods like filter pass &Self::Item, so they can use the item later.
On the other hand, methods like map and filter_map do not need the item after they are used as arguments (the items are being mapped after all), so they pass the item as Self::Item.
In general, it is possible to use filter_map to replace the use of filter in cases that the items need to be mutated. In your case, you can do this:
extern crate libusb;
fn main() {
let mut context = libusb::Context::new().expect("context creation");
let mut filtered: Vec<_> = context.devices()
.expect("devices list")
.iter()
.filter_map(|mut r| {
if let Ok(d) = r.device_descriptor() {
if d.vendor_id() == 7531 {
return Some(r);
}
}
None
})
.collect();
for d in &mut filtered {
// same as: for d in filtered.iter_mut()
println!("{:?}", d.device_descriptor());
}
}
The filter_map filters out None values and produces the wrapped values in Somes.
Answer migrated from: How can I apply mutating method calls to each struct when using Iter::find?
You can use Iterator::find_map instead, whose closure takes in elements by value which can then be easily mutated:
games
.iter_mut()
.find_map(|game| {
let play = rng.gen_range(1..=10);
game.play(play).then(|| game)
})
Playground

Resources