Avoiding "cannot move out of borrowed content" without the use of "to_vec"? - rust

I'm learning rust and have a simple program, shown below. Playground link.
#[derive(Debug)]
pub struct Foo {
bar: String,
}
pub fn gather_foos<'a>(data: &'a Vec<Vec<&'a Foo>>) -> Vec<Vec<&'a Foo>> {
let mut ret: Vec<Vec<&Foo>> = Vec::new();
for i in 0..data.len() {
if meets_requirements(&data[i]) {
ret.push(data[i].to_vec());
}
}
return ret
}
fn meets_requirements<'a>(_data: &'a Vec<&'a Foo>) -> bool {
true
}
fn main() {
let foo = Foo{
bar: String::from("bar"),
};
let v1 = vec![&foo, &foo, &foo];
let v2 = vec![&foo, &foo];
let data = vec![v1, v2];
println!("{:?}", gather_foos(&data));
}
The program simply loops through an array of arrays of a struct, checks if the array of structs meets some requirement and returns an array of arrays that meets said requirement.
I'm sure there's a more efficient way of doing this without the need to call to_vec(), which I had to implement in order to avoid the error cannot move out of borrowed content, but I'm not sure what that solution is.
I'm learning about Box<T> now and think it might provide a solution to my needs? Thanks for any help!!

The error is showing up because you're trying to move ownership of one of the vectors in the input vector to the output vector, which is not allowed since you've borrowed the input vector immutably. to_vec() creates a copy, which is why it works when you use it.
The solution depends on what you're trying to do. If you don't need the original input (you only want the matched ones), you can simply pass the input by value rather than by reference, which will allow you to consume the vector and move items to the output. Here's an example of this.
If you do need the original input, but you don't want to copy the vectors with to_vec(), you may want to use references in the output, as demonstrated by this example. Note that the function now returns a vector of references to vectors, rather than a vector of owned vectors.
For other cases, there are other options. If you need the data to be owned by multiple items for some reason, you could try Rc<T> or Arc<T> for reference-counted smart pointers, which can be cloned to provide immutable access to the same data by multiple owners.

Related

Creating a `Pin<Box<[T; N]>>` in Rust when `[T; N]` is too large to be created on the stack

Generalized Question
How can I implement a general function pinned_array_of_default in stable Rust where [T; N] is too large to fit on the stack?
fn pinned_array_of_default<T: Default, const N: usize>() -> Pin<Box<[T; N]>> {
unimplemented!()
}
Alternatively, T can implement Copy if that makes the process easier.
fn pinned_array_of_element<T: Copy, const N: usize>(x: T) -> Pin<Box<[T; N]>> {
unimplemented!()
}
Keeping the solution in safe Rust would have been preferable, but it seems unlikely that it is possible.
Approaches
Initially I was hopping that by implementing Default I might be able to get Default to handle the initial allocation, however it still creates it on the stack so this will not work for large values of N.
let boxed: Box<[T; N]> = Box::default();
let foo = Pin::new(boxed);
I suspect I need to use MaybeUninit to achieve this and there is a Box::new_uninit() function, but it is currently unstable and I would ideally like to keep this within stable Rust. I also somewhat unsure if transmuting Pin<Box<MaybeUninit<B>>> to Pin<Box<B>> could somehow have negative effects on the Pin.
Background
The purpose behind using a Pin<Box<[T; N]>> is to hold a block of pointers where N is some constant factor/multiple of the page size.
#[repr(C)]
#[derive(Copy, Clone)]
pub union Foo<R: ?Sized> {
assigned: NonNull<R>,
next_unused: Option<NonNull<Self>>,
}
Each pointer may or may not be in use at a given point in time. An in-use Foo points to R, and an unused/empty Foo has a pointer to either the next empty Foo in the block or None. A pointer to the first unused Foo in the block is stored separately. When a block is full, a new block is created and then pointer chain of unused positions continues through the next block.
The box needs to be pinned since it will contain self referential pointers as well as outside structs holding pointers into assigned positions in each block.
I know that Foo is wildly unsafe by Rust standards, but the general question of creating a Pin<Box<[T; N]>> still stands
A way to construct a large array on the heap and avoid creating it on the stack is to proxy through a Vec. You can construct the elements and use .into_boxed_slice() to get a Box<[T]>. You can then use .try_into() to convert it to a Box<[T; N]>. And then use .into() to convert it to a Pin<Box<[T; N]>>:
fn pinned_array_of_default<T: Default, const N: usize>() -> Pin<Box<[T; N]>> {
let mut vec = vec![];
vec.resize_with(N, T::default);
let boxed: Box<[T; N]> = match vec.into_boxed_slice().try_into() {
Ok(boxed) => boxed,
Err(_) => unreachable!(),
};
boxed.into()
}
You can optionally make this look more straight-forward if you add T: Clone so that you can do vec![T::default(); N] and/or add T: Debug so you can use .unwrap() or .expect().
See also:
Creating a fixed-size array on heap in Rust

Refactoring out `clone` when Copy trait is not implemented?

Is there a way to get rid of clone(), given the restrictions I've noted in the comments? I would really like to know if it's possible to use borrowing in this case, where modifying the third-party function signature is not possible.
// We should keep the "data" hidden from the consumer
mod le_library {
pub struct Foobar {
data: Vec<i32> // Something that doesn't implement Copy
}
impl Foobar {
pub fn new() -> Foobar {
Foobar {
data: vec![1, 2, 3],
}
}
pub fn foo(&self) -> String {
let i = third_party(self.data.clone()); // Refactor out clone?
format!("{}{}", "foo!", i)
}
}
// Can't change the signature, suppose this comes from a crate
pub fn third_party(data:Vec<i32>) -> i32 {
data[0]
}
}
use le_library::Foobar;
fn main() {
let foobar = Foobar::new();
let foo = foobar.foo();
let foo2 = foobar.foo();
println!("{}", foo);
println!("{}", foo2);
}
playground
As long as your foo() method accepts &self, it is not possible, because the
pub fn third_party(data: Vec<i32>) -> i32
signature is quite unambiguous: regardless of what this third_party function does, it's API states that it needs its own instance of Vec, by value. This precludes using borrowing of any form, and because foo() accepts self by reference, you can't really do anything except for cloning.
Also, supposedly this third_party is written without any weird unsafe hacks, so it is quite safe to assume that the Vec which is passed into it is eventually dropped and deallocated. Therefore, unsafely creating a copy of the original Vec without cloning it (by copying internal pointers) is out of question - you'll definitely get a use-after-free if you do it.
While your question does not state it, the fact that you want to preserve the original value of data is kind of a natural assumption. If this assumption can be relaxed, and you're actually okay with giving the data instance out and e.g. replacing it with an empty vector internally, then there are several things you can potentially do:
Switch foo(&self) to foo(&mut self), then you can quite easily extract data and replace it with an empty vector.
Use Cell or RefCell to store the data. This way, you can continue to use foo(&self), at the cost of some runtime checks when you extract the value out of a cell and replace it with some default value.
Both these approaches, however, will result in you losing the original Vec. With the given third-party API there is no way around that.
If you still can somehow influence this external API, then the best solution would be to change it to accept &[i32], which can easily be obtained from Vec<i32> with borrowing.
No, you can't get rid of the call to clone here.
The problem here is with the third-party library. As the function third_party is written now, it's true that it could be using an &Vec<i32>; it doesn't require ownership, since it's just moving out a value that's Copy. However, since the implementation is outside of your control, there's nothing preventing the person maintaining the function from changing it to take advantage of owning the Vec. It's possible that whatever it is doing would be easier or require less memory if it were allowed to overwrite the provided memory, and the function writer is leaving the door open to do so in the future. If that's not the case, it might be worth suggesting a change to the third-party function's signature and relying on clone in the meantime.

Why does std::vec::Vec implement two kinds of the Extend trait?

The struct std::vec::Vec implements two kinds of Extend, as specified here – impl<'a, T> Extend<&'a T> for Vec<T> and impl<T> Extend<T> for Vec<T>. The documentation states that the first kind is an "Extend implementation that copies elements out of references before pushing them onto the Vec". I'm rather new to Rust, and I'm not sure if I'm understanding it correctly.
I would guess that the first kind is used with the equivalent of C++ normal iterators, and the second kind is used with the equivalent of C++ move iterators.
I'm trying to write a function that accepts any data structure that will allow inserting i32s to the back, so I take a parameter that implements both kinds of Extend, but I can't figure out how to specify the generic parameters to get it to work:
fn main() {
let mut vec = std::vec::Vec::<i32>::new();
add_stuff(&mut vec);
}
fn add_stuff<'a, Rec: std::iter::Extend<i32> + std::iter::Extend<&'a i32>>(receiver: &mut Rec) {
let x = 1 + 4;
receiver.extend(&[x]);
}
The compiler complains that &[x] "creates a temporary which is freed while still in use" which makes sense because 'a comes from outside the function add_stuff. But of course what I want is for receiver.extend(&[x]) to copy the element out of the temporary array slice and add it to the end of the container, so the temporary array will no longer be used after receiver.extend returns. What is the proper way to express what I want?
From the outside of add_stuff, Rect must be able to be extended with a reference whose lifetime is given in the inside of add_stuff. Thus, you could require that Rec must be able to be extended with references of any lifetime using higher-ranked trait bounds:
fn main() {
let mut vec = std::vec::Vec::<i32>::new();
add_stuff(&mut vec);
}
fn add_stuff<Rec>(receiver: &mut Rec)
where
for<'a> Rec: std::iter::Extend<&'a i32>
{
let x = 1 + 4;
receiver.extend(&[x]);
}
Moreover, as you see, the trait bounds were overly tight. One of them should be enough if you use receiver consistently within add_stuff.
That said, I would simply require Extend<i32> and make sure that add_stuff does the right thing internally (if possible):
fn add_stuff<Rec>(receiver: &mut Rec)
where
Rec: std::iter::Extend<i32>
{
let x = 1 + 4;
receiver.extend(std::iter::once(x));
}

Is there a way to remove entries from a generic first vector that are present in another vector?

I have a problem understanding ownership when a higher order function is called. I am supposed to remove entries from the first vector if the elements exist in the second vector so I came up with this attempt:
fn array_diff<T: PartialEq>(a: Vec<T>, b: Vec<T>) -> Vec<T> {
a.iter()
.filter(|incoming| !b.contains(incoming))
.collect::<Vec<T>>()
}
I can't change the function signature. The .collect() call doesn't work because all I am getting is a reference to elements in a. While this is generic, I don't know if the result is copy-able or clone-able. I also probably can't dereference the elements in a.
Is there a way to fix this piece of code without rewriting it from scratch?
For this particular test ... you can consume the vector instead of relying on references. The signature yields values and not references. As such, to pass the test you only have to use into_iter instead:
a.into_iter() // <----------- call into_iter
.filter(|incoming| !b.contains(incoming))
.collect::<Vec<T>>()
This consumes the values and returns them out again.
Destroying the incoming allocation to create a new allocation isn't very efficient. Instead, write code that is more directly in line with the problem statement:
fn array_diff<T: PartialEq>(mut a: Vec<T>, b: Vec<T>) -> Vec<T> {
a.retain(|aa| !b.contains(aa));
a
}
Adding mut in the signature doesn't change the signature because no one can tell that you've added it. It's the exact same as:
fn array_diff<T: PartialEq>(a: Vec<T>, b: Vec<T>) -> Vec<T> {
let mut a = a;
a.retain(|aa| !b.contains(aa));
a
}

Passing a member of a struct to a method of the same struct in Rust

I am now facing a borrowing problem in Rust, and I have an idea to solve it. But I think the way I found is not a good answer. So I am wondering if there is another way to solve it.
I use the following example code to describe my situation:
struct S {
val: u8
}
impl S {
pub fn f1(&mut self) {
println!("F1");
self.f2(self.val);
}
pub fn f2(&mut self, input: u8) {
println!("F2");
// Do something with input
}
}
fn main() {
let mut s = S {
val: 0
};
s.f1();
}
Structure S has a method, f2, which takes an additional argument input to do something. There is another method, f1, which calls f2 with the val of structure S. Outsider may call either f1 or f2 for different use cases.
When I compiled the above code, I got the following error message:
src\main.rs:9:17: 9:25 error: cannot use `self.val` because it was mutably borrowed [E0503]
src\main.rs:9 self.f2(self.val);
^~~~~~~~
src\main.rs:9:9: 9:13 note: borrow of `*self` occurs here
src\main.rs:9 self.f2(self.val);
^~~~
I roughly understand how borrowing works in Rust. So I know that I can solve the problem by changing the implementation of f1 to:
pub fn f1(&mut self) {
let v = self.val;
println!("F1");
self.f2(v);
}
However, I feel this solution a little bit redundant. I am wondering if there is a way to solve this problem without using extra variable binding.
Your solution works not because of an extra variable binding, but rather because of an extra copy. Integer types can be implicitly copied, so let v = self.val creates a copy of the value. That copy is not borrowed from self but owned. So compiler allows you to call f2 with this copy.
If you write self.f2(self.val), compiler will also attempt to make a copy of self.val. However, at this location it is not possible to make a copy because self is borrowed for the function call. So it is not possible to make such call unless you copy the value before it. And this is not a syntax limitation, but an enforcement of the borrow checker. Anyway, it's better to write the copying and the call in the order in which they actually happen.
If the type you're trying to use as argument were not Copy (e.g. a String), you would need to write let v = self.val.clone(); self.f2(v); to ask the compiler for copy explicitly. Making such calls without making a copy is not allowed. You probably would need to make the method non-mutable or eliminate the argument somehow.
You can use this trick for copyable values:
pub fn f1(&mut self) {
println!("F1");
match self.val {x => self.f2(x)};
}
However, using an explicit temporary variable is more clear and idiomatic.

Resources