Idiomatic append operation - rust

I'm writing a function that will transfer the contents from one Vec to another.
I managed to write two different versions of the same code. One is cleaner, but is potentially slower.
Version 1:
fn move_values<T>(buffer: &mut Vec<T>, recipient: &mut Vec<T>) {
loop {
let value = buffer.pop();
if value.is_none() {
return;
}
recipient.push(value.unwrap());
}
}
Version 2:
fn move_values<T>(buffer: &mut Vec<T>, recipient: &mut Vec<T>) {
for value in buffer.iter() {
recipient.push(value.clone());
}
buffer.clear();
}
My initial gut feeling is that Version 1 is faster because it only requires a single run through the buffer; while Version 2 is more "Rusty" because it involves iterating over a collection rather than using loop.
Which of these is more idiomatic or "better practice" in general?
Note, I'm aware of append, I'm trying to do this by hand for educational purposes.

Neither. There's a built-in operation for this, Vec::append:
Moves all the elements of other into Self, leaving other empty.
fn move_values<T>(buffer: &mut Vec<T>, recipient: &mut Vec<T>) {
recipient.append(buffer);
}
Neither of your functions even compile:
fn move_values_1<T>(buffer: &mut Vec<T>, recipient: &mut Vec<T>) {
loop {
let value = buffer.pop();
if value.is_none() {
return;
}
recipient.push_front(card.unwrap());
}
}
error[E0425]: unresolved name `card`
--> src/main.rs:7:30
|
7 | recipient.push_front(card.unwrap());
| ^^^^ unresolved name
fn move_values_2<T>(buffer: &mut Vec<T>, recipient: &mut Vec<T>) {
for value in buffer.iter() {
recipient.push_front(value.clone());
}
buffer.clear();
}
error: no method named `push_front` found for type `&mut std::vec::Vec<T>` in the current scope
--> src/main.rs:7:19
|
7 | recipient.push_front(card.unwrap());
| ^^^^^^^^^^
if I were to implement it myself
Well, there's a reason that it's implemented for you, but sure... let's dig in.
Checking if something is_some or is_none can usually be avoided by pattern matching. For example:
fn move_values_1<T>(buffer: &mut Vec<T>, recipient: &mut Vec<T>) {
while let Some(v) = buffer.pop() {
recipient.push(v);
}
}
Of course, this moves everything in reverse order because pushing and popping to a Vec both occur at the end.
Calling clone doesn't do what you want unless your trait bounds say that T implements Clone. Otherwise, you are just cloning the reference itself.
You can avoid the need for cloning if you drain the values from one collection and insert them into the other:
for value in buffer.drain(..) {
recipient.push(value);
}
But that for loop is silly, just extend the collection using the iterator:
recipient.extend(buffer.drain(..));
I'd still use the built in append method to do this when transferring between collections of the same type, as it is probably optimized for the precise data layout, and potentially specialized for certain types of data.

Related

Given an optional mutable reference, can I pass it to another function without moving it?

I am working on a little Rust project where many functions take an optional mutable reference to a struct. For simplicity, let's say that this struct is a String. So the functions look something like this:
fn append_to_string(maybe_string: Option<&mut String>) {
if let Some(s) = maybe_string {
s.push('1');
}
}
My main function has ownership of the optional structure. So it can easily call these functions using Option::as_mut:
fn main() {
let mut maybe_string = Some(String::new());
append_to_string(maybe_string.as_mut());
println!("{:?}", maybe_string);
}
This all seems to work fine. But the problem comes when one of the functions needs to call others.
fn append_multiple_to_string(maybe_string: Option<&mut String>) {
for _ in 0..2 {
append_to_string(maybe_string);
}
}
I can't compile this, because append_multiple_to_string moves maybe_string into append_to_string in the first iteration of the loop, so it can't use it again in subsequent iterations of the loop. See this Rust Playground.
I've actually figured out a way to make this work by extracting the reference from the Option and constructing a new Option for each iteration, like this (Rust Playground):
fn append_multiple_to_string(maybe_string: Option<&mut String>) {
match maybe_string {
Some(string) => {
for _ in 0..2 {
append_to_string(Some(string));
}
}
None => {
for _ in 0..2 {
append_to_string(None);
}
}
}
}
But this feels very cumbersome and I don't love that I have to repeat basically the same code twice to make it work. I feel like I must be missing a more elegant way of doing this, but I just don't seem to be able to figure out what it is. I guess I could make a macro that could take one copy of the code and expand it, but I have written macros before and I find them to be difficult to write and maintain, so I'd rather avoid that.
I assume that there is no way to make a copy of the Option to pass in, because then I would have two simultaneous mutable references to the same data. So am I just stuck with the ugly code that I have?
I'm open to changing the argument type away from Option<&mut String>, but I'm not sure what to change it to so that I can avoid this problem. If I do need to change it, I would prefer not to change it in such a way that the functions can change the value of main's maybe_string.is_some(). That is to say, with my current code, if a function calls maybe_string.take(), it is only taking the value out of its copy of the Option, not main's copy.
I would also prefer not to solve this problem using unsafe code.
You can use Option::as_deref_mut:
fn append_multiple_to_string(mut maybe_string: Option<&mut String>) {
// ^^^
for _ in 0..2 {
append_to_string(maybe_string.as_deref_mut());
// ^^^^^^^^^^^^^^^
}
}
as_deref_mut will turn a &'a mut Option<P> (where P is a mutable pointer type implementing DerefMut) into an Option<&'a mut P::Target>. In the case of a &mut T pointer, this means it will just turn a &'a mut Option<&'b mut T> into an Option<&'a mut T>, but it also works with other mutable pointer types, such as &'a mut Option<Box<T>> → Option<&'a mut T> or &'a mut Option<String> → Option<&'a mut str>.

How can you use an immutable Option by reference that contains a mutable reference?

Here's a Thing:
struct Thing(i32);
impl Thing {
pub fn increment_self(&mut self) {
self.0 += 1;
println!("incremented: {}", self.0);
}
}
And here's a function that tries to mutate a Thing and returns either true or false, depending on if a Thing is available:
fn try_increment(handle: Option<&mut Thing>) -> bool {
if let Some(t) = handle {
t.increment_self();
true
} else {
println!("warning: increment failed");
false
}
}
Here's a sample of usage:
fn main() {
try_increment(None);
let mut thing = Thing(0);
try_increment(Some(&mut thing));
try_increment(Some(&mut thing));
try_increment(None);
}
As written, above, it works just fine (link to Rust playground). Output below:
warning: increment failed
incremented: 1
incremented: 2
warning: increment failed
The problem arises when I want to write a function that mutates the Thing twice. For example, the following does not work:
fn try_increment_twice(handle: Option<&mut Thing>) {
try_increment(handle);
try_increment(handle);
}
fn main() {
try_increment_twice(None);
let mut thing = Thing(0);
try_increment_twice(Some(&mut thing));
try_increment_twice(None);
}
The error makes perfect sense. The first call to try_increment(handle) gives ownership of handle away and so the second call is illegal. As is often the case, the Rust compiler yields a sensible error message:
|
24 | try_increment(handle);
| ------ value moved here
25 | try_increment(handle);
| ^^^^^^ value used here after move
|
In an attempt to solve this, I thought it would make sense to pass handle by reference. It should be an immutable reference, mind, because I don't want try_increment to be able to change handle itself (assigning None to it, for example) only to be able to call mutations on its value.
My problem is that I couldn't figure out how to do this.
Here is the closest working version that I could get:
struct Thing(i32);
impl Thing {
pub fn increment_self(&mut self) {
self.0 += 1;
println!("incremented: {}", self.0);
}
}
fn try_increment(handle: &mut Option<&mut Thing>) -> bool {
// PROBLEM: this line is allowed!
// (*handle) = None;
if let Some(ref mut t) = handle {
t.increment_self();
true
} else {
println!("warning: increment failed");
false
}
}
fn try_increment_twice(mut handle: Option<&mut Thing>) {
try_increment(&mut handle);
try_increment(&mut handle);
}
fn main() {
try_increment_twice(None);
let mut thing = Thing(0);
try_increment_twice(Some(&mut thing));
try_increment_twice(None);
}
This code runs, as expected, but the Option is now passed about by mutable reference and that is not what I want:
I'm allowed to mutate the Option by reassigning None to it, breaking all following mutations. (Uncomment line 12 ((*handle) = None;) for example.)
It's messy: There are a whole lot of extraneous &mut's lying about.
It's doubly messy: Heaven only knows why I must use ref mut in the if let statement while the convention is to use &mut everywhere else.
It defeats the purpose of having the complicated borrow-checking and mutability checking rules in the compiler.
Is there any way to actually achieve what I want: passing an immutable Option around, by reference, and actually being able to use its contents?
You can't extract a mutable reference from an immutable one, even a reference to its internals. That's kind of the point! Multiple aliases of immutable references are allowed so, if Rust allowed you to do that, you could have a situation where two pieces of code are able to mutate the same data at the same time.
Rust provides several escape hatches for interior mutability, for example the RefCell:
use std::cell::RefCell;
fn try_increment(handle: &Option<RefCell<Thing>>) -> bool {
if let Some(t) = handle {
t.borrow_mut().increment_self();
true
} else {
println!("warning: increment failed");
false
}
}
fn try_increment_twice(handle: Option<RefCell<Thing>>) {
try_increment(&handle);
try_increment(&handle);
}
fn main() {
let mut thing = RefCell::new(Thing(0));
try_increment_twice(Some(thing));
try_increment_twice(None);
}
TL;DR: The answer is No, I can't.
After the discussions with #Peter Hall and #Stargateur, I have come to understand why I need to use &mut Option<&mut Thing> everywhere. RefCell<> would also be a feasible work-around but it is no neater and does not really achieve the pattern I was originally seeking to implement.
The problem is this: if one were allowed to mutate the object for which one has only an immutable reference to an Option<&mut T> one could use this power to break the borrowing rules entirely. Concretely, you could, essentially, have many mutable references to the same object because you could have many such immutable references.
I knew there was only one mutable reference to the Thing (owned by the Option<>) but, as soon as I started taking references to the Option<>, the compiler no longer knew that there weren't many of those.
The best version of the pattern is as follows:
fn try_increment(handle: &mut Option<&mut Thing>) -> bool {
if let Some(ref mut t) = handle {
t.increment_self();
true
}
else {
println!("warning: increment failed");
false
}
}
fn try_increment_twice(mut handle: Option<&mut Thing>) {
try_increment(&mut handle);
try_increment(&mut handle);
}
fn main() {
try_increment_twice(None);
let mut thing = Thing(0);
try_increment_twice(Some(&mut thing));
try_increment_twice(None);
}
Notes:
The Option<> holds the only extant mutable reference to the Thing
try_increment_twice() takes ownership of the Option<>
try_increment() must take the Option<> as &mut so that the compiler knows that it has the only mutable reference to the Option<>, during the call
If the compiler knows that try_increment() has the only mutable reference to the Option<> which holds the unique mutable reference to the Thing, the compiler knows that the borrow rules have not been violated.
Another Experiment
The problem of the mutability of Option<> remains because one can call take() et al. on a mutable Option<>, breaking everything following.
To implement the pattern that I wanted, I need something that is like an Option<> but, even if it is mutable, it cannot be mutated. Something like this:
struct Handle<'a> {
value: Option<&'a mut Thing>,
}
impl<'a> Handle<'a> {
fn new(value: &'a mut Thing) -> Self {
Self {
value: Some(value),
}
}
fn empty() -> Self {
Self {
value: None,
}
}
fn try_mutate<T, F: Fn(&mut Thing) -> T>(&mut self, mutation: F) -> Option<T> {
if let Some(ref mut v) = self.value {
Some(mutation(v))
}
else {
None
}
}
}
Now, I thought, I can pass around &mut Handle's all day long and know that someone who has a Handle can only mutate its contents, not the handle itself. (See Playground)
Unfortunately, even this gains nothing because, if you have a mutable reference, you can always reassign it with the dereferencing operator:
fn try_increment(handle: &mut Handle) -> bool {
if let Some(_) = handle.try_mutate(|t| { t.increment_self() }) {
// This breaks future calls:
(*handle) = Handle::empty();
true
}
else {
println!("warning: increment failed");
false
}
}
Which is all fine and well.
Bottom line conclusion: just use &mut Option<&mut T>

Can a type know when a mutable borrow to itself has ended?

I have a struct and I want to call one of the struct's methods every time a mutable borrow to it has ended. To do so, I would need to know when the mutable borrow to it has been dropped. How can this be done?
Disclaimer: The answer that follows describes a possible solution, but it's not a very good one, as described by this comment from Sebastien Redl:
[T]his is a bad way of trying to maintain invariants. Mostly because dropping the reference can be suppressed with mem::forget. This is fine for RefCell, where if you don't drop the ref, you will simply eventually panic because you didn't release the dynamic borrow, but it is bad if violating the "fraction is in shortest form" invariant leads to weird results or subtle performance issues down the line, and it is catastrophic if you need to maintain the "thread doesn't outlive variables in the current scope" invariant.
Nevertheless, it's possible to use a temporary struct as a "staging area" that updates the referent when it's dropped, and thus maintain the invariant correctly; however, that version basically amounts to making a proper wrapper type and a kind of weird way to use it. The best way to solve this problem is through an opaque wrapper struct that doesn't expose its internals except through methods that definitely maintain the invariant.
Without further ado, the original answer:
Not exactly... but pretty close. We can use RefCell<T> as a model for how this can be done. It's a bit of an abstract question, but I'll use a concrete example to demonstrate. (This won't be a complete example, but something to show the general principles.)
Let's say you want to make a Fraction struct that is always in simplest form (fully reduced, e.g. 3/5 instead of 6/10). You write a struct RawFraction that will contain the bare data. RawFraction instances are not always in simplest form, but they have a method fn reduce(&mut self) that reduces them.
Now you need a smart pointer type that you will always use to mutate the RawFraction, which calls .reduce() on the pointed-to struct when it's dropped. Let's call it RefMut, because that's the naming scheme RefCell uses. You implement Deref<Target = RawFraction>, DerefMut, and Drop on it, something like this:
pub struct RefMut<'a>(&'a mut RawFraction);
impl<'a> Deref for RefMut<'a> {
type Target = RawFraction;
fn deref(&self) -> &RawFraction {
self.0
}
}
impl<'a> DerefMut for RefMut<'a> {
fn deref_mut(&mut self) -> &mut RawFraction {
self.0
}
}
impl<'a> Drop for RefMut<'a> {
fn drop(&mut self) {
self.0.reduce();
}
}
Now, whenever you have a RefMut to a RawFraction and drop it, you know the RawFraction will be in simplest form afterwards. All you need to do at this point is ensure that RefMut is the only way to get &mut access to the RawFraction part of a Fraction.
pub struct Fraction(RawFraction);
impl Fraction {
pub fn new(numerator: i32, denominator: i32) -> Self {
// create a RawFraction, reduce it and wrap it up
}
pub fn borrow_mut(&mut self) -> RefMut {
RefMut(&mut self.0)
}
}
Pay attention to the pub markings (and lack thereof): I'm using those to ensure the soundness of the exposed interface. All three types should be placed in a module by themselves. It would be incorrect to mark the RawFraction field pub inside Fraction, since then it would be possible (for code outside the module) to create an unreduced Fraction without using new or get a &mut RawFraction without going through RefMut.
Supposing all this code is placed in a module named frac, you can use it something like this (assuming Fraction implements Display):
let f = frac::Fraction::new(3, 10);
println!("{}", f); // prints 3/10
f.borrow_mut().numerator += 3;
println!("{}", f); // prints 3/5
The types encode the invariant: Wherever you have Fraction, you can know that it's fully reduced. When you have a RawFraction, &RawFraction, etc., you can't be sure. If you want, you may also make RawFraction's fields non-pub, so that you can't get an unreduced fraction at all except by calling borrow_mut on a Fraction.
Basically the same thing is done in RefCell. There you want to reduce the runtime borrow-count when a borrow ends. Here you want to perform an arbitrary action.
So let's re-use the concept of writing a function that returns a wrapped reference:
struct Data {
content: i32,
}
impl Data {
fn borrow_mut(&mut self) -> DataRef {
println!("borrowing");
DataRef { data: self }
}
fn check_after_borrow(&self) {
if self.content > 50 {
println!("Hey, content should be <= {:?}!", 50);
}
}
}
struct DataRef<'a> {
data: &'a mut Data
}
impl<'a> Drop for DataRef<'a> {
fn drop(&mut self) {
println!("borrow ends");
self.data.check_after_borrow()
}
}
fn main() {
let mut d = Data { content: 42 };
println!("content is {}", d.content);
{
let b = d.borrow_mut();
//let c = &d; // Compiler won't let you have another borrow at the same time
b.data.content = 123;
println!("content set to {}", b.data.content);
} // borrow ends here
println!("content is now {}", d.content);
}
This results in the following output:
content is 42
borrowing
content set to 123
borrow ends
Hey, content should be <= 50!
content is now 123
Be aware that you can still obtain an unchecked mutable borrow with e.g. let c = &mut d;. This will be silently dropped without calling check_after_borrow.

Game loop in Rust while satisfying the borrow checker [duplicate]

I'm writing a game engine. In the engine, I've got a game state which contains the list of entities in the game.
I want to provide a function on my gamestate update which will in turn tell each entity to update. Each entity needs to be able to refer to the gamestate in order to correctly update itself.
Here's a simplified version of what I have so far.
pub struct GameState {
pub entities: Vec<Entity>,
}
impl GameState {
pub fn update(&mut self) {
for mut t in self.entities.iter_mut() {
t.update(self);
}
}
}
pub struct Entity {
pub value: i64,
}
impl Entity {
pub fn update(&mut self, container: &GameState) {
self.value += container.entities.len() as i64;
}
}
fn main() {
let mut c = GameState { entities: vec![] };
c.entities.push(Entity { value: 1 });
c.entities.push(Entity { value: 2 });
c.entities.push(Entity { value: 3 });
c.update();
}
The problem is the borrow checker doesn't like me passing the gamestate to the entity:
error[E0502]: cannot borrow `*self` as immutable because `self.entities` is also borrowed as mutable
--> example.rs:8:22
|
7 | for mut t in self.entities.iter_mut() {
| ------------- mutable borrow occurs here
8 | t.update(self);
| ^^^^ immutable borrow occurs here
9 | }
| - mutable borrow ends here
error: aborting due to previous error
Can anyone give me some suggestions on better ways to design this that fits with Rust better?
Thanks!
First, let's answer the question you didn't ask: Why is this not allowed?
The answer lies around the guarantees that Rust makes about & and &mut pointers. A & pointer is guaranteed to point to an immutable object, i.e. it's impossible for the objects behind the pointer to mutate while you can use that pointer. A &mut pointer is guaranteed to be the only active pointer to an object, i.e. you can be sure that nobody is going to observe or mutate the object while you're mutating it.
Now, let's look at the signature of Entity::update:
impl Entity {
pub fn update(&mut self, container: &GameState) {
// ...
}
}
This method takes two parameters: a &mut Entity and a &GameState. But hold on, we can get another reference to self through the &GameState! For example, suppose that self is the first entity. If we do this:
impl Entity {
pub fn update(&mut self, container: &GameState) {
let self_again = &container.entities[0];
// ...
}
}
then self and self_again alias each other (i.e. they refer to the same thing), which is not allowed as per the rules I mentioned above because one of the pointers is a mutable pointer.
What can you do about this?
One option is to remove an entity from the entities vector before calling update on it, then inserting it back after the call. This solves the aliasing problem because we can't get another alias to the entity from the game state. However, removing the entity from the vector and reinserting it are operations with linear complexity (the vector needs to shift all the following items), and if you do it for each entity, then the main update loop runs in quadratic complexity. You can work around that by using a different data structure; this can be as simple as a Vec<Option<Entity>>, where you simply take the Entity from each Option, though you might want to wrap this into a type that hides all None values to external code. A nice consequence is that when an entity has to interact with other entities, it will automatically skip itself when iterating on the entities vector, since it's no longer there!
A variation on the above is to simply take ownership of the whole vector of entities and temporarily replace the game state's vector of entities with an empty one.
impl GameState {
pub fn update(&mut self) {
let mut entities = std::mem::replace(&mut self.entities, vec![]);
for mut t in entities.iter_mut() {
t.update(self);
}
self.entities = entities;
}
}
This has one major downside: Entity::update will not be able to interact with the other entities.
Another option is to wrap each entity in a RefCell.
use std::cell::RefCell;
pub struct GameState {
pub entities: Vec<RefCell<Entity>>,
}
impl GameState {
pub fn update(&mut self) {
for t in self.entities.iter() {
t.borrow_mut().update(self);
}
}
}
By using RefCell, we can avoid retaining a mutable borrow on self. Here, we can use iter instead of iter_mut to iterate on entities. In return, we now need to call borrow_mut to obtain a mutable pointer to the value wrapped in the RefCell.
RefCell essentially performs borrow checking at runtime. This means that you can end up writing code that compiles fine but panics at runtime. For example, if we write Entity::update like this:
impl Entity {
pub fn update(&mut self, container: &GameState) {
for entity in container.entities.iter() {
self.value += entity.borrow().value;
}
}
}
the program will panic:
thread 'main' panicked at 'already mutably borrowed: BorrowError', ../src/libcore/result.rs:788
That's because we end up calling borrow on the entity that we're currently updating, which is still borrowed by the borrow_mut call done in GameState::update. Entity::update doesn't have enough information to know which entity is self, so you would have to use try_borrow or borrow_state (which are both unstable as of Rust 1.12.1) or pass additional data to Entity::update to avoid panics with this approach.

vector method push_all is not found for a custom struct

So in this simple example
#![feature(collections)]
struct User {
reference: String,
email: String
}
fn main() {
let rows = vec![
vec!["abcd".to_string(), "test#test.com".to_string()],
vec!["efgh".to_string(), "test1#test.com".to_string()],
vec!["wfee".to_string(), "test2#test.com".to_string()],
vec!["rrgr".to_string(), "test3#test.com".to_string()]
];
let mut rows_mut: Vec<Vec<String>> = Vec::new();
rows_mut.push_all(&rows);
let mut users_mut: Vec<User> = Vec::new();
let users = vec![
User { reference: "ref1".to_string(), email: "test#test.com".to_string() },
User { reference: "ref2".to_string(), email: "test1#test.com".to_string() }
];
users_mut.push_all(&users);
}
I'm getting an error
src/main.rs:24:12: 24:28 error: no method named `push_all` found for type `collections::vec::Vec<User>` in the current scope
src/main.rs:24 users_mut.push_all(&users);
^~~~~~~~~~~~~~~~
error: aborting due to previous error
Why does it work for Vec<String>, but not for Vec<User>? Is the only way in this case to iterate and add elements one by one?
Look at the definition of push_all:
impl<T> Vec<T> where T: Clone {
fn push_all(&mut self, other: &[T]);
}
Appends all elements in a slice to the Vec.
Iterates over the slice other, clones each element, and then appends it to this Vec. The other vector is traversed in-order.
(Emphasis mine.)
Your type must implement Clone because it clones each value. String does; User doesn’t. You can add #[derive(Clone)] to it.
If you are willing to consume the source vector, you should use x.extend(y.into_iter()) which avoids needing to clone the values.
Of course, for this trivial case if it’s purely the difference in mutness, just add the mut in the initial pattern (if it’s a function argument this works too, the bit before the colon in each argument is a pattern, like with let, so fn foo(mut x: Vec<T>) { … } works fine and is equivalent to fn foo(x: Vec<T>) { let mut x = x; … }.)
Because, if you go to the documentation for Vec::push_all and scroll up and little, you'll see this line:
impl<T: Clone> Vec<T>
This means that the following methods are only implemented for Vec<T> when T implements Clone. In this case, T would be User, and User does not implement Clone. Therefore, the method does not exist.
You can solve this by adding #[derive(Clone)] before struct User {...}.

Resources