Idiomatically access an element of a vector mutably and immutably - rust

How would you mutate a vector in such a way where you would need an immutable reference to said vector to determine how you would need to mutate the vector? For example, I have a piece of code that looks something like this, and I want to duplicate the last element of the vector:
let mut vec: Vec<usize> = vec![123, 42, 10];
// Doesn't work of course:
vec.push(*vec.last().unwrap())
// Works, but is this necessary?
let x = *vec.last().unwrap();
vec.push(x);

immutable reference [...] to determine how you would need to mutate the vector?
The short answer is you don't. Any mutation to the vector could possibly invalidate all existing references, making any future operations access invalid data, potentially causing segfaults. Safe Rust doesn't allow for that possibility.
Your second example creates a copy of the value in the vector, so it no longer matters what happens to the vector; that value will continue to be valid.
What's unfortunate about the first example is that if you follow the order of operations, a human can tell that the immutable value is retrieved before the mutation happens. In fact, that's why the multiple-statement version is possible at all! This is indeed a current limitation of the Rust borrow checker. There is investigation ongoing to see if some of these types of limitations can be lifted.

Related

Is Vec<&&str> the same as Vec<&Str>?

I'm learning Rust and I'm trying to solve an advent of code challenge (day 9 2015).
I created a situation where I end up with a variable that has the type Vec<&&str> (note the double '&', it's not a typo). I'm now wondering if this type is different than Vec<&str>. I can't figure out if a reference to a reference to something would ever make sense. I know I can avoid this situation by using String for the from and to variables. I'm asking if Vec<&&str> == Vec<&str> and if I should try and avoid Vec<&&str>.
Here is the code that triggered this question:
use itertools::Itertools
use std::collections::{HashSet};
fn main() {
let contents = fs::read_to_string("input.txt").unwrap();
let mut vertices: HashSet<&str> = HashSet::new();
for line in contents.lines() {
let data: Vec<&str> = line.split(" ").collect();
let from = data[0];
let to = data[2];
vertices.insert(from);
vertices.insert(to);
}
// `Vec<&&str>` originates from here
let permutations_iter = vertices.iter().permutations(vertices.len());
for perm in permutations_iter {
let length_trip = compute_length_of_trip(&perm);
}
}
fn compute_length_of_trip(trip: &Vec<&&str>) -> u32 {
...
}
Are Vec<&str> and Vec<&&str> different types?
I'm now wondering if this type is different than Vec<&str>.
Yes, a Vec<&&str> is a type different from Vec<&str> - you can't pass a Vec<&&str> where a Vec<&str> is expected and vice versa. Vec<&str> stores string slice references, which you can think of as pointers to data inside some strings. Vec<&&str> stores references to such string slice references, i.e. pointers to pointers to data. With the latter, accessing the string data requires an additional indirection.
However, Rust's auto-dereferencing makes it possible to use a Vec<&&str> much like you'd use a Vec<&str> - for example, v[0].len() will work just fine on either, v[some_idx].chars() will iterate over chars with either, and so on. The only difference is that Vec<&&str> stores the data more indirectly and therefore requires a bit more work on each access, which can lead to slightly less efficient code.
Note that you can always convert a Vec<&&str> to Vec<&str> - but since doing so requires allocating a new vector, if you decide you don't want Vec<&&str>, it's better not to create it in the first place.
Can I avoid Vec<&&str> and how?
Since a &str is Copy, you can avoid the creation of Vec<&&str> by adding a .copied() when you iterate over vertices, i.e. change vertices.iter() to vertices.iter().copied(). If you don't need vertices sticking around, you can also use vertices.into_iter(), which will give out &str, as well as free vertices vector as soon as the iteration is done.
The reason why the additional reference arises and the ways to avoid it have been covered on StackOverflow before.
Should I avoid Vec<&&str>?
There is nothing inherently wrong with Vec<&&str> that would require one to avoid it. In most code you'll never notice the difference in efficiency between Vec<&&str> and Vec<&str>. Having said that, there are some reasons to avoid it beyond performance in microbenchmarks. The additional indirection in Vec<&&str> requires the exact &strs it was created from (and not just the strings that own the data) to stick around and outlive the new collection. This is not relevant in your case, but would become noticeable if you wanted to return the permutations to the caller that owns the strings. Also, there is value in the simpler type that doesn't accumulate a reference on each transformation. Just imagine needing to transform the Vec<&&str> further into a new vector - you wouldn't want to deal with Vec<&&&str>, and so on for every new transformation.
Regarding performance, less indirection is usually better since it avoids an extra memory access and increases data locality. However, one should also note that a Vec<&str> takes up 16 bytes per element (on 64-bit architectures) because a slice reference is represented by a "fat pointer", i.e. a pointer/length pair. A Vec<&&str> (as well as Vec<&&&str> etc.) on the other hand takes up only 8 bytes per element, because a reference to a fat reference is represented by a regular "thin" pointer. So if your vector measures millions of elements, a Vec<&&str> might be more efficient than Vec<&str> simply because it occupies less memory. As always, if in doubt, measure.
The reason you have &&str is that the data &str is owned by vertices and when you create an interator over that data you are simply getting a reference to that data, hence the &&str.
There's really nothing to avoid here. It simply shows your iterator references the data that is inside the HashSet.

Is it possible to create a Rust function that inputs and outputs an edited value, without killing its' lifetime?

I need a function that should accept a vector without killing its' lifetime, but still being able to get modified (so no reference).
The modification in this case is the following: You have a vector of a certain structure, that has a children attribute which is a vector of this same type. The function takes the last struct of the vector, gets into it's children, once again gets the last of the children's vector, gets once again into the children, and so n times. Then the function returns the n-th level child.
How would I go about making a compilable code like following pseudo code?
fn g(vector: Vec<...>, n: numeric input) {
let temporary;
n times {
temporary = vector.last().unwrap().children;
}
return temporary;
}
let vec = Vec // Existing Vector
g(vec).push() // pushes into a certain child element of the vec element, this child is received over the function.
However the above thing won't work since by giving the vector to the function, it's ownership is granted to the function and it's lifetime expires here.
(This is all very difficult to do correctly without an M.R.E. or even a copypaste of your compiler-errors - thus I'll proceed via theoretical analysis and educated guessing.)
Even though you pass in the tokens via &mut, that gets automagically converted into a nonexclusive & reference because that's what's in the signature. You can't then get an exclusive reference from that, only a mutable name that holds nonexclusive references.
You need an exclusive reference in order to do that push operation - the "many readers XOR one writer" principle requires that.
Once you have outermost_vector as an exclusive reference, you then need to assign more exclusive references to it. last returns nonexclusive references, so that's out. But since outermost_vector is now an exclusive reference, you can call slice::last_mut on it. Then - assuming all of this is even possible in the first place - you just need to follow the compiler-errors to fix up other minor things such as the intricacies of correctly pattern-matching the intermediate values.
You also need to fix the function's overall typing; Vec::push returns (), which thankfully is not a member of the type Vec<Token>.
And as to the title of your question: since you're taking an &'a mut Vec<Token>, if you want to return some data borrowed from that then you should type your return as &'a mut Vec<Token>, not Vec<Token> over which your function has no ownership.
(Question was edited significantly, new answer follows:)
Your function must accept vector not by value, but by exclusive reference (spelled &mut rather than &). Then you can return another exclusive reference, one which is borrowed from the exclusive reference that was passed-in, but points directly to the desired receiver of push. The function's type signature would be something like fn g(vector: &mut Vec<T>, n: usize) -> &mut Vec<T>.
Passing by value would cause the function to take ownership of the whole thing - all owned data not returned are dropped, and this is probably not what you want.
Passing by nonexclusive reference is not what you want either, since there is no way to then convert that back to an exclusive reference with which to call push - even if you're back at the callsite where you have ownership of the data from which the reference is borrowed-from.

Erronous mutable borrow (E0502) when trying to remove and insert into a HashMap

I am a beginner to Rust and tried using a HashMap<u64, u64>. I want to remove an element and insert it with a modified value:
let mut r = HashMap::new();
let mut i = 2;
...
if r.contains_key(&i) {
let v = r.get(&i).unwrap();
r.remove(&i);
r.insert(i, v+1);
}
Now, the borrow checker complains that r is borrowed immutable, then mutable and then immutable again in the three lines of the if-block.
I don't understand what's going on...I guess since the get, remove and insert methods have r as implicit argument, it is borrowed in the three calls. But why is it a problem that this borrow in the remove call is mutable?
But why is it a problem that this borrow in the remove call is mutable?
The problem is the spanning: Rust allows either any number of immutable borrows or a single mutable borrow, they can not overlap.
The issue here is that v is a reference to the map contents, meaning the existence of v requires borrowing the map until v stops being used. Which thus overlaps with both remove and insert calls, and forbids them.
Now there are various ways to fix this. Since in this specific case you're using u64 which is Copy, you can just dereference and it'll copy the value you got from the map, removing the need for a borrow:
if r.contains_key(&i) {
let v = *r.get(&i).unwrap();
r.remove(&i);
r.insert(i, v+1);
}
this is limited in its flexibility though, as it only works for Copy types[0].
In this specific case it probably doesn't matter that much, because Copy is cheap, but it would still make more sense to use the advanced APIs Rust provides, for safety, for clarity, and because you'll eventually need them for less trivial types.
The simplest is to just use get_mut: where get returns an Option<&V>, get_mut returns an Option<&mut V>, meaning you can... update the value in-place, you don't need to get it out, and you don't need to insert it back in (nor do you need a separate lookup but you already didn't need that really):
if let Some(v) = r.get_mut(&i) {
*v += 1;
}
more than sufficient for your use case.
The second option is the Entry API, and the thing which will ruin every other hashmap API for you forever. I'm not joking, every other language becomes ridiculously frustrating, you may want to avoid clicking on that link (though you will eventually need to learn about it anyway, as it solves real borrowing and efficiency issues).
It doesn't really show its stuff here because your use case is simple and get_mut more than does the job, but anyway, you could write the increment as:
r.entry(i).and_modify(|v| *v+=1);
Incidentally in most languages (and certainly in Rust as well) when you insert an item in a hashmap, the old value gets evicted if there was one. So the remove call was already redundant and wholly unnecessary.
And pattern-matching an Option (such as that returned by HashMap::get) is generally safer, cleaner, and faster than painstakenly and procedurally doing all the low-level bits.
So even without using advanced APIs, the original code can be simplified to:
if let Some(&v) = r.get(&i) {
r.insert(i, v+1);
}
I'd still recommend the get_mut version over that as it is simpler, avoids the double lookup, and works on non-Copy types, but YMMV.
Also unlike most languages Rust's HashMap::insert returns the old value (f any), not a concern here but can be useful in some cases.
[0] as well as Clone ones, by explicitly calling .clone(), that may or may not translate to a significant performance impact depending on the type you're cloning.
The problem is that you keep an immutable reference when getting v. Since it is a u64, just implicitly clone so there is no more reference involved:
let v = r.get(&i).unwrap().clone();
Playground

Why do immutable references to copy types in rust exist?

So I just started learning rust (first few chapters of "the book") and am obviously quite a noob. I finished the ownership-basics chapter (4) and wrote some test programs to make sure I understood everything. I seem to have the basics down but I asked myself why immutable references to copy-types are even possible. I will try to explain my thoughts with examples.
I thought that you maybe want to store a reference to a copy-type so you can check it's value later instead of having a copy of the old value but this can't be it since the underlying value can't be changed as long as it's been borrowed.
The most basic example of this would be this code:
let mut x = 10; // push i32
let x_ref = &x; // push immutable reference to x
// x = 100; change x which is disallowed since it's borrowed currently
println!("{}", x_ref); // do something with the reference since you want the current value of x
The only reason for this I can currently think of (with my current knowledge) is that they just exist so you can call generic methods which require references (like cmp) with them.
This code demonstrates this:
let x = 10; // push i32
// let ordering = 10.cmp(x); try to compare it but you can't since cmp wants a reference
let ordering = 10.cmp(&x) // this works since it's now a reference
So, is that the only reason you can create immutable references to copy-types?
Disclaimer:
I don't see Just continue reading the book as a valid answer. However I fully understand if you say something like Yes you need those for this and this use-case (optional example), it will be covered in chapter X. I hope you understand what I mean :)
EDIT:
Maybe worth mentioning, I'm a C# programmer and not new to programming itself.
EDIT 2:
I don't know if this is technically a duplicate of this question but I do not fully understand the question and the answer so I hope for a more simple answer understandable by a real noob.
An immutable reference to a Copy-type is still "an immutable reference". The code that gets passed the reference can't change the original value. It can make a (hopefully) trivial copy of that value, but it can still only ever change that copy after doing so.
That is, the original owner of the value is ensured that - while receivers of the reference may decide to make a copy and change that - the state of whatever is referenced can't ever change. If the receiver wants to change the value, it can feel free; nobody else is going to see it, though.
Immutable references to primitives are not different, and while being Copy everywhere, you are probably more inclined to what "an immutable reference" means semantically for primitive types. For instance
fn print_the_age(age: &i32) { ... }
That function could make a copy via *age and change it. But the caller will not see that change and it does not make much sense to do so in the first place.
Update due to comment: There is no advantage per se, at least as far as primitives are concerned (larger types may be costly to copy). It does boil down to the semantic relationship between the owner of the i32 and the receiver: "Here is a reference, it is guaranteed to not change while you have that reference, I - the owner - can't change or move or deallocate and there is no other thread else including myself that could possibly do that".
Consider where the reference is coming from: If you receive an &i32, wherever it is coming from can't change and can't deallocate. The `i32ยด may be part of a larger type, which - due to handing out a reference - can't move, change or get de-allocated; the receiver is guaranteed of that. It's hard to say there is an advantage per se in here; it might be advantageous to communicate more detailed type (and lifetime!) relationships this way.
They're very useful, because they can be passed to generic functions that expect a reference:
fn map_vec<T, U>(v: &Vec<T>, f: impl Fn(&T) -> U) -> Vec<U> {...}
If immutable references of non-Copy types were forbidden, we would need two versions:
fn map_vec_own<T: !Copy, U>(v: &Vec<T>, f: impl Fn(&T) -> U) -> Vec<U> {...}
fn map_vec_copy<T: Copy, U>(v: &Vec<T>, f: impl Fn( T) -> U) -> Vec<U> {...}
Immutable references are, naturally, used to provide access to the referenced data. For instance, you could have loaded a dictionary and have multiple threads reading from it at the same time, each using their own immutable reference. Because the references are immutable those threads will not corrupt that common data.
Using only mutable references, you can't be sure of that so you need to make full copies. Copying data takes time and space, which are always limited. The primary question for performance tends to be if your data fits in CPU cache.
I'm guessing you were thinking of "copy" types as ones that fit in the same space as the reference itself, i.e. sizeof(type) <= sizeof(type*). Rust's Copy trait indicates data that could be safely copied, no matter the size. These are orthogonal concepts; for instance, a pointer might not be safely copied without adjusting a refernce count, or an array might be copyable but take gigabytes of memory. This is why Rc<T> has the Clone trait, not Copy.

cannot borrow foo as immutable because it is also borrowed as mutable

I have following (innocent enough) Rust code:
let file = &Path(some_file_name);
let mut buf = [0u8, ..12];
match io::file_reader(file) {
Ok(reader) => reader.read(buf, buf.len()),
Err(msg) => println(msg)
}
The rustc complains that
cannot borrow buf[] as immutable because it is also borrowed as mutable
If changing the corresponding line to:
Ok(reader) => reader.read(buf, 12),
all will work just fine. But it is less satisfactory since now the length of the buffer is duplicated in the code. Although vaguely understanding why rustc complains, I still would like to argue that rustc should be able to infer that len() is a pure function and has no side effect so the code is valid. Besides, it is a quite common patten to read into a buffer that way.
So what is the idiomatic Rust way here?
EDIT: The code was for Rust 0.8. As #pnkfelix pointed out, the Reader.read API has been changed since then. It doesn't need the second parameter any more.
This answer is for my version of rustc:
rustc 0.9-pre (61443dc 2013-12-01)
The current version of the Reader trait has a different interface than the one you listed. Instead of taking both (a slice of) an output buffer and a length, it now just takes (a slice of) an output buffer. It can get the length of the output buffer from the slice, so you don't need to repeat yourself.
The reason Rust is complaining is that it is trying to ensure that you do not have read/write aliasing of memory. It is trying to stop you from passing an immutable-borrow of buf into one context and a mutable-borrow of buf into another context.
When you say len() is a pure function, I take it you mean that it does not write to any mutable state. However, in the general case, it could be reading mutable state. (That is not the case here, since we are dealing with a fixed size buffer. But still, in general one could imagine that we are dealing with some auto-resizing array abstraction.)
So there is an effect, just one that people do not often think about: that of reading.
I suspect the idiomatic way of dealing with the problem you saw (ignoring the fact that the API has changed) would be to avoid having the overlapping borrows of buf, e.g. like so:
Ok(reader) => { let l = buf.len(); reader.read(buf, l) },
This way, you don't repeat yourself; you're just providing two non-overlapping extents where buf is borrowed in different ways.

Resources