Why the result of Index::index is dereferenced? - rust

I have some code like this
struct MyStruct {
id: u32
}
fn main() {
let vec: Vec<MyStruct> = vec![MyStruct {
id: 1
}];
let my_struct = vec[0];
}
I thought my_struct must be of reference type, but the compiler proved I was wrong:
error[E0507]: cannot move out of index of `Vec<MyStruct>`
--> src/main.rs:10:21
|
10 | let my_struct = vec[0];
| ^^^^^^ move occurs because value has type `MyStruct`, which does not implement the `Copy` trait
|
help: consider borrowing here
|
10 | let my_struct = &vec[0];
So I read the doc of trait Index and noticed the sentence below:
container[index] is actually syntactic sugar for *container.index(index)
It's counterintuitive. Why rust dereferences it? Does it mean that it's more friendly to types those impl Copy?

There are plenty of other questions asking about this (1 2 3) but none really address the why part.
Rust has made many design decisions work to make ownership clear. For example, you cannot pass a value, v, directly to a function f that expects a reference. It must be f(&v). There are places where this isn't always the case (macros like println! and auto-ref for method calls being prime exceptions), but many parts of the language follow this principle.
The behavior of the index operator is pretty much the same as the normal field access operator .. If you have my_struct: &MyStruct then my_struct.id will yield a u32 and not a &u32. The default behavior is to move fields. You have to introduce a & (i.e. &my_struct.id) to get a reference to the field. So the same is with the index operator. If you want to make it clear that you want only a reference to the element, then you need to introduce a &.

Related

Missing lifetime specifier in returned HashMap<&i32, SimpleCallback>

Consider the following:
use std::collections::HashMap;
use std::vec::Vec;
use crate::core::process_call_backs::SimpleCallback;
pub fn make_adventure_list(adventure_list: Vec<SimpleCallback>) -> HashMap<&i32, SimpleCallback> {
let mut adventures = HashMap::new();
let mut count = 1;
for adventure in adventure_list {
adventures.insert(count, adventure);
count = count + 1;
}
adventures;
}
I get the error:
error[E0106]: missing lifetime specifier
--> core/src/core/create_adventures.rs:5:76
|
5 | pub fn make_adventure_list(adventure_list: Vec<SimpleCallback>) -> HashMap<&i32, SimpleCallback> {
| ^ help: consider giving it an explicit bounded or 'static lifetime: `&'static`
|
= help: this function's return type contains a borrowed value with an elided lifetime, but the lifetime cannot be derived from the arguments
I understand the meaning of this error, but not I'm sure how to implement the fix. Do I need to make adventure_list mutable?
I guess you want to map numbers to callbacks. But what you wrote is mapping references to numbers to callbacks.
Now, references have a lifetime. In your case, you start out with count - which lives only inside your function. Thus, even if you wanted to refer (i.e. have a reference) to it in your result, this would go wrong, as count goes out of scope at the end of the function.
What you almost certainly want, is your result type to be HashMap<i32, SimpleCallback>.
Remark: Since references have lifetimes, Rust suggests to add a static lifetime, meaning that you have references to numbers that are available for the whole run of the program (as opposed to only within your function). But, as said, you almost certainly do not want references to numbers, but simply numbers.

Why do I get "no method named push found for type Option" with a vector of vectors?

I tried to use a String vector inside another vector:
let example: Vec<Vec<String>> = Vec::new();
for _number in 1..10 {
let mut temp: Vec<String> = Vec::new();
example.push(temp);
}
I should have 10 empty String vectors inside my vector, but:
example.get(0).push(String::from("test"));
fails with
error[E0599]: no method named `push` found for type `std::option::Option<&std::vec::Vec<std::string::String>>` in the current scope
--> src/main.rs:9:20
|
9 | example.get(0).push(String::from("test"));
| ^^^^
Why does it fail? Is it even possible to have an vector "inception"?
I highly recommend reading the documentation of types and methods before you use them. At the very least, look at the function's signature. For slice::get:
pub fn get<I>(&self, index: I) -> Option<&<I as SliceIndex<[T]>>::Output>
where
I: SliceIndex<[T]>,
While there's some generics happening here, the important part is that the return type is an Option. An Option<Vec> is not a Vec.
Refer back to The Rust Programming Language's chapter on enums for more information about enums, including Option and Result. If you wish to continue using the semantics of get, you will need to:
Switch to get_mut as you want to mutate the inner vector.
Make example mutable.
Handle the case where the indexed value is missing. Here I use an if let.
let mut example: Vec<_> = std::iter::repeat_with(Vec::new).take(10).collect();
if let Some(v) = example.get_mut(0) {
v.push(String::from("test"));
}
If you want to kill the program if the value is not present at the index, the shortest thing is to use the index syntax []:
example[0].push(String::from("test"));

When would an implementation want to take ownership of self in Rust?

I'm reading through the Rust documentation on lifetimes. I tried something like:
struct S {
x: i8,
}
impl S {
fn fun(self) {}
fn print(&self) {
println!("{}", self.x);
}
}
fn main() {
let s = S { x: 1 };
s.fun();
s.print();
}
I get the following error:
error[E0382]: borrow of moved value: `s`
--> src/main.rs:16:5
|
15 | s.fun();
| - value moved here
16 | s.print();
| ^ value borrowed here after move
|
= note: move occurs because `s` has type `S`, which does not implement the `Copy` trait
This is because the fun(self) method takes ownership of the s instance. This is solved by changing to fun(&self).
I can't see why you would ever want to have a method on an object take control of itself. I can think of only one example, a destructor method, but if you wanted to do dispose of the object then it would be taken care of by the owner of the object anyway (i.e. scope of main in this example).
Why is it possible to write a method that takes ownership of the struct? Is there ever any circumstance where you would want this?
The idiomatic way to refer to a method that "takes control" of self in the Rust standard library documentation is to say that it "consumes" it. If you search for this, you should find some examples:
Option::unwrap_or_default
A lot in the Iterator trait.
As to why: you can try rewriting Iterator::map — you would end up having a lifetime parameter wandering around that would quickly become unmanageable. Why? Because the Map iterator is based upon the previous one, so the borrow checker will enforce that you can only use one of the two at the same time.
Conversion from type A to type B commonly involves functions taking self by value. See the implementors of Into and From traits for concrete examples.

Why are explicit lifetimes needed in Rust?

I was reading the lifetimes chapter of the Rust book, and I came across this example for a named/explicit lifetime:
struct Foo<'a> {
x: &'a i32,
}
fn main() {
let x; // -+ x goes into scope
// |
{ // |
let y = &5; // ---+ y goes into scope
let f = Foo { x: y }; // ---+ f goes into scope
x = &f.x; // | | error here
} // ---+ f and y go out of scope
// |
println!("{}", x); // |
} // -+ x goes out of scope
It's quite clear to me that the error being prevented by the compiler is the use-after-free of the reference assigned to x: after the inner scope is done, f and therefore &f.x become invalid, and should not have been assigned to x.
My issue is that the problem could have easily been analyzed away without using the explicit 'a lifetime, for instance by inferring an illegal assignment of a reference to a wider scope (x = &f.x;).
In which cases are explicit lifetimes actually needed to prevent use-after-free (or some other class?) errors?
The other answers all have salient points (fjh's concrete example where an explicit lifetime is needed), but are missing one key thing: why are explicit lifetimes needed when the compiler will tell you you've got them wrong?
This is actually the same question as "why are explicit types needed when the compiler can infer them". A hypothetical example:
fn foo() -> _ {
""
}
Of course, the compiler can see that I'm returning a &'static str, so why does the programmer have to type it?
The main reason is that while the compiler can see what your code does, it doesn't know what your intent was.
Functions are a natural boundary to firewall the effects of changing code. If we were to allow lifetimes to be completely inspected from the code, then an innocent-looking change might affect the lifetimes, which could then cause errors in a function far away. This isn't a hypothetical example. As I understand it, Haskell has this problem when you rely on type inference for top-level functions. Rust nipped that particular problem in the bud.
There is also an efficiency benefit to the compiler — only function signatures need to be parsed in order to verify types and lifetimes. More importantly, it has an efficiency benefit for the programmer. If we didn't have explicit lifetimes, what does this function do:
fn foo(a: &u8, b: &u8) -> &u8
It's impossible to tell without inspecting the source, which would go against a huge number of coding best practices.
by inferring an illegal assignment of a reference to a wider scope
Scopes are lifetimes, essentially. A bit more clearly, a lifetime 'a is a generic lifetime parameter that can be specialized with a specific scope at compile time, based on the call site.
are explicit lifetimes actually needed to prevent [...] errors?
Not at all. Lifetimes are needed to prevent errors, but explicit lifetimes are needed to protect what little sanity programmers have.
Let's have a look at the following example.
fn foo<'a, 'b>(x: &'a u32, y: &'b u32) -> &'a u32 {
x
}
fn main() {
let x = 12;
let z: &u32 = {
let y = 42;
foo(&x, &y)
};
}
Here, the explicit lifetimes are important. This compiles because the result of foo has the same lifetime as its first argument ('a), so it may outlive its second argument. This is expressed by the lifetime names in the signature of foo. If you switched the arguments in the call to foo the compiler would complain that y does not live long enough:
error[E0597]: `y` does not live long enough
--> src/main.rs:10:5
|
9 | foo(&y, &x)
| - borrow occurs here
10 | };
| ^ `y` dropped here while still borrowed
11 | }
| - borrowed value needs to live until here
The lifetime annotation in the following structure:
struct Foo<'a> {
x: &'a i32,
}
specifies that a Foo instance shouldn't outlive the reference it contains (x field).
The example you came across in the Rust book doesn't illustrate this because f and y variables go out of scope at the same time.
A better example would be this:
fn main() {
let f : Foo;
{
let n = 5; // variable that is invalid outside this block
let y = &n;
f = Foo { x: y };
};
println!("{}", f.x);
}
Now, f really outlives the variable pointed to by f.x.
Note that there are no explicit lifetimes in that piece of code, except the structure definition. The compiler is perfectly able to infer lifetimes in main().
In type definitions, however, explicit lifetimes are unavoidable. For example, there is an ambiguity here:
struct RefPair(&u32, &u32);
Should these be different lifetimes or should they be the same? It does matter from the usage perspective, struct RefPair<'a, 'b>(&'a u32, &'b u32) is very different from struct RefPair<'a>(&'a u32, &'a u32).
Now, for simple cases, like the one you provided, the compiler could theoretically elide lifetimes like it does in other places, but such cases are very limited and do not worth extra complexity in the compiler, and this gain in clarity would be at the very least questionable.
If a function receives two references as arguments and returns a reference, then the implementation of the function might sometimes return the first reference and sometimes the second one. It is impossible to predict which reference will be returned for a given call. In this case, it is impossible to infer a lifetime for the returned reference, since each argument reference may refer to a different variable binding with a different lifetime. Explicit lifetimes help to avoid or clarify such a situation.
Likewise, if a structure holds two references (as two member fields) then a member function of the structure may sometimes return the first reference and sometimes the second one. Again explicit lifetimes prevent such ambiguities.
In a few simple situations, there is lifetime elision where the compiler can infer lifetimes.
I've found another great explanation here: http://doc.rust-lang.org/0.12.0/guide-lifetimes.html#returning-references.
In general, it is only possible to return references if they are
derived from a parameter to the procedure. In that case, the pointer
result will always have the same lifetime as one of the parameters;
named lifetimes indicate which parameter that is.
The case from the book is very simple by design. The topic of lifetimes is deemed complex.
The compiler cannot easily infer the lifetime in a function with multiple arguments.
Also, my own optional crate has an OptionBool type with an as_slice method whose signature actually is:
fn as_slice(&self) -> &'static [bool] { ... }
There is absolutely no way the compiler could have figured that one out.
As a newcomer to Rust, my understanding is that explicit lifetimes serve two purposes.
Putting an explicit lifetime annotation on a function restricts the type of code that may appear inside that function. Explicit lifetimes allow the compiler to ensure that your program is doing what you intended.
If you (the compiler) want(s) to check if a piece of code is valid, you (the compiler) will not have to iteratively look inside every function called. It suffices to have a look at the annotations of functions that are directly called by that piece of code. This makes your program much easier to reason about for you (the compiler), and makes compile times managable.
On point 1., Consider the following program written in Python:
import pandas as pd
import numpy as np
def second_row(ar):
return ar[0]
def work(second):
df = pd.DataFrame(data=second)
df.loc[0, 0] = 1
def main():
# .. load data ..
ar = np.array([[0, 0], [0, 0]])
# .. do some work on second row ..
second = second_row(ar)
work(second)
# .. much later ..
print(repr(ar))
if __name__=="__main__":
main()
which will print
array([[1, 0],
[0, 0]])
This type of behaviour always surprises me. What is happening is that df is sharing memory with ar, so when some of the content of df changes in work, that change infects ar as well. However, in some cases this may be exactly what you want, for memory efficiency reasons (no copy). The real problem in this code is that the function second_row is returning the first row instead of the second; good luck debugging that.
Consider instead a similar program written in Rust:
#[derive(Debug)]
struct Array<'a, 'b>(&'a mut [i32], &'b mut [i32]);
impl<'a, 'b> Array<'a, 'b> {
fn second_row(&mut self) -> &mut &'b mut [i32] {
&mut self.0
}
}
fn work(second: &mut [i32]) {
second[0] = 1;
}
fn main() {
// .. load data ..
let ar1 = &mut [0, 0][..];
let ar2 = &mut [0, 0][..];
let mut ar = Array(ar1, ar2);
// .. do some work on second row ..
{
let second = ar.second_row();
work(second);
}
// .. much later ..
println!("{:?}", ar);
}
Compiling this, you get
error[E0308]: mismatched types
--> src/main.rs:6:13
|
6 | &mut self.0
| ^^^^^^^^^^^ lifetime mismatch
|
= note: expected type `&mut &'b mut [i32]`
found type `&mut &'a mut [i32]`
note: the lifetime 'b as defined on the impl at 4:5...
--> src/main.rs:4:5
|
4 | impl<'a, 'b> Array<'a, 'b> {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^
note: ...does not necessarily outlive the lifetime 'a as defined on the impl at 4:5
--> src/main.rs:4:5
|
4 | impl<'a, 'b> Array<'a, 'b> {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^
In fact you get two errors, there is also one with the roles of 'a and 'b interchanged. Looking at the annotation of second_row, we find that the output should be &mut &'b mut [i32], i.e., the output is supposed to be a reference to a reference with lifetime 'b (the lifetime of the second row of Array). However, because we are returning the first row (which has lifetime 'a), the compiler complains about lifetime mismatch. At the right place. At the right time. Debugging is a breeze.
The reason why your example does not work is simply because Rust only has local lifetime and type inference. What you are suggesting demands global inference. Whenever you have a reference whose lifetime cannot be elided, it must be annotated.
I think of a lifetime annotation as a contract about a given ref been valid in the receiving scope only while it remains valid in the source scope. Declaring more references in the same lifetime kind of merges the scopes, meaning that all the source refs have to satisfy this contract.
Such annotation allow the compiler to check for the fulfillment of the contract.

How to retrieve a user-defined type from a for loop?

I defined an Attribute type and I have a Vec<Attribute> that I am looping over to retrieve the "best" one. This was similar to my first attempt:
#[derive(Debug)]
struct Attribute;
impl Attribute {
fn new() -> Self {
Self
}
}
fn example(attrs: Vec<Attribute>, root: &mut Attribute) {
let mut best_attr = &Attribute::new();
for a in attrs.iter() {
if is_best(a) {
best_attr = a;
}
}
*root = *best_attr;
}
// simplified for example
fn is_best(_: &Attribute) -> bool {
true
}
I had the following compile error:
error[E0507]: cannot move out of borrowed content
--> src/lib.rs:17:13
|
17 | *root = *best_attr;
| ^^^^^^^^^^ cannot move out of borrowed content
After some searching for a solution, I resolved the error by doing the following:
Adding a #[derive(Clone)] attribute to my Attribute struct
Replacing the final statement with *root = best_attr.clone();
I don't fully understand why this works, and I feel like this is a rough solution to the problem I was having. How does this resolve the error, and is this the correct way to solve this problem?
You are experiencing the basis of the Rust memory model:
every object can (and must!) be owned by only exactly one other object
most types are never implicitly copied and always moved (there are some exceptions: types that implement Copy)
Take this code for example:
let x = String::new();
let y = x;
println!("{}", x);
it generates the error:
error[E0382]: borrow of moved value: `x`
--> src/main.rs:4:20
|
3 | let y = x;
| - value moved here
4 | println!("{}", x);
| ^ value borrowed here after move
|
= note: move occurs because `x` has type `std::string::String`, which does not implement the `Copy` trait
x, of type String, is not implicitly copyable, and thus has been moved into y. x cannot be used any longer.
In your code, when you write *root = *best_attr, you are first dereferencing the reference best_attr, then assigning the dereferenced value to *root. Your Attribute type is not Copy, thus this assignment should be a move.
Then, the compiler complains:
cannot move out of borrowed content
Indeed, best_attr is an immutable reference, which does not allow you to take ownership of the value behind it (it doesn't even allow modifying it). Allowing such a move would put the object owning the value behind the reference in an undefined state, which is exactly what Rust aims to prevent.
In this case, your best option is indeed to create a new object with the same value as the first one, which is exactly what the trait Clone is made for.
#[derive(Clone)] allows you to mark your structs as Clone-able, as long as all of their fields are Clone. In more complex cases, you'll have to implement the trait by hand.

Resources