How lifetime parameters and borrowing interacts in function signatures - rust

Lets say I have a function with the following signature in Rust:
fn f<'a>(x: &'a i32) -> &'a i32;
Lets say I do the following then:
let x = 0;
let y = f(&x);
In that case, the Rust borrow checker considers y to borrow x. Why? What is the reason on a deeper level than "because you used the same lifetime parameter in the parameter type and the return type".

The function signature
fn f<'a>(x: &'a i32) -> &'a i32;
means that the value returned by f is a reference to what x parameter refers to, hence it can't outlive it. For example, this won't work:
// Compile error
let y = {
let x = 0;
f(&x)
// x is dropped here
};
// Here y still "exists", but x doesn't (y outlives x)
To your specific question:
Lets say I do the following then:
let x = 0;
let y = f(&x);
In that case, the Rust borrow checker considers y to borrow x. Why?
The answer is because the function signature of f tells it so. To give you an example, suppose that we change the signature to this:
fn f<'a, 'b>(x: &'a i32, z: &'b i32) -> &'a i32;
Then we call f like this:
let x = 0;
let z = 1;
let y = f(&x, &z);
In the code above, y borrows x, but not z. It's because the return value of f has 'a lifetime that is the same as x's lifetime.

Using the syntax from the Rustonomicon, we can elaborate the second snippet
let x: i32 = 0;
'a: { // definition of a lifetime (not real syntax)
let y: &'a i32 = f::<'a>(&'a x) // &'a x is also not real, though interestingly rustc recognizes it
}
The lifetime 'a is introduced by the compiler because you wrote a & and it needs to know how long that borrow will last (note that you cannot manually specify lifetimes for borrows). The type of the function also means that y has 'a in its type. The compiler needs to figure out where 'a starts and ends. The rules are that 'a must start right before you perform the borrow, that x cannot be moved during 'a (since it's borrowed), and that y cannot be used after 'a (because its type says so). If the compiler can pick start and end points for 'a so that the rules hold, then it compiles. If it can't, there's an error. There is no direct relationship between x and y; it goes through the lifetime 'a that they both interact with. E.g. you can't move x and then read *y because 'a must end before you move x and after you read *y, but such a time does not exist. TL;DR think about borrow checking as lifetime inference.

Related

Writing generic implementation of sum on an iterator in Rust

I have been learning Rust, coming from a Swift, C and C++ background. I feel like I have a basic understanding of ownership, borrowing and traits. To exercise a bit, I decided to implement a sum function on a generic slice [T] where T has a default value and can be added to itself.
This is how far I got:
trait Summable {
type Result;
fn sum(&self) -> Self::Result;
}
impl<T> Summable for [T]
where
T: Add<Output = T> + Default,
{
type Result = T;
fn sum(&self) -> T {
let x = T::default();
self.iter().fold(x, |a, b| a + b)
}
}
Compiler complains with expected type parameter T, found &T for a + b.
I understand why the error happens, but not exactly how to fix it. Yes, the type of x is T. It cannot be &T because, if nothing else, if the slice is empty, that's the value that is returned and I cannot return a reference to something created inside the function. Plus, the default function returns a new value that the code inside the function owns. Makes sense. And yes, b should be a shared reference to the values in the slice since I don't want to consume them (not T) and I don't want to mutate them (not &mut T).
But that means I need to add T to &T, and return a T because I am returning a new value (the sum) which will be owned by the caller. How?
PS: Yes, I know this function exists already. This is a learning exercise.
The std::ops::Add trait has an optional Rhs type parameter that defaults to Self:
pub trait Add<Rhs = Self> {
type Output;
fn add(self, rhs: Rhs) -> Self::Output;
}
Because you've omitted the Rhs type parameter from the T: Add<Output = T> bound, it defaults to T: hence to your a you can add a T, but not an &T.
Either specify that T: for<'a> Add<&'a T, Output = T>; or else somehow obtain an owned T from b, e.g. via T: Copy or T: Clone.

Why can I not call FnMut twice in a line?

Taking example snippets from here: the following doesn't compile
fn foobar<F>(mut f: F)
where F: FnMut(i32) -> i32
{
println!("{}", f(f(2)));
// error: cannot borrow `f` as mutable more than once at a time
}
fn main() {
foobar(|x| x * 2);
}
but this does
fn foobar<F>(mut f: F)
where F: FnMut(i32) -> i32
{
let tmp = f(2);
println!("{}", f(tmp));
}
fn main() {
foobar(|x| x * 2);
}
I don't understand why the first snippet is illegal: it's effectively the same as the second one, just written more concisely. More specifically, why must f(f(2)) mutably borrow f twice? It can simply borrow the inner f to compute the value of f(2), and then borrow the outer f and apply it to the value.
More specifically, why must f(f(2)) mutably borrow f twice?
The borrows here happen in the order of expression evaluation, and expression evaluation is always left-to-right, even when the expressions in question are trivial variable accesses. The expressions to be evaluated in this code are:
f(f(2)) is made up of two subexpressions: f and f(2).
Evaluate the function value, f (and borrow it as &mut because we're calling a FnMut).
Evaluate the argument, f(2).
Evaluate the function value, f; error because it's already borrowed.
Evaluate the argument, 2.
Call the borrow of f with the argument, 2. This is the result of f(2).
Call the borrow of f with the argument, the result of evaluating f(2). This is the result of f(f(2))
The borrow checker could soundly accept this case, but it would require the idea of recognizing that the first borrow hasn't been used yet, which isn't currently a thing in the borrow checker.

Can I simplify tuple expression that uses a double reference?

I have an expression that creates variables that are double references, only to have to dereference the variable to use it in a call. I suspect there is a simpler syntax and way to call my function.
I have two types A and B, neither movable, but both cloneable. I am calling a function with a Vec of tuples of A and B and need to index the Vec to get out the tuple, destructure the values from the tuple into local variables, and use one value immutably but the other mutably in another function call.
pub fn my_func(v: &mut Vec<(&A, &mut B)>, x: &mut C) {
let i = 0_usize; // Would be a loop index normally.
let (a, b) = &mut v[i]; // This makes b a double reference.
x.do_something(*b); // This expects &mut B as parameter.
}
How can I change the signature to my_func, the indexing of v and the destructuring into a and b in a consistent way to simplify things? Can I get away with fewer ampersands and muts and derefs? Note that I do not need a to be mutable, just b.
Ignoring &mut C since it is required, if you count one point for each ampersand &, one point for each mut, and one point for each deref *, then you get 8 points. A solution that scores fewer "points" is a winner.
I don't think there is any way to simplify the function's signature. You could replace Vec<T> with [T], provided you don't need to push or pop elements from the vector, but this change doesn't affect the "score" per your definition.
Both a and b turn into double references because of match ergonomics.
The pattern on the left side of the let doesn't have exactly the same "shape" as the expression on the right, so the compiler "fixes" it for you. The type of the expression is &mut (&A, &mut B). The pattern doesn't match the outer &mut, so the compiler pushes it inside the tuple, giving (&mut &A, &mut &mut B). Now the shapes match: the outer type is a 2-tuple, so a is &mut &A and b is &mut &mut B.
I've come up with a few variations on your function:
pub struct A;
pub struct B;
pub struct C;
impl C {
fn do_something(&mut self, b: &mut B) {}
}
pub fn my_func_v1(v: &mut Vec<(&A, &mut B)>, x: &mut C) {
let i = 0_usize;
let (a, b) = &mut v[i];
x.do_something(b);
}
pub fn my_func_v2(v: &mut Vec<(&A, &mut B)>, x: &mut C) {
let i = 0_usize;
let &mut (a, &mut ref mut b) = &mut v[i];
x.do_something(b);
}
pub fn my_func_v2a(v: &mut Vec<(&A, &mut B)>, x: &mut C) {
let i = 0_usize;
let (a, &mut ref mut b) = v[i];
x.do_something(b);
}
pub fn my_func_v3(v: &mut Vec<(&A, &mut B)>, x: &mut C) {
let i = 0_usize;
let (a, b) = &mut v[i];
let (a, b) = (*a, &mut **b);
// Or:
//let a = *a;
//let b = &mut **b;
x.do_something(b);
}
pub fn my_func_v4(v: &mut Vec<(&A, &mut B)>, x: &mut C) {
let i = 0_usize;
let e = &mut v[i];
let (a, b) = (e.0, &mut *e.1);
// Or:
//let a = e.0;
//let b = &mut *e.1;
x.do_something(b);
}
v1 is identical, except I wrote b instead of *b. The compiler sees an expression of type &mut &mut B and a parameter of type &mut B, and will transparently dereference the outer reference. This might not work if the parameter type is generic (here, it's the nongeneric &mut B).
v2 is the "direct" way to avoid making a and b double references. As you can see, it's not very pretty. First, I added &mut in front of the tuple, in order to match the type of the expression on the right, to prevent match ergonomics from kicking in. Next, we can't just write b because the compiler interprets this pattern as a move/copy, but we can't move out of a mutable reference and &mut T is not Copy. &mut ref mut b is the pattern version of a reborrow.
v2a is like v2, but with the &mut removed on both sides (thanks loganfsmyth for the reminder!). v[i] is of type (&A, &mut B), so it cannot be moved, but it's an lvalue, so if we re-reference parts of it that cannot be moved/copied, then the whole thing won't be moved at all.
Remember that in patterns, &mut deconstructs a reference, while ref mut constructs a reference. Now, &mut ref mut might seem like a noop, but it's not. A double mutable reference &mut &mut T actually has two distinct lifetimes; let's name them 'a for the inner lifetime and 'b for the outer lifetime, giving &'b mut &'a mut T ('b is shorter than 'a). When you dereference such a value (either with the * operator or with a &mut pattern), the output is &'b mut T, not &'a mut T. If it were &'a mut T, then you could end up with more than one mutable reference to the same memory location. b yields a &'a mut T, while &mut ref mut b yields a &'b mut T.
v3 uses the dereferencing operator instead of &mut patterns, which is hopefully easier to understand. We need to explicitly reborrow (&mut **b), unfortunately; *b is interpreted as a move out of a mutable reference, again.
When b is &mut &mut B and we pass either b or *b to do_something, neither of these two are actually "correct". The correct expression is &mut **b. However, the compiler automatically references and dereferences arguments (including the receiver) in function/method calls, but not in other contexts (such as the initializer for a local variable).
v4 saves a couple * by relying on the auto-deref from using the . operator.
One option is to explicitly declare the b binding to be a reference into v[i] directly. This can be done with
let (a, ref mut b) = v[i];
x.do_something(b);
For simplifying the signature, you're pretty limited, but one place to start would be
pub fn my_func(v: &mut [(&A, &mut B)], x: &mut C) {
assuming your function doesn't actually try to add/remove items in the vector.

return closures but cannot infer type

When learning rust closures,I try Like Java return "A Function"
fn equal_5<T>() -> T
where T: Fn(u32) -> bool {
let x:u32 = 5;
|z| z == x
}
But when i use it
let compare = equal_5();
println!("eq {}", compare(6));
Build error
11 | let compare = equal_5();
| ------- consider giving `compare` a type
12 | println!("eq {}", compare(6));
| ^^^^^^^^^^ cannot infer type
|
= note: type must be known at this point
See: https://doc.rust-lang.org/stable/rust-by-example/trait/impl_trait.html
Currently T simply describes a type, which in this case implements the Fn trait. In other words, T isn't a concrete type. In fact with closures, it's impossible to declare a concrete type because each closure has it's own unique type (even if two closures are exactly the same they have different types.)
To get around directly declaring the type of the closure (which is impossible) we can use the impl keyword. What the impl keyword does is convert our description of a type (trait bounds) into an invisible concrete type which fits those bounds.
So this works:
fn equal_5() -> impl Fn(u32) -> bool {
let x:u32 = 5;
move |z| z == x
}
let compare = equal_5();
println!("eq {}", compare(6));
One thing to note is we can also do this dynamically. Using boxes and the dyn trait. So this also works, however incurs the associated costs with dynamic resolution.
fn equal_5() -> Box<dyn Fn(u32) -> bool> {
let x:u32 = 5;
Box::new(move |z| z == x)
}
let compare = equal_5();
println!("eq {}", compare(6));
The compiler seems to be complaining that it's expecting a type parameter but finds a closure instead. It knows the type, and doesn't need a type parameter, but also the size of the closure object isn't fixed, so you can either use impl or a Box. The closure will also need to use move in order to move the data stored in x into the closure itself, or else it wont be accessible after equal_5() returns, and you'll get a compiler error that x doesn't live long enough.
fn equal_5() -> impl Fn(u32) -> bool {
let x:u32 = 5;
move |z| z == x
}
or
fn equal_5() -> Box<Fn(u32) -> bool> {
let x:u32 = 5;
Box::new(move |z| z == x)
}

What is the difference between "eq()" and "=="?

This is what the std says:
pub trait PartialEq<Rhs: ?Sized = Self> {
/// This method tests for `self` and `other` values to be equal, and is used
/// by `==`.
#[must_use]
#[stable(feature = "rust1", since = "1.0.0")]
fn eq(&self, other: &Rhs) -> bool;
/// This method tests for `!=`.
#[inline]
#[must_use]
#[stable(feature = "rust1", since = "1.0.0")]
fn ne(&self, other: &Rhs) -> bool {
!self.eq(other)
}
}
And the link: https://doc.rust-lang.org/src/core/cmp.rs.html#207
This is my code:
fn main() {
let a = 1;
let b = &a;
println!("{}", a==b);
}
and the compiler told me:
error[E0277]: can't compare `{integer}` with `&{integer}`
--> src\main.rs:4:21
|
4 | println!("{}", a==b);
| ^^ no implementation for `{integer} == &{integer}`
|
= help: the trait `PartialEq<&{integer}>` is not implemented for `{integer}`
But when I used eq(), it compiled:
fn main() {
let a = 1;
let b = &a;
println!("{}", a.eq(b));
}
It's actually quite simple, but it requires a bit of knowledge. The expression a == b is syntactic sugar for PartialEq::eq(&a, &b) (otherwise, we'd be moving a and b by trying to test if they're equal if we're dealing with non-Copy types).
In our case, the function PartialEq::eq needs to take two arguments, both of which are of type &i32. We see that a : i32 and b : &i32. Thus, &b will have type &&i32, not &i32.
It makes sense that we'd get a type error by trying to compare two things with different types. a has type i32 and b has type &i32, so it makes sense that no matter how the compiler secretly implements a == b, we might get a type error for trying to do it.
On the other hand, in the case where a : i32, the expression a.eq(b) is syntactic sugar for PartialEq::eq(&a, b). There's a subtle difference here - there's no &b. In this case, both &a and b have type &i32, so this is totally fine.
The difference between a.eq(b) and a == b in that dot operator does autoref/autoderef on receiver type for call-by-reference methods.
So when you write a.eq(b) compiler looks at PartialEq::eq(&self, other: &Rhs) signature, sees &self reference and adds it to a.
When you write a == b it desugars to PartialEq::eq(a, b) where a: i32 b: &i32 in your case, hence the error no implementation for `{integer} == &{integer}` .
But why it does not do the same in operators? See
Tracking issue: Allow autoderef and autoref in operators (experiment) #44762
Related information:
What are Rust's exact auto-dereferencing rules?

Resources