Rust Lifetime Book confusion - rust

I am very confused about lifetimes, in particular an example in the rust book:
https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html
Says that:
fn longest(x: &str, y: &str) -> &str {
if x.len() > y.len() {
x
} else {
y
}
}
Needs a lifetime specifier so the compiler knows the lifetime? Doesn't the compiler already know the lifetime? x and y are both parameters to the function, so x and y will last for the whole duration of the function so why would we need a lifetime specifier here? I don't think x and y will ever run into lifetime issues based on the code as is.

Related

vector of string slices goes out of scope but original string remains, why is checker saying there is an error?

Beginner at rust here. I understand why the code below has an error. test(x) creates y then returns a value that references the &str owned by y. y is destroyed as it goes out of scope so it can't do that.
Here's my issue the thing is the &str owned by y is actually a slice of x that has NOT went out of scope yet... so technically the reference should still work.
enum TestThing<'a> {
Blah(&'a str)
}
fn test(x: &str) -> Vec<TestThing> {
let y = x.split(" ").collect::<Vec<&str>>();
parse(&y)
}
fn parse<'a>(x: &'a Vec<&str>) -> Vec<TestThing<'a>> {
let mut result: Vec<TestThing> = vec![];
for v in x {
result.push(TestThing::Blah(v));
}
result
}
Is the checker just being over-zealous here? Is there a method around this? Am I missing something? Is this just something to do with split? I also tried cloning v, and that didn't work either.
Move the lifetime here: x: &'a Vec<&str> -> x: &Vec<&'a str>.
P.S. Using a slice (&[&'a str]) would be better, since it's smaller and more flexible, see Why is it discouraged to accept a reference to a String (&String), Vec (&Vec), or Box (&Box) as a function argument?. Some kind of impl Iterator or impl IntoIterator would be even more flexible.

Why can't the Rust compiler infer that one argument outlives another?

I have the following code:
struct Solver<'a> {
guesses: Vec<&'a str>,
}
impl<'a> Solver<'a> {
fn register_guess(&mut self, guess: &'a str) {
self.guesses.push(guess);
}
}
fn foo(mut solver: Solver, guess: &str) {
solver.register_guess(guess)
}
It doesn't compile:
|
11 | fn foo(mut solver: Solver, guess: &str) {
| ---------- - let's call the lifetime of this reference `'1`
| |
| has type `Solver<'2>`
12 | solver.register_guess(guess)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ argument requires that `'1` must outlive `'2`
The error message says that the argument guess must outlive solver. It's plainly obvious to me that that's true: the lifetime of solver ends at the end of the function, and the lifetime of guess doesn't. This seems like something the compiler should be able to infer, and compile without error.
Why isn't that the case? Does this code actually have some way for solver to outlive guess? Or is it just that the compiler doesn't try to do this kind of inference at all?
I know how to fix it --- change the function to fn foo<'a>(mut solver: Solver<'a>, guess: &'a str) --- but I'm asking why I should have to do that.
While solver itself can't outlive guess, the lifetime it refers to very well could. For example, imagine invoking foo() with a Solver<'static>. That kind of solver would expect guess to be &'static str and might store the data referred to by guess in a global variable. (Remember that the compiler doesn't consider what register_guess() does while borrow-checking foo(), it just considers its signature.)
More generally, Solver<'a> might contain references to 'a data that outlives solver itself. Nothing stops register_guess() from storing the contents of guess inside such references. If guess isn't guaranteed to live at least as long as 'a, then foo() is simply unsound. For example, take this alternative definition of Solver:
struct Solver<'a> {
guesses: &'a mut Vec<&'a str>,
}
With unchanged signature of register_guess(), foo() would allow unsound code like this:
fn main() {
let mut guesses = vec![];
let solver = Solver { guesses: &mut guesses };
{
let guess = "foo".to_string();
// stores temporary "foo" to guesses, which outlives it
foo(solver, guess.as_str());
}
println!("{}", guesses[0]); // UB: use after free
}
This error comes from rust's rules of lifetime elision. One of this rules states that:
Each elided lifetime in the parameters becomes a distinct lifetime parameter
Rust conservatively assumes that each not specified lifetime is different. If you want some lifetimes to be equal you must specify it explicitly. Your problem is equivalent to simple function that takes two string slices and returns the longer one. You must write the signature of such function as fn longer<'a>(&'a str, &'a str) -> &'a str, or the compiler will give you the same error.

What does "smaller" mean for multiple references that share a lifetime specifier?

The Rust Programming Language says:
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
if x.len() > y.len() {
x
} else {
y
}
}
The function signature now tells Rust that for some lifetime 'a, the
function takes two parameters, both of which are string slices that
live at least as long as lifetime 'a. The function signature also
tells Rust that the string slice returned from the function will live
at least as long as lifetime 'a. In practice, it means that the
lifetime of the reference returned by the longest function is the
same as the smaller of the lifetimes of the references passed in.
These relationships are what we want Rust to use when analyzing this
code.
I don't get why it says:
In practice, it means that the lifetime of the reference returned by
the longest function is the same as the smaller of the lifetimes of
the references passed in.
Note the word "smaller". For both parameters and returned values, we specified 'a which is the same. Why does the book say "smaller"? If that was the case, we would have different specifiers(a', b').
It's important to note that in the example, whatever is passed as x and y do not have to have identical lifetimes.
Let's rework the example to use &i32 which makes for easier demonstrations:
fn biggest<'a>(x: &'a i32, y: &'a i32) -> &'a i32 {
if x > y {
x
} else {
y
}
}
Now, given the following example:
fn main() {
let bigger;
let a = 64;
{
let b = 32;
bigger = biggest(&a, &b);
}
dbg!(bigger);
}
We have 3 different lifetimes:
bigger which lives until the end of main
a which lives until the end of main
b which lives until the end of its block
Now lets take the description apart
the lifetime of the reference returned by the [...] function
In our case this would be bigger, which lives until the end of main
smaller of the lifetimes of the references passed in
We passed in a, which gets dropped at the end of main, and b, which gets dropped at the end of its block. Since b gets dropped first, it is the "smaller" lifetime.
And sure enough, the compiler yells at us:
error[E0597]: `b` does not live long enough
--> src/main.rs:7:30
|
7 | bigger = biggest(&a, &b);
| ^^ borrowed value does not live long enough
8 | }
| - `b` dropped here while still borrowed
9 |
10 | dbg!(bigger);
| ------ borrow later used here
But if we move bigger inside the block, and hence reduce its lifetime, the code compiles:
fn main() {
let a = 64;
{
let b = 32;
let bigger = biggest(&a, &b);
dbg!(bigger);
}
println!("This works!");
}
Now a and b still have different lifetimes but the code compiles since the the smaller of both lifetimes (b's) is the same as bigger's lifetime.
Did this help illustrate what "smaller" lifetime means?

rust lifetimes and borrow checker

I'm reading through the "Learn rust" tutorial, and I'm trying to understand lifetimes. Chapter 10-3 has the following non-working example:
fn longest(x: &str, y: &str) -> &str {
if x.len() > y.len() {
x
} else {
y
}
}
The paragraph below explains the error as :
When we’re defining this function, we don’t know the concrete values that will be passed into this function, so we don’t know whether the if case or the else case will execute.
However, if I change the block of code to do something else; say, print x and return y; so that we know what is being returned each time, the same error occurs. Why?
fn longest(x: &str, y: &str) -> &str {
println!("{}", x);
y
}
It book also says :
The borrow checker can’t determine this either, because it doesn’t know how the lifetimes of x and y relate to the lifetime of the return value.
My doubts are:
Is the borrow checker capable of tracking lifetimes across functions? If so, can you please provide an example?
I don't understand the error. x and y are references passed into longest, so the compiler should know that its owner is elsewhere(and that its lifetime would continue beyond longest). When the compiler sees that the return values are either x or y, why is there a confusion on lifetimes?
Think of the function as a black box. Neither you, nor the compiler knows what happens inside. You may say that the compiler "knows", but that's not really true. Imagine that it returns X or Y based on the result of a remote HTTP call. How can it know in advance ?
Yet it needs to provide some guarantee that the returned reference is safe to use. That works by forcing you (i.e. the developer) to explicitly specify the relationships between the input parameters and the returned value.
First you need to specify the lifetimes of the parameters. I'll use 'x, for x, 'y for y and 'r for the result. Thus our function will look like:
fn longest<'x, 'y, 'r>(x: &'x str, y: &'y str) -> &'r str
But this is not enough. We still need to tell the compiler what the relationships are. There are two way to do it (the magic syntax will be explained later):
Inside the <> brackets like that: <'a, 'b: 'a>
In a where clause like that: where 'b: 'a
Both options are the same but the where clause will be more readable if you have a large number of generic parameters.
Back to the problem. We need to tell the compiler that 'r depends on both 'x and 'y and that it will be valid as long as they are valid. We can do that by saying 'long: 'short which translates to "lifetime 'long must be valid at least as long as lifetime 'short".
Thus we need to modify our function like that:
fn longest<'x, 'y, 'r>(x: &'x str, y: &'y str) -> &'r str
where
'x: 'r,
'y: 'r,
{
if x.len() > y.len() {
x
} else {
y
}
}
I.e. we are saying that our returned value will not outlive the function parameters, thus preventing a "use after free" situation.
PS: In this example you can actually do it with only one lifetime parameter, as we are not interested in the relationship between them. In this case the lifetime will be the smaller one of x/y:
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str

The “outlives” relation and actual scopes

I was going through the legendary RFC 1214 and it seems that I’m missing something crucial.
struct Foo;
struct Bar<'a> {
foo: &'a Foo
}
fn f<'x, 'y>(_: &'x Foo, _: &'y Bar<'x>)
where 'y: 'x, 'x: 'y {
}
fn g<'x>(x: &'x Foo) {
let y = Bar {foo : x};
f(x, &y); // ?
}
fn main(){
let x = Foo;
g(&x);
}
In this code I went to great lengths to make sure that 'x : 'y and not 'y : 'x. The function that defines x calls the function that defines y, I believe this is already enough to guarantee that x outlives y, but I also put a reference to x inside y, just to make sure.
Now, the constraints in f are such that the invocation of this function can’t possibly be valid. I mean, well, it can, if and only if 'x == 'y, but it totally looks like x lives strictly longer than y, as it is defined in the outer scope.
Nevertheless, this code typechecks and compiles. How is this possible?
Lifetimes have variance, that is, the compiler can choose to shorten the lifetime of a &'a Foo to some &'b Foo. The lifetime of a reference like that just means that the Foo lasts at least as long as 'a: a shorter lifetime still satisfies this guarantee. This is what is happening here: the 'x lifetime is being shortened to have the same lifetime as the &y reference.
You can use invariance to stop this compiling: if the lifetime 'x cannot be shortened, then the code will stop compiling as you expect.
use std::cell::Cell;
struct Foo;
struct Bar<'a> {
foo: Cell<&'a Foo>
}
fn f<'x, 'y>(_: Cell<&'x Foo>, _: &'y Bar<'x>)
where 'y: 'x, 'x: 'y {
}
fn g<'x>(x: Cell<&'x Foo>) {
let y = Bar {foo : x.clone()};
f(x, &y); // ?
}
fn main(){
let x = Foo;
g(Cell::new(&x));
}
<anon>:16:10: 16:11 error: `y` does not live long enough
<anon>:16 f(x, &y); // ?
^
<anon>:14:28: 17:2 note: reference must be valid for the lifetime 'x as defined on the block at 14:27...
<anon>:14 fn g<'x>(x: Cell<&'x Foo>) {
<anon>:15 let y = Bar {foo : x.clone()};
<anon>:16 f(x, &y); // ?
<anon>:17 }
<anon>:15:34: 17:2 note: ...but borrowed value is only valid for the block suffix following statement 0 at 15:33
<anon>:15 let y = Bar {foo : x.clone()};
<anon>:16 f(x, &y); // ?
<anon>:17 }
What is happening here is Cell<T> is invariant in T, because it is readable and writable. This in particular means that Cell<&'x Foo> cannot be shortened to Cell<&'y Foo>: filling it with a reference &'y Foo that is truly 'y (i.e. only lasts for 'y) will mean the reference is dangling once the cell leaves 'y (but is still in 'x).
Here are three things which in combination explain the behavior you see:
The 'x on f is a completely different, independent lifetime parameter from the 'x in g. The compiler can choose different concrete lifetimes to substitute for each.
'x : 'y, 'y: 'x means that 'x == 'y (this is not real syntax).
If you have a reference, you can implicitly create another reference with a shorter lifetime from it. Consider for example the function fn mangle_a_string<'a>(_s: &'a str) -> &'a str { "a static string" }
So what happens in f(x, &y) is that the first argument is coerced to a reference with a shorter lifetime, matching the second argument's lifetim, to satisfy the bounds in the where clause.

Resources