Understanding lifetimes: borrowed value does not live enough - rust

fn main() {
let long;
let str1="12345678".to_string();
{
let str2 = "123".to_string();
long = longest(&str1, &str2);
}
println!("the longest string is: {}", long);
}
fn longest<'a>(x:&'a str, y:&'a str) -> &'a str{
if x.len() > y.len() {
x
} else {
y
}
}
gives
error[E0597]: `str2` does not live long enough
--> src/main.rs:6:31
|
6 | long = longest(&str1, &str2);
| ^^^^^ borrowed value does not live long enough
7 | }
| - `str2` dropped here while still borrowed
8 | println!("the longest string is: {}", long);
| ---- borrow later used here
My theory is that, since the funtion longest has only one lifetime parameter, the compiler is making both x and y to have the lifetime of str1. So Rust is protecting me from calling longest and possibly receive back str2 which has lifetime less than str1 which is the chosen lifetime for 'a.
Is my theory right?

Let's take a closer look at the signature for longest:
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str
What this means is that for a given lifetime 'a, both arguments need to last at least for the length of that lifetime (or longer, which doesn't really matter since you can safely shorten any lifetime in this case without any particular difference) and the return value also lives as long as that lifetime, because the return value comes from one of the arguments and therefore "inherits" the lifetime.
The sole reason for that is that at compile time, you can't really be sure whether x or y will be returned when compiling the function, so the compiler has to assume that either can be returned. Since you've bound both of them with the same lifetime (x, y and the return value have to live at least for the duration of 'a), the resulting lifetime of 'a is the smallest one. Now let's examine the usage of the function:
let long;
let str1 = "12345678".to_string();
{
let str2 = "123".to_string();
long = longest(&str1, &str2);
}
You have two lifetimes here, the one outside the braces (the main() body lifetime) and the lifetime inside the braces (since everything between the braces is destroyed after the closing brace). Because you're storing the strings as String by using .to_string() (owned strings) rather than &'static str (borrowed string literals stored in the program executable file), the string data gets destroyed as soon as it leaves the scope, which, in the case of str2, is the brace scope. The lifetime of str2 ends before the lifetime of str1, therefore, the lifetime of the return value comes from str2 rather than str1.
You then try to store the return value into long — a variable outside the inner brace scope, i.e. into a variable with a lifetime of the main() body rather than the scope. But since the lifetime of str2 restricts the lifetime of the return value for longest in this situation, the return value of longest doesn't live after the braced scope — the owned string you used to store str2 is dropped at the end of the braced scope, releasing resources required to store it, i.e. from a memory safety standpoint it no longer exists.
If you try this, however, everything works fine:
let long;
let str1 = "12345678";
{
let str2 = "123";
long = longest(str1, str2);
}
println!("the longest string is: {}", long);
But why? Remember what I said about how you stored the strings, more specifically, what I said about borrowed string literals which are stored in the executable file. These have a 'static lifetime, which means the entire duration of the program's runtime existence. This means that &'static to anything (not just str) always lives long enough, since now you're referring to memory space inside the executable file (allocated at compile time) rather than a resource on the heap managed by String and dropped when the braced scope ends. You're no longer dealing with a managed resource, you're dealing with a resource managed at compile time, and that pleases the borrow checker by eliminating possible issues with its duration of life, since it's always 'static.

This code for the developer's perspective looks good and it should be because we are printing long in the outer scope so there should be no problem at all.
But Rust's compiler does a strict check on borrowed values and it needs to be sure that every value lives long enough for the variables that depend on that value.
Compiler sees that long depends on the value whose lifetime is shorter that it's own i.e. str, it gives an error. Behind the scenes this is done by a borrow checker.
You can see more details about borrow checker here

Related

Why does Rust translate same lifetime specifiers as the smaller one

In Rust docs, we see this example:
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
if x.len() > y.len() {
x
} else {
y
}
}
And explanation looks like this:
The function signature now tells Rust that for some lifetime 'a, the
function takes two parameters, both of which are string slices that
live at least as long as lifetime 'a. The function signature also
tells Rust that the string slice returned from the function will live
at least as long as lifetime 'a. In practice, it means that the
lifetime of the reference returned by the longest function is the same
as the smaller of the lifetimes of the references passed in
Note the words after "in practice". It mentions that:
In practice, it means that the
lifetime of the reference returned by the longest function is the same
as the smaller of the lifetimes of the references passed in
I don't understand why in practice, it means that lifetime of the returned is the same as the smaller of those 2 parameter's lifetimes. Is this something I need to memorize or what ? We can clearly say that parameters and returned values all have 'a same specifier. Why does Rust think that this means returned value should have smaller lifetime of those 2 passed ?
Why does rust think that this means returned value should have SMALLER lifetime of those 2 passed ?
Because that's the only thing that makes sense. Imagine this situation:
let a = "foo"; // &'static str
let s = "bar".to_string(); // String
let b = s.as_str(); // &str (non-static, borrows from s)
let longest = longest(a, b);
The lifetime of a is 'static, i.e. a lasts as long as the program. The lifetime of b is shorter than that, as it's tied to the lifetime of the variable s. But longest only accepts one lifetime!
What Rust does is compute a lifetime that is an intersection of the 'static lifetime of a and the tied-to-s lifetime of b, and uses that as the lifetime of (this invocation of) longest(). If such a lifetime cannot be found, you get a borrow checking error. If it can be found, it's no longer than the shortest source lifetime.
In the above case, the intersection of 'static and the lifetime tied to s is the lifetime tied to s, so that's what's used for the lifetime 'a in longest().

How is output lifetime of a function calculated?

In Rust book (https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html), this code is used as example (paraphrased):
fn main() {
let string1 = String::from("long string is long");
{
let string2 = String::from("xyz");
let result = longest(string1.as_str(), string2.as_str()); // line 5
println!("The longest string is {}", result); // line 6
}
}
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
if x.len() > y.len() { x } else { y }
}
I am confused why this code compiles at all.
Regarding the longest function, the book says, "the generic lifetime 'a will get the concrete lifetime that is equal to the smaller of the lifetimes of x and y".
The book then talked as if string1.as_str() and string2.as_str() live as long as string1 and string2 respectively. But why would they? These two references were not used after line 5, and by line 6, they should have been dead. Why there wasn't an error at line 6 for using result when it is no longer live?
One could say that presence of result somehow extends the input lifetimes, but wouldn't that contradict the notion that "output lifetime is the intersection of input lifetimes"?
Where do I get it wrong?
But why would they? These two references were not used after line 5, and by line 6, they should have been dead.
But they're not dead. In fact, one of them is definitely in result and is getting used on Line 6. A reference can last, at minimum, until the end of the current expression (generally, but not always, until a semicolon), and at maximum as long as the thing it points to continues existing. The lifetime parameter from the output of longest requires that it last as long as result is in scope. Notably, the scope of result is no larger than the scope of either string1 or string2, so there's no issue. If we tried to assign the result of longest to a variable that outlives string2, then we'd have a problem. For instance, this won't compile.
fn main() {
let string1 = String::from("long string is long");
let mut result = "";
{
let string2 = String::from("xyz");
result = longest(string1.as_str(), string2.as_str());
}
println!("The longest string is {}", result);
}
Because that would require result to outlive string2, which is a problem.
The confusion seems to me to originate in the type &'a str. The 'a is the lifetime of the data to which the reference refers, i.e. the region of validity of that data. It is not the region for which the reference variable itself is valid.
So...
string1.as_str() returns a &'s1 str, where 's1 is the lifetime of string1.
string2.as_str() returns a &'s2 str, where 's2 is the lifetime of string2.
longest must infer a single generic lifetime 'a from two parameter lifetimes 's1 and 's2. It does this by choosing the common overlap, the shorter lifetime 's2. This is a form of subtyping: references valid for longer lifetimes can be used transparently as references valid for shorter lifetimes.
So result is a &'s2 str.
Line 6 references result of type &'s2 str where 's2 is the lifetime of string2. The reference name itself is available, and the referent data is valid for the lifetime of string2, 's2, which we are still within.
string1.as_str() and string2.as_str() live as long as string1 and string2 respectively
That's loosely worded. They are references, and they do not last past line 5. But they are references with a lifetime equal to the lifetime of string1 and string2 respectively. This means the data they point to is to live at least that long.
Why there wasn't an error at line 6 for using result when it is no longer live?
result has type &'s2 str, so it is valid. It is a copy of one of the two (temporary) references which are inputs to longest in line 5. The name result is valid in line 6, and it points to data which is guaranteed to live as long as 's2 which is still valid. So, there is no error.
One could say that presence of result somehow extends the input lifetimes
There is no such extension. Those temporaries do not exist past line 5. But one of them is copied into result, and the lifetime is not how long the temporary lives (how long the input reference &'a str lives), but how long the referent data lives.

In Rust, can you own a string literal?

According to The Rust book:
Each value in Rust has a variable that’s called its owner. There can be only one owner at a time. When the owner goes out of scope, the value will be dropped.
According to rust-lang.org:
Static items do not call drop at the end of the program.
After reading this SO post, and given the code below, I understand that foo is a value whose variable y, equivalent to &y since "string literals are string slices", is called its owner. Is that correct? Or do static items have no owner?
let x = String::from("foo"); // heap allocated, mutable, owned
let y = "foo" // statically allocated to rust executable, immutable
I'm wondering because unlike an owned String, string literals are not moved, presumably because they're stored in .rodata in the executable.
fn main() {
let s1 = "foo"; // as opposed to String::from("foo")
let s2 = s1; // not moved
let s3 = s2; // no error, unlike String::from("foo")
}
UPDATE: According to The Rust book:
These ampersands are references, and they allow you to refer to some value without taking ownership of it...Another data type that does not have ownership is the slice.
Since string literals are string slices (&str) (see citation above), they, logically, do not have ownership. The rationale seems to be that the compiler requires a data structure with a known size: a reference:
let s1: str = "foo"; // [rustc E0277] the size for values of type `str` cannot be known at compilation time [E]
A string slice reference (&str) does not own the string slice that it points to, it borrows it. You can have several immutable references to an object, that's why your second code sample is correct and the borrow checker is happy to accept it.
I think you can say that types with the 'static lifetime have no owner, or that something outside of the main function owns it. The owner only matters when the lifetime of the owning object ends (if you own it, you need to free resources). For references only lifetimes matter.

Do lifetime annotations in Rust change the lifetime of the variables?

The Rust chapter states that the annotations don't tamper with the lifetime of a variable but how true is that? According to the book, the function longest takes two references of the strings and return the longer one. But here in the error case
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
if x.len() > y.len() {
x
} else {
y
}
}
fn main() {
let string1 = String::from("long string is long");
let result;
{
let string2 = String::from("xyz");
result = longest(string1.as_str(), string2.as_str());
}
println!("The longest string is {}", result);
}
it does actually change the lifetime of the result variable, doesn't it?
We’ve told Rust that the lifetime of the reference returned by the longest function is the same as the smaller of the lifetimes of the references passed in.
What the book is merely suggesting is that a lifetime parameter of a function cannot interfere with the affected value's lifetime. They cannot make a value live longer (or the opposite) than what is already prescribed by the program.
However, different function signatures can decide the lifetime of those references. Since references are covariant with respect to their lifetimes, you can turn a reference of a "wider" lifetime into a smaller one within that lifetime.
For example, given the definition
fn longest<'a>(a: &'a str, b: &'a str) -> &'a str
, the lifetimes of the two input references must match. However, we can write this:
let local = "I am local string.".to_string();
longest(&local, "I am &'static str!");
The string literal, which has a 'static lifetime, is compatible with the lifetime 'a, in this case mainly constrained by the string local.
Likewise, in the example above, the lifetime 'a has to be constrained to the nested string string2, otherwise it could not be passed by reference to the function. This also means that the output reference is restrained by this lifetime, which is why the code fails to compile when attempting to use the output of longest outside the scope of string2:
error[E0597]: `string2` does not live long enough
--> src/main.rs:14:44
|
14 | result = longest(string1.as_str(), string2.as_str());
| ^^^^^^^ borrowed value does not live long enough
15 | }
| - `string2` dropped here while still borrowed
16 | println!("The longest string is {}", result);
| ------ borrow later used here
See also this question for an extended explanation of lifetimes and their covariance/contravariance characteristics:
How can this instance seemingly outlive its own parameter lifetime?
First, it's important to understand the difference between a lifetime and a scope. References have lifetimes, which are dependent on the scopes of the variables they refer to.
A variable scope is lexical:
fn main() {
let string1 = String::from("long string is long"); // <-- scope of string1 begins here
let result;
{
let string2 = String::from("xyz"); // <-- scope of string2 begins here
result = longest(string1.as_str(), string2.as_str());
// <-- scope of string2 ends here
}
println!("The longest string is {}", result);
// <-- scope of string1 ends here
}
When you create a new reference to a variable, the lifetime of the reference is tied solely to the scope of the variable. Other references have different lifetime information attached to them, depending on where the reference came from and what information is known in that context. When you put named lifetime annotations on a type, the type-checker simply ensures that the lifetime information attached to any references is compatible with the annotations.
fn main() {
let string1 = String::from("long string is long");
let result;
{
let string2 = String::from("xyz");
// The lifetime of result cannot be longer than `'a`
result = longest(string1.as_str(), string2.as_str());
// But a reference to string2 also has lifetime `'a`, which means that
// the lifetime `'a` is only valid for the scope of string2
// <-- i.e. to here
}
// But then we try to use it here — oops!
println!("The longest string is {}", result);
}
We’ve told Rust that the lifetime of the reference returned by the longest function is the same as the smaller of the lifetimes of the references passed in.
Sort of. We did tell this information to Rust, however, the borrow-checker will still check if it is true! If it's isn't already true then we will get an error. We can't change the truthfulness of that information, we can only tell Rust the constraints we want, and it will tell us if we are right.
In your example, you could make the main function valid by changing the lifetime annotations on longest:
fn longest<'a, 'b>(x: &'a str, y: &'b str) -> &'a str {
if x.len() > y.len() {
x
} else {
y // oops!
}
}
But now you get an error inside longest because it no longer meets the requirements: it is now never valid to return y because its lifetime could be shorter than 'a. In fact, the only ways to implement this function correctly are:
Return x
Return a slice of x
Return a &'static str — since 'static outlives all other lifetimes
Use unsafe code

Why does the compiler tell me to consider using a `let` binding" when I already am?

What is my error and how to fix it?
fn get_m() -> Vec<i8> {
vec![1, 2, 3]
}
fn main() {
let mut vals = get_m().iter().peekable();
println!("Saw a {:?}", vals.peek());
}
(playground)
The compiler's error suggests "consider using a let binding" — but I already am:
error[E0597]: borrowed value does not live long enough
--> src/main.rs:6:45
|
6 | let mut vals = get_m().iter().peekable();
| ------- ^ temporary value dropped here while still borrowed
| |
| temporary value created here
7 | println!("Saw a {:?}", vals.peek());
8 | }
| - temporary value needs to live until here
|
= note: consider using a `let` binding to increase its lifetime
This is obviously a newbie question -- though I thought I'd written enough Rust at this point that I had a handle on the borrow checker... apparently I haven't.
This question is similar to Using a `let` binding to increase value lifetime, but doesn't involve breaking down an expression into multiple statements, so I don't think the problem is identical.
The problem is that the Peekable iterator lives to the end of the function, but it holds a reference to the vector returned by get_m, which only lasts as long as the statement containing that call.
There are actually a lot of things going on here, so let's take it step by step:
get_m allocates and returns a vector, of type Vec<i8>.
We make the call .iter(). Surprisingly, Vec<i8> has no iter method, nor does it implement any trait that has one. So there are three sub-steps here:
Any method call checks whether its self value implements the Deref trait, and applies it if necessary. Vec<i8> does implement Deref, so we implicitly call its deref method. However, deref takes its self argument by reference, which means that get_m() is now an rvalue appearing in an lvalue context. In this situation, Rust creates a temporary to hold the value, and passes a reference to that. (Keep an eye on this temporary!)
We call deref, yielding a slice of type &[i8] borrowing the vector's elements.
This slice implements the SliceExt trait, which does have an iter method. Finally! This iter also takes its self argument by reference, and returns a std::slice::Iter holding a reference to the slice.
We make the call .peekable(). As before, std::slice::Iter has no peekable method, but it does implement Iterator; IteratorExt is implemented for every Iterator; and IteratorExt does have a peekable method. This takes its self by value, so the Iter is consumed, and we get a std::iter::Peekable back in return, again holding a reference to the slice.
This Peekable is then bound to the variable vals, which lives to the end of the function.
The temporary holding the original Vec<i8>, to whose elements the Peekable refers, now dies. Oops. This is the borrowed value not living long enough.
But the temporary dies there only because that's the rule for temporaries. If we give it a name, then it lasts as long as its name is in scope:
let vec = get_m();
let mut peekable = vec.iter().peekable();
println!("Saw a {:?}", vals.peek());
I think that's the story. What still confuses me, though, is why that temporary doesn't live longer, even without a name. The Rust reference says, "A temporary's lifetime equals the largest lifetime of any reference that points to it." But that's clearly not the case here.
This is happening because you are trying to run your .iter().peekable() on the actual vector inside of get_m(), which is getting re-referenced by vals.
Basically, you want something like this:
fn get_m() -> Vec<i8> {
vec![1, 2, 3]
}
fn main() {
let vals = get_m();
let mut val = vals.iter().peekable();
println!("Saw a {:?}", val.peek());
}
(Playground)
Result:
Saw a Some(1)

Resources