Rust temporary variable lifetime in method chaining - rust

I'm trying to learn Rust's lifetime rules by comparing it to similar concepts in C++, which I'm more familiar with. Most of the time, my intuition works really well and I can make sense the rule. However, in the following case, I'm not sure if my understanding is correct or not.
In Rust, a temporary value's lifetime is the end of its statement, except when the last temporary value is bound to a name using let.
struct A(u8);
struct B(u8);
impl A {
fn get_b(&mut self) -> Option<B> {
Some(B(self.0))
}
}
fn a(v: u8) -> A {
A(v)
}
// temporary A's lifetime is the end of the statement
// temporary B binds to a name so lives until the enclosing block
let b = a(1).get_b();
// temporary A's lifetime is the end of the statement
// temporary B's lifetime extends to the enclosing block,
// so that taking reference of temporary works similar as above
let b = &a(2).get_b();
If the temporary value is in an if condition, according to the reference, the lifetime is instead limited to the conditional expression.
// Both temporary A and temporary B drops before printing some
if a(3).get_b().unwrap().val <= 3 {
println!("some");
}
Now to the question:
If putting let in if condition, because of pattern matching, we are binding to the inner part of the temporary value. I'd expect the temporary value bound by let to be extended to the enclosing block, while other temporary values should still have a lifetime limited by the if condition.
(In this case actually everything is copied I would say even temporary B can be dropped, but that's a separate question.)
However, both temporaries' lifetimes are extended to the enclosing if block.
// Both temporary A and temporary B's lifetime are extended to the end of the enclosing block,
// which is the if statement
if let Some(B(v # 0...4)) = a(4).get_b() {
println!("some {}", v);
}
Should this be considered an inconsistency in Rust? Or am I misunderstanding and there is a consistent rule that can explain this behavior?
Full code example:
playground
The same thing implemented in C++ that matches my expectation
Note the output from Rust is
some 4
Drop B 4
Drop A 4
while the output from C++ is
Drop A 4
some 4
Drop B 4
I have read this Reddit thread and Rust issue, which I think is quite relevant, but I still can't find a clear set of lifetime rule that works for all the cases in Rust.
Update:
What I'm unclear about is why the temporary lifetime rule about if conditional expression does not apply to if let. I think the let Some(B(v # 0...4)) = a(4).get_b() should be the conditional expression, and thus the temporary A's lifetime should be limited by that, rather than the entire if statement.
The behaviour of extending temporary B's lifetime to the entire if statement is expected, because that is borrowed by the pattern matching.

An if let construct is just syntactic sugar for a match construct. let Some(B(v # 0...4)) = a(4).get_b() is not a conditional used in a regular if expression, because it is not an expression that evaluates to bool. Given your example:
if let Some(B(v # 0...4)) = a(4).get_b() {
println!("some {}", v);
}
It will behave exactly the same as the below example. No exceptions. if let is rewritten into match before the type or borrow checkers are even run.
match a(4).get_b() {
Some(B(v # 0...4)) => {
println!("some {}", v);
}
_ => {}
}
Temporaries live as long as they do in match blocks because they sometimes come in handy. Like if your last function was fn get_b(&mut self) -> Option<&B>, and if the temporary didn't live for the entire match block, then it wouldn't pass borrowck.
If conditionals don't follow the same rule because it's impossible for the last function call in an if conditional to hold a reference to anything. They have to evaluate to a plain bool.
See:
Rust issue 37612

Related

Why does assigning a reference to a variable make me not able to return it

In this code:
fn main() {
let a = {
&mut vec![1]
};
let b = {
let temp = &mut vec![1];
temp
};
println!("{a:?} {b:?}");
}
Why is a valid and b not valid ("temporary value dropped while borrowed [E0716]")?
It would make sense to me if they were both problematic, why isn't the vec in a getting dropped?
Is this simply that the compiler can understand the first example but the second one is to hard for it to understand?
In one sentence: a is a temporary while b is not.
temp is a variable; variables are always dropped at the end of the enclosing scope. The scope ends before we assign it to b.
In contrast, the vec![] in a is a temporary, as it is not assigned to a variable. Temporaries are generally dropped at the end of the statement, however because the statement is a let declaration, the temporary inside it is subject to temporary lifetime extension and its lifetime is extended to match the lifetime of a itself, that is, until the enclosing block of a.
Note that to be precise, temp is also assigned a temporary that is subject to temporary lifetime extension - but its extended lifetime matches the lifetime of temp, as it is part of its declaration.

Who has the ownership of a temporary value when it is created and referred by a struct? [duplicate]

Coming from C++, I'm rather surprised that this code is valid in Rust:
let x = &mut String::new();
x.push_str("Hello!");
In C++, you can't take the address of a temporary, and a temporary won't outlive the expression it appears in.
How long does the temporary live in Rust? And since x is only a borrow, who is the owner of the string?
Why is it legal to borrow a temporary?
It's legal for the same reason it's illegal in C++ — because someone said that's how it should be.
How long does the temporary live in Rust? And since x is only a borrow, who is the owner of the string?
The reference says:
the temporary scope of an expression is the
smallest scope that contains the expression and is for one of the following:
The entire function body.
A statement.
The body of a if, while or loop expression.
The else block of an if expression.
The condition expression of an if or while expression, or a match
guard.
The expression for a match arm.
The second operand of a lazy boolean expression.
Essentially, you can treat your code as:
let mut a_variable_you_cant_see = String::new();
let x = &mut a_variable_you_cant_see;
x.push_str("Hello!");
See also:
Why can I return a reference to a local literal but not a variable?
What is the scope of unnamed values?
Are raw pointers to temporaries ok in Rust?
From the Rust Reference:
Temporary lifetimes
When using a value expression in most place expression contexts, a temporary unnamed memory location is created initialized to that value and the expression evaluates to that location instead
This applies, because String::new() is a value expression and being just below &mut it is in a place expression context. Now the reference operator only has to pass through this temporary memory location, so it becomes the value of the whole right side (including the &mut).
When a temporary value expression is being created that is assigned into a let declaration, however, the temporary is created with the lifetime of the enclosing block instead
Since it is assigned to the variable it gets a lifetime until the end of the enclosing block.
This also answers this question about the difference between
let a = &String::from("abcdefg"); // ok!
and
let a = String::from("abcdefg").as_str(); // compile error
In the second variant the temporary is passed into as_str(), so its lifetime ends at the end of the statement.
Rust's MIR provides some insight on the nature of temporaries; consider the following simplified case:
fn main() {
let foo = &String::new();
}
and the MIR it produces (standard comments replaced with mine):
fn main() -> () {
let mut _0: ();
scope 1 {
let _1: &std::string::String; // the reference is declared
}
scope 2 {
}
let mut _2: std::string::String; // the owner is declared
bb0: {
StorageLive(_1); // the reference becomes applicable
StorageLive(_2); // the owner becomes applicable
_2 = const std::string::String::new() -> bb1; // the owner gets a value; go to basic block 1
}
bb1: {
_1 = &_2; // the reference now points to the owner
_0 = ();
StorageDead(_1); // the reference is no longer applicable
drop(_2) -> bb2; // the owner's value is dropped; go to basic block 2
}
bb2: {
StorageDead(_2); // the owner is no longer applicable
return;
}
}
You can see that an "invisible" owner receives a value before a reference is assigned to it and that the reference is dropped before the owner, as expected.
What I'm not sure about is why there is a seemingly useless scope 2 and why the owner is not put inside any scope; I'm suspecting that MIR just isn't 100% ready yet.

Is the official Rust book wrong about the borrow checker? [duplicate]

Coming from C++, I'm rather surprised that this code is valid in Rust:
let x = &mut String::new();
x.push_str("Hello!");
In C++, you can't take the address of a temporary, and a temporary won't outlive the expression it appears in.
How long does the temporary live in Rust? And since x is only a borrow, who is the owner of the string?
Why is it legal to borrow a temporary?
It's legal for the same reason it's illegal in C++ — because someone said that's how it should be.
How long does the temporary live in Rust? And since x is only a borrow, who is the owner of the string?
The reference says:
the temporary scope of an expression is the
smallest scope that contains the expression and is for one of the following:
The entire function body.
A statement.
The body of a if, while or loop expression.
The else block of an if expression.
The condition expression of an if or while expression, or a match
guard.
The expression for a match arm.
The second operand of a lazy boolean expression.
Essentially, you can treat your code as:
let mut a_variable_you_cant_see = String::new();
let x = &mut a_variable_you_cant_see;
x.push_str("Hello!");
See also:
Why can I return a reference to a local literal but not a variable?
What is the scope of unnamed values?
Are raw pointers to temporaries ok in Rust?
From the Rust Reference:
Temporary lifetimes
When using a value expression in most place expression contexts, a temporary unnamed memory location is created initialized to that value and the expression evaluates to that location instead
This applies, because String::new() is a value expression and being just below &mut it is in a place expression context. Now the reference operator only has to pass through this temporary memory location, so it becomes the value of the whole right side (including the &mut).
When a temporary value expression is being created that is assigned into a let declaration, however, the temporary is created with the lifetime of the enclosing block instead
Since it is assigned to the variable it gets a lifetime until the end of the enclosing block.
This also answers this question about the difference between
let a = &String::from("abcdefg"); // ok!
and
let a = String::from("abcdefg").as_str(); // compile error
In the second variant the temporary is passed into as_str(), so its lifetime ends at the end of the statement.
Rust's MIR provides some insight on the nature of temporaries; consider the following simplified case:
fn main() {
let foo = &String::new();
}
and the MIR it produces (standard comments replaced with mine):
fn main() -> () {
let mut _0: ();
scope 1 {
let _1: &std::string::String; // the reference is declared
}
scope 2 {
}
let mut _2: std::string::String; // the owner is declared
bb0: {
StorageLive(_1); // the reference becomes applicable
StorageLive(_2); // the owner becomes applicable
_2 = const std::string::String::new() -> bb1; // the owner gets a value; go to basic block 1
}
bb1: {
_1 = &_2; // the reference now points to the owner
_0 = ();
StorageDead(_1); // the reference is no longer applicable
drop(_2) -> bb2; // the owner's value is dropped; go to basic block 2
}
bb2: {
StorageDead(_2); // the owner is no longer applicable
return;
}
}
You can see that an "invisible" owner receives a value before a reference is assigned to it and that the reference is dropped before the owner, as expected.
What I'm not sure about is why there is a seemingly useless scope 2 and why the owner is not put inside any scope; I'm suspecting that MIR just isn't 100% ready yet.

Why does this iterator drop when not called directly as a function argument? [duplicate]

Coming from C++, I'm rather surprised that this code is valid in Rust:
let x = &mut String::new();
x.push_str("Hello!");
In C++, you can't take the address of a temporary, and a temporary won't outlive the expression it appears in.
How long does the temporary live in Rust? And since x is only a borrow, who is the owner of the string?
Why is it legal to borrow a temporary?
It's legal for the same reason it's illegal in C++ — because someone said that's how it should be.
How long does the temporary live in Rust? And since x is only a borrow, who is the owner of the string?
The reference says:
the temporary scope of an expression is the
smallest scope that contains the expression and is for one of the following:
The entire function body.
A statement.
The body of a if, while or loop expression.
The else block of an if expression.
The condition expression of an if or while expression, or a match
guard.
The expression for a match arm.
The second operand of a lazy boolean expression.
Essentially, you can treat your code as:
let mut a_variable_you_cant_see = String::new();
let x = &mut a_variable_you_cant_see;
x.push_str("Hello!");
See also:
Why can I return a reference to a local literal but not a variable?
What is the scope of unnamed values?
Are raw pointers to temporaries ok in Rust?
From the Rust Reference:
Temporary lifetimes
When using a value expression in most place expression contexts, a temporary unnamed memory location is created initialized to that value and the expression evaluates to that location instead
This applies, because String::new() is a value expression and being just below &mut it is in a place expression context. Now the reference operator only has to pass through this temporary memory location, so it becomes the value of the whole right side (including the &mut).
When a temporary value expression is being created that is assigned into a let declaration, however, the temporary is created with the lifetime of the enclosing block instead
Since it is assigned to the variable it gets a lifetime until the end of the enclosing block.
This also answers this question about the difference between
let a = &String::from("abcdefg"); // ok!
and
let a = String::from("abcdefg").as_str(); // compile error
In the second variant the temporary is passed into as_str(), so its lifetime ends at the end of the statement.
Rust's MIR provides some insight on the nature of temporaries; consider the following simplified case:
fn main() {
let foo = &String::new();
}
and the MIR it produces (standard comments replaced with mine):
fn main() -> () {
let mut _0: ();
scope 1 {
let _1: &std::string::String; // the reference is declared
}
scope 2 {
}
let mut _2: std::string::String; // the owner is declared
bb0: {
StorageLive(_1); // the reference becomes applicable
StorageLive(_2); // the owner becomes applicable
_2 = const std::string::String::new() -> bb1; // the owner gets a value; go to basic block 1
}
bb1: {
_1 = &_2; // the reference now points to the owner
_0 = ();
StorageDead(_1); // the reference is no longer applicable
drop(_2) -> bb2; // the owner's value is dropped; go to basic block 2
}
bb2: {
StorageDead(_2); // the owner is no longer applicable
return;
}
}
You can see that an "invisible" owner receives a value before a reference is assigned to it and that the reference is dropped before the owner, as expected.
What I'm not sure about is why there is a seemingly useless scope 2 and why the owner is not put inside any scope; I'm suspecting that MIR just isn't 100% ready yet.

Where is a MutexGuard if I never assign it to a variable?

I don't understand "where" the MutexGuard in the inner block of code is. The mutex is locked and unwrapped, yielding a MutexGuard. Somehow this code manages to dereference that MutexGuard and then mutably borrow that object. Where did the MutexGuard go? Also, confusingly, this dereference cannot be replaced with deref_mut. Why?
use std::sync::Mutex;
fn main() {
let x = Mutex::new(Vec::new());
{
let y: &mut Vec<_> = &mut *x.lock().unwrap();
y.push(3);
println!("{:?}, {:?}", x, y);
}
let z = &mut *x.lock().unwrap();
println!("{:?}, {:?}", x, z);
}
Summary: because *x.lock().unwrap() performs an implicit borrow of the operand x.lock().unwrap(), the operand is treated as a place context. But since our actual operand is not a place expression, but a value expression, it gets assigned to an unnamed memory location (basically a hidden let binding)!
See below for a more detailed explanation.
Place expressions and value expressions
Before we dive in, first two important terms. Expressions in Rust are divided into two main categories: place expressions and value expressions.
Place expressions represent a value that has a home (a memory location). For example, if you have let x = 3; then x is a place expression. Historically this was called lvalue expression.
Value expressions represent a value that does not have a home (we can only use the value, there is no memory location associated with it). For example, if you have fn bar() -> i32 then bar() is a value expression. Literals like 3.14 or "hi" are value expressions too. Historically these were called rvalue expressions.
There is a good rule of thumb to check if something is a place or value expression: "does it make sense to write it on the left side of an assignment?". If it does (like my_variable = ...;) it is a place expression, if it doesn't (like 3 = ...;) it's a value expression.
There also exist place contexts and value contexts. These are basically the "slots" in which expressions can be placed. There are only a few place contexts, which (usually, see below) require a place expression:
Left side of a (compound) assignment expression (⟨place context⟩ = ...;, ⟨place context⟩ += ...;)
Operand of an borrow expression (&⟨place context⟩ and &mut ⟨place context⟩)
... plus a few more
Note that place expressions are strictly more "powerful". They can be used in a value context without a problem, because they also represent a value.
(relevant chapter in the reference)
Temporary lifetimes
Let's build a small dummy example to demonstrate a thing Rust does:
struct Foo(i32);
fn get_foo() -> Foo {
Foo(0)
}
let x: &Foo = &get_foo();
This works!
We know that the expression get_foo() is a value expression. And we know that the operand of a borrow expression is a place context. So why does this compile? Didn't place contexts need place expressions?
Rust creates temporary let bindings! From the reference:
When using a value expression in most place expression contexts, a temporary unnamed memory location is created initialized to that value and the expression evaluates to that location instead [...].
So the above code is equivalent to:
let _compiler_generated = get_foo();
let x: &Foo = &_compiler_generated;
This is what makes your Mutex example work: the MutexLock is assigned to a temporary unnamed memory location! That's where it lives. Let's see:
&mut *x.lock().unwrap();
The x.lock().unwrap() part is a value expression: it has the type MutexLock and is returned by a function (unwrap()) just like get_foo() above. Then there is only one last question left: is the operand of the deref * operator a place context? I didn't mention it in the list of place contests above...
Implicit borrows
The last piece in the puzzle are implicit borrows. From the reference:
Certain expressions will treat an expression as a place expression by implicitly borrowing it.
These include "the operand of the dereference operator (*)"! And all operands of any implicit borrow are place contexts!
So because *x.lock().unwrap() performs an implicit borrow, the operand x.lock().unwrap() is a place context, but since our actual operand is not a place, but a value expression, it gets assigned to an unnamed memory location!
Why doesn't this work for deref_mut()
There is an important detail of "temporary lifetimes". Let's look at the quote again:
When using a value expression in most place expression contexts, a temporary unnamed memory location is created initialized to that value and the expression evaluates to that location instead [...].
Depending on the situation, Rust chooses memory locations with different lifetimes! In the &get_foo() example above, the temporary unnamed memory location had a lifetime of the enclosing block. This is equivalent to the hidden let binding I showed above.
However, this "temporary unnamed memory location" is not always equivalent to a let binding! Let's take a look at this case:
fn takes_foo_ref(_: &Foo) {}
takes_foo_ref(&get_foo());
Here, the Foo value only lives for the duration of the takes_foo_ref call and not longer!
In general, if the reference to the temporary is used as an argument for a function call, the temporary lives only for that function call. This also includes the &self (and &mut self) parameter. So in get_foo().deref_mut(), the Foo object would also only live for the duration of deref_mut(). But since deref_mut() returns a reference to the Foo object, we would get a "does not live long enough" error.
That's of course also the case for x.lock().unwrap().deref_mut() -- that's why we get the error.
In the deref operator (*) case, the temporary lives for the enclosing block (equivalent to a let binding). I can only assume that this is a special case in the compiler: the compiler knows that a call to deref() or deref_mut() always returns a reference to the self receiver, so it wouldn't make sense to borrow the temporary for only the function call.
Here are my thoughts:
let y: &mut Vec<_> = &mut *x.lock().unwrap();
A couple of things going on under the surface for your current code:
Your .lock() yields a LockResult<MutexGuard<Vec>>
You called unwrap() on the LockResult and get a MutexGuard<Vec>
Because MutexGuard<T> implements the DerefMut interface, Rust performs deref coercion. It gets dereferenced by the * operator, and yields a &mut Vec.
In Rust, I believe you don't call deref_mut by your own, rather the complier will do the Deref coercion for you.
If you want to get your MutexGuard, you should not dereference it:
let mut y = x.lock().unwrap();
(*y).push(3);
println!("{:?}, {:?}", x, y);
//Output: Mutex { data: <locked> }, MutexGuard { lock: Mutex { data: <locked> } }
From what I have seen online, people usually do make the MutexGuard explicit by saving it into a variable, and dereference it when it is being used, like my modified code above. I don't think there is an official pattern about this. Sometimes it will also save you from making a temporary variable.

Resources