Value moved in an empty expression statement - rust

On my first day of fiddling with Rust, I tried to execute an empty expression statement. The compiler threw a "value borrowed here after move" error when I tried printing the variable used inside the empty expression statement.
Here is the sample code:
fn main() {
let tstr = String::from("test"); // "move occurs because `tstr` has type `String`, which does not implement the `Copy` trait"
tstr; // causes move? <- "value moved here"
println!("{}",tstr); // "value borrowed here after move"
}
Does the empty expression call some hidden trait of the String method which takes ownership of the object? Or is something else at play here?

Rust is generally considered to be an "expression oriented" language. Broadly speaking, this means that most language constructs are expressions. Expressions are things that evaluate to values, and critically, do so independent of the context that they appear in.
A concrete consequence of this is that whether you write
let foo = <expression>;
or
<expression>;
doesn't affect1 how Rust evaluates <expression>. In particular, it doesn't change whether evaluating <expression> causes a value to be moved.
Since evaluating the right-hand side of
let new_tstr = tstr;
clearly involves moving tstr, so to does evaluating
tstr;
1 At the language level. Rust may of course compile these statements differently so long as the observable behavior remains the same. See "as-if rule."

Related

Why does Rust fail to correctly infer the type?

Why does this work fine:
let items = [1, 2, 3];
let mut cumulator = 0;
for next in items.iter() {
cumulator += next;
}
println!("Final {}", cumulator);
But this fail?:
let items = [1, 2, 3];
let mut cumulator = 0;
for next in items.iter() {
cumulator += next.pow(2);
}
println!("Final {}", cumulator);
Error on .pow(2):
no method named `pow` found for reference `&{integer}` in the current scope
method not found in `&{integer}`rustc (E0599)
My IDE identifies next as i32 and the first code example works fine. But the compiler has an issue the moment I reference next.pow() or any function on next . The compiler complains that next is an ambiguous integer type.
Sure, I can fix this by either explicitly declaring the array as i32[]. Or I can also use an interim variable before cumulator which is also explicitly declared i32. But these seem unnecessary and a bit clunky.
So why is compiler happy in the first case and not in the second?
Calling methods on objects is kind of funny, because it conveys zero information. That is, if I write
a + b
Then, even if Rust knows nothing about a and b, it can now assume that a is Add where the Rhs type is the type of b. We can refine the types and, hopefully, get more information down the road. Similarly, if I write foobar(), where foobar is a local variable, then Rust knows it has to be at least FnOnce.
However, if foo is a variable, and I write
foo.frobnicate()
Then Rust has no idea what to do with that. Is it an inherent impl on some type? Is it a trait function? It could be literally anything. If it's inherent, then it could even be in a module that we haven't imported, so we can't simply check everything that's in scope.
In your case, pow isn't even a trait function, it's actually several different functions. Even if it was a trait function, we couldn't say anything, because we don't, a priori, know which trait. So Rust sees next.pow(2) and bails out immediately, rather than trying to do something unexpected.
In your other case, Rust is able to infer the type. At the end of the function, all it knows about the type is that it's an {integer} on which Add is defined, and Rust has integer defaulting rules that kick in to turn that into i32, in the absence of any other information.
Could they have applied integer defaulting to next.pow(2)? Possibly, but I'm glad they didn't. Integers are already a special case in Rust (integers and floats are the only types with polymorphic literals), so minimizing the amount of special casing required by the compiler is, at least in my mind, a good thing. The defaulting rules kicked in in the first example because nothing caused it to bail out, and they would have in the second if it hadn't already encountered the bigger error condition of "calling an impl function on an unknown type".

Why does Rust allow code with the wrong return type, but only with a trailing semicolon?

Consider the following Rust code:
fn f() -> i32 {
loop {
println!("Infinite loop!");
}
println!("Unreachable");
}
This compiles (with a warning) and runs, despite the fact that the return type is wrong.
It would seem that the compiler is OK with the return type of () in the last line because it detects that this code is unreachable.
However, if we remove the last semicolon:
fn f() -> i32 {
loop {
println!("Infinite loop!");
}
println!("Unreachable")
}
Then the code no longer compiles, giving a type error:
error[E0308]: mismatched types
--> src/main.rs:14:5
|
14 | println!("Unreachable")
| ^^^^^^^^^^^^^^^^^^^^^^^ expected `i32`, found `()`
|
= note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
Why is this? Isn't the return type the same, (), in both of these code snipets?
Note: I'm interested in understanding why the Rust compiler behaves differently on these two examples, i.e. how the Rust compiler is implemented. I did not mean to ask a philosophical question about how it "should" behave, from the perspective of language design (I understand that such a question would probably be off-topic).
The return type in the first code block is actually ! (called never) because you have a loop that never exits (so rust gives you a warning saying it's unreachable). The full type would be:
fn f() -> !
I suspect ! is more like the 'bottom' type in Rust than anything else. In the second case, your function likely errors out in an earlier stage during type checking because of the mismatch between i32 and () before the compiler gets to the 'unreachability' analysis, like it does in the first example.
edit: as suggested, here is the relevant part of the rust book https://doc.rust-lang.org/book/ch19-04-advanced-types.html#the-never-type-that-never-returns
(Converting Sven's first comment into an answer)
The Rust compiler needs to infer a type for the function body. In the first case, there is no return expression, and apparently the compiler infers ! as the return type because of the infinite loop, which makes sense. In the second case, there's a return expression, so the type inference solver uses that to infer the type, which also makes sense.
I don't think this is specified in the language reference, nor do I think it matters in any way – just omit the unreachable statement and you'll be fine.

Comparing value enclosed in RefCell<T>

I have a structure with a field defined as follows:
log_str: RefCell<String>
I performed various calls to borrow_mut() to call push_str(.) on the field. At the end, I'm assessing its value using:
assert_eq!(os.log_str.borrow(), "<expected value>");
Nonetheless, the line of the assert raises a compile-time error with the message:
error[E0369]: binary operation == cannot be applied to type std::cell::Ref<'_, std::string::String>
I understand why the error is happening, since the compiler even hints:
an implementation of std::cmp::PartialEq might be missing for std::cell::Ref<'_, std::string::String>
My question is: how should I compare the value enclosed in a RefCell<T> (typically in this case, comparing the enclosed string with an expected value).
Thanks !
You want to de-reference the borrowed value:
assert_eq!(*os.log_str.borrow(), "<expected value>");

Where is a MutexGuard if I never assign it to a variable?

I don't understand "where" the MutexGuard in the inner block of code is. The mutex is locked and unwrapped, yielding a MutexGuard. Somehow this code manages to dereference that MutexGuard and then mutably borrow that object. Where did the MutexGuard go? Also, confusingly, this dereference cannot be replaced with deref_mut. Why?
use std::sync::Mutex;
fn main() {
let x = Mutex::new(Vec::new());
{
let y: &mut Vec<_> = &mut *x.lock().unwrap();
y.push(3);
println!("{:?}, {:?}", x, y);
}
let z = &mut *x.lock().unwrap();
println!("{:?}, {:?}", x, z);
}
Summary: because *x.lock().unwrap() performs an implicit borrow of the operand x.lock().unwrap(), the operand is treated as a place context. But since our actual operand is not a place expression, but a value expression, it gets assigned to an unnamed memory location (basically a hidden let binding)!
See below for a more detailed explanation.
Place expressions and value expressions
Before we dive in, first two important terms. Expressions in Rust are divided into two main categories: place expressions and value expressions.
Place expressions represent a value that has a home (a memory location). For example, if you have let x = 3; then x is a place expression. Historically this was called lvalue expression.
Value expressions represent a value that does not have a home (we can only use the value, there is no memory location associated with it). For example, if you have fn bar() -> i32 then bar() is a value expression. Literals like 3.14 or "hi" are value expressions too. Historically these were called rvalue expressions.
There is a good rule of thumb to check if something is a place or value expression: "does it make sense to write it on the left side of an assignment?". If it does (like my_variable = ...;) it is a place expression, if it doesn't (like 3 = ...;) it's a value expression.
There also exist place contexts and value contexts. These are basically the "slots" in which expressions can be placed. There are only a few place contexts, which (usually, see below) require a place expression:
Left side of a (compound) assignment expression (⟨place context⟩ = ...;, ⟨place context⟩ += ...;)
Operand of an borrow expression (&⟨place context⟩ and &mut ⟨place context⟩)
... plus a few more
Note that place expressions are strictly more "powerful". They can be used in a value context without a problem, because they also represent a value.
(relevant chapter in the reference)
Temporary lifetimes
Let's build a small dummy example to demonstrate a thing Rust does:
struct Foo(i32);
fn get_foo() -> Foo {
Foo(0)
}
let x: &Foo = &get_foo();
This works!
We know that the expression get_foo() is a value expression. And we know that the operand of a borrow expression is a place context. So why does this compile? Didn't place contexts need place expressions?
Rust creates temporary let bindings! From the reference:
When using a value expression in most place expression contexts, a temporary unnamed memory location is created initialized to that value and the expression evaluates to that location instead [...].
So the above code is equivalent to:
let _compiler_generated = get_foo();
let x: &Foo = &_compiler_generated;
This is what makes your Mutex example work: the MutexLock is assigned to a temporary unnamed memory location! That's where it lives. Let's see:
&mut *x.lock().unwrap();
The x.lock().unwrap() part is a value expression: it has the type MutexLock and is returned by a function (unwrap()) just like get_foo() above. Then there is only one last question left: is the operand of the deref * operator a place context? I didn't mention it in the list of place contests above...
Implicit borrows
The last piece in the puzzle are implicit borrows. From the reference:
Certain expressions will treat an expression as a place expression by implicitly borrowing it.
These include "the operand of the dereference operator (*)"! And all operands of any implicit borrow are place contexts!
So because *x.lock().unwrap() performs an implicit borrow, the operand x.lock().unwrap() is a place context, but since our actual operand is not a place, but a value expression, it gets assigned to an unnamed memory location!
Why doesn't this work for deref_mut()
There is an important detail of "temporary lifetimes". Let's look at the quote again:
When using a value expression in most place expression contexts, a temporary unnamed memory location is created initialized to that value and the expression evaluates to that location instead [...].
Depending on the situation, Rust chooses memory locations with different lifetimes! In the &get_foo() example above, the temporary unnamed memory location had a lifetime of the enclosing block. This is equivalent to the hidden let binding I showed above.
However, this "temporary unnamed memory location" is not always equivalent to a let binding! Let's take a look at this case:
fn takes_foo_ref(_: &Foo) {}
takes_foo_ref(&get_foo());
Here, the Foo value only lives for the duration of the takes_foo_ref call and not longer!
In general, if the reference to the temporary is used as an argument for a function call, the temporary lives only for that function call. This also includes the &self (and &mut self) parameter. So in get_foo().deref_mut(), the Foo object would also only live for the duration of deref_mut(). But since deref_mut() returns a reference to the Foo object, we would get a "does not live long enough" error.
That's of course also the case for x.lock().unwrap().deref_mut() -- that's why we get the error.
In the deref operator (*) case, the temporary lives for the enclosing block (equivalent to a let binding). I can only assume that this is a special case in the compiler: the compiler knows that a call to deref() or deref_mut() always returns a reference to the self receiver, so it wouldn't make sense to borrow the temporary for only the function call.
Here are my thoughts:
let y: &mut Vec<_> = &mut *x.lock().unwrap();
A couple of things going on under the surface for your current code:
Your .lock() yields a LockResult<MutexGuard<Vec>>
You called unwrap() on the LockResult and get a MutexGuard<Vec>
Because MutexGuard<T> implements the DerefMut interface, Rust performs deref coercion. It gets dereferenced by the * operator, and yields a &mut Vec.
In Rust, I believe you don't call deref_mut by your own, rather the complier will do the Deref coercion for you.
If you want to get your MutexGuard, you should not dereference it:
let mut y = x.lock().unwrap();
(*y).push(3);
println!("{:?}, {:?}", x, y);
//Output: Mutex { data: <locked> }, MutexGuard { lock: Mutex { data: <locked> } }
From what I have seen online, people usually do make the MutexGuard explicit by saving it into a variable, and dereference it when it is being used, like my modified code above. I don't think there is an official pattern about this. Sometimes it will also save you from making a temporary variable.

How do I tell Rust that my Option's value actually does outlive the closure passed to and_then?

Here's a common pattern I find myself running into:
let maybe_vec = Some(vec!["val"]); // I have an option with something in it
maybe_vec.and_then(|vec| vec.get(0)); // then I want to transform the something
This gives me
src/lib.rs:317:34: 317:37 error: `vec` does not live long enough
src/lib.rs:317 maybe_vec.and_then(|vec| vec.get(0));
^~~
src/lib.rs:317:9: 317:45 note: reference must be valid for the method call at 317:8...
src/lib.rs:317 maybe_vec.and_then(|vec| vec.get(0));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
src/lib.rs:317:34: 317:44 note: ...but borrowed value is only valid for the block at 317:33
src/lib.rs:317 maybe_vec.and_then(|vec| vec.get(0));
^~~~~~~~~~
To me this error seems overly pedantic - vec might not live long enough, but in this particular case vec is the thing inside maybe_vec, which clearly is going to live long enough. Do I need to provide some sort of lifetime annotations here, or am I just going about this wrong?
Unfortunately for you, the compiler is correct here. and_then consumes the Option and the value inside the option along with it. The value is provided to the closure. You can see this by using the trick of assigning a variable to the unit-type (()):
let a = Some(vec![1]);
a.and_then(|z| { let () = z; });
// error: expected `collections::vec::Vec<_>`,
When you call get, it returns a reference to an item in the slice, but the Vec now only lives inside the closure. Once the closure exits, it's gone! Instead, you can change the Option into a reference, then get the value that way. This leaves the original Vec in place:
let a = Some(vec![1]);
a.as_ref().and_then(|z| z.get(0));

Resources