Consider the following rust program:
fn main()
{
let mut x = 1;
let mut r = &x;
r;
let y = 1;
r = &y;
x = 2;
r;
}
It compiles without any errors and I agree with that behaviour.
The problem is that I am not able to reach the same conclusion when trying to reason about this formally:
The type of the variable r is &'a i32 for some lifetime 'a.
The type of &x is &'b i32 for some lifetime 'b.
The lifetime 'a includes x = 2;.
From let mut r = &x; we know that 'b: 'a.
Because of 3 and 4 we know that the lifetime 'b includes x = 2;.
Because of 2 and 5 we are doing x = 2; while borrowing x, so the program should be invalid.
What is wrong with the formal reasoning above, and how would the correct reasoning be?
The lifetime 'a includes x = 2;.
Because of 3 and 4 we know that the lifetime 'b includes x = 2;.
They don't. r is reassigned to on line 7, this ends the entire thing as r is thereafter a completely new value independent from the old one -- and yes rustc is smart enough to work at that granularity, that's why if you remove the final x; it will warn that
value assigned to r is never read
at line 7 (whereas you would not get that warning with Go for instance, the compiler doesn't work at that low a granularity).
Rustc can thus infer that the smallest necessary length for 'b stops somewhere between the end of line 5 and the start of line 7.
Since x doesn't need to be updated before line 8, there is no conflict.
If you remove the assignment, however, 'b now has to extend to the last last line of the enclosing function, triggering a conflict.
Your reasoning seems to be that of lexical lifetimes, rather than NLL. You may want to go through the RFC 2094, it is very detailed. But in essence it works in terms of liveness constraints of values, and solving those constraints. In fact the RFC introduces liveness with an example which is a somewhat more complicated version of your situation:
let mut foo: T = ...;
let mut bar: T = ...;
let mut p: &'p T = &foo;
// `p` is live here: its value may be used on the next line.
if condition {
// `p` is live here: its value will be used on the next line.
print(*p);
// `p` is DEAD here: its value will not be used.
p = &bar;
// `p` is live here: its value will be used later.
}
// `p` is live here: its value may be used on the next line.
print(*p);
// `p` is DEAD here: its value will not be used.
Also note this quote which very much applies to your misunderstanding:
The key point is that p becomes dead (not live) in the span before it is reassigned. This is true even though the variable p will be used again, because the value that is in p will not be used.
So you really need to reason off of values, not variables. Using SSA in your head probably helps there.
Applying this to your version:
let mut x = 1;
let mut r = &x;
// `r` is live here: its value will be used on the next line
r;
// `r` is DEAD here: its value will never be used
let y = 1;
r = &y;
// `r` is live here: its value will be used later
x = 2;
r;
// `r` is DEAD here: the scope ends
Life before NLL
Before discussing about non-lexical lifetimes (NLL), let's first discuss the "ordinary" lifetimes. In older Rust before NLL is introduced, the code below won't compile because r is still in scope while x is mutated in row 3.
let mut x = 1;
let mut r = &x;
x = 2; // Compile error
To fix this, we need to explicitly make r out of scope before x is mutated:
let mut x = 1;
{
let mut r = &x;
}
x = 2;
At this point you might think: If after the line of x = 2, r is not used anymore, the first snippet should be safe. Can the compiler be smarter so that we don't need to explicitly make r out of scope like we did in the second snippet?
The answer is yes, and that's when NLL comes in.
Life after NLL
After NLL is introduced in Rust, our life becomes easier. The code below will compile:
let mut x = 1;
let mut r = &x;
x = 2; // Compiles under NLL
But remember, it will compile as long as r is not used after the mutation of x. For example, this won't compile even under NLL:
let mut x = 1;
let mut r = &x;
x = 2; // Compile error: cannot assign to `x` because it is borrowed
r; // borrow later used here
Although the rules of NLL described in RFC 2094 are quite complex, they can be summarized roughly and approximately (in most cases) as:
A program is valid as long as every owned value is not mutated between the assignment of a variable referring to it and the usage of that variable.
The code below is valid because x is mutated before the assignment of r and before the usage of r:
let mut x = 1;
x = 2; // x is mutated
let mut r = &x; // r is assigned here
r; // r is used here
The code below is valid because x is mutated after the assignment of r and after the usage of r:
let mut x = 1;
let mut r = &x; // r is assigned here
r; // r is used here
x = 2; // x is mutated
The code below is NOT valid because x is mutated after the assignment of r and before the usage of r:
let mut x = 1;
let mut r = &x; // r is assigned here
x = 2; // x is mutated
r; // r is used here -> compile error
To your specific program, it's valid because when x is mutated (x = 2), there is no variable referring to x anymore—r is now referring to y because of the previous line (r = &y). Therefore, the rule is still adhered.
let mut x = 1;
let mut r = &x;
r;
let y = 1;
r = &y;
// This mutation of x is seemingly sandwiched between
// the assignment of r above and the usage of r below,
// but it's okay as r is now referring to y and not x
x = 2;
r;
Related
The following code deadlocks because the mutex is not unlocked after the last use of v:
use std::sync::{Arc,Mutex};
fn main() {
let a = Arc::new(Mutex::new(3));
let mut v = a.lock().unwrap();
*v += 1;
println!("v is {v}");
// drop(v);
let b = Arc::clone(&a);
std::thread::spawn(move || {
let mut w = b.lock().unwrap();
*w += 1;
println!("w is {w}");
}).join().unwrap();
}
The fix is to uncomment the explicit drop(v). Why does compiler not automatically drop v after its last use?
In contrast, the Rust compiler knows to correctly drop v early in the following case:
fn main() {
let mut a = 3;
let v = &mut a;
*v += 1;
println!("v is {v}");
let w = &mut a;
*w += 1;
println!("w is {w}");
}
This behavior seems natural, I would expect the compiler to do the same above.
Values are dropped when their scope ends, not after their last use. What may be confusing you is that the borrow checker knows references are inconsequential after their last use, and thus considers their lifetimes differently for the purposes of enforcing Rust's referential guarantees.
Technically v in the second example is not dropped until the end of the scope either, but there is no drop logic for references. See What are non-lexical lifetimes?
How could know the type of a binding if I use auto type deduction when creating a binding? what if the expression on the right side is a borrow(like let x = &5;), will it be value or a borrow? What will happen if I re-assign a borrow or a value?
Just for check, I do can re-assign a borrow if I use let mut x: &mut T = &mut T{}; or let mut x:&T = & T{};, right?
I sense some confusion between binding and assigning:
Binding introduces a new variable, and associates it to a value,
Assigning overwrites a value with another.
This can be illustrated in two simple lines:
let mut x = 5; // Binding
x = 10; // Assigning
A binding may appear in multiple places in Rust:
let statements,
if let/while let conditions,
cases in a match expression,
and even in a for expression, on the left side of in.
Whenever there is a binding, Rust's grammar also allows pattern matching:
in the case of let statements and for expressions, the patterns must be irrefutable,
in the case of if let, while let and match cases, the patterns may fail to match.
Pattern matching means that the type of the variable introduced by the binding differs based on how the binding is made:
let x = &5; // x: &i32
let &y = &5; // y: i32
Assigning always requires using =, the assignment operator.
When assigning, the former value is overwritten, and drop is called on it if it implements Drop.
let mut x = 5;
x = 6;
// Now x == 6, drop was not called because it's a i32.
let mut s = String::from("Hello, World!");
s = String::from("Hello, 神秘德里克!");
// Now s == "Hello, 神秘德里克!", drop was called because it's a String.
The value that is overwritten may be as simple as an integer or float, a more involved struct or enum, or a reference.
let mut r = &5;
r = &6;
// Now r points to 6, drop was not called as it's a reference.
Overwriting a reference does not overwrite the value pointed to by the reference, but the reference itself. The original value still lives on, and will be dropped when it's ready.
To overwrite the pointed to value, one needs to use *, the dereference operator:
let mut x = 5;
let r = &mut x;
*r = 6;
// r still points to x, and now x = 6.
If the type of the dereferenced value requires it, drop will be called:
let mut s = String::from("Hello, World!");
let r = &mut s;
*r = String::from("Hello, 神秘德里克!");
// r still points to s, and now s = "Hello, 神秘德里克!".
I invite you to use to playground to and toy around, you can start from here:
fn main() {
let mut s = String::from("Hello, World!");
{
let r = &mut s;
*r = String::from("Hello, 神秘德里克!");
}
println!("{}", s);
}
Hopefully, things should be a little clearer now, so let's check your samples.
let x = &5;
x is a reference to i32 (&i32). What happens is that the compiler will introduce a temporary in which 5 is stored, and then borrow this temporary.
let mut x: &mut T = T{};
Is impossible. The type of T{} is T not &mut T, so this fails to compile. You could change it to let mut x: &mut T = &mut T{};.
And your last example is similar.
I have the following Rust program and I expect it to result in an compilation error since x is reassigned later. But it complies and gives output. Why?
fn main() {
let (x, y) = (1, 3);
println!("X is {} and Y is {}", x, y);
let x: i32 = 565;
println!("Now X is {}", x);
}
Rust actually lets you shadow other variables in a block, so let x: i32 = 565; is defining a new variable x that shadows the x defined earlier with let (x,y) = (1,3);. Note that you could even have redefined x to have a different type since the second x is a whole new variable!
fn main(){
let x = 1;
println!("Now X is {}",x);
let x = "hi";
println!("Now X is {}",x);
}
This reddit thread goes into more detail about why this is useful. The two things that are mentioned that seem interesting are:
For operations which take ownership of the variable, but return another variable of the same type, it sometimes "looks nice" to redefine the returned variable to have the same name. From here:
let iter = vec.into_iter();
let iter = modify(iter);
let iter = double(iter);
Or to make a variable immutable:
let mut x;
// Code where `x` is mutable
let x = x;
// Code where `x` is immutable
I want to enter a loop with a variable n which is borrowed by the function. At each step, n takes a new value; when exiting the loop, the job is done, with the help of other variables, and n will never be used again.
If I don't use references, I have something like this:
fn test(n: Thing) -> usize {
// stuff
let mut n = n;
for i in 1..10 {
let (q, m) = n.do_something(...);
n = m;
// stuff with x
}
x
}
x is the result of some computation with q and m but it is an usize type and I didn't encounter any issue in this part of the code. I didn't test this code, but this is the idea. I could make code written like this work.
Since I want to do it with a reference; I tried to write:
fn test(n: &Thing) -> usize {
// stuff
let mut n = n;
for i in 1..10 {
let (q, m) = (*n).do_something(...);
n = &m;
// stuff with x
}
x
}
Now the code will not compile because m has a shorter lifetime than n. I tried to make it work by doing some tricky things or by cloning things, but this can't be the right way. In C, the code would work because we don't care about what n is pointing to when exiting the loop since n isn't used after the loop. I perfectly understand that this is where Rust and C differ, but I am pretty sure a clean way of doing it in Rust exists.
Consider my question as very general; I am not asking for some ad-hoc solution for a specific problem.
As Chris Emerson points out, what you are doing is unsafe and it is probably not appropriate to write code like that in C either. The variable you are taking a reference to goes out of scope at the end of each loop iteration, and thus you would have a dangling pointer at the beginning of the next iteration. This would lead to all of the memory errors that Rust attempts to prevent; Rust has prevented you from doing something bad that you thought was safe.
If you want something that can be either borrowed or owned; that's a Cow:
use std::borrow::Cow;
#[derive(Clone)]
struct Thing;
impl Thing {
fn do_something(&self) -> (usize, Thing) {
(1, Thing)
}
}
fn test(n: &Thing) -> usize {
let mut n = Cow::Borrowed(n);
let mut x = 0;
for _ in 1..10 {
let (q, m) = n.do_something();
n = Cow::Owned(m);
x = x + q;
}
x
}
fn main() {
println!("{}", test(&Thing));
}
If I understand this right, the problem is not related to life outside of the loop; m doesn't live long enough to keep a reference for the next iteration.
let mut n = n;
for i in 1..10 {
let (q,m) = (*n).do_something(...)
n = &m
} // At this point m is no longer live, i.e. doesn't live until the next iteration.
Again it depends on the specific types/lifetimes, but you could potentially assign m to a variable with a longer lifetime, but then you're back to the first example.
What happens if I borrow a dereferenced pointer?
let a = some object;
let b = &a;
let c = &*b;
What object has c borrowed? Does the dereferencing op create a temporary object like a function's return value? How does it obey the borrowing rules.
I'm also confused about Box's mutable semantic.
let mut a = Box::new(8us)
*a = 1;
This code works just fine without something like Box::new_mut(). But
let mut a = &8us;
*a = 1;
An error occurs.
Quick answer: It depends.
Long answer: keep reading...
What happens if I borrow a dereferenced pointer?
If you check out the LLVM-IR that rust generates, you can see everything in fine detail:
let a = 8us;
let b = &a;
let c = &*b;
gets expanded to
let a;
// %a = alloca i64
let b;
// %b = alloca i64*
let c;
// %c = alloca i64*
a = 8us;
// store i64 8, i64* %a
b = &a;
// store i64* %a, i64** %b
let tmp = *b;
// %0 = load i64** %b
c = tmp;
// store i64* %0, i64** %c
now, llvm can easily optimize this stuff out. It gets more complicated once you implement traits like Deref on your own types. Then obviously a function call is involved, but most likely optimized out again, since you shouldn't be doing complicated stuff in the deref function.
What object has c borrowed?
c borrows a
Does the dereferencing op create a temporary object like a function's return value?
Nope, see above
How does it obey the borrowing rules.
*b behaves as if it were a. If you take a reference to it you get a reference to a.
To answer your second issue:
A Box owns the object it points to. Since you declared your Box to be mutable, you can take eithe a mutable reference or any number of non-mutable references to your object. This means when you dereference the Box rust decides, depending the situation, whether to automatically create a mutable box. In the case of the assignment, your deref'd box is on the left side of the assignment operator, therefor rust tries to get a mutable reference to the object.
If you give your variables types, this becomes more obvious:
let mut x = Box::new(8us);
{
let y : &usize = &*x;
}
{
let y : &mut usize = &mut *x;
*x = 99; // cannot assign to `*x` because it is borrowed
}