I've read What are non-lexical lifetimes?. With the non-lexical borrow checker, the following code compiles:
fn main() {
let mut scores = vec![1, 2, 3];
let score = &scores[0]; // borrows `scores`, but never used
// its lifetime can end here
scores.push(4); // borrows `scores` mutably, and succeeds
}
It seems reasonable in the case above, but when it comes to a mutex lock, we don't want it to be released prematurely.
In the following code, I would like to lock a shared structure first and then execute a closure, mainly to avoid deadlock. However, I'm not sure if the lock will be released prematurely.
use lazy_static::lazy_static; // 1.3.0
use std::sync::Mutex;
struct Something;
lazy_static! {
static ref SHARED: Mutex<Something> = Mutex::new(Something);
}
pub fn lock_and_execute(f: Box<Fn()>) {
let _locked = SHARED.lock(); // `_locked` is never used.
// does its lifetime end here?
f();
}
Does Rust treat locks specially, so that their lifetimes are guaranteed to extend to the end of their scope? Must we use that variable explicitly to avoid premature dropping of the lock, like in the following code?
pub fn lock_and_execute(f: Box<Fn()>) {
let locked = SHARED.lock(); // - lifetime begins
f(); // |
drop(locked); // - lifetime ends
}
There is a misunderstanding here: NLL (non-lexical lifetimes) affects the borrow-checks, not the actual lifetime of the objects.
Rust uses RAII1 extensively, and thus the Drop implementation of a number of objects, such as locks, has side-effects which have to occur at a well-determined and predictable point in the flow of execution.
NLL did NOT change the lifetime of such objects, and therefore their destructor is executed at exactly the same point that it was before: at the end of their lexical scope, in reverse order of creation.
NLL did change the understanding of the compiler of the use of lifetimes for the purpose of borrow-checking. This does not, actually, cause any code change; this is purely analysis. This analysis was made more clever, to better recognize the actual scope in which a reference is used:
Prior to NLL, a reference was considered "in use" from the moment it was created to the moment it was dropped, generally its lexical scope (hence the name).
NLL, instead:
Tries to defer the start of the "in use" span, if possible.
Ends the "in use" span with the last use of the reference.
In the case of a Ref<'a> (from RefCell), the Ref<'a> will be dropped at the end of the lexical scope, at which point it will use the reference to RefCell to decrement the counter.
NLL does not peel away layers of abstractions, so must consider that any object containing a reference (such as Ref<'a>) may access said reference in its Drop implementation. As a result, any object that contains a reference, such as a lock, will force NLL to consider that the "in use" span of the reference extends until they are dropped.
1 Resource Acquisition Is Initialization, whose original meaning is that once a variable constructor has been executed it has acquired the resources it needed and is not in a half-baked state, and which is generally used to mean that the destruction of said variable will release any resources it owned.
Does Rust treat locks specially, so that their lifetimes are guaranteed to extend to the end of their scope?
No. This is the default for every type, and has nothing to do with the borrow checker.
Must we use that variable explicitly to avoid premature dropping of the lock
No.
All you need to do is ensure that the lock guard is bound to a variable. Your example does this (let _lock = ...), so the lock will be dropped at the end of scope. If you had used the _ pattern instead, the lock would have been dropped immediately:
You can prove this for yourself by testing if the lock has indeed been dropped:
pub fn lock_and_execute() {
let shared = Mutex::new(Something);
println!("A");
let _locked = shared.lock().unwrap();
// If `_locked` was dropped, then we can re-lock it:
println!("B");
shared.lock().unwrap();
println!("C");
}
fn main() {
lock_and_execute();
}
This code will deadlock, as the same thread attempts to acquire the lock twice.
You could also attempt to use a method that requires &mut self to see that the immutable borrow is still held by the guard, which has not been dropped:
pub fn lock_and_execute() {
let mut shared = Mutex::new(Something);
println!("A");
let _locked = shared.lock().unwrap();
// If `_locked` was dropped, then we can re-lock it:
println!("B");
shared.get_mut().unwrap();
println!("C");
}
error[E0502]: cannot borrow `shared` as mutable because it is also borrowed as immutable
--> src/main.rs:13:5
|
9 | let _locked = shared.lock().unwrap();
| ------ immutable borrow occurs here
...
13 | shared.get_mut().unwrap();
| ^^^^^^^^^^^^^^^^ mutable borrow occurs here
...
16 | }
| - immutable borrow might be used here, when `_locked` is dropped and runs the `Drop` code for type `std::sync::MutexGuard`
See also:
Where is a MutexGuard if I never assign it to a variable?
How to lock a Rust struct the way a struct is locked in Go?
Why does _ destroy at the end of statement?
Related
I'm having difficulty understanding why Rust's borrow checker does not allow multiple mutable borrows when it is safe to do so.
Let's give an example:
fn borrow_mut(s : &mut String) {
s.push_str(" world!");
println!("{}", s);
}
fn main() {
let mut s = String::from("hello");
let rs : &mut String = &mut s;
// second mutable borrow
borrow_mut(&mut s);
println!("{rs}");
}
This code fails to compile with the following message:
error[E0499]: cannot borrow `s` as mutable more than once at a time
--> main.rs:11:16
|
8 | let rs : &mut String = &mut s;
| ------ first mutable borrow occurs here
...
11 | borrow_mut(&mut s);
| ^^^^^^ second mutable borrow occurs here
12 |
13 | println!("{rs}");
| -- first borrow later used here
rs points on the variable of type String in the stack frame. String contains pointer on memory in the heap. So even if the string reallocates its data in borrow_mut(), both pointers are still valid, so this code should be safe.
Could someone explain the reason of why the borrow checker prevents multiple mutable borrows even when it's safe?
It's for thread safety, to avoid data races. If two such mutable borrowings can exist, then two threads of execution can both attempt to modify the original data. If they do, all sorts of nasty race conditions can arise, e.g. if both threads try to append to the string:
The underlying array holding the data can get reallocated twice, with one of them leaked
The appended data could end up writing out of bounds due to time-of-check/time-of-use issues
You could end up with inconsistent definitions of the length and capacity
On some architectures and data sizes, tearing could mean a single logical value is read half as the old version and half as the updated value (producing something that could easily be unrelated to either the old or new value)
etc.
Borrows as a language feature mean that the function can temporarily hand off its unique mutable-ownership to some other function; while that other function holds the borrow, the original object can't be accessed through anything but that mutable borrow. It also means that for non-mutable borrows, it can prevent mutable borrows that might causes races between reads through the non-mutable borrow and writes through the mutable borrow. The borrow checker is preventing you from launching a thread that modifies s, then calling borrow_mut from the main thread, and the two threads producing garbage or crashing the program when they modify s simultaneously.
To be clear, with an advanced borrow-checker in some future version of Rust, this code could be made to work (the code you wrote does nothing inherently unsafe). But fully analyzing deep code paths to ensure nothing evil could possibly occur is hard, and it's relatively easy to impose stricter rules (which might be loosened in the future if they're sure it won't impose restrictions on the language design that bite them later). Your code would work just fine if you passed the single mutable borrow you already had into borrow_mut after all; your code is not made worse by doing things The Rust Way™.
So I'm pursuing my Rust adventures (loving it) and I'm exploring threads. As usual I stumbled upon an error that I do not understand.
Here is a minimal example:
use std::thread;
pub fn compute_something(input: &Vec<&usize>) -> usize {
input.iter().map(|v| *v).sum()
}
pub fn main() {
let items = vec![0, 1, 2, 3, 4, 5];
let mut slice: Vec<&usize> = Vec::new();
slice.push(&items[1]); // borrowed value does not live long enough
// argument requires that `items` is borrowed for `'static`
slice.push(&items[2]); // borrowed value does not live long enough
// argument requires that `items` is borrowed for `'static`
assert_eq!(3, compute_something(&slice));
let h = thread::spawn(move || compute_something(&slice));
match h.join() {
Ok(result) => println!("Result: {:?}", result),
Err(e) => println!("Nope: {:?}", e)
}
} // `items` dropped here while still borrowed
I have of course made a playground to illustrate.
If I drop the thread part (everything after the assert_eq! line) and just call compute_something(&slice) it compiles fine.
There are three main things I don't understand here:
Why is it a problem to drop items while borrowed at the end of the program, shouldn't the runtime clean-up the memory just fine? It's not like I'm gonna be able to access slice outside of main.
What is still borrowing items at the end of the program? slice? If so, why does that same program compile by just removing everything after the assert_eq! line? I can't see how it changes the borrowing pattern.
Why is calling compute_something from inside the thread's closure creating the issue and how do I solve it?
You move slice into the closure that you pass to thread::spawn(). Since the closure passed to thread::spawn() must be 'static, this implies that the vector being moved into the closure must not borrow anything that isn't 'static either. The compiler therefore deduces the type of slice to be Vec<&'static usize>.
But it does borrow something that's not 'static -- the values that you try to push into it borrow from a variable local to main(), and so the compiler complains about this.
The simplest way to fix this case is to have slice be a Vec<usize> and not borrow from items at all.
Another option is to use scoped threads from the crossbeam crate, which know how to borrow from local variables safely by enforcing that all threads are joined before a scope ends.
To directly answer the questions you posed:
Why is it a problem to drop items while borrowed at the end of the program, shouldn't the runtime clean-up the memory just fine?
When main() terminates, all threads are also terminated -- however, there is a brief window of time during which the values local to main() have been destroyed but before the threads are terminated. There can exist dangling references during this window, and that violates Rust's memory safety model. This is why thread::spawn() requires a 'static closure.
Even though you join the thread yourself, the borrow checker doesn't know that joining the thread ends the borrow. (This is the problem that crossbeam's scoped threads solve.)
What is still borrowing items at the end of the program?
The vector that was moved into the closure is still borrowing items.
Why is calling compute_something from inside the thread's closure creating the issue and how do I solve it?
Calling this function isn't creating the issue. Moving slice into the closure is creating the issue.
Here is the way I solved this issue.
I used Box::Leak: https://doc.rust-lang.org/std/boxed/struct.Box.html#method.leak
let boxed_data = data.into_boxed_slice();
let boxed_data_static_ref = Box::leak(boxed_data);
let compressed_data = &boxed_data_static_ref[start_of_data..start_of_data+zfr.compressed_size as usize];
let handles = (0..NUM_THREADS).map(|thread_nr| {
thread::spawn(move || {
main2(thread_nr, compressed_data, zfr.crc);
})
}).collect::<Vec<_>>();
for h in handles {
h.join().unwrap();
}
In How does Rust prevent data races when the owner of a value can read it while another thread changes it?, I understand I need &mut self, when we want to mutate an object, even when the method is called with an owned value.
But how about primitive values, like i32? I ran this code:
fn change_aaa(bbb: &mut i32) {
*bbb = 3;
}
fn main() {
let mut aaa: i32 = 1;
change_aaa(&mut aaa); // somehow run this asynchronously
aaa = 2; // ... and will have data race here
}
My questions are:
Is this safe in a non concurrent situation?
According to The Rust Programming Language, if we think of the owned value as a pointer, it is not safe according the following rules, however, it compiles.
Two or more pointers access the same data at the same time.
At least one of the pointers is being used to write to the data.
There’s no mechanism being used to synchronize access to the data.
Is this safe in a concurrent situation?
I tried, but I find it hard to put change_aaa(&mut aaa) into a thread, according to Why can't std::thread::spawn accept arguments in Rust? and How does Rust prevent data races when the owner of a value can read it while another thread changes it?. However, is it designed to be hard or impossible to do this, or just because I am unfamiliar with Rust?
The signature of change_aaa doesn't allow it to move the reference into another thread. For example, you might imagine a change_aaa() implemented like this:
fn change_aaa(bbb: &mut i32) {
std::thread::spawn(move || {
std::thread::sleep(std::time::Duration::from_secs(1));
*bbb = 100; // ha ha ha - kaboom!
});
}
But the above doesn't compile. This is because, after desugaring the lifetime elision, the full signature of change_aaa() is:
fn change_aaa<'a>(bbb: &'a mut i32)
The lifetime annotation means that change_aaa must support references of any lifetime 'a chosen by the caller, even a very short one, such as one that invalidates the reference as soon as change_aaa() returns. And this is exactly how change_aaa() is called from main(), which can be desugared to:
let mut aaa: i32 = 1;
{
let aaa_ref = &mut aaa;
change_aaa(aaa_ref);
// aaa_ref goes out of scope here, and we're free to mutate
// aaa as we please
}
aaa = 2; // ... and will have data race here
So the lifetime of the reference is short, and ends just before the assignment to aaa. On the other hand, thread::spawn() requires a function bound with 'static lifetime. That means that the closure passed to thread::spawn() must either only contain owned data, or references to 'static data (data guaranteed to last until the end of the program). Since change_aaa() accepts bbb with with lifetime shorter than 'static, it cannot pass bbb to thread::spawn().
To get a grip on this you can try to come up with imaginative ways to write change_aaa() so that it writes to *bbb in a thread. If you succeed in doing so, you will have found a bug in rustc. In other words:
However, is it designed to be hard or impossible to do this, or just because I am unfamiliar with Rust?
It is designed to be impossible to do this, except through types that are explicitly designed to make it safe (e.g. Arc to prolong the lifetime, and Mutex to make writes data-race-safe).
Is this safe in a non concurrent situation? According to this post, if we think owned value if self as a pointer, it is not safe according the following rules, however, it compiles.
Two or more pointers access the same data at the same time.
At least one of the pointers is being used to write to the data.
There’s no mechanism being used to synchronize access to the data.
It is safe according to those rules: there is one pointer accessing data at line 2 (the pointer passed to change_aaa), then that pointer is deleted and another pointer is used to update the local.
Is this safe in a concurrent situation? I tried, but I find it hard to put change_aaa(&mut aaa) into a thread, according to post and post. However, is it designed to be hard or impossible to do this, or just because I am unfamiliar with Rust?
While it is possible to put change_aaa(&mut aaa) in a separate thread using scoped threads, the corresponding lifetimes will ensure the compiler rejects any code trying to modify aaa while that thread runs. You will essentially have this failure:
fn main(){
let mut aaa: i32 = 1;
let r = &mut aaa;
aaa = 2;
println!("{}", r);
}
error[E0506]: cannot assign to `aaa` because it is borrowed
--> src/main.rs:10:5
|
9 | let r = &mut aaa;
| -------- borrow of `aaa` occurs here
10 | aaa = 2;
| ^^^^^^^ assignment to borrowed `aaa` occurs here
11 | println!("{}", r);
| - borrow later used here
I understand you're not allowed to create two mutable references to an object at once in Rust. I don't entirely understand why the following code works:
fn main() {
let mut string = String::from("test");
let mutable_reference: &mut String = &mut string;
mutable_reference.push_str(" test");
// as I understand it, this creates a new mutable reference (2nd?)
test(&mut *mutable_reference);
println!("{}", mutable_reference);
}
fn test(s: &mut String) {
s.push_str(" test");
}
The rule
There shall only be one usable mutable reference to a particular value at any point in time.
This is NOT a spatial exclusion (there CAN be multiple references to the same piece) but a temporal exclusion.
The mechanism
In order to enforce this, &mut T is NOT Copy; therefore calling:
test(mutable_reference);
should move the reference into test.
Actually doing this would make it unusable later on and not be very ergonomic, so the Rust compiler inserts an automatic reborrowing, much like you did yourself:
test(&mut *mutable_reference);
You can force the move if you wanted to:
test({ let x = mutable_reference; x });
The effect
Re-borrowing is, in essence, just borrowing:
mutable_reference is borrowed for as long as the unnamed temporary mutable reference exists (or anything that borrows from it),
the unnamed temporary mutable reference is moved into test,
at the of expression, the unnamed temporary mutable reference is destroyed, and therefore the borrow of mutable_reference ends.
Is there more than one mutable pointer somewhere in memory referring to the same location? Yes.
Is there more than one mutable pointer in the code to the same location which is usable? No.
Re-borrowing a mutable pointer locks out the one you're re-borrowing from.
I analyzed the differences in MIR between test(mutable_reference) and test(&mut *mutable_reference). It appears that the latter only introduces an additional level of indirection:
When you are using an extra dereference in the function call, it doesn't create a mutable reference that holds; in other words, it doesn't cause any actual difference outside the function call.
I understand you're not allowed to create two mutable references to an object at once in Rust. I don't entirely understand why the following code works:
fn main() {
let mut string = String::from("test");
let mutable_reference: &mut String = &mut string;
mutable_reference.push_str(" test");
// as I understand it, this creates a new mutable reference (2nd?)
test(&mut *mutable_reference);
println!("{}", mutable_reference);
}
fn test(s: &mut String) {
s.push_str(" test");
}
The rule
There shall only be one usable mutable reference to a particular value at any point in time.
This is NOT a spatial exclusion (there CAN be multiple references to the same piece) but a temporal exclusion.
The mechanism
In order to enforce this, &mut T is NOT Copy; therefore calling:
test(mutable_reference);
should move the reference into test.
Actually doing this would make it unusable later on and not be very ergonomic, so the Rust compiler inserts an automatic reborrowing, much like you did yourself:
test(&mut *mutable_reference);
You can force the move if you wanted to:
test({ let x = mutable_reference; x });
The effect
Re-borrowing is, in essence, just borrowing:
mutable_reference is borrowed for as long as the unnamed temporary mutable reference exists (or anything that borrows from it),
the unnamed temporary mutable reference is moved into test,
at the of expression, the unnamed temporary mutable reference is destroyed, and therefore the borrow of mutable_reference ends.
Is there more than one mutable pointer somewhere in memory referring to the same location? Yes.
Is there more than one mutable pointer in the code to the same location which is usable? No.
Re-borrowing a mutable pointer locks out the one you're re-borrowing from.
I analyzed the differences in MIR between test(mutable_reference) and test(&mut *mutable_reference). It appears that the latter only introduces an additional level of indirection:
When you are using an extra dereference in the function call, it doesn't create a mutable reference that holds; in other words, it doesn't cause any actual difference outside the function call.