Why Rust prevents from multiple mutable references? - multithreading

Like in the topic, why Rust prevents from multiple mutable references? I have read chapter in rust-book, and I understand that when we have multi-threaded code we are secured from data races but let's look at this code:
fn main() {
let mut x1 = String::from("hello");
let r1 = &mut x1;
let r2 = &mut x1;
r1.insert(0, 'w');
}
This code is not running simultaneously so there is no possibility for data races.
What is more when I am creating new thread and I want to use variable from parent thread in a new thread I have to move it so only new thread is an owner of the parent variable.
The only reason I can see is that programmer can lose himself in his code when it is growing up. We have multiple places in which one piece of data can be modified and even the code is not running parallel we can get some bugs.

The fact that Rust prevent two mutable references at the same time to prevent data races is a common misconception. This is only one of the reasons. Preventing two mutable references makes it possible to keep invariants on types easily and let the compiler enforce that the invariant are not violated.
Take this piece of C++ code for an example:
#include <vector>
int main() {
std::vector<int> foo = { 1, 2, 3 };
for (auto& e: foo) {
if (e % 2 == 0) {
foo.push_back(e+1);
}
}
return 0;
}
This is unsafe because you cannot mutate a vector while you are iterating it. Mutating the vector might reallocate its internal buffer, which invalidates all references. In C++, this is a UB. In Python, Java or C# (and probably most other languages), you would get a runtime exception.
Rust however, prevents this kind of issues at compile time:
fn main() {
let mut foo = vec![1, 2, 3];
for e in foo {
if e % 2 == 0 {
foo.push(e+1);
}
}
}
gives an error:
error[E0382]: borrow of moved value: `foo`
--> src/main.rs:6:13
|
2 | let mut foo = vec![1, 2, 3];
| ------- move occurs because `foo` has type `std::vec::Vec<i32>`, which does not implement the `Copy` trait
3 |
4 | for e in foo {
| ---
| |
| value moved here
| help: consider borrowing to avoid moving into the for loop: `&foo`
5 | if e % 2 == 0 {
6 | foo.push(e+1);
| ^^^ value borrowed here after move

The big benefit of this restriction is that rust can prevent data races at compile time. A data race occurs if we have two pointers pointing to the same piece of data and one of those pointers is used to write to the data and there is no mechanism to synchronize data access between those pointers. In that situation, u could imagine one pointer will read the data and in the middle, another pointer modifying the data. in that case, we are gonna get corrupted data back. to fix this error you can switch these references back to be immutable references.
Rust enforces a “single writer or multiple readers” rule: either you can read and write the value, or it can be
shared by any number of readers, but never both at the same time.

Related

Why is mutating an owned value and borrowed reference safe in Rust?

In How does Rust prevent data races when the owner of a value can read it while another thread changes it?, I understand I need &mut self, when we want to mutate an object, even when the method is called with an owned value.
But how about primitive values, like i32? I ran this code:
fn change_aaa(bbb: &mut i32) {
*bbb = 3;
}
fn main() {
let mut aaa: i32 = 1;
change_aaa(&mut aaa); // somehow run this asynchronously
aaa = 2; // ... and will have data race here
}
My questions are:
Is this safe in a non concurrent situation?
According to The Rust Programming Language, if we think of the owned value as a pointer, it is not safe according the following rules, however, it compiles.
Two or more pointers access the same data at the same time.
At least one of the pointers is being used to write to the data.
There’s no mechanism being used to synchronize access to the data.
Is this safe in a concurrent situation?
I tried, but I find it hard to put change_aaa(&mut aaa) into a thread, according to Why can't std::thread::spawn accept arguments in Rust? and How does Rust prevent data races when the owner of a value can read it while another thread changes it?. However, is it designed to be hard or impossible to do this, or just because I am unfamiliar with Rust?
The signature of change_aaa doesn't allow it to move the reference into another thread. For example, you might imagine a change_aaa() implemented like this:
fn change_aaa(bbb: &mut i32) {
std::thread::spawn(move || {
std::thread::sleep(std::time::Duration::from_secs(1));
*bbb = 100; // ha ha ha - kaboom!
});
}
But the above doesn't compile. This is because, after desugaring the lifetime elision, the full signature of change_aaa() is:
fn change_aaa<'a>(bbb: &'a mut i32)
The lifetime annotation means that change_aaa must support references of any lifetime 'a chosen by the caller, even a very short one, such as one that invalidates the reference as soon as change_aaa() returns. And this is exactly how change_aaa() is called from main(), which can be desugared to:
let mut aaa: i32 = 1;
{
let aaa_ref = &mut aaa;
change_aaa(aaa_ref);
// aaa_ref goes out of scope here, and we're free to mutate
// aaa as we please
}
aaa = 2; // ... and will have data race here
So the lifetime of the reference is short, and ends just before the assignment to aaa. On the other hand, thread::spawn() requires a function bound with 'static lifetime. That means that the closure passed to thread::spawn() must either only contain owned data, or references to 'static data (data guaranteed to last until the end of the program). Since change_aaa() accepts bbb with with lifetime shorter than 'static, it cannot pass bbb to thread::spawn().
To get a grip on this you can try to come up with imaginative ways to write change_aaa() so that it writes to *bbb in a thread. If you succeed in doing so, you will have found a bug in rustc. In other words:
However, is it designed to be hard or impossible to do this, or just because I am unfamiliar with Rust?
It is designed to be impossible to do this, except through types that are explicitly designed to make it safe (e.g. Arc to prolong the lifetime, and Mutex to make writes data-race-safe).
Is this safe in a non concurrent situation? According to this post, if we think owned value if self as a pointer, it is not safe according the following rules, however, it compiles.
Two or more pointers access the same data at the same time.
At least one of the pointers is being used to write to the data.
There’s no mechanism being used to synchronize access to the data.
It is safe according to those rules: there is one pointer accessing data at line 2 (the pointer passed to change_aaa), then that pointer is deleted and another pointer is used to update the local.
Is this safe in a concurrent situation? I tried, but I find it hard to put change_aaa(&mut aaa) into a thread, according to post and post. However, is it designed to be hard or impossible to do this, or just because I am unfamiliar with Rust?
While it is possible to put change_aaa(&mut aaa) in a separate thread using scoped threads, the corresponding lifetimes will ensure the compiler rejects any code trying to modify aaa while that thread runs. You will essentially have this failure:
fn main(){
let mut aaa: i32 = 1;
let r = &mut aaa;
aaa = 2;
println!("{}", r);
}
error[E0506]: cannot assign to `aaa` because it is borrowed
--> src/main.rs:10:5
|
9 | let r = &mut aaa;
| -------- borrow of `aaa` occurs here
10 | aaa = 2;
| ^^^^^^^ assignment to borrowed `aaa` occurs here
11 | println!("{}", r);
| - borrow later used here

Insert into a HashMap based on another value in the same Hashmap

I'm trying to insert a value into a HashMap based on another value in the same HashMap, like so:
use std::collections::HashMap;
fn main() {
let mut some_map = HashMap::new();
some_map.insert("a", 1);
let some_val = some_map.get("a").unwrap();
if *some_val != 2 {
some_map.insert("b", *some_val);
}
}
which gives this warning:
warning: cannot borrow `some_map` as mutable because it is also borrowed as immutable
--> src/main.rs:10:9
|
7 | let some_val = some_map.get("a").unwrap();
| -------- immutable borrow occurs here
...
10 | some_map.insert("b", *some_val);
| ^^^^^^^^ --------- immutable borrow later used here
| |
| mutable borrow occurs here
|
= note: `#[warn(mutable_borrow_reservation_conflict)]` on by default
= warning: this borrowing pattern was not meant to be accepted, and may become a hard error in the future
= note: for more information, see issue #59159 <https://github.com/rust-lang/rust/issues/59159>
If I were instead trying to update an existing value, I could use interior mutation and RefCell, as described here.
If I were trying to insert or update a value based on itself, I could use the entry API, as described here.
I could work around the issue with cloning, but I would prefer to avoid that since the retrieved value in my actual code is somewhat complex. Will this require unsafe code?
EDIT
Since previous answer is simply false and doesn't answer the question at all, there's code which doesn't show any warning (playground)
Now it's a hashmap with Rc<_> values, and val_rc contains only a reference counter on actual data (number 1 in this case). Since it's just a counter, there's no cost of cloning it. Note though, that there's only one copy of a number exists, so if you modify a value of some_map["a"], then some_map["b"] is modified aswell, since they refer to a single piece of memory. Also note, that 1 lives on stack, so you better consider turn it into Rc<Box<_>> if you plan to add many heavy objects.
use std::collections::HashMap;
use std::rc::Rc;
fn main() {
let mut some_map = HashMap::new();
some_map.insert("a", Rc::new(1));
let val_rc = Rc::clone(some_map.get("a").unwrap());
if *val_rc != 2 {
some_map.insert("b", val_rc);
}
}
Previous version of answer
Hard to tell what exactly you're looking for, but in this particular case, if you only need to check the value, then destroy the borrowed value, before you update the hashmap. A dirty and ugly code would be like this:
fn main() {
let mut some_map = HashMap::new();
some_map.insert("a", 1);
let is_ok = false;
{
let some_val = some_map.get("a").unwrap();
is_ok = *some_val != 2;
}
if is_ok {
some_map.insert("b", *some_val);
}
}

Rust - writing to indices of a vector across multiple threads

I have a circular ring buffer (implemented as a vector) where I want one thread to periodically write to the ring buffer and another to periodically read from the ring buffer. Is it possible to create a vector that can be read and written at the same time so long as threads accessing the vector are not at the same index?
What I am hoping to achieve:
use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;
fn main() {
let vec = Arc::new(vec![Mutex::new(1), Mutex::new(2),Mutex::new(3)]);
{
let vec = vec.clone();
thread::spawn(move|| {
let mut s2 = *vec.get_mut(2).unwrap().lock().unwrap();
s2 = 7;
});
}
println!("{}", vec[2].lock().unwrap());
}
Compiler output is:
Compiling playground v0.0.1 (/playground)
warning: variable `s2` is assigned to, but never used
--> src/main.rs:12:21
|
12 | let mut s2 = *vec.get_mut(2).unwrap().lock().unwrap();
| ^^
|
= note: `#[warn(unused_variables)]` on by default
= note: consider using `_s2` instead
warning: value assigned to `s2` is never read
--> src/main.rs:13:13
|
13 | s2 = 7;
| ^^
|
= note: `#[warn(unused_assignments)]` on by default
= help: maybe it is overwritten before being read?
error[E0596]: cannot borrow data in an `Arc` as mutable
--> src/main.rs:12:27
|
12 | let mut s2 = *vec.get_mut(2).unwrap().lock().unwrap();
| ^^^ cannot borrow as mutable
|
= help: trait `DerefMut` is required to modify through a dereference, but it is not implemented for `std::sync::Arc<std::vec::Vec<std::sync::Mutex<i32>>>`
error: aborting due to previous error
For more information about this error, try `rustc --explain E0596`.
error: could not compile `playground`.
To learn more, run the command again with --verbose.
Foiled by the rust type system trying to prevent a race condition :(
What I don't want
An implementation that involves having the lock scope including the vector.
An atomic read and write to the vector is not an option since the vector will contain images.
Link to playground:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=5b5efe91bdd45c658d11f1cefb16045e
First, I would recommend you to use std::sync::RwLock, because it allows multiple readers to read data simultaneously.
Second, spawning threads can lead to performance bottlenecks in your code. Try to use thread pool.
Of course, the exact choice will vary depending on the result of benchmarks, but those are general recommendations.
Your code is mostly correct, except one crucial part. You are using Mutex which implements interior mutability pattern and also provides thread-safety.
Interior mutability moves compiletime checks of XOR borrowing rule (either N immutable borrows or just one mutable) to the run-time. So, Mutex ensures that any time there exists only one reader or only one writer.
When you try to get mutable reference from vec, like this
vec.get_mut(..)
You are essentially ignoring benefits provided by interior mutability. Compiler can't guarantee that XOR rule is not broken, because you borrow vec as mutable.
Obvious solution is to borrow vec as immutable and using Mutex to safeguard against race condition and don't utilize compiler borrowing rules.
let mut s2 = vec
.get(2) // Get immutable reference to second item
.unwrap() // Ensure that it exists
.lock() // Lock mutex.
.unwrap(); // Ensure mutex isn't poisoned.
// s2 is now `std::sync::MutexGuard<i32>`, which implements `std::ops::DerefMut`,
// so it can get us mutable reference to data.
*s2 = 7;

Borrowing error when pushing reference into vector that is on the same scope [duplicate]

For reasons related to code organization, I need the compiler to accept the following (simplified) code:
fn f() {
let mut vec = Vec::new();
let a = 0;
vec.push(&a);
let b = 0;
vec.push(&b);
// Use `vec`
}
The compiler complains
error: `a` does not live long enough
--> src/main.rs:8:1
|
4 | vec.push(&a);
| - borrow occurs here
...
8 | }
| ^ `a` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
error: `b` does not live long enough
--> src/main.rs:8:1
|
6 | vec.push(&b);
| - borrow occurs here
7 | // Use `vec`
8 | }
| ^ `b` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
However, I'm having a hard time convincing the compiler to drop the vector before the variables it references. vec.clear() doesn't work, and neither does drop(vec). mem::transmute() doesn't work either (to force vec to be 'static).
The only solution I found was to transmute the reference into &'static _. Is there any other way? Is it even possible to compile this in safe Rust?
Is it even possible to compile this in safe Rust?
No. What you are trying to do is inherently unsafe in the general case.
The collection contains a reference to a variable that will be dropped before the collection itself is dropped. This means that the destructor of the collection has access to references that are no longer valid. The destructor could choose to dereference one of those values, breaking Rust's memory safety guarantees.
note: values in a scope are dropped in the opposite order they are created
As the compiler tells you, you need to reorder your code. You didn't actually say what the limitations are for "reasons related to code organization", but the straight fix is:
fn f() {
let a = 0;
let b = 0;
let mut vec = Vec::new();
vec.push(&a);
vec.push(&b);
}
A less obvious one is:
fn f() {
let a;
let b;
let mut vec = Vec::new();
a = 0;
vec.push(&a);
b = 0;
vec.push(&b);
}
That all being said, once non-lexical lifetimes are enabled, your original code will work! The borrow checker becomes more granular about how long a value needs to live.
But wait; I just said that the collection might access invalid memory if a value inside it is dropped before the collection, and now the compiler is allowing that to happen? What gives?
This is because the standard library pulls a sneaky trick on us. Collections like Vec or HashSet guarantee that they do not access their generic parameters in the destructor. They communicate this to the compiler using the unstable #[may_dangle] feature.
See also:
Moved variable still borrowing after calling `drop`?
"cannot move out of variable because it is borrowed" when rotating variables

How do I add references to a container when the borrowed values are created after the container?

For reasons related to code organization, I need the compiler to accept the following (simplified) code:
fn f() {
let mut vec = Vec::new();
let a = 0;
vec.push(&a);
let b = 0;
vec.push(&b);
// Use `vec`
}
The compiler complains
error: `a` does not live long enough
--> src/main.rs:8:1
|
4 | vec.push(&a);
| - borrow occurs here
...
8 | }
| ^ `a` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
error: `b` does not live long enough
--> src/main.rs:8:1
|
6 | vec.push(&b);
| - borrow occurs here
7 | // Use `vec`
8 | }
| ^ `b` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
However, I'm having a hard time convincing the compiler to drop the vector before the variables it references. vec.clear() doesn't work, and neither does drop(vec). mem::transmute() doesn't work either (to force vec to be 'static).
The only solution I found was to transmute the reference into &'static _. Is there any other way? Is it even possible to compile this in safe Rust?
Is it even possible to compile this in safe Rust?
No. What you are trying to do is inherently unsafe in the general case.
The collection contains a reference to a variable that will be dropped before the collection itself is dropped. This means that the destructor of the collection has access to references that are no longer valid. The destructor could choose to dereference one of those values, breaking Rust's memory safety guarantees.
note: values in a scope are dropped in the opposite order they are created
As the compiler tells you, you need to reorder your code. You didn't actually say what the limitations are for "reasons related to code organization", but the straight fix is:
fn f() {
let a = 0;
let b = 0;
let mut vec = Vec::new();
vec.push(&a);
vec.push(&b);
}
A less obvious one is:
fn f() {
let a;
let b;
let mut vec = Vec::new();
a = 0;
vec.push(&a);
b = 0;
vec.push(&b);
}
That all being said, once non-lexical lifetimes are enabled, your original code will work! The borrow checker becomes more granular about how long a value needs to live.
But wait; I just said that the collection might access invalid memory if a value inside it is dropped before the collection, and now the compiler is allowing that to happen? What gives?
This is because the standard library pulls a sneaky trick on us. Collections like Vec or HashSet guarantee that they do not access their generic parameters in the destructor. They communicate this to the compiler using the unstable #[may_dangle] feature.
See also:
Moved variable still borrowing after calling `drop`?
"cannot move out of variable because it is borrowed" when rotating variables

Resources