Rust - writing to indices of a vector across multiple threads - multithreading

I have a circular ring buffer (implemented as a vector) where I want one thread to periodically write to the ring buffer and another to periodically read from the ring buffer. Is it possible to create a vector that can be read and written at the same time so long as threads accessing the vector are not at the same index?
What I am hoping to achieve:
use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;
fn main() {
let vec = Arc::new(vec![Mutex::new(1), Mutex::new(2),Mutex::new(3)]);
{
let vec = vec.clone();
thread::spawn(move|| {
let mut s2 = *vec.get_mut(2).unwrap().lock().unwrap();
s2 = 7;
});
}
println!("{}", vec[2].lock().unwrap());
}
Compiler output is:
Compiling playground v0.0.1 (/playground)
warning: variable `s2` is assigned to, but never used
--> src/main.rs:12:21
|
12 | let mut s2 = *vec.get_mut(2).unwrap().lock().unwrap();
| ^^
|
= note: `#[warn(unused_variables)]` on by default
= note: consider using `_s2` instead
warning: value assigned to `s2` is never read
--> src/main.rs:13:13
|
13 | s2 = 7;
| ^^
|
= note: `#[warn(unused_assignments)]` on by default
= help: maybe it is overwritten before being read?
error[E0596]: cannot borrow data in an `Arc` as mutable
--> src/main.rs:12:27
|
12 | let mut s2 = *vec.get_mut(2).unwrap().lock().unwrap();
| ^^^ cannot borrow as mutable
|
= help: trait `DerefMut` is required to modify through a dereference, but it is not implemented for `std::sync::Arc<std::vec::Vec<std::sync::Mutex<i32>>>`
error: aborting due to previous error
For more information about this error, try `rustc --explain E0596`.
error: could not compile `playground`.
To learn more, run the command again with --verbose.
Foiled by the rust type system trying to prevent a race condition :(
What I don't want
An implementation that involves having the lock scope including the vector.
An atomic read and write to the vector is not an option since the vector will contain images.
Link to playground:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=5b5efe91bdd45c658d11f1cefb16045e

First, I would recommend you to use std::sync::RwLock, because it allows multiple readers to read data simultaneously.
Second, spawning threads can lead to performance bottlenecks in your code. Try to use thread pool.
Of course, the exact choice will vary depending on the result of benchmarks, but those are general recommendations.
Your code is mostly correct, except one crucial part. You are using Mutex which implements interior mutability pattern and also provides thread-safety.
Interior mutability moves compiletime checks of XOR borrowing rule (either N immutable borrows or just one mutable) to the run-time. So, Mutex ensures that any time there exists only one reader or only one writer.
When you try to get mutable reference from vec, like this
vec.get_mut(..)
You are essentially ignoring benefits provided by interior mutability. Compiler can't guarantee that XOR rule is not broken, because you borrow vec as mutable.
Obvious solution is to borrow vec as immutable and using Mutex to safeguard against race condition and don't utilize compiler borrowing rules.
let mut s2 = vec
.get(2) // Get immutable reference to second item
.unwrap() // Ensure that it exists
.lock() // Lock mutex.
.unwrap(); // Ensure mutex isn't poisoned.
// s2 is now `std::sync::MutexGuard<i32>`, which implements `std::ops::DerefMut`,
// so it can get us mutable reference to data.
*s2 = 7;

Related

Why does Rust not allow multiple mutable borrows when it's safe?

I'm having difficulty understanding why Rust's borrow checker does not allow multiple mutable borrows when it is safe to do so.
Let's give an example:
fn borrow_mut(s : &mut String) {
s.push_str(" world!");
println!("{}", s);
}
fn main() {
let mut s = String::from("hello");
let rs : &mut String = &mut s;
// second mutable borrow
borrow_mut(&mut s);
println!("{rs}");
}
This code fails to compile with the following message:
error[E0499]: cannot borrow `s` as mutable more than once at a time
--> main.rs:11:16
|
8 | let rs : &mut String = &mut s;
| ------ first mutable borrow occurs here
...
11 | borrow_mut(&mut s);
| ^^^^^^ second mutable borrow occurs here
12 |
13 | println!("{rs}");
| -- first borrow later used here
rs points on the variable of type String in the stack frame. String contains pointer on memory in the heap. So even if the string reallocates its data in borrow_mut(), both pointers are still valid, so this code should be safe.
Could someone explain the reason of why the borrow checker prevents multiple mutable borrows even when it's safe?
It's for thread safety, to avoid data races. If two such mutable borrowings can exist, then two threads of execution can both attempt to modify the original data. If they do, all sorts of nasty race conditions can arise, e.g. if both threads try to append to the string:
The underlying array holding the data can get reallocated twice, with one of them leaked
The appended data could end up writing out of bounds due to time-of-check/time-of-use issues
You could end up with inconsistent definitions of the length and capacity
On some architectures and data sizes, tearing could mean a single logical value is read half as the old version and half as the updated value (producing something that could easily be unrelated to either the old or new value)
etc.
Borrows as a language feature mean that the function can temporarily hand off its unique mutable-ownership to some other function; while that other function holds the borrow, the original object can't be accessed through anything but that mutable borrow. It also means that for non-mutable borrows, it can prevent mutable borrows that might causes races between reads through the non-mutable borrow and writes through the mutable borrow. The borrow checker is preventing you from launching a thread that modifies s, then calling borrow_mut from the main thread, and the two threads producing garbage or crashing the program when they modify s simultaneously.
To be clear, with an advanced borrow-checker in some future version of Rust, this code could be made to work (the code you wrote does nothing inherently unsafe). But fully analyzing deep code paths to ensure nothing evil could possibly occur is hard, and it's relatively easy to impose stricter rules (which might be loosened in the future if they're sure it won't impose restrictions on the language design that bite them later). Your code would work just fine if you passed the single mutable borrow you already had into borrow_mut after all; your code is not made worse by doing things The Rust Way™.

Will the non-lexical lifetime borrow checker release locks prematurely?

I've read What are non-lexical lifetimes?. With the non-lexical borrow checker, the following code compiles:
fn main() {
let mut scores = vec![1, 2, 3];
let score = &scores[0]; // borrows `scores`, but never used
// its lifetime can end here
scores.push(4); // borrows `scores` mutably, and succeeds
}
It seems reasonable in the case above, but when it comes to a mutex lock, we don't want it to be released prematurely.
In the following code, I would like to lock a shared structure first and then execute a closure, mainly to avoid deadlock. However, I'm not sure if the lock will be released prematurely.
use lazy_static::lazy_static; // 1.3.0
use std::sync::Mutex;
struct Something;
lazy_static! {
static ref SHARED: Mutex<Something> = Mutex::new(Something);
}
pub fn lock_and_execute(f: Box<Fn()>) {
let _locked = SHARED.lock(); // `_locked` is never used.
// does its lifetime end here?
f();
}
Does Rust treat locks specially, so that their lifetimes are guaranteed to extend to the end of their scope? Must we use that variable explicitly to avoid premature dropping of the lock, like in the following code?
pub fn lock_and_execute(f: Box<Fn()>) {
let locked = SHARED.lock(); // - lifetime begins
f(); // |
drop(locked); // - lifetime ends
}
There is a misunderstanding here: NLL (non-lexical lifetimes) affects the borrow-checks, not the actual lifetime of the objects.
Rust uses RAII1 extensively, and thus the Drop implementation of a number of objects, such as locks, has side-effects which have to occur at a well-determined and predictable point in the flow of execution.
NLL did NOT change the lifetime of such objects, and therefore their destructor is executed at exactly the same point that it was before: at the end of their lexical scope, in reverse order of creation.
NLL did change the understanding of the compiler of the use of lifetimes for the purpose of borrow-checking. This does not, actually, cause any code change; this is purely analysis. This analysis was made more clever, to better recognize the actual scope in which a reference is used:
Prior to NLL, a reference was considered "in use" from the moment it was created to the moment it was dropped, generally its lexical scope (hence the name).
NLL, instead:
Tries to defer the start of the "in use" span, if possible.
Ends the "in use" span with the last use of the reference.
In the case of a Ref<'a> (from RefCell), the Ref<'a> will be dropped at the end of the lexical scope, at which point it will use the reference to RefCell to decrement the counter.
NLL does not peel away layers of abstractions, so must consider that any object containing a reference (such as Ref<'a>) may access said reference in its Drop implementation. As a result, any object that contains a reference, such as a lock, will force NLL to consider that the "in use" span of the reference extends until they are dropped.
1 Resource Acquisition Is Initialization, whose original meaning is that once a variable constructor has been executed it has acquired the resources it needed and is not in a half-baked state, and which is generally used to mean that the destruction of said variable will release any resources it owned.
Does Rust treat locks specially, so that their lifetimes are guaranteed to extend to the end of their scope?
No. This is the default for every type, and has nothing to do with the borrow checker.
Must we use that variable explicitly to avoid premature dropping of the lock
No.
All you need to do is ensure that the lock guard is bound to a variable. Your example does this (let _lock = ...), so the lock will be dropped at the end of scope. If you had used the _ pattern instead, the lock would have been dropped immediately:
You can prove this for yourself by testing if the lock has indeed been dropped:
pub fn lock_and_execute() {
let shared = Mutex::new(Something);
println!("A");
let _locked = shared.lock().unwrap();
// If `_locked` was dropped, then we can re-lock it:
println!("B");
shared.lock().unwrap();
println!("C");
}
fn main() {
lock_and_execute();
}
This code will deadlock, as the same thread attempts to acquire the lock twice.
You could also attempt to use a method that requires &mut self to see that the immutable borrow is still held by the guard, which has not been dropped:
pub fn lock_and_execute() {
let mut shared = Mutex::new(Something);
println!("A");
let _locked = shared.lock().unwrap();
// If `_locked` was dropped, then we can re-lock it:
println!("B");
shared.get_mut().unwrap();
println!("C");
}
error[E0502]: cannot borrow `shared` as mutable because it is also borrowed as immutable
--> src/main.rs:13:5
|
9 | let _locked = shared.lock().unwrap();
| ------ immutable borrow occurs here
...
13 | shared.get_mut().unwrap();
| ^^^^^^^^^^^^^^^^ mutable borrow occurs here
...
16 | }
| - immutable borrow might be used here, when `_locked` is dropped and runs the `Drop` code for type `std::sync::MutexGuard`
See also:
Where is a MutexGuard if I never assign it to a variable?
How to lock a Rust struct the way a struct is locked in Go?
Why does _ destroy at the end of statement?

Borrowing error when pushing reference into vector that is on the same scope [duplicate]

For reasons related to code organization, I need the compiler to accept the following (simplified) code:
fn f() {
let mut vec = Vec::new();
let a = 0;
vec.push(&a);
let b = 0;
vec.push(&b);
// Use `vec`
}
The compiler complains
error: `a` does not live long enough
--> src/main.rs:8:1
|
4 | vec.push(&a);
| - borrow occurs here
...
8 | }
| ^ `a` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
error: `b` does not live long enough
--> src/main.rs:8:1
|
6 | vec.push(&b);
| - borrow occurs here
7 | // Use `vec`
8 | }
| ^ `b` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
However, I'm having a hard time convincing the compiler to drop the vector before the variables it references. vec.clear() doesn't work, and neither does drop(vec). mem::transmute() doesn't work either (to force vec to be 'static).
The only solution I found was to transmute the reference into &'static _. Is there any other way? Is it even possible to compile this in safe Rust?
Is it even possible to compile this in safe Rust?
No. What you are trying to do is inherently unsafe in the general case.
The collection contains a reference to a variable that will be dropped before the collection itself is dropped. This means that the destructor of the collection has access to references that are no longer valid. The destructor could choose to dereference one of those values, breaking Rust's memory safety guarantees.
note: values in a scope are dropped in the opposite order they are created
As the compiler tells you, you need to reorder your code. You didn't actually say what the limitations are for "reasons related to code organization", but the straight fix is:
fn f() {
let a = 0;
let b = 0;
let mut vec = Vec::new();
vec.push(&a);
vec.push(&b);
}
A less obvious one is:
fn f() {
let a;
let b;
let mut vec = Vec::new();
a = 0;
vec.push(&a);
b = 0;
vec.push(&b);
}
That all being said, once non-lexical lifetimes are enabled, your original code will work! The borrow checker becomes more granular about how long a value needs to live.
But wait; I just said that the collection might access invalid memory if a value inside it is dropped before the collection, and now the compiler is allowing that to happen? What gives?
This is because the standard library pulls a sneaky trick on us. Collections like Vec or HashSet guarantee that they do not access their generic parameters in the destructor. They communicate this to the compiler using the unstable #[may_dangle] feature.
See also:
Moved variable still borrowing after calling `drop`?
"cannot move out of variable because it is borrowed" when rotating variables

How do I add references to a container when the borrowed values are created after the container?

For reasons related to code organization, I need the compiler to accept the following (simplified) code:
fn f() {
let mut vec = Vec::new();
let a = 0;
vec.push(&a);
let b = 0;
vec.push(&b);
// Use `vec`
}
The compiler complains
error: `a` does not live long enough
--> src/main.rs:8:1
|
4 | vec.push(&a);
| - borrow occurs here
...
8 | }
| ^ `a` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
error: `b` does not live long enough
--> src/main.rs:8:1
|
6 | vec.push(&b);
| - borrow occurs here
7 | // Use `vec`
8 | }
| ^ `b` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
However, I'm having a hard time convincing the compiler to drop the vector before the variables it references. vec.clear() doesn't work, and neither does drop(vec). mem::transmute() doesn't work either (to force vec to be 'static).
The only solution I found was to transmute the reference into &'static _. Is there any other way? Is it even possible to compile this in safe Rust?
Is it even possible to compile this in safe Rust?
No. What you are trying to do is inherently unsafe in the general case.
The collection contains a reference to a variable that will be dropped before the collection itself is dropped. This means that the destructor of the collection has access to references that are no longer valid. The destructor could choose to dereference one of those values, breaking Rust's memory safety guarantees.
note: values in a scope are dropped in the opposite order they are created
As the compiler tells you, you need to reorder your code. You didn't actually say what the limitations are for "reasons related to code organization", but the straight fix is:
fn f() {
let a = 0;
let b = 0;
let mut vec = Vec::new();
vec.push(&a);
vec.push(&b);
}
A less obvious one is:
fn f() {
let a;
let b;
let mut vec = Vec::new();
a = 0;
vec.push(&a);
b = 0;
vec.push(&b);
}
That all being said, once non-lexical lifetimes are enabled, your original code will work! The borrow checker becomes more granular about how long a value needs to live.
But wait; I just said that the collection might access invalid memory if a value inside it is dropped before the collection, and now the compiler is allowing that to happen? What gives?
This is because the standard library pulls a sneaky trick on us. Collections like Vec or HashSet guarantee that they do not access their generic parameters in the destructor. They communicate this to the compiler using the unstable #[may_dangle] feature.
See also:
Moved variable still borrowing after calling `drop`?
"cannot move out of variable because it is borrowed" when rotating variables

Is it possible to share data with threads without any cloning?

When I'm delegating work to threads I often have a piece of data that will outlive all of the threads, such as numbers in the following example:
use std::thread;
fn main() {
let numbers = vec![1, 2, 3];
let thread_a = thread::spawn(|| println!("{}", numbers.len()));
let thread_b = thread::spawn(|| println!("{}", numbers.len()));
thread_a.join().unwrap();
thread_b.join().unwrap();
}
It's not modified anywhere, and because of the joins, it's guaranteed that the threads are done using it. However, Rust's borrow checker is not able to tell:
error[E0373]: closure may outlive the current function, but it borrows `numbers`, which is owned by the current function
--> src/main.rs:6:34
|
6 | let thread_a = thread::spawn(|| println!("{}", numbers.len()));
| ^^ ------- `numbers` is borrowed here
| |
| may outlive borrowed value `numbers`
|
note: function requires argument type to outlive `'static`
--> src/main.rs:6:20
|
6 | let thread_a = thread::spawn(|| println!("{}", numbers.len()));
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
help: to force the closure to take ownership of `numbers` (and any other referenced variables), use the `move` keyword
|
6 | let thread_a = thread::spawn(move || println!("{}", numbers.len()));
| ^^^^^^^
The solutions I've seen so far all involve cloning the piece of data (or cloning an Arc of the data). Is it possible to do it without any cloning, though?
You might have the wrong idea: cloning an Arc is just incrementing a reference counter and making a copy of a pointer; it doesn't perform any additional allocation. Of course, creating the Arc involves an allocation, but then, you're already allocating in order to construct the Vec, so one additional fixed-size allocation isn't likely to hurt.
If all you really need is the length, you can just compute that outside the thread's closure and store it in a variable; a usize has no problems crossing a thread boundary.
The issue is that the compiler is unable to infer from the use of join() that a given thread is bound to a limited lifetime... it doesn't even try.
Before Rust 1.0, there was a thread::scoped constructor that allowed you to pass in non-'static references, but that had to be de-stabilised due to a memory safety issue. See How can I pass a reference to a stack variable to a thread? for alternatives.

Resources