Is it possible to share data with threads without any cloning? - multithreading

When I'm delegating work to threads I often have a piece of data that will outlive all of the threads, such as numbers in the following example:
use std::thread;
fn main() {
let numbers = vec![1, 2, 3];
let thread_a = thread::spawn(|| println!("{}", numbers.len()));
let thread_b = thread::spawn(|| println!("{}", numbers.len()));
thread_a.join().unwrap();
thread_b.join().unwrap();
}
It's not modified anywhere, and because of the joins, it's guaranteed that the threads are done using it. However, Rust's borrow checker is not able to tell:
error[E0373]: closure may outlive the current function, but it borrows `numbers`, which is owned by the current function
--> src/main.rs:6:34
|
6 | let thread_a = thread::spawn(|| println!("{}", numbers.len()));
| ^^ ------- `numbers` is borrowed here
| |
| may outlive borrowed value `numbers`
|
note: function requires argument type to outlive `'static`
--> src/main.rs:6:20
|
6 | let thread_a = thread::spawn(|| println!("{}", numbers.len()));
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
help: to force the closure to take ownership of `numbers` (and any other referenced variables), use the `move` keyword
|
6 | let thread_a = thread::spawn(move || println!("{}", numbers.len()));
| ^^^^^^^
The solutions I've seen so far all involve cloning the piece of data (or cloning an Arc of the data). Is it possible to do it without any cloning, though?

You might have the wrong idea: cloning an Arc is just incrementing a reference counter and making a copy of a pointer; it doesn't perform any additional allocation. Of course, creating the Arc involves an allocation, but then, you're already allocating in order to construct the Vec, so one additional fixed-size allocation isn't likely to hurt.
If all you really need is the length, you can just compute that outside the thread's closure and store it in a variable; a usize has no problems crossing a thread boundary.
The issue is that the compiler is unable to infer from the use of join() that a given thread is bound to a limited lifetime... it doesn't even try.
Before Rust 1.0, there was a thread::scoped constructor that allowed you to pass in non-'static references, but that had to be de-stabilised due to a memory safety issue. See How can I pass a reference to a stack variable to a thread? for alternatives.

Related

Unreasonable "cannot borrow `a` as immutable because it is also borrowed as mutable"?

I have seen cannot borrow as immutable because it is also borrowed as mutable and my question is not a duplicate, since my code has Non-Lexical Lifetimes enabled.
I'm wondering if there is a fundamental reason why the following code:
fn f1(a: &u32) {
print!("{:?}", a);
}
fn main() {
let mut a = 3;
let b = &mut a;
f1(&a);
*b += 1;
print!("{:?}", b);
}
must result in the following error:
error[E0502]: cannot borrow `a` as immutable because it is also borrowed as mutable
--> src/bin/client/main.rs:91:6
|
90 | let b = &mut a;
| ------ mutable borrow occurs here
91 | f1(&a);
| ^^ immutable borrow occurs here
92 | *b += 1;
| ------- mutable borrow later used here
Now, I know that on the line f1(&a), we'll have one mutable reference (b) and one immutable reference (&a), and according to these rules this can't happen. But having 1 mutable and 1 immutable reference can only cause a problem if their usages are interleaved, right? That is, in theory, shouldn't Rust be able to observe that b is not used within &a's existence, and thus accept this program?
Is this just a limitation of the compiler? Or am I overlooking some other memory danger here?
That is, in theory, shouldn't Rust be able to observe that b is not used within &a's existence, and thus accept this program?
Maybe, though it's possible that there are edge cases where this would be a problem. I would expect optimisations to be an issue here e.g. eventually Rust will be able to finally tag &mut as noalias without LLVMs immediately miscompiling things, and in that case your code would be UB if it were allowed.
Is this just a limitation of the compiler?
In this case no, it's literally a limitation of the language specification. There are situations which are limitations of the compiler like loop mutations, but here you're trying to do something the language's rules explicitely and specifically forbid.
Even polonius will not change that.

Rust - writing to indices of a vector across multiple threads

I have a circular ring buffer (implemented as a vector) where I want one thread to periodically write to the ring buffer and another to periodically read from the ring buffer. Is it possible to create a vector that can be read and written at the same time so long as threads accessing the vector are not at the same index?
What I am hoping to achieve:
use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;
fn main() {
let vec = Arc::new(vec![Mutex::new(1), Mutex::new(2),Mutex::new(3)]);
{
let vec = vec.clone();
thread::spawn(move|| {
let mut s2 = *vec.get_mut(2).unwrap().lock().unwrap();
s2 = 7;
});
}
println!("{}", vec[2].lock().unwrap());
}
Compiler output is:
Compiling playground v0.0.1 (/playground)
warning: variable `s2` is assigned to, but never used
--> src/main.rs:12:21
|
12 | let mut s2 = *vec.get_mut(2).unwrap().lock().unwrap();
| ^^
|
= note: `#[warn(unused_variables)]` on by default
= note: consider using `_s2` instead
warning: value assigned to `s2` is never read
--> src/main.rs:13:13
|
13 | s2 = 7;
| ^^
|
= note: `#[warn(unused_assignments)]` on by default
= help: maybe it is overwritten before being read?
error[E0596]: cannot borrow data in an `Arc` as mutable
--> src/main.rs:12:27
|
12 | let mut s2 = *vec.get_mut(2).unwrap().lock().unwrap();
| ^^^ cannot borrow as mutable
|
= help: trait `DerefMut` is required to modify through a dereference, but it is not implemented for `std::sync::Arc<std::vec::Vec<std::sync::Mutex<i32>>>`
error: aborting due to previous error
For more information about this error, try `rustc --explain E0596`.
error: could not compile `playground`.
To learn more, run the command again with --verbose.
Foiled by the rust type system trying to prevent a race condition :(
What I don't want
An implementation that involves having the lock scope including the vector.
An atomic read and write to the vector is not an option since the vector will contain images.
Link to playground:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=5b5efe91bdd45c658d11f1cefb16045e
First, I would recommend you to use std::sync::RwLock, because it allows multiple readers to read data simultaneously.
Second, spawning threads can lead to performance bottlenecks in your code. Try to use thread pool.
Of course, the exact choice will vary depending on the result of benchmarks, but those are general recommendations.
Your code is mostly correct, except one crucial part. You are using Mutex which implements interior mutability pattern and also provides thread-safety.
Interior mutability moves compiletime checks of XOR borrowing rule (either N immutable borrows or just one mutable) to the run-time. So, Mutex ensures that any time there exists only one reader or only one writer.
When you try to get mutable reference from vec, like this
vec.get_mut(..)
You are essentially ignoring benefits provided by interior mutability. Compiler can't guarantee that XOR rule is not broken, because you borrow vec as mutable.
Obvious solution is to borrow vec as immutable and using Mutex to safeguard against race condition and don't utilize compiler borrowing rules.
let mut s2 = vec
.get(2) // Get immutable reference to second item
.unwrap() // Ensure that it exists
.lock() // Lock mutex.
.unwrap(); // Ensure mutex isn't poisoned.
// s2 is now `std::sync::MutexGuard<i32>`, which implements `std::ops::DerefMut`,
// so it can get us mutable reference to data.
*s2 = 7;

Borrowing error when pushing reference into vector that is on the same scope [duplicate]

For reasons related to code organization, I need the compiler to accept the following (simplified) code:
fn f() {
let mut vec = Vec::new();
let a = 0;
vec.push(&a);
let b = 0;
vec.push(&b);
// Use `vec`
}
The compiler complains
error: `a` does not live long enough
--> src/main.rs:8:1
|
4 | vec.push(&a);
| - borrow occurs here
...
8 | }
| ^ `a` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
error: `b` does not live long enough
--> src/main.rs:8:1
|
6 | vec.push(&b);
| - borrow occurs here
7 | // Use `vec`
8 | }
| ^ `b` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
However, I'm having a hard time convincing the compiler to drop the vector before the variables it references. vec.clear() doesn't work, and neither does drop(vec). mem::transmute() doesn't work either (to force vec to be 'static).
The only solution I found was to transmute the reference into &'static _. Is there any other way? Is it even possible to compile this in safe Rust?
Is it even possible to compile this in safe Rust?
No. What you are trying to do is inherently unsafe in the general case.
The collection contains a reference to a variable that will be dropped before the collection itself is dropped. This means that the destructor of the collection has access to references that are no longer valid. The destructor could choose to dereference one of those values, breaking Rust's memory safety guarantees.
note: values in a scope are dropped in the opposite order they are created
As the compiler tells you, you need to reorder your code. You didn't actually say what the limitations are for "reasons related to code organization", but the straight fix is:
fn f() {
let a = 0;
let b = 0;
let mut vec = Vec::new();
vec.push(&a);
vec.push(&b);
}
A less obvious one is:
fn f() {
let a;
let b;
let mut vec = Vec::new();
a = 0;
vec.push(&a);
b = 0;
vec.push(&b);
}
That all being said, once non-lexical lifetimes are enabled, your original code will work! The borrow checker becomes more granular about how long a value needs to live.
But wait; I just said that the collection might access invalid memory if a value inside it is dropped before the collection, and now the compiler is allowing that to happen? What gives?
This is because the standard library pulls a sneaky trick on us. Collections like Vec or HashSet guarantee that they do not access their generic parameters in the destructor. They communicate this to the compiler using the unstable #[may_dangle] feature.
See also:
Moved variable still borrowing after calling `drop`?
"cannot move out of variable because it is borrowed" when rotating variables

How do I add references to a container when the borrowed values are created after the container?

For reasons related to code organization, I need the compiler to accept the following (simplified) code:
fn f() {
let mut vec = Vec::new();
let a = 0;
vec.push(&a);
let b = 0;
vec.push(&b);
// Use `vec`
}
The compiler complains
error: `a` does not live long enough
--> src/main.rs:8:1
|
4 | vec.push(&a);
| - borrow occurs here
...
8 | }
| ^ `a` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
error: `b` does not live long enough
--> src/main.rs:8:1
|
6 | vec.push(&b);
| - borrow occurs here
7 | // Use `vec`
8 | }
| ^ `b` dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created
However, I'm having a hard time convincing the compiler to drop the vector before the variables it references. vec.clear() doesn't work, and neither does drop(vec). mem::transmute() doesn't work either (to force vec to be 'static).
The only solution I found was to transmute the reference into &'static _. Is there any other way? Is it even possible to compile this in safe Rust?
Is it even possible to compile this in safe Rust?
No. What you are trying to do is inherently unsafe in the general case.
The collection contains a reference to a variable that will be dropped before the collection itself is dropped. This means that the destructor of the collection has access to references that are no longer valid. The destructor could choose to dereference one of those values, breaking Rust's memory safety guarantees.
note: values in a scope are dropped in the opposite order they are created
As the compiler tells you, you need to reorder your code. You didn't actually say what the limitations are for "reasons related to code organization", but the straight fix is:
fn f() {
let a = 0;
let b = 0;
let mut vec = Vec::new();
vec.push(&a);
vec.push(&b);
}
A less obvious one is:
fn f() {
let a;
let b;
let mut vec = Vec::new();
a = 0;
vec.push(&a);
b = 0;
vec.push(&b);
}
That all being said, once non-lexical lifetimes are enabled, your original code will work! The borrow checker becomes more granular about how long a value needs to live.
But wait; I just said that the collection might access invalid memory if a value inside it is dropped before the collection, and now the compiler is allowing that to happen? What gives?
This is because the standard library pulls a sneaky trick on us. Collections like Vec or HashSet guarantee that they do not access their generic parameters in the destructor. They communicate this to the compiler using the unstable #[may_dangle] feature.
See also:
Moved variable still borrowing after calling `drop`?
"cannot move out of variable because it is borrowed" when rotating variables

Why does Rust want to borrow a variable as mutable more than once at a time?

I'm attempting to implement a dynamic programming problem in Rust to gain familiarity with the language. Like many dynamic programming problems, this uses memoization to reduce the running time. Unfortunately, my first-pass solution yields errors. I've pared the code down to the following. Warning - it's now a bit nonsensical:
use std::collections::HashMap;
fn repro<'m>(memo: &'m mut HashMap<i32, Vec<i32>>) -> Option<&'m Vec<i32>> {
{
let script_a = repro(memo);
let script_b = repro(memo);
}
memo.get(&0)
}
fn main() {}
The compilation error is:
error[E0499]: cannot borrow `*memo` as mutable more than once at a time
--> src/main.rs:6:30
|
5 | let script_a = repro(memo);
| ---- first mutable borrow occurs here
6 | let script_b = repro(memo);
| ^^^^ second mutable borrow occurs here
7 | }
| - first borrow ends here
Why is the variable memo borrowed multiple times? In my view, it should be borrowed once when I compute script_a, then that borrow ends, then it gets borrowed again for script_b.
Currently borrows last for the block they are defined in (#9113 might change this if implemented)
The problem is that script_a (that holds an immutable reference to a map) is valid for the whole block and you try to use a mutable reference to the same map:
let script_a = repro(memo);
let script_b = repro(memo);
// script_a is still alive
The bigger problem is the infinite loop. Anyway let script_a is a reference to data inside the hash map, so it's still borrowed by the time you call let script_b = repro(memo);.

Resources