How do I share a mutable object between threads using Arc? - multithreading

I'm trying to share a mutable object between threads in Rust using Arc, but I get this error:
error[E0596]: cannot borrow data in a `&` reference as mutable
--> src/main.rs:11:13
|
11 | shared_stats_clone.add_stats();
| ^^^^^^^^^^^^^^^^^^ cannot borrow as mutable
This is the sample code:
use std::{sync::Arc, thread};
fn main() {
let total_stats = Stats::new();
let shared_stats = Arc::new(total_stats);
let threads = 5;
for _ in 0..threads {
let mut shared_stats_clone = shared_stats.clone();
thread::spawn(move || {
shared_stats_clone.add_stats();
});
}
}
struct Stats {
hello: u32,
}
impl Stats {
pub fn new() -> Stats {
Stats { hello: 0 }
}
pub fn add_stats(&mut self) {
self.hello += 1;
}
}
What can I do?

Arc's documentation says:
Shared references in Rust disallow mutation by default, and Arc is no exception: you cannot generally obtain a mutable reference to something inside an Arc. If you need to mutate through an Arc, use Mutex, RwLock, or one of the Atomic types.
You will likely want a Mutex combined with an Arc:
use std::{
sync::{Arc, Mutex},
thread,
};
struct Stats;
impl Stats {
fn add_stats(&mut self, _other: &Stats) {}
}
fn main() {
let shared_stats = Arc::new(Mutex::new(Stats));
let threads = 5;
for _ in 0..threads {
let my_stats = shared_stats.clone();
thread::spawn(move || {
let mut shared = my_stats.lock().unwrap();
shared.add_stats(&Stats);
});
// Note: Immediately joining, no multithreading happening!
// THIS WAS A LIE, see below
}
}
This is largely cribbed from the Mutex documentation.
How can I use shared_stats after the for? (I'm talking about the Stats object). It seems that the shared_stats cannot be easily converted to Stats.
As of Rust 1.15, it's possible to get the value back. See my additional answer for another solution as well.
[A comment in the example] says that there is no multithreading. Why?
Because I got confused! :-)
In the example code, the result of thread::spawn (a JoinHandle) is immediately dropped because it's not stored anywhere. When the handle is dropped, the thread is detached and may or may not ever finish. I was confusing it with JoinGuard, a old, removed API that joined when it is dropped. Sorry for the confusion!
For a bit of editorial, I suggest avoiding mutability completely:
use std::{ops::Add, thread};
#[derive(Debug)]
struct Stats(u64);
// Implement addition on our type
impl Add for Stats {
type Output = Stats;
fn add(self, other: Stats) -> Stats {
Stats(self.0 + other.0)
}
}
fn main() {
let threads = 5;
// Start threads to do computation
let threads: Vec<_> = (0..threads).map(|_| thread::spawn(|| Stats(4))).collect();
// Join all the threads, fail if any of them failed
let result: Result<Vec<_>, _> = threads.into_iter().map(|t| t.join()).collect();
let result = result.unwrap();
// Add up all the results
let sum = result.into_iter().fold(Stats(0), |i, sum| sum + i);
println!("{:?}", sum);
}
Here, we keep a reference to the JoinHandle and then wait for all the threads to finish. We then collect the results and add them all up. This is the common map-reduce pattern. Note that no thread needs any mutability, it all happens in the master thread.

Related

Proper way to share references to Vec between threads

I am new to rust and I am attempting to create a Vec that will live on the main thread, and pass a reference to another thread, which then pushes members onto the vector, for the main thread to use.
use std::{thread};
fn main() {
let mut v: Vec<u8> = Vec::new();
let _ = thread::spawn(move || {
vec_push(&mut v, 0)
});
for i in v.iter_mut() {
println!("poo {}", i);
}
}
fn vec_push(v: &mut Vec<u8>, n: u8) {
v.push(n);
}
This is a simplified version of what I am trying to do. In my main code I am want it to be a Vec of TcpStreams.
I think this post would also apply to maintaining a struct (that doesn't implement Copy) between threads.
I get this error
error[E0382]: borrow of moved value: `v`
--> src/main.rs:8:11
|
4 | let mut v: Vec<u8> = Vec::new();
| ----- move occurs because `v` has type `Vec<u8>`, which does not implement the `Copy` trait
5 | let _ = thread::spawn(move || {
| ------- value moved into closure here
6 | vec_push(&mut v, 0)
| - variable moved due to use in closure
7 | });
8 | for i in v.iter_mut() {
| ^^^^^^^^^^^^ value borrowed here after move
Is there a better way to do this? Am I missing some basic concept?
Any help would be useful, I am used to C where I can just throw around references willy-nilly
What you are doing is wildly unsound. You are trying to have two mutable references to a object, which is strictly forbidden in rust. Rust forbids this to prevent you from having data races that would result in memory unsafety.
If you want to mutate an object from different threads you have to synchronize it somehow. The easiest way to do it is by using Mutex. This probably won't be very efficient in a high-congestion scenario (as locking a mutex can become your bottle neck), but it will be safe.
To share this Mutex between threads you can wrap it in an Arc (an atomic counted shared reference smart pointer). So your code can be transformed to something like this:
use std::thread;
use std::sync::{Arc, Mutex};
fn main() {
let v = Arc::new(Mutex::new(Vec::new()));
let v_clone = Arc::clone(&v);
let t = thread::spawn(move || {
vec_push(v_clone, 0)
});
t.join().unwrap();
for i in v.lock().unwrap().iter_mut() {
println!("poo {}", i);
}
}
fn vec_push(v: Arc<Mutex<Vec<u8>>>, n: u8) {
v.lock().unwrap().push(n);
}
You probably will also want to join your spawned thread, so you should name it.

Unsafely define struct to share mutable references between threads

I need to unsafely define a Rust struct that I can share between 2 threads and mutate the content of the struct from both threads.
I do not want to use Mutex nor RwLock because I want to implement the thread safety myself. For performance concerns, I do not want to check the mutex when time I want to access the content and I know it is not in the critical section.
If I only use Arc to share the struct between threads, I get cannot borrow data in an Arc as mutable and help: trait DerefMut is required to modify through a dereference, but it is not implemented for std::sync::Arc<Foo>.
The safe way to do this:
struct Foo {
bar: usize,
}
impl Foo {
pub fn set_bar(&mut self, a: usize) {
self.bar = a;
}
}
fn main() {
let mut foo = Foo { bar: 32 };
foo.bar = 33;
let foo_arc = std::sync::Arc::new(std::sync::Mutex::new(foo));
let foo_arc_2 = std::sync::Arc::clone(&foo_arc);
let handle = std::thread::spawn(move || {
foo_arc_2.lock().unwrap().set_bar(32);
});
foo_arc.lock().unwrap().set_bar(31);
handle.join().unwrap();
}
What I unsafely want to achieve:
struct Foo {
bar: usize,
// My own lock
// lock: std::sync::Mutex<usize>,
}
unsafe impl Sync for Foo {}
impl Foo {
pub fn set_bar(&mut self, a: usize) {
self.bar = a;
}
}
fn main() {
let mut foo = Foo { bar: 32 };
foo.bar = 33;
let foo_arc = std::sync::Arc::new(foo);
let foo_arc_2 = std::sync::Arc::clone(&foo_arc);
let handle = std::thread::spawn(move || {
foo_arc_2.set_bar(32);
});
foo_arc.set_bar(31);
handle.join().unwrap();
}
I might not have to use Arc and use something more low level unknown to me at the moment.
If you want to do this to later use it in production, don't do it! Many people smarter than you and me already done this correctly. Use what they wrote instead. If you want to do this as an exercise, or for learning purposes, then go ahead and do it.
If you want to provide a type with interior mutability then you must use UnsafeCell. This type is at a core of every interior mutability in rust and using it is the only way to get a &mut T from &T. You should read really carefully it's documentation, the documentation of the cell module and The Nomicon (preferably all of it, but at least concurrency chapter).
If you prefer watching videos, Jon Gjengset has, among many others, this amazing video on cell types. And this video on atomic memory and implementing (bad) mutex.

Vector holding mutable functions

I would like to have a vector with functions. Then I would like to iterate on this vector and execute functions one by one. The functions would mutate an external state. Additionally, I would like to be able to place the same function twice in the vector.
The problems I have are:
I cannot dereference and execute the function from the vector,
Adding the same function to the vector twice fails with, understandable, error that I cannot have two mutable references.
The closest I got is:
fn main() {
let mut c = 0;
{
let mut f = ||{c += 1};
let mut v: Vec<&mut FnMut()> = vec![];
v.push(&mut f);
// How to execute the stored function? The following complains about
// an immutable reference:
// assignment into an immutable reference
// (v[0])();
// How to store the same function twice? The following will fail with:
// cannot borrow `f` as mutable more than once at a time
// v.push(&mut f);
}
println!("c {}", c);
}
For the first problem, I don't really know why no mutable dereference happens here (in my opinion, it should), but there is a simple workaround: just do the dereference and then reference manually:
(&mut *v[0])();
Your second problem is more complex, though. There is no simple solution, because what you're trying to do violates Rust aliasing guarantees, and since you did not describe the purpose of it, I can't suggest alternatives properly. In general, however, you can overcome this error by switching to runtime borrow-checking with Cell/RefCell or Mutex (the latter is when you need concurrent access). With Cell (works nice for primitives):
use std::cell::Cell;
fn main() {
let c = Cell::new(0);
{
let f = || { c.set(c.get() + 1); };
let mut v: Vec<&Fn()> = vec![];
v.push(&f);
v.push(&f);
v[0]();
v[1]();
}
println!("c {}", c.get());
}
With RefCell (works nice for more complex types):
use std::cell::RefCell;
fn main() {
let c = RefCell::new(0);
{
let f = || { *c.borrow_mut() += 1; };
let mut v: Vec<&Fn()> = vec![];
v.push(&f);
v.push(&f);
v[0]();
v[1]();
}
println!("c {}", *c.borrow());
}
As you can see, now you have &Fn() instead of &mut FnMut(), which can be aliased freely, and whose captured environment may also contain aliased references (immutable, of course).

How can multiple threads share an iterator?

I've been working on a function that will copy a bunch of files from a source to a destination using Rust and threads. I'm getting some trouble making the threads share the iterator. I am not still used to the borrowing system:
extern crate libc;
extern crate num_cpus;
use libc::{c_char, size_t};
use std::thread;
use std::fs::copy;
fn python_str_array_2_str_vec<T, U, V>(_: T, _: U) -> V {
unimplemented!()
}
#[no_mangle]
pub extern "C" fn copyFiles(
sources: *const *const c_char,
destinies: *const *const c_char,
array_len: size_t,
) {
let src: Vec<&str> = python_str_array_2_str_vec(sources, array_len);
let dst: Vec<&str> = python_str_array_2_str_vec(destinies, array_len);
let mut iter = src.iter().zip(dst);
let num_threads = num_cpus::get();
let threads = (0..num_threads).map(|_| {
thread::spawn(|| while let Some((s, d)) = iter.next() {
copy(s, d);
})
});
for t in threads {
t.join();
}
}
fn main() {}
I'm getting this compilation error that I have not been able to solve:
error[E0597]: `src` does not live long enough
--> src/main.rs:20:20
|
20 | let mut iter = src.iter().zip(dst);
| ^^^ does not live long enough
...
30 | }
| - borrowed value only lives until here
|
= note: borrowed value must be valid for the static lifetime...
error[E0373]: closure may outlive the current function, but it borrows `**iter`, which is owned by the current function
--> src/main.rs:23:23
|
23 | thread::spawn(|| while let Some((s, d)) = iter.next() {
| ^^ ---- `**iter` is borrowed here
| |
| may outlive borrowed value `**iter`
|
help: to force the closure to take ownership of `**iter` (and any other referenced variables), use the `move` keyword, as shown:
| thread::spawn(move || while let Some((s, d)) = iter.next() {
I've seen the following questions already:
Value does not live long enough when using multiple threads
I'm not using chunks, I would like to try to share an iterator through the threads although creating chunks to pass them to the threads will be the classic solution.
Unable to send a &str between threads because it does not live long enough
I've seen some of the answers to use channels to communicate with the threads, but I'm not quite sure about using them. There should be an easier way of sharing just one object through threads.
Why doesn't a local variable live long enough for thread::scoped
This got my attention, scoped is supposed to fix my error, but since it is in the unstable channel I would like to see if there is another way of doing it just using spawn.
Can someone explain how should I fix the lifetimes so the iterator can be accessed from the threads?
Here's a minimal, reproducible example of your problem:
use std::thread;
fn main() {
let src = vec!["one"];
let dst = vec!["two"];
let mut iter = src.iter().zip(dst);
thread::spawn(|| {
while let Some((s, d)) = iter.next() {
println!("{} -> {}", s, d);
}
});
}
There are multiple related problems:
The iterator lives on the stack and the thread's closure takes a reference to it.
The closure takes a mutable reference to the iterator.
The iterator itself has a reference to a Vec that lives on the stack.
The Vec itself has references to string slices that likely live on the stack but are not guaranteed to live longer than the thread either way.
Said another way, the Rust compiler has stopped you from executing four separate pieces of memory unsafety.
A main thing to recognize is that any thread you spawn might outlive the place where you spawned it. Even if you call join right away, the compiler cannot statically verify that will happen, so it has to take the conservative path. This is the point of scoped threads — they guarantee the thread exits before the stack frame they were started in.
Additionally, you are attempting to use a mutable reference in multiple concurrent threads. There's zero guarantee that the iterator (or any of the iterators it was built on) can be safely called in parallel. It's entirely possible that two threads call next at exactly the same time. The two pieces of code run in parallel and write to the same memory address. One thread writes half of the data and the other thread writes the other half, and now your program crashes at some arbitrary point in the future.
Using a tool like crossbeam, your code would look something like:
use crossbeam; // 0.7.3
fn main() {
let src = vec!["one"];
let dst = vec!["two"];
let mut iter = src.iter().zip(dst);
while let Some((s, d)) = iter.next() {
crossbeam::scope(|scope| {
scope.spawn(|_| {
println!("{} -> {}", s, d);
});
})
.unwrap();
}
}
As mentioned, this will only spawn one thread at a time, waiting for it to finish. An alternative to get more parallelism (the usual point of this exercise) is to interchange the calls to next and spawn. This requires transferring ownership of s and d to the thread via the move keyword:
use crossbeam; // 0.7.3
fn main() {
let src = vec!["one", "alpha"];
let dst = vec!["two", "beta"];
let mut iter = src.iter().zip(dst);
crossbeam::scope(|scope| {
while let Some((s, d)) = iter.next() {
scope.spawn(move |_| {
println!("{} -> {}", s, d);
});
}
})
.unwrap();
}
If you add a sleep call inside the spawn, you can see the threads run in parallel.
I'd have written it using a for loop, however:
let iter = src.iter().zip(dst);
crossbeam::scope(|scope| {
for (s, d) in iter {
scope.spawn(move |_| {
println!("{} -> {}", s, d);
});
}
}).unwrap();
In the end, the iterator is exercised on the current thread, and each value returned from the iterator is then handed off to a new thread. The new threads are guaranteed to exit before the captured references.
You may be interested in Rayon, a crate that allows easy parallelization of certain types of iterators.
See also:
How can I pass a reference to a stack variable to a thread?
Lifetime troubles sharing references between threads
How do I use static lifetimes with threads?
Thread references require static lifetime?
Lifetime woes when using threads
Cannot call a function in a spawned thread because it "does not fulfill the required lifetime"

Explain the behavior of *Rc::make_mut and why it differs compared to Mutex

I needed to pass a resource between several functions which use a closure as an argument. And within these the data was handled, but it looked for that the changes that were realized to a variable will be reflected in the rest.
The first thing I thought was to use Rc. I had previously used Arc to handle the data between different threads, but since these functions aren't running in different threads I chose Rc instead.
The most simplified code that I have, to show my doubts:
The use of RefCell was because maybe I had to see that this syntax will not work as I expected:
*Rc::make_mut(&mut rc_pref_temp)...
use std::sync::Arc;
use std::rc::Rc;
use std::sync::Mutex;
use std::cell::RefCell;
use std::cell::Cell;
fn main() {
test2();
println!("---");
test();
}
#[derive(Debug, Clone)]
struct Prefe {
name_test: RefCell<u64>,
}
impl Prefe {
fn new() -> Prefe {
Prefe {
name_test: RefCell::new(3 as u64),
}
}
}
fn test2(){
let mut prefe: Prefe = Prefe::new();
let mut rc_pref = Rc::new(Mutex::new(prefe));
println!("rc_pref Mutex: {:?}", rc_pref.lock().unwrap().name_test);
let mut rc_pref_temp = rc_pref.clone();
*rc_pref_temp.lock().unwrap().name_test.get_mut() += 1;
println!("rc_pref_clone Mutex: {:?}", rc_pref_temp.lock().unwrap().name_test);
*rc_pref_temp.lock().unwrap().name_test.get_mut() += 1;
println!("rc_pref_clone Mutex: {:?}", rc_pref_temp.lock().unwrap().name_test);
println!("rc_pref Mutex: {:?}", rc_pref.lock().unwrap().name_test);
}
fn test(){
let mut prefe: Prefe = Prefe::new();
let mut rc_pref = Rc::new(prefe);
println!("rc_pref: {:?}", rc_pref.name_test);
let mut rc_pref_temp = rc_pref.clone();
*((*Rc::make_mut(&mut rc_pref_temp)).name_test).get_mut() += 1;
println!("rc_pref_clone: {:?}", rc_pref_temp.name_test);
*((*Rc::make_mut(&mut rc_pref_temp)).name_test).get_mut() += 1;
println!("rc_pref_clone: {:?}", rc_pref_temp.name_test);
println!("rc_pref: {:?}", rc_pref.name_test);
}
The code is simplified, the scenario where it is used is totally different. I note this to avoid comments like "you can lend a value to the function", because what interests me is to know why the cases exposed work in this way.
stdout:
rc_pref Mutex : RefCell { value: 3 }
rc_pref_clone Mutex : RefCell { value: 4 }
rc_pref_clone Mutex : RefCell { value: 5 }
rc_pref Mutex : RefCell { value: 5 }
---
rc_pref : RefCell { value: 3 }
rc_pref_clone : RefCell { value: 4 }
rc_pref_clone : RefCell { value: 5 }
rc_pref : RefCell { value: 3 }
About test()
I'm new to Rust so I don't know if this crazy syntax is the right way.
*((*Rc::make_mut(&mut rc_pref_temp)).name_test).get_mut() += 1;
When running test() you can see that the previous syntax works, because it increases the value, but this increase does not affect the clones. I expected that with the use of *Rc::make_mut(& mut rc_pref_temp)... that the clones of a shared reference would reflect the same values.
If Rc has references to the same object, why do the changes to an object not apply to the rest of the clones? Why does this work this way? Am I doing something wrong?
Note: I use RefCell because in some tests I thought that maybe I had something to do.
About test2()
I've got it working as expected using Mutex with Rc, but I do not know if this is the correct way. I have some ideas of how Mutex and Arc works, but after using this syntax:
*Rc::make_mut(&mut rc_pref_temp)...
With the use of Mutex in test2(), I wonder if Mutex is not only responsible for changing the data in but also the one in charge of reflecting the changes in all the cloned references.
Do the shared references actually point to the same object? I want to think they do, but with the above code where the changes are not reflected without the use of Mutex, I have some doubts.
You need to read and understand the documentation for functions you use before you use them. Rc::make_mut says, emphasis mine:
Makes a mutable reference into the given Rc.
If there are other Rc or Weak pointers to the same value, then
make_mut will invoke clone on the inner value to ensure unique
ownership. This is also referred to as clone-on-write.
See also get_mut, which will fail rather than cloning.
You have multiple Rc pointers because you called rc_pref.clone(). Thus, when you call make_mut, the inner value will be cloned and the Rc pointers will now be disassociated from each other:
use std::rc::Rc;
fn main() {
let counter = Rc::new(100);
let mut counter_clone = counter.clone();
println!("{}", Rc::strong_count(&counter)); // 2
println!("{}", Rc::strong_count(&counter_clone)); // 2
*Rc::make_mut(&mut counter_clone) += 50;
println!("{}", Rc::strong_count(&counter)); // 1
println!("{}", Rc::strong_count(&counter_clone)); // 1
println!("{}", counter); // 100
println!("{}", counter_clone); // 150
}
The version with the Mutex works because it's completely different. You aren't calling a function which clones the inner value anymore. Of course, it doesn't make sense to use a Mutex when you don't have threads. The single-threaded equivalent of a Mutex is... RefCell!
I honestly don't know how you found Rc::make_mut; I've never even heard of it before. The module documentation for cell doesn't mention it, nor does the module documentation for rc.
I'd highly encourage you to take a step back and re-read through the documentation. The second edition of The Rust Programming Language has a chapter on smart pointers, including Rc and RefCell. Read the module-level documentation for rc and cell as well.
Here's what your code should look like. Note the usage of borrow_mut.
fn main() {
let prefe = Rc::new(Prefe::new());
println!("prefe: {:?}", prefe.name_test); // 3
let prefe_clone = prefe.clone();
*prefe_clone.name_test.borrow_mut() += 1;
println!("prefe_clone: {:?}", prefe_clone.name_test); // 4
*prefe_clone.name_test.borrow_mut() += 1;
println!("prefe_clone: {:?}", prefe_clone.name_test); // 5
println!("prefe: {:?}", prefe.name_test); // 5
}

Resources