Why does the Rust playground not produce different results for threads? - multithreading

For the following sample code in Rust book concurrency chapter.
use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;
fn main() {
let data = Arc::new(Mutex::new(vec![1, 2, 3]));
for i in 0..3 {
let data = data.clone();
thread::spawn(move || {
let mut data = data.lock().unwrap();
data[0] += i;
println!("{}", data[0]);
});
}
thread::sleep(Duration::from_millis(50));
}
My friend and I separately ran this code on the Rust playground and always got the same order: 3, 4, 4, so it seems the threads are always started in the order of 2, 1, 0.
With multi-threaded programming, shouldn't we never know which thread will start first, as there is no fixed order of running the spawned threads?
Is the Rust playground considered a single computer?

This may not be the only thing at play here, but the playground does caching; if you don't change the code, it won't re-run it.

Related

Explain the behavior of *Rc::make_mut and why it differs compared to Mutex

I needed to pass a resource between several functions which use a closure as an argument. And within these the data was handled, but it looked for that the changes that were realized to a variable will be reflected in the rest.
The first thing I thought was to use Rc. I had previously used Arc to handle the data between different threads, but since these functions aren't running in different threads I chose Rc instead.
The most simplified code that I have, to show my doubts:
The use of RefCell was because maybe I had to see that this syntax will not work as I expected:
*Rc::make_mut(&mut rc_pref_temp)...
use std::sync::Arc;
use std::rc::Rc;
use std::sync::Mutex;
use std::cell::RefCell;
use std::cell::Cell;
fn main() {
test2();
println!("---");
test();
}
#[derive(Debug, Clone)]
struct Prefe {
name_test: RefCell<u64>,
}
impl Prefe {
fn new() -> Prefe {
Prefe {
name_test: RefCell::new(3 as u64),
}
}
}
fn test2(){
let mut prefe: Prefe = Prefe::new();
let mut rc_pref = Rc::new(Mutex::new(prefe));
println!("rc_pref Mutex: {:?}", rc_pref.lock().unwrap().name_test);
let mut rc_pref_temp = rc_pref.clone();
*rc_pref_temp.lock().unwrap().name_test.get_mut() += 1;
println!("rc_pref_clone Mutex: {:?}", rc_pref_temp.lock().unwrap().name_test);
*rc_pref_temp.lock().unwrap().name_test.get_mut() += 1;
println!("rc_pref_clone Mutex: {:?}", rc_pref_temp.lock().unwrap().name_test);
println!("rc_pref Mutex: {:?}", rc_pref.lock().unwrap().name_test);
}
fn test(){
let mut prefe: Prefe = Prefe::new();
let mut rc_pref = Rc::new(prefe);
println!("rc_pref: {:?}", rc_pref.name_test);
let mut rc_pref_temp = rc_pref.clone();
*((*Rc::make_mut(&mut rc_pref_temp)).name_test).get_mut() += 1;
println!("rc_pref_clone: {:?}", rc_pref_temp.name_test);
*((*Rc::make_mut(&mut rc_pref_temp)).name_test).get_mut() += 1;
println!("rc_pref_clone: {:?}", rc_pref_temp.name_test);
println!("rc_pref: {:?}", rc_pref.name_test);
}
The code is simplified, the scenario where it is used is totally different. I note this to avoid comments like "you can lend a value to the function", because what interests me is to know why the cases exposed work in this way.
stdout:
rc_pref Mutex : RefCell { value: 3 }
rc_pref_clone Mutex : RefCell { value: 4 }
rc_pref_clone Mutex : RefCell { value: 5 }
rc_pref Mutex : RefCell { value: 5 }
---
rc_pref : RefCell { value: 3 }
rc_pref_clone : RefCell { value: 4 }
rc_pref_clone : RefCell { value: 5 }
rc_pref : RefCell { value: 3 }
About test()
I'm new to Rust so I don't know if this crazy syntax is the right way.
*((*Rc::make_mut(&mut rc_pref_temp)).name_test).get_mut() += 1;
When running test() you can see that the previous syntax works, because it increases the value, but this increase does not affect the clones. I expected that with the use of *Rc::make_mut(& mut rc_pref_temp)... that the clones of a shared reference would reflect the same values.
If Rc has references to the same object, why do the changes to an object not apply to the rest of the clones? Why does this work this way? Am I doing something wrong?
Note: I use RefCell because in some tests I thought that maybe I had something to do.
About test2()
I've got it working as expected using Mutex with Rc, but I do not know if this is the correct way. I have some ideas of how Mutex and Arc works, but after using this syntax:
*Rc::make_mut(&mut rc_pref_temp)...
With the use of Mutex in test2(), I wonder if Mutex is not only responsible for changing the data in but also the one in charge of reflecting the changes in all the cloned references.
Do the shared references actually point to the same object? I want to think they do, but with the above code where the changes are not reflected without the use of Mutex, I have some doubts.
You need to read and understand the documentation for functions you use before you use them. Rc::make_mut says, emphasis mine:
Makes a mutable reference into the given Rc.
If there are other Rc or Weak pointers to the same value, then
make_mut will invoke clone on the inner value to ensure unique
ownership. This is also referred to as clone-on-write.
See also get_mut, which will fail rather than cloning.
You have multiple Rc pointers because you called rc_pref.clone(). Thus, when you call make_mut, the inner value will be cloned and the Rc pointers will now be disassociated from each other:
use std::rc::Rc;
fn main() {
let counter = Rc::new(100);
let mut counter_clone = counter.clone();
println!("{}", Rc::strong_count(&counter)); // 2
println!("{}", Rc::strong_count(&counter_clone)); // 2
*Rc::make_mut(&mut counter_clone) += 50;
println!("{}", Rc::strong_count(&counter)); // 1
println!("{}", Rc::strong_count(&counter_clone)); // 1
println!("{}", counter); // 100
println!("{}", counter_clone); // 150
}
The version with the Mutex works because it's completely different. You aren't calling a function which clones the inner value anymore. Of course, it doesn't make sense to use a Mutex when you don't have threads. The single-threaded equivalent of a Mutex is... RefCell!
I honestly don't know how you found Rc::make_mut; I've never even heard of it before. The module documentation for cell doesn't mention it, nor does the module documentation for rc.
I'd highly encourage you to take a step back and re-read through the documentation. The second edition of The Rust Programming Language has a chapter on smart pointers, including Rc and RefCell. Read the module-level documentation for rc and cell as well.
Here's what your code should look like. Note the usage of borrow_mut.
fn main() {
let prefe = Rc::new(Prefe::new());
println!("prefe: {:?}", prefe.name_test); // 3
let prefe_clone = prefe.clone();
*prefe_clone.name_test.borrow_mut() += 1;
println!("prefe_clone: {:?}", prefe_clone.name_test); // 4
*prefe_clone.name_test.borrow_mut() += 1;
println!("prefe_clone: {:?}", prefe_clone.name_test); // 5
println!("prefe: {:?}", prefe.name_test); // 5
}

How to tell Rust to let me modify a shared variable hidden behind an RwLock?

Safe Rust demands the following from all references:
One or more references (&T) to a resource,
Exactly one mutable reference (&mut T).
I want to have one Vec that can be read by multiple threads and written by one, but only one of those should be possible at a time (as the language demands).
So I use an RwLock.
I need a Vec<i8>. To let it outlive the main function, I Box it and then I RwLock around that, like thus:
fn main() {
println!("Hello, world!");
let mut v = vec![0, 1, 2, 3, 4, 5, 6];
let val = RwLock::new(Box::new(v));
for i in 0..10 {
thread::spawn(move || threadFunc(&val));
}
loop {
let mut VecBox = (val.write().unwrap());
let ref mut v1 = *(*VecBox);
v1.push(1);
//And be very busy.
thread::sleep(Duration::from_millis(10000));
}
}
fn threadFunc(val: &RwLock<Box<Vec<i8>>>) {
loop {
//Use Vec
let VecBox = (val.read().unwrap());
let ref v1 = *(*VecBox);
println!("{}", v1.len());
//And be very busy.
thread::sleep(Duration::from_millis(1000));
}
}
Rust refuses to compile this:
capture of moved value: `val`
--> src/main.rs:14:43
|
14 | thread::spawn(move || threadFunc(&val));
| ------- ^^^ value captured here after move
| |
| value moved (into closure) here
Without the thread:
for i in 0..10 {
threadFunc(&val);
}
It compiles. The problem is with the closure. I have to "move" it, or else Rust complains that it can outlive main, I also can't clone val (RwLock doesn't implement clone()).
What should I do?
Note that there's no structural difference between using a RwLock and a Mutex; they just have different access patterns. See
Concurrent access to vector from multiple threads using a mutex lock for related discussion.
The problem centers around the fact that you've transferred ownership of the vector (in the RwLock) to some thread; therefore your main thread doesn't have it anymore. You can't access it because it's gone.
In fact, you'll have the same problem as you've tried to pass the vector to each of the threads. You only have one vector to give away, so only one thread could have it.
You need thread-safe shared ownership, provided by Arc:
use std::sync::{Arc, RwLock};
use std::thread;
use std::time::Duration;
fn main() {
println!("Hello, world!");
let v = vec![0, 1, 2, 3, 4, 5, 6];
let val = Arc::new(RwLock::new(v));
for _ in 0..10 {
let v = val.clone();
thread::spawn(move || thread_func(v));
}
for _ in 0..5 {
{
let mut val = val.write().unwrap();
val.push(1);
}
thread::sleep(Duration::from_millis(1000));
}
}
fn thread_func(val: Arc<RwLock<Vec<i8>>>) {
loop {
{
let val = val.read().unwrap();
println!("{}", val.len());
}
thread::sleep(Duration::from_millis(100));
}
}
Other things to note:
I removed the infinite loop in main so that the code can actually finish.
I fixed all of the compiler warnings. If you are going to use a compiled language, pay attention to the warnings.
unnecessary parentheses
snake_case identifiers. Definitely do not use PascalCase for local variables; that's used for types. camelCase does not get used in Rust.
I added some blocks to shorten the lifetime that the read / write locks will be held. Otherwise there's a lot of contention and the child threads never have a chance to get a read lock.
let ref v1 = *(*foo); is non-idiomatic. Prefer let v1 = &**foo. You don't even need to do that at all, thanks to Deref.

How to multi-thread function calls on read-only data without cloning it? [duplicate]

This question already has an answer here:
Lifetime of variables passed to a new thread
(1 answer)
Closed 6 years ago.
Take this simple example where we're using an immutable list of vectors to calculate new values.
Given this working, single threaded example:
use std::collections::LinkedList;
fn calculate_vec(v: &Vec<i32>) -> i32 {
let mut result: i32 = 0;
for i in v {
result += *i;
}
return result;
}
fn calculate_from_list(list: &LinkedList<Vec<i32>>) -> LinkedList<i32> {
let mut result: LinkedList<i32> = LinkedList::new();
for v in list {
result.push_back(calculate_vec(v));
}
return result;
}
fn main() {
let mut list: LinkedList<Vec<i32>> = LinkedList::new();
// some arbitrary values
list.push_back(vec![0, -2, 3]);
list.push_back(vec![3, -4, 3]);
list.push_back(vec![7, -10, 6]);
let result = calculate_from_list(&list);
println!("Here's the result!");
for i in result {
println!("{}", i);
}
}
Assuming calculate_vec is a processor intensive function, we may want to use multiple threads to run this, the following example works but requires (what I think is) an unnecessary vector clone.
use std::collections::LinkedList;
fn calculate_vec(v: &Vec<i32>) -> i32 {
let mut result: i32 = 0;
for i in v {
result += *i;
}
return result;
}
fn calculate_from_list(list: &LinkedList<Vec<i32>>) -> LinkedList<i32> {
use std::thread;
let mut result: LinkedList<i32> = LinkedList::new();
let mut join_handles = LinkedList::new();
for v in list {
let v_clone = v.clone(); // <-- how to avoid this clone?
join_handles.push_back(thread::spawn(move || calculate_vec(&v_clone)));
}
for j in join_handles {
result.push_back(j.join().unwrap());
}
return result;
}
fn main() {
let mut list: LinkedList<Vec<i32>> = LinkedList::new();
// some arbitrary values
list.push_back(vec![0, -2, 3]);
list.push_back(vec![3, -4, 3]);
list.push_back(vec![7, -10, 6]);
let result = calculate_from_list(&list);
println!("Here's the result!");
for i in result {
println!("{}", i);
}
}
This example works, but it only when the vector is cloned,
however logically, I don't think this should be needed since the vector is immutable.
There is no reason each call to calculate_vec should need to allocate a new vector.
How could this simple example be multi-threaded without needing to clone the data before its passed to the closure?
Update, heres a working example that uses Arc based on #ker's suggestion, although it does need to take ownership.
Note 1) I'm aware there are 3rd party libraries to handle threading, but would be interested to know if this is possible using Rust's standard library.
Note 2) There are quite a few similar questions on threading but examples often involves threads writing to data, which isn't the case here.
There are multiple ways to solve your problem.
move the Vector into an Arc<LinkedList<Vec<i32>>> and clone that. After the calculation, you can use try_unwrap to get your LinkedList<Vec<i32>> back. This works with just the Rust standard library.Heres a working example that uses Arc, though LinkedList was replaced by Vec to allow indexing.Also note that the function needs to own the argument being passed to it in this case.
Use the crossbeam crate to create threads that can reference their scope, freeing you from the need to do all that join_handles code by hand. This will have a minimal impact on your code, since it work exactly like you want.
crossbeam::scope(|scope| {
for v in list {
scope.spawn(|| calculate_vec(&v))
}
});
Use the scoped_threadpool crate. It works just like crossbeam but doesn't create one thread per task, instead it spreads out the tasks over a limited number of threads. (thanks #delnan)
use the rayon crate for direct data parallelism
use rayon::prelude::*;
list.par_iter().map(|v| calculate_vec(&v)).collect()

How do I modify a value in one thread and read the value in another thread using shared memory?

The following Python code creates a thread (actually a process) with an array containing two floats passed to it, the thread counts up 1 by the first float and -1 by the second float every 5 seconds, while the main thread is continuously printing the two floats:
from multiprocessing import Process, Array
from time import sleep
def target(states):
while True:
states[0] -= 1
states[1] += 1
sleep(5)
def main():
states = Array("d", [0.0, 0.0])
process = Process(target=target, args=(states,))
process.start()
while True:
print(states[0])
print(states[1])
if __name__ == "__main__":
main()
How can I do the same thing using shared memory in Rust? I've tried doing the following (playground):
use std::sync::{Arc, Mutex};
use std::thread;
fn main() {
let data = Arc::new(Mutex::new([0.0]));
let data = data.clone();
thread::spawn(move || {
let mut data = data.lock().unwrap();
data[0] = 1.0;
});
print!("{}", data[0]);
}
But that's giving a compile error:
error: cannot index a value of type `std::sync::Arc<std::sync::Mutex<[_; 1]>>`
--> <anon>:12:18
|>
12 |> print!("{}", data[0]);
|> ^^^^^^^
And even if that'd work, it does something different. I've read this, but I've still no idea how to do it.
Your code is not that far off! :)
Let's look at the compiler error first: it says that you are apparently attempting to index something. This is true, you want to index the data variable (with data[0]), but the compiler complains that the value you want to index is of type std::sync::Arc<std::sync::Mutex<[_; 1]>> and cannot be indexed.
If you look at the type, you can quickly see: my array is still wrapped in a Mutex<T> which is wrapped in an Arc<T>. This brings us to the solution: you have to lock for read access, too. So you have to add the lock().unwrap() like in the other thread:
print!("{}", data.lock().unwrap()[0]);
But now a new compiler error arises: use of moved value: `data`. Dang! This comes from your name shadowing. You say let data = data.clone(); before starting the thread; this shadows the original data. So how about we replace it by let data_for_thread = data.clone() and use data_for_thread in the other thread? You can see the working result here on the playground.
Making it do the same thing as the Python example isn't that hard anymore then, is it?
use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;
let data = Arc::new(Mutex::new([0.0, 0.0]));
let data_for_thread = data.clone();
thread::spawn(move || {
loop {
thread::sleep(Duration::from_secs(5))
let mut data = data_for_thread.lock().unwrap();
data[0] += 1.0;
data[1] -= 1.0;
}
});
loop {
let data = data.lock().unwrap();
println!("{}, {}", data[0], data[1]);
}
You can try it here on the playground, although I changed a few minor things to allow running on the playground.
Ok, so let's first fix the compiler error:
use std::sync::{Arc, Mutex};
use std::thread;
fn main() {
let data = Arc::new(Mutex::new([0.0]));
let thread_data = data.clone();
thread::spawn(move || {
let mut data = thread_data.lock().unwrap();
data[0] = 1.0;
});
println!("{}", data.lock().unwrap()[0]);
}
The variable thread_data is always moved into the thread, that is why it cannot be accessed after the thread is spawned.
But this still has a problem: you are starting a thread that will run concurrently with the main thread and the last print statement will execute before the thread changes the value most of the time (it will be random).
To fix this you have to wait for the thread to finish before printing the value:
use std::sync::{Arc, Mutex};
use std::thread;
fn main() {
let data = Arc::new(Mutex::new([0.0]));
let thread_data = data.clone();
let t = thread::spawn(move || {
let mut data = thread_data.lock().unwrap();
data[0] = 1.0;
});
t.join().unwrap();
println!("{}", data.lock().unwrap()[0]);
}
This will always produce the correct result.
If you update common data by a thread, the other threads might not see the updated value, unless you do the following:
Declare the variable as volatile which makes sure that the latest update is given back to the threads that read the variable. The data is read from the memory block but not from cache.
Make all updates and reads as synchronized which might turn out to be costly in terms of performance but is sure to deal with data corruptions/in-consistency due to non-synchronization methods of writes and reads by distinct threads.

Printing the Arc and Mutex types

How can I print the values of a Vec that is encapsulated by a Mutex and Arc? I'm really new to Rust, so I'm not sure if I am phrasing this well.
This is the code I have, loosely based on the documentation.
use std::sync::{Arc, Mutex};
use std::thread;
fn main() {
let data = Arc::new(Mutex::new(vec![104, 101, 108, 108, 111]));
for i in 0..2 {
let data = data.clone();
thread::spawn(move || {
let mut data = data.lock().unwrap();
data[i] += 1;
});
}
println!("{:?}", String::from_utf8(data).unwrap());
thread::sleep_ms(50);
}
The error that the compiler gives me:
$ rustc datarace_fixed.rs
datarace_fixed.rs:14:37: 14:41 error: mismatched types:
expected collections::vec::Vec<u8>,
found alloc::arc::Arc<std::sync::mutex::Mutex<collections::vec::Vec<_>>>
(expected struct collections::vec::Vec,
found struct alloc::arc::Arc) [E0308]
datarace_fixed.rs:14 println!("{:?}", String::from_utf8(data).unwrap());
To work with a Mutex value you have to lock the Mutex, just like you do in the spawned threads. (playpen):
let data = data.lock().unwrap();
println!("{:?}", String::from_utf8(data.clone()).unwrap());
Note that String::from_utf8 consumes the vector (in order to wrap it in a String without extra allocations), which is obvious from it taking a value vec: Vec<u8> and not a reference. Since we aren't ready to relinquish our hold on data we have to clone it when using this method.
A cheaper alternative would be to use the slice-based version of from_utf8 (playpen):
let data = data.lock().unwrap();
println!("{:?}", from_utf8(&data).unwrap());

Resources