Read and Write the same variable in multiple threads in Rust - multithreading

I'm doing a simple program to simulate a supermarket queue to learn Rust. The program consists in two threads. One is adding people (char 'x') to the supermarket queue and the other one is removing them.
I use Arc and Mutex to handle the concurrency but it seems that the first thread never free the var, so the second doesn't work.
The code is the follows:
let queue = Arc::new(Mutex::new(Vec::<char>::new()));
let queue1 = Arc::clone(&queue);
thread::spawn(move || loop {
let mut aux = queue1.lock().unwrap();
aux.push('x');
print_queue(&aux);
thread::sleep(Duration::from_secs(3));
});
thread::spawn(move || loop {
let mut aux = queue.lock().unwrap();
println!("AUX state: {:?}", aux);
if !&aux.is_empty() {
aux.pop();
}
print_queue(&aux);
let mut rng = rand::thread_rng();
thread::sleep(Duration::from_secs(rng.gen_range(1..10)));
});
The print of the AUX state never shows. What I'm doing wrong?

Your code is using this pattern:
loop {
let mut guard = mutex.lock().unwrap();
// do something with `guard`
thread::sleep(duration);
// `guard` is implicitly dropped at the end of the scope
}
The problem with this code is that the mutex guard (in this case guard) holds the lock, and it holds it until it is dropped, which in this case happens when the variable goes out of scope. But it goes out of scope after the thread is sleeping, so the thread will still hold the lock while it sleeps.
To avoid this, you should drop the guard immediately once you're done using it, before the thread sleeps:
loop {
{
let mut guard = mutex.lock().unwrap();
// do something with `guard`
// implicitly drop `guard` at the end of the inner scope, before sleep
}
thread::sleep(duration);
}
or
loop {
let mut guard = mutex.lock().unwrap();
// do something with `guard`
// explicitly drop `guard` before sleep
drop(guard);
thread::sleep(duration);
}

Related

How to create threads in a for loop and get the return value from each?

I am writing a program that pings a set of targets 100 times, and stores each RTT value returned from the ping into a vector, thus giving me a set of RTT values for each target. Say I have n targets, I would like all of the pinging to be done concurrently. The rust code looks like this:
let mut sample_rtts_map = HashMap::new();
for addr in targets.to_vec() {
let mut sampleRTTvalues: Vec<f32> = vec![];
//sample_rtts_map.insert(addr, sampleRTTvalues);
thread::spawn(move || {
while sampleRTTvalues.len() < 100 {
let sampleRTT = ping(addr);
sampleRTTvalues.push(sampleRTT);
// thread::sleep(Duration::from_millis(5000));
}
});
}
The hashmap is used to tell which vector of values belongs to which target. The problem is, how do I retrieve the updated sampleRTTvalues from each thread after the thread is done executing? I would like something like:
let (name, sampleRTTvalues) = thread::spawn(...)
The name, being the name of the thread, and sampleRTTvalues being the vector. However, since I'm creating threads in a for loop, each thread is being instantiated the same way, so how I differentiate them?
Is there some better way to do this? I've looked into schedulers, future, etc., but it seems my case can just be done with simple threads.
I go the desired behavior with the following code:
use std::thread;
use std::sync::mpsc;
use std::collections::HashMap;
use rand::Rng;
use std::net::{Ipv4Addr,Ipv6Addr,IpAddr};
const RTT_ONE: IpAddr = IpAddr::V4(Ipv4Addr::new(127,0,0,1));
const RTT_TWO: IpAddr = IpAddr::V6(Ipv6Addr::new(0,0,0,0,0,0,0,1));
const RTT_THREE: IpAddr = IpAddr::V4(Ipv4Addr::new(127,0,1,1));//idk how ip adresses work, forgive if this in invalid but you get the idea
fn ping(address: IpAddr) -> f32 {
rand::thread_rng().gen_range(5.0..107.0)
}
fn main() {
let targets = [RTT_ONE,RTT_TWO,RTT_THREE];
let mut sample_rtts_map: HashMap<IpAddr,Vec<f32>> = HashMap::new();
for addr in targets.into_iter() {
let (sample_values,moved_values) = mpsc::channel();
let mut sampleRTTvalues: Vec<f32> = vec![];
thread::spawn(move || {
while sampleRTTvalues.len() < 100 {
let sampleRTT = ping(addr);
sampleRTTvalues.push(sampleRTT);
//thread::sleep(Duration::from_millis(5000));
}
});
sample_rtts_map.insert(addr,moved_values.recv().unwrap());
}
}
note that the use rand::Rng can be removed when implementing, as it is only so the example works. what this does is pass data from the spawned thread to the main thread, and in the method used it waits until the data is ready before adding it to the hash map. If this is problematic (takes a long time, etc.) then you can use try_recv instead of recv which will add an error / option type that will return a recoverable error if the value is ready when unwrapped, or return the value if it's ready
You can use a std::sync::mpsc channel to collect your data:
use std::collections::HashMap;
use std::sync::mpsc::channel;
use std::thread;
fn ping(_: &str) -> f32 { 0.0 }
fn main() {
let targets = ["a", "b"]; // just for example
let mut sample_rtts_map = HashMap::new();
let (tx, rx) = channel();
for addr in targets {
let tx = tx.clone();
thread::spawn(move || {
for _ in 0..100 {
let sampleRTT = ping(addr);
tx.send((addr, sampleRTT));
}
});
}
drop(tx);
// exit loop when all thread's tx have dropped
while let Ok((addr, sampleRTT)) = rx.recv() {
sample_rtts_map.entry(addr).or_insert(vec![]).push(sampleRTT);
}
println!("sample_rtts_map: {:?}", sample_rtts_map);
}
This will run all pinging threads simultaneously, and collect data in main thread synchronously, so that we can avoid using locks. Do not forget to drop sender in main thread after cloning to all pinging threads, or the main thread will hang forever.

Why do Rust mutexes not seem to give the lock to the thread that wanted to lock it last?

I wanted to write a program that spawns two threads that lock a Mutex, increase it, print something, and then unlock the Mutex so the other thread can do the same. I added some sleep time to make it more consistent, so I thought the output should be something like:
ping pong ping pong …
but the actual output is pretty random. Most of the time, it is just
ping ping ping … pong
But there's no consistency at all; sometimes there is a “pong” in the middle too.
I was of the belief that mutexes had some kind of way to determine who wanted to lock it last but it doesn’t look like that’s the case.
How does the locking actually work?
How can I get the desired output?
use std::sync::{Arc, Mutex};
use std::{thread, time};
fn main() {
let data1 = Arc::new(Mutex::new(1));
let data2 = data1.clone();
let ten_millis = time::Duration::from_millis(10);
let a = thread::spawn(move || loop {
let mut data = data1.lock().unwrap();
thread::sleep(ten_millis);
println!("ping ");
*data += 1;
if *data > 10 {
break;
}
});
let b = thread::spawn(move || loop {
let mut data = data2.lock().unwrap();
thread::sleep(ten_millis);
println!("pong ");
*data += 1;
if *data > 10 {
break;
}
});
a.join().unwrap();
b.join().unwrap();
}
Mutex and RwLock both defer to OS-specific primitives and cannot be guaranteed to be fair. On Windows, they are both implemented with SRW locks which are specifically documented as not fair. I didn't do research for other operating systems but you definitely cannot rely on fairness with std::sync::Mutex, especially if you need this code to be portable.
A possible solution in Rust is the Mutex implementation provided by the parking_lot crate, which provides an unlock_fair method, which is implemented with a fair algorithm.
From the parking_lot documentation:
By default, mutexes are unfair and allow the current thread to re-lock the mutex before another has the chance to acquire the lock, even if that thread has been blocked on the mutex for a long time. This is the default because it allows much higher throughput as it avoids forcing a context switch on every mutex unlock. This can result in one thread acquiring a mutex many more times than other threads.
However in some cases it can be beneficial to ensure fairness by forcing the lock to pass on to a waiting thread if there is one. This is done by using this method instead of dropping the MutexGuard normally.
While parking_lot::Mutex doesn't claim to be fair without specifically using the unlock_fair method, I found that your code produced the same number of pings as pongs, by just making that switch (playground), not even using the unlock_fair method.
Usually mutexes are unlocked automatically, when a guard goes out of scope. To make it unlock fairly, you need to insert this method call before the guard is dropped:
let b = thread::spawn(move || loop {
let mut data = data1.lock();
thread::sleep(ten_millis);
println!("pong ");
*data += 1;
if *data > 10 {
break;
}
MutexGuard::unlock_fair(data);
});
The order of locking the mutex is not guaranteed in any way; it's possible for the first thread to acquire the lock 100% of the time, while the second thread 0%
The threads are scheduled by the OS and the following scenario is quite possible:
the OS gives CPU time to the first thread and it acquires the lock
the OS gives CPU time to the second thread, but the lock is taken, hence it goes to sleep
The fist thread releases the lock, but is still allowed to run by the OS. It goes for another iteration of the loop and re-acquires the lock
The other thread cannot proceed, because the lock is still taken.
If you give the second thread more time to acquire the lock you will see the expected ping-pong pattern, although there is no guarantee (a bad OS may decide to never give CPU time to some of your threads):
use std::sync::{Arc, Mutex};
use std::{thread, time};
fn main() {
let data1 = Arc::new(Mutex::new(1));
let data2 = data1.clone();
let ten_millis = time::Duration::from_millis(10);
let a = thread::spawn(move || loop {
let mut data = data1.lock().unwrap();
*data += 1;
if *data > 10 {
break;
}
drop(data);
thread::sleep(ten_millis);
println!("ping ");
});
let b = thread::spawn(move || loop {
let mut data = data2.lock().unwrap();
*data += 1;
if *data > 10 {
break;
}
drop(data);
thread::sleep(ten_millis);
println!("pong ");
});
a.join().unwrap();
b.join().unwrap();
}
You can verify that by playing with the sleep time. The lower the sleep time, the more irregular the ping-pong alternations will be, and with values as low as 10ms, you may see ping-ping-pong, etc.
Essentially, a solution based on time is bad by design. You can guarantee that "ping" will be followed by "pong" by improving the algorithm. For instance you can print "ping" on odd numbers and "pong" on even numbers:
use std::sync::{Arc, Mutex};
use std::{thread, time};
const MAX_ITER: i32 = 10;
fn main() {
let data1 = Arc::new(Mutex::new(1));
let data2 = data1.clone();
let ten_millis = time::Duration::from_millis(10);
let a = thread::spawn(move || 'outer: loop {
loop {
thread::sleep(ten_millis);
let mut data = data1.lock().unwrap();
if *data > MAX_ITER {
break 'outer;
}
if *data & 1 == 1 {
*data += 1;
println!("ping ");
break;
}
}
});
let b = thread::spawn(move || 'outer: loop {
loop {
thread::sleep(ten_millis);
let mut data = data2.lock().unwrap();
if *data > MAX_ITER {
break 'outer;
}
if *data & 1 == 0 {
*data += 1;
println!("pong ");
break;
}
}
});
a.join().unwrap();
b.join().unwrap();
}
This isn't the best implementation, but I tried to do it with as few modifications as possible to the original code.
You may also consider an implementation with a Condvar, a better solution, in my opinion, as it avoids the busy waiting on the mutex (ps: also removed the code duplication):
use std::sync::{Arc, Mutex, Condvar};
use std::thread;
const MAX_ITER: i32 = 10;
fn main() {
let cv1 = Arc::new((Condvar::new(), Mutex::new(1)));
let cv2 = cv1.clone();
let a = thread::spawn(ping_pong_task("ping", cv1, |x| x & 1 == 1));
let b = thread::spawn(ping_pong_task("pong", cv2, |x| x & 1 == 0));
a.join().unwrap();
b.join().unwrap();
}
fn ping_pong_task<S: Into<String>>(
msg: S,
cv: Arc<(Condvar, Mutex<i32>)>,
check: impl Fn(i32) -> bool) -> impl Fn()
{
let message = msg.into();
move || {
let (condvar, mutex) = &*cv;
let mut value = mutex.lock().unwrap();
loop {
if check(*value) {
println!("{} ", message);
*value += 1;
condvar.notify_all();
}
if *value > MAX_ITER {
break;
}
value = condvar.wait(value).unwrap();
}
}
}
I was of the belief that mutexes had some kind of way to determine who wanted to lock it last but it doesn’t look like that’s the case.
Nope. The job of a mutex is just to make the code run as fast as possible. Alternation gives the worst performance because you're constantly blowing out the CPU caches. You are asking for the worst possible implementation of a mutex.
How does the locking actually work?
The scheduler tries to get as much work done as possible. It's your job to write code that only does the work you really want to get done.
How can I get the desired output?
Don't use two threads if you just want to do one thing then something else then the first thing again. Use threads when you don't care about the order in which work is done and just want to get as much work done as possible.

Is there an API to race N threads (or N closures on N threads) to completion?

Given several threads that complete with an Output value, how do I get the first Output that's produced? Ideally while still being able to get the remaining Outputs later in the order they're produced, and bearing in mind that some threads may or may not terminate.
Example:
struct Output(i32);
fn main() {
let mut spawned_threads = Vec::new();
for i in 0..10 {
let join_handle: ::std::thread::JoinHandle<Output> = ::std::thread::spawn(move || {
// pretend to do some work that takes some amount of time
::std::thread::sleep(::std::time::Duration::from_millis(
(1000 - (100 * i)) as u64,
));
Output(i) // then pretend to return the `Output` of that work
});
spawned_threads.push(join_handle);
}
// I can do this to wait for each thread to finish and collect all `Output`s
let outputs_in_order_of_thread_spawning = spawned_threads
.into_iter()
.map(::std::thread::JoinHandle::join)
.collect::<Vec<::std::thread::Result<Output>>>();
// but how would I get the `Output`s in order of completed threads?
}
I could solve the problem myself using a shared queue/channels/similar, but are there built-in APIs or existing libraries which could solve this use case for me more elegantly?
I'm looking for an API like:
fn race_threads<A: Send>(
threads: Vec<::std::thread::JoinHandle<A>>
) -> (::std::thread::Result<A>, Vec<::std::thread::JoinHandle<A>>) {
unimplemented!("so far this doesn't seem to exist")
}
(Rayon's join is the closest I could find, but a) it only races 2 closures rather than an arbitrary number of closures, and b) the thread pool w/ work stealing approach doesn't make sense for my use case of having some closures that might run forever.)
It is possible to solve this use case using pointers from How to check if a thread has finished in Rust? just like it's possible to solve this use case using an MPSC channel, however here I'm after a clean API to race n threads (or failing that, n closures on n threads).
These problems can be solved by using a condition variable:
use std::sync::{Arc, Condvar, Mutex};
#[derive(Debug)]
struct Output(i32);
enum State {
Starting,
Joinable,
Joined,
}
fn main() {
let pair = Arc::new((Mutex::new(Vec::new()), Condvar::new()));
let mut spawned_threads = Vec::new();
let &(ref lock, ref cvar) = &*pair;
for i in 0..10 {
let my_pair = pair.clone();
let join_handle: ::std::thread::JoinHandle<Output> = ::std::thread::spawn(move || {
// pretend to do some work that takes some amount of time
::std::thread::sleep(::std::time::Duration::from_millis(
(1000 - (100 * i)) as u64,
));
let &(ref lock, ref cvar) = &*my_pair;
let mut joinable = lock.lock().unwrap();
joinable[i] = State::Joinable;
cvar.notify_one();
Output(i as i32) // then pretend to return the `Output` of that work
});
lock.lock().unwrap().push(State::Starting);
spawned_threads.push(Some(join_handle));
}
let mut should_stop = false;
while !should_stop {
let locked = lock.lock().unwrap();
let mut locked = cvar.wait(locked).unwrap();
should_stop = true;
for (i, state) in locked.iter_mut().enumerate() {
match *state {
State::Starting => {
should_stop = false;
}
State::Joinable => {
*state = State::Joined;
println!("{:?}", spawned_threads[i].take().unwrap().join());
}
State::Joined => (),
}
}
}
}
(playground link)
I'm not claiming this is the simplest way to do it. The condition variable will awake the main thread every time a child thread is done. The list can show the state of each thread, if one is (about to) finish, it can be joined.
No, there is no such API.
You've already been presented with multiple options to solve your problem:
Use channels
Use a CondVar
Use futures
Sometimes when programming, you have to go beyond sticking pre-made blocks together. This is supposed to be a fun part of programming. I encourage you to embrace it. Go create your ideal API using the components available and publish it to crates.io.
I really don't see what's so terrible about the channels version:
use std::{sync::mpsc, thread, time::Duration};
#[derive(Debug)]
struct Output(i32);
fn main() {
let (tx, rx) = mpsc::channel();
for i in 0..10 {
let tx = tx.clone();
thread::spawn(move || {
thread::sleep(Duration::from_millis((1000 - (100 * i)) as u64));
tx.send(Output(i)).unwrap();
});
}
// Don't hold on to the sender ourselves
// Otherwise the loop would never terminate
drop(tx);
for r in rx {
println!("{:?}", r);
}
}

Unable to iterate over Arc Mutex

Consider the following code, I append each of my threads to a Vector in order to join them up to the main thread after I have spawned each thread, however I am not able to call iter() on my vector of JoinHandlers.
How can I go about doing this?
fn main() {
let requests = Arc::new(Mutex::new(Vec::new()));
let threads = Arc::new(Mutex::new(Vec::new()));
for _x in 0..100 {
println!("Spawning thread: {}", _x);
let mut client = Client::new();
let thread_items = requests.clone();
let handle = thread::spawn(move || {
for _y in 0..100 {
println!("Firing requests: {}", _y);
let start = time::precise_time_s();
let _res = client.get("http://jacob.uk.com")
.header(Connection::close())
.send().unwrap();
let end = time::precise_time_s();
thread_items.lock().unwrap().push((Request::new(end-start)));
}
});
threads.lock().unwrap().push((handle));
}
// src/main.rs:53:22: 53:30 error: type `alloc::arc::Arc<std::sync::mutex::Mutex<collections::vec::Vec<std::thread::JoinHandle<()>>>>` does not implement any method in scope named `unwrap`
for t in threads.iter(){
println!("Hello World");
}
}
First, you don't need threads to be contained in Arc in Mutex. You can keep it just Vec:
let mut threads = Vec::new();
...
threads.push(handle);
This is so because you don't share threads between, well, threads. You only access it from the main thread.
Second, if for some reason you do need to keep it in Arc (e.g. if your example does not reflect the actual structure of your program which is more complex), then you need to lock the mutex to obtain a reference to the contained vector, just as you do when pushing:
for t in threads.lock().unwrap().iter() {
...
}

How to terminate or suspend a Rust thread from another thread?

Editor's note — this example was created before Rust 1.0 and the specific types have changed or been removed since then. The general question and concept remains valid.
I have spawned a thread with an infinite loop and timer inside.
thread::spawn(|| {
let mut timer = Timer::new().unwrap();
let periodic = timer.periodic(Duration::milliseconds(200));
loop {
periodic.recv();
// Do my work here
}
});
After a time based on some conditions, I need to terminate this thread from another part of my program. In other words, I want to exit from the infinite loop. How can I do this correctly? Additionally, how could I to suspend this thread and resume it later?
I tried to use a global unsafe flag to break the loop, but I think this solution does not look nice.
For both terminating and suspending a thread you can use channels.
Terminated externally
On each iteration of a worker loop, we check if someone notified us through a channel. If yes or if the other end of the channel has gone out of scope we break the loop.
use std::io::{self, BufRead};
use std::sync::mpsc::{self, TryRecvError};
use std::thread;
use std::time::Duration;
fn main() {
println!("Press enter to terminate the child thread");
let (tx, rx) = mpsc::channel();
thread::spawn(move || loop {
println!("Working...");
thread::sleep(Duration::from_millis(500));
match rx.try_recv() {
Ok(_) | Err(TryRecvError::Disconnected) => {
println!("Terminating.");
break;
}
Err(TryRecvError::Empty) => {}
}
});
let mut line = String::new();
let stdin = io::stdin();
let _ = stdin.lock().read_line(&mut line);
let _ = tx.send(());
}
Suspending and resuming
We use recv() which suspends the thread until something arrives on the channel. In order to resume the thread, you need to send something through the channel; the unit value () in this case. If the transmitting end of the channel is dropped, recv() will return Err(()) - we use this to exit the loop.
use std::io::{self, BufRead};
use std::sync::mpsc;
use std::thread;
use std::time::Duration;
fn main() {
println!("Press enter to wake up the child thread");
let (tx, rx) = mpsc::channel();
thread::spawn(move || loop {
println!("Suspending...");
match rx.recv() {
Ok(_) => {
println!("Working...");
thread::sleep(Duration::from_millis(500));
}
Err(_) => {
println!("Terminating.");
break;
}
}
});
let mut line = String::new();
let stdin = io::stdin();
for _ in 0..4 {
let _ = stdin.lock().read_line(&mut line);
let _ = tx.send(());
}
}
Other tools
Channels are the easiest and the most natural (IMO) way to do these tasks, but not the most efficient one. There are other concurrency primitives which you can find in the std::sync module. They belong to a lower level than channels but can be more efficient in particular tasks.
The ideal solution would be a Condvar. You can use wait_timeout in the std::sync module, as pointed out by #Vladimir Matveev.
This is the example from the documentation:
use std::sync::{Arc, Mutex, Condvar};
use std::thread;
use std::time::Duration;
let pair = Arc::new((Mutex::new(false), Condvar::new()));
let pair2 = pair.clone();
thread::spawn(move|| {
let &(ref lock, ref cvar) = &*pair2;
let mut started = lock.lock().unwrap();
*started = true;
// We notify the condvar that the value has changed.
cvar.notify_one();
});
// wait for the thread to start up
let &(ref lock, ref cvar) = &*pair;
let mut started = lock.lock().unwrap();
// as long as the value inside the `Mutex` is false, we wait
loop {
let result = cvar.wait_timeout(started, Duration::from_millis(10)).unwrap();
// 10 milliseconds have passed, or maybe the value changed!
started = result.0;
if *started == true {
// We received the notification and the value has been updated, we can leave.
break
}
}
Having been back to this question several times myself, here's what I think addresses OP's intent and others' best practice of getting the thread to stop itself. Building on the accepted answer, Crossbeam is a nice upgrade to mpsc in allowing message endpoints to be cloned and moved. It also has a convenient tick function. The real point here is it has try_recv() which is non-blocking.
I'm not sure how universally useful it'd be to put a message checker in the middle of an operational loop like this. I haven't found that Actix (or previously Akka) could really stop a thread without--as stated above--getting the thread to do it itself. So this is what I'm using for now (wide open to correction here, still learning myself).
// Cargo.toml:
// [dependencies]
// crossbeam-channel = "0.4.4"
use crossbeam_channel::{Sender, Receiver, unbounded, tick};
use std::time::{Duration, Instant};
fn main() {
let (tx, rx):(Sender<String>, Receiver<String>) = unbounded();
let rx2 = rx.clone();
// crossbeam allows clone and move of receiver
std::thread::spawn(move || {
// OP:
// let mut timer = Timer::new().unwrap();
// let periodic = timer.periodic(Duration::milliseconds(200));
let ticker: Receiver<Instant> = tick(std::time::Duration::from_millis(500));
loop {
// OP:
// periodic.recv();
crossbeam_channel::select! {
recv(ticker) -> _ => {
// OP: Do my work here
println!("Hello, work.");
// Comms Check: keep doing work?
// try_recv is non-blocking
// rx, the single consumer is clone-able in crossbeam
let try_result = rx2.try_recv();
match try_result {
Err(_e) => {},
Ok(msg) => {
match msg.as_str() {
"END_THE_WORLD" => {
println!("Ending the world.");
break;
},
_ => {},
}
},
_ => {}
}
}
}
}
});
// let work continue for 10 seconds then tell that thread to end.
std::thread::sleep(std::time::Duration::from_secs(10));
println!("Goodbye, world.");
tx.send("END_THE_WORLD".to_string());
}
Using strings as a message device is a tad cringeworthy--to me. Could do the other suspend and restart stuff there in an enum.

Resources