Why does this use of Condvar wait and notify not deadlock? - rust

https://doc.rust-lang.org/stable/std/sync/struct.Condvar.html
use std::sync::{Arc, Mutex, Condvar};
use std::thread;
let pair = Arc::new((Mutex::new(false), Condvar::new()));
let pair2 = Arc::clone(&pair);
// Inside of our lock, spawn a new thread, and then wait for it to start.
thread::spawn(move|| {
let (lock, cvar) = &*pair2;
let mut started = lock.lock().unwrap(); // #1
*started = true;
// We notify the condvar that the value has changed.
cvar.notify_one();
});
// Wait for the thread to start up.
let (lock, cvar) = &*pair;
let mut started = lock.lock().unwrap(); // #2
while !*started {
started = cvar.wait(started).unwrap();
}
If I understand this correctly, the core in the spawned thread (#1) might run after the main thread locked the mutex (#2). But in this case, the main thread will never unlock it, because the spawned thread can't lock it and change the value, so the loop keeps running forever... Or, is it because of some Condvar mechanics?

wait's documentation says (emphasis added):
pub fn wait<'a, T>(
&self,
guard: MutexGuard<'a, T>
) -> LockResult<MutexGuard<'a, T>>
This function will atomically unlock the mutex specified (represented by guard) and block the current thread. This means that any calls to notify_one or notify_all which happen logically after the mutex is unlocked are candidates to wake this thread up. When this function call returns, the lock specified will have been re-acquired.
When you call cvar.wait inside the loop it unlocks the mutex, which allows the spawned thread to lock it and set started to true.

Related

Wake another thread periodically with Rust

Condvar in Rust is good for waking another thread, but in this case below, I don't really need the true, I just want to wake the other thread periodically
use std::sync::{Arc, Mutex, Condvar};
use std::thread;
fn main() {
let pair = Arc::new((Mutex::new(false), Condvar::new()));
let pair2 = Arc::clone(&pair);
// Inside of our lock, spawn a new thread, and then wait for it to start.
thread::spawn(move|| {
let (lock, cvar) = &*pair2;
let mut started = lock.lock().unwrap();
// We notify the condvar that the value has changed.
loop{
*started = true;
cvar.notify_one();
std::thread::sleep(std::time::Duration::from_millis(20));
}
});
// Wait for the thread to start up.
let (lock, cvar) = &*pair;
let mut started = lock.lock().unwrap();
loop {
started = cvar.wait(started).unwrap();
println!("done");
}
println!("end");
}
Also, this example does not even work, I don't know why. It should wake the main thread every 20 ms.
The reason this doesn't work is that you're holding the mutex while you sleep. The main thread is only woken up after it has been hit with notify_one and its mutex is lockable. But the spawned thread holds the mutex locked forever.
Playground
In fact, you don't need the lock at all in your spawned thread and you could make it contain no data by constructing it as Mutex::new(()). However, if you do that, it is possible that your main thread hasn't finished one loop by the time the spawned thread finishes its sleep. The mutex ensures that when you call notify_one, the main thread is definitely waiting to be notified. If you want that behavior or not is up to you. With locking the mutex in the spawned thread, the main thread is woken up immediately if its previous loop took longer than one tick. Without the locking, wake-ups may be skipped and the next wake-up is aligned to the next tick.
But really, do what's in the answers #Finomnis suggested and use channels.

Concurrency in a rust tui app - lock starvation issue

I have a tui app (built with the tui-rs crate) with its state inside a global AppState struct.
The app has 2 threads:
The main thread draws the contents to the screen in a loop. It needs an immutable reference to AppState to do the drawing.
A second thread occasionally spawns to do some cpu-heavy work. As a result it needs to update AppState and thus needs a mutable reference.
pub fn draw_screen() -> Result<(), Box<dyn Error>> {
let mut terminal = init_terminal().unwrap();
// prepare app state
let mut state = AppState::fresh_state();
let arc_state = Arc::new(Mutex::new(state));
loop {
terminal.draw(|f| {
// needs an immutable reference to AppState
do_some_drawing(arc_state.clone());
});
// needs a mutable reference to AppState
thread::spawn(|| {
do_some_calc(arc_state.clone());
});
}
Ok(())
}
Currently this results in a hanged app. The main thread keeps spinning in a loop, preventing the Mutex lock from ever being released. This means the second thread can never do its work. This in turn means the main thread (having checked whether the work is done yet) keeps spawning more worker threads, none of which can progress.
At least that's my best guess as to what's happening.
In case it matters, this is how I access the mutex in both immutable and mutable cases:
// when I need an immutable ref
let lock = thread_state.lock().unwrap();
let state = lock.deref();
// when I need a mutable ref
let mut lock = thread_state.lock().unwrap();
let state = lock.deref_mut();
What's the right way to resolve the above situation?
I've been reading about RefCell and channels but I'm not experienced enough to make a call. Would appreciate some guidance (any links to tutorials / docs super welcome).

Multithreaded list iteration while using Mutex to prevent dealing with the same type at the same time

I am writing an application that needs to run on many threads at the same time. It will process a long list of items where one property of each item is a user_id. I am trying to make sure that items belonging to the same user_id are never processed at the same time. This means that the closure running the sub threads needs to wait until no other thread is processing data for the same user.
I do not understand how to solve this. My simplified, current example, looks like this:
use std::collections::HashMap;
use std::sync::{Arc, Mutex};
use threadpool::ThreadPool;
fn main() {
let pool = ThreadPool::new(num_cpus::get());
let mut locks: HashMap<String, Mutex<bool>> = HashMap::new();
let queue = Arc::new(vec![
"1".to_string(),
"1".to_string(),
"2".to_string(),
"1".to_string(),
"3".to_string(),
]);
let count = queue.len();
for i in 0..count {
let user_id = queue[i].clone();
// Problem: cannot borrow `locks` as mutable more than once at a time
// mutable borrow starts here in previous iteration of loop
let lock = locks.entry(user_id).or_insert(Mutex::new(true));
pool.execute(move || {
// Wait until the user_id becomes free.
lock.lock().unwrap();
// Do stuff with user_id, but never process
// the same user_id more than once at the same time.
println!("{:?}", user_id);
});
}
pool.join();
}
I am trying to keep a list of Mutex which I then use to wait for the user_id to become free, but the borrow checker does not allow this. The queue items and the item process code is much more complex in the actual application I am working on.
I am not allowed to change the order of the items in the queue (but some variations will be allowed because of waiting for the lock).
How to solve this scenario?
First of all, HashMap::entry() consumes the key, so since you want to use it in the closure as well, you'll need to clone it, i.e. .entry(user_id.clone()).
Since you need to share the Mutex<bool> between the main thread and worker threads, then you need to likewise wrap that in an Arc. You can also use Entry::or_insert_with(), so you avoid needlessly creating a new Mutex unless needed.
let mut locks: HashMap<String, Arc<Mutex<bool>>> = HashMap::new();
// ...
let lock = locks
.entry(user_id.clone())
.or_insert_with(|| Arc::new(Mutex::new(true)))
.clone();
Lastly, you must store the guard returned by lock(), otherwise it is immediately released.
let _guard = lock.lock().unwrap();
use std::collections::HashMap;
use std::sync::{Arc, Mutex};
use threadpool::ThreadPool;
fn main() {
let pool = ThreadPool::new(num_cpus::get());
let mut locks: HashMap<String, Arc<Mutex<bool>>> = HashMap::new();
let queue = Arc::new(vec![
"1".to_string(),
"1".to_string(),
"2".to_string(),
"1".to_string(),
"3".to_string(),
]);
let count = queue.len();
for i in 0..count {
let user_id = queue[i].clone();
let lock = locks
.entry(user_id.clone())
.or_insert_with(|| Arc::new(Mutex::new(true)))
.clone();
pool.execute(move || {
// Wait until the user_id becomes free.
let _guard = lock.lock().unwrap();
// Do stuff with user_id, but never process
// the same user_id more than once at the same time.
println!("{:?}", user_id);
});
}
pool.join();
}

Understanding &* to access a Rust Arc

Reading about Condvar (condition variable for Rust) at https://doc.rust-lang.org/beta/std/sync/struct.Condvar.html I stumbled upon:
use std::sync::{Arc, Mutex, Condvar};
use std::thread;
let pair = Arc::new((Mutex::new(false), Condvar::new()));
let pair2 = pair.clone();
// Inside of our lock, spawn a new thread, and then wait for it to start.
thread::spawn(move|| {
let (lock, cvar) = &*pair2;
let mut started = lock.lock().unwrap();
*started = true;
// We notify the condvar that the value has changed.
cvar.notify_one();
});
// Wait for the thread to start up.
let (lock, cvar) = &*pair;
let mut started = lock.lock().unwrap();
while !*started {
started = cvar.wait(started).unwrap();
}
What is the &*pair2 thing? I think it has to do with being able to retrieve the pair from inside the Arc, but shouldn't it be better to have a simple method that retrives the internal object of the Arc as a reference?
Can somebody explain to me exactly what &* does?
The * operator turns the Arc<T> into T. The & operator borrows that T into &T.
So when we put them together, &*pair borrows the Arc<T> into &T.
Another way of writing that code would be:
let (lock, cvar) = pair2.deref();
Indeed, the original &*pair2 actually means &*pair2.deref() – the * forces the compiler to insert a .deref() call, and it's that method which performs the actual conversion.

Shouldn't a loop spawned in a thread print repeatedly?

Example code:
fn main() {
use std::thread::spawn;
spawn(|| { loop { println!("a") } });
// `a` is never printed
}
fn main() {
use std::thread::spawn;
spawn(|| { loop { println!("a") } });
loop { }
// `a` is printed repeatedly
}
a prints to the standard output in the second case, but the same is not true in the first case. Why is that? Shouldn't a print repeatedly in the first case as well?
Shouldn't a print repeatedly in the first case as well?
No. The documentation of thread:spawn says (emphasis mine):
The join handle will implicitly detach the child thread upon being dropped. In this case, the child thread may outlive the parent (unless the parent thread is the main thread; the whole process is terminated when the main thread finishes.) Additionally, the join handle provides a join method that can be used to join the child thread. If the child thread panics, join will return an Err containing the argument given to panic.
Your entire program exits because the main thread has exited. There was never even a chance for the child thread to start, much less print anything.
In the second example, you prevent the main thread from exiting by also causing that to spin forever.
What happens when you spawn a loop?
That thread will spin in the loop, as long as the program executes.
Idiomatically, you don't need the extra curly braces in the spawn, and it's more standard to only import the std::thread and then call thread::spawn:
fn main() {
use std::thread;
thread::spawn(|| loop {
println!("a")
});
}
To have the main thread wait for the child, you need to keep the JoinHandle from thread::spawn and call join on it:
fn main() {
use std::thread;
let handle = thread::spawn(|| loop {
println!("a")
});
handle.join().expect("The thread panicked");
}

Resources