Match operator makes one thread monopolising mpsc channel - multithreading

I'm doing the final project of the rust book (ch 20 - Building a Multithreaded Web Server) and the final working code for a thread worker is :
impl Worker {
fn new(id: usize, receiver: Arc<Mutex<mpsc::Receiver<Message>>>) -> Worker {
let thread = thread::spawn(move || loop {
let message = receiver.lock().unwrap().recv().unwrap();
match message {
Message::NewJob(job) => {
println!("Worker {} got a job; executing.", id);
job();
}
Message::Terminate => {
println!("Worker {} was told to terminate.", id);
break;
}
}
});
// ...
}
}
Then I decided to shorten the code by removing the message variable and apply the match operator directly on receiver.lock().unwrap().recv().unwrap()
impl Worker {
fn new(id: usize, receiver: Arc<Mutex<mpsc::Receiver<Message>>>) -> Worker {
let thread = thread::spawn(move || loop {
match receiver.lock().unwrap().recv().unwrap() {
Message::NewJob(job) => {
println!("Worker {} got a job; executing.", id);
job();
}
Message::Terminate => {
println!("Worker {} was told to terminate.", id);
break;
}
}
});
// ...
}
}
but now there is only one worker (the first one) responding to the job queue.
What is happening here ?

The two code snippets are not equivalent. The difference lies in the way temporaries are borrowed and dropped.
In the first example, the temporary mutex guard is dropped at the end of the statement:
let message = receiver.lock().unwrap().recv().unwrap();
// the mutex guard is dropped here, and other threads can acquire the lock
while in the second snipped, the mutex guard remains alive and thus holds the lock until the end of the match statement:
match receiver.lock().unwrap().recv().unwrap() { // the mutex is locked here and will remain locked until the channel yields a result, thus blocking other threads
Message::NewJob(job) => {
println!("Worker {} got a job; executing.", id);
job();
}
Message::Terminate => {
println!("Worker {} was told to terminate.", id);
break;
}
}
// the mutex guard is dropped here
So in the second example other threads are locked out from acquiring access to the receiving end of the channel.
To quote the answer from this thread:
when a temporary is created inside an expression, it is (only) dropped at the end of the enscoping statement:
A let binding ends its statement before the binding ("variable") is usable;
A match and derivatives, end the statement after the scope in which the binding ("variable") is usable.

Related

How to create individual thread for sending pings on client webscoket

I wrote this code, but when sending pings, the program cannot do anything else. How can I spawn another thread to do this work while I do something else in my program?
pub fn sending_ping(addr: Addr<MicroscopeClient>) -> Result<(), ()> {
info!("Pings started");
spawn(async move {
loop {
info!("Ping");
match addr.send(Ping {}).await {
Ok(_) => {
info!("Ping sended")
}
Err(e) => {
warn!("Ping error");
return;
}
}
std::thread::sleep(Duration::from_millis((4000) as u64));
}
});
return Ok(());
}
Assuming you are using tokio, calling tokio::spawn does not necessarily guarantee that the task will be executed on a separate thread (though it may be).
The problem here is that std::thread::sleep() completely blocks the thread that the task is running on, meaning your pings will run asynchronously (not blocking other tasks) but when you move on to the sleep nothing else on the current thread will execute for the next 4 seconds.
You can resolve this by using a non-blocking version of sleep, such as the one provided by tokio https://docs.rs/tokio/latest/tokio/time/fn.sleep.html
Thus your code would look like
pub fn sending_ping(addr: Addr<MicroscopeClient>) -> Result<(), ()> {
info!("Pings started");
spawn(async move {
loop {
info!("Ping");
match addr.send(Ping {}).await {
Ok(_) => {
info!("Ping sended")
}
Err(e) => {
warn!("Ping error");
return;
}
}
tokio::time::sleep(Duration::from_millis(4000)).await;
}
});
return Ok(());
}
If you truly want to ensure that the task spawns on a different thread you would have to use std::thread::spawn but you would then have to set up another async runtime. Instead you could use spawn_blocking which at least guarantees the task is run on a thread where tasks are expected to block
I solved this particular problem in the following way
fn send_heartbeat(&self, ctx: &mut <Self as Actor>::Context) {
ctx.run_interval(HEARTBEAT_INTERVAL, |act, ctx| {
if Instant::now().duration_since(act.hb) > CLIENT_TIMEOUT {
info!("Websocket Client heartbeat failed, disconnecting!");
act.server_addr.do_send(Disconnect {
id: act.user_id.clone(),
});
ctx.stop();
return;
}
});
}
But I still don't know how to start some long process in a websocket in a separate thread, so that it does not block the websocket, with tokio="0.2.0" and actix="0.3.0".

What happens when a thread that calls Condvar::wait(condvar, some_mutex) is spuriously woken?

If a thread is sleeping, waiting on a notify_one / notify_all call but before this occurs the thread is woken, does the call to wait return or does the thread continue to wait?
I've read the docs (for Condvar::wait) and its unclear to me if it means that a call to wait will return as soon as a thread is woken:
Note that this function is susceptible to spurious wakeups. Condition variables normally have a boolean predicate associated with them, and the predicate must always be checked each time this function returns to protect against spurious wakeups.
If it is the case that the call to wait returns as soon as the thread is woken, then what is returned? Since it must return some kind of guard:
pub type LockResult<Guard> = Result<Guard, PoisonError<Guard>>;
Specific example, in particular, the recv method:
pub struct Receiver<T> {
shared: Arc<Shared<T>>
}
impl<T> Receiver<T> {
pub fn recv(&self) -> Option<T> {
let mut inner_mutex_guard:MutexGuard<Inner<T>> = self.shared.inner.lock().unwrap();
loop {
match inner_mutex_guard.queue.pop_front() {
Some(t) => return Some(t),
None => {
if inner_mutex_guard.senders == 0 {
return None
} else {
inner_mutex_guard = Condvar::wait(&self.shared.available, inner_mutex_guard).unwrap();
}
}
}
}
}
}
pub struct Inner<T> {
queue: VecDeque<T>,
senders: usize
}
pub struct Shared<T> {
inner: Mutex<Inner<T>>,
available: Condvar,
}
Your example is correct. Although I would write
inner_mutex_guard = self.shared.available.wait(inner_mutex_guard).unwrap();
instead of
inner_mutex_guard = Condvar::wait(&self.shared.available, inner_mutex_guard).unwrap();
If a spurious wakeup happens, your loop will call pop_front() again, which will potentially return None and enter another .wait().
I understand that you probably got confused by Condition variables normally have a boolean predicate associated with them. That's because your boolean predicate is a little hidden ... it is "Does .pop_front() return Some". Which you do check, as required, whenever a wakeup happens.
The return type of .wait() is another LockGuard. In order for something to happen, someone else must have the possibility to change the locked value. Therefore, the value must be unlocked while waiting.
As there are several easy-to-miss race conditions in the implementation of unlock-wait-lock, it's usually done in a single call. So .wait() unlocks the value, waits for the condition to happen, and then locks the value again. That's why it returns a new LockGuard ... it is the new locked and potentially changed value.
Although to be honest I'm not sure why they did it that way ... they could as well have done .wait(&mut LockGuard) instead of .wait(LockGuard) -> LockGuard. But who knows.
Edit: I'm pretty sure they return a lock guard because the re-locking could fail; and therefore they don't actually return a LockGuard but a Result<LockGuard>
Edit #2: It cannot be passed in by reference, because during the wait, no LockGuard exists. The previous one was dropped, and the new one only exists when the data is locked again. And LockGuard does not have an "empty" state (like None), so it actually has to be consumed and dropped. Which isn't possible with a reference.
Explanation of CondVar
~ original answer before an example was added ~
You have to understand that a CondVar is more of an optimization. It prevents idle-waiting for a condition. That's why it takes a Guard, it's supposed to be combined with a locked value that you want to watch for changes. So if it wakes up, check if the value changed and then go back to wait. wait is meant to be called in a loop.
Here's an example:
use std::{
sync::{Arc, Condvar, Mutex},
thread::{sleep, JoinHandle},
time::Duration,
};
struct Counter {
value: Mutex<u32>,
condvar: Condvar,
}
fn start_counting_thread(counter: Arc<Counter>) -> JoinHandle<()> {
std::thread::spawn(move || loop {
sleep(Duration::from_millis(100));
let mut value = counter.value.lock().unwrap();
*value += 1;
counter.condvar.notify_all();
if *value > 15 {
break;
}
})
}
fn main() {
let counter = Arc::new(Counter {
value: Mutex::new(0),
condvar: Condvar::new(),
});
let counting_thread = start_counting_thread(counter.clone());
// Wait until the value more than 10
let mut value = counter.value.lock().unwrap();
while *value <= 10 {
println!("Value is {value}, waiting ...");
value = counter.condvar.wait(value).unwrap();
}
println!("Condition met. Value is now {}.", *value);
// Unlock value
drop(value);
// Wait for counting thread to finish
counting_thread.join().unwrap();
}
Value is 0, waiting ...
Value is 1, waiting ...
Value is 2, waiting ...
Value is 3, waiting ...
Value is 4, waiting ...
Value is 5, waiting ...
Value is 6, waiting ...
Value is 7, waiting ...
Value is 8, waiting ...
Value is 9, waiting ...
Value is 10, waiting ...
Condition met. Value is now 11.
If you don't want to implement the loop manually but instead just wait until a condition is met, use wait_while:
use std::{
sync::{Arc, Condvar, Mutex},
thread::{sleep, JoinHandle},
time::Duration,
};
struct Counter {
value: Mutex<u32>,
condvar: Condvar,
}
fn start_counting_thread(counter: Arc<Counter>) -> JoinHandle<()> {
std::thread::spawn(move || loop {
sleep(Duration::from_millis(100));
let mut value = counter.value.lock().unwrap();
*value += 1;
counter.condvar.notify_all();
if *value > 15 {
break;
}
})
}
fn main() {
let counter = Arc::new(Counter {
value: Mutex::new(0),
condvar: Condvar::new(),
});
let counting_thread = start_counting_thread(counter.clone());
// Wait until the value more than 10
let mut value = counter.value.lock().unwrap();
value = counter.condvar.wait_while(value, |val| *val <= 10).unwrap();
println!("Condition met. Value is now {}.", *value);
// Unlock value
drop(value);
// Wait for counting thread to finish
counting_thread.join().unwrap();
}
Condition met. Value is now 11.

Avoid deadlock in rust when multiple spawns execute code in a loop

I am trying to run 2 threads in parallel and share some data between them. When either one of the threads contain a loop statement, the shared data in the other thread goes into a deadlock.
But if I were to add a line to code to break out of the loop statement after a certain number of iterations, the deadlock gets released and the operation in the next thread starts.
Rust Playground
Code:
use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;
#[derive(Clone, Copy)]
struct SomeNetwork {
is_connected: bool,
}
impl SomeNetwork {
fn connection_manager(&mut self) {
loop {
// if I exit the loop after a few iterations then the deadlock is removed
// eg: when I use `for i in 0..10 {` instead of `loop`
println!("connection_manager thread...");
thread::sleep(Duration::from_millis(2000));
}
}
fn api_calls(&self) {
loop {
if self.is_connected {
//make_an_api_call()
}
println!("api_calls thread...");
thread::sleep(Duration::from_millis(5000));
}
}
pub fn start() {
let self_arc = SomeNetwork {
is_connected: false,
};
let self_arc = Arc::new(Mutex::new(self_arc));
let self_cloned1 = Arc::clone(&self_arc);
let self_cloned2 = Arc::clone(&self_arc);
thread::Builder::new()
.spawn(move || {
let mut n = self_cloned1.lock().unwrap();
n.connection_manager();
})
.unwrap();
thread::Builder::new()
.spawn(move || {
let n = self_cloned2.lock().unwrap(); // <---- deadlock here
n.api_calls();
})
.unwrap();
loop {
thread::sleep(Duration::from_millis(5000))
}
}
}
fn main() {
SomeNetwork::start();
}
Output:
connection_manager thread...
connection_manager thread...
connection_manager thread...
connection_manager thread...
connection_manager thread...
....
Wouldn't the underlying OS take care of the scheduling once a thread goes into sleep?
What could be done here, so that I can run both threads in parallel?
The issue is the mutex you created stays locked during connection_manager.
The way you use a mutex in Rust is that it wraps the data it locks. When you lock the mutex, it blocks the current thread until it can obtain the mutex. Once it has, it gives you a MutexGuard which you can think of as a wrapper for a reference to the mutex. The MutexGuard gives you mutable access to the data inside the mutex. Then once the MutexGuard is no longer needed Rust invokes MutexGuard's implementation of Drop which unlocks the mutex and allows other threads to obtain it.
// Block until mutex is locked for this thread and return MutexGuard
let mut n = self_cloned1.lock().unwrap();
// Do stuff with the locked mutex
n.connection_manager();
// MutexGuard is no longer needed so it gets dropped and the mutex is released
As you can see, if connection_manager never exits the mutex will remain locked for the first thread to obtain the mutex.
What you want is probably to use a mutex with a condvar so the mutex can be released while the thread is sleeping.
Edit:
Here is a rough idea of what that using condvars to handle connecting and channels to pass jobs to workers would look like. Playground Link
use std::sync::{Arc, Mutex, Condvar};
use std::thread::{self, current};
use std::time::Duration;
use crossbeam_channel::{unbounded, Receiver};
#[derive(Clone, Copy)]
struct SomeNetwork {
is_connected: bool,
}
const TIMEOUT: Duration = Duration::from_secs(5);
impl SomeNetwork {
fn connect(&mut self) {
println!("connection_manager thread...");
self.is_connected = true;
}
fn api_calls(&self, job: i32) {
//println!("api_calls thread...");
println!("[Worker {:?}] Handling job {}", current().id(), job);
thread::sleep(Duration::from_millis(50))
}
pub fn start_connection_thread(
self_data: Arc<Mutex<Self>>,
connect_condvar: Arc<Condvar>,
worker_condvar: Arc<Condvar>,
) {
thread::Builder::new()
.spawn(move || {
let mut guard = self_data.lock().unwrap();
loop {
// Do something with the data
if !guard.is_connected {
guard.connect();
// Notify all workers that the connection is ready
worker_condvar.notify_all();
}
// Use condvar to release mutex and wait until signaled to start again
let (new_guard, _) = connect_condvar.wait_timeout(guard, TIMEOUT).unwrap();
guard = new_guard;
}
})
.unwrap();
}
pub fn start_worker_thread(
self_data: Arc<Mutex<Self>>,
connect_condvar: Arc<Condvar>,
worker_condvar: Arc<Condvar>,
requests: Receiver<i32>,
) {
thread::Builder::new()
.spawn(move || {
loop {
// Wait until a request is received
let request = requests.recv().unwrap();
// Lock mutex once we have a request
let mut guard = self_data.lock().unwrap();
// Make sure we are connected before starting tasks
while !guard.is_connected {
// Wake up 1 connection thread if the connection breaks
connect_condvar.notify_one();
// Sleep until signaled that the connection has been fixed
let (new_guard, _) = worker_condvar.wait_timeout(guard, TIMEOUT).unwrap();
guard = new_guard;
}
// Now that we have verified we are connected, handle the request
guard.api_calls(request);
}
})
.unwrap();
}
pub fn start() {
let self_arc = SomeNetwork {
is_connected: false,
};
let self_arc = Arc::new(Mutex::new(self_arc));
let connect_condvar = Arc::new(Condvar::new());
let worker_condvar = Arc::new(Condvar::new());
// Create a channel to send jobs to workers
let (send, recv) = unbounded();
Self::start_connection_thread(self_arc.clone(), connect_condvar.clone(), worker_condvar.clone());
// Start some workers
for _ in 0..5 {
Self::start_worker_thread(self_arc.clone(), connect_condvar.clone(), worker_condvar.clone(), recv.clone());
}
// Send messages to workers
for message in 1..100 {
send.send(message);
}
loop {
thread::sleep(Duration::from_millis(5000))
}
}
}
fn main() {
SomeNetwork::start();
}

Tokio channel sends, but doesn't receive

TL;DR I'm trying to have a background thread that's ID'd that is controlled via that ID and web calls, and the background threads doesn't seem to be getting the message via all the types of channels I've tried.
I've tried both the std channels as well as tokio's, and of those I've tried all but the watcher type from tokio. All have the same result which probably means that I've messed something up somewhere without realizing it, but I can't find the issue:
use std::collections::{
hash_map::Entry::{Occupied, Vacant},
HashMap,
};
use std::sync::Arc;
use tokio::sync::mpsc::{self, UnboundedSender};
use tokio::sync::RwLock;
use tokio::task::JoinHandle;
use uuid::Uuid;
use warp::{http, Filter};
#[derive(Default)]
pub struct Switcher {
pub handle: Option<JoinHandle<bool>>,
pub pipeline_end_tx: Option<UnboundedSender<String>>,
}
impl Switcher {
pub fn set_sender(&mut self, tx: UnboundedSender<String>) {
self.pipeline_end_tx = Some(tx);
}
pub fn set_handle(&mut self, handle: JoinHandle<bool>) {
self.handle = Some(handle);
}
}
const ADDR: [u8; 4] = [0, 0, 0, 0];
const PORT: u16 = 3000;
type RunningPipelines = Arc<RwLock<HashMap<String, Arc<RwLock<Switcher>>>>>;
#[tokio::main]
async fn main() {
let running_pipelines = Arc::new(RwLock::new(HashMap::<String, Arc<RwLock<Switcher>>>::new()));
let session_create = warp::post()
.and(with_pipelines(running_pipelines.clone()))
.and(warp::path("session"))
.then(|pipelines: RunningPipelines| async move {
println!("session requested OK!");
let id = Uuid::new_v4();
let mut switcher = Switcher::default();
let (tx, mut rx) = mpsc::unbounded_channel::<String>();
switcher.set_sender(tx);
let t = tokio::spawn(async move {
println!("Background going...");
//This would be something processing in the background until it received the end signal
match rx.recv().await {
Some(v) => {
println!(
"Got end message:{} YESSSSSS#!##!!!!!!!!!!!!!!!!1111eleven",
v
);
}
None => println!("Error receiving end signal:"),
}
println!("ABORTING HANDLE");
true
});
let ret = HashMap::from([("session_id", id.to_string())]);
switcher.set_handle(t);
{
pipelines
.write()
.await
.insert(id.to_string(), Arc::new(RwLock::new(switcher)));
}
Ok(warp::reply::json(&ret))
});
let session_end = warp::delete()
.and(with_pipelines(running_pipelines.clone()))
.and(warp::path("session"))
.and(warp::query::<HashMap<String, String>>())
.then(
|pipelines: RunningPipelines, p: HashMap<String, String>| async move {
println!("session end requested OK!: {:?}", p);
match p.get("session_id") {
None => Ok(warp::reply::with_status(
"Please specify session to end",
http::StatusCode::BAD_REQUEST,
)),
Some(id) => {
let mut pipe = pipelines.write().await;
match pipe.entry(String::from(id)) {
Occupied(handle) => {
println!("occupied");
let (k, v) = handle.remove_entry();
drop(pipe);
println!("removed from hashmap, key:{}", k);
let s = v.write().await;
if let Some(h) = &s.handle {
if let Some(tx) = &s.pipeline_end_tx {
match tx.send("goodbye".to_string()) {
Ok(res) => {
println!(
"sent end message|{:?}| to fpipeline: {}",
res, id
);
//Added this to try to get it to at least Error on the other side
drop(tx);
},
Err(err) => println!(
"ERROR sending end message to pipeline({}):{}",
id, err
),
};
} else {
println!("no sender channel found for pipeline: {}", id);
};
h.abort();
} else {
println!(
"no luck finding the value in handle in the switcher: {}",
id
);
};
}
Vacant(_) => {
println!("no luck finding the handle in the pipelines: {}", id)
}
};
Ok(warp::reply::with_status("done", http::StatusCode::OK))
}
}
},
);
let routes = session_create
.or(session_end)
.recover(handle_rejection)
.with(warp::cors().allow_any_origin());
println!("starting server...");
warp::serve(routes).run((ADDR, PORT)).await;
}
async fn handle_rejection(
err: warp::Rejection,
) -> Result<impl warp::Reply, std::convert::Infallible> {
Ok(warp::reply::json(&format!("{:?}", err)))
}
fn with_pipelines(
pipelines: RunningPipelines,
) -> impl Filter<Extract = (RunningPipelines,), Error = std::convert::Infallible> + Clone {
warp::any().map(move || pipelines.clone())
}
depends:
[dependencies]
warp = "0.3"
tokio = { version = "1", features = ["full"] }
uuid = { version = "0.8.2", features = ["serde", "v4"] }
Results when I boot up, send a "create" request, and then an "end" request with the received ID:
starting server...
session requested OK!
Background going...
session end requested OK!: {"session_id": "6b984a45-38d8-41dc-bf95-422f75c5a429"}
occupied
removed from hashmap, key:6b984a45-38d8-41dc-bf95-422f75c5a429
sent end message|()| to fpipeline: 6b984a45-38d8-41dc-bf95-422f75c5a429
You'll notice that the background thread starts (and doesn't end) when the "create" request is made, but when the "end" request is made, while everything appears to complete successfully from the request(web) side, the background thread doesn't ever receive the message. As I've said I've tried all different channel types and moved things around to get it into this configuration... i.e. flattened and thread safetied as much as I could or at least could think of. I'm greener than I would like in rust, so any help would be VERY appreciated!
I think that the issue here is that you are sending the message and then immediately aborting the background task:
tx.send("goodbye".to_string());
//...
h.abort();
And the background task does not have time to process the message, as the abort is of higher priority.
What you need is to join the task, not to abort it.
Curiously, tokio tasks handles do not have a join() method, instead you wait for the handle itself. But for that you need to own the handle, so first you have to extract the handle from the Switcher:
let mut s = v.write().await;
//steal the task handle
if let Some(h) = s.handle.take() {
//...
tx.send("goodbye".to_string());
//...
//join the task
h.await.unwrap();
}
Note that joining a task may fail, in case the task is aborted or panicked. I'm just panicking in the code above, but you may want to do something different.
Or... you could not to wait for the task. In tokio if you drop a task handle, it will be detached. Then, it will finish when it finishes.

How to correctly exit the thread blocking on mpsc::Receiver

impl A {
fn new() -> (A, std::sync::mpsc::Receiver<Data>) {
let (sender, receiver) = std::sync::mpsc::channel();
let objA = A { sender: sender, }; // A spawns threads, clones and uses sender etc
(objA, receiver)
}
}
impl B {
fn new() -> B {
let (objA, receiver) = A::new();
B {
a: objA,
join_handle: Some(std::thread::spwan(move || {
loop {
match receiver.recv() {
Ok(data) => /* Do Something, inform main thread etc */,
Err(_) => break,
}
}
})),
}
}
}
impl Drop for B {
fn drop(&mut self) {
// Want to do something like "sender.close()/receiver.close()" etc so that the following
// thread joins. But there is no such function. How do i break the following thread ?
self.join_handle().take().unwrap().join().unwrap();
}
}
Is there a way to cleanly exit under such a circumstance ? The thing is that when either receiver or sender is dropped the other sniffs this and gives an error. In case of receiver it will be woken up and will yield an error in which case i am breaking out of the infinite and blocking loop above. However how do i do that explicitly using this very property of channels, without resorting to other flags in conjunction with try_recv()etc., and cleanly exit my thread deterministically?
Why not sending a specific message to shut this thread? I do not know what is your data but most of the time it may be an enum and adding a enum variant like 'MyData::Shutdown' in your receive you can simply break out of the loop.
You can wrap the a field of your B type in an Option. This way in the Drop::drop method you can do drop(self.a.take()) which will replace the field with a None and drop the sender. This closes the channel and your thread can now be properly joined.
You can create a new channel and swap your actual sender out with the dummy-sender. Then you can drop your sender and therefor join the thread:
impl Drop for B {
fn drop(&mut self) {
let (s, _) = channel();
drop(replace(&mut self.a.sender, s));
self.join_handle.take().unwrap().join().unwrap();
}
}
Try it out in the playpen: http://is.gd/y7A9L0
I don't know what the overhead of creating and immediately dropping a channel is, but it's not free and unlikely to be optimized out (There's an Arc in there).
on a side-note, Your infinite loop with a match on receiver.recv() could be replaced by a for loop using the Receiver::iter method:
for _ in receiver.iter() {
// do something with the value
}

Resources