How to avoid a deadlock caused by a thread panic? - rust

My server uses a Barrier to notify the client when it's safe to attempt to connect. Without the barrier, we risk failing randomly as there is no guarantee that the server socket would have been bound.
Now imagine that the server panics - for instance tries to bind the socket to port 80. The client will be left wait()-ing forever. We cannot join() the server thread in order to find out if it panicked, because join() is a blocking operation - if we join() we won't be able to connect().
What's the proper way to do this kind of synchronization, given that the std::sync APIs do not provide methods with timeouts?
This is just a MCVE to demonstrate the issue. I had a similar case in a unit test - it was left running forever.
use std::{
io::prelude::*,
net::{SocketAddr, TcpListener, TcpStream},
sync::{Arc, Barrier},
};
fn main() {
let port = 9090;
//let port = 80;
let barrier = Arc::new(Barrier::new(2));
let server_barrier = barrier.clone();
let client_sync = move || {
barrier.wait();
};
let server_sync = Box::new(move || {
server_barrier.wait();
});
server(server_sync, port);
//server(Box::new(|| { no_sync() }), port); //use to test without synchronisation
client(&client_sync, port);
//client(&no_sync, port); //use to test without synchronisation
}
fn no_sync() {
// do nothing in order to demonstrate the need for synchronization
}
fn server(sync: Box<Fn() + Send + Sync>, port: u16) {
std::thread::spawn(move || {
std::thread::sleep_ms(100); //there is no guarantee when the os will schedule the thread. make it 100% reproducible
let addr = SocketAddr::from(([127, 0, 0, 1], port));
let socket = TcpListener::bind(&addr).unwrap();
println!("server socket bound");
sync();
let (mut client, _) = socket.accept().unwrap();
client.write_all(b"hello mcve").unwrap();
});
}
fn client(sync: &Fn(), port: u16) {
sync();
let addr = SocketAddr::from(([127, 0, 0, 1], port));
let mut socket = TcpStream::connect(&addr).unwrap();
println!("client socket connected");
let mut buf = String::new();
socket.read_to_string(&mut buf).unwrap();
println!("client received: {}", buf);
}

Instead of a Barrier I would use a Condvar here.
To actually solve your problem, I see at least three possible solutions:
Use Condvar::wait_timeout and set the timeout to a reasonable duration (e.g. 1 second which should be enough for binding to a port)
You could use the same method as above, but with a lower timeout (e.g. 10 msec) and check if the Mutex is poisoned.
Instead of a Condvar, you could use a plain Mutex (make sure that the Mutex is locked by the other thread first) and then use Mutex::try_lock to check if the Mutex is poisoned
I think one should prefer solution 1 or 2 over the third one, because you will avoid to make sure that the other thread has locked the Mutex first.

Related

How to properly use self when creating a new thread from inside a method in Rust

I am creating a server that stores the TcpStream objects inside a Vec to be used later.The problem is the function that listens for new connections and adds them to the Vec runs forever in a separate thread and doesn't allow other threads to read the Vec.
pub struct Server {
pub connections: Vec<TcpStream>,
}
impl Server {
fn listen(&mut self) {
println!("Server is listening on port 8080");
let listener = TcpListener::bind("127.0.0.1:8080").unwrap();
loop {
let stream = listener.accept().unwrap().0;
println!("New client connected: {}", stream.peer_addr().unwrap());
//should block for write here
self.connections.push(stream);
//should release write lock
}
}
pub fn run(self) {
let arc_self = Arc::new(RwLock::new(self));
let arc_self_clone = arc_self.clone();
//blocks the lock for writing forever because of listen()
let listener_thread = thread::spawn(move || arc_self_clone.write().unwrap().listen());
loop {
let mut input = String::new();
io::stdin().read_line(&mut input).unwrap();
if input.trim() == "1" {
//can't read because lock blocked for writing
for c in &arc_self.read().unwrap().connections {
println!("testing...");
}
}
}
}
}
In the current example the server accepts connections but does not allow the main thread to read the connections vector.I tought about making the listen function run at a fixed interval (1-5s) so it allows other threads to read the vector in that time but listener.accept() blocks the thread aniway so i don't think that is a valid solution.I would also prefer if it were to run forever if possible and block access to the vector only when it needs to write (a new client connects) and while it waits for clients to connect not block the reading access of other threads to the connections vector.
You could just wrap connections in a RwLock instead of entire self, as shown below, but I would recommend using a proper synchronisation primitive like a channel.
pub struct Server {
pub connections: RwLock<Vec<TcpStream>>,
}
impl Server {
fn listen(&self) {
println!("Server is listening on port 8080");
let listener = TcpListener::bind("127.0.0.1:8080").unwrap();
loop {
let stream = listener.accept().unwrap().0;
println!("New client connected: {}", stream.peer_addr().unwrap());
//should block for write here
self.connections.write().unwrap().push(stream);
//should release write lock
}
}
pub fn run(self) {
let arc_self = Arc::new(self);
let arc_self_clone = arc_self.clone();
let listener_thread = thread::spawn(move || arc_self_clone.listen());
loop {
let mut input = String::new();
io::stdin().read_line(&mut input).unwrap();
if input.trim() == "1" {
for c in &*arc_self.connections.try_read().unwrap() {
println!("testing...");
}
}
}
}
}

Kill warp webserver on request in rust

I'm learning rust and for something that I want to do is kill, or shutdown, a webserver on GET /.
Is this something you can't do in with warp? Or is my implementation broken?
I've got the following code, but it just doesn't seem to want to respond to any HTTP requests.
pub async fn perform_oauth_flow(&self) {
let (tx, rx) = channel::unbounded();
let routes = warp::path::end().map(move || {
println!("handling");
tx.send("kill");
Ok(warp::reply::with_status("OK", http::StatusCode::CREATED))
});
println!("Spawning server");
let webserver_thread = thread::spawn(|| async {
spawn(warp::serve(routes).bind(([127, 0, 0, 1], 3000)))
.await
.unwrap();
});
println!("waiting for result");
let result = rx.recv().unwrap();
println!("Got result");
if result == "kill" {
webserver_thread.join().unwrap().await;
}
}
let webserver_thread = thread::spawn(|| async {
// ^^^^^
Creating an async block is not going to execute the code inside; it is just creating a Future you need to .await. Your server never actually runs.
In general, using threads with async code is not going to work well. Better to use your runtime tasks, in case of warp it is tokio, using tokio::spawn():
let webserver_thread = tokio::spawn(async move {
spawn(warp::serve(routes).bind(([127, 0, 0, 1], 3000)))
.await
.unwrap();
});
// ...
if result == "kill" {
webserver_thread.await;
}
You may also find it necessary to use tokio's async channels instead of synchronous channels.
There are two issues in your code:
As pointed out by #ChayimFriedman's answer, you never start the server because your async block never runs.
Even if you replace the threads with Tokio tasks, you never tell the server to exit. You need to use bind_with_graceful_shutdown so that you can notify the server to exit.
(Untested) complete example:
pub async fn perform_oauth_flow(&self) {
let (tx, rx) = tokio::oneshot::channel();
let routes = warp::path::end().map(move || {
println!("handling");
tx.send(());
Ok(warp::reply::with_status("OK", http::StatusCode::CREATED))
});
println!("Spawning server");
let server = warp::serve(routes)
.bind_with_graceful_shutdown(
([127, 0, 0, 1], 3000)),
async { rx.await.ok(); })
.1;
println!("waiting for result");
server.await;
}

[tokio-rs][documentation] Multiple asynchronous "sub-apps" with shared state example?

A common pattern for Node.js apps is to split them into many "sub-apps" that share some state. Of course, all the "sub-apps" should be handled asynchronously.
Here's a simple example of such a Node app, with three "sub-apps":
An interval timer => Every 10 seconds, a shared itv_counter is incremented
A TCP server => For every TCP message received, a shared tcp_counter is incremented
A UDP server => For every UDP message received, a shared udp_counter is incremented
Every time one of the counters is incremented, all three counters must be printed (hence why the "sub-apps" need to share state).
Here's an implementation in Node. The nice thing about Node is that you can assume that pretty much all I/O operations are handled asynchronously by default. There's no cognitive overhead for the developer.
const dgram = require('dgram');
const net = require('net');
const tcp_port = 3000;
const udp_port = 3001;
const tcp_listener = net.createServer();
const udp_listener = dgram.createSocket('udp4');
// state shared by the 3 asynchronous applications
const shared_state = {
itv_counter: 0,
tcp_counter: 0,
udp_counter: 0,
};
// itv async app: increment itv_counter every 10 seconds and print shared state
setInterval(() => {
shared_state.itv_counter += 1;
console.log(`itv async app: ${JSON.stringify(shared_state)}`);
}, 10_000);
// tcp async app: increment tcp_counter every time a TCP message is received and print shared state
tcp_listener.on('connection', (client) => {
client.on('data', (_data) => {
shared_state.tcp_counter += 1;
console.log(`tcp async app: ${JSON.stringify(shared_state)}`);
});
});
tcp_listener.listen(tcp_port, () => {
console.log(`TCP listener on port ${tcp_port}`);
});
// udp async app: increment udp_counter every time a UDP message is received and print shared state
udp_listener.on('message', (_message, _client) => {
shared_state.udp_counter += 1;
console.log(`udp async app: ${JSON.stringify(shared_state)}`);
});
udp_listener.on('listening', () => {
console.log(`UDP listener on port ${udp_port}`);
});
udp_listener.bind(udp_port);
Now, here's an implementation in Rust with Tokio as the asynchronous runtime.
use std::sync::{Arc, Mutex};
use std::time::Duration;
use tokio::net::{TcpListener, UdpSocket};
use tokio::prelude::*;
// state shared by the 3 asynchronous applications
#[derive(Clone, Debug)]
struct SharedState {
state: Arc<Mutex<State>>,
}
#[derive(Debug)]
struct State {
itv_counter: usize,
tcp_counter: usize,
udp_counter: usize,
}
impl SharedState {
fn new() -> SharedState {
SharedState {
state: Arc::new(Mutex::new(State {
itv_counter: 0,
tcp_counter: 0,
udp_counter: 0,
})),
}
}
}
#[tokio::main]
async fn main() {
let shared_state = SharedState::new();
// itv async app: increment itv_counter every 10 seconds and print shared state
let itv_shared_state = shared_state.clone();
let itv_handle = tokio::spawn(async move {
let mut interval = tokio::time::interval(Duration::from_secs(10));
interval.tick().await;
loop {
interval.tick().await;
let mut state = itv_shared_state.state.lock().unwrap();
state.itv_counter += 1;
println!("itv async app: {:?}", state);
}
});
// tcp async app: increment tcp_counter every time a TCP message is received and print shared state
let tcp_shared_state = shared_state.clone();
let tcp_handle = tokio::spawn(async move {
let mut tcp_listener = TcpListener::bind("127.0.0.1:3000").await.unwrap();
println!("TCP listener on port 3000");
while let Ok((mut tcp_stream, _)) = tcp_listener.accept().await {
let tcp_shared_state = tcp_shared_state.clone();
tokio::spawn(async move {
let mut buffer = [0; 1024];
while let Ok(byte_count) = tcp_stream.read(&mut buffer).await {
if byte_count == 0 {
break;
}
let mut state = tcp_shared_state.state.lock().unwrap();
state.tcp_counter += 1;
println!("tcp async app: {:?}", state);
}
});
}
});
// udp async app: increment udp_counter every time a UDP message is received and print shared state
let udp_shared_state = shared_state.clone();
let udp_handle = tokio::spawn(async move {
let mut udp_listener = UdpSocket::bind("127.0.0.1:3001").await.unwrap();
println!("UDP listener on port 3001");
let mut buffer = [0; 1024];
while let Ok(_byte_count) = udp_listener.recv(&mut buffer).await {
let mut state = udp_shared_state.state.lock().unwrap();
state.udp_counter += 1;
println!("udp async app: {:?}", state);
}
});
itv_handle.await.unwrap();
tcp_handle.await.unwrap();
udp_handle.await.unwrap();
}
First of all, as I'm not super comfortable with Tokio and async Rust yet, there might be things that are dead wrong in this implementation, or bad practice. Please let me know if that's the case (e.g. I have no clue if the three JoinHandle .await are necessary at the very end). That said, it behaves the same as the Node implementation for my simple tests.
But I'm still not sure if it's equivalent under the hood in terms of asynchronicity. Should there be a tokio::spawn for every callback in the Node app? In that case, I should wrap tcp_stream.read() and udp_listener.recv() in another tokio::spawn to mimic the Node callbacks for TCP's on('data') and UDP's on('message'), respectively. Not sure...
What would be the tokio implementation that would be totally equivalent to the Node.js app in terms of asynchronicity? In general, what's a good rule of thumb to know when something should be wrapped in a tokio::spawn?
I see you have three different counters for your tasks and so I think there is a meaningful way to use a token of your state struct and turn it around between tasks.
So every task is responsible to update its own counter.
As a suggestion I suggest to use tokio::sync::mpsc::channel and implement three mpsc value each one directed from one task to next one.
Of course if there is an update period difference between tasks there is a risk that some values update a little bit late but I think in general cases it can be ignored.

What is the best way to share a big read-only structure between two threads?

One thread calculates some data that takes about 1GB of RAM and another thread only reads this data. What is the best way to implement this?
use std::thread;
use std::time::Duration;
fn main() {
let mut shared: i32 = 0; // will be changed to big structure
thread::spawn(move || {
loop {
shared += 1;
println!("write shared {}", shared);
thread::sleep(Duration::from_secs(2));
}
});
thread::spawn(move || {
loop {
thread::sleep(Duration::from_secs(1));
println!("read shared = ???"); // <---------------- ????
}
});
thread::sleep(Duration::from_secs(4));
println!("main");
}
You can run this code online (play.rust-lang.org)
The code and your statements don't really make sense together. For example, there's nothing that prevents the second thread from finishing before the first thread ever has a chance to start. Yes, I see the sleeps, but sleeping is not a viable concurrency solution.
For the question as asked, I'd use a channel. This allows one thread to produce a value and then transfer ownership of that value to another thread:
use std::thread;
use std::sync::mpsc;
fn main() {
let (tx, rx) = mpsc::channel();
let a = thread::spawn(move || {
let large_value = 1;
println!("write large_value {}", large_value);
tx.send(large_value).expect("Unable to send");
});
let b = thread::spawn(move || {
let large_value = rx.recv().expect("Unable to receive");
println!("read shared = {}", large_value);
});
a.join().expect("Unable to join a");
b.join().expect("Unable to join b");
println!("main");
}
For the code as presented, there's really no other options besides a Mutex or a RwLock. This allows one thread to mutate the shared value for a while, then the other thread may read it for a while (subject to the vagaries of the OS scheduler):
use std::thread;
use std::time::Duration;
use std::sync::{Arc, Mutex};
fn main() {
let shared = Arc::new(Mutex::new(0));
let shared_1 = shared.clone();
thread::spawn(move || {
loop {
let mut shared = shared_1.lock().expect("Unable to lock");
*shared += 1;
println!("write large_value {}", *shared);
}
});
thread::spawn(move || {
loop {
let shared = shared.lock().expect("Unable to lock");
println!("read shared = {}", *shared);
}
});
thread::sleep(Duration::from_secs(1));
println!("main");
}
None of this is particularly unique to Rust; channels are quite popular in Go and Clojure and mutexes have existed for A Very Long Time. I'd suggest checking out any of the numerous beginner's guides on the Internet for multithreading and the perils therein.

How to terminate or suspend a Rust thread from another thread?

Editor's note — this example was created before Rust 1.0 and the specific types have changed or been removed since then. The general question and concept remains valid.
I have spawned a thread with an infinite loop and timer inside.
thread::spawn(|| {
let mut timer = Timer::new().unwrap();
let periodic = timer.periodic(Duration::milliseconds(200));
loop {
periodic.recv();
// Do my work here
}
});
After a time based on some conditions, I need to terminate this thread from another part of my program. In other words, I want to exit from the infinite loop. How can I do this correctly? Additionally, how could I to suspend this thread and resume it later?
I tried to use a global unsafe flag to break the loop, but I think this solution does not look nice.
For both terminating and suspending a thread you can use channels.
Terminated externally
On each iteration of a worker loop, we check if someone notified us through a channel. If yes or if the other end of the channel has gone out of scope we break the loop.
use std::io::{self, BufRead};
use std::sync::mpsc::{self, TryRecvError};
use std::thread;
use std::time::Duration;
fn main() {
println!("Press enter to terminate the child thread");
let (tx, rx) = mpsc::channel();
thread::spawn(move || loop {
println!("Working...");
thread::sleep(Duration::from_millis(500));
match rx.try_recv() {
Ok(_) | Err(TryRecvError::Disconnected) => {
println!("Terminating.");
break;
}
Err(TryRecvError::Empty) => {}
}
});
let mut line = String::new();
let stdin = io::stdin();
let _ = stdin.lock().read_line(&mut line);
let _ = tx.send(());
}
Suspending and resuming
We use recv() which suspends the thread until something arrives on the channel. In order to resume the thread, you need to send something through the channel; the unit value () in this case. If the transmitting end of the channel is dropped, recv() will return Err(()) - we use this to exit the loop.
use std::io::{self, BufRead};
use std::sync::mpsc;
use std::thread;
use std::time::Duration;
fn main() {
println!("Press enter to wake up the child thread");
let (tx, rx) = mpsc::channel();
thread::spawn(move || loop {
println!("Suspending...");
match rx.recv() {
Ok(_) => {
println!("Working...");
thread::sleep(Duration::from_millis(500));
}
Err(_) => {
println!("Terminating.");
break;
}
}
});
let mut line = String::new();
let stdin = io::stdin();
for _ in 0..4 {
let _ = stdin.lock().read_line(&mut line);
let _ = tx.send(());
}
}
Other tools
Channels are the easiest and the most natural (IMO) way to do these tasks, but not the most efficient one. There are other concurrency primitives which you can find in the std::sync module. They belong to a lower level than channels but can be more efficient in particular tasks.
The ideal solution would be a Condvar. You can use wait_timeout in the std::sync module, as pointed out by #Vladimir Matveev.
This is the example from the documentation:
use std::sync::{Arc, Mutex, Condvar};
use std::thread;
use std::time::Duration;
let pair = Arc::new((Mutex::new(false), Condvar::new()));
let pair2 = pair.clone();
thread::spawn(move|| {
let &(ref lock, ref cvar) = &*pair2;
let mut started = lock.lock().unwrap();
*started = true;
// We notify the condvar that the value has changed.
cvar.notify_one();
});
// wait for the thread to start up
let &(ref lock, ref cvar) = &*pair;
let mut started = lock.lock().unwrap();
// as long as the value inside the `Mutex` is false, we wait
loop {
let result = cvar.wait_timeout(started, Duration::from_millis(10)).unwrap();
// 10 milliseconds have passed, or maybe the value changed!
started = result.0;
if *started == true {
// We received the notification and the value has been updated, we can leave.
break
}
}
Having been back to this question several times myself, here's what I think addresses OP's intent and others' best practice of getting the thread to stop itself. Building on the accepted answer, Crossbeam is a nice upgrade to mpsc in allowing message endpoints to be cloned and moved. It also has a convenient tick function. The real point here is it has try_recv() which is non-blocking.
I'm not sure how universally useful it'd be to put a message checker in the middle of an operational loop like this. I haven't found that Actix (or previously Akka) could really stop a thread without--as stated above--getting the thread to do it itself. So this is what I'm using for now (wide open to correction here, still learning myself).
// Cargo.toml:
// [dependencies]
// crossbeam-channel = "0.4.4"
use crossbeam_channel::{Sender, Receiver, unbounded, tick};
use std::time::{Duration, Instant};
fn main() {
let (tx, rx):(Sender<String>, Receiver<String>) = unbounded();
let rx2 = rx.clone();
// crossbeam allows clone and move of receiver
std::thread::spawn(move || {
// OP:
// let mut timer = Timer::new().unwrap();
// let periodic = timer.periodic(Duration::milliseconds(200));
let ticker: Receiver<Instant> = tick(std::time::Duration::from_millis(500));
loop {
// OP:
// periodic.recv();
crossbeam_channel::select! {
recv(ticker) -> _ => {
// OP: Do my work here
println!("Hello, work.");
// Comms Check: keep doing work?
// try_recv is non-blocking
// rx, the single consumer is clone-able in crossbeam
let try_result = rx2.try_recv();
match try_result {
Err(_e) => {},
Ok(msg) => {
match msg.as_str() {
"END_THE_WORLD" => {
println!("Ending the world.");
break;
},
_ => {},
}
},
_ => {}
}
}
}
}
});
// let work continue for 10 seconds then tell that thread to end.
std::thread::sleep(std::time::Duration::from_secs(10));
println!("Goodbye, world.");
tx.send("END_THE_WORLD".to_string());
}
Using strings as a message device is a tad cringeworthy--to me. Could do the other suspend and restart stuff there in an enum.

Resources