How can I reliably clean up Rust threads performing blocking IO? - multithreading

It seems to be a common idiom in Rust to spawn off a thread for blocking IO so you can use non-blocking channels:
use std::sync::mpsc::channel;
use std::thread;
use std::net::TcpListener;
fn main() {
let (accept_tx, accept_rx) = channel();
let listener_thread = thread::spawn(move || {
let listener = TcpListener::bind(":::0").unwrap();
for client in listener.incoming() {
if let Err(_) = accept_tx.send(client.unwrap()) {
break;
}
}
});
}
The problem is, rejoining threads like this depends on the spawned thread "realizing" that the receiving end of the channel has been dropped (i.e., calling send(..) returns Err(_)):
drop(accept_rx);
listener_thread.join(); // blocks until listener thread reaches accept_tx.send(..)
You can make dummy connections for TcpListeners, and shutdown TcpStreams via a clone, but these seem like really hacky ways to clean up such threads, and as it stands, I don't even know of a hack to trigger a thread blocking on a read from stdin to join.
How can I clean up threads like these, or is my architecture just wrong?

One simply cannot safely cancel a thread reliably in Windows or Linux/Unix/POSIX, so it isn't available in the Rust standard library.
Here is an internals discussion about it.
There are a lot of unknowns that come from cancelling threads forcibly. It can get really messy. Beyond that, the combination of threads and blocking I/O will always face this issue: you need every blocking I/O call to have timeouts for it to even have a chance of being interruptible reliably. If one can't write async code, one needs to either use processes (which have a defined boundary and can be ended by the OS forcibly, but obviously come with heavier weight and data sharing challenges) or non-blocking I/O which will land your thread back in an event loop that is interruptible.
mio is available for async code. Tokio is a higher level crate based on mio which makes writing non-blocking async code even more straight forward.

Related

What is the name, if it exists, for a concurrency structure that allows parallel processing but writes in the order received?

I have a Rust-based latency-sensitive application that subscribes to a stream of incoming data, deserializes it, processes the deserialized object, and then forwards it elsewhere.
Sometimes, I receive bursts of messages and this causes the latency to degrade a bit as it is "backed up." It would be great if I could parallelize the deserialization.
However, I need to preserve the order of the messages when I forward them along. Forwarding is extremely fast, almost negligible, so the fact that forwarding is serial is okay.
Naively, I could send a tuple of (sequence_number, data) over a channel to a pool of processor threads, and each thread could, upon processing, send a tuple of (sequence_number, processed) over a different channel to a single thread that simply forwards. The forwarding thread would also keep track of the next sequence_number to send. When it receives something over the channel, it saves to a HashMap<u64, MyData>. Then while the map contains the next sequence_number, it could forward.
But it gives me pause that I couldn't find such a library on GitHub; makes me think this could be a bad idea.
So I am wondering, is there a name for this sort of thing? Does it exist in Rust or some other language? Is there a better pattern I can follow?
Not sure of a common term but you could use FuturesOrdered from the futures crate.
Here is an example (playground):
use rand::{thread_rng, Rng};
use futures::stream::FuturesOrdered;
use futures::StreamExt as _;
use std::thread;
use core::time::Duration;
#[tokio::main]
async fn main() {
let mut ord_futures = FuturesOrdered::new();
for i in 0..100 {
// receive
ord_futures.push(async move {
tokio::time::sleep(Duration::from_secs(thread_rng().gen_range(1..5))).await;
println!("processed {i}");
i
});
}
while let Some(i) = ord_futures.next().await {
// forward
println!("received {i}");
}
}

Is there a recommended rust multi-threaded tcp communication program model?

While learning some Rust, I saw a lot of tutorials that used two very simple models. One is on the server side, where all the accepted tcpstreams are moved to a new thread for use, and the other is on the client side, using blocking reads and then output.
But for a real project, this is definitely not enough. For example, on the client side, it is usually not possible to block the main thread to read the data. So either use non-blocking sockets, or use multi-threaded or asynchronous io.
Since I am new to Rust, I don't plan to use async io or tokio libraries for this.
Suppose I use a thread to block reading data, and send data or close the tcp connection in the main thread.
As a general practice, since the tcp connection is used in two threads, then generally we have to use Arc<Mutex<TcpStream>> to use the connection variable.
But when I need to read in the read thread, I will do Mutex::lock() to get the TcpStream, and when I send or close in the main thread, I also need to do Mutex::lock(). Won't this cause a deadlock?
Of course, another way is to poll a message queue in a new thread, and send commands like this one when the socket has a read event, or when the main thread needs to send data or close the connection. This way the access to the TcpStream is done in one thread. However, it seems to add a lot of extra code for maintaining the message queue.
If the TcpStream can generate two ends, just like channel, a read end and a write end. I will use them in different threads conveniently. But it seems no such function provided.
Is there a recommended approach?
Not sure about a "recommended" approach, but you don't need a mutex to read/write from a TcpStream because io traits are implemented for &TcpStream in addition to TcpStream. This allows you to call methods like read() on &stream, which can be easily shared among threads using Arc. For example:
use std::io::{BufRead, BufReader, BufWriter, Write};
use std::net::TcpStream;
use std::sync::Arc;
fn main() -> std::io::Result<()> {
let stream = Arc::new(TcpStream::connect("127.0.0.1:34254")?);
let (r, w) = (Arc::clone(&stream), stream);
let thr1 = std::thread::spawn(move || -> std::io::Result<()> {
let r = BufReader::new(r.as_ref());
for line in r.lines() {
println!("received: {}", line?);
}
Ok(())
});
let thr2 = std::thread::spawn(move || {
let mut w = BufWriter::new(w.as_ref());
w.write_all(b"Hello\n")
});
thr1.join().unwrap()?;
thr2.join().unwrap()?;
Ok(())
}

How can I launch a daemon in a websocket handler with actix-web?

Given a basic setup of a WebSocket server with Actix, how can I launch a daemon inside my message handler?
I've extended the example starter code linked above to call daemon(false, true) using the fork crate.
use actix::{Actor, StreamHandler};
use actix_web::{web, App, Error, HttpRequest, HttpResponse, HttpServer};
use actix_web_actors::ws;
use fork::{daemon, Fork};
/// Define HTTP actor
struct MyWs;
impl Actor for MyWs {
type Context = ws::WebsocketContext<Self>;
}
/// Handler for ws::Message message
impl StreamHandler<Result<ws::Message, ws::ProtocolError>> for MyWs {
fn handle(
&mut self,
msg: Result<ws::Message, ws::ProtocolError>,
ctx: &mut Self::Context,
) {
match msg {
Ok(ws::Message::Ping(msg)) => ctx.pong(&msg),
Ok(ws::Message::Text(text)) => {
println!("text message received");
if let Ok(Fork::Child) = daemon(false, true) {
println!("from daemon: this print but then the websocket crashes!");
};
ctx.text(text)
},
Ok(ws::Message::Binary(bin)) => ctx.binary(bin),
_ => (),
}
}
}
async fn index(req: HttpRequest, stream: web::Payload) -> Result<HttpResponse, Error> {
let resp = ws::start(MyWs {}, &req, stream);
println!("{:?}", resp);
resp
}
#[actix_web::main]
async fn main() -> std::io::Result<()> {
HttpServer::new(|| App::new().route("/ws/", web::get().to(index)))
.bind("127.0.0.1:8080")?
.run()
.await
}
The above code starts the server but when I send it a message, I receive a Panic in Arbiter thread.
text message received
from daemon: this print but then the websocket crashes!
thread 'actix-rt:worker:0' panicked at 'failed to park', /Users/xxx/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.25/src/runtime/basic_scheduler.rs:158:56
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Panic in Arbiter thread.
The issue with your application is that the actix-web runtime (i.e. Tokio) is multi-threaded. This is a problem because the fork() call (used internaly by daemon()) only replicates the thread that called fork().
Even if your parent process has N threads, your child process will have only 1. If your parent process has any mutexes locked by those threads, their state will be replicated in the child process, but as those threads do not exist there, they will remain locked for forever.
If you have an Rc/Arc it will never de-allocate its memory, because it will never be dropped, thus its internal count will never reach zero. The same applies for any pointers and shared state.
Or said more simply - your forked child will end up in undefined state.
This is best explained in Calling fork() in a Multithreaded Environment:
The fork( ) system call creates an exact duplicate of the address
space from which it is called, resulting in two address spaces
executing the same code. Problems can occur if the forking address
space has multiple threads executing at the time of the fork( ). When
multithreading is a result of library invocation, threads are not
necessarily aware of each other's presence, purpose, actions, and so
on. Suppose that one of the other threads (any thread other than the
one doing the fork( )) has the job of deducting money from your
checking account. Clearly, you do not want this to happen twice as a
result of some other thread's decision to call fork( ).
Because of these types of problems, which in general are problems of
threads modifying persistent state, POSIX defined the behavior of
fork( ) in the presence of threads to propagate only the forking
thread. This solves the problem of improper changes being made to
persistent state. However, it causes other problems, as discussed in
the next paragraph.
In the POSIX model, only the forking thread is propagated. All the
other threads are eliminated without any form of notice; no cancels
are sent and no handlers are run. However, all the other portions of
the address space are cloned, including all the mutex state. If the
other thread has a mutex locked, the mutex will be locked in the child
process, but the lock owner will not exist to unlock it. Therefore,
the resource protected by the lock will be permanently unavailable.
Here you can find a more reputable source with more details
To answer your other question:
"how can I launch a daemon inside my message handler?"
I assume you want to implement the classical unix "fork() on accept()" model.
In that case you are out of luck, because servers such as actix-web, and async/await
in general are not designed with that in mind. Even if you have a
single-threaded async/await server, then:
When a child is forked it inherits all file descriptors from the parent. So it's
common after a fork, the child to close its listening socket in order to avoid a
resource leak - but there is no way to do that on any of the async/await based servers,
not because it's impossible to do, but because it's not implemented.
And even more important reason to do that is to prevent the child process
from accepting new connections - because even if you run a single threaded
server, it's still capable of processing many tasks concurrently - i.e.
when your handler calls .await on something, the acceptor would be free to
accept a new connection (by stealing it from the socket's queue) and start processing it.
Your parent server may have already spawned a lot of tasks and those would be
replicated in each forked child, thus executing the very same thing multiple times,
independently in each process
And well... there is no way to prevent any of that on any of the async/await
based servers I'm familiar with. You would need a custom server that:
Checks in its acceptor task if it's a child and if it detects that it's the child
it should close the listening socket and drop the acceptor.
It should not execute any other task that was forked from the parent,
but there is no way to achieve that.
In other words - async/await and "fork() on accept()" are two different and
incompatible models for processing tasks concurrently.
A possible solution would be to have a non-async acceptor daemon that only
accepts connections and forks itself. Then spawns a web-server in the child
then feeding it the accepted socket. But although possible, none of the servers
currently have support for that.
As described in the other answer, the async runtime you're relying on may completely break if you touch it in the child process. Touching anything can completely break assumptions the actix or tokio devs made. Wacky stuff will happen if you so much as return from the function.
See this response by one of the key authors of tokio to someone doing something similar (calling fork() in the context of a threadpool with hyper):
Threads + fork is bad news... you can fork if you immediately exec and do not allocate memory or perform any other operation that may have been corrupted by the fork.
Going back to your question:
The objective is for my websocket to respond to messages and be able to launch isolated long-running processes that launch successfully and do not exit when the websocket exits.
I don't think you want to manually fork() at all. Utility functions provided by actix/tokio should integrate well with their runtimes. You may:
Run blocking or CPU-heavy code in a dedicated thread with actix_web::block
Spawn a future with actix::AsyncContext::spawn. You would ideally want to use e.g. tokio::process::Command rather than the std version to avoid blocking in an async context.
If all you're doing in the child process is running Command::new() and later Command::spawn(), I'm pretty sure you can just call it directly. There's no need to fork; it does that internally.

How to check if a thread has finished in Rust?

When I spawn a thread in Rust, I get a JoinHandle, which is good for... joining (a blocking operation), and not much else. How can I check if a child thread has exited (i.e., JoinHandle.join() would not block) from the parent thread? Bonus points if you know how to kill a child thread.
I imagine you could do this by creating a channel, sending something to the child, and catching errors, but that seems like needless complexity and overhead.
As of Rust 1.7, there's no API in the standard library to check if a child thread has exited without blocking.
A portable workaround would be to use channels to send a message from the child to the parent to signal that the child is about to exit. Receiver has a non-blocking try_recv method. When try_recv does receive a message, you can then use join() on the JoinHandle to retrieve the thread's result.
There are also unstable platform-specific extension traits that let you obtain the raw thread handle. You'd then have to write platform-specific code to test whether the thread has exited or not.
If you think this feature should be in Rust's standard library, you can submit an RFC (be sure to read the README first!).
Bonus points if you know how to kill a child thread.
Threads in Rust are implemented using native OS threads. Even though the operating system might provide a way to kill a thread, it's a bad idea to do so, because the resources that the thread allocated will not be cleaned up until the process ends.
The short answer is not possible yet. But this is not the point that should really be addressed.
Bonus points if you know how to kill a child thread.
NEVER
Even in languages that do support killing threads (see Java here), it is recommended not to.
A thread's execution is generally coded with explicit points of interactions, and there are often implicit assumptions that no other interruption will occur.
The most egregious example is of course resources: the naive "kill" method would be to stop executing the thread; this would mean not releasing any resource. You may think about memory, it's the least of your worries. Imagine, instead, all the Mutex that are not unlocked and will create deadlocks later...
The other option would be to inject a panic in the thread, which would cause unwinding. However, you cannot just start unwinding at any point! The program would have to define safe points at which injecting a panic would be guaranteed to be safe (injecting it at any other point means potentially corrupting shared objects); how to define such safe points and inject the panic there is an open research problem in native languages, especially those executed on systems W^X (where memory pages are either Writable or Executable but never both).
In summary, there is no known way to safely (both memory-wise and functionality-wise) kill a thread.
It's possible, friends. Use refcounters which Rust will drop on end or panic. 100% safe. Example:
use std::time::Duration;
use std::sync::Arc;
use std::sync::atomic::{AtomicBool, Ordering};
use std::thread;
fn main() {
// Play with this flag
let fatal_flag = true;
let do_stop = true;
let working = Arc::new(AtomicBool::new(true));
let control = Arc::downgrade(&working);
thread::spawn(move || {
while (*working).load(Ordering::Relaxed) {
if fatal_flag {
panic!("Oh, my God!");
} else {
thread::sleep(Duration::from_millis(20));
println!("I'm alive!");
}
}
});
thread::sleep(Duration::from_millis(50));
// To stop thread
if do_stop {
match control.upgrade() {
Some(working) => (*working).store(false, Ordering::Relaxed),
None => println!("Sorry, but thread died already."),
}
}
thread::sleep(Duration::from_millis(50));
// To check it's alive / died
match control.upgrade() {
Some(_) => println!("Thread alive!"),
None => println!("Thread ends!"),
}
}
Gist: https://gist.github.com/DenisKolodin/edea80f2f5becb86f718c330219178e2
At playground: https://play.rust-lang.org/?gist=9a0cf161ba0bbffe3824b9db4308e1fb&version=stable&backtrace=0
UPD: I've created thread-control crate which implements this approach: https://github.com/DenisKolodin/thread-control
I think Arc can be used to solve this problem
If the thread exits, the reference counter is reduced by one
As of rust 1.61.0, there is an is_finished method.
https://doc.rust-lang.org/stable/std/thread/struct.JoinHandle.html#method.is_finished

How to avoid deadlocks?

When using multiple threads, shared memory needs to be locked by critical sections. However, using critical sections causes potential deadlocks. How can they be avoided?
One way is to use a hierarchy of critical sections. If you ensure that a parent critical section is never entered within one of its children, deadlocks cannot happen. The difficulty is to enforce this hierarchy.
The Related list to the right on this page contains a few links that provides interesting information on the topic.
In addition to that list, there are many other SO questions discussing the topic, such as
Threading Best Practices
Why is lock(this) {…} bad?
What are common reasons for deadlocks?
...and many more
You can avoid critical sections by using message passing instead (synchronous and asynchronous calls). When using synchronous calls, you still have to make sure not to make a circular call, in which thread A asks thread B a question, and B needs to ask A a question to be able to respond.
Another option is to make asynchronous calls instead. However, it is more difficult to get return values.
Note: Indeed, a message passing system is implemented using a critical section that locks the call queue, but it is abstracted away.
Among the various methods to enter critical sections -- semaphores and mutexs are the most popular.
A semaphore is a waiting mechanism and mutex is a locking mechanism, well the concept is confusing to the most, but in short, a thread activating a mutex can only deactivate it. with this in mind...
Dont allow any process to lock partial no of resources, if a process need 5 resources, wait until all the are available.
if u use semaphore here, u can unblock/un-wait the resource occupied by other thread. by this i mean pre-emption is another reason.
These 2 according to me are the basic conditions, the remaining 2 of the common 4 precautions can be related to these.
If u dont agree ps add comments. I've gtg already late, I will later add a cleaner and clearer explanation.
When I work in C++, the following works for me:
all public methods (excluding ctor and dtor) of a threadsafe class lock
private methods cannot call public methods
It's not a general deadlock avoidance method.
You must code multi-thread programs very carefully. There's no short-cut, you must understand the flow of your program, otherwise you'll be doomed.
THE FOLLOWING ALGORITHM IS USED TO AVOID DEADLOCK:
Banker’s Algorithm
–Impose less stringent conditions than in deadlock prevention in an attempt to get better resource utilization
–Safe state
•Operating system can guarantee that all current processes can complete their work within a finite time
–Unsafe state
•Does not imply that the system is deadlocked, but that the OS cannot guarantee that all current processes can complete their work within a finite time
–Requires that resources be allocated to processes only when the allocations result in safe states.
–It has a number of weaknesses (such as requiring a fixed number of processes and resources) that prevent it from being implemented in real systems
One way is by using a non-blocking locking function. As an example, in rust You could use std::sync::Mutex::try_lock instead of std::sync::Mutex::lock.
So so if you have this example code:
fn transfer(tx: &Mutex<i32>, rx: &Mutex<i32>, amount: i32) -> () {
let mut tx = tx.lock().unwrap();
let mut rx = rx.lock().unwrap();
*tx -= amount;
*rx += amount;
}
You could instead do something like this:
fn transfer(tx: &Mutex<i32>, rx: &Mutex<i32>, amount: i32) -> () {
loop {
// Attempt to lock both mutexes
let mut tx = tx.try_lock();
let mut rx = rx.try_lock();
// If both locks were successfull,
// i.e. if they currently are not
// locked by an other thread
if let Ok(ref mut tx) = tx {
if let Ok(ref mut rx) = rx {
// Perform the operations needed on
// the values inside the mutexes
**tx -= amount;
**rx += amount;
// Exit the loop
break;
}
}
// If at least one of the locks were
// not successful, restart the loop
// and try locking the values again.
// You may also want to sleep the thread
// here for a short period if You think that
// the mutexes might be locked for a while.
}
}

Resources