How to check if a thread has finished in Rust? - multithreading

When I spawn a thread in Rust, I get a JoinHandle, which is good for... joining (a blocking operation), and not much else. How can I check if a child thread has exited (i.e., JoinHandle.join() would not block) from the parent thread? Bonus points if you know how to kill a child thread.
I imagine you could do this by creating a channel, sending something to the child, and catching errors, but that seems like needless complexity and overhead.

As of Rust 1.7, there's no API in the standard library to check if a child thread has exited without blocking.
A portable workaround would be to use channels to send a message from the child to the parent to signal that the child is about to exit. Receiver has a non-blocking try_recv method. When try_recv does receive a message, you can then use join() on the JoinHandle to retrieve the thread's result.
There are also unstable platform-specific extension traits that let you obtain the raw thread handle. You'd then have to write platform-specific code to test whether the thread has exited or not.
If you think this feature should be in Rust's standard library, you can submit an RFC (be sure to read the README first!).
Bonus points if you know how to kill a child thread.
Threads in Rust are implemented using native OS threads. Even though the operating system might provide a way to kill a thread, it's a bad idea to do so, because the resources that the thread allocated will not be cleaned up until the process ends.

The short answer is not possible yet. But this is not the point that should really be addressed.
Bonus points if you know how to kill a child thread.
NEVER
Even in languages that do support killing threads (see Java here), it is recommended not to.
A thread's execution is generally coded with explicit points of interactions, and there are often implicit assumptions that no other interruption will occur.
The most egregious example is of course resources: the naive "kill" method would be to stop executing the thread; this would mean not releasing any resource. You may think about memory, it's the least of your worries. Imagine, instead, all the Mutex that are not unlocked and will create deadlocks later...
The other option would be to inject a panic in the thread, which would cause unwinding. However, you cannot just start unwinding at any point! The program would have to define safe points at which injecting a panic would be guaranteed to be safe (injecting it at any other point means potentially corrupting shared objects); how to define such safe points and inject the panic there is an open research problem in native languages, especially those executed on systems W^X (where memory pages are either Writable or Executable but never both).
In summary, there is no known way to safely (both memory-wise and functionality-wise) kill a thread.

It's possible, friends. Use refcounters which Rust will drop on end or panic. 100% safe. Example:
use std::time::Duration;
use std::sync::Arc;
use std::sync::atomic::{AtomicBool, Ordering};
use std::thread;
fn main() {
// Play with this flag
let fatal_flag = true;
let do_stop = true;
let working = Arc::new(AtomicBool::new(true));
let control = Arc::downgrade(&working);
thread::spawn(move || {
while (*working).load(Ordering::Relaxed) {
if fatal_flag {
panic!("Oh, my God!");
} else {
thread::sleep(Duration::from_millis(20));
println!("I'm alive!");
}
}
});
thread::sleep(Duration::from_millis(50));
// To stop thread
if do_stop {
match control.upgrade() {
Some(working) => (*working).store(false, Ordering::Relaxed),
None => println!("Sorry, but thread died already."),
}
}
thread::sleep(Duration::from_millis(50));
// To check it's alive / died
match control.upgrade() {
Some(_) => println!("Thread alive!"),
None => println!("Thread ends!"),
}
}
Gist: https://gist.github.com/DenisKolodin/edea80f2f5becb86f718c330219178e2
At playground: https://play.rust-lang.org/?gist=9a0cf161ba0bbffe3824b9db4308e1fb&version=stable&backtrace=0
UPD: I've created thread-control crate which implements this approach: https://github.com/DenisKolodin/thread-control

I think Arc can be used to solve this problem
If the thread exits, the reference counter is reduced by one

As of rust 1.61.0, there is an is_finished method.
https://doc.rust-lang.org/stable/std/thread/struct.JoinHandle.html#method.is_finished

Related

Are there any downsides to choosing not to join threads in Rust?

I have a program that uses multiple threads to brute force the decryption of some encrypted string. The main thread has a channel, and the sender is cloned and sent to each thread. When a thread finds an answer, it sends it to the receiver which is in the main thread.
In this program I am not joining the threads, instead I use the blocking call sender.recv() to suspend the main thread until a single other thread finishes.
My hope is, once this call finishes, the main thread will return and all the other worker threads will be terminated.
Is this a poor design choice? Are there drawbacks of not having some condition in the other threads which would cause them to return when the solution has been discovered? Is it okay/safe to rely on the compiler to clean up my threads before they've technically finished?
Assuming there's no cleanup to be done, what you've done is mostly harmless. I'm assuming your worker thread looks something like this right now.
fn my_thread() {
// ... lots of hard work ...
channel.send(my_result);
}
and if that's the case, then "I received the result" and "the other thread is terminated" are very similar events, and the difference of "this function returned" is probably irrelevant. But suppose someone comes along and changes the code to look like this.
fn my_thread() {
// ... lots of hard work ...
channel.send(my_result);
do_cleanup_stuff();
}
Now do_cleanup_stuff() might not get a chance to run, if your main thread terminates before my_thread does. If that cleanup function is important, that could cause problems. And it could be more subtle than that. If any local variable in my_thread holds a file handle or an open TCP stream or any other object with a nontrivial Drop implementation, that value may not get a chance to Drop properly if you don't join the thread.
So it's probably best practice to join everything, even if it's just a final step at the end of your main.

How can I launch a daemon in a websocket handler with actix-web?

Given a basic setup of a WebSocket server with Actix, how can I launch a daemon inside my message handler?
I've extended the example starter code linked above to call daemon(false, true) using the fork crate.
use actix::{Actor, StreamHandler};
use actix_web::{web, App, Error, HttpRequest, HttpResponse, HttpServer};
use actix_web_actors::ws;
use fork::{daemon, Fork};
/// Define HTTP actor
struct MyWs;
impl Actor for MyWs {
type Context = ws::WebsocketContext<Self>;
}
/// Handler for ws::Message message
impl StreamHandler<Result<ws::Message, ws::ProtocolError>> for MyWs {
fn handle(
&mut self,
msg: Result<ws::Message, ws::ProtocolError>,
ctx: &mut Self::Context,
) {
match msg {
Ok(ws::Message::Ping(msg)) => ctx.pong(&msg),
Ok(ws::Message::Text(text)) => {
println!("text message received");
if let Ok(Fork::Child) = daemon(false, true) {
println!("from daemon: this print but then the websocket crashes!");
};
ctx.text(text)
},
Ok(ws::Message::Binary(bin)) => ctx.binary(bin),
_ => (),
}
}
}
async fn index(req: HttpRequest, stream: web::Payload) -> Result<HttpResponse, Error> {
let resp = ws::start(MyWs {}, &req, stream);
println!("{:?}", resp);
resp
}
#[actix_web::main]
async fn main() -> std::io::Result<()> {
HttpServer::new(|| App::new().route("/ws/", web::get().to(index)))
.bind("127.0.0.1:8080")?
.run()
.await
}
The above code starts the server but when I send it a message, I receive a Panic in Arbiter thread.
text message received
from daemon: this print but then the websocket crashes!
thread 'actix-rt:worker:0' panicked at 'failed to park', /Users/xxx/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.25/src/runtime/basic_scheduler.rs:158:56
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Panic in Arbiter thread.
The issue with your application is that the actix-web runtime (i.e. Tokio) is multi-threaded. This is a problem because the fork() call (used internaly by daemon()) only replicates the thread that called fork().
Even if your parent process has N threads, your child process will have only 1. If your parent process has any mutexes locked by those threads, their state will be replicated in the child process, but as those threads do not exist there, they will remain locked for forever.
If you have an Rc/Arc it will never de-allocate its memory, because it will never be dropped, thus its internal count will never reach zero. The same applies for any pointers and shared state.
Or said more simply - your forked child will end up in undefined state.
This is best explained in Calling fork() in a Multithreaded Environment:
The fork( ) system call creates an exact duplicate of the address
space from which it is called, resulting in two address spaces
executing the same code. Problems can occur if the forking address
space has multiple threads executing at the time of the fork( ). When
multithreading is a result of library invocation, threads are not
necessarily aware of each other's presence, purpose, actions, and so
on. Suppose that one of the other threads (any thread other than the
one doing the fork( )) has the job of deducting money from your
checking account. Clearly, you do not want this to happen twice as a
result of some other thread's decision to call fork( ).
Because of these types of problems, which in general are problems of
threads modifying persistent state, POSIX defined the behavior of
fork( ) in the presence of threads to propagate only the forking
thread. This solves the problem of improper changes being made to
persistent state. However, it causes other problems, as discussed in
the next paragraph.
In the POSIX model, only the forking thread is propagated. All the
other threads are eliminated without any form of notice; no cancels
are sent and no handlers are run. However, all the other portions of
the address space are cloned, including all the mutex state. If the
other thread has a mutex locked, the mutex will be locked in the child
process, but the lock owner will not exist to unlock it. Therefore,
the resource protected by the lock will be permanently unavailable.
Here you can find a more reputable source with more details
To answer your other question:
"how can I launch a daemon inside my message handler?"
I assume you want to implement the classical unix "fork() on accept()" model.
In that case you are out of luck, because servers such as actix-web, and async/await
in general are not designed with that in mind. Even if you have a
single-threaded async/await server, then:
When a child is forked it inherits all file descriptors from the parent. So it's
common after a fork, the child to close its listening socket in order to avoid a
resource leak - but there is no way to do that on any of the async/await based servers,
not because it's impossible to do, but because it's not implemented.
And even more important reason to do that is to prevent the child process
from accepting new connections - because even if you run a single threaded
server, it's still capable of processing many tasks concurrently - i.e.
when your handler calls .await on something, the acceptor would be free to
accept a new connection (by stealing it from the socket's queue) and start processing it.
Your parent server may have already spawned a lot of tasks and those would be
replicated in each forked child, thus executing the very same thing multiple times,
independently in each process
And well... there is no way to prevent any of that on any of the async/await
based servers I'm familiar with. You would need a custom server that:
Checks in its acceptor task if it's a child and if it detects that it's the child
it should close the listening socket and drop the acceptor.
It should not execute any other task that was forked from the parent,
but there is no way to achieve that.
In other words - async/await and "fork() on accept()" are two different and
incompatible models for processing tasks concurrently.
A possible solution would be to have a non-async acceptor daemon that only
accepts connections and forks itself. Then spawns a web-server in the child
then feeding it the accepted socket. But although possible, none of the servers
currently have support for that.
As described in the other answer, the async runtime you're relying on may completely break if you touch it in the child process. Touching anything can completely break assumptions the actix or tokio devs made. Wacky stuff will happen if you so much as return from the function.
See this response by one of the key authors of tokio to someone doing something similar (calling fork() in the context of a threadpool with hyper):
Threads + fork is bad news... you can fork if you immediately exec and do not allocate memory or perform any other operation that may have been corrupted by the fork.
Going back to your question:
The objective is for my websocket to respond to messages and be able to launch isolated long-running processes that launch successfully and do not exit when the websocket exits.
I don't think you want to manually fork() at all. Utility functions provided by actix/tokio should integrate well with their runtimes. You may:
Run blocking or CPU-heavy code in a dedicated thread with actix_web::block
Spawn a future with actix::AsyncContext::spawn. You would ideally want to use e.g. tokio::process::Command rather than the std version to avoid blocking in an async context.
If all you're doing in the child process is running Command::new() and later Command::spawn(), I'm pretty sure you can just call it directly. There's no need to fork; it does that internally.

How can Rust be told that a thread does not live longer than its caller? [duplicate]

This question already has an answer here:
How can I pass a reference to a stack variable to a thread?
(1 answer)
Closed 5 years ago.
I have the following code:
fn main() {
let message = "Can't shoot yourself in the foot if you ain't got no gun";
let t1 = std::thread::spawn(|| {
println!("{}", message);
});
t1.join();
}
rustc gives me the compilation error:
closure may outlive the current function, but it borrows message, which is owned by the current function
This is wrong since:
The function it's referring to here is (I believe) main. The threads will be killed or enter in UB once main is finished executing.
The function it's referring to clearly invokes .join() on said thread.
Is the previous code unsafe in any way? If so, why? If not, how can I get the compiler to understand that?
Edit: Yes I am aware I can just move the message in this case, my question is specifically asking how can I pass a reference to it (preferably without having to heap allocate it, similarly to how this code would do it:
std::thread([&message]() -> void {/* etc */});
(Just to clarify, what I'm actually trying to do is access a thread safe data structure from two threads... other solutions to the problem that don't involve making the copy work would also help).
Edit2: The question this has been marked as a duplicate of is 5 pages long and as such I'd consider it and invalid question in it's own right.
Is the previous code 'unsafe' in any way ? If so, why ?
The goal of Rust's type-checking and borrow-checking system is to disallow unsafe programs, but that does not mean that all programs that fail to compile are unsafe. In this specific case, your code is not unsafe, but it does not satisfy the type constraints of the functions you are using.
The function it's referring to clearly invokes .join() on said thread.
But there is nothing from a type-checker standpoint that requires the call the .join. A type-checking system (on its own) can't enforce that a function has or has not been called on a given object. You could just as easily imagine an example like
let message = "Can't shoot yourself in the foot if you ain't got no gun";
let mut handles = vec![];
for i in 0..3 {
let t1 = std::thread::spawn(|| {
println!("{} {}", message, i);
});
handles.push(t1);
}
for t1 in handles {
t1.join();
}
where a human can tell that each thread is joined before main exits. But a typechecker has no way to know that.
The function it's referring to here is (I believe) main. So presumably those threads will be killed when main exists anyway (and them running after main exists is ub).
From the standpoint of the checkers, main is just another function. There is no special knowledge that this specific function can have extra behavior. If this were any other function, the thread would not be auto-killed. Expanding on that, even for main there is no guarantee that the child threads will be killed instantly. If it takes 5ms for the child threads to be killed, that is still 5ms where the child threads could be accessing the content of a variable that has gone out of scope.
To gain the behavior that you are looking for with this specific snippet (as-is), the lifetime of the closure would have to be tied to the lifetime of the t1 object, such that the closure was guaranteed to never be used after the handles have been cleaned up. While that is certainly an option, it is significantly less flexible in the general case. Because it would be enforced at the type level, there would be no way to opt out of this behavior.
You could consider using crossbeam, specifically crossbeam::scope's .spawn, which enforces this lifetime requirement where the standard library does not, meaning a thread must stop execution before the scope is finished.
In your specific case, your code works fine as long as you transfer ownership of message to the child thread instead of borrowing it from the main function, because there is no risk of unsafe code with or without your call to .join. Your code works fine if you change
let t1 = std::thread::spawn(|| {
to
let t1 = std::thread::spawn(move || {

How can I reliably clean up Rust threads performing blocking IO?

It seems to be a common idiom in Rust to spawn off a thread for blocking IO so you can use non-blocking channels:
use std::sync::mpsc::channel;
use std::thread;
use std::net::TcpListener;
fn main() {
let (accept_tx, accept_rx) = channel();
let listener_thread = thread::spawn(move || {
let listener = TcpListener::bind(":::0").unwrap();
for client in listener.incoming() {
if let Err(_) = accept_tx.send(client.unwrap()) {
break;
}
}
});
}
The problem is, rejoining threads like this depends on the spawned thread "realizing" that the receiving end of the channel has been dropped (i.e., calling send(..) returns Err(_)):
drop(accept_rx);
listener_thread.join(); // blocks until listener thread reaches accept_tx.send(..)
You can make dummy connections for TcpListeners, and shutdown TcpStreams via a clone, but these seem like really hacky ways to clean up such threads, and as it stands, I don't even know of a hack to trigger a thread blocking on a read from stdin to join.
How can I clean up threads like these, or is my architecture just wrong?
One simply cannot safely cancel a thread reliably in Windows or Linux/Unix/POSIX, so it isn't available in the Rust standard library.
Here is an internals discussion about it.
There are a lot of unknowns that come from cancelling threads forcibly. It can get really messy. Beyond that, the combination of threads and blocking I/O will always face this issue: you need every blocking I/O call to have timeouts for it to even have a chance of being interruptible reliably. If one can't write async code, one needs to either use processes (which have a defined boundary and can be ended by the OS forcibly, but obviously come with heavier weight and data sharing challenges) or non-blocking I/O which will land your thread back in an event loop that is interruptible.
mio is available for async code. Tokio is a higher level crate based on mio which makes writing non-blocking async code even more straight forward.

How to avoid deadlocks?

When using multiple threads, shared memory needs to be locked by critical sections. However, using critical sections causes potential deadlocks. How can they be avoided?
One way is to use a hierarchy of critical sections. If you ensure that a parent critical section is never entered within one of its children, deadlocks cannot happen. The difficulty is to enforce this hierarchy.
The Related list to the right on this page contains a few links that provides interesting information on the topic.
In addition to that list, there are many other SO questions discussing the topic, such as
Threading Best Practices
Why is lock(this) {…} bad?
What are common reasons for deadlocks?
...and many more
You can avoid critical sections by using message passing instead (synchronous and asynchronous calls). When using synchronous calls, you still have to make sure not to make a circular call, in which thread A asks thread B a question, and B needs to ask A a question to be able to respond.
Another option is to make asynchronous calls instead. However, it is more difficult to get return values.
Note: Indeed, a message passing system is implemented using a critical section that locks the call queue, but it is abstracted away.
Among the various methods to enter critical sections -- semaphores and mutexs are the most popular.
A semaphore is a waiting mechanism and mutex is a locking mechanism, well the concept is confusing to the most, but in short, a thread activating a mutex can only deactivate it. with this in mind...
Dont allow any process to lock partial no of resources, if a process need 5 resources, wait until all the are available.
if u use semaphore here, u can unblock/un-wait the resource occupied by other thread. by this i mean pre-emption is another reason.
These 2 according to me are the basic conditions, the remaining 2 of the common 4 precautions can be related to these.
If u dont agree ps add comments. I've gtg already late, I will later add a cleaner and clearer explanation.
When I work in C++, the following works for me:
all public methods (excluding ctor and dtor) of a threadsafe class lock
private methods cannot call public methods
It's not a general deadlock avoidance method.
You must code multi-thread programs very carefully. There's no short-cut, you must understand the flow of your program, otherwise you'll be doomed.
THE FOLLOWING ALGORITHM IS USED TO AVOID DEADLOCK:
Banker’s Algorithm
–Impose less stringent conditions than in deadlock prevention in an attempt to get better resource utilization
–Safe state
•Operating system can guarantee that all current processes can complete their work within a finite time
–Unsafe state
•Does not imply that the system is deadlocked, but that the OS cannot guarantee that all current processes can complete their work within a finite time
–Requires that resources be allocated to processes only when the allocations result in safe states.
–It has a number of weaknesses (such as requiring a fixed number of processes and resources) that prevent it from being implemented in real systems
One way is by using a non-blocking locking function. As an example, in rust You could use std::sync::Mutex::try_lock instead of std::sync::Mutex::lock.
So so if you have this example code:
fn transfer(tx: &Mutex<i32>, rx: &Mutex<i32>, amount: i32) -> () {
let mut tx = tx.lock().unwrap();
let mut rx = rx.lock().unwrap();
*tx -= amount;
*rx += amount;
}
You could instead do something like this:
fn transfer(tx: &Mutex<i32>, rx: &Mutex<i32>, amount: i32) -> () {
loop {
// Attempt to lock both mutexes
let mut tx = tx.try_lock();
let mut rx = rx.try_lock();
// If both locks were successfull,
// i.e. if they currently are not
// locked by an other thread
if let Ok(ref mut tx) = tx {
if let Ok(ref mut rx) = rx {
// Perform the operations needed on
// the values inside the mutexes
**tx -= amount;
**rx += amount;
// Exit the loop
break;
}
}
// If at least one of the locks were
// not successful, restart the loop
// and try locking the values again.
// You may also want to sleep the thread
// here for a short period if You think that
// the mutexes might be locked for a while.
}
}

Resources