Interrupt std::io::read() in Rust - rust

I have been trying to write an application that invokes a system command in a separate thread.
The key characteristic is that I want to be able to kill this command and invoke a new one from the main thread upon request.
My secondary thread looks like so:
let (tx, rx): (mpsc::Sender<String>, mpsc::Receiver<String>) = mpsc::channel();
let child_handle = stoppable_thread::spawn(move |stop| {
let mut child = Command::new("[the program]").arg(format!("-g {}", gain)).stdout(Stdio::piped()).spawn().expect("naw man");
let mut childout = child.stdout.as_mut().unwrap();
while !stop.get() {
let mut buffer = [0; 128];
childout.try_read(&mut buffer).unwrap(); // This part makes the code wait for the next output
// Here the buffer is sent via mpsc irrelevant to the issue
}});
The problem is when I send a stop signal (or if I used an mpsc channel to notify the thread to stop) it waits for the command to output something to stdout. This is unwanted behavior.
How can I remedy this? How can I interrupt the read() function?

You can kill the child process, which will cause it to close its output, at which point read will see an EOF and return. However, this requires sending the child to the parent thread. Something like:
let (tx, rx): (mpsc::Sender<String>, mpsc::Receiver<String>) = mpsc::channel();
let (ctx, crx) = mpsc::channel();
let child_handle = stoppable_thread::spawn(move |stop| {
let mut child = Command::new("[the program]").arg(format!("-g {}", gain)).stdout(Stdio::piped()).spawn().expect("naw man");
let mut childout = child.stdout.take().unwrap();
ctx.send (child);
while !stop.get() {
let mut buffer = [0; 128];
childout.try_read(&mut buffer).unwrap(); // This part makes the code wait for the next output
// Here the buffer is sent via mpsc irrelevant to the issue
}});
// When you want to stop:
let child = crx.recv().unwrap();
child.kill().unwrap();

Related

Run command, stream stdout/stderr and capture results

I'm trying to use std::process::Command to run a command and stream its stdout and stderr while also capturing a copy of stdout/stderr. I found I can use spawn.
This code will capture the output, but won't stream it to stdout/stderr while it's happening:
let mut child = command
.envs(env)
.stdout(Stdio::piped()) // <=== Difference here
.spawn()
.unwrap();
let output = child
.wait_with_output().unwrap();
println!("Done {}", std::str::from_utf8(&output.stdout).unwrap());
This code will stream the output but not capture it:
let mut child = command
.envs(env)
.spawn()
.unwrap();
let output = child
.wait_with_output().unwrap();
println!("Done {}", std::str::from_utf8(&output.stdout).unwrap());
Is there a way to capture a command's output while also streaming it to the parent stdout/stderr?
There might be a less verbose way to do this, but this is the solution I came up with.
Spawn the process with a piped io for stdout and stderr. Spawn a thread for stdout and stderr. In each thread read from the pipe and output directly to stdout or stderr then write the contents to a channel.
In the main thread wait for the process to finish, then join the threads and finally read each channel to get the contents of stdout and stderr.
use std::io::BufRead;
let mut child = command
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.spawn()
.unwrap();
let child_stdout = child
.stdout
.take()
.expect("Internal error, could not take stdout");
let child_stderr = child
.stderr
.take()
.expect("Internal error, could not take stderr");
let (stdout_tx, stdout_rx) = std::sync::mpsc::channel();
let (stderr_tx, stderr_rx) = std::sync::mpsc::channel();
let stdout_thread = thread::spawn(move || {
let stdout_lines = BufReader::new(child_stdout).lines();
for line in stdout_lines {
let line = line.unwrap();
println!("{}", line);
stdout_tx.send(line).unwrap();
}
});
let stderr_thread = thread::spawn(move || {
let stderr_lines = BufReader::new(child_stderr).lines();
for line in stderr_lines {
let line = line.unwrap();
eprintln!("{}", line);
stderr_tx.send(line).unwrap();
}
});
let status = child
.wait()
.expect("Internal error, failed to wait on child");
stdout_thread.join().unwrap();
stderr_thread.join().unwrap();
let stdout = stdout_rx.into_iter().collect::<Vec<String>>().join("");
let stderr = stderr_rx.into_iter().collect::<Vec<String>>().join("");
The channel isn't strictly needed. I originally wanted to mutate a string, but I'm new in Rust with threads and couldn't find any examples showing how to mutate a string in a thread and then read it back into main.
I'm accepting the other solution as it really answered my main question. I just wanted to post back to give everyone a fully-featured answer that does exactly what I originally asked
This is similar to how I stream the compilation and execution output on Rust Explorer.
To stream the output you can pipe the stdout and read it line by line using BufReader.
Playground
use std::io::BufRead;
use std::io::BufReader;
use std::process::Command;
use std::process::Stdio;
fn main() {
// Compile code.
let mut child = Command::new("bash")
.args([
"-c",
"echo 'Hello'; sleep 3s; echo 'World'"
])
.stdout(Stdio::piped())
.spawn()
.unwrap();
let stdout = child.stdout.take().unwrap();
// Stream output.
let lines = BufReader::new(stdout).lines();
for line in lines {
println!("{}", line.unwrap());
}
}

How to create threads in a for loop and get the return value from each?

I am writing a program that pings a set of targets 100 times, and stores each RTT value returned from the ping into a vector, thus giving me a set of RTT values for each target. Say I have n targets, I would like all of the pinging to be done concurrently. The rust code looks like this:
let mut sample_rtts_map = HashMap::new();
for addr in targets.to_vec() {
let mut sampleRTTvalues: Vec<f32> = vec![];
//sample_rtts_map.insert(addr, sampleRTTvalues);
thread::spawn(move || {
while sampleRTTvalues.len() < 100 {
let sampleRTT = ping(addr);
sampleRTTvalues.push(sampleRTT);
// thread::sleep(Duration::from_millis(5000));
}
});
}
The hashmap is used to tell which vector of values belongs to which target. The problem is, how do I retrieve the updated sampleRTTvalues from each thread after the thread is done executing? I would like something like:
let (name, sampleRTTvalues) = thread::spawn(...)
The name, being the name of the thread, and sampleRTTvalues being the vector. However, since I'm creating threads in a for loop, each thread is being instantiated the same way, so how I differentiate them?
Is there some better way to do this? I've looked into schedulers, future, etc., but it seems my case can just be done with simple threads.
I go the desired behavior with the following code:
use std::thread;
use std::sync::mpsc;
use std::collections::HashMap;
use rand::Rng;
use std::net::{Ipv4Addr,Ipv6Addr,IpAddr};
const RTT_ONE: IpAddr = IpAddr::V4(Ipv4Addr::new(127,0,0,1));
const RTT_TWO: IpAddr = IpAddr::V6(Ipv6Addr::new(0,0,0,0,0,0,0,1));
const RTT_THREE: IpAddr = IpAddr::V4(Ipv4Addr::new(127,0,1,1));//idk how ip adresses work, forgive if this in invalid but you get the idea
fn ping(address: IpAddr) -> f32 {
rand::thread_rng().gen_range(5.0..107.0)
}
fn main() {
let targets = [RTT_ONE,RTT_TWO,RTT_THREE];
let mut sample_rtts_map: HashMap<IpAddr,Vec<f32>> = HashMap::new();
for addr in targets.into_iter() {
let (sample_values,moved_values) = mpsc::channel();
let mut sampleRTTvalues: Vec<f32> = vec![];
thread::spawn(move || {
while sampleRTTvalues.len() < 100 {
let sampleRTT = ping(addr);
sampleRTTvalues.push(sampleRTT);
//thread::sleep(Duration::from_millis(5000));
}
});
sample_rtts_map.insert(addr,moved_values.recv().unwrap());
}
}
note that the use rand::Rng can be removed when implementing, as it is only so the example works. what this does is pass data from the spawned thread to the main thread, and in the method used it waits until the data is ready before adding it to the hash map. If this is problematic (takes a long time, etc.) then you can use try_recv instead of recv which will add an error / option type that will return a recoverable error if the value is ready when unwrapped, or return the value if it's ready
You can use a std::sync::mpsc channel to collect your data:
use std::collections::HashMap;
use std::sync::mpsc::channel;
use std::thread;
fn ping(_: &str) -> f32 { 0.0 }
fn main() {
let targets = ["a", "b"]; // just for example
let mut sample_rtts_map = HashMap::new();
let (tx, rx) = channel();
for addr in targets {
let tx = tx.clone();
thread::spawn(move || {
for _ in 0..100 {
let sampleRTT = ping(addr);
tx.send((addr, sampleRTT));
}
});
}
drop(tx);
// exit loop when all thread's tx have dropped
while let Ok((addr, sampleRTT)) = rx.recv() {
sample_rtts_map.entry(addr).or_insert(vec![]).push(sampleRTT);
}
println!("sample_rtts_map: {:?}", sample_rtts_map);
}
This will run all pinging threads simultaneously, and collect data in main thread synchronously, so that we can avoid using locks. Do not forget to drop sender in main thread after cloning to all pinging threads, or the main thread will hang forever.

Rust Tokio mpsc::channel unexpected behavior for multi-task program

In the following program I use Tokio's mpsc channels. The Sender is moved to a task named input_message and the Receiver is moved to another task named printer. Both tasks are tokio::spawn()-ed in the main function. The input_message task is to read the user's input and send it through a Channel. The printer task recv() on the channel to get the user's input and simply prints it to stdout:
use std::error::Error;
use tokio::sync::mpsc;
use std::io::{BufRead, Write};
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let (tx, mut rx) = mpsc::unbounded_channel::<String>();
let printer = tokio::spawn(async move {
loop {
let res = rx.recv().await; // (11) Comment this ..
// let res = rx.try_recv(); // (12) Uncomment this ,,
if let Some(m) = res { // .. and this
// if let Ok(m) = res { // ,, and this
if m.trim() == "q".to_string() {
break;
}
println!("Received: {}", m.trim());
}
}
println!("Printer exited");
});
let input_message = tokio::spawn(async move {
let stdin = std::io::stdin();
let mut bufr = std::io::BufReader::new(stdin);
let mut buf = String::new();
loop {
// Let the printer thread print the string before asking the user's input.
std::thread::sleep(std::time::Duration::from_millis(1));
print!("Enter input: ");
std::io::stdout().flush().unwrap();
bufr.read_line(&mut buf).unwrap();
if buf.trim() == "q".to_string() {
tx.send(buf).unwrap();
break;
}
tx.send(buf).unwrap();
buf = String::new();
}
println!("InputMessage exited");
});
tokio::join!(input_message, printer);
Ok(())
}
The expected behavior of the program is to:
Ask the user a random input (q to quit)
Print that same input to stdout
Using rx.recv().await as in line 11-13 the program seems to buffer the Strings representing the user's input: the various inputs are not received by the printer task that therefore does not print the strings to stdout. Once the quit message (i.e. q) is sent, the input_message task exits and the messages seems to be flushed out of the channel and the receiver processes them all at once, and so the printer task prints all the inputs at once. Here's an example of wrong output:
Enter input: Hello
Enter input: World
Enter input: q
InputMessage exited
Received: Hello
Received: World
Printer exited
My question here is, how is it possible that the channel buffers the messages and processes them in one go only when the sending thread exits, instead of receiving them as they are sent?
What I tried to do is to use the try_recv() function as in line 12-14 and indeed it fixes the problem. The output is correctly printed, here is an example:
Enter input: Hello
Received: Hello
Enter input: World
Received: World
Enter input: q
InputMessage exited
Printer exited
In light of this, I get confused. I get the difference between the recv().await and the try_recv() functions but I think there's something more in this case that I'm ignoring that makes the latter work and the former not work. Is anybody able to shed some light and elaborate on this? Why does try_recv() work and recv().await not, and why should recv().await not work in this scenario? In terms of efficiency is looping on try_recv() bad or "bad practice" at all?
There are a few things to point out here, but first of all, you are waiting for lines on std::io::stdin() which blocks the thread until a line arrives on that stream. While the thread waiting for input, no other future can be executed on this thread, this blog post is a great resource if you want to dive deeper why you shouldn't do that.
Tokio's io module offers an async handle to stdin(), you can work with this as a quick fix, although the documentation explicitly mentions that you should spin up a dedicated (non-async) thread for interactive user input instead of using the async handle.
Swapping std::io::stdin() for tokio::io::stdin() also entails swapping out the standard library BufReader for tokio's implementation that wraps an R: AsyncRead rather than R: Read.
To prevent interleaved writes between the input task and the output task, you can use a responder channel that signals to the input task when the output has been printed. Instead of sending String over the channel, you could send a Message with these fields:
struct Message {
payload: String,
done_tx: oneshot::Sender<()>,
}
After reading an input line, send the Message over the channel to the printer task. The printer task prints the String and signals through the done_tx that the input task can print the input prompt and wait for a new line.
Putting all that together with some other changes like a while loop to wait for messages, you'd end up with something like this:
use std::error::Error;
use tokio::io::{AsyncBufReadExt, AsyncWriteExt};
use tokio::sync::{mpsc, oneshot};
#[derive(Debug)]
struct Message {
done_tx: oneshot::Sender<()>,
message: String,
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let (tx, mut rx) = mpsc::unbounded_channel::<Message>();
let printer = tokio::spawn(async move {
while let Some(Message {
message: m,
done_tx,
}) = rx.recv().await
{
if m.trim() == "q".to_string() {
break;
}
println!("Received: {}", m.trim());
done_tx.send(()).unwrap();
}
println!("Printer exited");
});
let input_message = tokio::spawn(async move {
let stdin = tokio::io::stdin();
let mut stdout = tokio::io::stdout();
let mut bufr = tokio::io::BufReader::new(stdin);
let mut buf = String::new();
loop {
// Let the printer thread print the string before asking the user's input.
stdout.write(b"Enter input: ").await.unwrap();
stdout.flush().await.unwrap();
bufr.read_line(&mut buf).await.unwrap();
let end = buf.trim() == "q";
let (done_tx, done) = oneshot::channel();
let message = Message {
message: buf,
done_tx,
};
tx.send(message).unwrap();
if end {
break;
}
done.await.unwrap();
buf = String::new();
}
println!("InputMessage exited");
});
tokio::join!(input_message, printer);
Ok(())
}

Reading and writing to a long-running std::process::Child

I have a long-running child process to which I need to read and write a lot of data. I have a reader thread and a writer thread that manipulate the child.stdout and child.stdin respectively:
extern crate scoped_threadpool;
fn main() {
// run the subprocess
let mut child = std::process::Command::new("cat")
.stdin(std::process::Stdio::piped())
.stdout(std::process::Stdio::piped())
.spawn()
.unwrap();
let child_stdout = child.stdout.as_mut().unwrap();
let child_stdin = std::sync::Mutex::new(child.stdin.as_mut().unwrap());
let mut pool = scoped_threadpool::Pool::new(2);
pool.scoped(|scope| {
// read all output from the subprocess
scope.execute(move || {
use std::io::BufRead;
let reader = std::io::BufReader::new(child_stdout);
for line in reader.lines() {
println!("{}", line.unwrap());
}
});
// write to the subprocess
scope.execute(move || {
for a in 0..1000 {
use std::io::Write;
writeln!(&mut child_stdin.lock().unwrap(), "{}", a).unwrap();
} // close child_stdin???
});
});
}
When the writer is done, I want to close child_stdin so that the subprocess finishes and exits, so that the reader sees EOF and pool.scoped returns. I can't do this without child.wait() and I can't call child.wait() because it's being borrowed by the two threads.
How do I make this program complete?
Amusingly, you've caused this yourself by sharing ownership using the Mutex ^_^. Instead of taking a reference to child.stdin, take complete ownership of it and pass it to the thread. When the thread is over, it will be dropped, closing it implicitly:
let mut child_stdin = child.stdin.unwrap();
// ...
scope.execute(move ||
for a in 0..1000 {
use std::io::Write;
writeln!(&mut child_stdin, "{}", a).unwrap();
}
// child_stdin has been moved into this closure and is now
// dropped, closing it.
);
If you'd like to still be able to call wait to get the ExitStatus, change the reference to stdout and the transfer of ownership of stdin to use Option::take. This means that child is not borrowed at all:
let mut child = // ...
let child_stdout = child.stdout.as_mut().unwrap();
let mut child_stdin = child.stdin.take().unwrap();
// ...
child.wait().unwrap();

How to terminate or suspend a Rust thread from another thread?

Editor's note — this example was created before Rust 1.0 and the specific types have changed or been removed since then. The general question and concept remains valid.
I have spawned a thread with an infinite loop and timer inside.
thread::spawn(|| {
let mut timer = Timer::new().unwrap();
let periodic = timer.periodic(Duration::milliseconds(200));
loop {
periodic.recv();
// Do my work here
}
});
After a time based on some conditions, I need to terminate this thread from another part of my program. In other words, I want to exit from the infinite loop. How can I do this correctly? Additionally, how could I to suspend this thread and resume it later?
I tried to use a global unsafe flag to break the loop, but I think this solution does not look nice.
For both terminating and suspending a thread you can use channels.
Terminated externally
On each iteration of a worker loop, we check if someone notified us through a channel. If yes or if the other end of the channel has gone out of scope we break the loop.
use std::io::{self, BufRead};
use std::sync::mpsc::{self, TryRecvError};
use std::thread;
use std::time::Duration;
fn main() {
println!("Press enter to terminate the child thread");
let (tx, rx) = mpsc::channel();
thread::spawn(move || loop {
println!("Working...");
thread::sleep(Duration::from_millis(500));
match rx.try_recv() {
Ok(_) | Err(TryRecvError::Disconnected) => {
println!("Terminating.");
break;
}
Err(TryRecvError::Empty) => {}
}
});
let mut line = String::new();
let stdin = io::stdin();
let _ = stdin.lock().read_line(&mut line);
let _ = tx.send(());
}
Suspending and resuming
We use recv() which suspends the thread until something arrives on the channel. In order to resume the thread, you need to send something through the channel; the unit value () in this case. If the transmitting end of the channel is dropped, recv() will return Err(()) - we use this to exit the loop.
use std::io::{self, BufRead};
use std::sync::mpsc;
use std::thread;
use std::time::Duration;
fn main() {
println!("Press enter to wake up the child thread");
let (tx, rx) = mpsc::channel();
thread::spawn(move || loop {
println!("Suspending...");
match rx.recv() {
Ok(_) => {
println!("Working...");
thread::sleep(Duration::from_millis(500));
}
Err(_) => {
println!("Terminating.");
break;
}
}
});
let mut line = String::new();
let stdin = io::stdin();
for _ in 0..4 {
let _ = stdin.lock().read_line(&mut line);
let _ = tx.send(());
}
}
Other tools
Channels are the easiest and the most natural (IMO) way to do these tasks, but not the most efficient one. There are other concurrency primitives which you can find in the std::sync module. They belong to a lower level than channels but can be more efficient in particular tasks.
The ideal solution would be a Condvar. You can use wait_timeout in the std::sync module, as pointed out by #Vladimir Matveev.
This is the example from the documentation:
use std::sync::{Arc, Mutex, Condvar};
use std::thread;
use std::time::Duration;
let pair = Arc::new((Mutex::new(false), Condvar::new()));
let pair2 = pair.clone();
thread::spawn(move|| {
let &(ref lock, ref cvar) = &*pair2;
let mut started = lock.lock().unwrap();
*started = true;
// We notify the condvar that the value has changed.
cvar.notify_one();
});
// wait for the thread to start up
let &(ref lock, ref cvar) = &*pair;
let mut started = lock.lock().unwrap();
// as long as the value inside the `Mutex` is false, we wait
loop {
let result = cvar.wait_timeout(started, Duration::from_millis(10)).unwrap();
// 10 milliseconds have passed, or maybe the value changed!
started = result.0;
if *started == true {
// We received the notification and the value has been updated, we can leave.
break
}
}
Having been back to this question several times myself, here's what I think addresses OP's intent and others' best practice of getting the thread to stop itself. Building on the accepted answer, Crossbeam is a nice upgrade to mpsc in allowing message endpoints to be cloned and moved. It also has a convenient tick function. The real point here is it has try_recv() which is non-blocking.
I'm not sure how universally useful it'd be to put a message checker in the middle of an operational loop like this. I haven't found that Actix (or previously Akka) could really stop a thread without--as stated above--getting the thread to do it itself. So this is what I'm using for now (wide open to correction here, still learning myself).
// Cargo.toml:
// [dependencies]
// crossbeam-channel = "0.4.4"
use crossbeam_channel::{Sender, Receiver, unbounded, tick};
use std::time::{Duration, Instant};
fn main() {
let (tx, rx):(Sender<String>, Receiver<String>) = unbounded();
let rx2 = rx.clone();
// crossbeam allows clone and move of receiver
std::thread::spawn(move || {
// OP:
// let mut timer = Timer::new().unwrap();
// let periodic = timer.periodic(Duration::milliseconds(200));
let ticker: Receiver<Instant> = tick(std::time::Duration::from_millis(500));
loop {
// OP:
// periodic.recv();
crossbeam_channel::select! {
recv(ticker) -> _ => {
// OP: Do my work here
println!("Hello, work.");
// Comms Check: keep doing work?
// try_recv is non-blocking
// rx, the single consumer is clone-able in crossbeam
let try_result = rx2.try_recv();
match try_result {
Err(_e) => {},
Ok(msg) => {
match msg.as_str() {
"END_THE_WORLD" => {
println!("Ending the world.");
break;
},
_ => {},
}
},
_ => {}
}
}
}
}
});
// let work continue for 10 seconds then tell that thread to end.
std::thread::sleep(std::time::Duration::from_secs(10));
println!("Goodbye, world.");
tx.send("END_THE_WORLD".to_string());
}
Using strings as a message device is a tad cringeworthy--to me. Could do the other suspend and restart stuff there in an enum.

Resources