Reading and writing to a long-running std::process::Child - rust

I have a long-running child process to which I need to read and write a lot of data. I have a reader thread and a writer thread that manipulate the child.stdout and child.stdin respectively:
extern crate scoped_threadpool;
fn main() {
// run the subprocess
let mut child = std::process::Command::new("cat")
.stdin(std::process::Stdio::piped())
.stdout(std::process::Stdio::piped())
.spawn()
.unwrap();
let child_stdout = child.stdout.as_mut().unwrap();
let child_stdin = std::sync::Mutex::new(child.stdin.as_mut().unwrap());
let mut pool = scoped_threadpool::Pool::new(2);
pool.scoped(|scope| {
// read all output from the subprocess
scope.execute(move || {
use std::io::BufRead;
let reader = std::io::BufReader::new(child_stdout);
for line in reader.lines() {
println!("{}", line.unwrap());
}
});
// write to the subprocess
scope.execute(move || {
for a in 0..1000 {
use std::io::Write;
writeln!(&mut child_stdin.lock().unwrap(), "{}", a).unwrap();
} // close child_stdin???
});
});
}
When the writer is done, I want to close child_stdin so that the subprocess finishes and exits, so that the reader sees EOF and pool.scoped returns. I can't do this without child.wait() and I can't call child.wait() because it's being borrowed by the two threads.
How do I make this program complete?

Amusingly, you've caused this yourself by sharing ownership using the Mutex ^_^. Instead of taking a reference to child.stdin, take complete ownership of it and pass it to the thread. When the thread is over, it will be dropped, closing it implicitly:
let mut child_stdin = child.stdin.unwrap();
// ...
scope.execute(move ||
for a in 0..1000 {
use std::io::Write;
writeln!(&mut child_stdin, "{}", a).unwrap();
}
// child_stdin has been moved into this closure and is now
// dropped, closing it.
);
If you'd like to still be able to call wait to get the ExitStatus, change the reference to stdout and the transfer of ownership of stdin to use Option::take. This means that child is not borrowed at all:
let mut child = // ...
let child_stdout = child.stdout.as_mut().unwrap();
let mut child_stdin = child.stdin.take().unwrap();
// ...
child.wait().unwrap();

Related

What is `take()` doing here, and why do I need it?

This code spawns a child process, consuming its stderr and stdout line by line, and logging each appropriately. It compiles and works.
use std::error::Error;
use std::process::{Stdio};
use tokio::io::{AsyncBufReadExt, BufReader};
use tokio::process::{Command, Child};
use tracing::{info, warn};
macro_rules! relay_pipe_lines {
($pipe:expr, $handler:expr) => {
tokio::spawn(async move {
let mut reader = BufReader::new($pipe).lines();
loop {
let line = reader
.next_line()
.await
.unwrap_or_else(|_| Some(String::new()));
match line {
None => break,
Some(line) => $handler(line)
}
}
});
};
}
pub fn start_and_log_command(mut command: Command) -> Result<Child, Box<dyn Error>> {
command.stdout(Stdio::piped()).stderr(Stdio::piped());
let mut child = command.spawn()?;
let child_stdout = child.stdout.take().unwrap(); // remove `take` from here
let child_stderr = child.stderr.take().unwrap(); // .. or from here and it fails
let child_pid = child.id().unwrap();
relay_pipe_lines!(child_stdout, |line|info!("[pid {}:stdout]: {}", child_pid, line));
relay_pipe_lines!(child_stderr, |line|warn!("[pid {}:stderr]: {}", child_pid, line));
Ok(child)
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
tracing_subscriber::fmt::init();
info!("Tracing logging initialised.");
let mut command = Command::new("ls");
command.arg("-l");
let mut child = start_and_log_command(command)?;
// Compose reading waiting concurrently.
let exit_status = child.wait().await.expect("Cannot reap child process");
dbg!(exit_status.success());
Ok(())
}
Removing the call to take() from the indicated lines fails the build, as "child.stdout partially moved due to this method call", which I mostly understand.
I'd like to understand how using take() does not partially move child.stdout.
It's a call to Option::take(), which avoids a "partial move" by leaving None in place of the moved value. As a result, child is left in a valid state and can be returned from the function.
In other words, child.stdout.take() is equivalent to std::mem::replace(&mut child.stdout, None), and means "take the current value out of the option (whatever it is), and leave None in its place."

Writing to stdio & reading from stdout in Rust Command process

I'll try to simplify as much as possible what I'm trying to do accomplish but in a nutshell here is my problem:
I am trying to spawn the node shell as a process in Rust. I would like to pass to the process' stdin javascript code and read the nodejs output from stdout of the process. This would be an interactive usage where the node shell is spawned and keeps receiving JS instructions and executing them.
I do not wish to launch the nodejs app using a file argument.
I have read quite a bit about std::process::Command, tokio and why we can't write and read to a piped input using standard library. One of the solutions that I kept seeing online (in order to not block the main thread while reading/writing) is to use a thread for reading the output. Most solutions did not involve a continuous write/read flow.
What I have done is to spawn 2 threads, one that keeps writing to stdin and one that keeps reading from stdout. That way, I thought, I won't be blocking the main thread. However my issue is that only 1 thread can actively be used. When I have a thread for stdin, stdout does not even receive data.
Here is the code, comments should provide more details
pub struct Runner {
handle: Child,
pub input: Arc<Mutex<String>>,
pub output: Arc<Mutex<String>>,
input_thread: JoinHandle<()>,
output_thread: JoinHandle<()>,
}
impl Runner {
pub fn new() -> Runner {
let mut handle = Command::new("node")
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.spawn()
.expect("Failed to spawn node process!");
// begin stdout thread part
let mut stdout = handle.stdout.take().unwrap();
let output = Arc::new(Mutex::new(String::new()));
let out_clone = Arc::clone(&output);
let output_thread = spawn(move || loop {
// code here never executes...why ?
let mut buf: [u8; 1] = [0];
let mut output = out_clone.lock().unwrap();
let what_i_read = stdout.read(&mut buf);
println!("reading: {:?}", what_i_read);
match what_i_read {
Err(err) => {
println!("{}] Error reading from stream: {}", line!(), err);
break;
}
Ok(bytes_read) => {
if bytes_read != 0 {
let char = String::from_utf8(buf.to_vec()).unwrap();
output.push_str(char.as_str());
} else if output.len() != 0 {
println!("result: {}", output);
out_clone.lock().unwrap().clear();
}
}
}
});
// begin stdin thread block
let mut stdin = handle.stdin.take().unwrap();
let input = Arc::new(Mutex::new(String::new()));
let input_clone = Arc::clone(&input);
let input_thread = spawn(move || loop {
let mut in_text = input_clone.lock().unwrap();
if in_text.len() != 0 {
println!("writing: {}", in_text);
stdin.write_all(in_text.as_bytes()).expect("!write");
stdin.write_all("\n".as_bytes()).expect("!write");
in_text.clear();
}
});
Runner {
handle,
input,
output,
input_thread,
output_thread,
}
}
// this function should receive commands
pub fn execute(&mut self, str: &str) {
let input = Arc::clone(&self.input);
let mut input = input.lock().unwrap();
input.push_str(str);
}
}
In the main thread I'd like use this as
let mut runner = Runner::new();
runner.execute("console.log('foo'");
println!("{:?}", runner.output);
I am still new to Rust but at least I passed the point where the borrow checker makes me bang my head against the wall, I am starting to find it more pleasing now :)

How can I force a thread that is blocked reading from a file to resume in Rust?

Because Rust does not have have the built-in ability to read from a file in a non-blocking manner, I have to spawn a thread which reads the file /dev/input/fs0 in order to get joystick events. Suppose the joystick is unused (nothing to read), so the reading thread is blocked while reading from the file.
Is there a way for the main thread to force the blocking read of the reading thread to resume, so the reading thread may exit cleanly?
In other languages, I would simply close the file in the main thread. This would force the blocking read to resume. But I have not found a way to do so in Rust, because reading requires a mutable reference to the file.
The idea is to call File::read only when there is available data. If there is no available data, we check a flag to see if the main thread requested to stop. If not, wait and try again.
Here is an example using nonblock crate:
extern crate nonblock;
use std::fs::File;
use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;
use nonblock::NonBlockingReader;
fn main() {
let f = File::open("/dev/stdin").expect("open failed");
let mut reader = NonBlockingReader::from_fd(f).expect("from_fd failed");
let exit = Arc::new(Mutex::new(false));
let texit = exit.clone();
println!("start reading, type something and enter");
thread::spawn(move || {
let mut buf: Vec<u8> = Vec::new();
while !*texit.lock().unwrap() {
let s = reader.read_available(&mut buf).expect("io error");
if s == 0 {
if reader.is_eof() {
println!("eof");
break;
}
} else {
println!("read {:?}", buf);
buf.clear();
}
thread::sleep(Duration::from_millis(200));
}
println!("stop reading");
});
thread::sleep(Duration::from_secs(5));
println!("closing file");
*exit.lock().unwrap() = true;
thread::sleep(Duration::from_secs(2));
println!("\"stop reading\" was printed before the main exit!");
}
fn read_async<F>(file: File, fun: F) -> thread::JoinHandle<()>
where F: Send + 'static + Fn(&Vec<u8>)
{
let mut reader = NonBlockingReader::from_fd(file).expect("from_fd failed");
let mut buf: Vec<u8> = Vec::new();
thread::spawn(move || {
loop {
let s = reader.read_available(&mut buf).expect("io error");
if s == 0 {
if reader.is_eof() {
break;
}
} else {
fun(&buf);
buf.clear();
}
thread::sleep(Duration::from_millis(100));
}
})
}
Here is an example using poll binding of nix crate. The function poll waits (with timeout) for specific events:
extern crate nix;
use std::io::Read;
use std::os::unix::io::AsRawFd;
use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;
use nix::poll;
fn main() {
let mut f = std::fs::File::open("/dev/stdin").expect("open failed");
let mut pfd = poll::PollFd {
fd: f.as_raw_fd(),
events: poll::POLLIN, // is there input data?
revents: poll::EventFlags::empty(),
};
let exit = Arc::new(Mutex::new(false));
let texit = exit.clone();
println!("start reading, type something and enter");
thread::spawn(move || {
let timeout = 100; // millisecs
let mut s = unsafe { std::slice::from_raw_parts_mut(&mut pfd, 1) };
let mut buffer = [0u8; 10];
loop {
if poll::poll(&mut s, timeout).expect("poll failed") != 0 {
let s = f.read(&mut buffer).expect("read failed");
println!("read {:?}", &buffer[..s]);
}
if *texit.lock().unwrap() {
break;
}
}
println!("stop reading");
});
thread::sleep(Duration::from_secs(5));
println!("closing file");
*exit.lock().unwrap() = true;
thread::sleep(Duration::from_secs(2));
println!("\"stop reading\" was printed before the main exit!");
}

Unable to iterate over Arc Mutex

Consider the following code, I append each of my threads to a Vector in order to join them up to the main thread after I have spawned each thread, however I am not able to call iter() on my vector of JoinHandlers.
How can I go about doing this?
fn main() {
let requests = Arc::new(Mutex::new(Vec::new()));
let threads = Arc::new(Mutex::new(Vec::new()));
for _x in 0..100 {
println!("Spawning thread: {}", _x);
let mut client = Client::new();
let thread_items = requests.clone();
let handle = thread::spawn(move || {
for _y in 0..100 {
println!("Firing requests: {}", _y);
let start = time::precise_time_s();
let _res = client.get("http://jacob.uk.com")
.header(Connection::close())
.send().unwrap();
let end = time::precise_time_s();
thread_items.lock().unwrap().push((Request::new(end-start)));
}
});
threads.lock().unwrap().push((handle));
}
// src/main.rs:53:22: 53:30 error: type `alloc::arc::Arc<std::sync::mutex::Mutex<collections::vec::Vec<std::thread::JoinHandle<()>>>>` does not implement any method in scope named `unwrap`
for t in threads.iter(){
println!("Hello World");
}
}
First, you don't need threads to be contained in Arc in Mutex. You can keep it just Vec:
let mut threads = Vec::new();
...
threads.push(handle);
This is so because you don't share threads between, well, threads. You only access it from the main thread.
Second, if for some reason you do need to keep it in Arc (e.g. if your example does not reflect the actual structure of your program which is more complex), then you need to lock the mutex to obtain a reference to the contained vector, just as you do when pushing:
for t in threads.lock().unwrap().iter() {
...
}

How to terminate or suspend a Rust thread from another thread?

Editor's note — this example was created before Rust 1.0 and the specific types have changed or been removed since then. The general question and concept remains valid.
I have spawned a thread with an infinite loop and timer inside.
thread::spawn(|| {
let mut timer = Timer::new().unwrap();
let periodic = timer.periodic(Duration::milliseconds(200));
loop {
periodic.recv();
// Do my work here
}
});
After a time based on some conditions, I need to terminate this thread from another part of my program. In other words, I want to exit from the infinite loop. How can I do this correctly? Additionally, how could I to suspend this thread and resume it later?
I tried to use a global unsafe flag to break the loop, but I think this solution does not look nice.
For both terminating and suspending a thread you can use channels.
Terminated externally
On each iteration of a worker loop, we check if someone notified us through a channel. If yes or if the other end of the channel has gone out of scope we break the loop.
use std::io::{self, BufRead};
use std::sync::mpsc::{self, TryRecvError};
use std::thread;
use std::time::Duration;
fn main() {
println!("Press enter to terminate the child thread");
let (tx, rx) = mpsc::channel();
thread::spawn(move || loop {
println!("Working...");
thread::sleep(Duration::from_millis(500));
match rx.try_recv() {
Ok(_) | Err(TryRecvError::Disconnected) => {
println!("Terminating.");
break;
}
Err(TryRecvError::Empty) => {}
}
});
let mut line = String::new();
let stdin = io::stdin();
let _ = stdin.lock().read_line(&mut line);
let _ = tx.send(());
}
Suspending and resuming
We use recv() which suspends the thread until something arrives on the channel. In order to resume the thread, you need to send something through the channel; the unit value () in this case. If the transmitting end of the channel is dropped, recv() will return Err(()) - we use this to exit the loop.
use std::io::{self, BufRead};
use std::sync::mpsc;
use std::thread;
use std::time::Duration;
fn main() {
println!("Press enter to wake up the child thread");
let (tx, rx) = mpsc::channel();
thread::spawn(move || loop {
println!("Suspending...");
match rx.recv() {
Ok(_) => {
println!("Working...");
thread::sleep(Duration::from_millis(500));
}
Err(_) => {
println!("Terminating.");
break;
}
}
});
let mut line = String::new();
let stdin = io::stdin();
for _ in 0..4 {
let _ = stdin.lock().read_line(&mut line);
let _ = tx.send(());
}
}
Other tools
Channels are the easiest and the most natural (IMO) way to do these tasks, but not the most efficient one. There are other concurrency primitives which you can find in the std::sync module. They belong to a lower level than channels but can be more efficient in particular tasks.
The ideal solution would be a Condvar. You can use wait_timeout in the std::sync module, as pointed out by #Vladimir Matveev.
This is the example from the documentation:
use std::sync::{Arc, Mutex, Condvar};
use std::thread;
use std::time::Duration;
let pair = Arc::new((Mutex::new(false), Condvar::new()));
let pair2 = pair.clone();
thread::spawn(move|| {
let &(ref lock, ref cvar) = &*pair2;
let mut started = lock.lock().unwrap();
*started = true;
// We notify the condvar that the value has changed.
cvar.notify_one();
});
// wait for the thread to start up
let &(ref lock, ref cvar) = &*pair;
let mut started = lock.lock().unwrap();
// as long as the value inside the `Mutex` is false, we wait
loop {
let result = cvar.wait_timeout(started, Duration::from_millis(10)).unwrap();
// 10 milliseconds have passed, or maybe the value changed!
started = result.0;
if *started == true {
// We received the notification and the value has been updated, we can leave.
break
}
}
Having been back to this question several times myself, here's what I think addresses OP's intent and others' best practice of getting the thread to stop itself. Building on the accepted answer, Crossbeam is a nice upgrade to mpsc in allowing message endpoints to be cloned and moved. It also has a convenient tick function. The real point here is it has try_recv() which is non-blocking.
I'm not sure how universally useful it'd be to put a message checker in the middle of an operational loop like this. I haven't found that Actix (or previously Akka) could really stop a thread without--as stated above--getting the thread to do it itself. So this is what I'm using for now (wide open to correction here, still learning myself).
// Cargo.toml:
// [dependencies]
// crossbeam-channel = "0.4.4"
use crossbeam_channel::{Sender, Receiver, unbounded, tick};
use std::time::{Duration, Instant};
fn main() {
let (tx, rx):(Sender<String>, Receiver<String>) = unbounded();
let rx2 = rx.clone();
// crossbeam allows clone and move of receiver
std::thread::spawn(move || {
// OP:
// let mut timer = Timer::new().unwrap();
// let periodic = timer.periodic(Duration::milliseconds(200));
let ticker: Receiver<Instant> = tick(std::time::Duration::from_millis(500));
loop {
// OP:
// periodic.recv();
crossbeam_channel::select! {
recv(ticker) -> _ => {
// OP: Do my work here
println!("Hello, work.");
// Comms Check: keep doing work?
// try_recv is non-blocking
// rx, the single consumer is clone-able in crossbeam
let try_result = rx2.try_recv();
match try_result {
Err(_e) => {},
Ok(msg) => {
match msg.as_str() {
"END_THE_WORLD" => {
println!("Ending the world.");
break;
},
_ => {},
}
},
_ => {}
}
}
}
}
});
// let work continue for 10 seconds then tell that thread to end.
std::thread::sleep(std::time::Duration::from_secs(10));
println!("Goodbye, world.");
tx.send("END_THE_WORLD".to_string());
}
Using strings as a message device is a tad cringeworthy--to me. Could do the other suspend and restart stuff there in an enum.

Resources