I have spawned a child process using Rust's Command API.
Now, I need to watch this process for a few seconds before moving on because the process may die early. On success, it should run "forever", so I can't just wait.
There's a nightly feature called try_wait which does what I want, but I really don't think I should run Rust nightly just for this!
I think I could start a new thread and keep it waiting forever or until the process dies... but I would like to not hang my process with that thread, so maybe run the thread as a daemon might be a solution...
Is this the way to go or is there a nicer solution?
Currently, if you don't want to use the nightly channel, there's a crate called wait-timeout (thanks to #lukas-kalbertodt for the suggestion) that adds the wait_timeout function to the std::process::Child trait.
It can be used like this:
let cmd = Command::new("my_command")
.spawn();
match cmd {
Ok(mut child) => {
let timeout = Duration::from_secs(1);
match child.wait_timeout(timeout) {
Ok(Some(status)) => println!("Exited with status {}", status),
Ok(None) => println!("timeout, process is still alive"),
Err(e) => println!("Error waiting: {}", e),
}
}
Err(err) => println!("Process did not even start: {}", err);
}
To keep monitoring the child process, just wrap this into a loop.
Notice that using Rust's nightly try_wait(), the code would looks nearly identical (so once it makes into the release branch, assuming no further changes, it should be very easy to move to that), but it will block for the given timeout even if the process dies earlier than that, unlike with the above solution:
let cmd = Command::new("my_command")
.spawn();
match cmd {
Ok(mut child) => {
let timeout = Duration::from_secs(1);
sleep(timeout); // try_wait will not block, so we need to wait here
match child.try_wait() {
Ok(Some(status)) => println!("Exited with status {}", status),
Ok(None) => println!("timeout, process is still alive"),
Err(e) => println!("Error waiting: {}", e),
}
}
Err(err) => println!("Process did not even start: {}", err);
}
Related
My requirement is very simple, which is a very reasonable requirement in many programs. It is to send a specified message to my Channel after a specified time.
I've checked tokio for topics related to delay, interval or timeout, but none of them seem that straightforward to implement.
What I've come up with now is to spawn an asynchronous task, then wait or sleep for a certain amount of time, and finally send the message.
But, obviously, spawning an asynchronous task is a relatively heavy operation. Is there a better solution?
async fn my_handler(sender: mpsc::Sender<i32>, dur: Duration) {
tokio::spawn(async {
time::sleep(dur).await;
sender.send(0).await;
}
}
You could try adding a second channel and a continuously running task that buffers messages until the time they are to be received. Implementing this is more involved than it sounds, I hope I'm handling cancellations right here:
fn make_timed_channel<T: Ord + Send + Sync + 'static>() -> (Sender<(Instant, T)>, Receiver<T>) {
// Ord is an unnecessary requirement arising from me stuffing both the Instant and the T into the Binary heap
// You could drop this requirement by using the priority_queue crate instead
let (sender1, receiver1) = mpsc::channel::<(Instant, T)>(42);
let (sender2, receiver2) = mpsc::channel::<T>(42);
let mut receiver1 = Some(receiver1);
tokio::spawn(async move {
let mut buf = std::collections::BinaryHeap::<Reverse<(Instant, T)>>::new();
loop {
// Pretend we're a bounded channel or exit if the upstream closed
if buf.len() >= 42 || receiver1.is_none() {
match buf.pop() {
Some(Reverse((time, element))) => {
sleep_until(time).await;
if sender2.send(element).await.is_err() {
break;
}
}
None => break,
}
}
// We have some deadline to send a message at
else if let Some(Reverse((then, _))) = buf.peek() {
if let Ok(recv) = timeout_at(*then, receiver1.as_mut().unwrap().recv()).await {
match recv {
Some(recv) => buf.push(Reverse(recv)),
None => receiver1 = None,
}
} else {
if sender2.send(buf.pop().unwrap().0 .1).await.is_err() {
break;
}
}
}
// We're empty, wait around
else {
match receiver1.as_mut().unwrap().recv().await {
Some(recv) => buf.push(Reverse(recv)),
None => receiver1 = None,
}
}
}
});
(sender1, receiver2)
}
Playground
Whether this is more efficient than spawning tasks, you'd have to benchmark. (I doubt it. Tokio iirc has some much fancier solution than a BinaryHeap for waiting for waking up at the next timeout, e.g.)
One optimization you could make if you don't need a Receiver<T> but just something that .poll().await can be called on: You could drop the second channel and maintain the BinaryHeap inside a custom receiver.
How can I create a future that completes upon the termination of a tokio::process::Child without closing stdin. I know there is try_wait for testing if a process has terminated without closing stdin, but I want to have this behavior with future semantics.
I tried to prepare a MRE for this question where my code panics as a result of writing to stdin after calling wait, but what I observe does not match the behavior stated in the documentation for tokio::process::Child's wait method. I would expect to see that the line stdin.write_u8(24).await.unwrap(); crashes with a broken pipe since stdin should have been closed by wait.
use tokio::{time, io::AsyncWriteExt}; // 1.0.1
use std::time::Duration;
#[tokio::main]
pub async fn main() {
let mut child = tokio::process::Command::new("nano")
.stdin(std::process::Stdio::piped())
.spawn()
.unwrap();
let mut stdin = child.stdin.take().unwrap();
let tasklet = tokio::spawn(async move {
child.wait().await
});
// this delay should give tokio::spawn plenty of time to spin up
// and call `wait` on the child (closing stdin)
time::sleep(Duration::from_millis(1000)).await;
// write character 24 (CANcel, ^X) to stdin to close nano
stdin.write_u8(24).await.unwrap();
match tasklet.await {
Ok(exit_result) => match exit_result {
Ok(exit_status) => eprintln!("exit_status: {}", exit_status),
Err(terminate_error) => eprintln!("terminate_error: {}", terminate_error)
}
Err(join_error) => eprintln!("join_error: {}", join_error)
}
}
So the answer to this question is to Option::take ChildStdin out of tokio::process::Child as described in this Github issue. In this case, wait will not close stdin and the programmer is responsible for not causing deadlocks.
The MRE above doesn't fail for two reasons: (i) I took ChildStdin out of tokio::process::Child and (ii) even if I hadn't taken it out, it still would not have been closed due to a bug in the code that will be fixed in this pull request.
let addr: SocketAddr = self.listen_bind.parse().unwrap();
let mut listener = TcpListener::bind(&addr).await?;
info!("Nightfort listening on {}", addr);
loop {
info!("debug1");
match listener.accept().await {
Ok((stream, addr)) => {
info!("debug2");
let watcher = self.watcher.clone();
info!("debug3");
tokio::spawn(async move {
info!("debug4");
if let Err(e) = Nightfort::process(watcher, stream, addr).await {
error!("Error on this ranger: {}, error: {:?}", addr, e);
}
});
}
Err(e) => error!("Socket conn error {}", e),
}
// let (stream, addr) = listener.accept().await?;
}
I spent two days on troubleshooting this weird issue. The process in rust can run very well on my local macos, linux, docker on linux, but can not run on aws linux or k8s on aws. The main issues I found is: the process hang on accept() even a client thinks it established a connection to the server and started sending messages to it. ps show the the server process is in S status. The code was written in nightly rust with alpha libs, and I thought there might be a bug in the dependency, then I updated my code and switch it to stable rust with the latest release of dependencies, but the issue is still there.
My rust project uses Command to execute a process.
Sometimes (low frequency) when I run this code the call to status.code() returns None. I am usually using Mac OS Catalina Beta 1, rustc 1.36.0 - but it happens in Travis too (will have to go and find logs of OS/rustc there).
I was treating this as an error but "randomly" it would cause local and travis builds to fail, so now I'm ignoring it - but it would be nice to understand what's causing it.
In failure cases, re-running immediately will cause it to succeed.
let output = Command::new(&command)
.args(command_args)
.stdin(Stdio::inherit())
.stdout(Stdio::inherit())
.stderr(Stdio::piped())
.output()
.chain_err(|| "Error while attempting to spawn command to compile and run flow")?;
match output.status.code() {
Some(0) => Ok("Flow ran to completion".to_string()),
Some(code) => {
error!(
"Process STDERR:\n{}",
String::from_utf8_lossy(&output.stderr)
);
bail!("Exited with status code: {}", code)
}
None => Ok("No return code - ignoring".to_string()),
}
My question is not why this could happen (I know that the docs say "terminated by signal") but why it is happening, as no-one AFAIK is sending a signal to it, I seriously doubt any OOM or other such issues.
Read the manual:
On Unix, this will return None if the process was terminated by a signal; std::os::unix provides an extension trait for extracting the signal and other details from the ExitStatus.
use std::os::unix::process::ExitStatusExt;
use std::process::Command;
fn main() {
let mut child = Command::new("sleep")
.args(&["10"])
.spawn()
.expect("failed to spawn child");
child.kill().expect("failed to kill on child");
let status = child.wait().expect("failed to wait on child");
match status.code() {
None => {
println!("{:?}", status.signal());
()
}
_ => (),
}
}
You could use from_c_int() to have a pretty print of the signal type.
This question refers to Rust as of October 2014.
If you are using Rust 1.0 or above, you best look elsewhere for a solution.
I have a long running Rust process that generates log values, which I'm running using Process.
It looks at though I might be able to periodically "check on" the running process using set_timeout() and wait() and do something kind of high level loop like:
let mut child = match Command::new("thing").arg("...").spawn() {
Ok(child) => child,
Err(e) => fail!("failed to execute child: {}", e),
};
loop {
child.set_timeout(Some(100));
match child.wait() {
// ??? Something goes here
}
}
The things I'm not 100% on are; how do I tell the difference between a timeout error and a process-return error from wait(), and how to a use the PipeStream to "read as much as you can without blocking from the stream" every interval to push out.
Is this the best approach? Should I start a task to monitor stdout and stderr instead?
For distinguishing the errors from the process from the timeout, you have to manage the returns from wait, an example here:
fn run() {
let mut child = match Command::new("sleep").arg("1").spawn() {
Ok(child) => child,
Err(e) => fail!("failed to execute child: {}", e),
};
loop {
child.set_timeout(Some(1000));
match child.wait() {
// Here assume any error is timeout, you can filter from IoErrorKind
Err(..) => println!("Timeout"),
Ok(ExitStatus(0)) => {
println!("Finished without errors");
return;
}
Ok(ExitStatus(a)) => {
println!("Finished with error number: {}", a);
return;
}
Ok(ExitSignal(a)) => {
println!("Terminated by signal number: {}", a);
return;
}
}
}
}
About using streams, check with wait_with_output, or implement something similar with channels and threads : http://doc.rust-lang.org/src/std/home/rustbuild/src/rust-buildbot/slave/nightly-linux/build/src/libstd/io/process.rs.html#601
Hope it helped
Have a look in cargo:
https://docs.rs/cargo-util/0.1.1/cargo_util/struct.ProcessBuilder.html#method.exec_with_streaming
The only downside is that cargo-util seems to need openssl even with default-features=false...
But you can at least see how it and read2 are done.