How would you stream output from a Process in Rust?

How would you stream output from a Process in Rust? - rust

This question refers to Rust as of October 2014.
If you are using Rust 1.0 or above, you best look elsewhere for a solution.
I have a long running Rust process that generates log values, which I'm running using Process.
It looks at though I might be able to periodically "check on" the running process using set_timeout() and wait() and do something kind of high level loop like:
let mut child = match Command::new("thing").arg("...").spawn() {
Ok(child) => child,
Err(e) => fail!("failed to execute child: {}", e),
};
loop {
child.set_timeout(Some(100));
match child.wait() {
// ??? Something goes here
}
}
The things I'm not 100% on are; how do I tell the difference between a timeout error and a process-return error from wait(), and how to a use the PipeStream to "read as much as you can without blocking from the stream" every interval to push out.
Is this the best approach? Should I start a task to monitor stdout and stderr instead?

For distinguishing the errors from the process from the timeout, you have to manage the returns from wait, an example here:
fn run() {
let mut child = match Command::new("sleep").arg("1").spawn() {
Ok(child) => child,
Err(e) => fail!("failed to execute child: {}", e),
};
loop {
child.set_timeout(Some(1000));
match child.wait() {
// Here assume any error is timeout, you can filter from IoErrorKind
Err(..) => println!("Timeout"),
Ok(ExitStatus(0)) => {
println!("Finished without errors");
return;
}
Ok(ExitStatus(a)) => {
println!("Finished with error number: {}", a);
return;
}
Ok(ExitSignal(a)) => {
println!("Terminated by signal number: {}", a);
return;
}
}
}
}
About using streams, check with wait_with_output, or implement something similar with channels and threads : http://doc.rust-lang.org/src/std/home/rustbuild/src/rust-buildbot/slave/nightly-linux/build/src/libstd/io/process.rs.html#601
Hope it helped

Have a look in cargo:
https://docs.rs/cargo-util/0.1.1/cargo_util/struct.ProcessBuilder.html#method.exec_with_streaming
The only downside is that cargo-util seems to need openssl even with default-features=false...
But you can at least see how it and read2 are done.

Related

How to let a mutex timeout?

I want to let my modem-AT-command-writing-thread only write to the modem's /dev/ttyUSB3 when the modem-AT-command-reading-thread has seen an "OK" or an "ERROR".
This initially sounds like a job for a Mutex<()>, but I have an additional requirement: If the modem-AT-command-reading-thread does not see an "OK" or "ERROR" within three seconds, the writing thread should just get on with sending the next AT command. i.e. If the reading thread gets nothing, the writing thread should still send one of its AT commands every three seconds. (Modems' AT command interfaces are often not nicely behaved.)
At the moment, I have a workaround using mpsc::channel:
Set-up:
let (sender, receiver) = channel::<()>();
modem-AT-command-reading-thread:
if line.starts_with("OK") || line.contains("ERROR") {
debug!("Sending go-ahead to writing_thread.");
sender.send(()).unwrap();
}
modem-AT-command-writing-thread:
/* This receive is just a way of blocking until the modem is ready. */
match receiver.recv_timeout(Duration::from_secs(3)) {
Ok(_) => {
debug!("Received go-ahead from reading thread.");
/*
* Empty the channel, in case the modem was too effusive. We don't want
* to "bank" earlier OK/ERRORs to allow multiple AT commands to be sent in
* quick succession.
*/
while let Ok(_) = receiver.try_recv() {}
}
Err(err) => match err {
RecvTimeoutError::Timeout => {
debug!("Timed-out waiting for go-ahead from reading thread.");
}
RecvTimeoutError::Disconnected => break 'outer
},
}
I cannot find a Mutex::lock_with_timeout().
How can I implement this properly, using a Mutex<()> or similar?

You can use parking_lot's Mutex, it has try_lock_for().

How to cheaply send a delay message?

My requirement is very simple, which is a very reasonable requirement in many programs. It is to send a specified message to my Channel after a specified time.
I've checked tokio for topics related to delay, interval or timeout, but none of them seem that straightforward to implement.
What I've come up with now is to spawn an asynchronous task, then wait or sleep for a certain amount of time, and finally send the message.
But, obviously, spawning an asynchronous task is a relatively heavy operation. Is there a better solution?
async fn my_handler(sender: mpsc::Sender<i32>, dur: Duration) {
tokio::spawn(async {
time::sleep(dur).await;
sender.send(0).await;
}
}

You could try adding a second channel and a continuously running task that buffers messages until the time they are to be received. Implementing this is more involved than it sounds, I hope I'm handling cancellations right here:
fn make_timed_channel<T: Ord + Send + Sync + 'static>() -> (Sender<(Instant, T)>, Receiver<T>) {
// Ord is an unnecessary requirement arising from me stuffing both the Instant and the T into the Binary heap
// You could drop this requirement by using the priority_queue crate instead
let (sender1, receiver1) = mpsc::channel::<(Instant, T)>(42);
let (sender2, receiver2) = mpsc::channel::<T>(42);
let mut receiver1 = Some(receiver1);
tokio::spawn(async move {
let mut buf = std::collections::BinaryHeap::<Reverse<(Instant, T)>>::new();
loop {
// Pretend we're a bounded channel or exit if the upstream closed
if buf.len() >= 42 || receiver1.is_none() {
match buf.pop() {
Some(Reverse((time, element))) => {
sleep_until(time).await;
if sender2.send(element).await.is_err() {
break;
}
}
None => break,
}
}
// We have some deadline to send a message at
else if let Some(Reverse((then, _))) = buf.peek() {
if let Ok(recv) = timeout_at(*then, receiver1.as_mut().unwrap().recv()).await {
match recv {
Some(recv) => buf.push(Reverse(recv)),
None => receiver1 = None,
}
} else {
if sender2.send(buf.pop().unwrap().0 .1).await.is_err() {
break;
}
}
}
// We're empty, wait around
else {
match receiver1.as_mut().unwrap().recv().await {
Some(recv) => buf.push(Reverse(recv)),
None => receiver1 = None,
}
}
}
});
(sender1, receiver2)
}
Playground
Whether this is more efficient than spawning tasks, you'd have to benchmark. (I doubt it. Tokio iirc has some much fancier solution than a BinaryHeap for waiting for waking up at the next timeout, e.g.)
One optimization you could make if you don't need a Receiver<T> but just something that .poll().await can be called on: You could drop the second channel and maintain the BinaryHeap inside a custom receiver.

How do I schedule a repeating task in Tokio?

I am replacing synchronous socket code written in Rust with the asynchronous equivalent using Tokio. Tokio uses futures for asynchronous activity so tasks are chained together and queued onto an executor to be executed by a thread pool.
The basic pseudocode for what I want to do is like this:
let tokio::net::listener = TcpListener::bind(&sock_addr).unwrap();
let server_task = listener.incoming().for_each(move |socket| {
let in_buf = vec![0u8; 8192];
// TODO this should happen continuously until an error happens
let read_task = tokio::io::read(socket, in_buf).and_then(move |(socket, in_buf, bytes_read)| {
/* ... Logic I want to happen repeatedly as bytes are read ... */
Ok(())
};
tokio::spawn(read_task);
Ok(())
}).map_err(|err| {
error!("Accept error = {:?}", err);
});
tokio::run(server_task);
This pseudocode would only execute my task once. How do I run it continuously? I want it to execute and then execute again and again etc. I only want it to stop executing if it panics or has an error result code. What's the simplest way of doing that?

Using loop_fn should work:
let read_task =
futures::future::loop_fn((socket, in_buf, 0), |(socket, in_buf, bytes_read)| {
if bytes_read > 0 { /* handle bytes */ }
tokio::io::read(socket, in_buf).map(Loop::Continue)
});

A clean way to accomplish this and not have to fight the type system is to use tokio-codec crate; if you want to interact with the reader as a stream of bytes instead of defining a codec you can use tokio_codec::BytesCodec.
use tokio::codec::Decoder;
use futures::Stream;
...
let tokio::net::listener = TcpListener::bind(&sock_addr).unwrap();
let server_task = listener.incoming().for_each(move |socket| {
let (_writer, reader) = tokio_codec::BytesCodec::new().framed(socket).split();
let read_task = reader.for_each(|bytes| {
/* ... Logic I want to happen repeatedly as bytes are read ... */
});
tokio::spawn(read_task);
Ok(())
}).map_err(|err| {
error!("Accept error = {:?}", err);
});
tokio::run(server_task);

How to query a child process status regularly

I have spawned a child process using Rust's Command API.
Now, I need to watch this process for a few seconds before moving on because the process may die early. On success, it should run "forever", so I can't just wait.
There's a nightly feature called try_wait which does what I want, but I really don't think I should run Rust nightly just for this!
I think I could start a new thread and keep it waiting forever or until the process dies... but I would like to not hang my process with that thread, so maybe run the thread as a daemon might be a solution...
Is this the way to go or is there a nicer solution?

Currently, if you don't want to use the nightly channel, there's a crate called wait-timeout (thanks to #lukas-kalbertodt for the suggestion) that adds the wait_timeout function to the std::process::Child trait.
It can be used like this:
let cmd = Command::new("my_command")
.spawn();
match cmd {
Ok(mut child) => {
let timeout = Duration::from_secs(1);
match child.wait_timeout(timeout) {
Ok(Some(status)) => println!("Exited with status {}", status),
Ok(None) => println!("timeout, process is still alive"),
Err(e) => println!("Error waiting: {}", e),
}
}
Err(err) => println!("Process did not even start: {}", err);
}
To keep monitoring the child process, just wrap this into a loop.
Notice that using Rust's nightly try_wait(), the code would looks nearly identical (so once it makes into the release branch, assuming no further changes, it should be very easy to move to that), but it will block for the given timeout even if the process dies earlier than that, unlike with the above solution:
let cmd = Command::new("my_command")
.spawn();
match cmd {
Ok(mut child) => {
let timeout = Duration::from_secs(1);
sleep(timeout); // try_wait will not block, so we need to wait here
match child.try_wait() {
Ok(Some(status)) => println!("Exited with status {}", status),
Ok(None) => println!("timeout, process is still alive"),
Err(e) => println!("Error waiting: {}", e),
}
}
Err(err) => println!("Process did not even start: {}", err);
}

What do I wait or join on when using channels and threads?

Here's an example but what should I wait on to decide when it is done. Do we have a better way to wait for the channel to be empty and all the threads to have completed? Full example is at http://github.com/posix4e/rust_webcrawl
loop {
let n_active_threads = running_threads.compare_and_swap(0, 0, Ordering::SeqCst);
match rx.try_recv() {
Ok(new_site) => {
let new_site_copy = new_site.clone();
let tx_copy = tx.clone();
counter += 1;
print!("{} ", counter);
if !found_urls.contains(&new_site) {
found_urls.insert(new_site);
running_threads.fetch_add(1, Ordering::SeqCst);
let my_running_threads = running_threads.clone();
pool.execute(move || {
for new_url in get_websites_helper(new_site_copy) {
if new_url.starts_with("http") {
tx_copy.send(new_url).unwrap();
}
}
my_running_threads.fetch_sub(1, Ordering::SeqCst);
});
}
}
Err(TryRecvError::Empty) if n_active_threads == 0 => break,
Err(TryRecvError::Empty) => {
writeln!(&mut std::io::stderr(),
"Channel is empty, but there are {} threads running",
n_active_threads);
thread::sleep_ms(10);
},
Err(TryRecvError::Disconnected) => unreachable!(),
}
}

This is actually a very complicated question, one with a great potential for race conditions! As I understand it, you:
Have an unbounded queue
Have a set of workers that operate on the queue items
The workers can put an unknown amount of items back into the queue
Want to know when everything is "done"
One obvious issue is that it may never be done. If every worker puts one item back into the queue, you've got an infinite loop.
That being said, I feel like the solution is to track
How many items are queued
How many items are in progress
When both of these values are zero, then you are done. Easier said than done...
use std::sync::Arc;
use std::sync::atomic::{AtomicUsize,Ordering};
use std::sync::mpsc::{channel,TryRecvError};
use std::thread;
fn main() {
let running_threads = Arc::new(AtomicUsize::new(0));
let (tx, rx) = channel();
// We prime the channel with the first bit of work
tx.send(10).unwrap();
loop {
// In an attempt to avoid a race condition, we fetch the
// active thread count before checking the channel. Otherwise,
// we might read nothing from the channel, and *then* a thread
// finishes and added something to the queue.
let n_active_threads = running_threads.compare_and_swap(0, 0, Ordering::SeqCst);
match rx.try_recv() {
Ok(id) => {
// I lie a bit and increment the counter to start
// with. If we let the thread increment this, we might
// read from the channel before the thread ever has a
// chance to run!
running_threads.fetch_add(1, Ordering::SeqCst);
let my_tx = tx.clone();
let my_running_threads = running_threads.clone();
// You could use a threadpool, but I'm spawning
// threads to only rely on stdlib.
thread::spawn(move || {
println!("Working on {}", id);
// Simulate work
thread::sleep_ms(100);
if id != 0 {
my_tx.send(id - 1).unwrap();
// Send multiple sometimes
if id % 3 == 0 && id > 2 {
my_tx.send(id - 2).unwrap();
}
}
my_running_threads.fetch_sub(1, Ordering::SeqCst);
});
},
Err(TryRecvError::Empty) if n_active_threads == 0 => break,
Err(TryRecvError::Empty) => {
println!("Channel is empty, but there are {} threads running", n_active_threads);
// We sleep a bit here, to avoid quickly spinning
// through an empty channel while the worker threads
// work.
thread::sleep_ms(1);
},
Err(TryRecvError::Disconnected) => unreachable!(),
}
}
}
I make no guarantees that this implementation is perfect (I probably should guarantee that it's broken, because threading is hard). One big caveat is that I don't intimately know the meanings of all the variants of Ordering, so I chose the one that looked to give the strongest guarantees.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How would you stream output from a Process in Rust? - rust

Have a look in cargo: https://docs.rs/cargo-util/0.1.1/cargo_util/struct.ProcessBuilder.html#method.exec_with_streaming The only downside is that cargo-util seems to need openssl even with default-features=false... But you can at least see how it and read2 are done.

Related

How to let a mutex timeout?

How to cheaply send a delay message?

How do I schedule a repeating task in Tokio?

How to query a child process status regularly

What do I wait or join on when using channels and threads?

Categories

Resources