Continuously process child process' outputs byte for byte with a BufReader

Continuously process child process' outputs byte for byte with a BufReader - rust

I'm trying to interact with an external command (in this case, exiftool) and reading the output byte by byte as in the example below.
While I can get it to work if I'm willing to first read in all the output and wait for the child process to finish, using a BufReader seems to result in indefinitely waiting for the first byte. I used this example as reference for accessing stdout with a BufReader.
use std::io::{Write, Read};
use std::process::{Command, Stdio, ChildStdin, ChildStdout};
fn main() {
let mut child = Command::new("exiftool")
.arg("-#") // "Read command line options from file"
.arg("-") // use stdin for -#
.arg("-q") // "quiet processing" (only send image data to stdout)
.arg("-previewImage") // for extracting thumbnails
.arg("-b") // "Output metadata in binary format"
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.spawn().unwrap();
{
// Pass input file names via stdin
let stdin: &mut ChildStdin = child.stdin.as_mut().unwrap();
stdin.write_all("IMG_1709.CR2".as_bytes()).unwrap();
// Leave scope:
// "When an instance of ChildStdin is dropped, the ChildStdin’s underlying file handle will
// be closed."
}
// This doesn't work:
let stdout: ChildStdout = child.stdout.take().unwrap();
let reader = std::io::BufReader::new(stdout);
for (byte_i, byte_value) in reader.bytes().enumerate() {
// This line is never printed and the program doesn't seem to terminate:
println!("Obtained byte {}: {}", byte_i, byte_value.unwrap());
// …
break;
}
// This works:
let output = child.wait_with_output().unwrap();
for (byte_i, byte_value) in output.stdout.iter().enumerate() {
println!("Obtained byte {}: {}", byte_i, byte_value);
// …
break;
}
}

You're not closing the child's stdin. Your stdin variable is a mutable reference, and dropping that has no effect on the referenced ChildStdin.
Use child.stdin.take() instead of child.stdin.as_mut():
{
// Pass input file names via stdin
let stdin: ChildStdin = child.stdin.take().unwrap();
stdin.write_all("IMG_1709.CR2".as_bytes()).unwrap();
// Leave scope:
// "When an instance of ChildStdin is dropped, the ChildStdin’s underlying file handle will
// be closed."
}

Related

How to create future that completes once tokio::process::Child has exited without closing stdin

How can I create a future that completes upon the termination of a tokio::process::Child without closing stdin. I know there is try_wait for testing if a process has terminated without closing stdin, but I want to have this behavior with future semantics.
I tried to prepare a MRE for this question where my code panics as a result of writing to stdin after calling wait, but what I observe does not match the behavior stated in the documentation for tokio::process::Child's wait method. I would expect to see that the line stdin.write_u8(24).await.unwrap(); crashes with a broken pipe since stdin should have been closed by wait.
use tokio::{time, io::AsyncWriteExt}; // 1.0.1
use std::time::Duration;
#[tokio::main]
pub async fn main() {
let mut child = tokio::process::Command::new("nano")
.stdin(std::process::Stdio::piped())
.spawn()
.unwrap();
let mut stdin = child.stdin.take().unwrap();
let tasklet = tokio::spawn(async move {
child.wait().await
});
// this delay should give tokio::spawn plenty of time to spin up
// and call `wait` on the child (closing stdin)
time::sleep(Duration::from_millis(1000)).await;
// write character 24 (CANcel, ^X) to stdin to close nano
stdin.write_u8(24).await.unwrap();
match tasklet.await {
Ok(exit_result) => match exit_result {
Ok(exit_status) => eprintln!("exit_status: {}", exit_status),
Err(terminate_error) => eprintln!("terminate_error: {}", terminate_error)
}
Err(join_error) => eprintln!("join_error: {}", join_error)
}
}

So the answer to this question is to Option::take ChildStdin out of tokio::process::Child as described in this Github issue. In this case, wait will not close stdin and the programmer is responsible for not causing deadlocks.
The MRE above doesn't fail for two reasons: (i) I took ChildStdin out of tokio::process::Child and (ii) even if I hadn't taken it out, it still would not have been closed due to a bug in the code that will be fixed in this pull request.

Why does BufReader hang when reading from stderr?

I want to execute a command and then capture any potential output to stderr. Here's what I have:
if let Ok(ref mut child) = Command::new("ssh")
.args(&[
"some_args",
"more_args"
])
.stderr(Stdio::piped())
.spawn()
{
let output = child.wait().expect("ssh command not running");
let reader = BufReader::new(child.stderr.take().expect("failed to capture stderr"));
for line in reader.lines() {
match line {
Ok(line_str) => println!("output: {}", line_str);
Err(e) => println!("output failed!"),
}
}
}
I see the output being printed but the program then hangs. I'm suspecting that this may be related to the child process exiting and BufReader is unable to read an eof. A work around was to maintain an let mut num_lines = 0; and then increment this per read. After x-number of reads, I break in the for-loop but this doesn't seem very clean. How can I get BufReader to finish reading properly?

Neither of these may solve your issue, but I'll offer the advice regardless:
Pipe-Wait-Read can deadlock
Calling child.wait() will block execution until the child has exited, returning the exit status.
Using Stdio::piped() creates a new pipe for the stdout/stderr streams in order to be processed by the application. Pipes are handled by the operating system and are not infinite; if one end of the pipe is writing data but the other side isn't reading it, it will eventually block those writes until something is read.
This code can deadlock because you're waiting on the child process to exit, but it may not be able to if it becomes blocked trying to write to an output pipe thats full and not being read from.
As an example, this deadlocks on my system (a fairly standard ubuntu system that has 64KiB buffers for pipes):
// create a simple child proccess that sends 64KiB+1 random bytes to stdout
let mut child = Command::new("dd")
.args(&["if=/dev/urandom", "count=65537", "bs=1", "status=none"])
.stdout(Stdio::piped())
.spawn()
.expect("failed to execute dd");
let _status = child.wait(); // hangs indefinitely
let reader = BufReader::new(child.stdout.take().expect("failed to capture stdout"));
for _line in reader.lines() {
// do something
}
There are plenty of alternatives:
Just read the output without waiting. reader.lines() will stop iterating when it reaches the end of the stream. You can then call child.wait() if you want to know the exit status.
Use .output() instead of .spawn(). This will block until the child has exited and return an Output holding the full stdout/stderr streams as Vec<u8>s.
You can process the output streams in separate threads while you're waiting for the child to exit. If that sounds good consider using tokio::process::Command.
See How do I read the output of a child process without blocking in Rust? for more info.
Don't swallow errors from .lines()
reader.lines() returns an iterator that yields a result for each line. One of the error states that could be somewhat handled is if the line wasn't properly utf-8 encoded, which will return something like this:
Err(
Custom {
kind: InvalidData,
error: "stream did not contain valid UTF-8",
},
)
However, any other error would be directly from the underlying reader and you should probably not continue iterating. Any error you receive is unlikely to be recoverable, and certainly not by continuing to ask for more lines.

How do I schedule a repeating task in Tokio?

I am replacing synchronous socket code written in Rust with the asynchronous equivalent using Tokio. Tokio uses futures for asynchronous activity so tasks are chained together and queued onto an executor to be executed by a thread pool.
The basic pseudocode for what I want to do is like this:
let tokio::net::listener = TcpListener::bind(&sock_addr).unwrap();
let server_task = listener.incoming().for_each(move |socket| {
let in_buf = vec![0u8; 8192];
// TODO this should happen continuously until an error happens
let read_task = tokio::io::read(socket, in_buf).and_then(move |(socket, in_buf, bytes_read)| {
/* ... Logic I want to happen repeatedly as bytes are read ... */
Ok(())
};
tokio::spawn(read_task);
Ok(())
}).map_err(|err| {
error!("Accept error = {:?}", err);
});
tokio::run(server_task);
This pseudocode would only execute my task once. How do I run it continuously? I want it to execute and then execute again and again etc. I only want it to stop executing if it panics or has an error result code. What's the simplest way of doing that?

Using loop_fn should work:
let read_task =
futures::future::loop_fn((socket, in_buf, 0), |(socket, in_buf, bytes_read)| {
if bytes_read > 0 { /* handle bytes */ }
tokio::io::read(socket, in_buf).map(Loop::Continue)
});

A clean way to accomplish this and not have to fight the type system is to use tokio-codec crate; if you want to interact with the reader as a stream of bytes instead of defining a codec you can use tokio_codec::BytesCodec.
use tokio::codec::Decoder;
use futures::Stream;
...
let tokio::net::listener = TcpListener::bind(&sock_addr).unwrap();
let server_task = listener.incoming().for_each(move |socket| {
let (_writer, reader) = tokio_codec::BytesCodec::new().framed(socket).split();
let read_task = reader.for_each(|bytes| {
/* ... Logic I want to happen repeatedly as bytes are read ... */
});
tokio::spawn(read_task);
Ok(())
}).map_err(|err| {
error!("Accept error = {:?}", err);
});
tokio::run(server_task);

How can I read non-blocking from stdin?

Is there a way to check whether data is available on stdin in Rust, or to do a read that returns immediately with the currently available data?
My goal is to be able to read the input produced for instance by cursor keys in a shell that is setup to return all read data immediately. For instance with an equivalent to: stty -echo -echok -icanon min 1 time 0.
I suppose one solution would be to use ncurses or similar libraries, but I would like to avoid any kind of large dependencies.
So far, I got only blocking input, which is not what I want:
let mut reader = stdin();
let mut s = String::new();
match reader.read_to_string(&mut s) {...} // this blocks :(

Converting OP's comment into an answer:
You can spawn a thread and send data over a channel. You can then poll that channel in the main thread using try_recv.
use std::io;
use std::sync::mpsc;
use std::sync::mpsc::Receiver;
use std::sync::mpsc::TryRecvError;
use std::{thread, time};
fn main() {
let stdin_channel = spawn_stdin_channel();
loop {
match stdin_channel.try_recv() {
Ok(key) => println!("Received: {}", key),
Err(TryRecvError::Empty) => println!("Channel empty"),
Err(TryRecvError::Disconnected) => panic!("Channel disconnected"),
}
sleep(1000);
}
}
fn spawn_stdin_channel() -> Receiver<String> {
let (tx, rx) = mpsc::channel::<String>();
thread::spawn(move || loop {
let mut buffer = String::new();
io::stdin().read_line(&mut buffer).unwrap();
tx.send(buffer).unwrap();
});
rx
}
fn sleep(millis: u64) {
let duration = time::Duration::from_millis(millis);
thread::sleep(duration);
}

Most operating systems default to work with the standard input and output in a blocking way. No wonder then that the Rust library follows in stead.
To read from a blocking stream in a non-blocking way you might create a separate thread, so that the extra thread blocks instead of the main one. Checking whether a blocking file descriptor produced some input is similar: spawn a thread, make it read the data, check whether it produced any data so far.
Here's a piece of code that I use with a similar goal of processing a pipe output interactively and that can hopefully serve as an example. It sends the data over a channel, which supports the try_recv method - allowing you to check whether the data is available or not.
Someone has told me that mio might be used to read from a pipe in a non-blocking way, so you might want to check it out too. I suspect that passing the stdin file descriptor (0) to Receiver::from_raw_fd should just work.

You could also potentially look at using ncurses (also on crates.io) which would allow you read in raw mode. There are a few examples in the Github repository which show how to do this.

"fd not available for reading or writing" when memory-mapping a file

I am trying to write data to a memory-mapped file in Rust but it won't memory map the specified file as it states the given fd is not available.
I can see it on the filesystem so it does exist with correct privileges. I suspect this is either a bug or I am not using the new IO API in the correct way.
mmap err = fd not available for reading or writing
Here's the code
use std::fs::File;
use std::os::MemoryMap;
use std::os::unix::prelude::AsRawFd;
use std::os::MapOption::{MapFd, MapWritable, MapReadable};
fn main() {
let f = File::create("test.dat").unwrap();
f.set_len(n as u64);
let fd = f.as_raw_fd();
let mmap = MemoryMap::new(n, &[MapReadable, MapWritable, MapFd(fd)]);
match mmap {
Ok(_) => println!("mmap success"),
Err(ref err) => println!("mmap err = {}", err),
}
}

Files created with File::create are in write-only mode, but you are attempting to map the file for both reading and writing. Use OpenOptions to get a file with both modes:
#![feature(os, std_misc)]
use std::fs::OpenOptions;
use std::os::MemoryMap;
use std::os::unix::prelude::AsRawFd;
use std::os::MapOption::{MapFd, MapReadable, MapWritable};
fn main() {
let n = 100;
let f = OpenOptions::new()
.read(true)
.write(true)
.truncate(true)
.create(true)
.open("test.dat")
.unwrap();
f.set_len(n as u64).unwrap();
let fd = f.as_raw_fd();
let mmap = MemoryMap::new(n, &[MapReadable, MapWritable, MapFd(fd)]);
match mmap {
Ok(_) => println!("mmap success"),
Err(err) => println!("mmap err = {}", err),
}
}
I figured this out by
Grepping the code for "fd not available for reading or writing", which leads to this line, which aligns to ErrFdNotAvail (could also have changed mmap err = {} to mmap err = {:?}).
Searching for that enum variant leads to this line, which maps the underlying libc::EACCES error.
Checked out the man page for mmap to see what EACCES says:
The flag PROT_READ was specified as part of the prot argument and fd was not open for reading. The flags MAP_SHARED and PROT_WRITE were specified as part of the flags and prot argument and fd was not open for writing.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Continuously process child process' outputs byte for byte with a BufReader - rust

Related

How to create future that completes once tokio::process::Child has exited without closing stdin

Why does BufReader hang when reading from stderr?

How do I schedule a repeating task in Tokio?

How can I read non-blocking from stdin?

"fd not available for reading or writing" when memory-mapping a file

Categories

Resources