I am wrapping a C/C++ library in a Rust crate and calling into it using FFI (I am not using a subprocess).
This library logs to stdout/stderr (using, say, printf() or std::cout) but I would like to "catch" this output and use Rust's log crate to control the output.
Is it possible to redirect stdout/stderr of FFI calls to log?
Please find below an example illustrating the different
steps to redirect/restore stderr (file descriptor 2).
The (C-like) style used here was intended in order to keep this
example minimal ; of course, you could probably use the libc
crate and encapsulate properly all of this in a struct.
Note that, in trivial cases, you may repeat the
redirect/invoke/obtain/restore sequence as many times as you want,
provided you keep pipe_fd, saved_fd and log_file open.
However, in non-trivial cases, some kind of complication is implied:
if the C code produces a quite long message, how can we detect
that we have read it all?
we could inject an end-marker into STDERR_FILENO after the
message is produced at the invoke step and then read log_file
until this marker is detected in the obtain step. (this adds
some kind of text processing)
we could recreate the pipe and log_file before each redirect
step, close the PIPE_WRITE end before the invoke step, read
log_file until EOF is reached and close it in the obtain step.
(this adds the overhead of more system-calls)
if the C code produces a very long message, wouldn't it exceed
the pipe's internal buffer capacity (and then block writing)?
we could execute the invoke step in a separate thread and
join() it after the obtain step has completed (end-marker or
EOF is reached), so that the invocation still looks serial
from the application's point of view.
(this adds the overhead of spawning/joining a thread)
an alternative is to put all the logging part of the application
in a separate thread (spawned once for all) and keep all the
invocation steps serial.
(if the logging part of the application does not have to be
perceived as serial this is OK, but else this just reports the
same problem one thread further)
we could fork() to perform the redirect and invoke
steps in a child process (if the application data does not have
to be altered, just read), get rid of the restore step and
wait() the process after the obtain step has completed
(end-marker or EOF is reached), so that the invocation still
looks serial from the application's point of view.
(this adds the overhead of spawning/waiting a process, and
precludes the ability to alter the application data from the
invoked code)
// necessary for the redirection
extern "C" {
fn pipe(fd: *mut i32) -> i32;
fn close(fd: i32) -> i32;
fn dup(fd: i32) -> i32;
fn dup2(
old_fd: i32,
new_fd: i32,
) -> i32;
}
const PIPE_READ: usize = 0;
const PIPE_WRITE: usize = 1;
const STDERR_FILENO: i32 = 2;
fn main() {
//
// duplicate original stderr in order to restore it
//
let saved_stderr = unsafe { dup(STDERR_FILENO) };
if saved_stderr == -1 {
eprintln!("cannot duplicate stderr");
return;
}
//
// create resources (pipe + file reading from it)
//
let mut pipe_fd = [-1; 2];
if unsafe { pipe(&mut pipe_fd[0]) } == -1 {
eprintln!("cannot create pipe");
return;
}
use std::os::unix::io::FromRawFd;
let mut log_file =
unsafe { std::fs::File::from_raw_fd(pipe_fd[PIPE_READ]) };
//
// redirect stderr to pipe/log_file
//
if unsafe { dup2(pipe_fd[PIPE_WRITE], STDERR_FILENO) } == -1 {
eprintln!("cannot redirect stderr to pipe");
return;
}
//
// invoke some C code that should write to stderr
//
extern "C" {
fn perror(txt: *const u8);
}
unsafe {
dup(-1); // invalid syscall in order to set errno (used by perror)
perror(&"something bad happened\0".as_bytes()[0]);
};
//
// obtain the previous message
//
use std::io::Read;
let mut buffer = [0_u8; 100];
if let Ok(sz) = log_file.read(&mut buffer) {
println!(
"message ({} bytes): {:?}",
sz,
std::str::from_utf8(&buffer[0..sz]).unwrap(),
);
}
//
// restore initial stderr
//
unsafe { dup2(saved_stderr, STDERR_FILENO) };
//
// close resources
//
unsafe {
close(saved_stderr);
// pipe_fd[PIPE_READ] will be closed by log_file
close(pipe_fd[PIPE_WRITE]);
};
}
Related
I'm learning the synchronizing primitive of tokio. From the example code of Notify, I found it is confused to understand why Channel<T> is mpsc.
use tokio::sync::Notify;
use std::collections::VecDeque;
use std::sync::Mutex;
struct Channel<T> {
values: Mutex<VecDeque<T>>,
notify: Notify,
}
impl<T> Channel<T> {
pub fn send(&self, value: T) {
self.values.lock().unwrap()
.push_back(value);
// Notify the consumer a value is available
self.notify.notify_one();
}
// This is a single-consumer channel, so several concurrent calls to
// `recv` are not allowed.
pub async fn recv(&self) -> T {
loop {
// Drain values
if let Some(value) = self.values.lock().unwrap().pop_front() {
return value;
}
// Wait for values to be available
self.notify.notified().await;
}
}
}
If there are elements in values, the consumer tasks will take it away
If there is no element in values, the consumer tasks will yield until the producer nitify it
But after I writen some test code, I found in no case the consumer will lose the notice from producer.
Could some one give me test code to prove the above Channel<T> fail to work well as a mpmc?
The following code shows why it is unsafe to use the above channel as mpmc.
use std::sync::Arc;
#[tokio::main]
async fn main() {
let mut i = 0;
loop{
let ch = Arc::new(Channel {
values: Mutex::new(VecDeque::new()),
notify: Notify::new(),
});
let mut handles = vec![];
for i in 0..100{
if i % 2 == 1{
for _ in 0..2{
let sender = ch.clone();
tokio::spawn(async move{
sender.send(1);
});
}
}else{
for _ in 0..2{
let receiver = ch.clone();
let handle = tokio::spawn(async move{
receiver.recv().await;
});
handles.push(handle);
}
}
}
futures::future::join_all(handles).await;
i += 1;
println!("No.{i} loop finished.");
}
}
Not running the next loop means that there are consumer tasks not finishing, and consumer tasks miss a notify.
Quote from the documentation you linked:
If you have two calls to recv and two calls to send in parallel, the following could happen:
Both calls to try_recv return None.
Both new elements are added to the vector.
The notify_one method is called twice, adding only a single permit to the Notify.
Both calls to recv reach the Notified future. One of them consumes the permit, and the other sleeps forever.
Replace try_recv with self.values.lock().unwrap().pop_front() in our case; the rest of the explanation stays identical.
The third point is the important one: Multiple calls to notify_one only result in a single token if no thread is waiting yet. And there is a short time window where it is possible that multiple threads already checked for the existance of an item but aren't waiting yet.
I'm new to Rust and was trying to generate plenty of JSON data on the fly for a project, but I'm having deadlocks.
I've tried removing the serialization (json_serde) and sending the HashMaps in the channel instead but I still get deadlocks on my computer. If I however comment the send(generator.next()) line and send a string myself, code works flawlessly, thus the deadlock is caused by my DatasetGenerator, but I don't understand why.
Code summary:
Have a DatasetGenerator object that can generate sequences of "events" and serialize them to JSON.
generator.next() works like an "iterator" - It increments an internal atomic counter in the generator and then generates the i-th item in the sequence + serializes the JSON.
Have a generator threadpool generate these JSONs at high throughput (very large payloads each)
Send these JSONs through a channel to other thread (which will send them through network but irrelevant for this question)
Depending if I comment tx_ref.send(generator_ref.next()) or tx_ref.send(some_new_string) below my code deadlocks or succeeds:
src/main.rs:
extern crate threads_pool;
use threads_pool::*;
mod generator;
use std::sync::mpsc;
use std::sync::Arc;
use std::thread;
fn main() {
// N will be an argument, and a very high number. For tests use this:
const N: i64 = 12; // Increase this if you're not getting the deadlock yet, or run cargo run again until it happens.
let (tx, rx) = mpsc::channel();
let tx_producer = tx.clone();
let producer_thread = thread::spawn(move || {
let pool = ThreadPool::new(4);
let generator = Arc::new(generator::data_generator::DatasetGenerator::new(3000));
for i in 0..N {
println!("Generating #{}", i);
let tx_ref = tx_producer.clone();
let generator_ref = generator.clone();
pool.execute(move || {
////////// v !!!DEADLOCK HERE!!! v //////////
tx_ref.send(generator_ref.next()).expect("tx failed."); // This locks!
//tx_ref.send(format!(" {} ", i)).expect("tx failed."); // This works!
////////// ^ !!!DEADLOCK HERE!!! ^ //////////
})
.unwrap();
}
println!("Generator done!");
});
println!("-» Consumer consuming!");
for j in 0..N {
let s = rx.recv().expect("rx failed");
println!("-» Consumed #{}: {} ... ", j, &s[..10]);
}
println!("Consumer done!!");
producer_thread.join().unwrap();
println!("Success. Exit!");
}
This is my DatasetGenerator which seems to be causing all the trouble (as not using serde but outputting the HashMaps still gives deadlocks). src/generator/dataset_generator.rs:
use serde_json::Value;
use std::collections::HashMap;
use std::sync::atomic;
pub struct DatasetGenerator {
num_features: usize,
pub counter: atomic::AtomicI64,
feature_names: Vec<String>,
}
type Datapoint = HashMap<String, Value>;
type Out = String;
impl DatasetGenerator {
pub fn new(num_features: usize) -> DatasetGenerator {
let mut feature_names = Vec::new();
for i in 0..num_features {
feature_names.push(format!("f_{}", i));
}
DatasetGenerator {
num_features,
counter: atomic::AtomicI64::new(0),
feature_names,
}
}
/// Generates the next item in the sequence (iterator-like).
pub fn next(&self) -> Out {
let value = self.counter.fetch_add(1, atomic::Ordering::SeqCst);
self.gen(value)
}
/// Generates the ith item in the sequence. DEADLOCKS!!! ///////////////////////////
pub fn gen(&self, ith: i64) -> Out {
let mut data = Datapoint::with_capacity(self.num_features);
for f in 0..self.num_features {
let name = self.feature_names.get(f).unwrap();
data.insert(name.to_string(), Value::from(ith));
}
serde_json::json!(data).to_string() // Tried without serialization and still deadlocks!
}
}
Commit with deadlock code is here if you want to try out yourself with cargo run: https://github.com/AlbertoEAF/learn-rust/tree/dc5fa867e5a70b605553ef65796fdc9dd42d38a0/rest-injector
Deadlock on Windows with Rust 1.60.0:
Thank you for the help! it's greatly appreciated :)
Update
I've followed the suggestions from #kmdreko's answer below, and apparently the problem is in the generator: not all the items are generated. Even though pool.execute() is called N times, only a random number of closures c < N are executed even if I place pool.close() before leaving the producer_thread. Why does that happen / How can it be fixed?
Fix: Turns out this lockup is caused by the threads_pool library (0.2.6). I switched the thread pool to rayon's and it worked smoothly at the first try.
One thing you should change: an mpsc::Receiver will return an error on .recv() if it cannot possibly yield a result by realizing that all the associated mpsc::Senders have dropped, which is a good indicator that all the work is done. Your tx_refs and even tx_producer will be dropped when their respective tasks/threads complete, however you still have tx in scope that can theoretically give a value. This is what gives you the apparent deadlock. You should simply remove tx_producer and use tx directly so it is moved into the producer thread and dropped accordingly.
Now, you'll see either all N tasks complete, or you'll get an error indicating that some tasks did not complete. The reason not all tasks are completing is because you're creating the thread pool, spawning all the tasks, and then immediately destroying it. The threads_pool documentation says that the threads will finish their current job when the pool is destroyed, but you want to wait until all jobs have completed. For that you need to call the .close() method provided by the PoolManager trait before the end of the closure.
The reason you saw inconsistent behavior, but was benefited by returning a string directly is because the jobs required less work and the threads could get away with completing all them before they saw their signal to exit. Your generator_ref.next() requires much more computation so its not surprising they'd only process 4-plus-a-bit jobs before they see they've been told to exit.
I have a process struct, which holds a process handle:
pub struct Process {
process: Child,
init: bool
}
I have a seperate function where I can 'talk' to the engine.
fn talk_to_engine(&mut self, input: &String) -> String {
let stdin = &mut self.process.stdin.as_mut().unwrap();
stdin.write_all(input.as_bytes()).expect("Failed to write to process.");
let mut s = String::new();
return match self.process.stdout.as_mut().unwrap().read_to_string(&mut s) {
Err(why) => panic!("stdout error: {}", why),
Ok(_) => s
}
}
Yet when I run the function, I get a blinking cursor in the terminal and it does nothing.
EDIT: I call the init_engine function which in turn calls the above mentioned function:
/// Initialize the engine.
pub fn init_engine(&mut self, _protocol: String, depth: String) {
//Stockfish::talk_to_engine(self, &protocol);
let output = Stockfish::talk_to_engine(self, &format!("go depth {}", &depth));
print!("{:?}", output);
self.init = true;
}
if you call init_engine, let's say, like this: struct.init_engine("uci".to_string(), "1".to_string());
Without any information a full reproduction case, or even knowing what the input are and subprocess are it's impossible to know, and hard to guess. Especially as you apparently didn't even try to find what was blocking exactly.
But there are two possible problem points I can see here:
The driver will only reads the output once all input has been consumed, if the subprocess interleaves reading and writing it could fill the entirety of the output pipe's buffer then block on writing to stdout forever, basically deadlocking.
read_to_string reads the entirety of the stream, meaning the subprocess must write everything out and terminate or at least close its stdout, otherwise more output remains possible, and the driver will keep waiting for it.
The second run of foo() will crash without an error message. When I remove this unsafe line, it works ok.
use std::process::{Command, Stdio};
use std::os::unix::io::FromRawFd;
fn foo() {
let mut p = Command::new("ls");
unsafe { p.stdout(Stdio::from_raw_fd(2)) };
let mut child = p.spawn().expect("spawn error");
child.wait().expect("wait error");
println!("process: {:?}", p);
}
fn main() {
foo();
foo();
}
It seems the unsafe code here has some issue. Maybe it's not releasing some resource?
Is there a way to do the stdout -> stderr redirection without using unsafe code?
Stdio::from_raw_fd(2) gives ownership of file descriptor 2 to the newly constructed Stdio object. Stdio's destructor will close the file descriptor. Stdio's destructor will run when the Command is dropped, because the Command owns the Stdio.
Of course, the reason you're not getting any output on the second call to foo is that standard error has been closed!
The simple solution would be to duplicate file descriptor 2 and pass the duplicate to Stdio::from_raw_fd.
In Rust, a panic terminates the current thread but is not sent back to the main thread. The solution we are told is to use join. However, this blocks the currently executing thread. So if my main thread spawns 2 threads, I cannot join both of them and immediately get a panic back.
let jh1 = thread::spawn(|| { println!("thread 1"); sleep(1000000); };
let jh2 = thread::spawn(|| { panic!("thread 2") };
In the above, if I join on thread 1 and then on thread 2 I will be waiting for 1 before ever receiving a panic from either thread
Although in some cases I desire the current behavior, my goal is to default to Go's behavior where I can spawn a thread and have it panic on that thread and then immediately end the main thread. (The Go specification also documents a protect function, so it is easy to achieve Rust behavior in Go).
Updated for Rust 1.10+, see revision history for the previous version of the answer
good point, in go the main thread doesn't get unwound, the program just crashes, but the original panic is reported. This is in fact the behavior I want (although ideally resources would get cleaned up properly everywhere).
This you can achieve with the recently stable std::panic::set_hook() function. With it, you can set a hook which prints the panic info and then exits the whole process, something like this:
use std::thread;
use std::panic;
use std::process;
fn main() {
// take_hook() returns the default hook in case when a custom one is not set
let orig_hook = panic::take_hook();
panic::set_hook(Box::new(move |panic_info| {
// invoke the default handler and exit the process
orig_hook(panic_info);
process::exit(1);
}));
thread::spawn(move || {
panic!("something bad happened");
}).join();
// this line won't ever be invoked because of process::exit()
println!("Won't be printed");
}
Try commenting the set_hook() call out, and you'll see that the println!() line gets executed.
However, this approach, due to the use of process::exit(), will not allow resources allocated by other threads to be freed. In fact, I'm not sure that Go runtime allows this as well; it is likely that it uses the same approach with aborting the process.
I tried to force my code to stop processing when any of threads panicked. The only more-or-less clear solution without using unstable features was to use Drop trait implemented on some struct. This can lead to a resource leak, but in my scenario I'm ok with this.
use std::process;
use std::thread;
use std::time::Duration;
static THREAD_ERROR_CODE: i32 = 0x1;
static NUM_THREADS: u32 = 17;
static PROBE_SLEEP_MILLIS: u64 = 500;
struct PoisonPill;
impl Drop for PoisonPill {
fn drop(&mut self) {
if thread::panicking() {
println!("dropped while unwinding");
process::exit(THREAD_ERROR_CODE);
}
}
}
fn main() {
let mut thread_handles = vec![];
for i in 0..NUM_THREADS {
thread_handles.push(thread::spawn(move || {
let b = PoisonPill;
thread::sleep(Duration::from_millis(PROBE_SLEEP_MILLIS));
if i % 2 == 0 {
println!("kill {}", i);
panic!();
}
println!("this is thread number {}", i);
}));
}
for handle in thread_handles {
let _ = handle.join();
}
}
No matter how b = PoisonPill leaves it's scope, normal or after panic!, its Drop method kicks in. You can distinguish if the caller panicked using thread::panicking and take some action — in my case killing the process.
Looks like exiting the whole process on a panic in any thread is now (rust 1.62) as simple as adding this to your Cargo.toml:
[profile.release]
panic = 'abort'
[profile.dev]
panic = 'abort'
A panic in a thread then looks like this, with exit code 134:
thread '<unnamed>' panicked at 'panic in thread', src/main.rs:5:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Aborted (core dumped)