Communicate with child process multiple times - rust

I'm trying to make some communications with mongo shell. Unfortunately I receive bytes only the first time (after first show dbs\n), and after that the second show dbs\n call hasn't made. Moreover I don't see the long line ------... (I marked It), some infinite loop is observed (the program does not finish).
What am I doing wrong? I did not find such cases on the Internet when people make communications with child processes multiple times. All such examples is all about send via in once and read via out once, but not multiple times.
fn main() -> Result<(), failure::Error> {
let mut mongod = Command::new("mongo")
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.spawn()
.expect("mongo was not started");
let mut inp = mongod.stdin.as_mut().unwrap();
let r = inp.write_all(b"show dbs\n")?;
let r = inp.flush()?;
let mut reader = BufReader::new(mongod.stdout.take().unwrap());
let mut s = String::new();
while let Ok(n) = reader.read_line(&mut s) {
println!("{:?}", n);
}
// !don't see it!
println!("{}", "-------------------------------------------");
let r = inp.write_all(b"show dbs\n")?;
let r = inp.flush()?;
while let Ok(n) = reader.read_line(&mut s) {
println!("{:?}", n);
}
}

Related

"there is no signal driver running, must be called from the context of Tokio runtime" despite running in Runtime::block_on

I'm writing a commandline application using Tokio that controls its lifecycle by listening for keyboard interrupt events (i.e. ctrl + c); however, at the same time it must also monitor the other tasks that are spawned and potentially initiate an early shutdown if any of the tasks panic or otherwise encounter an error. To do this, I have wrapped tokio::select in a while loop that terminates once the application has at least had a chance to safely shut down.
However, as soon as the select block polls the future returned by tokio::signal::ctrl_c, the main thread panics with the following message:
thread 'main' panicked at 'there is no signal driver running, must be called from the context of Tokio runtime'
...which is confusing, because this is all done inside a Runtime::block_on call. I haven't published this application (yet), but the problem can be reproduced with the following code:
use tokio::runtime::Builder;
use tokio::signal;
use tokio::sync::watch;
use tokio::task::JoinSet;
fn main() {
let runtime = Builder::new_multi_thread().worker_threads(2).build().unwrap();
runtime.block_on(async {
let _rt_guard = runtime.enter();
let (ping_tx, mut ping_rx) = watch::channel(0u32);
let (pong_tx, mut pong_rx) = watch::channel(0u32);
let mut tasks = JoinSet::new();
let ping = tasks.spawn(async move {
let mut val = 0u32;
ping_tx.send(val).unwrap();
while val < 10u32 {
pong_rx.changed().await.unwrap();
val = *pong_rx.borrow();
ping_tx.send(val + 1).unwrap();
println!("ping! {}", val + 1);
}
});
let pong = tasks.spawn(async move {
let mut val = 0u32;
while val < 10u32 {
ping_rx.changed().await.unwrap();
val = *ping_rx.borrow();
pong_tx.send(val + 1).unwrap();
println!("pong! {}", val + 1);
}
});
let mut interrupt = Box::pin(signal::ctrl_c());
let mut interrupt_read = false;
while !interrupt_read && !tasks.is_empty() {
tokio::select! {
biased;
_ = &mut interrupt, if !interrupt_read => {
ping.abort();
pong.abort();
interrupt_read = true;
},
_ = tasks.join_next() => {}
}
}
});
}
Rust Playground
This example is a bit contrived, but the important parts are:
I am intentionally using Runtime::block_on() instead of tokio::main as I want to control the number of runtime threads at runtime.
Although, curiously, this example works if rewritten to use tokio::main.
I added let _rt_guard = runtime.enter() to ensure that the runtime context was set, but its presence or absence don't appear to make a difference.
I discovered the answer to this while I was finishing writing up the question, but since I couldn't find the answer by searching on the error message I'll share the answer here.
As I noted, the example I gave works if run from within a function annotated with tokio::main. Looking at the docs, the main macro expands to:
fn main() {
tokio::runtime::Builder::new_multi_thread()
.enable_all()
.build()
.unwrap()
.block_on(async {
println!("Hello world");
})
}
The example I gave did not call enable_all() on the builder (or, more specifically, enable_io()). Adding this call to the builder chain stops the panic from occurring. The runtime.enter() call I made can also be removed, since it wasn't really doing anything.
Updated example (one of the threads still panics due to a Receiver handle being dropped, but the ctrl_c await no longer panics, which is the key):
use tokio::runtime::Builder;
use tokio::signal;
use tokio::sync::watch;
use tokio::task::JoinSet;
fn main() {
Builder::new_multi_thread()
.worker_threads(2)
.enable_io()
.build()
.unwrap()
.block_on(async {
let (ping_tx, mut ping_rx) = watch::channel(0u32);
let (pong_tx, mut pong_rx) = watch::channel(0u32);
let mut tasks = JoinSet::new();
let ping = tasks.spawn(async move {
let mut val = 0u32;
ping_tx.send(val).unwrap();
while val < 10u32 {
pong_rx.changed().await.unwrap();
val = *pong_rx.borrow();
ping_tx.send(val + 1).unwrap();
println!("ping! {}", val + 1);
}
});
let pong = tasks.spawn(async move {
let mut val = 0u32;
while val < 10u32 {
ping_rx.changed().await.unwrap();
val = *ping_rx.borrow();
pong_tx.send(val + 1).unwrap();
println!("pong! {}", val + 1);
}
});
let mut interrupt = Box::pin(signal::ctrl_c());
let mut interrupt_read = false;
while !interrupt_read && !tasks.is_empty() {
tokio::select! {
biased;
_ = &mut interrupt, if !interrupt_read => {
ping.abort();
pong.abort();
interrupt_read = true;
},
_ = tasks.join_next() => {}
}
}
});
}
Rust Playground

Why Sequential is faster than parallel?

The code is to count the frequency of each word in an article. In the code, I implemented sequential, muti-thread, and muti-thread with a thread pool.
I test the running time of three methods, however, I found that the sequential method is the fastest one. I use the article (data) at 37423.txt, the code is at play.rust-lang.org.
Below is just the single- and multi version (without the threadpool version):
use std::collections::HashMap;
use std::sync::{Arc, Mutex};
use std::thread;
use std::time::SystemTime;
pub fn word_count(article: &str) -> HashMap<String, i64> {
let now1 = SystemTime::now();
let mut map = HashMap::new();
for word in article.split_whitespace() {
let count = map.entry(word.to_string()).or_insert(0);
*count += 1;
}
let after1 = SystemTime::now();
let d1 = after1.duration_since(now1);
println!("single: {:?}", d1.as_ref().unwrap());
map
}
fn word_count_thread(word_vec: Vec<String>, counts: &Arc<Mutex<HashMap<String, i64>>>) {
let mut p_count = HashMap::new();
for word in word_vec {
*p_count.entry(word).or_insert(0) += 1;
}
let mut counts = counts.lock().unwrap();
for (word, count) in p_count {
*counts.entry(word.to_string()).or_insert(0) += count;
}
}
pub fn mt_word_count(article: &str) -> HashMap<String, i64> {
let word_vec = article
.split_whitespace()
.map(|x| x.to_owned())
.collect::<Vec<String>>();
let count = Arc::new(Mutex::new(HashMap::new()));
let len = word_vec.len();
let q1 = len / 4;
let q2 = len / 2;
let q3 = q1 * 3;
let part1 = word_vec[..q1].to_vec();
let part2 = word_vec[q1..q2].to_vec();
let part3 = word_vec[q2..q3].to_vec();
let part4 = word_vec[q3..].to_vec();
let now2 = SystemTime::now();
let count1 = count.clone();
let count2 = count.clone();
let count3 = count.clone();
let count4 = count.clone();
let handle1 = thread::spawn(move || {
word_count_thread(part1, &count1);
});
let handle2 = thread::spawn(move || {
word_count_thread(part2, &count2);
});
let handle3 = thread::spawn(move || {
word_count_thread(part3, &count3);
});
let handle4 = thread::spawn(move || {
word_count_thread(part4, &count4);
});
handle1.join().unwrap();
handle2.join().unwrap();
handle3.join().unwrap();
handle4.join().unwrap();
let x = count.lock().unwrap().clone();
let after2 = SystemTime::now();
let d2 = after2.duration_since(now2);
println!("muti: {:?}", d2.as_ref().unwrap());
x
}
The result of mine is: single:7.93ms, muti: 15.78ms, threadpool: 25.33ms
I did the separation of the article before calculating time.
I want to know if the code has some problem.
First you may want to know the single-threaded version is mostly parsing whitespace (and I/O, but the file is small so it will be in the OS cache on the second run). At most ~20% of the runtime is the counting that you parallelized. Here is the cargo flamegraph of the single-threaded code:
In the multi-threaded version, it's a mess of thread creation, copying and hashmap overhead. To make sure it's not a "too little data" problem, I've used used 100x your input txt file and I'm measuring a 2x slowdown over the single-threaded version. According to the time command, it uses 2x CPU-time compared to wall-clock, so it seems to do some parallel work. The profile looks like this: (clickable svg version)
I'm not sure what to make of it exactly, but it's clear that memory management overhead has increased a lot. You seem to be copying strings for each hashmap, while an ideal wordcount would probably do zero string copying while counting.
More generally I think it's a bad idea to join the results in the threads - the way you do it (as opposed to a map-reduce pattern) the thread needs a global lock, so you could just pass the results back to the main thread instead for combining. I'm not sure if this is the main problem here, though.
Optimization
To avoid string copying, use HashMap<&str, i64> instead of HashMap<String, i64>. This requires some changes (lifetime annotations and thread::scope()). It makes mt_word_count() about 6x faster compared to the old version.
With a large input I'm measuring now a 4x speedup compared to word_count(). (Which is the best you can hope for with four threads.) The multi-threaded version is now also faster overall, but only by ~20% or so, for the reasons explained above. (Note that the single-threaded baseline has also profited the same &str optimization. Also, many things could still be improved/optimized, but I'll stop here.)
fn word_count_thread<'t>(word_vec: Vec<&'t str>, counts: &Arc<Mutex<HashMap<&'t str, i64>>>) {
let mut p_count = HashMap::new();
for word in word_vec {
*p_count.entry(word).or_insert(0) += 1;
}
let mut counts = counts.lock().unwrap();
for (word, count) in p_count {
*counts.entry(word).or_insert(0) += count;
}
}
pub fn mt_word_count<'t>(article: &'t str) -> HashMap<&'t str, i64> {
let word_vec = article.split_whitespace().collect::<Vec<&str>>();
// (skipping 16 unmodified lines)
let x = thread::scope(|scope| {
let handle1 = scope.spawn(move || {
word_count_thread(part1, &count1);
});
let handle2 = scope.spawn(move || {
word_count_thread(part2, &count2);
});
let handle3 = scope.spawn(move || {
word_count_thread(part3, &count3);
});
let handle4 = scope.spawn(move || {
word_count_thread(part4, &count4);
});
handle1.join().unwrap();
handle2.join().unwrap();
handle3.join().unwrap();
handle4.join().unwrap();
count.lock().unwrap().clone()
});
let after2 = SystemTime::now();
let d2 = after2.duration_since(now2);
println!("muti: {:?}", d2.as_ref().unwrap());
x
}

Rust TcpStream doesn't send data to TcpListener until program is stopped

I'm making a simple chat using Tokio for learning. I've made a server with TcpListener that works fine when I use telnets conexions. For example, if I open 2 terminals with telnet and send data, the server receives the conexion and send the data to the address that is not the sender.
The problem I have is with the TcpStream. For some reason (I think is some type of blocking problem) the stream is not sending the data until I stop the app.
I'd tried multiple configurations, but I think that the simple requiered that is close to "work" is the next one.
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let mut stream = TcpStream::connect("127.0.0.1:8080").await?;
println!("INFO: Socket Address: {:?}", stream.local_addr().unwrap());
let (read, mut writer) = stream.split();
let (sender, receiver) = channel();
thread::spawn(move || loop {
let mut line_buf = String::new();
stdin().read_line(&mut line_buf);
let line = line_buf.trim_end().to_string();
if line.len() > 0 {
sender.send(line_buf.trim_end().to_string()).unwrap();
}
});
loop {
let line = receiver.recv().unwrap();
println!("DEBUG User Input {line}");
writer.write_all(line.as_bytes()).await?;
}
P.D: this code doesn't have anything for receive the messages sended by the server. For manage that, I had something like:
tokio::spawn(async move {
let (read, mut writer) = socket.split();
let mut reader = BufReader::new(read);
let mut line = String::new();
loop {
tokio::select! {
result = reader.read_line(&mut line) => {
if result.unwrap() == 0 {
break;
}
print!("DEBUG: Message Received: {line}");
tx.send((line.clone(), addr)).unwrap();
line.clear();
}
result = rx.recv() => {
let (msg, other_addr) = result.unwrap();
println!("INFO: Message Address: {addr}\n\n");
if addr != other_addr {
writer.write_all(msg.as_bytes()).await.unwrap();
}
}
}
}
});

How to create threads in a for loop and get the return value from each?

I am writing a program that pings a set of targets 100 times, and stores each RTT value returned from the ping into a vector, thus giving me a set of RTT values for each target. Say I have n targets, I would like all of the pinging to be done concurrently. The rust code looks like this:
let mut sample_rtts_map = HashMap::new();
for addr in targets.to_vec() {
let mut sampleRTTvalues: Vec<f32> = vec![];
//sample_rtts_map.insert(addr, sampleRTTvalues);
thread::spawn(move || {
while sampleRTTvalues.len() < 100 {
let sampleRTT = ping(addr);
sampleRTTvalues.push(sampleRTT);
// thread::sleep(Duration::from_millis(5000));
}
});
}
The hashmap is used to tell which vector of values belongs to which target. The problem is, how do I retrieve the updated sampleRTTvalues from each thread after the thread is done executing? I would like something like:
let (name, sampleRTTvalues) = thread::spawn(...)
The name, being the name of the thread, and sampleRTTvalues being the vector. However, since I'm creating threads in a for loop, each thread is being instantiated the same way, so how I differentiate them?
Is there some better way to do this? I've looked into schedulers, future, etc., but it seems my case can just be done with simple threads.
I go the desired behavior with the following code:
use std::thread;
use std::sync::mpsc;
use std::collections::HashMap;
use rand::Rng;
use std::net::{Ipv4Addr,Ipv6Addr,IpAddr};
const RTT_ONE: IpAddr = IpAddr::V4(Ipv4Addr::new(127,0,0,1));
const RTT_TWO: IpAddr = IpAddr::V6(Ipv6Addr::new(0,0,0,0,0,0,0,1));
const RTT_THREE: IpAddr = IpAddr::V4(Ipv4Addr::new(127,0,1,1));//idk how ip adresses work, forgive if this in invalid but you get the idea
fn ping(address: IpAddr) -> f32 {
rand::thread_rng().gen_range(5.0..107.0)
}
fn main() {
let targets = [RTT_ONE,RTT_TWO,RTT_THREE];
let mut sample_rtts_map: HashMap<IpAddr,Vec<f32>> = HashMap::new();
for addr in targets.into_iter() {
let (sample_values,moved_values) = mpsc::channel();
let mut sampleRTTvalues: Vec<f32> = vec![];
thread::spawn(move || {
while sampleRTTvalues.len() < 100 {
let sampleRTT = ping(addr);
sampleRTTvalues.push(sampleRTT);
//thread::sleep(Duration::from_millis(5000));
}
});
sample_rtts_map.insert(addr,moved_values.recv().unwrap());
}
}
note that the use rand::Rng can be removed when implementing, as it is only so the example works. what this does is pass data from the spawned thread to the main thread, and in the method used it waits until the data is ready before adding it to the hash map. If this is problematic (takes a long time, etc.) then you can use try_recv instead of recv which will add an error / option type that will return a recoverable error if the value is ready when unwrapped, or return the value if it's ready
You can use a std::sync::mpsc channel to collect your data:
use std::collections::HashMap;
use std::sync::mpsc::channel;
use std::thread;
fn ping(_: &str) -> f32 { 0.0 }
fn main() {
let targets = ["a", "b"]; // just for example
let mut sample_rtts_map = HashMap::new();
let (tx, rx) = channel();
for addr in targets {
let tx = tx.clone();
thread::spawn(move || {
for _ in 0..100 {
let sampleRTT = ping(addr);
tx.send((addr, sampleRTT));
}
});
}
drop(tx);
// exit loop when all thread's tx have dropped
while let Ok((addr, sampleRTT)) = rx.recv() {
sample_rtts_map.entry(addr).or_insert(vec![]).push(sampleRTT);
}
println!("sample_rtts_map: {:?}", sample_rtts_map);
}
This will run all pinging threads simultaneously, and collect data in main thread synchronously, so that we can avoid using locks. Do not forget to drop sender in main thread after cloning to all pinging threads, or the main thread will hang forever.

How to save command stdout to a file?

I'm writing a function that has a particular logging requirement. I want to capture the output from a Command::new() call and save it into a file. I'm just using echo here for simplicity's sake.
fn sys_command(id: &str) -> u64 {
let mut cmd = Command::new("echo")
.args(&[id])
.stdout(Stdio::piped())
.spawn()
.expect("failed to echo");
let stdout = cmd.stdout.as_mut().unwrap();
let stdout_reader = BufReader::new(stdout);
// let log_name = format!("./tmp/log/{}.log", id);
// fs::write(log_name, &stdout_reader);
println!("{:?}", stdout_reader.buffer());
cmd.wait().expect("failed to call");
id.parse::<u64>().unwrap()
}
How can I capture the output and save it to a file? I've made a playground here. My println! call returns [].
Even reading to another buffer prints the wrong value. Below, sys_command("10") prints 3. Here's an updated playground.
fn sys_command(id: &str) -> u64 {
let mut buffer = String::new();
let mut cmd = Command::new("echo")
.args(&[id])
.stdout(Stdio::piped())
.spawn()
.expect("failed to echo");
let stdout = cmd.stdout.as_mut().unwrap();
let mut stdout_reader = BufReader::new(stdout);
let result = stdout_reader.read_to_string(&mut buffer);
println!("{:?}", result.unwrap());
cmd.wait().expect("failed to draw");
id.parse::<u64>().unwrap()
}
What am I missing?
Instead of capturing the output in memory and then writing it to a file, you should just redirect the process's output to a file. This is simpler and more efficient.
let log_name = format!("./tmp/log/{}.log", id);
let log = File::create(log_name).expect("failed to open log");
let mut cmd = Command::new("echo")
.args(&[id])
.stdout(log)
.spawn()
.expect("failed to start echo");
cmd.wait().expect("failed to finish echo");
Even reading to another buffer prints the wrong value. Below, sys_command("10") prints 3.
This is unrelated — read_to_string() returns the number of bytes read, not the contents of them, and there are 3 characters: '1' '0' '\n'.

Resources