requests are 95% faster just for commenting a print statement? - rust

main-reference rust book
the modified code for above program listed is
fn main() {
let (tx1, rx) = mpsc::channel();
thread::spawn(move || {
tx1.send("hi2").unwrap();
});
let mut count = 0;
loop {
match rx.try_recv() {
Ok(msg) => {
println!("{:?}", msg);
break;
}
Err(_) => {
// println!("not yet");
count += 1;
}
}
}
print!("{:?}", count)
}
if i commented out the println statement the count is approx above 1646 as its varying everytime but it is around 10-20 if its not commented out
can someone please explain why
update1: as per the comments i tried replacing println with std::io::stdout().write(b"not yet").unwrap(); and the count is around 350-380
and if im using this instead let mut buffer = std::io::BufWriter::new(std::io::stdout()); buffer.write(b"not yet").unwrap(); the count is around 82
so my final question is does it matter like the number of requests per second

I think there is a bit of a conceptual misunderstanding here, your count code isn't measuring what you think it is measuring. Naturally, printing out to the console is more expensive than incrementing a variable, but it doesn't make a big difference in the actual execution time of your program.
It appears you are trying to measure how much time your main thread is waiting in the loop to receive a message from the second thread that you spawn. Let's say that it takes N microseconds on average for your second thread start up after you've spawned it and send a message (there's actually the potential for a lot of variance here, as it depends on operating system scheduling and system loading). Does it really matter what your main thread is doing during those N microseconds you are waiting for a message? You're either printing to the console and/or incrementing your counter, but either way, your main thread is waiting until it receives the message.
To show this, you can measure time elapsed like so:
use std::sync::mpsc;
fn main() {
let (tx1, rx) = mpsc::channel();
std::thread::spawn(move || {
tx1.send("hi2").unwrap();
});
let mut count = 0;
let t1 = std::time::Instant::now();
loop {
match rx.try_recv() {
Ok(msg) => {
println!("got message: {:?}", msg);
break;
}
Err(_) => {
// delta doesn't change much if this print is commented out
println!("not yet");
count += 1;
}
}
}
let t2 = std::time::Instant::now();
let delta = t2-t1;
println!("time: {}", delta.as_micros());
println!("count {}", count)
}
If you run this code multiple times with and without the println!() in the loop commented out, you will see that although count's value will differ, the delta time will not differ very much.

Related

Unable to use asynchronous actors

I'm trying to use the actors as documented in the actix documentation. But even the doc example is not working for me. I tried the following code which compiles but does not print the message "Received fibo message"
use actix::prelude::*;
// #[derive(Message)]
// #[rtype(Result = "Result<u64, ()>")]
// struct Fibonacci(pub u32);
struct Fibonacci(pub u32);
impl Message for Fibonacci {
type Result = Result<u64, ()>;
}
struct SyncActor;
impl Actor for SyncActor {
// It's important to note that you use "SyncContext" here instead of "Context".
type Context = SyncContext<Self>;
}
impl Handler<Fibonacci> for SyncActor {
type Result = Result<u64, ()>;
fn handle(&mut self, msg: Fibonacci, _: &mut Self::Context) -> Self::Result {
println!("Received fibo message");
if msg.0 == 0 {
Err(())
} else if msg.0 == 1 {
Ok(1)
} else {
let mut i = 0;
let mut sum = 0;
let mut last = 0;
let mut curr = 1;
while i < msg.0 - 1 {
sum = last + curr;
last = curr;
curr = sum;
i += 1;
}
Ok(sum)
}
}
}
fn main() {
System::new().block_on(async {
// Start the SyncArbiter with 2 threads, and receive the address of the Actor pool.
let addr = SyncArbiter::start(2, || SyncActor);
// send 5 messages
for n in 5..10 {
// As there are 2 threads, there are at least 2 messages always being processed
// concurrently by the SyncActor.
println!("Sending fibo message");
addr.do_send(Fibonacci(n));
}
});
}
This program displays 5 times :
Sending fibo message
Two remarks, first I'm unable to use the macro rtype, I use to implement Message myself. And then the line addr.do_send(Fibonacci(n)) seems to not send anything to my actor. However if I use addr.send(Fibonacci(n)).await; my message get sent and received on the actor side. But since I'm awaiting the send function it processes the message synchronously instead of using the 2 threads I have defined theoretically.
I also tried to wait with a thread::sleep after my main loop but the messages were not arriving either.
I might be misunderstanding something but it seems strange to me.
Cargo.toml file :
[dependencies]
actix = "0.11.1"
actix-rt = "2.2.0"
I finally managed to make it works, though I can't understand exactly why. Simply using tokio to wait for a ctrl-C made it possible for me to call do_send/try_send and work in parallel.
fn main() {
System::new().block_on(async {
// Start the SyncArbiter with 4 threads, and receive the address of the Actor pool.
let addr = SyncArbiter::start(4, || SyncActor);
// send 15 messages
for n in 5..20 {
// As there are 4 threads, there are at least 4 messages always being processed
// concurrently by the SyncActor.
println!("Sending fibo message");
addr.do_send(Fibonacci(n));
}
// This does not wotk
//thread::spawn(move || {
// thread::sleep(Duration::from_secs_f32(10f32));
//}).join();
// This made it worked
tokio::signal::ctrl_c().await.unwrap();
println!("Ctrl-C received, shutting down");
System::current().stop();
});
}
You don't have to use crate tokio explicitly here. In your loop, just change the last line to addr.send(Fibonacci(n)).await.unwrap(). Method send returns a future and it must be awaited to resolve.

How can I asynchronously read from both stdout and stderr of a subprocess using Tokio? [duplicate]

I'm making a small ncurses application in Rust that needs to communicate with a child process. I already have a prototype written in Common Lisp. I'm trying to rewrite it because CL uses a huge amount of memory for such a small tool.
I'm having some trouble figuring out how to interact with the sub-process.
What I'm currently doing is roughly this:
Create the process:
let mut program = match Command::new(command)
.args(arguments)
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.spawn()
{
Ok(child) => child,
Err(_) => {
println!("Cannot run program '{}'.", command);
return;
}
};
Pass it to an infinite (until user exits) loop, which reads and handles input and listens for output like this (and writes it to the screen):
fn listen_for_output(program: &mut Child, output_viewer: &TextViewer) {
match program.stdout {
Some(ref mut out) => {
let mut buf_string = String::new();
match out.read_to_string(&mut buf_string) {
Ok(_) => output_viewer.append_string(buf_string),
Err(_) => return,
};
}
None => return,
};
}
The call to read_to_string however blocks the program until the process exits. From what I can see read_to_end and read also seem to block. If I try running something like ls which exits right away, it works, but with something that doesn't exit like python or sbcl it only continues once I kill the subprocess manually.
Based on this answer, I changed the code to use BufReader:
fn listen_for_output(program: &mut Child, output_viewer: &TextViewer) {
match program.stdout.as_mut() {
Some(out) => {
let buf_reader = BufReader::new(out);
for line in buf_reader.lines() {
match line {
Ok(l) => {
output_viewer.append_string(l);
}
Err(_) => return,
};
}
}
None => return,
}
}
However, the problem still remains the same. It will read all lines that are available, and then block. Since the tool is supposed to work with any program, there is no way to guess out when the output will end, before trying to read. There doesn't appear to be a way to set a timeout for BufReader either.
Streams are blocking by default. TCP/IP streams, filesystem streams, pipe streams, they are all blocking. When you tell a stream to give you a chunk of bytes it will stop and wait till it has the given amout of bytes or till something else happens (an interrupt, an end of stream, an error).
The operating systems are eager to return the data to the reading process, so if all you want is to wait for the next line and handle it as soon as it comes in then the method suggested by Shepmaster in Unable to pipe to or from spawned child process more than once (and also in his answer here) works.
Though in theory it doesn't have to work, because an operating system is allowed to make the BufReader wait for more data in read, but in practice the operating systems prefer the early "short reads" to waiting.
This simple BufReader-based approach becomes even more dangerous when you need to handle multiple streams (like the stdout and stderr of a child process) or multiple processes. For example, BufReader-based approach might deadlock when a child process waits for you to drain its stderr pipe while your process is blocked waiting on it's empty stdout.
Similarly, you can't use BufReader when you don't want your program to wait on the child process indefinitely. Maybe you want to display a progress bar or a timer while the child is still working and gives you no output.
You can't use BufReader-based approach if your operating system happens not to be eager in returning the data to the process (prefers "full reads" to "short reads") because in that case a few last lines printed by the child process might end up in a gray zone: the operating system got them, but they're not large enough to fill the BufReader's buffer.
BufReader is limited to what the Read interface allows it to do with the stream, it's no less blocking than the underlying stream is. In order to be efficient it will read the input in chunks, telling the operating system to fill as much of its buffer as it has available.
You might be wondering why reading data in chunks is so important here, why can't the BufReader just read the data byte by byte. The problem is that to read the data from a stream we need the operating system's help. On the other hand, we are not the operating system, we work isolated from it, so as not to mess with it if something goes wrong with our process. So in order to call to the operating system there needs to be a transition to "kernel mode" which might also incur a "context switch". That is why calling the operating system to read every single byte is expensive. We want as few OS calls as possible and so we get the stream data in batches.
To wait on a stream without blocking you'd need a non-blocking stream. MIO promises to have the required non-blocking stream support for pipes, most probably with PipeReader, but I haven't checked it out so far.
The non-blocking nature of a stream should make it possible to read data in chunks regardless of whether the operating system prefers the "short reads" or not. Because non-blocking stream never blocks. If there is no data in the stream it simply tells you so.
In the absense of a non-blocking stream you'll have to resort to spawning threads so that the blocking reads would be performed in a separate thread and thus won't block your primary thread. You might also want to read the stream byte by byte in order to react to the line separator immediately in case the operating system does not prefer the "short reads". Here's a working example: https://gist.github.com/ArtemGr/db40ae04b431a95f2b78.
P.S. Here's an example of a function that allows to monitor the standard output of a program via a shared vector of bytes:
use std::io::Read;
use std::process::{Command, Stdio};
use std::sync::{Arc, Mutex};
use std::thread;
/// Pipe streams are blocking, we need separate threads to monitor them without blocking the primary thread.
fn child_stream_to_vec<R>(mut stream: R) -> Arc<Mutex<Vec<u8>>>
where
R: Read + Send + 'static,
{
let out = Arc::new(Mutex::new(Vec::new()));
let vec = out.clone();
thread::Builder::new()
.name("child_stream_to_vec".into())
.spawn(move || loop {
let mut buf = [0];
match stream.read(&mut buf) {
Err(err) => {
println!("{}] Error reading from stream: {}", line!(), err);
break;
}
Ok(got) => {
if got == 0 {
break;
} else if got == 1 {
vec.lock().expect("!lock").push(buf[0])
} else {
println!("{}] Unexpected number of bytes: {}", line!(), got);
break;
}
}
}
})
.expect("!thread");
out
}
fn main() {
let mut cat = Command::new("cat")
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.spawn()
.expect("!cat");
let out = child_stream_to_vec(cat.stdout.take().expect("!stdout"));
let err = child_stream_to_vec(cat.stderr.take().expect("!stderr"));
let mut stdin = match cat.stdin.take() {
Some(stdin) => stdin,
None => panic!("!stdin"),
};
}
With a couple of helpers I'm using it to control an SSH session:
try_s! (stdin.write_all (b"echo hello world\n"));
try_s! (wait_forˢ (&out, 0.1, 9., |s| s == "hello world\n"));
P.S. Note that await on a read call in async-std is blocking as well. It's just instead of blocking a system thread it only blocks a chain of futures (a stack-less green thread essentially). The poll_read is the non-blocking interface. In async-std#499 I've asked the developers whether there's a short read guarantee from these APIs.
P.S. There might be a similar concern in Nom: "we would want to tell the IO side to refill according to the parser's result (Incomplete or not)"
P.S. Might be interesting to see how stream reading is implemented in crossterm. For Windows, in poll.rs, they are using the native WaitForMultipleObjects. In unix.rs they are using mio poll.
Tokio's Command
Here is an example of using tokio 0.2:
use std::process::Stdio;
use futures::StreamExt; // 0.3.1
use tokio::{io::BufReader, prelude::*, process::Command}; // 0.2.4, features = ["full"]
#[tokio::main]
async fn main() {
let mut cmd = Command::new("/tmp/slow.bash")
.stdout(Stdio::piped()) // Can do the same for stderr
.spawn()
.expect("cannot spawn");
let stdout = cmd.stdout().take().expect("no stdout");
// Can do the same for stderr
// To print out each line
// BufReader::new(stdout)
// .lines()
// .for_each(|s| async move { println!("> {:?}", s) })
// .await;
// To print out each line *and* collect it all into a Vec
let result: Vec<_> = BufReader::new(stdout)
.lines()
.inspect(|s| println!("> {:?}", s))
.collect()
.await;
println!("All the lines: {:?}", result);
}
Tokio-Threadpool
Here is an example of using tokio 0.1 and tokio-threadpool. We start the process in a thread using the blocking function. We convert that to a stream with stream::poll_fn
use std::process::{Command, Stdio};
use tokio::{prelude::*, runtime::Runtime}; // 0.1.18
use tokio_threadpool; // 0.1.13
fn stream_command_output(
mut command: Command,
) -> impl Stream<Item = Vec<u8>, Error = tokio_threadpool::BlockingError> {
// Ensure that the output is available to read from and start the process
let mut child = command
.stdout(Stdio::piped())
.spawn()
.expect("cannot spawn");
let mut stdout = child.stdout.take().expect("no stdout");
// Create a stream of data
stream::poll_fn(move || {
// Perform blocking IO
tokio_threadpool::blocking(|| {
// Allocate some space to store anything read
let mut data = vec![0; 128];
// Read 1-128 bytes of data
let n_bytes_read = stdout.read(&mut data).expect("cannot read");
if n_bytes_read == 0 {
// Stdout is done
None
} else {
// Only return as many bytes as we read
data.truncate(n_bytes_read);
Some(data)
}
})
})
}
fn main() {
let output_stream = stream_command_output(Command::new("/tmp/slow.bash"));
let mut runtime = Runtime::new().expect("Unable to start the runtime");
let result = runtime.block_on({
output_stream
.map(|d| String::from_utf8(d).expect("Not UTF-8"))
.fold(Vec::new(), |mut v, s| {
print!("> {}", s);
v.push(s);
Ok(v)
})
});
println!("All the lines: {:?}", result);
}
There's numerous possible tradeoffs that can be made here. For example, always allocating 128 bytes isn't ideal, but it's simple to implement.
Support
For reference, here's slow.bash:
#!/usr/bin/env bash
set -eu
val=0
while [[ $val -lt 10 ]]; do
echo $val
val=$(($val + 1))
sleep 1
done
See also:
How do I synchronously return a value calculated in an asynchronous Future in stable Rust?
If Unix support is sufficient, you can also make the two output streams as non-blocking and poll over them as you would do it on TcpStream with the set_nonblocking function.
The ChildStdout and ChildStderr returned by the Command spawn are Stdio (and contain a file descriptor), you can modify directly the read behavior of these handle to make it non-blocking.
Based on the work of jcreekmore/timeout-readwrite-rs and anowell/nonblock-rs, I use this wrapper to modify the stream handles:
extern crate libc;
use std::io::Read;
use std::os::unix::io::AsRawFd;
use libc::{F_GETFL, F_SETFL, fcntl, O_NONBLOCK};
fn set_nonblocking<H>(handle: &H, nonblocking: bool) -> std::io::Result<()>
where
H: Read + AsRawFd,
{
let fd = handle.as_raw_fd();
let flags = unsafe { fcntl(fd, F_GETFL, 0) };
if flags < 0 {
return Err(std::io::Error::last_os_error());
}
let flags = if nonblocking{
flags | O_NONBLOCK
} else {
flags & !O_NONBLOCK
};
let res = unsafe { fcntl(fd, F_SETFL, flags) };
if res != 0 {
return Err(std::io::Error::last_os_error());
}
Ok(())
}
You can manage the two streams as any other non-blocking stream. The following example is based on the polling crate which makes really easy to handle read event and BufReader for line reading:
use std::process::{Command, Stdio};
use std::path::PathBuf;
use std::io::{BufReader, BufRead};
use std::thread;
extern crate polling;
use polling::{Event, Poller};
fn main() -> Result<(), std::io::Error> {
let path = PathBuf::from("./worker.sh").canonicalize()?;
let mut child = Command::new(path)
.stdin(Stdio::null())
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.spawn()
.expect("Failed to start worker");
let handle = thread::spawn({
let stdout = child.stdout.take().unwrap();
set_nonblocking(&stdout, true)?;
let mut reader_out = BufReader::new(stdout);
let stderr = child.stderr.take().unwrap();
set_nonblocking(&stderr, true)?;
let mut reader_err = BufReader::new(stderr);
move || {
let key_out = 1;
let key_err = 2;
let mut out_closed = false;
let mut err_closed = false;
let poller = Poller::new().unwrap();
poller.add(reader_out.get_ref(), Event::readable(key_out)).unwrap();
poller.add(reader_err.get_ref(), Event::readable(key_err)).unwrap();
let mut line = String::new();
let mut events = Vec::new();
loop {
// Wait for at least one I/O event.
events.clear();
poller.wait(&mut events, None).unwrap();
for ev in &events {
// stdout is ready for reading
if ev.key == key_out {
let len = match reader_out.read_line(&mut line) {
Ok(len) => len,
Err(e) => {
println!("stdout read returned error: {}", e);
0
}
};
if len == 0 {
println!("stdout closed (len is null)");
out_closed = true;
poller.delete(reader_out.get_ref()).unwrap();
} else {
print!("[STDOUT] {}", line);
line.clear();
// reload the poller
poller.modify(reader_out.get_ref(), Event::readable(key_out)).unwrap();
}
}
// stderr is ready for reading
if ev.key == key_err {
let len = match reader_err.read_line(&mut line) {
Ok(len) => len,
Err(e) => {
println!("stderr read returned error: {}", e);
0
}
};
if len == 0 {
println!("stderr closed (len is null)");
err_closed = true;
poller.delete(reader_err.get_ref()).unwrap();
} else {
print!("[STDERR] {}", line);
line.clear();
// reload the poller
poller.modify(reader_err.get_ref(), Event::readable(key_err)).unwrap();
}
}
}
if out_closed && err_closed {
println!("Stream closed, exiting process thread");
break;
}
}
}
});
handle.join().unwrap();
Ok(())
}
Additionally, used with a wrapper over an EventFd, it becomes possible to easily stop the process from another thread without blocking nor active polling and uses and only a single thread.
EDIT: It seems the polling crate sets automatically the polled handles in non-blocking mode following my tests. The set_nonblocking function is still useful in case you want to directly use the nix::poll object.
I have encountered enough use-cases where it was useful to interact with a subprocess over line-delimited text that I wrote a crate for it, interactive_process.
I expect the original problem has long since been solved, but I thought it might be helpful to others.

Why do Rust mutexes not seem to give the lock to the thread that wanted to lock it last?

I wanted to write a program that spawns two threads that lock a Mutex, increase it, print something, and then unlock the Mutex so the other thread can do the same. I added some sleep time to make it more consistent, so I thought the output should be something like:
ping pong ping pong …
but the actual output is pretty random. Most of the time, it is just
ping ping ping … pong
But there's no consistency at all; sometimes there is a “pong” in the middle too.
I was of the belief that mutexes had some kind of way to determine who wanted to lock it last but it doesn’t look like that’s the case.
How does the locking actually work?
How can I get the desired output?
use std::sync::{Arc, Mutex};
use std::{thread, time};
fn main() {
let data1 = Arc::new(Mutex::new(1));
let data2 = data1.clone();
let ten_millis = time::Duration::from_millis(10);
let a = thread::spawn(move || loop {
let mut data = data1.lock().unwrap();
thread::sleep(ten_millis);
println!("ping ");
*data += 1;
if *data > 10 {
break;
}
});
let b = thread::spawn(move || loop {
let mut data = data2.lock().unwrap();
thread::sleep(ten_millis);
println!("pong ");
*data += 1;
if *data > 10 {
break;
}
});
a.join().unwrap();
b.join().unwrap();
}
Mutex and RwLock both defer to OS-specific primitives and cannot be guaranteed to be fair. On Windows, they are both implemented with SRW locks which are specifically documented as not fair. I didn't do research for other operating systems but you definitely cannot rely on fairness with std::sync::Mutex, especially if you need this code to be portable.
A possible solution in Rust is the Mutex implementation provided by the parking_lot crate, which provides an unlock_fair method, which is implemented with a fair algorithm.
From the parking_lot documentation:
By default, mutexes are unfair and allow the current thread to re-lock the mutex before another has the chance to acquire the lock, even if that thread has been blocked on the mutex for a long time. This is the default because it allows much higher throughput as it avoids forcing a context switch on every mutex unlock. This can result in one thread acquiring a mutex many more times than other threads.
However in some cases it can be beneficial to ensure fairness by forcing the lock to pass on to a waiting thread if there is one. This is done by using this method instead of dropping the MutexGuard normally.
While parking_lot::Mutex doesn't claim to be fair without specifically using the unlock_fair method, I found that your code produced the same number of pings as pongs, by just making that switch (playground), not even using the unlock_fair method.
Usually mutexes are unlocked automatically, when a guard goes out of scope. To make it unlock fairly, you need to insert this method call before the guard is dropped:
let b = thread::spawn(move || loop {
let mut data = data1.lock();
thread::sleep(ten_millis);
println!("pong ");
*data += 1;
if *data > 10 {
break;
}
MutexGuard::unlock_fair(data);
});
The order of locking the mutex is not guaranteed in any way; it's possible for the first thread to acquire the lock 100% of the time, while the second thread 0%
The threads are scheduled by the OS and the following scenario is quite possible:
the OS gives CPU time to the first thread and it acquires the lock
the OS gives CPU time to the second thread, but the lock is taken, hence it goes to sleep
The fist thread releases the lock, but is still allowed to run by the OS. It goes for another iteration of the loop and re-acquires the lock
The other thread cannot proceed, because the lock is still taken.
If you give the second thread more time to acquire the lock you will see the expected ping-pong pattern, although there is no guarantee (a bad OS may decide to never give CPU time to some of your threads):
use std::sync::{Arc, Mutex};
use std::{thread, time};
fn main() {
let data1 = Arc::new(Mutex::new(1));
let data2 = data1.clone();
let ten_millis = time::Duration::from_millis(10);
let a = thread::spawn(move || loop {
let mut data = data1.lock().unwrap();
*data += 1;
if *data > 10 {
break;
}
drop(data);
thread::sleep(ten_millis);
println!("ping ");
});
let b = thread::spawn(move || loop {
let mut data = data2.lock().unwrap();
*data += 1;
if *data > 10 {
break;
}
drop(data);
thread::sleep(ten_millis);
println!("pong ");
});
a.join().unwrap();
b.join().unwrap();
}
You can verify that by playing with the sleep time. The lower the sleep time, the more irregular the ping-pong alternations will be, and with values as low as 10ms, you may see ping-ping-pong, etc.
Essentially, a solution based on time is bad by design. You can guarantee that "ping" will be followed by "pong" by improving the algorithm. For instance you can print "ping" on odd numbers and "pong" on even numbers:
use std::sync::{Arc, Mutex};
use std::{thread, time};
const MAX_ITER: i32 = 10;
fn main() {
let data1 = Arc::new(Mutex::new(1));
let data2 = data1.clone();
let ten_millis = time::Duration::from_millis(10);
let a = thread::spawn(move || 'outer: loop {
loop {
thread::sleep(ten_millis);
let mut data = data1.lock().unwrap();
if *data > MAX_ITER {
break 'outer;
}
if *data & 1 == 1 {
*data += 1;
println!("ping ");
break;
}
}
});
let b = thread::spawn(move || 'outer: loop {
loop {
thread::sleep(ten_millis);
let mut data = data2.lock().unwrap();
if *data > MAX_ITER {
break 'outer;
}
if *data & 1 == 0 {
*data += 1;
println!("pong ");
break;
}
}
});
a.join().unwrap();
b.join().unwrap();
}
This isn't the best implementation, but I tried to do it with as few modifications as possible to the original code.
You may also consider an implementation with a Condvar, a better solution, in my opinion, as it avoids the busy waiting on the mutex (ps: also removed the code duplication):
use std::sync::{Arc, Mutex, Condvar};
use std::thread;
const MAX_ITER: i32 = 10;
fn main() {
let cv1 = Arc::new((Condvar::new(), Mutex::new(1)));
let cv2 = cv1.clone();
let a = thread::spawn(ping_pong_task("ping", cv1, |x| x & 1 == 1));
let b = thread::spawn(ping_pong_task("pong", cv2, |x| x & 1 == 0));
a.join().unwrap();
b.join().unwrap();
}
fn ping_pong_task<S: Into<String>>(
msg: S,
cv: Arc<(Condvar, Mutex<i32>)>,
check: impl Fn(i32) -> bool) -> impl Fn()
{
let message = msg.into();
move || {
let (condvar, mutex) = &*cv;
let mut value = mutex.lock().unwrap();
loop {
if check(*value) {
println!("{} ", message);
*value += 1;
condvar.notify_all();
}
if *value > MAX_ITER {
break;
}
value = condvar.wait(value).unwrap();
}
}
}
I was of the belief that mutexes had some kind of way to determine who wanted to lock it last but it doesn’t look like that’s the case.
Nope. The job of a mutex is just to make the code run as fast as possible. Alternation gives the worst performance because you're constantly blowing out the CPU caches. You are asking for the worst possible implementation of a mutex.
How does the locking actually work?
The scheduler tries to get as much work done as possible. It's your job to write code that only does the work you really want to get done.
How can I get the desired output?
Don't use two threads if you just want to do one thing then something else then the first thing again. Use threads when you don't care about the order in which work is done and just want to get as much work done as possible.

Is there an API to race N threads (or N closures on N threads) to completion?

Given several threads that complete with an Output value, how do I get the first Output that's produced? Ideally while still being able to get the remaining Outputs later in the order they're produced, and bearing in mind that some threads may or may not terminate.
Example:
struct Output(i32);
fn main() {
let mut spawned_threads = Vec::new();
for i in 0..10 {
let join_handle: ::std::thread::JoinHandle<Output> = ::std::thread::spawn(move || {
// pretend to do some work that takes some amount of time
::std::thread::sleep(::std::time::Duration::from_millis(
(1000 - (100 * i)) as u64,
));
Output(i) // then pretend to return the `Output` of that work
});
spawned_threads.push(join_handle);
}
// I can do this to wait for each thread to finish and collect all `Output`s
let outputs_in_order_of_thread_spawning = spawned_threads
.into_iter()
.map(::std::thread::JoinHandle::join)
.collect::<Vec<::std::thread::Result<Output>>>();
// but how would I get the `Output`s in order of completed threads?
}
I could solve the problem myself using a shared queue/channels/similar, but are there built-in APIs or existing libraries which could solve this use case for me more elegantly?
I'm looking for an API like:
fn race_threads<A: Send>(
threads: Vec<::std::thread::JoinHandle<A>>
) -> (::std::thread::Result<A>, Vec<::std::thread::JoinHandle<A>>) {
unimplemented!("so far this doesn't seem to exist")
}
(Rayon's join is the closest I could find, but a) it only races 2 closures rather than an arbitrary number of closures, and b) the thread pool w/ work stealing approach doesn't make sense for my use case of having some closures that might run forever.)
It is possible to solve this use case using pointers from How to check if a thread has finished in Rust? just like it's possible to solve this use case using an MPSC channel, however here I'm after a clean API to race n threads (or failing that, n closures on n threads).
These problems can be solved by using a condition variable:
use std::sync::{Arc, Condvar, Mutex};
#[derive(Debug)]
struct Output(i32);
enum State {
Starting,
Joinable,
Joined,
}
fn main() {
let pair = Arc::new((Mutex::new(Vec::new()), Condvar::new()));
let mut spawned_threads = Vec::new();
let &(ref lock, ref cvar) = &*pair;
for i in 0..10 {
let my_pair = pair.clone();
let join_handle: ::std::thread::JoinHandle<Output> = ::std::thread::spawn(move || {
// pretend to do some work that takes some amount of time
::std::thread::sleep(::std::time::Duration::from_millis(
(1000 - (100 * i)) as u64,
));
let &(ref lock, ref cvar) = &*my_pair;
let mut joinable = lock.lock().unwrap();
joinable[i] = State::Joinable;
cvar.notify_one();
Output(i as i32) // then pretend to return the `Output` of that work
});
lock.lock().unwrap().push(State::Starting);
spawned_threads.push(Some(join_handle));
}
let mut should_stop = false;
while !should_stop {
let locked = lock.lock().unwrap();
let mut locked = cvar.wait(locked).unwrap();
should_stop = true;
for (i, state) in locked.iter_mut().enumerate() {
match *state {
State::Starting => {
should_stop = false;
}
State::Joinable => {
*state = State::Joined;
println!("{:?}", spawned_threads[i].take().unwrap().join());
}
State::Joined => (),
}
}
}
}
(playground link)
I'm not claiming this is the simplest way to do it. The condition variable will awake the main thread every time a child thread is done. The list can show the state of each thread, if one is (about to) finish, it can be joined.
No, there is no such API.
You've already been presented with multiple options to solve your problem:
Use channels
Use a CondVar
Use futures
Sometimes when programming, you have to go beyond sticking pre-made blocks together. This is supposed to be a fun part of programming. I encourage you to embrace it. Go create your ideal API using the components available and publish it to crates.io.
I really don't see what's so terrible about the channels version:
use std::{sync::mpsc, thread, time::Duration};
#[derive(Debug)]
struct Output(i32);
fn main() {
let (tx, rx) = mpsc::channel();
for i in 0..10 {
let tx = tx.clone();
thread::spawn(move || {
thread::sleep(Duration::from_millis((1000 - (100 * i)) as u64));
tx.send(Output(i)).unwrap();
});
}
// Don't hold on to the sender ourselves
// Otherwise the loop would never terminate
drop(tx);
for r in rx {
println!("{:?}", r);
}
}

How to daisy chain threads using channels in Rust?

I'm trying to implement the sieve of Eratosthenes in Rust using coroutines as a learning exercise (not homework), and I can't find any reasonable way of connecting each thread to the Receiver and Sender ends of two different channels.
The Receiver is involved in two distinct tasks, namely reporting the highest prime found so far, and supplying further candidate primes for the filter. This is fundamental to the algorithm.
Here is what I would like to do but can't because the Receiver cannot be transferred between threads. Using std::sync::Arc does not appear to help, unsurprisingly.
Please note that I do understand why this doesn't work
fn main() {
let (basetx, baserx): (Sender<u32>, Receiver<u32>) = channel();
let max_number = 103;
thread::spawn(move|| {
generate_natural_numbers(&basetx, max_number);
});
let oldrx = &baserx;
loop {
// we need the prime in this thread
let prime = match oldrx.recv() {
Ok(num) => num,
Err(_) => { break; 0 }
};
println!("{}",prime);
// create (newtx, newrx) in a deliberately unspecified way
// now we need to pass the receiver off to the sieve thread
thread::spawn(move || {
sieve(oldrx, newtx, prime); // forwards numbers if not divisible by prime
});
oldrx = newrx;
}
}
Equivalent working Go code:
func main() {
channel := make(chan int)
var ok bool = true;
var prime int = 0;
go generate(channel, 103)
for true {
prime, ok = <- channel
if !ok {
break;
}
new_channel := make(chan int)
go sieve(channel, new, prime)
channel = new_channel
fmt.Println(prime)
}
}
What is the best way to deal with a situation like this where a Receiver needs to be handed off to a different thread?
You don't really explain what the problem that you are having, but your code is close enough:
use std::sync::mpsc::{channel, Sender, Receiver};
use std::thread;
fn generate_numbers(tx: Sender<u8>) {
for i in 2..100 {
tx.send(i).unwrap();
}
}
fn filter(rx: Receiver<u8>, tx: Sender<u8>, val: u8) {
for v in rx {
if v % val != 0 {
tx.send(v).unwrap();
}
}
}
fn main() {
let (base_tx, base_rx) = channel();
thread::spawn(move || {
generate_numbers(base_tx);
});
let mut old_rx = base_rx;
loop {
let num = match old_rx.recv() {
Ok(v) => v,
Err(_) => break,
};
println!("prime: {}", num);
let (new_tx, new_rx) = channel();
thread::spawn(move || {
filter(old_rx, new_tx, num);
});
old_rx = new_rx;
}
}
using coroutines
Danger, Danger, Will Robinson! These are not coroutines; they are full-fledged threads! These are a lot more heavyweight compared to a coroutine.
What is the best way to deal with a situation like this where a Receiver needs to be handed off to a different thread?
Just... transfer ownership of the Receiver to the thread?

Resources