How can I do these operations in parallel? - multithreading

So what I'm trying to do
use std::io::{self, Read, Write};
use std::thread;
use std::time::Duration;
use termion::color;
use termion::event::Key;
use termion::input::TermRead;
use termion::raw::IntoRawMode;
use chrono::{DateTime, TimeZone, Utc};
fn main() {
// Initialize stdios.
let stdout = io::stdout();
let stdout = stdout.lock();
let mut stdout = stdout.into_raw_mode().unwrap();
let stdin = termion::async_stdin();
let mut keys = stdin.keys();
let period = 30;
let mut scheduled_time = Utc::now().timestamp() + period;
loop {
let now = Utc::now().timestamp();
if now > scheduled_time {
foo(); // Do some operations,
// this function needs to be called in fixed period of time.
// eg. per 30 seconds or per 1 hour.
scheduled_time += period;
}
write!(stdout, "Log after foo is done\r\n").unwrap();
stdout.flush().unwrap();
thread::sleep(Duration::from_secs(period as u64 - 1)); // Wait for some fixed time to perform foo again.
// In this example it is 30 seconds.
// Check for input from user in parallel, by using termion's AsyncReader this does not block.
match keys.next() {
Some(Ok(Key::Char('q'))) => break,
_ => (),
}
}
}
First 8-10 lines is initializing stdios.
In my main loop. I want to call a function foo and do some operations.
But, it needs to be called in some period of time. That is why
I inserted thread::sleep function there. Because if I don't call thread::sleep,
It will constantly check condition and will not call foo.
And that causes 100% CPU usage all the time.
However sleeping caused another problem. Let's say that period was 1 hour. If the user wants
to quit the program in the middle of the sleep. It does not quit until the thread wakes (I guess).
I'm very unfamiliar with threads but I need some idea about how to do this.
I know a little about AsyncReader. I guess it creates a thread to not block main thread
while waiting for input.

The issue is that the line let mut stdout = stdout.into_raw_mode().unwrap(); is telling stdout to use raw mode, which disables the TTY device's processing of input characters. Commenting out that line (and making the previous stdout definition mutable) will allow a ^C interrupt to kill the process.
Even though it is the stdout device being put into raw mode, the file pointer for stdout and stdin are often pointing to the same file (unless they've been redirected), so putting stdout in raw mode seems to be affecting stdin as well. The TTY drivers in the OS are usually responsible for converting the ^C character into a SIGINT signal, which is sent to your process, which will normally terminate the process

Related

CPU time sleep instead of wall-clock time sleep

Currently, I have the following Rust toy program:
use rayon::prelude::*;
use std::{env, thread, time};
/// Sleeps 1 seconds n times parallely using rayon
fn rayon_sleep(n: usize) {
let millis = vec![0; n];
millis
.par_iter()
.for_each(|_| thread::sleep(time::Duration::from_millis(1000)));
}
fn main() {
let args: Vec<String> = env::args().collect();
let n = args[1].parse::<usize>().unwrap();
let now = time::Instant::now();
rayon_sleep(n);
println!("rayon: {:?}", now.elapsed());
}
Basically, my program accepts one input argument n. Then, I sleep for 1 second n times. The program executes the sleep tasks in parallel using rayon.
However, this is not exactly what I want. As far as I know, thread::sleep sleeps according to wall-clock time. However, I would like to keep a virtual CPU busy for 1 second in CPU time.
Is there any way to do this?
EDIT
I would like to make this point clear: I don't mind if the OS preempts the tasks. However, if this happens, then I don't want to consider the time the task spends in the ready/waiting queue.
EDIT
This is a simple, illustrative example of what I need to do. In reality, I have to develop a benchmark for a crate that allows defining and simulating models using the DEVS formalism. The benchmark aims to compare DEVS-compliant libraries with each other, and it explicitly says that the models must spend a fixed, known amount of CPU time. That is why I need to make sure of that. Thus, I cannot use a simple busy loop nor simply sleep.
I followed Sven Marnach's suggestions and implemented the following function:
use cpu_time::ThreadTime;
use rayon::prelude::*;
use std::{env, thread, time};
/// Sleeps 1 seconds n times parallely using rayon
fn rayon_sleep(n: usize) {
let millis = vec![0; n];
millis.par_iter().for_each(|_| {
let duration = time::Duration::from_millis(1000);
let mut x: u32 = 0;
let now = ThreadTime::now(); // get current thread time
while now.elapsed() < duration { // active sleep
std::hint::black_box(&mut x); // to avoid compiler optimizations
x = x.wrapping_add(1);
}
});
}
fn main() {
let args: Vec<String> = env::args().collect();
let n = args[1].parse::<usize>().unwrap();
let now = time::Instant::now();
rayon_sleep(n);
println!("rayon: {:?}", now.elapsed());
}
If I set n to 8, it takes 2 seconds more or less. I'd expect a better performance (1 second, as I have 8 vCPUs), but I guess that the overhead corresponds to the OS scheduling policy.

Detecting EOF without 0-byte read in Rust

I've been working on some code that reads data from a Read type (the input) in chunks and does some processing on each chunk. The issue is that the final chunk needs to be processed with a different function. As far as I can tell, there's a couple of ways to detect EOF from a Read, but none of them feel particularly ergonomic for this case. I'm looking for a more idiomatic solution.
My current approach is to maintain two buffers, so that the previous read result can be maintained if the next read reads zero bytes, which indicates EOF in this case, since the buffer is of non-zero length:
use std::io::{Read, Result};
const BUF_SIZE: usize = 0x1000;
fn process_stream<I: Read>(mut input: I) -> Result<()> {
// Stores a chunk of input to be processed
let mut buf = [0; BUF_SIZE];
let mut prev_buf = [0; BUF_SIZE];
let mut prev_read = input.read(&mut prev_buf)?;
loop {
let bytes_read = input.read(&mut buf)?;
if bytes_read == 0 {
break;
}
// Some function which processes the contents of a chunk
process_chunk(&prev_buf[..prev_read]);
prev_read = bytes_read;
prev_buf.copy_from_slice(&buf[..]);
}
// Some function used to process the final chunk differently from all other messages
process_final_chunk(&prev_buf[..prev_read]);
Ok(())
}
This strikes me as a very ugly way to do this, I shouldn't need to use two buffers here.
An alternative I can think of would be to impose Seek on input and use input.read_exact(). I could then check for an UnexpectedEof errorkind to determine that we've hit the end of input, and seek backwards to read the final chunk again (the seek & read again is necessary here because the contents of the buffer are undefined in the case of an UnexpectedEof error). But this doesn't seem idiomatic at all: Encountering an error, seeking back, and reading again just to detect we're at the end of a file is very strange.
My ideal solution would be something like this, using an imaginary input.feof() function that returns true if the last input.read() call reached EOF, like the feof syscall in C:
fn process_stream<I: Read>(mut input: I) -> Result<()> {
// Stores a chunk of input to be processed
let mut buf = [0; BUF_SIZE];
let mut bytes_read = 0;
loop {
bytes_read = input.read(&mut buf)?;
if input.feof() {
break;
}
process_chunk(&buf[..bytes_read]);
}
process_final_chunk(&buf[..bytes_read]);
Ok(())
}
Can anyone suggest a way to implement this that is more idiomatic? Thanks!
When read of std::io::Read returns Ok(n), not only does that mean that the buffer buf has been filled in with n bytes of data from this source., but it also indicates that the bytes after index n (inclusive) are left untouched. With this in mind, you actually don't need a prev_buf at all, because when n is 0, all bytes of the buffer would be left untoutched (leaving them to be those bytes of the previous read).
prog-fh's solution is what you want to go with for the kind of processing you want to do, because it will only hand off full chunks to process_chunk. With read potentially returning a value between 0 and BUF_SIZE, this is needed. For more info, see this part of the above link:
It is not an error if the returned value n is smaller than the buffer size, even when the reader is not at the end of the stream yet. This may happen for example because fewer bytes are actually available right now (e. g. being close to end-of-file) or because read() was interrupted by a signal.
However, I advise that you think about what should happen when you get a Ok(0) from read that does not represent end of file forever. See this part:
If n is 0, then it can indicate one of two scenarios:
This reader has reached its “end of file” and will likely no longer be able to produce bytes. Note that this does not mean that the reader will always no longer be able to produce bytes.
So if you were to get a sequence of reads that returned Ok(BUF_SIZE), Ok(BUF_SIZE), 0, Ok(BUF_SIZE) (which is entirely possible, it just represents a hitch in the IO), would you want to not consider the last Ok(BUF_SIZE) as a read chunk? If you treat Ok(0) as EOF forever, that may be a mistake here.
The only way to reliably determine what should be considered as the last chunk is to have the expected length (in bytes, not # of chunks) sent beforehand as part of the protocol. Given a variable expected_len, you could then determine the start index of the last chunk through expected_len - expected_len % BUF_SIZE, and the end index just being expected_len itself.
Since you consider read_exact() as a possible solution, then we can consider that a non-final chunk contains exactly BUF_SIZE bytes.
Then why not just read as much as we can to fill such a buffer and process it with a function, then, when it's absolutely not possible (because EOF is reached), process the incomplete last chunk with another function?
Note that feof() in C does not guess that EOF will be reached on the next read attempt; it just reports the EOF flag that could have been set during the previous read attempt.
Thus, for EOF to be set and feof() to report it, a read attempt returning 0 must have been encountered first (as in the example below).
use std::fs::File;
use std::io::{Read, Result};
const BUF_SIZE: usize = 0x1000;
fn process_chunk(bytes: &[u8]) {
println!("process_chunk {}", bytes.len());
}
fn process_final_chunk(bytes: &[u8]) {
println!("process_final_chunk {}", bytes.len());
}
fn process_stream<I: Read>(mut input: I) -> Result<()> {
// Stores a chunk of input to be processed
let mut buf = [0; BUF_SIZE];
loop {
let mut bytes_read = 0;
while bytes_read < BUF_SIZE {
let r = input.read(&mut buf[bytes_read..])?;
if r == 0 {
break;
}
bytes_read += r;
}
if bytes_read == BUF_SIZE {
process_chunk(&buf);
} else {
process_final_chunk(&buf[..bytes_read]);
break;
}
}
Ok(())
}
fn main() {
let file = File::open("data.bin").unwrap();
process_stream(file).unwrap();
}
/*
$ dd if=/dev/random of=data.bin bs=1024 count=10
$ cargo run
process_chunk 4096
process_chunk 4096
process_final_chunk 2048
$ dd if=/dev/random of=data.bin bs=1024 count=8
$ cargo run
process_chunk 4096
process_chunk 4096
process_final_chunk 0
*/

Replacement for std::sync::Semaphore since it is deprecated?

The documentation says it is deprecated. What's the system semaphore? And what's the best replacement for this struct now?
Deprecated since 1.7.0: easily confused with system semaphore and not used enough to pull its weight
System semaphore refers to whatever semaphore the operating system provides. On POSIX (Linux, MacOS) these are the methods you get from #include <semaphore.h> (man page). std::sync::Semaphore was implemented in rust and was separate from the OS's semaphore, although it did use some OS level synchronization primitives (std::sync::Condvar which is based on pthread_cond_t on linux).
std::sync::Semaphore was never stabilized. The source code for Semaphore contains an unstable attribute
#![unstable(feature = "semaphore",
reason = "the interaction between semaphores and the acquisition/release \
of resources is currently unclear",
issue = "27798")]
The issue number in the header specifies the discussion about this feature.
The best replacement within std is either a std::sync::CondVar or a busy loop paired with a std::sync::Mutex. Pick a CondVar over a busy loop if you think you might be waiting more than a few thousand clock cycles.
The documentation for Condvar has a good example of how to use it as a (binary) semaphore
use std::sync::{Arc, Mutex, Condvar};
use std::thread;
let pair = Arc::new((Mutex::new(false), Condvar::new()));
let pair2 = Arc::clone(&pair);
// Inside of our lock, spawn a new thread, and then wait for it to start.
thread::spawn(move|| {
let (lock, cvar) = &*pair2;
let mut started = lock.lock().unwrap();
*started = true;
// We notify the condvar that the value has changed.
cvar.notify_one();
});
// Wait for the thread to start up.
let (lock, cvar) = &*pair;
let mut started = lock.lock().unwrap();
while !*started {
started = cvar.wait(started).unwrap();
}
This example could be adapted to work as a counting semaphore by changing Mutex::new(false) to Mutex::new(0) and a few corresponding changes.

How do I print output without a trailing newline in Rust?

The macro println! in Rust always leaves a newline character at the end of each output. For example
println!("Enter the number : ");
io::stdin().read_line(&mut num);
gives the output
Enter the number :
56
I don't want the user's input 56 to be on a new line. How do I do this?
It's trickier than it would seem at first glance. Other answers mention the print! macro but it's not quite that simple. You'll likely need to flush stdout, as it may not be written to the screen immediately. flush() is a trait method that is part of std::io::Write so that needs to be in scope for it to work (this is a pretty easy early mistake).
use std::io;
use std::io::Write; // <--- bring flush() into scope
fn main() {
println!("I'm picking a number between 1 and 100...");
print!("Enter a number: ");
io::stdout().flush().unwrap();
let mut val = String::new();
io::stdin().read_line(&mut val)
.expect("Error getting guess");
println!("You entered {}", val);
}
You can use the print! macro instead.
print!("Enter the number : ");
io::stdin().read_line(&mut num);
Beware:
Note that stdout is frequently line-buffered by default so it may be necessary to use io::stdout().flush() to ensure the output is emitted immediately.
Don't use the print/ln!-macros. Use write/ln!-macros.
It is more verbose, but print/ln! are problematic for using in command-line apps where their output might get piped or redirected to other apps, which is characteristic for Unix environments.
There is used always the same (only once requested and "buffered") stdout-device, but the stdout-device of the system is changed for piping/redirecting. So for each output to stdout you have to request the current stdout-device (std::io::stdout()). This can be done with write/ln!-macros.
So to say print/ln! is broken and there is an open issue since years.

std::sync::mpsc::channel always in the same order

No matter how many times I run the program, it always shows the numbers in the same order:
use std::sync::mpsc::channel;
use std::thread;
fn main() {
let (tx, rx) = channel();
for i in 0 ..10 {
let tx = tx.clone();
thread::spawn(move || {
tx.send(i).unwrap();
});
}
for _ in 0..10 {
println!("{}", rx.recv().unwrap());
}
}
Code on the playground. The output is:
6
7
8
5
9
4
3
2
1
0
If I rebuild the project, the sequence will change. Is the sequence decided at compile time?
What order would you expect them to be in? For what it's worth, on my machine I ran the same binary twice and got slightly different results.
Ultimately, this comes down to how your operating system decides to schedule threads. You create 10 new threads and then ask the OS to run each of them when convenient. A hypothetical thread scheduler might look like this:
for thread in threads {
if thread.runnable() {
thread.run_for_a_time_slice();
}
}
Where threads stores the threads in the order they were created. It's unlikely that any OS would be this naïve, but it shows the idea.
In your case, every thread is ready to run immediately, and is very short so it can run all the way to completion before the time is up.
Additionally, there might be some fairness being applied to the lock that guards the channel. Perhaps it always lets the first of multiple competing threads submit a value. Unfortunately, the implementation of channels is reasonably complex, so I can't immediately say if that's the case or not.

Resources