Await for future again after tokio::time::timeout - rust

Background:
I have a process using tokio::process to spawn child processes with handles in the tokio runtime.
It is also responsible for freeing the resources after killing a child and, according to the documentation (std::process::Child, tokio::process::Child), this requires the parent to wait() (or await in tokio) for the process.
Not all process behave the same to a SIGINT or a SIGTERM, so I wanted to give the child some time to die, before I send a SIGKILL.
Desired solution:
pub async fn kill(self) {
// Close input
std::mem::drop(self.stdin);
// Send gracefull signal
let pid = nix::unistd::Pid::from_raw(self.process.id() as nix::libc::pid_t);
nix::sys::signal::kill(pid, nix::sys::signal::SIGINT);
// Give the process time to die gracefully
if let Err(_) = tokio::time::timeout(std::time::Duration::from_secs(2), self.process).await
{
// Kill forcefully
nix::sys::signal::kill(pid, nix::sys::signal::SIGKILL);
self.process.await;
}
}
However this error is given:
error[E0382]: use of moved value: `self.process`
--> src/bin/multi/process.rs:46:13
|
42 | if let Err(_) = tokio::time::timeout(std::time::Duration::from_secs(2), self.process).await
| ------------ value moved here
...
46 | self.process.await;
| ^^^^^^^^^^^^ value used here after move
|
= note: move occurs because `self.process` has type `tokio::process::Child`, which does not implement the `Copy` trait
And if I obey and remove the self.process.await, I see the child process still taking resources in ps.
Question:
How can I await for an amount of time and perform actions and await again if the amount of time expired?
Note:
I solved my immediate problem by setting a tokio timer that always sends the SIGKILL after two seconds, and having a single self.process.await at the bottom. But this solution is not desirable since another process may spawn in the same PID while the timer is running.
Edit:
Adding a minimal, reproducible example (playground)
async fn delay() {
for _ in 0..6 {
tokio::time::delay_for(std::time::Duration::from_millis(500)).await;
println!("Ping!");
}
}
async fn runner() {
let delayer = delay();
if let Err(_) = tokio::time::timeout(std::time::Duration::from_secs(2), delayer).await {
println!("Taking more than two seconds");
delayer.await;
}
}

You needed to pass a mutable reference. However, you first need to pin the future in order for its mutable reference to implement Future. pin_mut re-exported from the futures crate is a good helper around this:
use futures::pin_mut;
async fn delay() {
for _ in 0..6 {
tokio::time::delay_for(std::time::Duration::from_millis(500)).await;
println!("Ping!");
}
}
async fn runner() {
let delayer = delay();
pin_mut!(delayer);
if let Err(_) = tokio::time::timeout(std::time::Duration::from_secs(2), &mut delayer).await {
println!("Taking more than two seconds");
delayer.await;
}
}

Related

How to loop over thread handles and join if finished, within another loop?

I have a program that creates threads in a loop, and also checks if they have finished and cleans them up if they have. See below for a minimal example:
use std::thread;
fn main() {
let mut v = Vec::<std::thread::JoinHandle<()>>::new();
for _ in 0..10 {
let jh = thread::spawn(|| {
thread::sleep(std::time::Duration::from_secs(1));
});
v.push(jh);
for jh in v.iter_mut() {
if jh.is_finished() {
jh.join().unwrap();
}
}
}
}
This gives the error:
error[E0507]: cannot move out of `*jh` which is behind a mutable reference
--> src\main.rs:13:17
|
13 | jh.join().unwrap();
| ^^^------
| | |
| | `*jh` moved due to this method call
| move occurs because `*jh` has type `JoinHandle<()>`, which does not implement the `Copy` trait
|
note: this function takes ownership of the receiver `self`, which moves `*jh`
--> D:\rust\.rustup\toolchains\stable-x86_64-pc-windows-msvc\lib/rustlib/src/rust\library\std\src\thread\mod.rs:1461:17
|
1461 | pub fn join(self) -> Result<T> {
How can I get the borrow checker to allow this?
JoinHandle::join actually consumes the JoinHandle.
iter_mut(), however, only borrows the elements of the vector and keeps the vector alive. Therefore your JoinHandles are only borrowed, and you cannot call consuming methods on borrowed objects.
What you need to do is to take the ownership of the elements while iterating over the vector, so they can be then consumed by join(). This is achieved by using into_iter() instead of iter_mut().
The second mistake is that you (probably accidentally) wrote the two for loops inside of each other, while they should be independent loops.
The third problem is a little more complex. You cannot check if a thread has finished and then join it the way you did. Therefore I removed the is_finished() check for now and will talk about this further down again.
Here is your fixed code:
use std::thread;
fn main() {
let mut v = Vec::<std::thread::JoinHandle<()>>::new();
for _ in 0..10 {
let jh = thread::spawn(|| {
thread::sleep(std::time::Duration::from_secs(1));
});
v.push(jh);
}
for jh in v.into_iter() {
jh.join().unwrap();
}
}
Reacting to finished threads
This one is harder. If you just want to wait until all of them are finished, the code above is the way to go.
However, if you have to react to finished threads right away, you basically have to set up some kind of event propagation. You don't want to loop over all threads over and over again until they are all finished, because that is something called idle-waiting and consumes a lot of computational power.
So if you want to achieve that there are two problems that have to be dealt with:
join() consumes the JoinHandle(), which would leave behind an incomplete Vec of JoinHandles. This isn't possible, so we need to wrap JoinHandle in a type that can actually be ripped out of the vector partially, like Option.
we need a way to signal to the main thread that a new child thread is finished, so that the main thread doesn't have to continuously iterate over the threads.
All in all this is very complex and tricky to implement.
Here is my attempt:
use std::{
thread::{self, JoinHandle},
time::Duration,
};
fn main() {
let mut v: Vec<Option<JoinHandle<()>>> = Vec::new();
let (send_finished_thread, receive_finished_thread) = std::sync::mpsc::channel();
for i in 0..10 {
let send_finished_thread = send_finished_thread.clone();
let join_handle = thread::spawn(move || {
println!("Thread {} started.", i);
thread::sleep(Duration::from_millis(2000 - i as u64 * 100));
println!("Thread {} finished.", i);
// Signal that we are finished.
// This will wake up the main thread.
send_finished_thread.send(i).unwrap();
});
v.push(Some(join_handle));
}
loop {
// Check if all threads are finished
let num_left = v.iter().filter(|th| th.is_some()).count();
if num_left == 0 {
break;
}
// Wait until a thread is finished, then join it
let i = receive_finished_thread.recv().unwrap();
let join_handle = std::mem::take(&mut v[i]).unwrap();
println!("Joining {} ...", i);
join_handle.join().unwrap();
println!("{} joined.", i);
}
println!("All joined.");
}
Important
This code is just a demonstration. It will deadlock if one of the threads panic. But this shows how complicated that problem is.
It could be solved by utilizing a drop guard, but I think this answer is convoluted enough ;)

Async loop on a new thread in rust: the trait `std::future::Future` is not implemented for `()`

I know this question has been asked many times, but I still can't figure out what to do (more below).
I'm trying to spawn a new thread using std::thread::spawn and then run an async loop inside of it.
The async function I want to run:
#[tokio::main]
pub async fn pull_tweets(pg_pool2: Arc<PgPool>, config2: Arc<Settings>) {
let mut scheduler = AsyncScheduler::new();
scheduler.every(10.seconds()).run(move || {
let arc_pool = pg_pool2.clone();
let arc_config = config2.clone();
async {
pull_from_main(arc_pool, arc_config).await;
}
});
tokio::spawn(async move {
loop {
scheduler.run_pending().await;
tokio::time::sleep(Duration::from_millis(100)).await;
}
});
}
Spawning a thread to run in:
#[actix_web::main]
async fn main() -> std::io::Result<()> {
let handle = thread::spawn(move || async {
pull_tweets(pg_pool2, config2).await;
});
}
The error:
error[E0277]: `()` is not a future
--> src/main.rs:89:9
|
89 | pull_tweets(pg_pool2, config2).await;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `()` is not a future
|
= help: the trait `std::future::Future` is not implemented for `()`
= note: required by `poll`
The last comment here does an amazing job at generalizing the problem. It seems at some point a return value is expected that implements IntoFuture and I don't have that. I tried adding Ok(()) to both the closure and the function, but no luck.
Adding into closure does literally nothing
Adding into async function gives me a new, but similar-sounding error:
`Result<(), ()>` is not a future
Then I noticed that the answer specifically talks about extension functions, which I'm not using. This also talks about extension functions.
Some other SO answers:
This is caused by a missing async.
This and this are reqwest library specific.
So none seem to work. Could someone help me understand 1)why the error exists here and 2)how to fix it?
Note 1: All of this can be easily solved by replacing std::thread::spawn with tokio::task::spawn_blocking. But I'm purposefully experimenting with thread spawn as per this article.
Note 2: Broader context on what I want to achieve: I'm pulling 150,000 tweets from twitter in an async loop. I want to compare 2 implementations: running on the main runtime vs running on separate thread. The latter is where I struggle.
Note 3: in my mind threads and async tasks are two different primitives that don't mix. Ie spawning a new thread doesn't affect how tasks behave and spawning a new task only adds work on existing thread. Please let me know if this worldview is mistaken (and what I can read).
#[tokio::main] converts your function into the following:
#[tokio::main]
pub fn pull_tweets(pg_pool2: Arc<PgPool>, config2: Arc<Settings>) {
let rt = tokio::Runtime::new();
rt.block_on(async {
let mut scheduler = AsyncScheduler::new();
// ...
});
}
Notice that it is a synchronous function, that spawns a new runtime and runs the inner future to completion. You do not await it, it is a separate runtime with it's own dedicate thread pool and scheduler:
#[actix_web::main]
async fn main() -> std::io::Result<()> {
let handle = thread::spawn(move || {
pull_tweets(pg_pool2, config2);
});
}
Note that your original example was wrong in another way:
#[actix_web::main]
async fn main() -> std::io::Result<()> {
let handle = thread::spawn(move || async {
pull_tweets(pg_pool2, config2).await;
});
}
Even if pull_tweets was an async function, the thread would not do anything, as all you do is create another future in the async block. The created future is not executed, because futures are lazy (and there is no executor context in that thread anyways).
I would structure the code to spawn the runtime directly in the new thread, and call any async functions you want from there:
#[actix_web::main]
async fn main() -> std::io::Result<()> {
let handle = thread::spawn(move || {
let rt = tokio::runtime::Builder::new_multi_thread()
.enable_all()
.build()
.unwrap();
rt.block_on(async {
pull_tweets(pg_pool2, config2).await;
});
});
}
pub async fn pull_tweets(pg_pool2: Arc<PgPool>, config2: Arc<Settings>) {
// ...
}

How can I make my program sleep for a long time while staying responsive?

I am working on a basic project. I have progressed a lot (first time). And I have encountered an obstacle while trying to make this as a working application.
What I mean by that is, it has a main loop which is infinite. The program will run until user presses q. By the way, the program is console application. And I'm working on Linux.
While the program is working, it needs to do some operation foo in intervals (eg. 1 hour, 30 minutes, 4 hours).
And yes that's it. I know it doesn't sound complicated but I'm stuck. So what I want is, a Stdin which waits for user input without blocking the application and during that time do foo for each 1 hour (let's stick with one hour). But between these foos, I don't want my program spin constantly. Because that causes 100% CPU usage. I want my program work when there is a need for work (eg. When elapsed time is 1 hour or When user presses q).
I used a library for manipulating the terminal. And that library has a feature for Stdin without blocking. And I looked it's implementation I saw that it creates another thread to handle this. Since I have to do other feature on my own. I need some help to figure it out.
For elapsing the time without spinning. I tried sleep main thread. It did help but it made user input not responsive. If the user presses q while main thread is sleeping. It waits for it to wake up. I don't want this. I want it to quit immediately if it is idle. (not doing foo). if it is doing foo, quit after foo is done. By the way foo is not a heavy computation in my code.
fn foo() {
println!("bar");
}
fn main() {
let mut instant = Instant::now();
let one_hour = Duration::from_secs(60 * 60);
let mut stdin = async_stdin().bytes();
loop {
// No need to work
std::thread::sleep(one_hour);
// Just to be sure
if instant.elapsed() > one_hour {
foo();
instant = Instant::now();
}
// Check for input if there is any
match stdin.next() {
Some(Ok(b'q')) => break,
_ => (),
}
}
}
foo stdin foo stdin
| | | |
| | | | <- Press q
| | | |
| | <- exit here
|<- Press q and or |
| exit immediately |
| |
| |
| |
| |
I use Rust for my project but pseudocode is fine to me. I am very new to concepts of multithreading and asynchronous. I know they are not the same. I don't even know which is needed for this. Which synchronization objects do I need? Any tips, references, links will help.
You need to run 2 threads and pass messages about exit and wait them with timeout.
use std::sync::mpsc;
use std::thread;
type Signal = ();
fn work_queue(end_receiver: mpsc::Receiver<Signal>){
loop{
foo();
match end_receiver.recv_timeout(one_hour){
Ok(_) => return, // Received signal
Err(mpsc::RecvTimeoutError::Timeout)=>foo(),
_ => handle_error(),
}
}
}
fn main(){
let (send, recv) = channel();
thread::spawn(move || {
work_queue(recv);
});
loop{
match stdin.next() {
Some(Ok(b'q')) => {
send.send(()).unwrap();
break;
},
_ => continue,
}
}
}
I didn't test this code but I hope that idea is clear.
Here's an example of what you want using tokio.
use std::io::{stdin, BufRead, Result};
use tokio::time::{interval, Duration};
#[tokio::main]
async fn main() -> Result<()> {
tokio::spawn(async {
let mut interval = interval(Duration::from_secs(60 * 60)); // one hour
interval.tick().await; // first tick fires immediately, ignore it
loop {
interval.tick().await;
// do your stuff
}
});
for line in stdin().lock().lines() {
if line? == String::from("q") {
break;
}
}
Ok(())
}
The first bit spawns an async task that loops forever, waiting an hour between ticks. While the second bit just blocks on user input. Entering "q" immediately exits main and destroys the async executor. If the task is in the middle of processing, it will wait for it to reach the next await before exiting.
I am new to rust as well and I was interested in figuring out your question. I learned a lot coming up with this answer. This answer builds on the accepted answer.
Here is a complete and functioning example that addresses a common issue with reading the keyboard: Having to press enter before it reads the bytes()
I have also created multiple (two) event producers for mpsc and added the timer crate to handle timing.
Note: This probably won't run properly in vscode or jetbrains environment terminal as it depends on termion. Run it in a true terminal.
dependencies from cargo.toml
[dependencies]
timer = "0.2.0"
chrono = "0.4.19"
termion = "1.5.6"
main.rs
extern crate timer;
extern crate chrono;
extern crate termion;
use std::thread;
use std::sync::mpsc::channel;
use termion::input::TermRead;
use std::io::{stdout, stdin};
use termion::raw::IntoRawMode;
use termion::event::Key;
enum Events { TimerEvent, QuitEvent }
pub fn main() {
let (tx, rx) = channel();
let timer = timer::Timer::new();
let timer_tx = tx.clone();
let _guard = timer.schedule_repeating(chrono::Duration::seconds(3), move || {
timer_tx.send(Events::TimerEvent).unwrap();
});
thread::spawn(move || {
//This is the trick to reading input one key at a time
stdout().into_raw_mode().unwrap();
for input in stdin().keys() {
if input.unwrap() == Key::Char('q') {
tx.send(Events::QuitEvent).unwrap();
}
}
});
loop {
let value = rx.recv().unwrap();
match value {
Events::TimerEvent => work_fn(),
Events::QuitEvent => return // return from main()
}
}
}
fn work_fn() {
println!("Timer went off! Do work.");
}

Spawn non-static future with Tokio

I have an async method that should execute some futures in parallel, and only return after all futures finished. However, it is passed some data by reference that does not live as long as 'static (it will be dropped at some point in the main method). Conceptually, it's similar to this (Playground):
async fn do_sth(with: &u64) {
delay_for(Duration::new(*with, 0)).await;
println!("{}", with);
}
async fn parallel_stuff(array: &[u64]) {
let mut tasks: Vec<JoinHandle<()>> = Vec::new();
for i in array {
let task = spawn(do_sth(i));
tasks.push(task);
}
for task in tasks {
task.await;
}
}
#[tokio::main]
async fn main() {
parallel_stuff(&[3, 1, 4, 2]);
}
Now, tokio wants futures that are passed to spawn to be valid for the 'static lifetime, because I could drop the handle without the future stopping. That means that my example above produces this error message:
error[E0759]: `array` has an anonymous lifetime `'_` but it needs to satisfy a `'static` lifetime requirement
--> src/main.rs:12:25
|
12 | async fn parallel_stuff(array: &[u64]) {
| ^^^^^ ------ this data with an anonymous lifetime `'_`...
| |
| ...is captured here...
...
15 | let task = spawn(do_sth(i));
| ----- ...and is required to live as long as `'static` here
So my question is: How do I spawn futures that are only valid for the current context that I can then wait until all of them completed?
It is not possible to spawn a non-'static future from async Rust. This is because any async function might be cancelled at any time, so there is no way to guarantee that the caller really outlives the spawned tasks.
It is true that there are various crates that allow scoped spawns of async tasks, but these crates cannot be used from async code. What they do allow is to spawn scoped async tasks from non-async code. This doesn't violate the problem above, because the non-async code that spawned them cannot be cancelled at any time, as it is not async.
Generally there are two approaches to this:
Spawn a 'static task by using Arc rather than ordinary references.
Use the concurrency primitives from the futures crate instead of spawning.
Generally to spawn a static task and use Arc, you must have ownership of the values in question. This means that since your function took the argument by reference, you cannot use this technique without cloning the data.
async fn do_sth(with: Arc<[u64]>, idx: usize) {
delay_for(Duration::new(with[idx], 0)).await;
println!("{}", with[idx]);
}
async fn parallel_stuff(array: &[u64]) {
// Make a clone of the data so we can shared it across tasks.
let shared: Arc<[u64]> = Arc::from(array);
let mut tasks: Vec<JoinHandle<()>> = Vec::new();
for i in 0..array.len() {
// Cloning an Arc does not clone the data.
let shared_clone = shared.clone();
let task = spawn(do_sth(shared_clone, i));
tasks.push(task);
}
for task in tasks {
task.await;
}
}
Note that if you have a mutable reference to the data, and the data is Sized, i.e. not a slice, it is possible to temporarily take ownership of it.
async fn do_sth(with: Arc<Vec<u64>>, idx: usize) {
delay_for(Duration::new(with[idx], 0)).await;
println!("{}", with[idx]);
}
async fn parallel_stuff(array: &mut Vec<u64>) {
// Swap the array with an empty one to temporarily take ownership.
let vec = std::mem::take(array);
let shared = Arc::new(vec);
let mut tasks: Vec<JoinHandle<()>> = Vec::new();
for i in 0..array.len() {
// Cloning an Arc does not clone the data.
let shared_clone = shared.clone();
let task = spawn(do_sth(shared_clone, i));
tasks.push(task);
}
for task in tasks {
task.await;
}
// Put back the vector where we took it from.
// This works because there is only one Arc left.
*array = Arc::try_unwrap(shared).unwrap();
}
Another option is to use the concurrency primitives from the futures crate. These have the advantage of working with non-'static data, but the disadvantage that the tasks will not be able to run on multiple threads at the same time.
For many workflows this is perfectly fine, as async code should spend most of its time waiting for IO anyway.
One approach is to use FuturesUnordered. This is a special collection that can store many different futures, and it has a next function that runs all of them concurrently, and returns once the first of them finished. (The next function is only available when StreamExt is imported)
You can use it like this:
use futures::stream::{FuturesUnordered, StreamExt};
async fn do_sth(with: &u64) {
delay_for(Duration::new(*with, 0)).await;
println!("{}", with);
}
async fn parallel_stuff(array: &[u64]) {
let mut tasks = FuturesUnordered::new();
for i in array {
let task = do_sth(i);
tasks.push(task);
}
// This loop runs everything concurrently, and waits until they have
// all finished.
while let Some(()) = tasks.next().await { }
}
Note: The FuturesUnordered must be defined after the shared value. Otherwise you will get a borrow error that is caused by them being dropped in the wrong order.
Another approach is to use a Stream. With streams, you can use buffer_unordered. This is a utility that uses FuturesUnordered internally.
use futures::stream::StreamExt;
async fn do_sth(with: &u64) {
delay_for(Duration::new(*with, 0)).await;
println!("{}", with);
}
async fn parallel_stuff(array: &[u64]) {
// Create a stream going through the array.
futures::stream::iter(array)
// For each item in the stream, create a future.
.map(|i| do_sth(i))
// Run at most 10 of the futures concurrently.
.buffer_unordered(10)
// Since Streams are lazy, we must use for_each or collect to run them.
// Here we use for_each and do nothing with the return value from do_sth.
.for_each(|()| async {})
.await;
}
Note that in both cases, importing StreamExt is important as it provides various methods that are not available on streams without importing the extension trait.
In case of code that uses threads for parallelism, it is possible to avoid copying by extending a lifetime with transmute. An example:
fn main() {
let now = std::time::Instant::now();
let string = format!("{now:?}");
println!(
"{now:?} has length {}",
parallel_len(&[&string, &string]) / 2
);
}
fn parallel_len(input: &[&str]) -> usize {
// SAFETY: this variable needs to be static, because it is passed into a thread,
// but the thread does not live longer than this function, because we wait for
// it to finish by calling `join` on it.
let input: &[&'static str] = unsafe { std::mem::transmute(input) };
let mut threads = vec![];
for txt in input {
threads.push(std::thread::spawn(|| txt.len()));
}
threads.into_iter().map(|t| t.join().unwrap()).sum()
}
It seems reasonable that this should also work for asynchronous code, but I do not know enough about that to say for sure.

Why can't I communicate with a forked child process using Tokio UnixStream?

I'm trying to get a parent process and a child process to communicate with each other using a tokio::net::UnixStream. For some reason the child is unable to read whatever the parent writes to the socket, and presumably the other way around.
The function I have is similar to the following:
pub async fn run() -> Result<(), Error> {
let mut socks = UnixStream::pair()?;
match fork() {
Ok(ForkResult::Parent { .. }) => {
socks.0.write_u32(31337).await?;
Ok(())
}
Ok(ForkResult::Child) => {
eprintln!("Reading from master");
let msg = socks.1.read_u32().await?;
eprintln!("Read from master {}", msg);
Ok(())
}
Err(_) => Err(Error),
}
}
The socket doesn't get closed, otherwise I'd get an immediate error trying to read from socks.1. If I move the read into the parent process it works as expected. The first line "Reading from master" gets printed, but the second line never gets called.
I cannot change the communication paradigm, since I'll be using execve to start another binary that expects to be talking to a socketpair.
Any idea what I'm doing wrong here? Is it something to do with the async/await?
When you call the fork() system call:
The child process is created with a single thread—the one that called fork().
The default executor in tokio is a thread pool executor. The child process will only get one of the threads in the pool, so it won't work properly.
I found I was able to make your program work by setting the thread pool to contain only a single thread, like this:
use tokio::prelude::*;
use tokio::net::UnixStream;
use nix::unistd::{fork, ForkResult};
use nix::sys::wait;
use std::io::Error;
use std::io::ErrorKind;
use wait::wait;
// Limit to 1 thread
#[tokio::main(core_threads = 1)]
async fn main() -> Result<(), Error> {
let mut socks = UnixStream::pair()?;
match fork() {
Ok(ForkResult::Parent { .. }) => {
eprintln!("Writing!");
socks.0.write_u32(31337).await?;
eprintln!("Written!");
wait().unwrap();
Ok(())
}
Ok(ForkResult::Child) => {
eprintln!("Reading from master");
let msg = socks.1.read_u32().await?;
eprintln!("Read from master {}", msg);
Ok(())
}
Err(_) => Err(Error::new(ErrorKind::Other, "oh no!")),
}
}
Another change I had to make was to force the parent to wait for the child to complete, by calling wait() - also something you probably do not want to be doing in a real async program.
Most of the advice I have read that if you need to fork from a threaded program, either do it before creating any threads, or call exec_ve() in the child immediately after forking (which is what you plan to do anyway).

Resources