Why do asynchronous tasks (functions) not run unless awaited? - rust

In Rust I have found that an asynchronous task or function (let's even say a future) is not invoked in the runtime unless it is awaited. In other languages such as C# or NodeJS it is possible to define async tasks and run them concurrently as an async task is meant to provide non-blocking IO. For instance:
public Task Run();
public Task ListenToMusic();
public async Task RunAndListenToMusic() {
Task run = Run(); // the task is already running
Task listenToMusic = ListenToMusic(); // the task is already running
await Task.WhenAll(run, listenToMusic);
}
I have tested this mechanism in Rust using a for loop that actually prints out sequential numbers and found that, they are always executed in order, meaning that the second task is run after the first one.
For people like me who are from the world of dotnet or Java, this is a weird behavior. What is actually going on, I searched but I need someone to explain this in a little bit more details and more simply.

Here's some Rust code that is equivalent to your example:
use tokio; // 1.14.0
async fn task1() {
for i in 0..10 {
println!("Task 1: {}", i);
}
}
async fn task2() {
for i in 0..10 {
println!("Task 2: {}", i);
}
}
#[tokio::main]
async fn main() {
let t1 = task1();
let t2 = task2();
tokio::join!(t1, t2);
}
Playground
If you run this code, you will notice that it executes all of task1 before executing task2. This is expected because execution is single-threaded, so task1 will run so long as it doesn't attempt a blocking operation. However if we add blocking operations (here I've used sleep, but the same goes for I/O operations):
use std::time::Duration;
use tokio; // 1.14.0
async fn task1() {
for i in 0..10 {
println!("Task 1: {}", i);
tokio::time::sleep (Duration::from_millis (1)).await;
}
}
async fn task2() {
for i in 0..10 {
println!("Task 2: {}", i);
tokio::time::sleep (Duration::from_millis (1)).await;
}
}
#[tokio::main]
async fn main() {
let t1 = task1();
let t2 = task2();
tokio::join!(t1, t2);
}
Playground
Now we see that operations are interleaved: when a task blocks the other tasks get a chance to run, which is the whole point of async programming.

Related

failed to run two threads using #[tokio::main] macro

I am trying to understand how tokio runtime works, i created two runtimes(on purpose) using #[tokio::main] macro, the first should executes function a() and the second executes function b().
I am assuming that they should be both printing "im awake A" and "im awake B" simultaniosuly forever (since they are calling a function that has a loop async_task), however that is not the case, it only prints "im awake A".
since each runtime has its own thread pool; why they are not running in parallel?
use std::thread;
fn main() {
a();
b();
}
#[tokio::main]
async fn a() {
tokio::spawn(async move { async_task("A".to_string()).await });
}
pub async fn async_task(msg: String) {
loop {
thread::sleep(std::time::Duration::from_millis(1000));
println!("im awake {}", msg);
}
}
#[tokio::main]
async fn b() {
tokio::spawn(async move { async_task("B".to_string()).await });
}
Calling a(); from the synchronous main function will block until a() finishes. Check out the documentation here: https://docs.rs/tokio/1.2.0/tokio/attr.main.html
#[tokio::main]
async fn main() {
println!("Hello world");
}
Equivalent code not using #[tokio::main]
fn main() {
tokio::runtime::Builder::new_multi_thread()
.enable_all()
.build()
.unwrap()
.block_on(async {
println!("Hello world");
}) }
To get your example to work, main() could also spawn 2 threads that run a, b and wait for them to finish:
fn main() {
let t1 = thread::spawn(|| {
a();
});
let t2 = thread::spawn(|| {
b();
});
t1.join().unwrap();
t2.join().unwrap();
}
EDIT:
Note that a() and b() also do not need to use tokio::spawn as they're already executing in their own async runtime.
#[tokio::main]
async fn a() -> Result<(), JoinError> {
async_task("A".to_string()).await
}
#[tokio::main]
async fn b() {
async_task("B".to_string()).await
}
If you use tokio::spawn in a and b, you would need to await the spawned future, but tokio::spawn(task.await).await is basically the same as just doing task.await.
#[tokio::main] expands to a call to Runtime::block_on(), and as said in its docs (emphasis mine):
This runs the given future on the current thread, blocking until it is complete, and yielding its resolved result.
If you use Runtime::spawn() instead (and make sure not to drop the runtime because it shuts it down), it prints both from A and B correctly:
fn main() {
let _a_runtime = a();
b();
}
fn a() -> tokio::runtime::Runtime {
let runtime = tokio::runtime::Runtime::new().unwrap();
runtime.spawn(async { async_task("A".to_string()).await });
runtime
}
#[tokio::main]
async fn b() {
tokio::spawn(async move { async_task("B".to_string()).await });
}
Take a look at the documentation for the main macro. There's a clue to why this doesn't work there.
Note: This macro can be used on any function and not just the main function. Using it on a non-main function makes the function behave as if it was synchronous by starting a new runtime each time it is called. If the function is called often, it is preferable to create the runtime using the runtime builder so the runtime can be reused across calls.
So you can use it on multiple functions, but what that means is that you need to call each one in a separate main function. You could also manually construct it
fn main() {
let jh1 = std::thread::spawn(|| a());
let jh2 = std::thread::spawn(|| b());
jh1.join().unwrap();
jh2.join().unwrap();
}
async fn async_task(msg: String) {
loop {
tokio::time::sleep(core::time::Duration::from_secs(1)).await;
println!("I'm awake {}", msg);
}
}
#[tokio::main]
async fn a() {
async_task("a".to_owned()).await
}
#[tokio::main]
async fn b() {
async_task("b".to_owned()).await
}

If "futures do nothing unless awaited", why does `tokio::spawn` work anyway?

I have read here that futures in Rust do nothing unless they are awaited. However, I tried a more complex example and it is a little unclear why I get a message printed by the 2nd print in this example because task::spawn gives me a JoinHanlde on which I do not do any .await.
Meanwhile, I tried the same example, but with an await above the 2nd print, and now I get printed only the message in the 1st print.
If I wait for all the futures at the end, I get printed both messages, which I understood. My question is why the behaviour in the previous 2 cases.
use futures::stream::{FuturesUnordered, StreamExt};
use futures::TryStreamExt;
use rand::prelude::*;
use std::collections::VecDeque;
use std::sync::Arc;
use tokio::sync::Semaphore;
use tokio::task::JoinHandle;
use tokio::{task, time};
fn candidates() -> Vec<i32> {
Vec::from([2, 2])
}
async fn produce_event(nanos: u64) -> i32 {
println!("waiting {}", nanos);
time::sleep(time::Duration::from_nanos(nanos)).await;
1
}
async fn f(seconds: i64, semaphore: &Arc<Semaphore>) {
let mut futures = vec![];
for (i, j) in (0..1).enumerate() {
for (i, event) in candidates().into_iter().enumerate() {
let permit = Arc::clone(semaphore).acquire_owned().await;
let secs = 500;
futures.push(task::spawn(async move {
let _permit = permit;
produce_event(500); // 2nd example has an .await here
println!("Event produced at {}", seconds);
}));
}
}
}
#[tokio::main()]
async fn main() {
let semaphore = Arc::new(Semaphore::new(45000));
for _ in 0..1 {
let mut futures: FuturesUnordered<_> = (0..2).map(|moment| f(moment, &semaphore)).collect();
while let Some(item) = futures.next().await {
let () = item;
}
}
}
However, I tried a more complex example and it is a little unclear why I get a message printed by the 2nd print in this example because task::spawn gives me a JoinHanlde on which I do not do any .await.
You're spawning tasks. A task is a separate thread of execution which can execute concurrently to the current task, and can be scheduled in parallel.
All the JoinHandle does there is wait for that task to end, it doesn't control the task running.
Meanwhile, I tried the same example, but with an await above the 2nd print, and now I get printed only the message in the 1st print.
You spawn a bunch of tasks and make them sleep. Since you don't wait for them to terminate (don't join them) nor is there any sort of sleep in their parent task, once all the tasks have been spawned the loops terminate, you reach the end of the main function and the program terminates.
At this point all the tasks are still sleeping.

Task execution pause/resume in Rust async? (tokio)

How can I pause an async task in Rust?
Swift has withCheckedContinuation(function:_:)
that pauses current task and returns saved context that can be resumed at desired time. (a.k.a. call/cc)
tokio has tokio::task::yield_now, but it can resume automatically, so that's not what I'm looking for. I mean "pause" that will never resume without explicit command.
Now I'm looking into tokio manual. It defines several synchronization features in tokio::sync module, but I couldn't find a function to pause a task directly. Am I supposed to use only the synchronization feature to simulate the suspend? Or am I missing something here?
tokio::sync::oneshot can be used for this purpose. It gives you two objects: one a future, and the other a handle you can use to cause the future to resolve. It also conveys a value, but that value can be just () if we don't need it.
In the following runnable example, main() is the task being paused and the task which resumes it is spawned, because that's the simplest thing to do; but in a real application you'd presumably pass on the sender to something else that already exists.
use std::time::Duration;
use tokio::spawn;
use tokio::time::sleep;
use tokio::sync::oneshot;
#[tokio::main]
async fn main() {
println!("one");
let (sender, receiver) = oneshot::channel::<()>();
spawn(async move {
sleep(Duration::from_millis(400));
println!("two");
if let Err(_) = sender.send(()) {
println!("oops, the receiver dropped");
}
});
println!("...wait...");
match receiver.await {
Ok(()) => println!("three"),
Err(_) => println!("oops, the sender dropped"),
}
}
Note that this is not a special feature of the oneshot channel: any future which you can control the resolution of can be used for this purpose. oneshot is appropriate when you want to hand out a specific handle to this one paused task. If you instead wanted many tasks to wake up on a single notification, you could use tokio::sync::watch instead.
I don't know anything built-in, but you can build your own.
You need access to the Waker to unpark a future. You also need to keep track of whether the futute has been manually unparked, because futures can be waked by the runtime even if nobody ordered them to.
There are various ways to write this code, here is one:
// You can get rid of this `Unpin` bound, if you really want
pub async fn park(callback: impl FnOnce(Parker) + Unpin) {
enum Park<F> {
FirstTime { callback: F },
SecondTime { unparked: Arc<AtomicBool> },
}
impl<F: FnOnce(Parker) + Unpin> Future for Park<F> {
type Output = ();
fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
if let Self::SecondTime { unparked } = &*self {
return if unparked.load(Ordering::SeqCst) {
Poll::Ready(())
} else {
Poll::Pending
};
}
let unparked = Arc::new(AtomicBool::new(false));
let callback = match std::mem::replace(
&mut *self,
Self::SecondTime {
unparked: Arc::clone(&unparked),
},
) {
Self::FirstTime { callback } => callback,
Self::SecondTime { .. } => unreachable!(),
};
callback(Parker {
waker: cx.waker().clone(),
unparked,
});
Poll::Pending
}
}
Park::FirstTime { callback }.await
}
Then you call it like park(|p| { ... }).await.
Example.

Messaging between two tokio runtimes inside separate threads

I ran into the kind of a problem described in this question: How can I create a Tokio runtime inside another Tokio runtime without getting the error "Cannot start a runtime from within a runtime"? .
Some good rust crates doesn't have asynchronous executor. I decided to put all such libraries calls in one thread which is tolerant of such operations. Another thread should be able to send non-blicking messages using tokio::channel.
I have programmed a demo stand to test implementation options. Call tokio::spawn inside of each runtime is made in order to understand a little more detail in tokio runtimes and handlers - it is a part of a question.
The question.
Please correct me if I misunderstand something further.
There are two tokio runtimes. Each is launched in its own thread. Call tokio::spawn inside first_runtime() spawns task on first runtime. Call tokio::spawn inside second_runtime() spawns task on second runtime. There is a tokio::channel between these two tasks. Call tx.send(...).await does not block sending thread if channel buffer is not full, even if receiving thread is blocked by thread::sleep() call.
Am I getting everything right? The output of this code tells me that I'm right, but I need confirmation of my reasoning.
use std::thread;
use std::time::Duration;
use tokio::sync::mpsc::{Sender, Receiver, channel}; // 1.12.0
#[tokio::main(worker_threads = 1)]
#[allow(unused_must_use)]
async fn first_runtime(tx: Sender<String>) {
thread::sleep(Duration::from_secs(1));
println!("first thread woke up");
tokio::spawn(async move {
for msg_id in 0..10 {
if let Err(e) = tx.send(format!("message {}", msg_id)).await {
eprintln!("[ERR]: {}", e);
} else {
println!("message {} send", msg_id);
}
}
}).await;
println!("first thread finished");
}
#[tokio::main(worker_threads = 1)]
#[allow(unused_must_use)]
async fn second_runtime(mut rx: Receiver<String>) {
thread::sleep(Duration::from_secs(3));
println!("second thread woke up");
tokio::spawn(async move {
while let Some(msg) = rx.recv().await {
println!("{} received", msg);
}
}).await;
println!("second thread finished");
}
fn main() {
let (tx, rx) = channel::<String>(5);
thread::spawn(move || { first_runtime(tx); });
second_runtime(rx);
}

How can I run a set of functions on a recurring interval without running the same function at the same time using only the standard Rust library?

I would like to use Rust to create a simple scheduler in order to run multiple concurrent functions at a defined time but do not start more if they haven't finished.
For example, if the defined interval is one second, the scheduler should run the functions and don't start more if the previous functions have not returned. The goal is to prevent running the same function multiple times.
I created a working example with Go like this:
package main
import (
"fmt"
"sync"
"time"
)
func myFunc(wg *sync.WaitGroup) {
fmt.Printf("now: %+s\n", time.Now())
time.Sleep(3 * time.Second)
wg.Done()
}
func main() {
quit := make(chan bool)
t := time.NewTicker(time.Second)
go func() {
for {
select {
case <-t.C:
var wg sync.WaitGroup
for i := 0; i <= 4; i++ {
wg.Add(1)
go myFunc(&wg)
}
wg.Wait()
fmt.Printf("--- done ---\n\n")
case <-quit:
return
}
}
}()
<-time.After(time.Minute)
close(quit)
}
Since I didn't find something like Go's NewTicker within the Rust standard library, I used Tokio and came up with this
extern crate futures;
extern crate tokio;
use futures::future::lazy;
use std::{thread, time};
use tokio::prelude::*;
use tokio::timer::Interval;
fn main() {
let task = Interval::new(time::Instant::now(), time::Duration::new(1, 0))
.for_each(|interval| {
println!("Interval: {:?}", interval);
for i in 0..5 {
tokio::spawn(lazy(move || {
println!("I am i: {}", i);
thread::sleep(time::Duration::from_secs(3));
Ok(())
}));
}
Ok(())
})
.map_err(|e| panic!("interval errored; err={:?}", e));
tokio::run(task);
}
The problem I have with this approach is that the tasks don't wait for the previous functions to be called therefore the functions start again no matter if previously they were running, I am missing here something like Go's sync.WaitGroup. What could be used to achieve the same results as in the working example?
Is it possible to achieve this by only using the standard library? This is mainly for learning purposes, probably there is a pretty straightforward way of doing it and I could avoid extra complexity.
In the end, I would like to periodically monitor some sites via HTTP (get just the returned status code) but don't query all of them again until I have all the responses.
Since you want concurrency and will only use the standard library, then you basically must use threads.
Here, we start a thread for every function for every iteration of the scheduler loop, allowing them to run in parallel. We then wait for all functions to finish, preventing ever running the same function twice concurrently.
use std::{
thread,
time::{Duration, Instant},
};
fn main() {
let scheduler = thread::spawn(|| {
let wait_time = Duration::from_millis(500);
// Make this an infinite loop
// Or some control path to exit the loop
for _ in 0..5 {
let start = Instant::now();
eprintln!("Scheduler starting at {:?}", start);
let thread_a = thread::spawn(a);
let thread_b = thread::spawn(b);
thread_a.join().expect("Thread A panicked");
thread_b.join().expect("Thread B panicked");
let runtime = start.elapsed();
if let Some(remaining) = wait_time.checked_sub(runtime) {
eprintln!(
"schedule slice has time left over; sleeping for {:?}",
remaining
);
thread::sleep(remaining);
}
}
});
scheduler.join().expect("Scheduler panicked");
}
fn a() {
eprintln!("a");
thread::sleep(Duration::from_millis(100))
}
fn b() {
eprintln!("b");
thread::sleep(Duration::from_millis(200))
}
You could also make use of a Barrier to start each function in a thread once and then synchronize all of them at the end of execution:
use std::{
sync::{Arc, Barrier},
thread,
time::Duration,
};
fn main() {
let scheduler = thread::spawn(|| {
let barrier = Arc::new(Barrier::new(2));
fn with_barrier(barrier: Arc<Barrier>, f: impl Fn()) -> impl Fn() {
move || {
// Make this an infinite loop
// Or some control path to exit the loop
for _ in 0..5 {
f();
barrier.wait();
}
}
}
let thread_a = thread::spawn(with_barrier(barrier.clone(), a));
let thread_b = thread::spawn(with_barrier(barrier.clone(), b));
thread_a.join().expect("Thread A panicked");
thread_b.join().expect("Thread B panicked");
});
scheduler.join().expect("Scheduler panicked");
}
fn a() {
eprintln!("a");
thread::sleep(Duration::from_millis(100))
}
fn b() {
eprintln!("b");
thread::sleep(Duration::from_millis(200))
}
I wouldn't use either of these solutions, personally. I'd find a crate where someone else has written and tested the code I need.
See also:
Does Rust have an equivalent of Python's threading.Timer?
Is there a way to schedule a task at a specific time or with an interval?
How do I emulate a timer inside an object that will periodically mutate the object?

Resources