I am using the Abortable crate to suspend the execution of a Future. Say I have an abortable future in which the async function itself awaits other async functions. My question is, if I abort the root Future, would the child Futures be aborted instantly at the same time, or would they be dangling?
I read the source code for Abortable, in particular the code for try_poll:
fn try_poll<I>(
mut self: Pin<&mut Self>,
cx: &mut Context<'_>,
poll: impl Fn(Pin<&mut T>, &mut Context<'_>) -> Poll<I>,
) -> Poll<Result<I, Aborted>> {
// Check if the task has been aborted
if self.is_aborted() {
return Poll::Ready(Err(Aborted));
}
// attempt to complete the task
if let Poll::Ready(x) = poll(self.as_mut().project().task, cx) {
return Poll::Ready(Ok(x));
}
// Register to receive a wakeup if the task is aborted in the future
self.inner.waker.register(cx.waker());
// Check to see if the task was aborted between the first check and
// registration.
// Checking with `is_aborted` which uses `Relaxed` is sufficient because
// `register` introduces an `AcqRel` barrier.
if self.is_aborted() {
return Poll::Ready(Err(Aborted));
}
Poll::Pending
}
My understanding is that once abort is called, it will propagate to the downstream Futures in the sense that when the root Future is aborted, it will stop polling its child Future (because Poll::Ready(Err(Aborted)) will be returned), which will in turn stop polling its child. If this reasoning is true, then the effect of calling abort is immediate.
Another argument is that if Future is pull-based, the root node should be invoked first and then propagate to the sub tasks until the leaf one is invoked and aborted (and then go back to root). This means that there is a latency between when the abort method is called and when the leaf Future actually stops polling. Might be relevant, but this blogpost mentions dangling tasks, and I am concerned this is the case.
For example, here is a toy example that I wrote:
use futures::future::{AbortHandle, Abortable};
use tokio::{time::sleep};
use std::{time::{Duration, SystemTime}};
/*
* main
* \
* child
* | \
* | \
* leaf1 leaf2
*/
async fn leaf2() {
println!("This will not be printed")
}
async fn leaf1(s: String) {
println!("[{:?}] ====== in a ======", SystemTime::now());
for i in 0..100000 {
println!("[{:?}] before sleep i is {}", SystemTime::now(), i);
sleep(Duration::from_millis(1)).await;
println!("[{:?}] {}! i is {}", SystemTime::now(), s.clone(), i);
}
}
async fn child(s: String) {
println!("[{:?}] ====== in child ======", SystemTime::now());
leaf1(s.clone()).await;
leaf2().await
}
#[tokio::main]
async fn main() {
let (abort_handle, abort_registration) = AbortHandle::new_pair();
let result_fut = Abortable::new(child(String::from("Hello")), abort_registration);
tokio::spawn(async move {
println!("{:?} ^^^^^ before sleep ^^^^^", SystemTime::now());
sleep(Duration::from_millis(100)).await;
println!("{:?} ^^^^^ after sleep, about to abort ^^^^^", SystemTime::now());
abort_handle.abort();
println!("{:?} ***** operation aborted *****", SystemTime::now());
});
println!("{:?} ====== before main sleeps ======", SystemTime::now());
sleep(Duration::from_millis(5)).await;
println!("{:?} ====== after main wakes up from sleep and now getting results \
======", SystemTime::now());
result_fut.await.unwrap();
}
Rust playground
I am personally leaning more towards the first argument that there is no latency between abortion of the root and abortion of leaf because the leaf doesn't need to know it needs to abort (the leaf only pulls when root tells it to). The example above prints the time the child is executed and the time when the root is aborted. The execution of child is always before the root is aborted, but I am not sure if this can prove that my first argument is true, so I would like to know what y'all think!
Yes, because a future needs to polled to executed but it will not be polled if it is aborted, the child futures will not be polled either and therefore the execution will stop immediately.
Of course, execution will stop only after reaching the next yield point, and spawned tasks using tokio::spawn() will not stop.
Dropping a Future type equals to drop all the states within that Future, including its child Futures. Remember Futures in Rust are all state machines. If you succeeded in safely dropping a Future's states, then it means the execution must have already been stopped before the drop, otherwise it is a data race.
Specifically, dropping a tokio JoinHandle conceptually only drops the handle itself, but does nothing about the task that the handle represents. Or in other words, if you use tokio::spawn(), then whatever Future you throw into that task became unrelated to your current Future (except you can receive back a result from the JoinHandle and explicitly call abort). So they call tokio's tasks as detached by default.
Let's talk about timing issues. For Abortable, it does not directly "drop" your Future. Instead, it "drops" in an indirect way. If you call AbortHandle::abort(), then immediately an atomic boolean flag is flipped and then a Waker::wake() is called, but the async executor would be unaware of this at this moment. The executor would still think your Abortable is either progressing or suspended at this moment. We should discuss them separately:
The executor was still polling your Future at the time of the atomic boolean flip. Then according to the source code
If this polling happens to be the one that completes the future, then the Abortable would return a completion.
If this polling returns Pending in the end, then Abortable has a high chance of return an aborted error, small chance of going back into suspense, depending on when the atomic flag flip became observable to the thread of this polling.
If either a completion or an aborted error is returned, the Abortable as a Future would be considered as completed and eventually(*) get dropped. When it gets dropped, it would mean all its child Futures would have also been dropped.
If you happen to be coding in a very poor style and this polling happens to have touched a long computation / a blocking call, then unfortunately, everything else would have to wait however long for this poll to return, and the executor may be starved during the process.
Your Future was suspended at the time of the atomic boolean flip. The Waker::wake() call tells the executor to poll this Abortable at least once sometime in the future. Some time later, the executor will finally decide to give your Abortable a poll. Then almost immediately the poll returns an aborted error. Then the same thing happens, your Abortable as a Future would be considered as completed and eventually get dropped.
In summary, no, the abort does not happen at the same time as the drop.
*: I'm not sure when the Future gets dropped after its completion. It would really depend on how Rust compiler transforms the .await point and I haven't looked into it yet.
Related
On my runtime, I have the following Rust code:
pub fn reduce(heap: &Heap, prog: &Program, tids: &[usize], root: u64, debug: bool) -> Ptr {
// Halting flag
let stop = &AtomicBool::new(false);
let barr = &Barrier::new(tids.len());
let locs = &tids.iter().map(|x| AtomicU64::new(u64::MAX)).collect::<Vec<AtomicU64>>();
// Spawn a thread for each worker
std::thread::scope(|s| {
for tid in tids {
s.spawn(move || {
reducer(heap, prog, tids, stop, barr, locs, root, *tid, debug);
});
}
});
// Return whnf term ptr
return load_ptr(heap, root);
}
This will spawn many threads, in order to perform a parallel computation. The problem is, the reduce function is called thousands of times, and the overhead of spawning threads can be considerable. When I implemented the same thing in C, I just kept the threads open, and sent a message in order to activate them. In Rust, with the std::thread::scope idiom, I'm not sure how to do so. Is it possible to keep the threads spawned after the first call to reduce, by just modifying that one function? That is, without changing anything else on my code?
Threads spawned using the threads::scoped api won't be able to outlive the the calling function. For long-running threads you'll need to spawn them using std::thread::spawn.
Once you've made that change rustc will be very upset with you due to lifetime errors because you are sending non-static references into the spawned threads. Fixing those errors will require some changes. That path is long and full of learning.
If you want something that works really well and is simple, consider instead using the excellent rayon crate. Rayon maintains it's own global thread pool so you don't have to.
Using rayon will look something like this:
tids.par_iter()
.for_each(|tid| {
reducer(heap, prog, tids, stop, barr, locs, root, *tid, debug);
});
This question may somewhat relate more to async-programming than Rust.But after googling a lot, there are still somepoint I think is missing. And since I am learning Rust, I would put it in a Rust way.
Let me give my understanding of async-programming first---After all, this is the basis, maybe I am wrong or not:
To make program run efficiently, dealing tasks concurrently is essential. Then thread is used, and the thread could be joined whenever the data from the thread is needed. But thread is not enough to handle many tasks,like a server does. Then thread-pool is used, but how to fetch data when it is needed with no information of which thread should be waiting for? Then callback function(cb for short) comes up.With cb,only what needs to do in cb should be considered. In addition, to make cpu little overhead, green thread comes up.
But what if the asyn-waiting things need to do one after another, which leads to "callback hell"? Ok, the "future/promise" style comes up, which let code looks like sync-code, or maybe like a chain(like in javascript). But still the code looks not quite nice. Finally, the "async/await" style comes up, as another syntactic sugar for "future/promise" style. And usually, the "async/await" with green thread style is called "coroutine", be it using only one native thread or multi-native threads over async tasks.
=============================================
As far as I know at this point, as keyword "await" can only be used in the scope of an "async" function, and only "async" function could be "awaited". But why? And what is it used to, as there is already "async"? Anyway, I tested the code below:
use async_std::{task};
// async fn easy_task() {
// for i in 0..100 {
// dbg!(i);
// }
// println!("finished easy task");
// }
async fn heavy_task(cnt1: i32, cnt2: i32) {
for i in 0..cnt1 {
println!("heavy_task1 cnt:{}", i);
}
println!("heavy task: waiting sub task");
// normal_sub_task(cnt2);
sub_task(cnt2).await;
println!("heavy task: sub task finished");
for i in 0..cnt1 {
println!("heavy_task2 cnt:{}", i);
}
println!("finished heavy task");
}
fn normal_sub_task(cnt: i32) {
println!("normal sub_task: start sub task");
for i in 0..cnt {
println!("normal sub task cnt:{}", i);
}
println!("normal sub_task: finished sub task");
}
async fn sub_task(cnt: i32) {
println!("sub_task: start sub task");
for i in 0..cnt {
println!("sub task cnt:{}", i);
}
println!("sub_task: finished sub task");
}
fn outer_task(cnt: i32) {
for i in 0..cnt {
println!("outer task cnt:{}", i);
}
println!("finished outer task");
}
fn main() {
// let _easy_f = easy_task();
let heavy_f = heavy_task(3000, 500);
let handle = task::spawn(heavy_f);
print!("=================after spawn==============");
outer_task(5000);
// task::join_handle(handle);
task::block_on(handle);
}
the conclusion I got from test is:
1.No matter awaiting async sub_task or just doing normal_sub_task(sync version) in the middle of async heavy_task(), the code below that (the heavy loop task2) would not cut in line.
2.No matter awaiting async sub_task or just doing normal_sub_task(sync version) in the middle of async heavy_task(), the outer_task would sometimes cut in line, breaking the heavy_task1 or async_sub_task/normal_sub_task.
Therefore, what is the meaning of "await", it seems that only keyword "asyc" is used here.
reference:
asyc_std
sing_dance_example from rust asyncbook
module Task in official rust module
recommended article of rust this week about async-programming
another article about rust thread and async-programming using future crates
stackoverflow question:What is the purpose of async/await in Rust?
the conclusion 2 I got seems to be violated against what Shepmaster said, "...we felt async functions should run synchronously to the first await."
The await keyword suspends the execution of an asynchronous function until the awaited future (future.await) produces a value.
It is the same meaning of all the other languages that uses the await concept.
When a future is awaited the "status of execution" of the async function is persisted into an internal
execution context and others async functions have the opportunity to progress if they are ready to run.
When the awaited future completes the async function resumes at the exact point of suspension.
If you think I need only async and write something like:
// OK: let result = future.await
let result = future
You don't get a value but something that represents a value ready in the future.
And if you mark async a function without awaiting anything inside
the body of the function you are injecting into an asynchronous engine a sequential task
that when executed will run to completion as a normal function, preventing asynchronous behavoir.
Some more comments about your code
Probably the confusion arise from a misunderstaning ot the task concept.
When learning async in rust I found the async book pretty useful.
The book define tasks as:
Tasks are the top-level futures that have been submitted to an executor
heavy_task is really the unique task in your example because it is the only future submitted to the async
runtime with task::block_on.
For example, the function outer_task has nothing to do with asynchronous world:
it is not a task, it get excuted immediately when called.
heavy_task behaves asychronously and await
sub_task(cnt2) future ... but sub_task future once executed
goes immediately to completion.
So, as it stand, your code behave practically as sequential.
But keep in mind that things in reality are more subtle, because in presence of other async tasks the
await inside heavy_task works as a suspension point and gives opportunity to other
tasks to be executed toward completion.
I'm trying to understand Future::select: in this example, the future with a longer time delay is returned first.
When I read this article with its example, I get cognitive dissonance. The author writes:
The select function runs two (or more in case of select_all) futures and returns the first one coming to completion. This is useful for implementing timeouts.
It seems I don't understand the sense of select.
extern crate futures; // v0.1 (old)
extern crate tokio_core;
use std::thread;
use std::time::Duration;
use futures::{Async, Future};
use tokio_core::reactor::Core;
struct Timeout {
time: u32,
}
impl Timeout {
fn new(period: u32) -> Timeout {
Timeout { time: period }
}
}
impl Future for Timeout {
type Item = u32;
type Error = String;
fn poll(&mut self) -> Result<Async<u32>, Self::Error> {
thread::sleep(Duration::from_secs(self.time as u64));
println!("Timeout is done with time {}.", self.time);
Ok(Async::Ready(self.time))
}
}
fn main() {
let mut reactor = Core::new().unwrap();
let time_out1 = Timeout::new(5);
let time_out2 = Timeout::new(1);
let task = time_out1.select(time_out2);
let mut reactor = Core::new().unwrap();
reactor.run(task);
}
I need to process the early future with the smaller time delay, and then work with the future with a longer delay. How can I do it?
TL;DR: use tokio::time
If there's one thing to take away from this: never perform blocking or long-running operations inside of asynchronous operations.
If you want a timeout, use something from tokio::time, such as delay_for or timeout:
use futures::future::{self, Either}; // 0.3.1
use std::time::Duration;
use tokio::time; // 0.2.9
#[tokio::main]
async fn main() {
let time_out1 = time::delay_for(Duration::from_secs(5));
let time_out2 = time::delay_for(Duration::from_secs(1));
match future::select(time_out1, time_out2).await {
Either::Left(_) => println!("Timer 1 finished"),
Either::Right(_) => println!("Timer 2 finished"),
}
}
What's the problem?
To understand why you get the behavior you do, you have to understand the implementation of futures at a high level.
When you call run, there's a loop that calls poll on the passed-in future. It loops until the future returns success or failure, otherwise the future isn't done yet.
Your implementation of poll "locks up" this loop for 5 seconds because nothing can break the call to sleep. By the time the sleep is done, the future is ready, thus that future is selected.
The implementation of an async timeout conceptually works by checking the clock every time it's polled, saying if enough time has passed or not.
The big difference is that when a future returns that it's not ready, another future can be checked. This is what select does!
A dramatic re-enactment:
sleep-based timer
core: Hey select, are you ready to go?
select: Hey future1, are you ready to go?
future1: Hold on a seconnnnnnnn [... 5 seconds pass ...] nnnnd. Yes!
simplistic async-based timer
core: Hey select, are you ready to go?
select: Hey future1, are you ready to go?
future1: Checks watch No.
select: Hey future2, are you ready to go?
future2: Checks watch No.
core: Hey select, are you ready to go?
[... polling continues ...]
[... 1 second passes ...]
core: Hey select, are you ready to go?
select: Hey future1, are you ready to go?
future1: Checks watch No.
select: Hey future2, are you ready to go?
future2: Checks watch Yes!
This simple implementation polls the futures over and over until they are all complete. This is not the most efficient, and not what most executors do.
See How do I execute an async/await function without using any external dependencies? for an implementation of this kind of executor.
smart async-based timer
core: Hey select, are you ready to go?
select: Hey future1, are you ready to go?
future1: Checks watch No, but I'll call you when something changes.
select: Hey future2, are you ready to go?
future2: Checks watch No, but I'll call you when something changes.
[... core stops polling ...]
[... 1 second passes ...]
future2: Hey core, something changed.
core: Hey select, are you ready to go?
select: Hey future1, are you ready to go?
future1: Checks watch No.
select: Hey future2, are you ready to go?
future2: Checks watch Yes!
This more efficient implementation hands a waker to each future when it is polled. When a future is not ready, it saves that waker for later. When something changes, the waker notifies the core of the executor that now would be a good time to re-check the futures. This allows the executor to not perform what is effectively a busy-wait.
The generic solution
When you have have an operation that is blocking or long-running, then the appropriate thing to do is to move that work out of the async loop. See What is the best approach to encapsulate blocking I/O in future-rs? for details and examples.
I want to terminate reading from a tokio::io::lines stream. I merged it with a oneshot future and terminated it, but tokio::run was still working.
use futures::{sync::oneshot, *}; // 0.1.27
use std::{io::BufReader, time::Duration};
use tokio::prelude::*; // 0.1.21
fn main() {
let (tx, rx) = oneshot::channel::<()>();
let lines = tokio::io::lines(BufReader::new(tokio::io::stdin()));
let lines = lines.for_each(|item| {
println!("> {:?}", item);
Ok(())
});
std::thread::spawn(move || {
std::thread::sleep(Duration::from_millis(5000));
println!("system shutting down");
let _ = tx.send(());
});
let lines = lines.select2(rx);
tokio::run(lines.map(|_| ()).map_err(|_| ()));
}
How can I stop reading from this?
There's nothing wrong with your strategy, but it will only work with futures that don't execute a blocking operation via Tokio's blocking (the traditional kind of blocking should never be done inside a future).
You can test this by replacing the tokio::io::lines(..) future with a simple interval future:
let lines = Interval::new(Instant::now(), Duration::from_secs(1));
The problem is that tokio::io::Stdin internally uses tokio_threadpool::blocking .
When you use Tokio thread pool blocking (emphasis mine):
NB: The entire task that called blocking is blocked whenever the
supplied closure blocks, even if you have used future combinators such
as select - the other futures in this task will not make progress
until the closure returns. If this is not desired, ensure that
blocking runs in its own task (e.g. using
futures::sync::oneshot::spawn).
Since this will block every other future in the combinator, your Receiver will not be able to get a signal from the Senderuntil the blocking ends.
Please see How can I read non-blocking from stdin? or you can use tokio-stdin-stdout, which creates a channel to consume data from stdin thread. It also has a line-by-line example.
Thank you for your comment and correcting my sentences.
I tried to stop this non-blocking Future and succeeded.
let lines = Interval::new(Instant::now(), Duration::from_secs(1));
My understating is that it would work for this case to wrap the blocking Future with tokio threadpool::blocking.
I'll try it later.
Thank you very much.
I need to pause the current thread in Rust and notify it from another thread. In Java I would write:
synchronized(myThread) {
myThread.wait();
}
and from the second thread (to resume main thread):
synchronized(myThread){
myThread.notify();
}
Is is possible to do the same in Rust?
Using a channel that sends type () is probably easiest:
use std::sync::mpsc::channel;
use std::thread;
let (tx,rx) = channel();
// Spawn your worker thread, giving it `send` and whatever else it needs
thread::spawn(move|| {
// Do whatever
tx.send(()).expect("Could not send signal on channel.");
// Continue
});
// Do whatever
rx.recv().expect("Could not receive from channel.");
// Continue working
The () type is because it's effectively zero-information, which means it's pretty clear you're only using it as a signal. The fact that it's size zero means it's also potentially faster in some scenarios (but realistically probably not any faster than a normal machine word write).
If you just need to notify the program that a thread is done, you can grab its join guard and wait for it to join.
let guard = thread::spawn( ... ); // This will automatically join when finished computing
guard.join().expect("Could not join thread");
You can use std::thread::park() and std::thread::Thread::unpark() to achieve this.
In the thread you want to wait,
fn worker_thread() {
std::thread::park();
}
in the controlling thread, which has a thread handle already,
fn main_thread(worker_thread: std::thread::Thread) {
worker_thread.unpark();
}
Note that the parking thread can wake up spuriously, which means the thread can sometimes wake up without the any other threads calling unpark on it. You should prepare for this situation in your code, or use something like std::sync::mpsc::channel that is suggested in the accepted answer.
There are multiple ways to achieve this in Rust.
The underlying model in Java is that each object contains both a mutex and a condition variable, if I remember correctly. So using a mutex and condition variable would work...
... however, I would personally switch to using a channel instead:
the "waiting" thread has the receiving end of the channel, and waits for it
the "notifying" thread has the sending end of the channel, and sends a message
It is easier to manipulate than a condition variable, notably because there is no risk to accidentally use a different mutex when locking the variable.
The std::sync::mpsc has two channels (asynchronous and synchronous) depending on your needs. Here, the asynchronous one matches more closely: std::sync::mpsc::channel.
There is a monitor crate that provides this functionality by combining Mutex with Condvar in a convenience structure.
(Full disclosure: I am the author.)
Briefly, it can be used like this:
let mon = Arc::new(Monitor::new(false));
{
let mon = mon.clone();
let _ = thread::spawn(move || {
thread::sleep(Duration::from_millis(1000));
mon.with_lock(|mut done| { // done is a monitor::MonitorGuard<bool>
*done = true;
done.notify_one();
});
});
}
mon.with_lock(|mut done| {
while !*done {
done.wait();
}
println!("finished waiting");
});
Here, mon.with_lock(...) is semantically equivalent to Java's synchronized(mon) {...}.