Idiomatic Progress and Cancellation in Rust

Idiomatic Progress and Cancellation in Rust - rust

As a C# backend developer, I am very much used to add progress and cancellation parameters to any function which can potentially be long running (regardless of synchronous or asynchronous functions). In C#, there is a canonical way to do this:
void LongRunningFun(int actualParam, IProgress<double> progress, CancellationToken ct) { ... }
async Task LongRunningFunAsync(int actualParam, IProgress<double> progress, CancellationToken ct) { ... }
Is there also a canonical way to introduce progress and cancellation to functions in Rust?
For example, I have seen that (Tokio?) tasks can be cancelled without need for an extra cancellation token parameter. However, is it idiomatic to write asynchronous code just for the sake of getting cancellation for free (even in situations where I am not interested in concurrency)?
Regarding progress, I could think of a Progress trait similar to the IProgress interface in C#. But is there also a standard trait for that purpose, or do I have to write my own one?
I am looking forward to see how actual Rust developers handle progress and cancellation. 😊

There is no standard way to do cancellation or progress currently. I would pass a reference (or Rc) to Cell that is set to true when cancellation was requested. You can wrap that in a nice struct (the following is not thread-safe, you can use Arc and atomics instead):
pub struct CancellationSource {
token: Rc<Cell<bool>>,
}
impl CancellationSource {
pub fn new() -> Self {
Self {
token: Rc::new(Cell::new(false)),
}
}
pub fn request_cancellation(&mut self) {
self.token.set(true);
}
pub fn token(&self) -> CancellationToken {
CancellationToken {
token: Rc::clone(&self.token),
}
}
}
#[derive(Clone)]
pub struct CancellationToken {
token: Rc<Cell<bool>>,
}
impl CancellationToken {
pub fn is_cancellation_requested(&self) -> bool {
self.token.get()
}
}
Dropping/cancelling an async future immediately stops it at the next .await point. So just making your code async won't help, unless you manually introduce yield points. It is also worth noting that the current cancellation story of futures is considered bad (because it can leads futures into inconsistent state) and the Rust teams are looking for alternative ideas (one possible solution being voluntary cancellation like C#'s).
There may be a crate for cancellation, but I don't know one.

Related

Sleep in Future::poll

I am trying to create a future polling for inputs from the crossterm crate, which does not provide an asynchronous API, as far as I know.
At first I tried to do something like the following :
use crossterm::event::poll as crossterm_poll;
use std::future::Future;
use std::pin::Pin;
use std::task::{Context, Poll};
use std::time::Duration;
use tokio::time::{sleep, timeout};
struct Polled {}
impl Polled {
pub fn new() -> Polled {
Polled {}
}
}
impl Future for Polled {
type Output = bool;
fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
// If there are events pending, it returns "Ok(true)", else it returns instantly
let poll_status = crossterm_poll(Duration::from_secs(0));
if poll_status.is_ok() && poll_status.unwrap() {
return Poll::Ready(true);
}
Poll::Pending
}
}
pub async fn poll(d: Duration) -> Result<bool, ()> {
let polled = Polled::new();
match timeout(d, polled).await {
Ok(b) => Ok(b),
Err(_) => Err(()),
}
}
It technically works but obivously the program started using 100% CPU all the time since the executor always try to poll the future in case there's something new. Thus I wanted to add some asynchronous equivalent to sleep, that would delay the next time the executor tries to poll the Future, so I tried adding the following (right before returning Poll::Pending), which obviously did not work since sleep_future::poll() just returns Pending :
let mut sleep_future = sleep(Duration::from_millis(50));
tokio::pin!(sleep_future);
sleep_future.poll(cx);
cx.waker().wake_by_ref();
The fact that poll is not async forbids the use of async functions, and I'm starting to wonder if what I want to do is actually feasible, or if I'm not diving in my first problem the wrong way.
Is finding a way to do some async sleep the good way to go ?
If not, what is it ? Am I missing something in the asynchronous paradigm ?
Or is it just sometimes impossible to wrap some synchronous logic into a Future if the crate does not give you the necessary tools to do so ?
Thanks in advance anyway !
EDIT : I found a way to do what I want using an async block :
pub async fn poll(d: Duration) -> Result<bool, ()> {
let mdr = async {
loop {
let a = crossterm_poll(Duration::from_secs(0));
if a.is_ok() && a.unwrap() {
break;
}
sleep(Duration::from_millis(50)).await;
}
true
};
match timeout(d, mdr).await {
Ok(b) => Ok(b),
Err(_) => Err(()),
}
}
Is it the idiomatic way to do so ? Or did I miss something more elegant ?

Yes, using an async block is a good way to compose futures, like your custom poller and tokio's sleep.
However, if you did want to write your own Future which also invokes tokio's sleep, here's what you would need to do differently:
Don't call wake_by_ref() immediately — the sleep future will take care of that when its time comes, and that's how you avoid spinning (using 100% CPU).
You must construct the sleep() future once when you intend to sleep (not every time you're polled), then store it in your future (this will require pin-projection) and poll the same future again the next time you're polled. That's how you ensure you wait the intended amount of time and not shorter.
Async blocks are usually a much easier way to get the same result.

How can I store variables in traits so that it can be used in other methods on trait?

I have several objects in my code that have common functionality. These objects act essentially like services where they have a life-time (controlled by start/stop) and perform work until their life-time ends. I am in the process of trying to refactor my code to reduce duplication but getting stuck.
At a high level, my objects all have the code below implemented in them.
impl SomeObject {
fn start(&self) {
// starts a new thread.
// stores the 'JoinHandle' so thread can be joined later.
}
fn do_work(&self) {
// perform work in the context of the new thread.
}
fn stop(&self) {
// interrupts the work that this object is doing.
// stops the thread.
}
}
Essentially, these objects act like "services" so in order to refactor, my first thought was that I should create a trait called "service" as shown below.
trait Service {
fn start(&self) {}
fn do_work(&self);
fn stop(&self) {}
}
Then, I can just update my objects to each implement the "Service" trait. The issue that I am having though, is that since traits are not allowed to have fields/properties, I am not sure how I can go about saving the 'JoinHandle' in the trait so that I can use it in the other methods.
Is there an idiomatic way to handle this problem in Rust?
tldr; how can I save variables in a trait so that they can be re-used in different trait methods?
Edit:
Here is the solution I settled on. Any feedback is appreciated.
extern crate log;
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::Arc;
use std::thread::JoinHandle;
pub struct Context {
name: String,
thread_join_handle: JoinHandle<()>,
is_thread_running: Arc<AtomicBool>,
}
pub trait Service {
fn start(&self, name: String) -> Context {
log::trace!("starting service, name=[{}]", name);
let is_thread_running = Arc::new(AtomicBool::new(true));
let cloned_is_thread_running = is_thread_running.clone();
let thread_join_handle = std::thread::spawn(move || loop {
while cloned_is_thread_running.load(Ordering::SeqCst) {
Self::do_work();
}
});
log::trace!("started service, name=[{}]", name);
return Context {
name: name,
thread_join_handle: thread_join_handle,
is_thread_running: is_thread_running,
};
}
fn stop(context: Context) {
log::trace!("stopping service, name=[{}]", context.name);
context.is_thread_running.store(false, Ordering::SeqCst);
context
.thread_join_handle
.join()
.expect("joining service thread");
log::trace!("stopped service, name=[{}]", context.name);
}
fn do_work();
}

I think you just need to save your JoinHandle in the struct's state itself, then the other methods can access it as well because they all get all the struct's data passed to them already.
struct SomeObject {
join_handle: JoinHandle;
}
impl Service for SomeObject {
fn start(&mut self) {
// starts a new thread.
// stores the 'JoinHandle' so thread can be joined later.
self.join_handle = what_ever_you_wanted_to_set //Store it here like this
}
fn do_work(&self) {
// perform work in the context of the new thread.
}
fn stop(&self) {
// interrupts the work that this object is doing.
// stops the thread.
}
}
Hopefully that works for you.

Depending on how you're using the trait, you could restructure your code and make it so start returns a JoinHandle and the other functions take the join handle as input
trait Service {
fn start(&self) -> JoinHandle;
fn do_work(&self, handle: &mut JoinHandle);
fn stop(&self, handle: JoinHandle);
}
(maybe with different function arguments depending on what you need). this way you could probably cut down on duplicate code by putting all the code that handles the handles (haha) outside of the structs themselves and makes it more generic. If you want the structs the use the JoinHandle outside of this trait, I'd say it's best to just do what Equinox suggested and just make it a field.

Is there a way to get a reference to the current task's context in an async function in rust?

In a rust async function is there any way to get access to the current Context without writing an explicit implementation of a Future?

Before actually answering the question, it is useful to remember what a Context is; whenever you are writing an implementation of a Future that depends on outside resources (say, I/O), you do not want to busy-wait anything. As a result, you'll most likely have implementations of Future where you'll return Pending and then wake it up. Context (and Waker) exist for that purpose.
However, this is what they are: low-level, implementation details. If you are using a Future already as opposed to writing a low-level implementation of one, the Waker will most likely be contained somewhere, but not directly accessible to you.
As a result of this, a Waker directly leaking is an implementation detail leak 99.9% of the time and not actually recommended. A Waker being used as part of a bigger struct is perfectly fine, however, but this is where you'll need to implement your own Future from scratch. There is no other valid use case for this, and in normal terms, you should never need direct access to a Waker.
Due to the limitations of the playground, I sadly cannot show you a live example of when it is useful to get this Waker; however, such a future setup may be used in such a situation: let's assume we're building the front door of a house. We have a doorbell and a door, and we want to be notified when somebody rings the doorbell. However, we don't want to have to wait at the door for visitors.
We therefore make two objects: a FrontDoor and a Doorbell, and we give the option to wire() the Doorbell to connect the two.
pub struct FrontDoor {
doorbell: Arc<RwLock<Doorbell>>
}
impl FrontDoor {
pub fn new() -> FrontDoor {
FrontDoor {
doorbell: Arc::new(RwLock::new(Doorbell {
waker: None,
visitor: false
}))
}
}
pub fn wire(&self) -> Arc<RwLock<Doorbell>> {
self.doorbell.clone() // We retrieve the bell
}
}
impl Future for FrontDoor {
type Output = ();
fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output> {
self.doorbell.read().map(|guard| {
match guard.visitor {
true => Poll::Ready(()),
false => Poll::Pending
}
}).unwrap_or(Poll::Pending)
}
}
pub struct Doorbell {
waker: Option<Waker>,
pub visitor: bool
}
impl Doorbell {
pub fn ring(&mut self) {
self.visitor = true;
self.waker.as_ref().map(|waker| waker.wake_by_ref());
}
}
Our FrontDoor implements Future, which means we can just throw it on an executor of your choice; waker is contained in the Doorbell object and allows us to "ring" and wake up our future.

Improve Rust's Future to do not create separate thread

I have written a simple future based on this tutorial which looks like this:
extern crate chrono; // 0.4.6
extern crate futures; // 0.1.25
use std::{io, thread};
use chrono::{DateTime, Duration, Utc};
use futures::{Async, Future, Poll, task};
pub struct WaitInAnotherThread {
end_time: DateTime<Utc>,
running: bool,
}
impl WaitInAnotherThread {
pub fn new(how_long: Duration) -> WaitInAnotherThread {
WaitInAnotherThread {
end_time: Utc::now() + how_long,
running: false,
}
}
pub fn run(&mut self, task: task::Task) {
let lend = self.end_time;
thread::spawn(move || {
while Utc::now() < lend {
let delta_sec = lend.timestamp() - Utc::now().timestamp();
if delta_sec > 0 {
thread::sleep(::std::time::Duration::from_secs(delta_sec as u64));
}
task.notify();
}
println!("the time has come == {:?}!", lend);
});
}
}
impl Future for WaitInAnotherThread {
type Item = ();
type Error = Box<io::Error>;
fn poll(&mut self) -> Poll<Self::Item, Self::Error> {
if Utc::now() < self.end_time {
println!("not ready yet! parking the task.");
if !self.running {
println!("side thread not running! starting now!");
self.run(task::current());
self.running = true;
}
Ok(Async::NotReady)
} else {
println!("ready! the task will complete.");
Ok(Async::Ready(()))
}
}
}
So the question is how do I replace pub fn run(&mut self, task: task::Task) with something that will not create a new thread for the future to resolve. It be useful if someone could rewrite my code with replaced run function without separate thread it will help me to understand how things should be. Also I know that tokio has an timeout implementation but I need this code for learning.

I think I understand what you mean.
Lets say you have two task, the Main and the Worker1, in this case you are polling the worker1 to wait for an answer; BUT there is a better way, and this is to wait for competition of the Worker1; and this can be done without having any Future, you simply call from Main the Worker1 function, when the worker is over the Main will go on. You need no future, you are simply calling a function, and the division Main and Worker1 is just an over-complication.
Now, I think your question became relevant in the moment you add at least another worker, last add Worker2, and you want the Main to resume the computation as soon as one of the two task complete; and you don't want those task to be executed in another thread/process, maybe because you are using asynchronous call (which simply mean the threading is done somewhere else, or you are low level enough that you receive Hardware Interrupt).
Since your Worker1 and Worker2 have to share the same thread, you need a way to save the current execution Main, create the one for one of the worker, and after a certain amount of work, time or other even (Scheduler), switch to the other worker, and so on. This is a Multi-Tasking system, and there are various software implementation for it in Rust; but with HW support you could do things that in software only you could not do (like have the hardware prevent one Task to access the resource from the other), plus you can have the CPU take care of the task switching and all... Well, this is what Thread and Process are.
Future are not what you are looking for, they are higher level and you can find some software scheduler that support them.

Is there a way to emulate the Java behaviour of calling a parent class' static method for simple global-ish error handling?

I'm trying to implement a simple interpreter in Rust for a made up programming language called rlox, following Bob Nystrom's book Crafting Interpreters.
I want errors to be able to occur in any child module, and for them to be "reported" in the main module (this is done in the book, with Java, by simply calling a static method on the containing class which prints the offending token and line). However, if an error occurs, it's not like I can just return early with Result::Err (which is, I assume, the idiomatic way to handle errors in Rust) because the interpreter should keep running - continually looking for errors.
Is there an (idiomatic) way for me to emulate the Java behaviour of calling a parent class' static method from a child class in Rust with modules? Should I abandon something like that entirely?
I thought about a strategy where I inject a reference to some ErrorReporter struct as a dependency into the Scanner and Token structs, but that seems unwieldy to me (I don't feel like an error reporter should be part of the struct's signature, am I wrong?):
struct Token {
error_reporter: Rc<ErrorReporter>, // Should I avoid this?
token_type: token::Type,
lexeme: String,
line: u32
}
This is the layout of my project if you need to visualise what I'm talking about with regards to module relationships. Happy to provide some source code if necessary.
rlox [package]
└───src
├───main.rs (uses scanner + token mods, should contain logic for handling errors)
├───lib.rs (just exports scanner and token mods)
├───scanner.rs (uses token mod, declares scanner struct and impl)
└───token.rs (declares token struct and impl)

Literal translation
Importantly, a Java static method has no access to any instance state. That means that it can be replicated in Rust by either a function or an associated function, neither of which have any state. The only difference is in how you call them:
fn example() {}
impl Something {
fn example() {}
}
fn main() {
example();
Something::example();
}
Looking at the source you are copying, it doesn't "just" report the error, it has code like this:
public class Lox {
static boolean hadError = false;
static void error(int line, String message) {
report(line, "", message);
}
private static void report(int line, String where, String message) {
System.err.println(
"[line " + line + "] Error" + where + ": " + message);
hadError = true;
}
}
I'm no JVM expert, but I'm pretty sure that using a static variable like that means that your code is no longer thread safe. You simply can't do that in safe Rust; you can't "accidentally" make memory-unsafe code.
The most literal translation of this that is safe would use associated functions and atomic variables:
use std::sync::atomic::{AtomicBool, Ordering, ATOMIC_BOOL_INIT};
static HAD_ERROR: AtomicBool = ATOMIC_BOOL_INIT;
struct Lox;
impl Lox {
fn error(line: usize, message: &str) {
Lox::report(line, "", message);
}
fn report(line: usize, where_it_was: &str, message: &str) {
eprintln!("[line {}] Error{}: {}", line, where_it_was, message);
HAD_ERROR.store(true, Ordering::SeqCst);
}
}
You can also choose more rich data structures to store in your global state by using lazy_static and a Mutex or RwLock, if you need them.
Idiomatic translation
Although it might be convenient, I don't think such a design is good. Global state is simply terrible. I'd prefer to use dependency injection.
Define an error reporter structure that has the state and methods you need and pass references to the error reporter down to where it needs to be:
struct LoggingErrorSink {
had_error: bool,
}
impl LoggingErrorSink {
fn error(&mut self, line: usize, message: &str) {
self.report(line, "", message);
}
fn report(&mut self, line: usize, where_it_was: &str, message: &str) {
eprintln!("[line {} ] Error {}: {}", line, where_it_was, message);
self.had_error = true;
}
}
fn some_parsing_thing(errors: &mut LoggingErrorSink) {
errors.error(0, "It's broken");
}
In reality, I'd rather define a trait for things that allow reporting errors and implement it for a concrete type. Rust makes this nice because there's zero performance difference when using these generics.
trait ErrorSink {
fn error(&mut self, line: usize, message: &str) {
self.report(line, "", message);
}
fn report(&mut self, line: usize, where_it_was: &str, message: &str);
}
struct LoggingErrorSink {
had_error: bool,
}
impl LoggingErrorSink {
fn report(&mut self, line: usize, where_it_was: &str, message: &str) {
eprintln!("[line {} ] Error {}: {}", line, where_it_was, message);
self.had_error = true;
}
}
fn some_parsing_thing<L>(errors: &mut L)
where
L: ErrorSink,
{
errors.error(0, "It's broken");
}
There's lots of variants of implementing this, all depending on your tradeoffs.
You could choose to have the logger take &self instead of &mut, which would force this case to use something like a Cell to gain internal mutability of had_error.
You could use something like an Rc to avoid adding any extra lifetimes to the calling chain.
You could choose to store the logger as a struct member instead of a function parameter.
For your extra keyboard work, you get the benefit of being able to test your errors. Simply whip up a dummy implementation of the trait that saves information to internal variables and pass it in at test time.
Opinions, ahoy!
a strategy where I inject a reference to some ErrorReporter struct as a dependency into the Scanner
Yes, dependency injection is an amazing solution to a large number of coding issues.
and Token structs
I don't know why a token would need to report errors, but it would make sense for the tokenizer to do so.
but that seems unwieldy to me. I don't feel like an error reporter should be part of the struct's signature, am I wrong?
I'd say yes, you are wrong; you've stated this as an absolute truth, of which very few exist in programming.
Concretely, very few people care about what is inside your type, probably only to be the implementer. The person who constructs a value of your type might care a little because they need to pass in dependencies, but this is a good thing. They now know that this value can generate errors that they need to handle "out-of-band", as opposed to reading some documentation after their program doesn't work.
A few more people care about the actual signature of your type. This is a double-edged blade. In order to have maximal performance, Rust will force you to expose your generic types and lifetimes in your type signatures. Sometimes, this sucks, but either the performance gain is worth it, or you can hide it somehow and take the tiny hit. That's the benefit of a language that gives you choices.
See also
How to synchronize a static variable among threads running different instances of a class in Java?
Where are static methods and static variables stored in Java?
Static fields in a struct in Rust
How can you make a safe static singleton in Rust?
How do I create a global, mutable singleton?
How can I avoid a ripple effect from changing a concrete struct to generic?

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Idiomatic Progress and Cancellation in Rust - rust

Related

Sleep in Future::poll

How can I store variables in traits so that it can be used in other methods on trait?

Is there a way to get a reference to the current task's context in an async function in rust?

Improve Rust's Future to do not create separate thread

Is there a way to emulate the Java behaviour of calling a parent class' static method for simple global-ish error handling?

Categories

Resources