Vector of futures in Rust doesn't execute concurrently [duplicate] - rust

I'm trying to understand Future::select: in this example, the future with a longer time delay is returned first.
When I read this article with its example, I get cognitive dissonance. The author writes:
The select function runs two (or more in case of select_all) futures and returns the first one coming to completion. This is useful for implementing timeouts.
It seems I don't understand the sense of select.
extern crate futures; // v0.1 (old)
extern crate tokio_core;
use std::thread;
use std::time::Duration;
use futures::{Async, Future};
use tokio_core::reactor::Core;
struct Timeout {
time: u32,
}
impl Timeout {
fn new(period: u32) -> Timeout {
Timeout { time: period }
}
}
impl Future for Timeout {
type Item = u32;
type Error = String;
fn poll(&mut self) -> Result<Async<u32>, Self::Error> {
thread::sleep(Duration::from_secs(self.time as u64));
println!("Timeout is done with time {}.", self.time);
Ok(Async::Ready(self.time))
}
}
fn main() {
let mut reactor = Core::new().unwrap();
let time_out1 = Timeout::new(5);
let time_out2 = Timeout::new(1);
let task = time_out1.select(time_out2);
let mut reactor = Core::new().unwrap();
reactor.run(task);
}
I need to process the early future with the smaller time delay, and then work with the future with a longer delay. How can I do it?

TL;DR: use tokio::time
If there's one thing to take away from this: never perform blocking or long-running operations inside of asynchronous operations.
If you want a timeout, use something from tokio::time, such as delay_for or timeout:
use futures::future::{self, Either}; // 0.3.1
use std::time::Duration;
use tokio::time; // 0.2.9
#[tokio::main]
async fn main() {
let time_out1 = time::delay_for(Duration::from_secs(5));
let time_out2 = time::delay_for(Duration::from_secs(1));
match future::select(time_out1, time_out2).await {
Either::Left(_) => println!("Timer 1 finished"),
Either::Right(_) => println!("Timer 2 finished"),
}
}
What's the problem?
To understand why you get the behavior you do, you have to understand the implementation of futures at a high level.
When you call run, there's a loop that calls poll on the passed-in future. It loops until the future returns success or failure, otherwise the future isn't done yet.
Your implementation of poll "locks up" this loop for 5 seconds because nothing can break the call to sleep. By the time the sleep is done, the future is ready, thus that future is selected.
The implementation of an async timeout conceptually works by checking the clock every time it's polled, saying if enough time has passed or not.
The big difference is that when a future returns that it's not ready, another future can be checked. This is what select does!
A dramatic re-enactment:
sleep-based timer
core: Hey select, are you ready to go?
select: Hey future1, are you ready to go?
future1: Hold on a seconnnnnnnn [... 5 seconds pass ...] nnnnd. Yes!
simplistic async-based timer
core: Hey select, are you ready to go?
select: Hey future1, are you ready to go?
future1: Checks watch No.
select: Hey future2, are you ready to go?
future2: Checks watch No.
core: Hey select, are you ready to go?
[... polling continues ...]
[... 1 second passes ...]
core: Hey select, are you ready to go?
select: Hey future1, are you ready to go?
future1: Checks watch No.
select: Hey future2, are you ready to go?
future2: Checks watch Yes!
This simple implementation polls the futures over and over until they are all complete. This is not the most efficient, and not what most executors do.
See How do I execute an async/await function without using any external dependencies? for an implementation of this kind of executor.
smart async-based timer
core: Hey select, are you ready to go?
select: Hey future1, are you ready to go?
future1: Checks watch No, but I'll call you when something changes.
select: Hey future2, are you ready to go?
future2: Checks watch No, but I'll call you when something changes.
[... core stops polling ...]
[... 1 second passes ...]
future2: Hey core, something changed.
core: Hey select, are you ready to go?
select: Hey future1, are you ready to go?
future1: Checks watch No.
select: Hey future2, are you ready to go?
future2: Checks watch Yes!
This more efficient implementation hands a waker to each future when it is polled. When a future is not ready, it saves that waker for later. When something changes, the waker notifies the core of the executor that now would be a good time to re-check the futures. This allows the executor to not perform what is effectively a busy-wait.
The generic solution
When you have have an operation that is blocking or long-running, then the appropriate thing to do is to move that work out of the async loop. See What is the best approach to encapsulate blocking I/O in future-rs? for details and examples.

Related

Is there any way to keep threads spawned in Rust - i.e., to "memoize" a thread object?

On my runtime, I have the following Rust code:
pub fn reduce(heap: &Heap, prog: &Program, tids: &[usize], root: u64, debug: bool) -> Ptr {
// Halting flag
let stop = &AtomicBool::new(false);
let barr = &Barrier::new(tids.len());
let locs = &tids.iter().map(|x| AtomicU64::new(u64::MAX)).collect::<Vec<AtomicU64>>();
// Spawn a thread for each worker
std::thread::scope(|s| {
for tid in tids {
s.spawn(move || {
reducer(heap, prog, tids, stop, barr, locs, root, *tid, debug);
});
}
});
// Return whnf term ptr
return load_ptr(heap, root);
}
This will spawn many threads, in order to perform a parallel computation. The problem is, the reduce function is called thousands of times, and the overhead of spawning threads can be considerable. When I implemented the same thing in C, I just kept the threads open, and sent a message in order to activate them. In Rust, with the std::thread::scope idiom, I'm not sure how to do so. Is it possible to keep the threads spawned after the first call to reduce, by just modifying that one function? That is, without changing anything else on my code?
Threads spawned using the threads::scoped api won't be able to outlive the the calling function. For long-running threads you'll need to spawn them using std::thread::spawn.
Once you've made that change rustc will be very upset with you due to lifetime errors because you are sending non-static references into the spawned threads. Fixing those errors will require some changes. That path is long and full of learning.
If you want something that works really well and is simple, consider instead using the excellent rayon crate. Rayon maintains it's own global thread pool so you don't have to.
Using rayon will look something like this:
tids.par_iter()
.for_each(|tid| {
reducer(heap, prog, tids, stop, barr, locs, root, *tid, debug);
});

Rust: can tokio be understood as similar to Javascripts event loop or be used like it?

I'm not sure if tokio is similar to the event loop in Javascript, also a non-blocking runtime, or if it can be used to work in a similar way. In my understanding, tokio is an runtime for futures in Rust. Therefore it must implement some kind of userland threads or tasks, which can be achieved with an event loop (at least partly) to schedule new tasks.
Let's take the following Javascript code:
console.log('hello1');
setTimeout(() => console.log('hello2'), 0);
console.log('hello3');
setTimeout(() => console.log('hello4'), 0);
console.log('hello5');
The output will be
hello1
hello3
hello5
hello2
hello4
How can I do this in tokio? Is tokio meant to work like this overall? I tried the following code
async fn set_timeout(f: impl Fn(), ms: u64) {
tokio::time::sleep(tokio::time::Duration::from_millis(ms)).await;
f()
}
#[tokio::main]
async fn main() {
println!("hello1");
tokio::spawn(async {set_timeout(|| println!("hello2"), 0)}).await;
println!("hello3");
tokio::spawn(async {set_timeout(|| println!("hello4"), 0)}).await;
println!("hello5");
}
The output is just
hello1
hello3
hello5
If I change the code to
println!("hello1");
tokio::spawn(async {set_timeout(|| println!("hello2"), 0)}.await).await;
println!("hello3");
tokio::spawn(async {set_timeout(|| println!("hello4"), 0)}.await).await;
println!("hello5");
The output is
hello1
hello2
hello3
hello4
hello5
but then I don't get the point of the whole async/await/future feature, because then my "async" set_timeout-tasks are actually blocking the other println statements..
In short: yes, Tokio is meant to work much like the JavaScript event loop. However, there are three problems with your first snippet.
First, it returns from main() before waiting for things to play out. Unlike your JavaScript code, which presumably runs in the browser, and runs the timeouts even after the code you typed in the console has finished running, the Rust code is in a short-lived executable which terminates after main(). Whatever things were scheduled to happen later won't occur if the executable stops running because it returned from main().
The second issue is that the anonymous async block that calls the set_timeout() async function doesn't do anything with its return value. An important difference between async functions in Rust and JavaScript is that in Rust you can't just call an async function and be done with it. In JavaScript an async function returns a promise, and if you don't await that promise, the event loop will still execute the code of the async function in the background. In Rust, an async function returns a future, but it is not associated with any event loop, it is just prepared for someone to run it. You then need to either await it with .await (with the same meaning as in JavaScript) or explicitly pass it to tokio::spawn() to execute in the background (with the same meaning as calling but not awaiting the function in JavaScript). Your async block does neither, so the invocation of set_timeout() is a no-op.
Finally, the code immediately awaits the task created by spawn(), which defeats the purpose of calling spawn() in the first place - tokio::spawn(foo()).await is functionally equivalent to foo().await for any foo().
The first issue can be resolved by adding a tiny sleep at the end of main. (This is not the proper fix, but will serve to demonstrate what happens.) The second issue can be fixed by removing the async block and just passing the return value of set_timeout() to tokio::spawn(). The third issue is resolved by removing the unnecessary .await of the task.
#[tokio::main]
async fn main() {
println!("hello1");
tokio::spawn(set_timeout(|| println!("hello2"), 0));
println!("hello3");
tokio::spawn(set_timeout(|| println!("hello4"), 0));
println!("hello5");
tokio::time::sleep(tokio::time::Duration::from_millis(1)).await;
}
This code will print the "expected" 1, 3, 5, 4, 2 (although the order is not guaranteed in programs like this). Real code would not end with a sleep; instead, it would await the tasks it has created, as shown in Shivam's answer.
Unlike JavaScript, Rust does not start the execution of an async function until the future is awaited. It means set_timeout(|| println!("hello2"), 0) only creates a new future. It doesn't execute it at all. When you await it, only then it is executed. .await essentially blocks the current thread until the future is completed which is not "real asynchronous applications". To make your code concurrent like JavaScript, you can use join! macro:-
use tokio::join;
use tokio::time::*;
async fn set_timeout(f: impl Fn(), ms: u64) {
sleep(Duration::from_millis(ms)).await;
f()
}
#[tokio::main]
async fn main() {
println!("hello1");
let fut_1 = tokio::spawn(set_timeout(|| println!("hello2"), 0));
println!("hello3");
let fut_2 = tokio::spawn(set_timeout(|| println!("hello4"), 0));
println!("hello5");
join!(fut_1, fut_2);
}
You can use FuturesOrdered if want to take feel of Promise.all.
More info:-
https://news.ycombinator.com/item?id=21473777
https://rust-lang.github.io/async-book/06_multiple_futures/01_chapter.html

Does rust currently have a library implement function similar to JavaScript's setTimeout and setInterval?

Does rust currently have a library implement function similar to JavaScript's setTimeout and setInverval?, that is, a library that can call multiple setTimeout and setInterval to implement management of multiple tasks at the same time.
I feel that tokio is not particularly convenient to use. I imagine it to be used like this:
fn callback1() {
println!("callback1");
}
fn callback2() {
println!("callback2");
}
set_interval(callback1, 10);
set_interval(callback1, 20);
set_timeout(callback1, 30);
Of course, I can simulate a function to make it work:
// just for test, not what I wanted at all
type rust_listener_callback = fn();
fn set_interval(func: rust_listener_callback, duration: i32) {
func()
}
fn set_timeout(func: rust_listener_callback, duration: i32) {
func();
}
If a set_interval is implemented in this way, multiple combinations, dynamic addition and deletion, and cancellation are not particularly convenient:
use tokio::time;
async fn set_interval(func: rust_listener_callback, duration: u64) {
let mut interval = time::interval(Duration::from_millis(duration));
tokio::spawn(async move {
loop {
interval.tick().await;
func()
}
}).await;
}
// emm maybe loop can be removed, just a sample
While, What I want to know is if there is a library to do this, instead of writing it myself.
I have some idea if I would write it myself. Generally, all functions are turned into a task queue or task tree, and then tokio::time::delay_for can be used to execute them one by one, but the details are actually more complicated.
However, I think that this general capability may have already been implemented but I has not found for the time being, so I want to ask here, Thank you very much.
And importantly, I hope it can support single thread
setTimeout can be done like this without the need for a crate:
tokio::spawn(async move {
tokio::time::sleep(Duration::from_secs(5)).await;
// code goes here
});
I asked myself the same question a few days ago, created a solution for this (for tokio runtimes), and found your stackoverflow post just now.
https://crates.io/crates/tokio-js-set-interval
code.rs
use std::time::Duration;
use tokio_js_set_interval::{set_interval, set_timeout};
#[tokio::main]
async fn main() {
println!("hello1");
set_timeout!(println!("hello2"), 0);
println!("hello3");
set_timeout!(println!("hello4"), 0);
println!("hello5");
// give enough time before tokios runtime exits
tokio::time::sleep(Duration::from_millis(1)).await;
}
But this must be used with caution. There is no guarantee that the futures will be executed (because tokios runtime must run long enough). Use it only:
for educational purposes,
and if you have low priority background tasks that you don't expect to get executed always
I created a library just for this which allows setting many timeouts using only 1 tokio task (instead of spawning a new task for each timeout) which provides better performance and lower memory usage.
The library also supports cancelling timeouts, and provides some ways to optimize the performance of the timeouts.
Check it out:
https://crates.io/crates/set_timeout
Usage example:
#[tokio::main]
async fn main() {
let scheduler = TimeoutScheduler::new(None);
// schedule a future which will run after 1.234 seconds from now.
scheduler.set_timeout(Duration::from_secs_f32(1.234), async move {
println!("It works!");
});
// make sure that the main task doesn't end before the timeout is executed, because if the main
// task returns the runtime stops running.
tokio::time::sleep(Duration::from_secs(2)).await;
}

How can I stop reading from a tokio::io::lines stream?

I want to terminate reading from a tokio::io::lines stream. I merged it with a oneshot future and terminated it, but tokio::run was still working.
use futures::{sync::oneshot, *}; // 0.1.27
use std::{io::BufReader, time::Duration};
use tokio::prelude::*; // 0.1.21
fn main() {
let (tx, rx) = oneshot::channel::<()>();
let lines = tokio::io::lines(BufReader::new(tokio::io::stdin()));
let lines = lines.for_each(|item| {
println!("> {:?}", item);
Ok(())
});
std::thread::spawn(move || {
std::thread::sleep(Duration::from_millis(5000));
println!("system shutting down");
let _ = tx.send(());
});
let lines = lines.select2(rx);
tokio::run(lines.map(|_| ()).map_err(|_| ()));
}
How can I stop reading from this?
There's nothing wrong with your strategy, but it will only work with futures that don't execute a blocking operation via Tokio's blocking (the traditional kind of blocking should never be done inside a future).
You can test this by replacing the tokio::io::lines(..) future with a simple interval future:
let lines = Interval::new(Instant::now(), Duration::from_secs(1));
The problem is that tokio::io::Stdin internally uses tokio_threadpool::blocking .
When you use Tokio thread pool blocking (emphasis mine):
NB: The entire task that called blocking is blocked whenever the
supplied closure blocks, even if you have used future combinators such
as select - the other futures in this task will not make progress
until the closure returns. If this is not desired, ensure that
blocking runs in its own task (e.g. using
futures::sync::oneshot::spawn).
Since this will block every other future in the combinator, your Receiver will not be able to get a signal from the Senderuntil the blocking ends.
Please see How can I read non-blocking from stdin? or you can use tokio-stdin-stdout, which creates a channel to consume data from stdin thread. It also has a line-by-line example.
Thank you for your comment and correcting my sentences.
I tried to stop this non-blocking Future and succeeded.
let lines = Interval::new(Instant::now(), Duration::from_secs(1));
My understating is that it would work for this case to wrap the blocking Future with tokio threadpool::blocking.
I'll try it later.
Thank you very much.

How to freeze a thread and notify it from another?

I need to pause the current thread in Rust and notify it from another thread. In Java I would write:
synchronized(myThread) {
myThread.wait();
}
and from the second thread (to resume main thread):
synchronized(myThread){
myThread.notify();
}
Is is possible to do the same in Rust?
Using a channel that sends type () is probably easiest:
use std::sync::mpsc::channel;
use std::thread;
let (tx,rx) = channel();
// Spawn your worker thread, giving it `send` and whatever else it needs
thread::spawn(move|| {
// Do whatever
tx.send(()).expect("Could not send signal on channel.");
// Continue
});
// Do whatever
rx.recv().expect("Could not receive from channel.");
// Continue working
The () type is because it's effectively zero-information, which means it's pretty clear you're only using it as a signal. The fact that it's size zero means it's also potentially faster in some scenarios (but realistically probably not any faster than a normal machine word write).
If you just need to notify the program that a thread is done, you can grab its join guard and wait for it to join.
let guard = thread::spawn( ... ); // This will automatically join when finished computing
guard.join().expect("Could not join thread");
You can use std::thread::park() and std::thread::Thread::unpark() to achieve this.
In the thread you want to wait,
fn worker_thread() {
std::thread::park();
}
in the controlling thread, which has a thread handle already,
fn main_thread(worker_thread: std::thread::Thread) {
worker_thread.unpark();
}
Note that the parking thread can wake up spuriously, which means the thread can sometimes wake up without the any other threads calling unpark on it. You should prepare for this situation in your code, or use something like std::sync::mpsc::channel that is suggested in the accepted answer.
There are multiple ways to achieve this in Rust.
The underlying model in Java is that each object contains both a mutex and a condition variable, if I remember correctly. So using a mutex and condition variable would work...
... however, I would personally switch to using a channel instead:
the "waiting" thread has the receiving end of the channel, and waits for it
the "notifying" thread has the sending end of the channel, and sends a message
It is easier to manipulate than a condition variable, notably because there is no risk to accidentally use a different mutex when locking the variable.
The std::sync::mpsc has two channels (asynchronous and synchronous) depending on your needs. Here, the asynchronous one matches more closely: std::sync::mpsc::channel.
There is a monitor crate that provides this functionality by combining Mutex with Condvar in a convenience structure.
(Full disclosure: I am the author.)
Briefly, it can be used like this:
let mon = Arc::new(Monitor::new(false));
{
let mon = mon.clone();
let _ = thread::spawn(move || {
thread::sleep(Duration::from_millis(1000));
mon.with_lock(|mut done| { // done is a monitor::MonitorGuard<bool>
*done = true;
done.notify_one();
});
});
}
mon.with_lock(|mut done| {
while !*done {
done.wait();
}
println!("finished waiting");
});
Here, mon.with_lock(...) is semantically equivalent to Java's synchronized(mon) {...}.

Resources