How to get the result of the first future to complete in a collection of futures?

How to get the result of the first future to complete in a collection of futures? - rust

I have a bunch of async_std::UdpSocket and want to drive them all until the first one completes, then get the result from its corresponding u8 buffer.
I've tried
// build sockets and buffers
let sockets = Vec::new();
let buffers = HashMap::new();
for addr in ["127.0.0.1:4000", "10.0.0.1:8080", "192.168.0.1:9000"] {
let socket = UdpSocket::bind(addr.parse().unwrap()).await?;
let buf = [0u8; 1500];
sockets.push(socket);
buffers.insert(socket.peer_addr()?, buf);
}
// create an iterator of tasks reading from each socket
let socket_tasks = sockets.into_iter().map(|socket| {
let socket_addr = socket.peer_addr().unwrap();
let buf = buffers.get_mut(&socket_addr).unwrap();
socket
.recv_from(buf)
.and_then(|_| async move { Ok(socket_addr) })
});
// wait for the first socket to return a value (DOESN'T WORK)
let buffer = try_join_all(socket_tasks)
.and_then(|socket_addr| async { Ok(buffers.get(&socket_addr)) })
.await
However this does not work because futures::future::try_join_all drives all futures to completion and returns a Vec<(SocketAddr)>. I want to get the result when a single future completes and get just a SocketAddr--something akin to select! but over a collection.
How can this be done?

If you only need the first future you can also use futures::future::select_all(). If there are many futures FuturesUnordered may be more performant because it only polls future when they awake, but if there are few select_all() is likely to be more performant as it is doing less work.

FuturesUnordered is a Stream that will produce the results of futures in the order they complete. You can socket_tasks.collect() into this type, and then use the next() method of StreamExt to create a future that will complete when the first future in the whole collection completes.

Related

How to use tokio DuplexStream in spawned parallel tasks?

Suppose I use a pair of DuplexStreams in an application in order to allow bidirectional communication between server and client.
use use tokio::io::duplex;
let (upstream, dwstream) = duplex(64*1024);
Then I use dwstream to send data to the client and read the data sent by the client from dwstream.
If all these operations are executed in one async block, of course there is no problem. But now, for concurrency, I spawn a new async task for reading the dwstream, and in another task I need to write data to the dwstream from time to time.
In order to move the dwstream to the new task, I have to create an Arc<Mutex<DuplexStream>> object and then clone it before it supports Move.
However, in order to read in the read task, the object must always be locked(), which actually makes it impossible to get the object in the write task.
let (upstream, dwstream) = duplex(64*1024);
let dwrc = Arc::new(Mutex::new(dwstream));
let dwrc1 = dwrc.clone();
// write task
tokio::spwan(async move {
... some operations ...
dwrc.lock().unwrap().write_all(...).await;
});
tokio::spwan(async move {
let data = dwrc1.lock().unwrap().read().await; // here I think dwstream is locked for a long time.
});
How can I resolve it?

You're supposed to write to one stream in one task, and read from the other stream in the other task. Currently, you're writing to and reading from the same stream (dwstream) across the two tasks.
let (mut upstream, mut dwstream) = duplex(64*1024);
// write task
tokio::spawn(async move {
upstream.write_all(b"abc").await;
});
// read task
tokio::spawn(async move {
let mut buf = vec![];
dwstream.read(&mut buf).await;
});
playground

What's a good detailed explanation of Tokio's simple TCP echo server example (on GitHub and API reference)?

Tokio has the same example of a simple TCP echo server on its:
GitHub main page (https://github.com/tokio-rs/tokio)
API reference main page (https://docs.rs/tokio/0.2.18/tokio/)
However, in both pages, there is no explanation of what's actually going on. Here's the example, slightly modified so that the main function does not return Result<(), Box<dyn std::error::Error>>:
use tokio::net::TcpListener;
use tokio::prelude::*;
#[tokio::main]
async fn main() {
if let Ok(mut tcp_listener) = TcpListener::bind("127.0.0.1:8080").await {
while let Ok((mut tcp_stream, _socket_addr)) = tcp_listener.accept().await {
tokio::spawn(async move {
let mut buf = [0; 1024];
// In a loop, read data from the socket and write the data back.
loop {
let n = match tcp_stream.read(&mut buf).await {
// socket closed
Ok(n) if n == 0 => return,
Ok(n) => n,
Err(e) => {
eprintln!("failed to read from socket; err = {:?}", e);
return;
}
};
// Write the data back
if let Err(e) = tcp_stream.write_all(&buf[0..n]).await {
eprintln!("failed to write to socket; err = {:?}", e);
return;
}
}
});
}
}
}
After reading the Tokio documentation (https://tokio.rs/docs/overview/), here's my mental model of this example. A task is spawned for each new TCP connection. And a task is ended whenever a read/write error occurs, or when the client ends the connection (i.e. n == 0 case). Therefore, if there are 20 connected clients at a point in time, there would be 20 spawned tasks. However, under the hood, this is NOT equivalent to spawning 20 threads to handle the connected clients concurrently. As far as I understand, this is basically the problem that asynchronous runtimes are trying to solve. Correct so far?
Next, my mental model is that a tokio scheduler (e.g. the multi-threaded threaded_scheduler which is the default for apps, or the single-threaded basic_scheduler which is the default for tests) will schedule these tasks concurrently on 1-to-N threads. (Side question: for the threaded_scheduler, is N fixed during the app's lifetime? If so, is it equal to num_cpus::get()?). If one task is .awaiting for the read or write_all operations, then the scheduler can use the same thread to perform more work for one of the other 19 tasks. Still correct?
Finally, I'm curious whether the outer code (i.e. the code that is .awaiting for tcp_listener.accept()) is itself a task? Such that in the 20 connected clients example, there aren't really 20 tasks but 21: one to listen for new connections + one per connection. All of these 21 tasks could be scheduled concurrently on one or many threads, depending on the scheduler. In the following example, I wrap the outer code in a tokio::spawn and .await the handle. Is it completely equivalent to the example above?
use tokio::net::TcpListener;
use tokio::prelude::*;
#[tokio::main]
async fn main() {
let main_task_handle = tokio::spawn(async move {
if let Ok(mut tcp_listener) = TcpListener::bind("127.0.0.1:8080").await {
while let Ok((mut tcp_stream, _socket_addr)) = tcp_listener.accept().await {
tokio::spawn(async move {
// ... same as above ...
});
}
}
});
main_task_handle.await.unwrap();
}

This answer is a summary of an answer I received on Tokio's Discord from Alice Ryhl. Big thank you!
First of all, indeed, for the multi-threaded scheduler, the number of OS threads is fixed to num_cpus.
Second, Tokio can swap the currently running task at every .await on a per-thread basis.
Third, the main function runs in its own task, which is spawned by the #[tokio::main] macro.
Therefore, for the first code block example, if there are 20 connected clients, there would be 21 tasks: one for the main macro + one for each of the 20 open TCP streams. For the second code block example, there would be 22 tasks because of the extra outer tokio::spawn but it's needless and doesn't add any concurrency.

How do I schedule a repeating task in Tokio?

I am replacing synchronous socket code written in Rust with the asynchronous equivalent using Tokio. Tokio uses futures for asynchronous activity so tasks are chained together and queued onto an executor to be executed by a thread pool.
The basic pseudocode for what I want to do is like this:
let tokio::net::listener = TcpListener::bind(&sock_addr).unwrap();
let server_task = listener.incoming().for_each(move |socket| {
let in_buf = vec![0u8; 8192];
// TODO this should happen continuously until an error happens
let read_task = tokio::io::read(socket, in_buf).and_then(move |(socket, in_buf, bytes_read)| {
/* ... Logic I want to happen repeatedly as bytes are read ... */
Ok(())
};
tokio::spawn(read_task);
Ok(())
}).map_err(|err| {
error!("Accept error = {:?}", err);
});
tokio::run(server_task);
This pseudocode would only execute my task once. How do I run it continuously? I want it to execute and then execute again and again etc. I only want it to stop executing if it panics or has an error result code. What's the simplest way of doing that?

Using loop_fn should work:
let read_task =
futures::future::loop_fn((socket, in_buf, 0), |(socket, in_buf, bytes_read)| {
if bytes_read > 0 { /* handle bytes */ }
tokio::io::read(socket, in_buf).map(Loop::Continue)
});

A clean way to accomplish this and not have to fight the type system is to use tokio-codec crate; if you want to interact with the reader as a stream of bytes instead of defining a codec you can use tokio_codec::BytesCodec.
use tokio::codec::Decoder;
use futures::Stream;
...
let tokio::net::listener = TcpListener::bind(&sock_addr).unwrap();
let server_task = listener.incoming().for_each(move |socket| {
let (_writer, reader) = tokio_codec::BytesCodec::new().framed(socket).split();
let read_task = reader.for_each(|bytes| {
/* ... Logic I want to happen repeatedly as bytes are read ... */
});
tokio::spawn(read_task);
Ok(())
}).map_err(|err| {
error!("Accept error = {:?}", err);
});
tokio::run(server_task);

Join futures with limited concurrency

I have a large vector of Hyper HTTP request futures and want to resolve them into a vector of results. Since there is a limit of maximum open files, I want to limit concurrency to N futures.
I've experimented with Stream::buffer_unordered but seems like it executed futures one by one.

We've used code like this in a project to avoid opening too many TCP sockets. These futures have Hyper futures within, so it seems exactly the same case.
// Convert the iterator into a `Stream`. We will process
// `PARALLELISM` futures at the same time, but with no specified
// order.
let all_done =
futures::stream::iter(iterator_of_futures.map(Ok))
.buffer_unordered(PARALLELISM);
// Everything after here is just using the stream in
// some manner, not directly related
let mut successes = Vec::with_capacity(LIMIT);
let mut failures = Vec::with_capacity(LIMIT);
// Pull values off the stream, dividing them into success and
// failure buckets.
let mut all_done = all_done.into_future();
loop {
match core.run(all_done) {
Ok((None, _)) => break,
Ok((Some(v), next_all_done)) => {
successes.push(v);
all_done = next_all_done.into_future();
}
Err((v, next_all_done)) => {
failures.push(v);
all_done = next_all_done.into_future();
}
}
}
This is used in a piece of example code, so the event loop (core) is explicitly driven. Watching the number of file handles used by the program showed that it was capped. Additionally, before this bottleneck was added, we quickly ran out of allowable file handles, whereas afterward we did not.

Asynchronously reconnecting a client to a server in an infinite loop

I'm not able to create a client that tries to connect to a server and:
if the server is down it has to try again in an infinite loop
if the server is up and connection is successful, when the connection is lost (i.e. server disconnects the client) the client has to restart the infinite loop to try to connect to the server
Here's the code to connect to a server; currently when the connection is lost the program exits. I'm not sure what the best way to implement it is; maybe I have to create a Future with an infinite loop?
extern crate tokio_line;
use tokio_line::LineCodec;
fn get_connection(handle: &Handle) -> Box<Future<Item = (), Error = io::Error>> {
let remote_addr = "127.0.0.1:9876".parse().unwrap();
let tcp = TcpStream::connect(&remote_addr, handle);
let client = tcp.and_then(|stream| {
let (sink, from_server) = stream.framed(LineCodec).split();
let reader = from_server.for_each(|message| {
println!("{}", message);
Ok(())
});
reader.map(|_| {
println!("CLIENT DISCONNECTED");
()
}).map_err(|err| err)
});
let client = client.map_err(|_| { panic!()});
Box::new(client)
}
fn main() {
let mut core = Core::new().unwrap();
let handle = core.handle();
let client = get_connection(&handle);
let client = client.and_then(|c| {
println!("Try to reconnect");
get_connection(&handle);
Ok(())
});
core.run(client).unwrap();
}
Add the tokio-line crate with:
tokio-line = { git = "https://github.com/tokio-rs/tokio-line" }

The key question seems to be: how do I implement an infinite loop using Tokio? By answering this question, we can tackle the problem of reconnecting infinitely upon disconnection. From my experience writing asynchronous code, recursion seems to be a straightforward solution to this problem.
UPDATE: as pointed out by Shepmaster (and the folks of the Tokio Gitter), my original answer leaks memory since we build a chain of futures that grows on each iteration. Here follows a new one:
Updated answer: use loop_fn
There is a function in the futures crate that does exactly what you need. It is called loop_fn. You can use it by changing your main function to the following:
fn main() {
let mut core = Core::new().unwrap();
let handle = core.handle();
let client = future::loop_fn((), |_| {
// Run the get_connection function and loop again regardless of its result
get_connection(&handle).map(|_| -> Loop<(), ()> {
Loop::Continue(())
})
});
core.run(client).unwrap();
}
The function resembles a for loop, which can continue or break depending on the result of get_connection (see the documentation for the Loop enum). In this case, we choose to always continue, so it will infinitely keep reconnecting.
Note that your version of get_connection will panic if there is an error (e.g. if the client cannot connect to the server). If you also want to retry after an error, you should remove the call to panic!.
Old answer: use recursion
Here follows my old answer, in case anyone finds it interesting.
WARNING: using the code below results in unbounded memory growth.
Making get_connection loop infinitely
We want to call the get_connection function each time the client is disconnected, so that is exactly what we are going to do (look at the comment after reader.and_then):
fn get_connection(handle: &Handle) -> Box<Future<Item = (), Error = io::Error>> {
let remote_addr = "127.0.0.1:9876".parse().unwrap();
let tcp = TcpStream::connect(&remote_addr, handle);
let handle_clone = handle.clone();
let client = tcp.and_then(|stream| {
let (sink, from_server) = stream.framed(LineCodec).split();
let reader = from_server.for_each(|message| {
println!("{}", message);
Ok(())
});
reader.and_then(move |_| {
println!("CLIENT DISCONNECTED");
// Attempt to reconnect in the future
get_connection(&handle_clone)
})
});
let client = client.map_err(|_| { panic!()});
Box::new(client)
}
Remember that get_connection is non-blocking. It just constructs a Box<Future>. This means that when calling it recursively, we still don't block. Instead, we get a new future, which we can link to the previous one by using and_then. As you can see, this is different to normal recursion since the stack doesn't grow on each iteration.
Note that we need to clone the handle (see handle_clone), and move it into the closure passed to reader.and_then. This is necessary because the closure is going to live longer than the function (it will be contained in the future we are returning).
Handling errors
The code you provided doesn't handle the case in which the client is unable to connect to the server (nor any other errors). Following the same principle shown above, we can handle errors by changing the end of get_connection to the following:
let handle_clone = handle.clone();
let client = client.or_else(move |err| {
// Note: this code will infinitely retry, but you could pattern match on the error
// to retry only on certain kinds of error
println!("Error connecting to server: {}", err);
get_connection(&handle_clone)
});
Box::new(client)
Note that or_else is like and_then, but it operates on the error produced by the future.
Removing unnecessary code from main
Finally, it is not necessary to use and_then in the main function. You can replace your main by the following code:
fn main() {
let mut core = Core::new().unwrap();
let handle = core.handle();
let client = get_connection(&handle);
core.run(client).unwrap();
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to get the result of the first future to complete in a collection of futures? - rust

If you only need the first future you can also use futures::future::select_all(). If there are many futures FuturesUnordered may be more performant because it only polls future when they awake, but if there are few select_all() is likely to be more performant as it is doing less work.

FuturesUnordered is a Stream that will produce the results of futures in the order they complete. You can socket_tasks.collect() into this type, and then use the next() method of StreamExt to create a future that will complete when the first future in the whole collection completes.

Related

How to use tokio DuplexStream in spawned parallel tasks?

What's a good detailed explanation of Tokio's simple TCP echo server example (on GitHub and API reference)?

How do I schedule a repeating task in Tokio?

Join futures with limited concurrency

Asynchronously reconnecting a client to a server in an infinite loop

Categories

Resources