How to use tokio DuplexStream in spawned parallel tasks? - rust

Suppose I use a pair of DuplexStreams in an application in order to allow bidirectional communication between server and client.
use use tokio::io::duplex;
let (upstream, dwstream) = duplex(64*1024);
Then I use dwstream to send data to the client and read the data sent by the client from dwstream.
If all these operations are executed in one async block, of course there is no problem. But now, for concurrency, I spawn a new async task for reading the dwstream, and in another task I need to write data to the dwstream from time to time.
In order to move the dwstream to the new task, I have to create an Arc<Mutex<DuplexStream>> object and then clone it before it supports Move.
However, in order to read in the read task, the object must always be locked(), which actually makes it impossible to get the object in the write task.
let (upstream, dwstream) = duplex(64*1024);
let dwrc = Arc::new(Mutex::new(dwstream));
let dwrc1 = dwrc.clone();
// write task
tokio::spwan(async move {
... some operations ...
dwrc.lock().unwrap().write_all(...).await;
});
tokio::spwan(async move {
let data = dwrc1.lock().unwrap().read().await; // here I think dwstream is locked for a long time.
});
How can I resolve it?

You're supposed to write to one stream in one task, and read from the other stream in the other task. Currently, you're writing to and reading from the same stream (dwstream) across the two tasks.
let (mut upstream, mut dwstream) = duplex(64*1024);
// write task
tokio::spawn(async move {
upstream.write_all(b"abc").await;
});
// read task
tokio::spawn(async move {
let mut buf = vec![];
dwstream.read(&mut buf).await;
});
playground

Related

How to get the result of the first future to complete in a collection of futures?

I have a bunch of async_std::UdpSocket and want to drive them all until the first one completes, then get the result from its corresponding u8 buffer.
I've tried
// build sockets and buffers
let sockets = Vec::new();
let buffers = HashMap::new();
for addr in ["127.0.0.1:4000", "10.0.0.1:8080", "192.168.0.1:9000"] {
let socket = UdpSocket::bind(addr.parse().unwrap()).await?;
let buf = [0u8; 1500];
sockets.push(socket);
buffers.insert(socket.peer_addr()?, buf);
}
// create an iterator of tasks reading from each socket
let socket_tasks = sockets.into_iter().map(|socket| {
let socket_addr = socket.peer_addr().unwrap();
let buf = buffers.get_mut(&socket_addr).unwrap();
socket
.recv_from(buf)
.and_then(|_| async move { Ok(socket_addr) })
});
// wait for the first socket to return a value (DOESN'T WORK)
let buffer = try_join_all(socket_tasks)
.and_then(|socket_addr| async { Ok(buffers.get(&socket_addr)) })
.await
However this does not work because futures::future::try_join_all drives all futures to completion and returns a Vec<(SocketAddr)>. I want to get the result when a single future completes and get just a SocketAddr--something akin to select! but over a collection.
How can this be done?
If you only need the first future you can also use futures::future::select_all(). If there are many futures FuturesUnordered may be more performant because it only polls future when they awake, but if there are few select_all() is likely to be more performant as it is doing less work.
FuturesUnordered is a Stream that will produce the results of futures in the order they complete. You can socket_tasks.collect() into this type, and then use the next() method of StreamExt to create a future that will complete when the first future in the whole collection completes.

Rust: Safe multi threading with recursion

I'm new to Rust.
For learning purposes, I'm writing a simple program to search for files in Linux, and it uses a recursive function:
fn ffinder(base_dir:String, prmtr:&'static str, e:bool, h:bool) -> std::io::Result<()>{
let mut handle_vec = vec![];
let pth = std::fs::read_dir(&base_dir)?;
for p in pth {
let p2 = p?.path().clone();
if p2.is_dir() {
if !h{ //search doesn't include hidden directories
let sstring:String = get_fname(p2.display().to_string());
let slice:String = sstring[..1].to_string();
if slice != ".".to_string() {
let handle = thread::spawn(move || {
ffinder(p2.display().to_string(),prmtr,e,h);
});
handle_vec.push(handle);
}
}
else {//search include hidden directories
let handle2 = thread::spawn(move || {
ffinder(p2.display().to_string(),prmtr,e,h);
});
handle_vec.push(handle2);
}
}
else {
let handle3 = thread::spawn(move || {
if compare(rmv_underline(get_fname(p2.display().to_string())),rmv_underline(prmtr.to_string()),e){
println!("File found at: {}",p2.display().to_string().blue());
}
});
handle_vec.push(handle3);
}
}
for h in handle_vec{
h.join().unwrap();
}
Ok(())
}
I've tried to use multi threading (thread::spawn), however it can create too many threads, exploding the OS limit, which breaks the program execution.
Is there a way to multi thread with recursion, using a safe,limited (fixed) amount of threads?
As one of the commenters mentioned, this is an absolutely perfect case for using Rayon. The blog post mentioned doesn't show how Rayon might be used in recursion, only making an allusion to crossbeam's scoped threads with a broken link. However, Rayon provides its own scoped threads implementation that solves your problem as well in that only uses as many threads as you have cores available, avoiding the error you ran into.
Here's the documentation for it:
https://docs.rs/rayon/1.0.1/rayon/fn.scope.html
Here's an example from some code I recently wrote. Basically what it does is recursively scan a folder, and each time it nests into a folder it creates a new job to scan that folder while the current thread continues. In my own tests it vastly outperforms a single threaded approach.
let source = PathBuf::from("/foo/bar/");
let (tx, rx) = mpsc::channel();
rayon::scope(|s| scan(&source, tx, s));
fn scan<'a, U: AsRef<Path>>(
src: &U,
tx: Sender<(Result<DirEntry, std::io::Error>, u64)>,
scope: &Scope<'a>,
) {
let dir = fs::read_dir(src).unwrap();
dir.into_iter().for_each(|entry| {
let info = entry.as_ref().unwrap();
let path = info.path();
if path.is_dir() {
let tx = tx.clone();
scope.spawn(move |s| scan(&path, tx, s)) // Recursive call here
} else {
// dbg!("{}", path.as_os_str().to_string_lossy());
let size = info.metadata().unwrap().len();
tx.send((entry, size)).unwrap();
}
});
}
I'm not an expert on Rayon, but I'm fairly certain the threading strategy works like this:
Rayon creates a pool of threads to match the number of logical cores you have available in your environment. The first call to the scoped function creates a job that the first available thread "steals" from the queue of jobs available. Each time we make another recursive call, it doesn't necessarily execute immediately, but it creates a new job that an idle thread can then "steal" from the queue. If all of the threads are busy, the job queue just fills up each time we make another recursive call, and each time a thread finishes its current job it steals another job from the queue.
The full code can be found here: https://github.com/1Dragoon/fcp
(Note that repo is a work in progress and the code there is currently typically broken and probably won't work at the time you're reading this.)
As an exercise to the reader, I'm more of a sys admin than an actual developer, so I also don't know if this is the ideal approach. From Rayon's documentation linked earlier:
scope() is a more flexible building block compared to join(), since a loop can be used to spawn any number of tasks without recursing
The language of that is a bit confusing. I'm not sure what they mean by "without recursing". Join seems to intend for you to already have tasks known about ahead of time and to execute them in parallel as threads become available, whereas scope seems to be more aimed at only creating jobs when you need them rather than having to know everything you need to do in advance. Either that or I'm understanding their meaning backwards.

What's a good detailed explanation of Tokio's simple TCP echo server example (on GitHub and API reference)?

Tokio has the same example of a simple TCP echo server on its:
GitHub main page (https://github.com/tokio-rs/tokio)
API reference main page (https://docs.rs/tokio/0.2.18/tokio/)
However, in both pages, there is no explanation of what's actually going on. Here's the example, slightly modified so that the main function does not return Result<(), Box<dyn std::error::Error>>:
use tokio::net::TcpListener;
use tokio::prelude::*;
#[tokio::main]
async fn main() {
if let Ok(mut tcp_listener) = TcpListener::bind("127.0.0.1:8080").await {
while let Ok((mut tcp_stream, _socket_addr)) = tcp_listener.accept().await {
tokio::spawn(async move {
let mut buf = [0; 1024];
// In a loop, read data from the socket and write the data back.
loop {
let n = match tcp_stream.read(&mut buf).await {
// socket closed
Ok(n) if n == 0 => return,
Ok(n) => n,
Err(e) => {
eprintln!("failed to read from socket; err = {:?}", e);
return;
}
};
// Write the data back
if let Err(e) = tcp_stream.write_all(&buf[0..n]).await {
eprintln!("failed to write to socket; err = {:?}", e);
return;
}
}
});
}
}
}
After reading the Tokio documentation (https://tokio.rs/docs/overview/), here's my mental model of this example. A task is spawned for each new TCP connection. And a task is ended whenever a read/write error occurs, or when the client ends the connection (i.e. n == 0 case). Therefore, if there are 20 connected clients at a point in time, there would be 20 spawned tasks. However, under the hood, this is NOT equivalent to spawning 20 threads to handle the connected clients concurrently. As far as I understand, this is basically the problem that asynchronous runtimes are trying to solve. Correct so far?
Next, my mental model is that a tokio scheduler (e.g. the multi-threaded threaded_scheduler which is the default for apps, or the single-threaded basic_scheduler which is the default for tests) will schedule these tasks concurrently on 1-to-N threads. (Side question: for the threaded_scheduler, is N fixed during the app's lifetime? If so, is it equal to num_cpus::get()?). If one task is .awaiting for the read or write_all operations, then the scheduler can use the same thread to perform more work for one of the other 19 tasks. Still correct?
Finally, I'm curious whether the outer code (i.e. the code that is .awaiting for tcp_listener.accept()) is itself a task? Such that in the 20 connected clients example, there aren't really 20 tasks but 21: one to listen for new connections + one per connection. All of these 21 tasks could be scheduled concurrently on one or many threads, depending on the scheduler. In the following example, I wrap the outer code in a tokio::spawn and .await the handle. Is it completely equivalent to the example above?
use tokio::net::TcpListener;
use tokio::prelude::*;
#[tokio::main]
async fn main() {
let main_task_handle = tokio::spawn(async move {
if let Ok(mut tcp_listener) = TcpListener::bind("127.0.0.1:8080").await {
while let Ok((mut tcp_stream, _socket_addr)) = tcp_listener.accept().await {
tokio::spawn(async move {
// ... same as above ...
});
}
}
});
main_task_handle.await.unwrap();
}
This answer is a summary of an answer I received on Tokio's Discord from Alice Ryhl. Big thank you!
First of all, indeed, for the multi-threaded scheduler, the number of OS threads is fixed to num_cpus.
Second, Tokio can swap the currently running task at every .await on a per-thread basis.
Third, the main function runs in its own task, which is spawned by the #[tokio::main] macro.
Therefore, for the first code block example, if there are 20 connected clients, there would be 21 tasks: one for the main macro + one for each of the 20 open TCP streams. For the second code block example, there would be 22 tasks because of the extra outer tokio::spawn but it's needless and doesn't add any concurrency.

How can I share or avoid sharing a websocket resource between two threads?

I am using tungstenite to build a chat server, and the way I want to do it relies on having many threads that communicate with each other through mpsc. I want to start up a new thread for each user that connects to the server and connect them to a websocket, and also have that thread be able to read from mpsc so that the server can send messages out through that connection.
The problem is that the mpsc read blocks the thread, but I can't block the thread if I want to be reading from it. The only thing I could think of to work around that is to make two threads, one for inbound and one for outbound messages, but that requires me to share my websocket connection with both workers, which of course I cannot do.
Here's a heavily truncated version of my code where I try to make two workers in the Action::Connect arm of the match statement, which gives error[E0382]: use of moved value: 'websocket' for trying to move it into the second worker's closure:
extern crate tungstenite;
extern crate workerpool;
use std::net::{TcpListener, TcpStream};
use std::sync::mpsc::{self, Sender, Receiver};
use workerpool::Pool;
use workerpool::thunk::{Thunk, ThunkWorker};
use tungstenite::server::accept;
pub enum Action {
Connect(TcpStream),
Send(String),
}
fn main() {
let (main_send, main_receive): (Sender<Action>, Receiver<Action>) = mpsc::channel();
let worker_pool = Pool::<ThunkWorker<()>>::new(8);
{
// spawn thread to listen for users connecting to the server
let main_send = main_send.clone();
worker_pool.execute(Thunk::of(move || {
let listener = TcpListener::bind(format!("127.0.0.1:{}", 8080)).unwrap();
for (_, stream) in listener.incoming().enumerate() {
main_send.send(Action::Connect(stream.unwrap())).unwrap();
}
}));
}
let mut users: Vec<Sender<String>> = Vec::new();
// process actions from children
while let Some(act) = main_receive.recv().ok() {
match act {
Action::Connect(stream) => {
let mut websocket = accept(stream).unwrap();
let (user_send, user_receive): (Sender<String>, Receiver<String>) = mpsc::channel();
let main_send = main_send.clone();
// thread to read user input and propagate it to the server
worker_pool.execute(Thunk::of(move || {
loop {
let message = websocket.read_message().unwrap().to_string();
main_send.send(Action::Send(message)).unwrap();
}
}));
// thread to take server output and propagate it to the server
worker_pool.execute(Thunk::of(move || {
while let Some(message) = user_receive.recv().ok() {
websocket.write_message(tungstenite::Message::Text(message.clone())).unwrap();
}
}));
users.push(user_send);
}
Action::Send(message) => {
// take user message and echo to all users
for user in &users {
user.send(message.clone()).unwrap();
}
}
}
}
}
If I create just one thread for both in and output in that arm, then user_receive.recv() blocks the thread so I can't read any messages with websocket.read_message() until I get an mpsc message from the main thread. How can I solve both problems? I considered cloning the websocket but it doesn't implement Clone and I don't know if just making a new connection with the same stream is a reasonable thing to try to do, it seems hacky.
The problem is that the mpsc read blocks the thread
You can use try_recv to avoid thread blocking. The another implementation of mpsc is crossbeam_channel. That project is a recommended replacement even by the author of mpsc
I want to start up a new thread for each user that connects to the server
I think the asyn/await approach will be much better from most of the prospectives then thread per client one. You can read more about it there

Join futures with limited concurrency

I have a large vector of Hyper HTTP request futures and want to resolve them into a vector of results. Since there is a limit of maximum open files, I want to limit concurrency to N futures.
I've experimented with Stream::buffer_unordered but seems like it executed futures one by one.
We've used code like this in a project to avoid opening too many TCP sockets. These futures have Hyper futures within, so it seems exactly the same case.
// Convert the iterator into a `Stream`. We will process
// `PARALLELISM` futures at the same time, but with no specified
// order.
let all_done =
futures::stream::iter(iterator_of_futures.map(Ok))
.buffer_unordered(PARALLELISM);
// Everything after here is just using the stream in
// some manner, not directly related
let mut successes = Vec::with_capacity(LIMIT);
let mut failures = Vec::with_capacity(LIMIT);
// Pull values off the stream, dividing them into success and
// failure buckets.
let mut all_done = all_done.into_future();
loop {
match core.run(all_done) {
Ok((None, _)) => break,
Ok((Some(v), next_all_done)) => {
successes.push(v);
all_done = next_all_done.into_future();
}
Err((v, next_all_done)) => {
failures.push(v);
all_done = next_all_done.into_future();
}
}
}
This is used in a piece of example code, so the event loop (core) is explicitly driven. Watching the number of file handles used by the program showed that it was capped. Additionally, before this bottleneck was added, we quickly ran out of allowable file handles, whereas afterward we did not.

Resources