Why does my asynchronous request pool (using crossbeam channels) block?

Why does my asynchronous request pool (using crossbeam channels) block? - rust

My main goal is to write an API Server, which retrieves part of the information from another external API server. However, this API server is quite fragile, therefore I would like to limit the global amount of concurrent requests made to those external API Servers for example to 10 or 20.
Thus, my idea was to write something a HttpPool, which consumes task via a crossbeam bounded channels and distributes them among tokio tasks. The ideas was to use a bounded channel to avoid publishing to much work and use a set of tasks to limit the amount of request per external API call.
It deems to work, if I do not create more than 8 tasks. If I define more, it blocks after fetching the first tasks from the queue.
use std::{error::Error, result::Result};
use tokio::sync::oneshot::Sender;
use tokio::time::timeout;
use tokio::time::{sleep, Duration};
use crossbeam_channel;
#[derive(Debug)]
struct HttpTaskRequest {
url: String,
result: Sender<String>,
}
type PoolSender = crossbeam_channel::Sender<HttpTaskRequest>;
type PoolReceiver = crossbeam_channel::Receiver<HttpTaskRequest>;
#[derive(Debug)]
struct HttpPool {
size: i32,
sender: PoolSender,
receiver: PoolReceiver,
}
impl HttpPool {
fn new(capacity: i32) -> Self {
let (tx, rx) = crossbeam_channel::bounded::<HttpTaskRequest>(capacity as usize);
HttpPool {
size: capacity,
sender: tx,
receiver: rx,
}
}
async fn start(self) -> Result<HttpPool, Box<dyn Error>> {
for i in 0..self.size {
let task_receiver = self.receiver.clone();
tokio::spawn(async move {
loop {
match task_receiver.recv() {
Ok(request) => {
if request.result.is_closed() {
println!("Task[{i}] received url {} already closed by receiver, seems to reach timeout already", request.url);
} else {
println!("Task[{i}] started to work {:?}", request.url);
let resp = reqwest::get("https://httpbin.org/ip").await;
println!("Resp: {:?}", resp);
println!("Done Send request for url {}", request.url);
request.result.send("Result".to_owned()).expect("Failed to send result");
}
}
Err(err) => println!("Error: {err}"),
}
}
});
}
Ok(self)
}
pub async fn request(&self, url: String) -> Result<(), Box<dyn Error>> {
let (os_sender, os_receiver) = tokio::sync::oneshot::channel::<String>();
let request = HttpTaskRequest {
result: os_sender,
url: url.clone(),
};
self.sender.send(request).expect("Failed to publish message to task group");
// check if a timeout or value was returned
match timeout(Duration::from_millis(100), os_receiver).await {
Ok(res) => {
println!("Request finished without reaching the timeout {}",res.unwrap());
}
Err(_) => {println!("Request {url} run into timeout");}
}
Ok(())
}
}
#[tokio::main]
async fn main() {
let http_pool = HttpPool::new(20).start().await.expect("Failed to start http pool");
for i in 0..10 {
let url = format!("T{}", i.to_string());
http_pool.request(url).await.expect("Failed to request message");
}
loop {}
}
Maybe somebody can explain, why the code blocks? Is it related to the tokio::spawn?
I guess my attempt is wrong, so please let me know if there is another way to handle it. The goal can be summarized like this. I would like to requests URLs and process them in a fashion, that not more than N concurrent requests are made against the API server.
I have read this question: How can I perform parallel asynchronous HTTP GET requests with reqwest?. But here, this answer, I do know the work, which is not the case in my example. They arrive on the fly, hence I am not sure how to handle them.

I have finally solved the mystery about the blocking in my code example above. As we can see, I have used the crate crossbeam_channel, which does not cooperate with async code. If we call recv on this type of channel, the thread blocks until a message is received. Hence, there is no way, that we can return back to the tokio scheduler, which implies that no other task is able to run. To refresh your memories, async code only returns to the scheduler, if a .await is called.
Furthermore, the code was working, if we have spawned less tasks than worker threads. The normal amount of worker threads is equal to the CPU core count, in my case eight. Hence, if I have started more than this, the all threads were blocked an the application freezes.
The fix was to replace the crate crossbeam-channel with async-channel, as stated on the tokio tutorial page.
In case my answer is vague, I recommend to read the following posts:
https://github.com/tokio-rs/tokio/discussions/3858
https://ryhl.io/blog/async-what-is-blocking/
https://crates.io/crates/async-channel

Related

Why the channel in the example code of tokio::sync::Notify is a mpsc?

I'm learning the synchronizing primitive of tokio. From the example code of Notify, I found it is confused to understand why Channel<T> is mpsc.
use tokio::sync::Notify;
use std::collections::VecDeque;
use std::sync::Mutex;
struct Channel<T> {
values: Mutex<VecDeque<T>>,
notify: Notify,
}
impl<T> Channel<T> {
pub fn send(&self, value: T) {
self.values.lock().unwrap()
.push_back(value);
// Notify the consumer a value is available
self.notify.notify_one();
}
// This is a single-consumer channel, so several concurrent calls to
// `recv` are not allowed.
pub async fn recv(&self) -> T {
loop {
// Drain values
if let Some(value) = self.values.lock().unwrap().pop_front() {
return value;
}
// Wait for values to be available
self.notify.notified().await;
}
}
}
If there are elements in values, the consumer tasks will take it away
If there is no element in values, the consumer tasks will yield until the producer nitify it
But after I writen some test code, I found in no case the consumer will lose the notice from producer.
Could some one give me test code to prove the above Channel<T> fail to work well as a mpmc?
The following code shows why it is unsafe to use the above channel as mpmc.
use std::sync::Arc;
#[tokio::main]
async fn main() {
let mut i = 0;
loop{
let ch = Arc::new(Channel {
values: Mutex::new(VecDeque::new()),
notify: Notify::new(),
});
let mut handles = vec![];
for i in 0..100{
if i % 2 == 1{
for _ in 0..2{
let sender = ch.clone();
tokio::spawn(async move{
sender.send(1);
});
}
}else{
for _ in 0..2{
let receiver = ch.clone();
let handle = tokio::spawn(async move{
receiver.recv().await;
});
handles.push(handle);
}
}
}
futures::future::join_all(handles).await;
i += 1;
println!("No.{i} loop finished.");
}
}
Not running the next loop means that there are consumer tasks not finishing, and consumer tasks miss a notify.

Quote from the documentation you linked:
If you have two calls to recv and two calls to send in parallel, the following could happen:
Both calls to try_recv return None.
Both new elements are added to the vector.
The notify_one method is called twice, adding only a single permit to the Notify.
Both calls to recv reach the Notified future. One of them consumes the permit, and the other sleeps forever.
Replace try_recv with self.values.lock().unwrap().pop_front() in our case; the rest of the explanation stays identical.
The third point is the important one: Multiple calls to notify_one only result in a single token if no thread is waiting yet. And there is a short time window where it is possible that multiple threads already checked for the existance of an item but aren't waiting yet.

Rust and tokio::postgresql, use of moved value

I am working on this kind of code:
database.rs
use tokio::task::JoinHandle;
use tokio_postgres::{Client, Connection, Error, NoTls, Socket, tls::NoTlsStream};
use crate::secret;
pub struct DatabaseConnection {
pub client: Client,
pub connection: Connection<Socket, NoTlsStream>
// pub connection: JoinHandle<Connection<Socket, NoTlsStream>>
}
impl DatabaseConnection {
async fn new() -> Result<DatabaseConnection, Error> {
let (new_client, new_connection) = DatabaseConnection::create_connection().await.expect("Error, amazing is amazing");
Ok(Self {
client: new_client,
// connection: tokio::spawn( async move { new_connection } )
connection: new_connection
})
}
async fn create_connection() -> Result<(Client, Connection<Socket, NoTlsStream>), Error> {
// Connect to the database.
let (client, connection) =
tokio_postgres::connect(
&format!(
"postgres://{user}:{pswd}#localhost/{db}",
user = secret::USERNAME,
pswd = secret::PASSWORD,
db = secret::DATABASE
)[..],
NoTls)
.await?;
Ok((client, connection))
}
}
pub async fn init_db() -> Result<(), Error> {
let database_connection =
DatabaseConnection::new()
.await;
let db_connection = tokio::spawn(async move {
if let Err(e) = database_connection {
println!("Connection error: {:?}", e);
}
});
let create = database_connection.unwrap().client
.query("CREATE TABLE person (
id SERIAL PRIMARY KEY,
name VARCHAR NOT NULLL;
)", &[]).await?;
Ok(())
}
main.rs
#[tokio::main]
async fn main() {
match database::init_db().await {
Ok(()) => println!("Successfully connected to the database"),
Err(e) => eprintln!("On main: {}", e)
}
}
I can't use the database_connection variable to perform my SQL statements because it's moved into the tokio routine.
I already tried to return on my struct a `connection: tokio::spawn( async move { new_connection } ), but the routine it's never spawned, till i call the attribute, and returns an ended database connection.
How can I solve this?
Thanks in advice

Going with 2 concurrent tasks (a concurrent event loop for the connection) allows sharing your connection between various program modules, and have a non-blocking (potentially non-async) query API.
It is possible to achieve by implementing a reactor pattern, but not as trivial as in some other languages, because Rust is going to make sure you follow the strict multi-thread correctness.
Let's say that task 1 is your main program, it spawns task 2 - a DatabaseConnection reactor event loop. Since this instance might be accessed by multiple threads potentially at the same time to start or process an SQL query, wrap it with Arc<Mutex<DatabaseConnection>>.
To execute a query task 1 needs to send an SQL command to task 2, and wait for the result. One way to do this is to use an mpsc channel for sending commands, and oneshot for the result. oneshot is similar to a promise/future in principle: you can await on one end, and push the value and wake from the other end.
For some code example check out the channels tutorial. The "Spawn manager task" chapter is gonna be a part of your DatabaseConnection task 2 waiting for SQL queries and processing them. The "Receive responses" chapter that shows how to use oneshot to send the result back.
Also note that async doesn't imply that you have to block on await. If your program is simple enough, there's a possiblity to avoid having the reactor while still not blocking. This can be done with tokio::select! inside a loop. An example of such usage can be found in the select tutorial - chapter "Resuming an async operation". Imagine that action() is your .query() method. Note that they are calling it, but not await-ing. Then the select! is able to return when the query operation results are ready, and if they are not ready - you are free to do any other async work.

How to find if tokio::sync::mpsc::Receiver has been closed?

I have a loop where I do some work and send result with Sender. The work takes time and I need to retry it in case of failure. It's possible that while I retry it, the receiver has been closed and my retries are going to be a waste of time. Because of this, I need a way to check if Receiver is available without sending a message.
In an ideal world, I want my code to look like this in pseudocode:
let (tx, rx) = tokio::sync::mpsc::channel(1);
tokio::spawn(async move {
// do som stuff with rx and drop it after some time
rx.recv(...).await;
});
let mut attempts = 0;
loop {
if tx.is_closed() {
break;
}
if let Ok(result) = do_work().await {
attempts = 0;
let _ = tx.send(result).await;
} else {
if attempts >= 10 {
break;
} else {
attempts += 1;
continue;
}
}
};
The problem is that Sender doesn't have an is_closed method. It does have pub fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), ClosedError>>, but I don't know what Context is or where can I find it.
When I don't have a value to send, how can I check if the sender is able to send?

Sender has a try_send method:
Attempts to immediately send a message on this Sender
This method differs from send by returning immediately if the channel's buffer is full or no receiver is waiting to acquire some data. Compared with send, this function has two failure cases instead of one (one for disconnection, one for a full buffer).
Use it instead of send and check for the error:
if let Err(TrySendError::Closed(_)) = tx.send(result).await {
break;
}
It is possible to do what you want by using poll_fn from futures crate. It adapts a function returning Poll to return a Future
use futures::future::poll_fn; // 0.3.5
use std::future::Future;
use tokio::sync::mpsc::{channel, error::ClosedError, Sender}; // 0.2.22
use tokio::time::delay_for; // 0.2.22
fn wait_until_ready<'a, T>(
sender: &'a mut Sender<T>,
) -> impl Future<Output = Result<(), ClosedError>> + 'a {
poll_fn(move |cx| sender.poll_ready(cx))
}
#[tokio::main]
async fn main() {
let (mut tx, mut rx) = channel::<i32>(1);
tokio::spawn(async move {
// Receive one value and close the channel;
let val = rx.recv().await;
println!("{:?}", val);
});
wait_until_ready(&mut tx).await.unwrap();
tx.send(123).await.unwrap();
wait_until_ready(&mut tx).await.unwrap();
delay_for(std::time::Duration::from_secs(1)).await;
tx.send(456).await.unwrap(); // 456 likely never printed out,
// despite having a positive readiness response
// and the send "succeeding"
}
Note, however, that in the general case this is susceptible to TOCTOU. Even though Sender's poll_ready reserves a slot in the channel for later usage, it is possible that the receiving end is closed between the readiness check and the actual send. I tried to indicate this in the code.

Send a null message that the receiver ignores. It could be anything. For example, if you're sending T now you could change it to Option<T> and have the receiver ignore Nones.
Yeah, that will work, although I don't really liked this approach since I need to change communication format.
I wouldn't get hung up on the communication format. This isn't a well-defined network protocol that should be isolated from implementation details; it's an internal communication mechanism between two pieces of your own code.

How to use Rust shiplift from a hyper server

I'm trying to write a simple Rust program that reads Docker stats using shiplift and exposes them as Prometheus metrics using rust-prometheus.
The shiplift stats example runs correctly on its own, and I'm trying to integrate it in the server as
fn handle(_req: Request<Body>) -> Response<Body> {
let docker = Docker::new();
let containers = docker.containers();
let id = "my-id";
let stats = containers
.get(&id)
.stats().take(1).wait();
for s in stats {
println!("{:?}", s);
}
// ...
}
// in main
let make_service = || {
service_fn_ok(handle)
};
let server = Server::bind(&addr)
.serve(make_service);
but it appears that the stream hangs forever (I cannot produce any error message).
I've also tried the same refactor (using take and wait instead of tokio::run) in the shiplift example, but in that case I get the error executor failed to spawn task: tokio::spawn failed (is a tokio runtime running this future?). Is tokio somehow required by shiplift?
EDIT:
If I've understood correctly, my attempt does not work because wait will block tokio executor and stats will never produce results.

shiplift's API is asynchronous, meaning wait() and other functions return a Future, instead of blocking the main thread until a result is ready. A Future won't actually do any I/O until it is passed to an executor. You need to pass the Future to tokio::run as in the example you linked to. You should read the tokio docs to get a better understanding of how to write asynchronous code in rust.

There were quite a few mistakes in my understanding of how hyper works. Basically:
if a service should handle futures, do not use service_fn_ok to create it (it is meant for synchronous services): use service_fn;
do not use wait: all futures use the same executor, the execution will just hang forever (there is a warning in the docs but oh well...);
as ecstaticm0rse notices, hyper::rt::spawn could be used to read stats asynchronously, instead of doing it in the service
Is tokio somehow required by shiplift?
Yes. It uses hyper, which throws executor failed to spawn task if the default tokio executor is not available (working with futures nearly always requires an executor anyway).
Here is a minimal version of what I ended up with (tokio 0.1.20 and hyper 0.12):
use std::net::SocketAddr;
use std::time::{Duration, Instant};
use tokio::prelude::*;
use tokio::timer::Interval;
use hyper::{
Body, Response, service::service_fn_ok,
Server, rt::{spawn, run}
};
fn init_background_task(swarm_name: String) -> impl Future<Item = (), Error = ()> {
Interval::new(Instant::now(), Duration::from_secs(1))
.map_err(|e| panic!(e))
.for_each(move |_instant| {
futures::future::ok(()) // unimplemented: call shiplift here
})
}
fn init_server(address: SocketAddr) -> impl Future<Item = (), Error = ()> {
let service = move || {
service_fn_ok(|_request| Response::new(Body::from("unimplemented")))
};
Server::bind(&address)
.serve(service)
.map_err(|e| panic!("Server error: {}", e))
}
fn main() {
let background_task = init_background_task("swarm_name".to_string());
let server = init_server(([127, 0, 0, 1], 9898).into());
run(hyper::rt::lazy(move || {
spawn(background_task);
spawn(server);
Ok(())
}));
}

Moving Receiver to thread complains about Sync, but expected Send

I'm trying to reference, via an Arc, a receiver into a thread, so I can do centralized pub-sub via dispatcher. However, I get the following error:
src/dispatcher.rs:58:11: 58:24 error: the trait `core::marker::Sync` is not implemented for the type `core::cell::UnsafeCell<std::sync::mpsc::Flavor<dispatcher::DispatchMessage>>` [E0277]
src/dispatcher.rs:58 thread::spawn(move || {
^~~~~~~~~~~~~
src/dispatcher.rs:58:11: 58:24 note: `core::cell::UnsafeCell<std::sync::mpsc::Flavor<dispatcher::DispatchMessage>>` cannot be shared between threads safely
src/dispatcher.rs:58 thread::spawn(move || {
Wat! I thought only Send was required for moving across channels? The code of DispatchMessage is:
#[derive(PartialEq, Debug, Clone)]
enum DispatchType {
ChangeCurrentChannel,
OutgoingMessage,
IncomingMessage
}
#[derive(Clone)]
struct DispatchMessage {
dispatch_type: DispatchType,
payload: String
}
Both String and surely Enum are Send, right? Why is it complaining about Sync?
The relevant part from the dispatcher:
pub fn start(&self) {
let shared_subscribers = Arc::new(self.subscribers);
for ref broadcaster in &self.broadcasters {
let shared_broadcaster = Arc::new(Mutex::new(broadcaster));
let broadcaster = shared_broadcaster.clone();
let subscribers = shared_subscribers.clone();
thread::spawn(move || {
loop {
let message = &broadcaster.lock().unwrap().recv().ok().expect("Couldn't receive message in broadcaster");
match subscribers.get(type_to_str(&message.dispatch_type)) {
Some(ref subs) => {
for sub in subs.iter() { sub.send(*message).unwrap(); }
},
None => ()
}
}
});
}
}
Full dispatcher code is in this gist: https://gist.github.com/timonv/5cdc56bf671cee69d3fa
If it's still relevant, built against the 5-2-2015 nightly.

Arc requires Sync, and it seems to me like you're attempting to put channels inside an Arc. Channels are not Sync, neither Sender nor Receiver.
Without knowing what you're trying to do, here are some things that may help you:
it's possible to clone Sender, so where you would probably Arc a T and share it between many threads, you can instead clone a Sender and send it to many threads, since it is Send
otherwise (and especially for Receiver, which you can't clone) you have to stick it inside an Arc<Mutex<T>>, which makes it Sync.

Although Jorge is correct in the general sense, the problem with this particular code is that creating an Arc Mutex takes ownership of the argument and can thus not be a reference. This makes sense when you think about it. How can you lock something that is not yours? Or more concrete, we need to lock whatever is at that memory location, not the pointer to it.
Changing the code to create the Arc Mutex when the broadcaster is added to the struct solves the problem. This would change that part of the code to:
pub fn register_broadcaster(&mut self, broadcaster: &mut Broadcast) {
let handle = Arc::new(Mutex::new(broadcaster.broadcast_handle()));
self.broadcasters.push(handle);
}
And then the start method of the dispatcher would look like:
pub fn start(&self) {
// Assuming that broadcasters.clone() copies the vector, but increase ref count on els
for broadcaster in self.broadcasters.clone() {
let subscribers = self.subscribers.clone();
thread::spawn(move || {
loop {
let message = broadcaster.lock().unwrap().recv().ok().expect("Couldn't receive message in broadcaster or channel hung up");
match subscribers.get(type_to_str(&message.dispatch_type)) {
Some(ref subs) => {
for sub in subs.iter() { sub.send(message.clone()).unwrap(); }
},
None => ()
}
}
});
}
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Why does my asynchronous request pool (using crossbeam channels) block? - rust

Related

Why the channel in the example code of tokio::sync::Notify is a mpsc?

Rust and tokio::postgresql, use of moved value

How to find if tokio::sync::mpsc::Receiver has been closed?

How to use Rust shiplift from a hyper server

Moving Receiver to thread complains about Sync, but expected Send

Categories

Resources