How to remotely shut down running tasks with Tokio - rust

I have a UDP socket that is receiving data
pub async fn start() -> Result<(), std::io::Error> {
loop {
let mut data = vec![0; 1024];
socket.recv_from(&mut data).await?;
}
}
This code is currently blocked on the .await when there is no data coming in. I want to gracefully shut down my server from my main thread, so how do I send a signal to this .await that it should stop sleeping and shut down instead?

Note: The Tokio website has a page on graceful shutdown.
If you have more than one task to kill, you should use a broadcast channel to send shutdown messages. You can use it together with tokio::select!.
use tokio::sync::broadcast::Receiver;
// You may want to log errors rather than return them in this function.
pub async fn start(kill: Receiver<()>) -> Result<(), std::io::Error> {
tokio::select! {
output = real_start() => output,
_ = kill.recv() => Err(...),
}
}
pub async fn real_start() -> Result<(), std::io::Error> {
loop {
let mut data = vec![0; 1024];
socket.recv_from(&mut data).await?;
}
}
Then to kill all the tasks, send a message on the channel.
To kill only a single task, you can use the JoinHandle::abort method, which will kill the task as soon as possible. Note that this method is available only in Tokio 1.x and 0.3.x, and to abort a task using Tokio 0.2.x, see the next section below.
let task = tokio::spawn(start());
...
task.abort();
As an alternative to JoinHandle::abort, you can use abortable from the futures crate. When you spawn the task, you do the following:
let (task, handle) = abortable(start());
tokio::spawn(task);
Then later you can kill the task by calling the abort method.
handle.abort();
Of course, a channel with select! can also be used to kill a single task, perhaps combined with an oneshot channel rather than a broadcast channel.
All of these methods guarantee that the real_start method is killed at an .await. It is not possible to kill the task while it is running code between two .awaits. You can read more about why this is here.
The mini-redis project contains an accessible real-world example of graceful shutdown of a server. Additionally, the Tokio tutorial has chapters on both select and channels.

Related

Why does my asynchronous request pool (using crossbeam channels) block?

My main goal is to write an API Server, which retrieves part of the information from another external API server. However, this API server is quite fragile, therefore I would like to limit the global amount of concurrent requests made to those external API Servers for example to 10 or 20.
Thus, my idea was to write something a HttpPool, which consumes task via a crossbeam bounded channels and distributes them among tokio tasks. The ideas was to use a bounded channel to avoid publishing to much work and use a set of tasks to limit the amount of request per external API call.
It deems to work, if I do not create more than 8 tasks. If I define more, it blocks after fetching the first tasks from the queue.
use std::{error::Error, result::Result};
use tokio::sync::oneshot::Sender;
use tokio::time::timeout;
use tokio::time::{sleep, Duration};
use crossbeam_channel;
#[derive(Debug)]
struct HttpTaskRequest {
url: String,
result: Sender<String>,
}
type PoolSender = crossbeam_channel::Sender<HttpTaskRequest>;
type PoolReceiver = crossbeam_channel::Receiver<HttpTaskRequest>;
#[derive(Debug)]
struct HttpPool {
size: i32,
sender: PoolSender,
receiver: PoolReceiver,
}
impl HttpPool {
fn new(capacity: i32) -> Self {
let (tx, rx) = crossbeam_channel::bounded::<HttpTaskRequest>(capacity as usize);
HttpPool {
size: capacity,
sender: tx,
receiver: rx,
}
}
async fn start(self) -> Result<HttpPool, Box<dyn Error>> {
for i in 0..self.size {
let task_receiver = self.receiver.clone();
tokio::spawn(async move {
loop {
match task_receiver.recv() {
Ok(request) => {
if request.result.is_closed() {
println!("Task[{i}] received url {} already closed by receiver, seems to reach timeout already", request.url);
} else {
println!("Task[{i}] started to work {:?}", request.url);
let resp = reqwest::get("https://httpbin.org/ip").await;
println!("Resp: {:?}", resp);
println!("Done Send request for url {}", request.url);
request.result.send("Result".to_owned()).expect("Failed to send result");
}
}
Err(err) => println!("Error: {err}"),
}
}
});
}
Ok(self)
}
pub async fn request(&self, url: String) -> Result<(), Box<dyn Error>> {
let (os_sender, os_receiver) = tokio::sync::oneshot::channel::<String>();
let request = HttpTaskRequest {
result: os_sender,
url: url.clone(),
};
self.sender.send(request).expect("Failed to publish message to task group");
// check if a timeout or value was returned
match timeout(Duration::from_millis(100), os_receiver).await {
Ok(res) => {
println!("Request finished without reaching the timeout {}",res.unwrap());
}
Err(_) => {println!("Request {url} run into timeout");}
}
Ok(())
}
}
#[tokio::main]
async fn main() {
let http_pool = HttpPool::new(20).start().await.expect("Failed to start http pool");
for i in 0..10 {
let url = format!("T{}", i.to_string());
http_pool.request(url).await.expect("Failed to request message");
}
loop {}
}
Maybe somebody can explain, why the code blocks? Is it related to the tokio::spawn?
I guess my attempt is wrong, so please let me know if there is another way to handle it. The goal can be summarized like this. I would like to requests URLs and process them in a fashion, that not more than N concurrent requests are made against the API server.
I have read this question: How can I perform parallel asynchronous HTTP GET requests with reqwest?. But here, this answer, I do know the work, which is not the case in my example. They arrive on the fly, hence I am not sure how to handle them.
I have finally solved the mystery about the blocking in my code example above. As we can see, I have used the crate crossbeam_channel, which does not cooperate with async code. If we call recv on this type of channel, the thread blocks until a message is received. Hence, there is no way, that we can return back to the tokio scheduler, which implies that no other task is able to run. To refresh your memories, async code only returns to the scheduler, if a .await is called.
Furthermore, the code was working, if we have spawned less tasks than worker threads. The normal amount of worker threads is equal to the CPU core count, in my case eight. Hence, if I have started more than this, the all threads were blocked an the application freezes.
The fix was to replace the crate crossbeam-channel with async-channel, as stated on the tokio tutorial page.
In case my answer is vague, I recommend to read the following posts:
https://github.com/tokio-rs/tokio/discussions/3858
https://ryhl.io/blog/async-what-is-blocking/
https://crates.io/crates/async-channel

Why can't I communicate with a forked child process using Tokio UnixStream?

I'm trying to get a parent process and a child process to communicate with each other using a tokio::net::UnixStream. For some reason the child is unable to read whatever the parent writes to the socket, and presumably the other way around.
The function I have is similar to the following:
pub async fn run() -> Result<(), Error> {
let mut socks = UnixStream::pair()?;
match fork() {
Ok(ForkResult::Parent { .. }) => {
socks.0.write_u32(31337).await?;
Ok(())
}
Ok(ForkResult::Child) => {
eprintln!("Reading from master");
let msg = socks.1.read_u32().await?;
eprintln!("Read from master {}", msg);
Ok(())
}
Err(_) => Err(Error),
}
}
The socket doesn't get closed, otherwise I'd get an immediate error trying to read from socks.1. If I move the read into the parent process it works as expected. The first line "Reading from master" gets printed, but the second line never gets called.
I cannot change the communication paradigm, since I'll be using execve to start another binary that expects to be talking to a socketpair.
Any idea what I'm doing wrong here? Is it something to do with the async/await?
When you call the fork() system call:
The child process is created with a single thread—the one that called fork().
The default executor in tokio is a thread pool executor. The child process will only get one of the threads in the pool, so it won't work properly.
I found I was able to make your program work by setting the thread pool to contain only a single thread, like this:
use tokio::prelude::*;
use tokio::net::UnixStream;
use nix::unistd::{fork, ForkResult};
use nix::sys::wait;
use std::io::Error;
use std::io::ErrorKind;
use wait::wait;
// Limit to 1 thread
#[tokio::main(core_threads = 1)]
async fn main() -> Result<(), Error> {
let mut socks = UnixStream::pair()?;
match fork() {
Ok(ForkResult::Parent { .. }) => {
eprintln!("Writing!");
socks.0.write_u32(31337).await?;
eprintln!("Written!");
wait().unwrap();
Ok(())
}
Ok(ForkResult::Child) => {
eprintln!("Reading from master");
let msg = socks.1.read_u32().await?;
eprintln!("Read from master {}", msg);
Ok(())
}
Err(_) => Err(Error::new(ErrorKind::Other, "oh no!")),
}
}
Another change I had to make was to force the parent to wait for the child to complete, by calling wait() - also something you probably do not want to be doing in a real async program.
Most of the advice I have read that if you need to fork from a threaded program, either do it before creating any threads, or call exec_ve() in the child immediately after forking (which is what you plan to do anyway).

How can I send a message to a specific thread?

I need to create some threads where some of them are going to run until their runner variable value has been changed. This is my minimal code.
use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;
fn main() {
let mut log_runner = Arc::new(Mutex::new(true));
println!("{}", *log_runner.lock().unwrap());
let mut threads = Vec::new();
{
let mut log_runner_ref = Arc::clone(&log_runner);
// log runner thread
let handle = thread::spawn(move || {
while *log_runner_ref.lock().unwrap() == true {
// DO SOME THINGS CONTINUOUSLY
println!("I'm a separate thread!");
}
});
threads.push(handle);
}
// let the main thread to sleep for x time
thread::sleep(Duration::from_millis(1));
// stop the log_runner thread
*log_runner.lock().unwrap() = false;
// join all threads
for handle in threads {
handle.join().unwrap();
println!("Thread joined!");
}
println!("{}", *log_runner.lock().unwrap());
}
It looks like I'm able to set the log_runner_ref in the log runner thread after 1 second to false. Is there a way to mark the treads with some name / ID or something similar and send a message to a specific thread using its specific marker (name / ID)?
If I understand it correctly, then the let (tx, rx) = mpsc::channel(); can be used for sending messages to all the threads simultaneously rather than to a specific one. I could send some identifier with the messages and each thread will be looking for its own identifier for the decision if to act on received message or not, but I would like to avoid the broadcasting effect.
MPSC stands for Multiple Producers, Single Consumer. As such, no, you cannot use that by itself to send a message to all threads, since for that you'd have to be able to duplicate the consumer. There are tools for this, but the choice of them requires a bit more info than just "MPMC" or "SPMC".
Honestly, if you can rely on channels for messaging (there are cases where it'd be a bad idea), you can create a channel per thread, assign the ID outside of the thread, and keep a HashMap instead of a Vec with the IDs associated to the threads. Receiver<T> can be moved into the thread (it implements Send if T implements Send), so you can quite literally move it in.
You then keep the Sender outside and send stuff to it :-)

How to use Rust shiplift from a hyper server

I'm trying to write a simple Rust program that reads Docker stats using shiplift and exposes them as Prometheus metrics using rust-prometheus.
The shiplift stats example runs correctly on its own, and I'm trying to integrate it in the server as
fn handle(_req: Request<Body>) -> Response<Body> {
let docker = Docker::new();
let containers = docker.containers();
let id = "my-id";
let stats = containers
.get(&id)
.stats().take(1).wait();
for s in stats {
println!("{:?}", s);
}
// ...
}
// in main
let make_service = || {
service_fn_ok(handle)
};
let server = Server::bind(&addr)
.serve(make_service);
but it appears that the stream hangs forever (I cannot produce any error message).
I've also tried the same refactor (using take and wait instead of tokio::run) in the shiplift example, but in that case I get the error executor failed to spawn task: tokio::spawn failed (is a tokio runtime running this future?). Is tokio somehow required by shiplift?
EDIT:
If I've understood correctly, my attempt does not work because wait will block tokio executor and stats will never produce results.
shiplift's API is asynchronous, meaning wait() and other functions return a Future, instead of blocking the main thread until a result is ready. A Future won't actually do any I/O until it is passed to an executor. You need to pass the Future to tokio::run as in the example you linked to. You should read the tokio docs to get a better understanding of how to write asynchronous code in rust.
There were quite a few mistakes in my understanding of how hyper works. Basically:
if a service should handle futures, do not use service_fn_ok to create it (it is meant for synchronous services): use service_fn;
do not use wait: all futures use the same executor, the execution will just hang forever (there is a warning in the docs but oh well...);
as ecstaticm0rse notices, hyper::rt::spawn could be used to read stats asynchronously, instead of doing it in the service
Is tokio somehow required by shiplift?
Yes. It uses hyper, which throws executor failed to spawn task if the default tokio executor is not available (working with futures nearly always requires an executor anyway).
Here is a minimal version of what I ended up with (tokio 0.1.20 and hyper 0.12):
use std::net::SocketAddr;
use std::time::{Duration, Instant};
use tokio::prelude::*;
use tokio::timer::Interval;
use hyper::{
Body, Response, service::service_fn_ok,
Server, rt::{spawn, run}
};
fn init_background_task(swarm_name: String) -> impl Future<Item = (), Error = ()> {
Interval::new(Instant::now(), Duration::from_secs(1))
.map_err(|e| panic!(e))
.for_each(move |_instant| {
futures::future::ok(()) // unimplemented: call shiplift here
})
}
fn init_server(address: SocketAddr) -> impl Future<Item = (), Error = ()> {
let service = move || {
service_fn_ok(|_request| Response::new(Body::from("unimplemented")))
};
Server::bind(&address)
.serve(service)
.map_err(|e| panic!("Server error: {}", e))
}
fn main() {
let background_task = init_background_task("swarm_name".to_string());
let server = init_server(([127, 0, 0, 1], 9898).into());
run(hyper::rt::lazy(move || {
spawn(background_task);
spawn(server);
Ok(())
}));
}

Can a Tokio task terminate the whole runtime gracefully?

I start up a Tokio runtime with code like this:
tokio::run(my_future);
My future goes on to start a bunch of tasks in response to various conditions.
One of those tasks is responsible for determining when the program should shut down. However, I don't know how to have that task gracefully terminate the program. Ideally, I'd like to find a way for this task to cause the run function call to terminate.
Below is an example of the kind of program I would like to write:
extern crate tokio;
use tokio::prelude::*;
use std::time::Duration;
use std::time::Instant;
use tokio::timer::{Delay, Interval};
fn main() {
let kill_future = Delay::new(Instant::now() + Duration::from_secs(3));
let time_print_future = Interval::new_interval(Duration::from_secs(1));
let mut runtime = tokio::runtime::Runtime::new().expect("failed to start new Runtime");
runtime.spawn(time_print_future.for_each(|t| Ok(println!("{:?}", t))).map_err(|_| ()));
runtime.spawn(
kill_future
.map_err(|_| {
eprintln!("Timer error");
})
.map(move |()| {
// TODO
unimplemented!("Shutdown the runtime!");
}),
);
// TODO
unimplemented!("Block until the runtime is shutdown");
println!("Done");
}
shutdown_now seems promising, but upon further investigation, it's probably not going to work. In particular, it takes ownership of the runtime, and Tokio is probably not going to allow both the main thread (where the runtime was created) and some random task to own the runtime.
You can use a oneshot channel to communicate from inside the runtime to outside. When the delay expires, we send a single message through the channel.
Outside of the runtime, once we receive that message we initiate a shutdown of the runtime and wait for it to finish.
use std::time::{Duration, Instant};
use tokio::{
prelude::*,
runtime::Runtime,
sync::oneshot,
timer::{Delay, Interval},
}; // 0.1.15
fn main() {
let mut runtime = Runtime::new().expect("failed to start new Runtime");
let (tx, rx) = oneshot::channel();
runtime.spawn({
let every_second = Interval::new_interval(Duration::from_secs(1));
every_second
.for_each(|t| Ok(println!("{:?}", t)))
.map_err(drop)
});
runtime.spawn({
let in_three_seconds = Delay::new(Instant::now() + Duration::from_secs(3));
in_three_seconds
.map_err(|_| eprintln!("Timer error"))
.and_then(move |_| tx.send(()))
});
rx.wait().expect("unable to wait for receiver");
runtime
.shutdown_now()
.wait()
.expect("unable to wait for shutdown");
println!("Done");
}
See also:
How do I gracefully shutdown the Tokio runtime in response to a SIGTERM?
Is there any way to shutdown `tokio::runtime::current_thread::Runtime`?
How can I stop the hyper HTTP web server and return an error?

Resources