Multi-threaded communication with an external process in Rust - multithreading

A rust newbie here.
I would like to launch an external long-running process and talk with it over pipes from multiple threads in Rust.
I am getting lifetime errors and can't figure the proper way to please the lifetimes checker. What are the ways to restructure this?
Consider the following example:
use std::process::{Command, Stdio, ChildStdin};
use std::sync::Mutex;
use std::io::{Write};
use std::thread;
struct Element {
sink: Mutex<Option<ChildStdin>>
}
impl Element {
fn launch_process(&self) {
let child =
Command::new("sed").args(&["s/foo/bar/g"])
.stdin(Stdio::piped())
.spawn()
.unwrap();
let mut sink = self.sink.lock().unwrap();
*sink = child.stdin;
}
fn tx(&self, content: &[u8]) {
let mut sink = self.sink.lock().unwrap();
sink.as_mut().unwrap().write(content);
}
fn start_tx(&self) {
thread::spawn( || {
self.tx(b"foo fighters");
});
}
}
fn main() {
let e = Element {
sink: Mutex::new(None)
};
e.launch_process();
e.start_tx();
}
If I remove the thread::spawn bit then everything works as expected. With thread::spawn in place, I get the error:
error[E0495]: cannot infer an appropriate lifetime due to conflicting requirements
--> src/main.rs:28:24
|
28 | thread::spawn( || {
| ________________________^
29 | | self.tx(b"foo fighters");
30 | | });
| |_________^
|
note: first, the lifetime cannot outlive the anonymous lifetime #1 defined on the method body at 27:5...
--> src/main.rs:27:5
|
27 | / fn start_tx(&self) {
28 | | thread::spawn( || {
29 | | self.tx(b"foo fighters");
30 | | });
31 | | }
| |_____^
= note: ...so that the types are compatible:
expected &&Element
found &&Element
= note: but, the lifetime must be valid for the static lifetime...
note: ...so that the type `[closure#src/main.rs:28:24: 30:10 self:&&Element]` will meet its required lifetime bounds
--> src/main.rs:28:9
|
28 | thread::spawn( || {
| ^^^^^^^^^^^^^
error: aborting due to previous error

You can't pass &self (a temporary borrow) to a thread, because the thread may keep running after the reference is no longer valid.
For using data from threads you have only two options:
Give ownership (which is exclusive) of the object to the thread, i.e. use move || closure, and don't try to use that object afterwards from the main thread, or any other thread.
Wrap the object in Arc to get shared thread-safe ownership, and send a clone to the thread (with Arc::clone it's cheap and the underlying data is shared).
When the compiler says that you need a "static lifetime", ignore that. For all practical purposes, it means "references are not allowed".

Related

How can I use a channel between an async closure and my main thread in rust?

I am trying to use a channel to communicate between an event handler and the main thread of my program using async rust. The event handler in question is this one from the matrix-rust-sdk.
I can see that this exact pattern is used in the code here.
But when I tried literally the same thing in my own code, it gives me a really strange lifetime error.
error: lifetime may not live long enough
--> src/main.rs:84:13
|
80 | move |event: OriginalSyncRoomMessageEvent, room: Room| {
| ------------------------------------------------------
| | |
| | return type of closure `impl futures_util::Future<Output = ()>` contains a lifetime `'2`
| lifetime `'1` represents this closure's body
...
84 | / async move {
85 | | if let Room::Joined(room) = room {
86 | | if room.room_id() == room_id {
87 | | match event.content.msgtype {
... |
94 | | }
95 | | }
| |_____________^ returning this value requires that `'1` must outlive `'2`
|
= note: closure implements `Fn`, so references to captured variables can't escape the closure
I tried to make a much simpler example, and the weird lifetime error remains:
use tokio::sync::mpsc;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let (tx, _rx) = mpsc::channel(32);
let closure = move || async {
at.send("hi");
};
Ok(())
}
Gives me:
error: lifetime may not live long enough
--> src/main.rs:9:27
|
9 | let closure = move || async {
| ___________________-------_^
| | | |
| | | return type of closure `impl Future<Output = ()>` contains a lifetime `'2`
| | lifetime `'1` represents this closure's body
10 | | at.send("hi");
11 | | };
| |_____^ returning this value requires that `'1` must outlive `'2`
|
= note: closure implements `Fn`, so references to captured variables can't escape the closure
So how can I use a channel in an async closure? Why doesn't my code work when the code in the matrix-rust-sdk does?
I think you meant || async move instead of move || async.
use tokio::sync::mpsc;
#[tokio::main]
async fn main() {
let (tx, _rx) = mpsc::channel(32);
let closure = || async move {
tx.send("hi").await.unwrap();
};
closure().await;
}
I think in most cases, |args| async move {} is what you want to use if you want to create an async closure. But I don't completely understand the differences either.
For more infos, this might help:
What is the difference between `|_| async move {}` and `async move |_| {}`.
I don't think your minimal example represents your actual problem of your real code, though. This is a minimal example that represents the real problem:
#[derive(Clone, Debug)]
struct RoomId(u32);
#[derive(Clone, Debug)]
struct Room {
id: RoomId,
}
impl Room {
fn room_id(&self) -> &RoomId {
&self.id
}
}
#[tokio::main]
async fn main() {
let dm_room = Room { id: RoomId(42) };
let dm_room_closure = dm_room.clone();
let closure = move || {
let room_id = dm_room_closure.room_id();
async move {
println!("{}", room_id.0);
}
};
closure().await;
}
error: lifetime may not live long enough
--> src/main.rs:23:9
|
20 | let closure = move || {
| -------
| | |
| | return type of closure `impl Future<Output = ()>` contains a lifetime `'2`
| lifetime `'1` represents this closure's body
...
23 | / async move {
24 | | println!("{}", room_id.0);
25 | | }
| |_________^ returning this value requires that `'1` must outlive `'2`
|
= note: closure implements `Fn`, so references to captured variables can't escape the closure
The real problem here is caused by the fact that room_id contains a reference to dm_room_closure, but dm_room_closure does not get kept alive by the innermost async move context.
To fix this, make sure that the async move keeps dm_room_closure alive by moving it in as well. In this case, this is as simple as creating the room_id variable inside of the async move:
#[derive(Clone, Debug)]
struct RoomId(u32);
#[derive(Clone, Debug)]
struct Room {
id: RoomId,
}
impl Room {
fn room_id(&self) -> &RoomId {
&self.id
}
}
#[tokio::main]
async fn main() {
let dm_room = Room { id: RoomId(42) };
let dm_room_closure = dm_room.clone();
let closure = move || {
async move {
let room_id = dm_room_closure.room_id();
println!("{}", room_id.0);
}
};
closure().await;
}
42
So I finally fixed the error. It turns out that something in Room doesn't implement Copy and therefore it was causing some sort of state sharing despite the Clones. I fixed it by passing the RoomId as a string. Since the lifetime error message is entirely opaque, there was no way to see which moved variable was actually causing the problem. Off to file a compiler bug report.

In Rust, I have a large number of receiver objects I'd like manage, however I'm running into lifetime issues using Select

Due to the possibility of there being a large number of objects, I'd like to have a way to add them to the select list, remove them for processing and then add them back. All, without having to rebuild the select list each time an object is added back for waiting. It looks something like this:
use std::collections::HashMap;
use crossbeam::{Select, Sender, Receiver};
struct WaitList <'a> {
sel: Select<'a>,
objects: HashMap<u128, Object>,
sel_index: HashMap<usize, u128>,
}
impl<'a> WaitList<'a> {
fn new () -> Self { Self { sel: Select::new(), objects: HashMap::new(), sel_index: HashMap::new() } }
fn select(&self) -> &Object {
let oper = self.sel.select();
let index = oper.index();
let id = self.sel_index.get(&index).unwrap();
let obj = self.objects.get(&id).unwrap();
obj.cmd = oper.recv(&obj.receiver).unwrap();
self.sel.remove(index);
obj
}
fn add_object(&self, object: Object) {
let id = object.id;
self.objects.insert(id, object);
self.add_select(id);
}
fn add_select(&self, id: u128) {
let idx = self.sel.recv(&self.objects.get(&id).unwrap().receiver);
self.sel_index.insert(idx, id);
}
}
Over time the select list will contain more dead entries, then live, and I'll rebuild it at that time. But, I'd like to not have to rebuild it every time. Here's the detailed error message:
Checking test-select v0.1.0 (/Users/bruce/Projects/rust/examples/test-select)
error[E0495]: cannot infer an appropriate lifetime for autoref due to conflicting requirements
--> src/main.rs:28:47
|
28 | let idx = self.sel.recv(&self.objects.get(&id).unwrap().receiver);
| ^^^
|
note: first, the lifetime cannot outlive the anonymous lifetime #1 defined on the method body at 27:5...
--> src/main.rs:27:5
|
27 | / fn add_select(&self, id: u128) {
28 | | let idx = self.sel.recv(&self.objects.get(&id).unwrap().receiver);
29 | | self.sel_index.insert(idx, id);
30 | | }
| |_____^
note: ...so that reference does not outlive borrowed content
--> src/main.rs:28:34
|
28 | let idx = self.sel.recv(&self.objects.get(&id).unwrap().receiver);
| ^^^^^^^^^^^^
note: but, the lifetime must be valid for the lifetime `'a` as defined on the impl at 9:6...
--> src/main.rs:9:6
|
9 | impl<'a> WaitList<'a> {
| ^^
note: ...so that the types are compatible
--> src/main.rs:28:28
|
28 | let idx = self.sel.recv(&self.objects.get(&id).unwrap().receiver);
| ^^^^
= note: expected `&mut crossbeam::Select<'_>`
found `&mut crossbeam::Select<'a>`
While I believe I understand the issue, that the borrow of the receiver from the hash table doesn't live long enough, I'm having a difficult time trying to come up with an alternative -- and I'm not seeing a clean way to borrow the information. I considered creating a struct to contain the borrow, and using that instead of a plain id in the wait sel_index however that runs into the same lifetime problem.
struct SingleWaiter<'a> {
id: u128,
receiver: &'a Receiver::<Command>
}
I feel like I'm missing something or not understanding something, as it seems like it shouldn't be that difficult to do what I want to do. I can imagine that the choice of HashMap for holding object might be the issue, a Vec felt wrong, as I'm adding and inserting. BTW, the HashMap isn't normally part of the waitlist. It is part of something else, but the problem remains the same irregardless of where the HashMap lives.

How to create a thread manager?

I have a data stream that I want to process in the background, but I want to create a struct or some functions to manage this stream.
In C++ land, I would create a class that abstracts all of this away. It would have a start method which would initialize the data stream and start a thread for processing. It would have a stop method that stops the processing and joins the thread.
However, this isn't really Rusty, and it doesn't even work in Rust.
Example (Playground)
use std::thread;
use std::time::Duration;
struct Handler {
worker_handle: Option<thread::JoinHandle<()>>,
stop_flag: bool, // Should be atomic, but lazy for example
}
impl Handler {
pub fn new() -> Handler {
let worker_handle = None;
let stop_flag = true;
return Handler { worker_handle, stop_flag };
}
pub fn start(&mut self) {
self.stop_flag = false;
self.worker_handle = Some(std::thread::spawn(move || {
println!("Spawned");
self.worker_fn();
}));
}
pub fn stop(&mut self) {
let handle = match self.worker_handle {
None => return,
Some(x) => x,
};
self.stop_flag = true;
handle.join();
}
fn worker_fn(&mut self) {
while !self.stop_flag {
println!("Working!");
}
}
}
fn main() {
let mut handler = Handler::new();
handler.start();
thread::sleep(Duration::from_millis(10000));
handler.stop();
}
Output:
error[E0495]: cannot infer an appropriate lifetime due to conflicting requirements
--> src/main.rs:20:54
|
20 | self.worker_handle = Some(std::thread::spawn(move || {
| ______________________________________________________^
21 | | println!("Spawned");
22 | | self.worker_fn();
23 | | }));
| |_________^
|
note: first, the lifetime cannot outlive the anonymous lifetime #1 defined on the method body at 18:5...
--> src/main.rs:18:5
|
18 | / pub fn start(&mut self) {
19 | | self.stop_flag = false;
20 | | self.worker_handle = Some(std::thread::spawn(move || {
21 | | println!("Spawned");
22 | | self.worker_fn();
23 | | }));
24 | | }
| |_____^
= note: ...so that the types are compatible:
expected &mut Handler
found &mut Handler
= note: but, the lifetime must be valid for the static lifetime...
note: ...so that the type `[closure#src/main.rs:20:54: 23:10 self:&mut Handler]` will meet its required lifetime bounds
--> src/main.rs:20:35
|
20 | self.worker_handle = Some(std::thread::spawn(move || {
| ^^^^^^^^^^^^^^^^^^
error: aborting due to previous error
Even if I cheat and remove the call the worker_fn, I still can't really work with JoinHandles like I might expect from C++ land.
error[E0507]: cannot move out of `self.worker_handle.0` which is behind a mutable reference
--> src/main.rs:27:28
|
27 | let handle = match self.worker_handle {
| ^^^^^^^^^^^^^^^^^^ help: consider borrowing here: `&self.worker_handle`
28 | None => return,
29 | Some(x) => x,
| -
| |
| data moved here
| move occurs because `x` has type `std::thread::JoinHandle<()>`, which does not implement the `Copy` trait
error: aborting due to previous error
So it's clear that I'm going outside of the usual Rust model and probably shouldn't be using this strategy.
But I still want to build some kind of interface that lets me simply spin up the data stream without worrying about managing threads, and I'm not really sure the best way to do this.
So it seems I have two core problems.
1) How can I create a function to be run in a thread which accepts data from an outside source, and can be signaled to quit safely? If I had an atomic bool for killing it, how would I share that between the threads?
2) How do I handle joining the thread when I'm done? The stop method needs to clean up the thread, but I don't know how to track a reference to it.

How to implement poll on a Future in a non-blocking manner in futures 0.3?

Update: This question is not relevant anymore since Context got removed from the poll() signature.
I'm trying to implement a simple Future with the futures crate v0.3: opening a File.
My future currently looks like this:
struct OpenFuture {
options: OpenOptions,
path: PathBuf,
output: Option<io::Result<File>>,
}
To implement Future, I came up with this:
impl Future for OpenFuture {
type Output = io::Result<File>;
fn poll(mut self: PinMut<Self>, cx: &mut Context) -> Poll<Self::Output> {
if let Some(file) = self.file.take() {
Poll::Ready(file)
} else {
let waker = cx.waker().clone();
cx.spawner().spawn_obj(
Box::new(lazy(move |_| {
// self.file = Some(self.options.open(&self.path));
waker.wake();
})).into(),
);
Poll::Pending
}
}
}
If the output is Option::Some, it can be taken and the future is ready, this is straightforward. But if it's not ready, I don't want to block the thread as mentioned in the Documentation:
An implementation of poll should strive to return quickly, and must never block. Returning quickly prevents unnecessarily clogging up threads or event loops. If it is known ahead of time that a call to poll may end up taking awhile, the work should be offloaded to a thread pool (or something similar) to ensure that poll can return quickly.
So I want to offload the work. Since I have a Context passed to the poll method, I have access to a Spawn and a Waker. The Spawn should take a task, that opens the file. After the file is open, it notifies with waker.wake().
The provided code doesn't work when uncommenting the line due to a lifetime error:
error[E0495]: cannot infer an appropriate lifetime due to conflicting requirements
|
| Box::new(lazy(move |_| {
| _______________________________________^
| | self.file = Some(self.options.open(&self.path));
| | waker.wake();
| | })).into(),
| |_________________________^
|
note: first, the lifetime cannot outlive the anonymous lifetime #1 defined on the method body...
|
| fn poll(mut self: PinMut<Self>, cx: &mut Context) -> Poll<Self::Output> {
| _____________^
| | if let Some(file) = self.file.take() {
| | Poll::Ready(file)
| | } else {
... |
| | }
| | }
| |_____________^
= note: ...so that the types are compatible:
expected std::pin::PinMut<'_, _>
found std::pin::PinMut<'_, _>
= note: but, the lifetime must be valid for the static lifetime...
= note: ...so that the expression is assignable:
expected std::future::FutureObj<'static, _>
found std::future::FutureObj<'_, _>
How this can be solved?
Additionally, Spawn::spawn_obj returns a Result. How this result can be handled? Is it recommended to just return io::ErrorKind::Other?

How to set a lifetime to a value captured in a closure?

I wrote what I think is simple code:
#![feature(plugin)]
#![plugin(rocket_codegen)]
extern crate rocket;
extern crate statsd;
use rocket::{Data, Request};
use rocket::fairing::AdHoc;
use statsd::Client;
#[get("/")]
fn index() -> &'static str {
"Hello, World"
}
fn main() {
let mut client = Client::new("127.0.0.1:9125", "miniserver-rs").unwrap();
rocket::ignite()
.attach(AdHoc::on_request(|request, data|{
client.incr("http.requests");
println!("Request URI: {}", request.uri());
}))
.mount("/", routes![index])
.launch();
client.incr("server.bootstrap");
}
I try to send some metrics on each request, but I get the following compiler error:
Compiling miniserver-rs v0.1.0 (main.rs)
error[E0373]: closure may outlive the current function, but it borrows `client`, which is owned by the current function
--> src\main.rs:19:33
|
19 | .attach(AdHoc::on_request(|request, _data|{
| ^^^^^^^^^^^^^^^^ may outlive borrowed value `client`
20 | client.incr("http.requests");
| ------ `client` is borrowed here help: to force the closure to take ownership of `client` (and any other referenced variables), use the `move` keyword
|
19 | .attach(AdHoc::on_request(move |request, _data|{
| ^^^^^^^^^^^^^^^^^^^^^
error[E0387]: cannot borrow data mutably in a captured outer variable in an `Fn` closure
--> src\main.rs:20:11
|
20 | client.incr("http.requests");
| ^^^^^^
|
help: consider changing this closure to take self by mutable reference
--> src\main.rs:19:33
|
19 | .attach(AdHoc::on_request(|request, _data|{
| _________________________________^
20 | | client.incr("http.requests");
21 | | println!("Request URI: {}", request.uri());
22 | | }))
| |_______^
I understand that client is captured in a closure and owned by another function (main) that could live less than the closure. I cannot move it because Client doesn't implement Copy, so the reference could not be used afterwards.
I also understand that I cannot borrow mutable data in a closure (Client is mutable). After a lot of search, I can conclude I need to use Arc/Rc in combination with Mutex/RwLock/RefCell, but before going further, I want to be sure it is required.
Let's look at the requirements. You want to call statsd::Client::incr(&mut client, metric) from inside the closure, so you need mutable access to client. This is a variable you close over with ||.
Now AdHoc::on_request<F>(f: F) requires F: Fn(...) + Send + Sync + 'static. Fn means you only have immutable access to your capture via &self. The 'static bound means the capture cannot be a reference itself, so it requires move ||. Finally Sync means you can't use Cell or RefCell to get a mutable reference from &self.client, since Rocket will share it between threads.
Just like you suspected, the canonical solution to have shared mutable access through a Send + Sync value is to use Arc<Mutex<_>>. This also solves the problem of "losing access by moving". Your code would look like the following (untested):
fn main() {
let client = Arc::new(Mutex::new(
Client::new("127.0.0.1:9125", "miniserver-rs").unwrap()));
// shallow-clone the Arc to move it into closure
let rocket_client = client.clone();
rocket::ignite()
.attach(AdHoc::on_request(move |request, data|{
rocket_client.lock()
.unwrap()
.incr("http.requests");
println!("Request URI: {}", request.uri());
}))
.mount("/", routes![index])
.launch();
client.lock()
.unwrap()
.incr("server.bootstrap");
}

Resources