Safely move or dereference Receiver in a Fn? - rust

I'm working on an app that optionally uses a GUI to display video data that's roughly structured like this:
fn main() {
let (window_tx, window_rx) =
MainContext::channel::<MyStruct>(PRIORITY_DEFAULT);
let some_thread = thread::spawn(move || -> () {
// send data to window_tx
});
let application =
gtk::Application::new(Some("com.my.app"), Default::default());
application.connect_activate(move |app: &gtk::Application| {
build_ui(app, window_rx);
});
application.run();
some_thread.join().unwrap();
}
fn build_ui(application: &gtk::Application, window_rx: Receiver<MyStruct>) {
window_rx.attach( ... );
}
The gtk rust library requires a Fn callback passed to application.connect_activate on startup, so I can't use a FnOnce or FnMut closure to move the glib::Receiver in the callback. The compiler throws this error:
error[E0507]: cannot move out of `window_rx`, a captured variable in an `Fn` closure
I've tried to avoid the move by wrapping window_rx in a Rc, ie:
let r = Rc::new(RefCell::new(window_rx));
application.connect_activate(move |app: &gtk::Application| {
build_ui(app, Rc::clone(&r));
});
But upon dereferencing the Rc in my build_ui function, I get this error:
error[E0507]: cannot move out of an `Rc`
The fallback I've used thus far is to just move the channel creation and thread creation into my build_ui function, but because the GUI is not required, I was hoping to avoid using GTK and the callback entirely if GUI is not used. Is there some way I can either safely move window_rx within a closure or otherwise dereference it in the callback without causing an error?

When you need to move a value out from code that, by the type system but not in practice, could be called more than once, the simple tool to reach for is Option. Wrapping the value in an Option allows it to be swapped with an Option::None.
When you need something to be mutable even though you're inside a Fn, you need interior mutability; in this case, Cell will do. Here's a complete compilable program that approximates your situation:
use std::cell::Cell;
// Placeholders to let it compile
use std::sync::mpsc;
fn wants_fn_callback<F>(_f: F) where F: Fn() + 'static {}
struct MyStruct;
fn main() {
let (_, window_rx) = mpsc::channel::<MyStruct>();
let window_rx: Cell<Option<mpsc::Receiver<MyStruct>>> = Cell::new(Some(window_rx));
wants_fn_callback(move || {
let _: mpsc::Receiver<MyStruct> = window_rx.take().expect("oops, called twice");
});
}
Cell::take() removes the Option<Receiver> from the Cell, leaving None in its place. The expect then removes the Option wrapper (and handles the possibility of the function being called twice by panicking in that case).
Applied to your original problem, this would be:
let window_rx: Option<Receiver<MyStruct>> = Cell::new(Some(window_rx));
application.connect_activate(move |app: &gtk::Application| {
build_ui(app, window_rx.take().expect("oops, called twice"));
});
However, be careful: if the library requires a Fn closure, there might be some condition under which the function could be called more than once, in which case you should be prepared to do something appropriate in that circumstance. If there isn't such a condition, then the library's API should be improved to take a FnOnce instead.

Related

Rust - How to pass function parameters to closure

I'm trying to write a function that takes two parameters. The function starts two threads and uses one of the parameters inside one of the thread closures. This doesn't work because of the error "Borrowed data escapes outside of closure". Here's the code.
pub fn measure_stats(testdatapath: &PathBuf, filenameprefix: &String) {
let (tx, rx) = mpsc::channel();
let filename = format!("test.txt")
let measure_thread = thread::spawn(move || {
let stats = sar();
fs::write(filename, stats).expect("failed to write output to file");
// Send a signal that we're done.
let _ = tx.send(());
});
thread::spawn(move || {
let mut n = 0;
loop {
// Break if the measure thread is done.
match rx.try_recv() {
Ok(_) | Err(TryRecvError::Disconnected) => break,
Err(TryRecvError::Empty) => {}
}
let filename = format!("{:04}.img", n);
let filepath = Path::new(testdatapath).join(&filename);
random_file_write(&filepath).unwrap();
random_file_read(&filepath).unwrap();
fs::remove_file(&filepath).expect("failed to remove file");
n += 1;
}
});
measure_thread.join().expect("joining measure thread panicked");
}
The problem is that testdatapath escapes the function body. I think this is a problem because the lifetime of testdatapath is only guaranteed until the end of the closure, but it needs to be the lifetime of the entire program. But it's a little confusing to me.
I've tried cloning the variable, but that didn't help. I'm not sure how I'm supposed to do this. How do I use a function parameter inside the closure or accomplish the same goal some other more canonical way?
If it's okay for the function not to return until both threads complete, then use std::thread::scope() to create scoped threads instead of std::thread::spawn(). Scoped threads allow borrowing data whereas regular spawning cannot, but require the threads to all terminate before the scope ends and the function that created them returns.
If this has to be a “background” task, then you need to make sure that all the data used by each thread is owned, i.e. not a reference. In this case, that means you should change the parameters to be owned:
pub fn measure_stats(testdatapath: PathBuf, filenameprefix: String) {
Then, those values will be moved into the receiving thread, without any lifetime constraints.
You're trying to make testdata live longer than the function, since this is a value you're borrowing and since you can't guarantee that the original PathBuff will outlive closure running in the new thread the compiler is warning you that you're assuming that this would be the case, but not taking any precautions to do so.
The 3 simpler choices:
Move the PathBuff to the function instead of borrowing it (remove the &).
Use an Arc
clone it and move the clone into the thread.

How do I define the lifetime for a tokio task spawned from a class?

I'm attempting to write a generic set_interval function helper:
pub fn set_interval<F, Fut>(mut f: F, dur: Duration)
where
F: Send + 'static + FnMut() -> Fut,
Fut: Future<Output = ()> + Send + 'static,
{
let mut interval = tokio::time::interval(dur);
tokio::spawn(async move {
// first tick is at 0ms
interval.tick().await;
loop {
interval.tick().await;
tokio::spawn(f());
}
});
}
This works fine until it's called from inside a class:
fn main() {}
struct Foo {}
impl Foo {
fn bar(&self) {
set_interval(|| self.task(), Duration::from_millis(1000));
}
async fn task(&self) {
}
}
self is not 'static, and we can't restrict lifetime parameter to something that is less than 'static because of tokio::task.
Is it possible to modify set_interval implementation so it works in cases like this?
Link to playground
P.S. Tried to
let instance = self.clone();
set_interval(move || instance.task(), Duration::from_millis(1000));
but I also get an error: error: captured variable cannot escape FnMut closure body
Is it possible to modify set_interval implementation so it works in cases like this?
Not really. Though spawn-ing f() really doesn't help either, as it precludes a simple "callback owns the object" solution (as you need either both callback and future to own the object, or just future).
I think that leaves two solutions:
Convert everything to shared mutability Arc, the callback owns one Arc, then on each tick it clones that and moves the clone into the future (the task method).
Have the future (task) acquire the object from some external source instead of being called on one, this way the intermediate callback doesn't need to do anything. Or the callback can do the acquiring and move that into the future, same diff.
Incidentally at this point it could make sense to just create the future directly, but allow cloning it. So instead of taking a callback set_interval would take a clonable future, and it would spawn() clones of its stored future instead of creating them anew.
As mentioned by #Masklinn, you can clone the Arc to allow for this. Note that cloning the Arc will not clone the underlying data, just the pointer, so it is generally OK to do so, and should not have a major impact on performance.
Here is an example. The following code will produce the error async block may outlive the current function, but it borrows data, which is owned by the current function:
fn main() {
// 🛑 Error: async block may outlive the current function, but it borrows data, which is owned by the current function
let data = Arc::new("Hello, World".to_string());
tokio::task::spawn(async {
println!("1: {}", data.len());
});
tokio::task::spawn(async {
println!("2: {}", data.len());
});
}
Rust unhelpfully suggests adding move to both async blocks, but that will result in a borrowing error because there would be multiple ownership.
To fix the problem, we can clone the Arc for each task and then add the move keyword to the async blocks:
fn main() {
let data = Arc::new("Hello, World".to_string());
let data_for_task_1 = data.clone();
tokio::task::spawn(async move {
println!("1: {}", data_for_task_1.len());
});
let data_for_task_2 = data.clone();
tokio::task::spawn(async move {
println!("2: {}", data_for_task_2.len());
});
}

Waiting on multiple futures borrowing mutable self

Each of the following methods need (&mut self) to operate. The following code gives the error.
cannot borrow *self as mutable more than once at a time
How can I achieve this correctly?
loop {
let future1 = self.handle_new_connections(sender_to_connector.clone());
let future2 = self.handle_incoming_message(&mut receiver_from_peers);
let future3 = self.handle_outgoing_message();
tokio::pin!(future1, future2, future3);
tokio::select! {
_=future1=>{},
_=future2=>{},
_=future3=>{}
}
}
You are not allowed to have multiple mutable references to an object and there's a good reason for that.
Imagine you pass an object mutably to 2 different functions and they edited the object out of sync since you don't have any mechanism for that in place. then you'd end up with something called a race condition.
To prevent this bug rust allows only one mutable reference to an object at a time but you can have multiple immutable references and often you see people use internal mutability patterns.
In your case, you want data not to be able to be modified by 2 different threads at the same time so you'd wrap it in a Lock or RwLock then since you want multiple threads to be able to own this value you'd wrap that in an Arc.
here you can read about interior mutability in more detail.
Alternatively, while declaring the type of your function you could add proper lifetimes to indicate the resulting Future will be waited on in the same context by giving it a lifetime since your code waits for the future before the next iteration that would do the trick as well.
I encountered the same problem when dealing with async code. Here is what I figured out:
Let's say you have an Engine, that contains both incoming and outgoing:
struct Engine {
log: Arc<Mutex<Vec<String>>>,
outgoing: UnboundedSender<String>,
incoming: UnboundedReceiver<String>,
}
Our goal is to create two functions process_incoming and process_logic and then poll them simultaneously without messing up with the borrow checker in Rust.
What is important here is that:
You cannot pass &mut self to these async functions simultaneously.
Either incoming or outgoing will be only held by one function at most.
The data access by both process_incoming and process_logic need to be wrapped by a lock.
Any trying to lock Engine directly will lead to a deadlock at runtime.
So that leaves us giving up using the method in favor of the associated function:
impl Engine {
// ...
async fn process_logic(outgoing: &mut UnboundedSender<String>, log: Arc<Mutex<Vec<String>>>) {
loop {
Delay::new(Duration::from_millis(1000)).await.unwrap();
let msg: String = "ping".into();
println!("outgoing: {}", msg);
log.lock().push(msg.clone());
outgoing.send(msg).await.unwrap();
}
}
async fn process_incoming(
incoming: &mut UnboundedReceiver<String>,
log: Arc<Mutex<Vec<String>>>,
) {
while let Some(msg) = incoming.next().await {
println!("incoming: {}", msg);
log.lock().push(msg);
}
}
}
And we can then write main as:
fn main() {
futures::executor::block_on(async {
let mut engine = Engine::new();
let a = Engine::process_incoming(&mut engine.incoming, engine.log.clone()).fuse();
let b = Engine::process_logic(&mut engine.outgoing, engine.log).fuse();
futures::pin_mut!(a, b);
select! {
_ = a => {},
_ = b => {},
}
});
}
I put the whole example here.
It's a workable solution, only be aware that you should add futures and futures-timer in your dependencies.

Is there a way of spawning new threads with copies of existing data that have a non-static lifetime?

I have a problem that is similar to what is discussed in Is there a succinct way to spawn new threads with copies of existing data?. Unlike the linked question, I am trying to move an object with an associated lifetime into the new thread.
Intuitively, what I am trying to do is to copy everything that is necessary to continue the computation to the new thread and exit the old one. However, when trying to move cloned data (with a lifetime) to the new thread, I get the following error:
error[E0759]: data has lifetime 'a but it needs to satisfy a 'static lifetime requirement
I created a reproducible example based on the referenced question here. This is just to exemplify the problem. Here, the lifetimes could be removed easily but in my actual use-case the data I want to move to the thread is much more complex.
Is there an easy way of making this work with Rust?
A qualified answer to the question in the title is "yes", but we can't do it by copying non-static references. The reasons for this seeming limitation are sound. The way we can get the required data/objects into the thread closures is by passing ownership of them (or copies of them, or other concrete objects that represent them) to the closures.
It may not be immediately clear on how to do this with a complex library like pyo3 since much of the API returns reference types to objects rather than concrete objects that can be passed as-is to other threads, but the library does provide ways to pass Python data/objects to other threads, which I'll cover in the second example below.
The start() function will need to put a 'static bound on the closure type associated with its data parameter because within its body, start() is passing these closures on to other threads. The compiler is working to guarantee that the closures aren't holding on to references to anything that may evaporate if a thread runs longer than its parent, which is why it gripes without the 'static guarantee.
fn start<'a>(data : Vec<Arc<dyn Fn() -> f64 + Send + Sync + 'static>>,
more_data : String)
{
for _ in 1..=4 {
let cloned_data = data.clone();
let cloned_more_data = more_data.clone();
thread::spawn(move || foo(cloned_data, cloned_more_data));
}
}
A 'static bound is different than a 'static lifetime applied to a reference (data: 'static vs. &'static data). In the case of a bound, it only means the type it's applied to doesn't contain any non-static references (if it even holds any references at all). It's pretty common to see this bound applied to method parameters in threaded code.
As this applies specifically to the pyo3 problem space, we can avoid forming closures that contain non-static references by converting any such references to owned objects, then when the callback, running in another thread, needs to do something with them, it can acquire the GIL and cast them back to Python object references.
More about this in the code comments below. I took a simple example from the pyo3 GitHub README and combined it with the code provided in the playground example.
Something to watch out for when applying this pattern is deadlock. The threads will need to acquire the GIL in order to use the Python objects they have access to. In the example, once the parent thread is done spawning new threads, it releases the GIL when it goes out of scope. The parent then waits for the child threads to complete by joining their handles.
use std::thread;
use std::thread::JoinHandle;
use std::sync::Arc;
use pyo3::prelude::*;
use pyo3::types::IntoPyDict;
use pyo3::types::PyDict;
type MyClosure<'a> = dyn Fn() -> f64 + Send + Sync + 'a;
fn main() -> Result<(), ()>
{
match Python::with_gil(|py| main_(py)
.map_err(|e| e.print_and_set_sys_last_vars(py)))
{
Ok(handles) => {
for handle in handles {
handle.join().unwrap();
}},
Err(e) => { println!("{:?}", e); },
}
Ok(())
}
fn main_(py: Python) -> PyResult<Vec<JoinHandle<()>>>
{
let sys = py.import("sys")?;
let version = sys.get("version")?.extract::<String>()?;
let locals = [("os", py.import("os")?)].into_py_dict(py);
let code = "os.getenv('USER') or os.getenv('USERNAME') or 'Unknown'";
let user = py.eval(code, None, Some(&locals))?.extract::<String>()?;
println!("Hello {}, I'm Python {}", user, version);
// The thread will do something with the `locals` dictionary. In order to
// pass this reference object to the thread, first convert it to a
// non-reference object.
// Convert `locals` to `PyObject`.
let locals_obj = locals.to_object(py);
// Now we can move `locals_obj` into the thread without concern.
let closure: Arc<MyClosure<'_>> = Arc::new(move || {
// We can print out the PyObject which reveals it to be a tuple
// containing a pointer value.
println!("{:?}", locals_obj);
// If we want to do anything with the `locals` object, we can cast it
// back to a `PyDict` reference. We'll need to acquire the GIL first.
Python::with_gil(|py| {
// We have the GIL, cast the dict back to a PyDict reference.
let py_dict = locals_obj.cast_as::<PyDict>(py).unwrap();
// Printing it out reveals it to be a dictionary with the key `os`.
println!("{:?}", py_dict);
});
1.
});
let data = vec![closure];
let more = "Important data.".to_string();
let handles = start(data, more);
Ok(handles)
}
fn start<'a>(data : Vec<Arc<MyClosure<'static>>>,
more : String
) -> Vec<JoinHandle<()>>
{
let mut handles = vec![];
for _ in 1..=4 {
let cloned_data = data.clone();
let cloned_more = more.clone();
let h = thread::spawn(move || foo(cloned_data, cloned_more));
handles.push(h);
}
handles
}
fn foo<'a>(data : Vec<Arc<MyClosure<'a>>>,
more : String)
{
for closure in data {
closure();
}
}
Output:
Hello todd, I'm Python 3.8.10 (default, Jun 2 2021, 10:49:15)
[GCC 9.4.0]
Py(0x7f3329ccdd40)
Py(0x7f3329ccdd40)
Py(0x7f3329ccdd40)
{'os': <module 'os' from '/usr/lib/python3.8/os.py'>}
{'os': <module 'os' from '/usr/lib/python3.8/os.py'>}
{'os': <module 'os' from '/usr/lib/python3.8/os.py'>}
Py(0x7f3329ccdd40)
{'os': <module 'os' from '/usr/lib/python3.8/os.py'>}
Something to consider: you may be able to minimize, or eliminate, the need to pass Python objects to the threads by extracting all the information needed from them into Rust objects and passing those to threads instead.

Cannot move out of captured outer variable in an `Fn` closure

I'm trying to figure out how to send a function through a channel, and how to avoid extra cloning in order to execute the function at the other end. If I remove the extra cloning operation inside the closure, I get the following error:
error: cannot move out of captured outer variable in an 'Fn' closure
Ignoring the fact that this code does absolutely nothing, and makes use of a global mutable static Sender<T>, it represents what I'm trying to achieve while giving the proper compiler errors. This code is not meant to be ran, just compiled.
use std::ops::DerefMut;
use std::sync::{Arc, Mutex};
use std::collections::LinkedList;
use std::sync::mpsc::{Sender, Receiver};
type SafeList = Arc<Mutex<LinkedList<u8>>>;
type SendableFn = Arc<Mutex<(Fn() + Send + Sync + 'static)>>;
static mut tx: *mut Sender<SendableFn> = 0 as *mut Sender<SendableFn>;
fn main() {
let list: SafeList = Arc::new(Mutex::new(LinkedList::new()));
loop {
let t_list = list.clone();
run(move || {
foo(t_list.clone());
});
}
}
fn run<T: Fn() + Send + Sync + 'static>(task: T) {
unsafe {
let _ = (*tx).send(Arc::new(Mutex::new(task)));
}
}
#[allow(dead_code)]
fn execute(rx: Receiver<SendableFn>) {
for t in rx.iter() {
let mut guard = t.lock().unwrap();
let task = guard.deref_mut();
task();
}
}
#[allow(unused_variables)]
fn foo(list: SafeList) { }
Is there a better method to getting around that error and/or another way I should be sending functions through channels?
The problem with Fn() is that you can call it multiple times. If you moved out of a captured value, that value would not be available anymore at the next call. You need a FnOnce() to make sure calling the closure also moves out of it, so it's gone and can't be called again.
There's no way to have an Arc<Mutex<(FnOnce() + Send + Sync + 'static)>>. This would again require that you statically guarantee that after you call the function, noone else can call it again. Which you cannot, since someone else might have another Arc pointing to your FnOnce. What you can do is box it and send it as Box<FnOnce() + Send + Sync + 'static>. There's only ever one owner of a Box.
The trouble with FnOnce() is, is that you can't really call it while it's in the Box, because that would require moving it out of the Box and calling it. But we don't know the size of it, so we cannot move it out of the Box. In the future Box<FnOnce()> closures might become directly usable.
"Luckily" this problem occurred more often, so there's FnBox. Sadly this requires nightly to work. Also I couldn't figure out how to use the function call syntax that is described in the docs, but you can manually call call_box on the Box<FnBox()>. Try it out in the Playground

Resources