Pass a String parameter to several threads in Rust [duplicate] - multithreading

This question already has an answer here:
Sharing String between threads in Rust
(1 answer)
Closed 2 years ago.
I have a struct like this:
pub struct Connector {
host: String,
executor: ThreadPool,
}
This struct features a method, that dispatches work (TCP-connections) on several threads via TcpStream::connect().
Unfortunately I got an error when referencing a field of a struct.
self.executor.execute(move || {
if let Ok(_) = TcpStream::connect((self.host.as_str(), 8080)) {
println!("Connection established");
//...
}
});
This code above leads to following error message:
std::sync::mpsc::Sender<std::boxed::Box<(dyn threadpool::FnBox + std::marker::Send + 'static)>>` cannot be shared between threads safely
I have also tried to make the host-field a &'static str, but since this is data from external input this does not work (I can't transform a String-object into a 'static str).
How can I insert the struct's field into its threads as parameters?
Edit: Creating a new variable inside the scope leads to:
creates a temporary which is freed while still in use

In your code you're not passing a String parameter; you're passing a &str parameter. Since you're effectively passing it to a different thread, the Rust compiler cannot check whether the reference will live long enough. Passing a &'static str would indeed have been a good solution if the string was actually static, but if that's not the case, you have to pass something that's owned.
In this case, the answer is somewhat complicated by the fact that TcpStream::connect() likely accepts a T: ToSocketAddrs and what works will depend on what impls exist for ToSocketAddrs. For example, tokio's ToSocketAddrs is implemented for (&'_ str, u16) but not for (String, u16). It is implemented for String, though, so perhaps the simplest solution here is to build a String and allow the closure to take ownership of that String, along these lines:
let addr = format!("{}:{}", &self.host, 8080);
self.executor.execute(move || {
if let Ok(_) = TcpStream::connect(addr) {
println!("Connection established");
//...
}
});

Related

Can't Deserialize in generic trait function due to lifetime issue, but it works when out of generic function [duplicate]

This question already has answers here:
Rust and serde deserializing using generics
(2 answers)
Closed 4 days ago.
I would like to implement a trait function that takes a path and gives an owned value.
use bevy::prelude::*;
use serde::de;
pub trait RonResource<'a, T: de::Deserialize<'a> + Resource> {
fn import_from_ron(path: &str) -> T {
let ron_string = std::fs::read_to_string(path).unwrap();
ron::from_str::<T>(
&ron_string
).expect(("Failed to load: ".to_owned() + path).as_str())
}
}
I implement the 'a lifetime because de::Deserialize needs it.
But the compiled tells me that "ron_string" will be dropped while still borrowed. (at the "&ron_string" line)
This code works fine if I implement it without generics, like this for exemple:
let settings = ron::from_str::<GasParticleSystemSettings>(
&std::fs::read_to_string("assets/settings/gas_ps_settings.ron").unwrap()
).expect("Failed to load settings/gas_ps_settings.ron")
.side_gas;
I don't get why the value need to "survive" the whole function as it will not be needed. Otherwise, the specific code wouldn't work!
The problem is when using Deserialize<'a> the deserialized object might contain references to the original str ie something like this:
struct HasRef<'a> {
s: &'a str,
}
is allowed. It would contain references to ron_string which is dropped at the end of the function.
What's totally save though is to just require DeserializeOwned instead at which point you can't do 0-copy deserialization any more but you don't have to keep the original string around either:
pub trait RonResource<T: de::DeserializeOwned + Resource> {
fn import_from_ron(path: &str) -> T {
let ron_string = std::fs::read_to_string(path).unwrap();
ron::from_str::<T>(
&ron_string
).expect(("Failed to load: ".to_owned() + path).as_str())
}
}
Your second example works presumably because you only access GasParticleSystemSettings.side_gas which I assume either does not contain references or is straight up Copy neither of which would create a lifetime issue.

Is there a way of spawning new threads with copies of existing data that have a non-static lifetime?

I have a problem that is similar to what is discussed in Is there a succinct way to spawn new threads with copies of existing data?. Unlike the linked question, I am trying to move an object with an associated lifetime into the new thread.
Intuitively, what I am trying to do is to copy everything that is necessary to continue the computation to the new thread and exit the old one. However, when trying to move cloned data (with a lifetime) to the new thread, I get the following error:
error[E0759]: data has lifetime 'a but it needs to satisfy a 'static lifetime requirement
I created a reproducible example based on the referenced question here. This is just to exemplify the problem. Here, the lifetimes could be removed easily but in my actual use-case the data I want to move to the thread is much more complex.
Is there an easy way of making this work with Rust?
A qualified answer to the question in the title is "yes", but we can't do it by copying non-static references. The reasons for this seeming limitation are sound. The way we can get the required data/objects into the thread closures is by passing ownership of them (or copies of them, or other concrete objects that represent them) to the closures.
It may not be immediately clear on how to do this with a complex library like pyo3 since much of the API returns reference types to objects rather than concrete objects that can be passed as-is to other threads, but the library does provide ways to pass Python data/objects to other threads, which I'll cover in the second example below.
The start() function will need to put a 'static bound on the closure type associated with its data parameter because within its body, start() is passing these closures on to other threads. The compiler is working to guarantee that the closures aren't holding on to references to anything that may evaporate if a thread runs longer than its parent, which is why it gripes without the 'static guarantee.
fn start<'a>(data : Vec<Arc<dyn Fn() -> f64 + Send + Sync + 'static>>,
more_data : String)
{
for _ in 1..=4 {
let cloned_data = data.clone();
let cloned_more_data = more_data.clone();
thread::spawn(move || foo(cloned_data, cloned_more_data));
}
}
A 'static bound is different than a 'static lifetime applied to a reference (data: 'static vs. &'static data). In the case of a bound, it only means the type it's applied to doesn't contain any non-static references (if it even holds any references at all). It's pretty common to see this bound applied to method parameters in threaded code.
As this applies specifically to the pyo3 problem space, we can avoid forming closures that contain non-static references by converting any such references to owned objects, then when the callback, running in another thread, needs to do something with them, it can acquire the GIL and cast them back to Python object references.
More about this in the code comments below. I took a simple example from the pyo3 GitHub README and combined it with the code provided in the playground example.
Something to watch out for when applying this pattern is deadlock. The threads will need to acquire the GIL in order to use the Python objects they have access to. In the example, once the parent thread is done spawning new threads, it releases the GIL when it goes out of scope. The parent then waits for the child threads to complete by joining their handles.
use std::thread;
use std::thread::JoinHandle;
use std::sync::Arc;
use pyo3::prelude::*;
use pyo3::types::IntoPyDict;
use pyo3::types::PyDict;
type MyClosure<'a> = dyn Fn() -> f64 + Send + Sync + 'a;
fn main() -> Result<(), ()>
{
match Python::with_gil(|py| main_(py)
.map_err(|e| e.print_and_set_sys_last_vars(py)))
{
Ok(handles) => {
for handle in handles {
handle.join().unwrap();
}},
Err(e) => { println!("{:?}", e); },
}
Ok(())
}
fn main_(py: Python) -> PyResult<Vec<JoinHandle<()>>>
{
let sys = py.import("sys")?;
let version = sys.get("version")?.extract::<String>()?;
let locals = [("os", py.import("os")?)].into_py_dict(py);
let code = "os.getenv('USER') or os.getenv('USERNAME') or 'Unknown'";
let user = py.eval(code, None, Some(&locals))?.extract::<String>()?;
println!("Hello {}, I'm Python {}", user, version);
// The thread will do something with the `locals` dictionary. In order to
// pass this reference object to the thread, first convert it to a
// non-reference object.
// Convert `locals` to `PyObject`.
let locals_obj = locals.to_object(py);
// Now we can move `locals_obj` into the thread without concern.
let closure: Arc<MyClosure<'_>> = Arc::new(move || {
// We can print out the PyObject which reveals it to be a tuple
// containing a pointer value.
println!("{:?}", locals_obj);
// If we want to do anything with the `locals` object, we can cast it
// back to a `PyDict` reference. We'll need to acquire the GIL first.
Python::with_gil(|py| {
// We have the GIL, cast the dict back to a PyDict reference.
let py_dict = locals_obj.cast_as::<PyDict>(py).unwrap();
// Printing it out reveals it to be a dictionary with the key `os`.
println!("{:?}", py_dict);
});
1.
});
let data = vec![closure];
let more = "Important data.".to_string();
let handles = start(data, more);
Ok(handles)
}
fn start<'a>(data : Vec<Arc<MyClosure<'static>>>,
more : String
) -> Vec<JoinHandle<()>>
{
let mut handles = vec![];
for _ in 1..=4 {
let cloned_data = data.clone();
let cloned_more = more.clone();
let h = thread::spawn(move || foo(cloned_data, cloned_more));
handles.push(h);
}
handles
}
fn foo<'a>(data : Vec<Arc<MyClosure<'a>>>,
more : String)
{
for closure in data {
closure();
}
}
Output:
Hello todd, I'm Python 3.8.10 (default, Jun 2 2021, 10:49:15)
[GCC 9.4.0]
Py(0x7f3329ccdd40)
Py(0x7f3329ccdd40)
Py(0x7f3329ccdd40)
{'os': <module 'os' from '/usr/lib/python3.8/os.py'>}
{'os': <module 'os' from '/usr/lib/python3.8/os.py'>}
{'os': <module 'os' from '/usr/lib/python3.8/os.py'>}
Py(0x7f3329ccdd40)
{'os': <module 'os' from '/usr/lib/python3.8/os.py'>}
Something to consider: you may be able to minimize, or eliminate, the need to pass Python objects to the threads by extracting all the information needed from them into Rust objects and passing those to threads instead.

Rust cannot return value referencing local variable [duplicate]

This question already has answers here:
Is there any way to return a reference to a variable created in a function?
(5 answers)
Closed 2 years ago.
I guess this is a quite basic Rust question, but I have this piece of code
struct Stock<'a> {
name: &'a str,
value: u8,
}
impl<'a> Stock<'a> {
fn read(file: &mut File) -> Stock<'a> {
let mut value = vec![0; 1];
file.read(&mut value);
let name_size = read_u32(file);
let mut name = vec![0; name_size as usize];
file.read(&mut name);
return Stock {
name: str::from_utf8(&name).unwrap(),
value: value[0],
};
}
}
and of course, it doesn't work, because I am referencing a local value name. To my limited knowledge, this is because outside the scope of read, name doesn't exist. However wouldn't from_utf8 create a new variable that is not scoped to read?
Also, I've read a bit about other people with this issue, and the suggestion is always to switch to the owned String. But why can't I do something such as name: str::from_utf8(&name).unwrap().to_owned()?
However wouldn't from_utf8 create a new variable that is not scoped to read?
No, it gives you a borrow with the same lifetime as the one you pass.
Even if it created a new variable, you would have to tie it to somewhere, like the scope of read(), so again you would have the same problem.
The issue is that your Stock struct holds a reference. That means whatever it points to it has to live for all the time Stock is alive. But any variables you create in the local scope of read() will die when read() ends.
Instead, what you want is for Stock to own the value so that it keeps alive it itself.
You most likely should change the struct field to name: String and get rid of the 'a lifetime parameter.

Share function reference between threads in Rust [duplicate]

This question already has answers here:
Sending trait objects between threads in Rust
(1 answer)
How can I pass a reference to a stack variable to a thread?
(1 answer)
Closed 3 years ago.
I want to share a function reference between threads but the Rust compiler says `dyn for<'r> std::ops::Fn(&'r std::string::String) -> std::string::String` cannot be shared between threads safely. I'm well informed about Send, Sync, and Arc<T> when sharing "regular" values between threads but in this case I can't understand the problem. A function has a static address during the runtime of the program, therefore I can't see a problem here.
How can I make this work?
fn main() {
// pass a function..
do_sth_multithreaded(&append_a);
do_sth_multithreaded(&identity);
}
fn append_a(string: &String) -> String {
let mut string = String::from(string);
string.push('a');
string
}
fn identity(string: &String) -> String {
String::from(string)
}
fn do_sth_multithreaded(transform_fn: &dyn Fn(&String) -> String) {
for i in 0..4 {
let string = format!("{}", i);
thread::spawn(move || {
println!("Thread {}: {}", i, transform_fn(&string))
});
}
}
A function has a static address during the runtime of the program, therefore I can't see a problem here.
That's nice for functions, but you're passing a &dyn Fn, and that could just as well be a closure or (in unstable Rust) a custom object implementing that trait. And this object might not have a static address. So you can't guarantee that the object will outlive the threads you spawn.
But that's not even what the compiler is complaining about (yet!). It's actually complaining that it doesn't know whether you're allowed to access the Fn from another thread. Again, not relevant for function pointers, but relevant for closures.
Here's a signature that works for your example:
fn do_sth_multithreaded(transform_fn: &'static (dyn Fn(&String) -> String + Sync))
Note the 'static lifetime bound and the Sync bound.
But while the static lifetime works for this case, it probably means you can't ever send closures. To make that work, you need to use a scoped thread system (for example, from the crossbeam crate) to make sure do_sth_multithreaded waits for the threads to finish before returning. Then you can relax the static lifetime bound.

Wrapping a struct that borrows something [duplicate]

This question already has answers here:
Why can't I store a value and a reference to that value in the same struct?
(4 answers)
How to return a reference to a sub-value of a value that is under a mutex?
(5 answers)
Returning a RWLockReadGuard independently from a method
(2 answers)
How can I store a Chars iterator in the same struct as the String it is iterating on?
(2 answers)
Closed 3 years ago.
I would like to wrap a low-level third-party API with my own struct and functions to make it more friendly. Unfortunately, the third-party API needs a reference to a socket in its constructor, which means I'd like my Struct to "own" the socket (someone has to own it so that it can be borrowed, right? and I'd like that hidden as an implementation detail in my API).
The third-party API looks something like this:
struct LowLevelApi<'a> {
stream: &'a mut TcpStream,
// ...
}
impl<'a> LowLevelApi<'a> {
pub fn new(socket: &'a mut TcpStream, ... ) -> LowLevelApi<'a> {
// ...
}
}
I would like to make the interface to my function look like:
pub fn new(host: String, port: u16, ...) -> HighLevelApi {
// ...
}
I tried this:
pub struct HighLevelApi<'a> {
stream: TcpStream,
low: LowLevelApi<'a>
}
impl <'a> HighLevelApi<'a> {
pub fn new(host: String, port: u16) -> HighLevelApi<'a> {
// Ignore lack of error checking for now
let mut stream = TcpStream::connect(format!("{}:{}", host, port)).unwrap();
HighLevelApi {
stream,
low: LowLevelApi::new(&mut stream)
}
}
}
Rust is (rightly) angry: It has no way of knowing that I'm not going to do something bad with the low field later. And even worse, I would need to somehow guarantee that when my structure gets dropped, low gets dropped first, and stream second (since by that point, any relationship between the two is lost - or rather, there never is/was a relationship between the two).
(actually, it's worse than that, because the stream local variable gets moved into the new struct, so the local can't possibly be borrowed by LowLevelApi, but I can't think of a way to initialize HighLevelApi with the stream from the struct, since there's no way to get a handle to that from within the struct's initialization, is there? But based on my guess about what would happen in the paragraph above, it doesn't really matter since it still wouldn't do what I wanted)
What are examples of the various techniques that can be used to store a wrap a third-party (not under my control) struct that needs a reference to something?
The Rental crate seems to do what is needed here, albeit with documentation and examples that leave a lot to the imagination (i.e. trial and error).
Here's roughly what solves this
rental! {
pub mod rentals {
#[rental_mut]
pub struct Wrapper {
stream: Box<TcpStream>,
low: LowLevelApi<'stream>
}
}
}
pub struct HighLevelApi {
wrapper: rentals::Wrapper
}
impl HighLevelApi {
pub fn new(host: String, port: u16) -> HighLevelApi {
Api {
// Ignore lack of error checking for now
wrapper: rentals::Wrapper::new(
Box::new(TcpStream::connect(format!("{}:{}", host, port)).unwrap()),
|s| LowLevelApi::new(s)
)
}
}
pub fn do_something(&mut self) {
self.wrapper.rent_mut(|ll| ll.do_something()) // ll is the LowLevelApi
}
}
I noticed two important things that made this work:
The lifetime name on low in the Wrapper struct must match the name of the "owning" field (in this case "'stream")
You never get direct access to the reference - you get it through a callback/closure:
In the auto-generated constructor (new()) the second parameter isn't a LowLevelApi, it's a closure that gets the &mut TcpStream, and that closure is then expected to return a LowLevelApi
When you want to actually use the LowLevelApi you can "rent" it, hence the wrapper.rent_mut(f) where f is the closure that gets passed a LowLevelApi (ll) and can do what it needs
With these facts, the rest of the Rental documentation makes a lot more sense.

Resources