How do I use static lifetimes with threads? - rust

I'm currently struggling with lifetimes in Rust (1.0), especially when it comes to passing structs via channels.
How would I get this simple example to compile:
use std::sync::mpsc::{Receiver, Sender};
use std::sync::mpsc;
use std::thread::spawn;
use std::io;
use std::io::prelude::*;
struct Message<'a> {
text: &'a str,
}
fn main() {
let (tx, rx): (Sender<Message>, Receiver<Message>) = mpsc::channel();
let _handle_receive = spawn(move || {
for message in rx.iter() {
println!("{}", message.text);
}
});
let stdin = io::stdin();
for line in stdin.lock().lines() {
let message = Message {
text: &line.unwrap()[..],
};
tx.send(message).unwrap();
}
}
I get:
error[E0597]: borrowed value does not live long enough
--> src/main.rs:23:20
|
23 | text: &line.unwrap()[..],
| ^^^^^^^^^^^^^ does not live long enough
...
26 | }
| - temporary value only lives until here
|
= note: borrowed value must be valid for the static lifetime...
I can see why this is (line only lives for one iteration of for), but I can't figure out what the right way of doing this is.
Should I, as the compiler hints, try to convert the &str into &'static str?
Am I leaking memory if every line would have a 'static lifetime?
When am I supposed to use 'static anyway? Is it something I should try to avoid or is it perfectly OK?
Is there a better way of passing Strings in structs via channels?
I apologize for those naive questions. I've spent quite some time searching already, but I can't quite wrap my head around it. It's probably my dynamic language background getting in the way :)
As an aside: Is &input[..] for converting a String into a &str considered OK? It's the only stable way I could find to do this.

You can't convert &'a T into &'static T except by leaking memory. Luckily, this is not necessary at all. There is no reason to send borrowed pointers to the thread and keep the lines on the main thread. You don't need the lines on the main thread. Just send the lines themselves, i.e. send String.
If access from multiple threads was necessary (and you don't want to clone), use Arc<String> (in the future, Arc<str> may also work). This way the string is shared between threads, properly shared, so that it will be deallocated exactly when no thread uses it any more.
Sending non-'static references between threads is unsafe because you never know how long the other thread will keep using it, so you don't know when the borrow expires and the object can be freed. Note that scoped threads don't have this problem (which aren't in 1.0 but are being redesigned as we speak) do allow this, but regular, spawned threads do.
'static is not something you should avoid, it is perfectly fine for what it does: Denoting that a value lives for the entire duration the program is running. But if that is not what you're trying to convey, of course it is the wrong tool.

Think about it this way: A thread has no syntactical lifetime, i.e. the thread will not be dropped at the end of code block where it was created. Whatever data you send to the thread, you must be sure that it will live as long as the thread does, which means forever. Which means 'static.
What can go wrong in your case, is if the main loop sends a reference to a thread and destroys the string before it has been handled by the thread. The thread would access invalid memory when dealing with the string.
One option would be to put your lines into some statically allocated container but this would mean that you never can destroy those strings. Generally speaking a bad idea. Another option is to think: does the main thread actually need the line once it is read? What if the main thread transfered responsibility for line to the handling thread?
struct Message {
text: String,
}
for line in stdin.lock().lines() {
let message = Message {
text: line.unwrap(),
};
tx.send(message).unwrap();
}
Now you are transferring ownership (move) from the main thread to the handler thread. Because you move your value, no references are involved and no checks for lifetime apply anymore.

Related

Reading a vector from multiple threads [duplicate]

This question already has an answer here:
How can I pass a reference to a stack variable to a thread?
(1 answer)
Closed last month.
I have a function that returns a vector of strings, which is read by multiple threads later. How to do this in rust?
fn get_list() -> Vec<String> { ... }
fn read_vec() {
let v = get_list();
for i in 1..10 {
handles.push(thread::spawn (|| { do_work(&v); }));
}
handles.join();
}
I think I need to extend the lifetime of v to static and pass it as a immutable ref to threads. But, I am not sure , how?
The problem you are facing is that the threads spawned by thread::spawn run for an unknown amount of time. You'll need to make sure that your Vec<String> outlives these threads.
You can use atomic reference-counting by creating an Arc<Vec<String>>, and create a clone for each thread. The Vec<String> will be deallocated only when all Arcs are dropped. Docs
You can leak the Vec<String>. I personally like this approach, but only if you need the Vec<String> for the entire runtime of your program. To achieve this, you can turn your Vec<String> into a &'static [String] by using Vec::leak. Docs
You can ensure that your threads will not run after the read_vec function returns - This is what you're essentially doing by calling handles.join(). However, the compiler doesn't see that these threads are joined later, and there might be edge cases where they are not joined (what happens when the 2nd thread::spawn panics?). To make this explicit, use the scope function in std::thread. Docs
Of course, you can also just clone the Vec<String>, and give each thread a unique copy.
TL;DR:
For this particular use-case, I'd recommend std::thread::scope. If the Vec<String> lives for the entire duration of your program, leaking it using Vec::leak is a great and often under-used solution. For more complex scenarios, wrapping the Vec<String> in an Arc is probably the right way to go.

Mutably use a value inside and outside of a closure

Is there a way to mutably borrow (or move a reference to) some value into a closure and continue using it outside, in a cleaner way?
For example, I have this code:
let queue = Arc::new(RefCell::new(Vec::new()));
let cqueue = Arc::clone(&queue);
EntityEventQueue::register_receiver(&entity_equeue, "position-callback",
Box::new( move |e| {
cqueue.borrow_mut().push(e.clone());
}));
// mutate queue
It works, but I heard that RefCell is bad practice outside some specific uses. Is there a way that I can use queue both inside and outside of the closure?
And if there is not, do you know a better way of implementing this? The one requirement is that the queue must be outside of the EntityEventQueue structure
(I created the register_receiver method, so it can be altered. Its signature is pub fn register_receiver(this: &Arc<RefCell<Self>>, name: &str, callback: Box<dyn FnMut(...) + 'a>)
You should use some synchronization mechanism instead of RefCell. For example a Mutex or a RwLock. Depending on your writing needs. Quick tip is:
One writer (at a time) and several readers -> RwLock
Many writers many readers -> Mutex
Those are std but you have some other synchronization libraries and mechanisms available.

How can I make a variable borrow for 'static?

In vulkano, to create a CPUAccessibleBuffer you need give it some data and the CPUAccessibleBuffer::from_data function requires the data to have the 'static lifetime.
I have some data in &[u8] array (created at runtime) that I would like to pass to that function.
However, it errors with this message
argument requires that `data` is borrowed for `'static`
So how can I make the lifetime of the data 'static ?
You should use CpuAccessibleBuffer::from_iter instead, it does the same thing but does not require the collection to be Copy or 'static:
let data: &[u8] = todo!();
let _ = CpuAccessibleBuffer::from_iter(
device,
usage,
host_cached,
data.iter().copied(), // <--- pass like so
);
Or if you actually have a Vec<u8>, you can pass it directly:
let data: Vec<u8> = todo!();
let _ = CpuAccessibleBuffer::from_iter(
device,
usage,
host_cached,
data, // <--- pass like so
);
If you really must create the data at runtime, and you really need to last for 'static, then you can use one of the memory leaking methods such as Box::leak or Vec::leak to deliberately leak a heap allocation and ensure it is never freed.
While leaking memory is normally something one avoids, in this case it's actually a sensible thing to do. If the data must live forever then leaking it is actually the correct thing to do, semantically speaking. You don't want the memory to be freed, not ever, which is exactly what happens when memory is leaked.
Example:
fn require_static_data(data: &'static [u8]) {
unimplemented!()
}
fn main() {
let data = vec![1, 2, 3, 4];
require_static_data(data.leak());
}
Playground
That said, really think over the reallys I led with. Make sure you understand why the code you're calling wants 'static data and ask yourself why your data isn't already 'static.
Is it possible to create the data at compile time? Rust has a powerful build time macro system. It's possible, for example, to use include_bytes! to read in a file and do some processing on it before it's embedded into your executable.
Is there another API you can use, another function call you're not seeing that doesn't require 'static?
(These questions aren't for you specifically, but for anyone who comes across this Q&A in the future.)
If the data is created at runtime, it can't have a static lifetime. Static means that data is present for the whole lifetime of the program, which is necessary in some contexts, especially when threading is involved. One way for data to be static is, as Paul already answered, explicitly declaring it as such, i.e.:
static constant_value: i32 = 0;
However, there's no universally applicable way to make arbitrary data static. This type of inference is made at compile-time by the borrow checker, not by the programmer.
Usually if a function requires 'static (type) arguments (as in this case) it means that anything less could potentially be unsafe, and you need to reorganize the way data flows in and out of your program to provide this type of data safely. Unfortunately, that's not something SO can provide within the scope of this question.
Make a constant with static lifetime:
static NUM: i32 = 18;

How to move closures forever

I'm designing a little struct that runs closures for me and I can set them to stop:
pub fn run(&self, f: Box<dyn Fn()>) {
let should_continue = self.should_continue.clone();
self.run_thread = Some(std::thread::spawn(move || {
while should_continue.load(Ordering::Relaxed) {
//f should run fast so `should_continue` is readed frequently
f();
}
}));
}
as you can see, I'm passing Fn in a box, which gives me an error about Box not being shareable between threads. Actually, I don't care about fn once I pass it to this function run, so I wanted to move the closure to this function, since I'll not use it anymore. I cannot mark Fn as send because the f that I'm gonna actually pass does not implement Send.
So, how can I move a closure completely?
//move this closure to inside of run
self.run(||{});
Having a buildable reproduction case rather than code with random unprovided dependencies is useful so here's what I understand of your code.
The error I get is that the dyn Fn can not be sent between threads which is very different than shared: while there are many things which can not be shared (Sync) between threads (they can only be used from one thread at a time) there are also things which must remain on their original thread at all time. Rc for instance, is not Send, because it's not a thread-safe reference-counted pointer sending an Rc to a different thread would break its guarantees, therefore that's not allowed.
dyn Fn is opaque and offers no real guarantee as to what it's doing internally except for, well, being callable multiple times. So as far as the compiler is concerned it could close over something which isn't Send (e.g. a reference to a !Sync type, or an Rc, ...), which means the compiler assumes the Fn isn't Send either.
The solution is simple: just define f: Box<dyn Fn() + Send>, this way within run you guarantee that the function can, in fact, be sent between threads; and the caller to run will get an error if they're trying to send a function which can not be sent.
demo
run_ok uses a trivial closure, there is no issue with sending it over. run_not_ok closes over an Rc, and the function therefore doesn't compile (just uncomment it to see). run_ok2 is the same function as run_not_ok using an Arc instead of the Rc, and compiles fine.

How can Rust be told that a thread does not live longer than its caller? [duplicate]

This question already has an answer here:
How can I pass a reference to a stack variable to a thread?
(1 answer)
Closed 5 years ago.
I have the following code:
fn main() {
let message = "Can't shoot yourself in the foot if you ain't got no gun";
let t1 = std::thread::spawn(|| {
println!("{}", message);
});
t1.join();
}
rustc gives me the compilation error:
closure may outlive the current function, but it borrows message, which is owned by the current function
This is wrong since:
The function it's referring to here is (I believe) main. The threads will be killed or enter in UB once main is finished executing.
The function it's referring to clearly invokes .join() on said thread.
Is the previous code unsafe in any way? If so, why? If not, how can I get the compiler to understand that?
Edit: Yes I am aware I can just move the message in this case, my question is specifically asking how can I pass a reference to it (preferably without having to heap allocate it, similarly to how this code would do it:
std::thread([&message]() -> void {/* etc */});
(Just to clarify, what I'm actually trying to do is access a thread safe data structure from two threads... other solutions to the problem that don't involve making the copy work would also help).
Edit2: The question this has been marked as a duplicate of is 5 pages long and as such I'd consider it and invalid question in it's own right.
Is the previous code 'unsafe' in any way ? If so, why ?
The goal of Rust's type-checking and borrow-checking system is to disallow unsafe programs, but that does not mean that all programs that fail to compile are unsafe. In this specific case, your code is not unsafe, but it does not satisfy the type constraints of the functions you are using.
The function it's referring to clearly invokes .join() on said thread.
But there is nothing from a type-checker standpoint that requires the call the .join. A type-checking system (on its own) can't enforce that a function has or has not been called on a given object. You could just as easily imagine an example like
let message = "Can't shoot yourself in the foot if you ain't got no gun";
let mut handles = vec![];
for i in 0..3 {
let t1 = std::thread::spawn(|| {
println!("{} {}", message, i);
});
handles.push(t1);
}
for t1 in handles {
t1.join();
}
where a human can tell that each thread is joined before main exits. But a typechecker has no way to know that.
The function it's referring to here is (I believe) main. So presumably those threads will be killed when main exists anyway (and them running after main exists is ub).
From the standpoint of the checkers, main is just another function. There is no special knowledge that this specific function can have extra behavior. If this were any other function, the thread would not be auto-killed. Expanding on that, even for main there is no guarantee that the child threads will be killed instantly. If it takes 5ms for the child threads to be killed, that is still 5ms where the child threads could be accessing the content of a variable that has gone out of scope.
To gain the behavior that you are looking for with this specific snippet (as-is), the lifetime of the closure would have to be tied to the lifetime of the t1 object, such that the closure was guaranteed to never be used after the handles have been cleaned up. While that is certainly an option, it is significantly less flexible in the general case. Because it would be enforced at the type level, there would be no way to opt out of this behavior.
You could consider using crossbeam, specifically crossbeam::scope's .spawn, which enforces this lifetime requirement where the standard library does not, meaning a thread must stop execution before the scope is finished.
In your specific case, your code works fine as long as you transfer ownership of message to the child thread instead of borrowing it from the main function, because there is no risk of unsafe code with or without your call to .join. Your code works fine if you change
let t1 = std::thread::spawn(|| {
to
let t1 = std::thread::spawn(move || {

Resources