Are Fn() + 'static and FnMut() + 'static equivalent? - rust

I have what I think to be a simple question, but not much luck finding an answer for it.
Background
I understand the difference between Fn and FnMut in Rust, but I see quite often the need to accept closures that require a 'static lifetime bound.
The Question
Is an Fn() + 'static equivalent to FnMut() + 'static?
My Opinion
In my opinion, I seem to believe they are, because an Fn allows for capturing immutable references to its environment, whereas FnMut, mutable references, however, due to the 'static lifetime bound, the only references they can have, are owned ones, and therefore will almost always have move semantics associated with the closure. Since only owned values, or special &'static references can have 'static lifetime, it seems to me pointless to want to have or need an FnMut() in this case, since there is no mutable reference one might be able to get to the closures environment.
Am I wrong with this conclusion? My guess is yes, otherwise there would probably be a Clippy lint for this.

Any closure type can capture any kind of data. The difference is how the closure can access the captured data while it is executed. An Fn closure receives a shared reference to its captured data. An FnMut closure receives a mutable reference to its captured data, so it can mutate it. And finally, an FnOnce closure receives ownership of the captrued data, which is why you can call it only once.
The 'static trait bound means that the captured data has static lifetime. This is completely orthogonal to the question what a closure can do with its captured data while it is called.

Related

When is a static lifetime not appropriate?

I have found a lot of information across the web about rust lifetimes, including information about static lifetimes. It makes sense to me that, in certain situations, you must guarantee that a reference will outlive everything.
For instance, I have a reference that I’m passing to a thread, and the compiler is requesting that the reference been marked as static. In this scenario, that seems to make sense because the compiler can’t know how long the thread will live and thus needs to ensure the passed reference outlives the thread. (I think that’s correct?)
I don’t know where this comes from, but I am always concerned that marking something with a static lifetime is something to be skeptical of, and avoided when possible.
So I wonder if that’s correct. Should I be critical of marking things with a static lifetime? Are there situations when the compiler will want to require one, but an alternate strategy might actually be more optimal?
What are some concrete ways that I can reason about the application of a static lifetime, and possibly determine when it might not be appropriate?
As you might have already guessed, there is no definitive, technical answer to this.
As a newcomer to Rust, 'static references seem to defeat the entire purpose of the borrowing system and there is a notion to avoid them. Once you get more experienced, this notion will go away.
First of all, 'static is not bad as it seems, since all things that have no other lifetimes associated with them are 'static, e.g. String::new(). Notice that 'static does not mean that the value in question does truly live forever. It just means that the value can be made to live forever. In your threading-examples, the thread can't make any promises about its own lifetime, so it needs to be able to make all things passed to it live forever. Any owned value which does not include lifetimes shorter than 'static (like vec![1,2,3]) can be made to live forever (simply by not destroying them) and are therefor 'static.
Second, &'static - the static reference - does not come up often anyway. If it does, you'll usually be aware of why. You won't see a lot of fn foo(bar: &'static Bar) because there simply aren't that many use-cases for it, not because it is actively avoided.
There are situations where 'static does come up in surprising ways. Out of my head:
A Box<dyn Trait> is implicitly a Box<dyn Trait + 'static>. This is because when the type of the value inside the Box gets erased, it might have had lifetimes associated with it; and all (different) types must be valid for as long as the Box lives. Therefore all types need to share a common denominator wrt their lifetimes and Rust is defined to choose 'static. This choice is usually ok, but can lead to surprising "requires 'static" errors. You can generalize this explicitly to Box<dyn Trait + 'a>
If you have a custom impl Drop on your type, the Drop-checker may not be able to prove that the destructor is unable to observe values that have already been dropped. To prevent the Drop impl from accessing references to values that have already been dropped, the compiler requires the entire type to only have 'static references inside of it. This can be overcome by an unsafe impl, which lifts the 'static-requirement.
Instead of &'static T, pass Arc<T> to the thread. This has only a tiny cost and ensures lifetimes will not be longer than necessary.

Why is the "move" keyword necessary when it comes to threads; why would I ever not want that behavior?

For example (taken from the Rust docs):
let v = vec![1, 2, 3];
let handle = thread::spawn(move || {
println!("Here's a vector: {:?}", v);
});
This is not a question about what move does, but about why it is necessary to specify.
In cases where you want the closure to take ownership of an outside value, would there ever be a reason not to use the move keyword? If move is always required in these cases, is there any reason why the presence of move couldn't just be implied/omitted? For example:
let v = vec![1, 2, 3];
let handle = thread::spawn(/* move is implied here */ || {
// Compiler recognizes that `v` exists outside of this closure's
// scope and does black magic to make sure the closure takes
// ownership of `v`.
println!("Here's a vector: {:?}", v);
});
The above example gives the following compile error:
closure may outlive the current function, but it borrows `v`, which is owned by the current function
When the error magically goes away simply by adding move, I can't help but wonder to myself: why would I ever not want that behavior?
I'm not suggesting anything is wrong with the required syntax. I'm just trying to gain a deeper understanding of move from people who understand Rust better than I do. :)
It's all about lifetime annotations, and a design decision Rust made long ago.
See, the reason why your thread::spawn example fails to compile is because it expects a 'static closure. Since the new thread can run longer than the code that spawned it, we have to make sure that any captured data stays alive after the caller returns. The solution, as you pointed out, is to pass ownership of the data with move.
But the 'static constraint is a lifetime annotation, and a fundamental principle of Rust is that lifetime annotations never affect run-time behavior. In other words, lifetime annotations are only there to convince the compiler that the code is correct; they can't change what the code does.
If Rust inferred the move keyword based on whether the callee expects 'static, then changing the lifetimes in thread::spawn may change when the captured data is dropped. This means that a lifetime annotation is affecting runtime behavior, which is against this fundamental principle. We can't break this rule, so the move keyword stays.
Addendum: Why are lifetime annotations erased?
To give us the freedom to change how lifetime inference works, which allows for improvements like non-lexical lifetimes (NLL).
So that alternative Rust implementations like mrustc can save effort by ignoring lifetimes.
Much of the compiler assumes that lifetimes work this way, so to make it otherwise would take a huge effort with dubious gain. (See this article by Aaron Turon; it's about specialization, not closures, but its points apply just as well.)
There are actually a few things in play here. To help answer your question, we must first understand why move exists.
Rust has 3 types of closures:
FnOnce, a closure that consumes its captured variables (and hence can only be called once),
FnMut, a closure that mutably borrows its captured variables, and
Fn, a closure that immutably borrows its captured variables.
When you create a closure, Rust infers which trait to use based on how the closure uses the values from the environment. The manner in which a closure captures its environment depends on its type. A FnOnce captures by value (which may be a move or a copy if the type is Copyable), a FnMut mutably borrows, and a Fn immutably borrows. However, if you use the move keyword when declaring a closure, it will always "capture by value", or take ownership of the environment before capturing it. Thus, the move keyword is irrelevant for FnOnces, but it changes how Fns and FnMuts capture data.
Coming to your example, Rust infers the type of the closure to be a Fn, because println! only requires a reference to the value(s) it is printing (the Rust book page you linked talks about this when explaining the error without move). The closure thus attempts to borrow v, and the standard lifetime rules apply. Since thread::spawn requires that the closure passed to it have a 'static lifetime, the captured environment must also have a 'static lifetime, which v does not outlive, causing the error. You must thus explicitly specify that you want the closure to take ownership of v.
This can be further exemplified by changing the closure to something that the compiler would infer to be a FnOnce -- || v, as a simple example. Since the compiler infers that the closure is a FnOnce, it captures v by value by default, and the line let handle = thread::spawn(|| v); compiles without requiring the move.
The existing answers have great information, which led me to an understanding that is easier for me to think about, and hopefully easier for other Rust newcomers to get.
Consider this simple Rust program:
fn print_vec (v: &Vec<u32>) {
println!("Here's a vector: {:?}", v);
}
fn main() {
let mut v: Vec<u32> = vec![1, 2, 3];
print_vec(&v); // `print_vec()` borrows `v`
v.push(4);
}
Now, asking why the move keyword can't be implied is like asking why the "&" in print_vec(&v) can't also be implied.
Rust’s central feature is ownership. You can't just tell the compiler, "Hey, here's a bunch of code I wrote, now please discern perfectly everywhere I intend to reference, borrow, copy, move, etc. Kthnxsbye!" Symbols and keywords like & and move are a necessary and integral part of the language.
In hindsight, this seems really obvious, and makes my question seem a little silly!

Do I have to create distinct structs for both owned (easy-to-use) and borrowed (more efficient) data structures?

I have a Message<'a> which has &'a str references on a mostly short-lived buffer.
Those references mandate a specific program flow as they are guaranteed to never outlive the lifetime 'a of the buffer.
Now I also want to have an owned version of Message, such that it can be moved around, sent via threads, etc.
Is there an idiomatic way to achieve this? I thought that Cow<'a, str> might help, but unfortunately, Cow does not magically allocate in case &'a str would outlive the buffer's lifetime.
AFAIK, Cow is not special in the sense that no matter if Cow holds an Owned variant, it must still pass the borrow checker on 'a.
Definition of std::borrow::Cow.
pub enum Cow<'a, B> {
Borrowed(&'a B),
Owned(<B as ToOwned>::Owned),
}
Is there an idiomatic way to have an owned variant of Message? For some reason we have &str and String, &[u8] and Vec<u8>, ... does that mean people generally would go for &msg and Message?
I suppose I still have to think about if an owned variant is really, really needed, but my experience shows that having an escape hatch for owned variants generally improves prototyping speed.
Yes, you need to have multiple types, one representing the owned concept and one representing the borrowed concept.
You'll see the same technique throughout the standard library and third-party crates.
See also:
How to abstract over a reference to a value or a value itself?
How to avoid writing duplicate accessor functions for mutable and immutable references in Rust?

What are the examples of unsafely specified lifetimes? [duplicate]

This question already has answers here:
Why are explicit lifetimes needed in Rust?
(10 answers)
Closed 2 years ago.
I have been learning the lifetimes topic for the last three days, and they start making sense to me now. However, I experimented a lot, but didn't manage to specify lifetimes in a way when they'd lead to runtime-unsafe behavior, because the compiler seems to be smart enough to prevent such cases, by not compiling.
Hence I have the chain of questions below:
Is it true that Rust compiler will catch every case of unsafe lifetime specifiers usage?
If yes, then why does Rust require manually specifying lifetimes, when it can do it on its own, by deducing the unsafe scenarios? Or is it just a relic that will go away once the compiler becomes powerful enough to make lifetime elision everywhere?
If no, what is the example (are the examples) of unsafe lifetime specifiers usage? They'd clearly prove the necessity of manually specifying lifetimes.
It is not possible (barring any compiler bugs) to induce undefined behavior with lifetime specifiers unless you use unsafe code (either in the function or elsewhere). However, lifetime specifiers are still necessary because sometimes there is ambiguity in what the proper lifetime should be. For example:
fn foo(bar: &i32, baz: &i32) -> &i32 {
// ...
}
What should the lifetime of the return type be? The compiler cannot infer this because it could be tied to either bar or baz, and each case would affect how long the return value lasts and therefore how the function can be used. The body of the function cannot be used to infer the lifetime because type and lifetime checks must be possible to complete using only the signature of the function. The only way to remove this ambiguity is to explicitly state what lifetime the return value should have:
fn foo<'a>(bar: &i32, baz: &'a i32) -> &'a i32 {
// ...
}
You can read more about the lifetime elision rules here.

Can I use a mutable reference method like a value-passing one?

Can I use a mutable reference method like a value-passing one? For example, can I use
o.mth(&mut self, ...)
as
o.mth(self, ...)
This would allow me to return the result without worrying about the lifetime of o. It might involve a move closure, or some kind of wrapper?
For context, I'm trying to return a boxed iterator over CSV records using the rust-csv package but the iterator can't outlive the reader, which Reader::records(&'t mut self) borrows mutably. Contrast this with BufRead::lines(self), which consumes its reader and hence can be returned without lifetime problems.
No, you cannot. The reason that self, &self, and &mut self methods exist is because they behave differently, have different restrictions, and allow different things.
In this case, you'd probably ultimately end up trying to create an iterator that yields references to itself, which isn't allowed, or store a value and a reference to that value in the same struct, which is also disallowed.

Resources