Do Rust lifetimes influence the semantics of the compiled program? - rust

I'm trying to grok lifetimes in Rust and asked myself whether they are "just" a safety measure (and a way to communicate how safety is ensured, or not, in the case of errors) or if there are cases where different choices of lifetimes actually change how the program runs, i.e. whether lifetimes make a semantic difference to the compiled program.
And with "lifetimes" I refer to all the pesky little 'a, 'b, 'static markers we include to make the borrow checker happy. Of course, writing
{
let foo = File::open("foo.txt")?;
}
foo.write_all(b"bar");
instead of
let foo = File::open("foo.txt")?;
foo.write_all(b"bar");
will close the file descriptor before the write occurs, even if we could access foo afterwards, but that kind of scoping and destructor-calling also happens in C++.

No, lifetimes do not affect the generated machine code in any way. At the end of the day, it's all "just pointers" to the compiled code.
Because we are humans speaking a human language, we tend to lump two different but related concepts together: concrete lifetimes and generic lifetime parameters.
All programming languages have concrete lifetimes. That just corresponds to when a resource will be released. That's what your example shows and indeed, C++ works the same as Rust does there. This is often known as Resource Acquisition Is Initialization (RAII). Garbage-collected languages have lifetimes too, but they can be harder to nail down exactly when they end.
What makes Rust neat in this area are the generic lifetime parameters, the things we know as 'a or 'static. These allow the compiler to track the underlying pointers so that the programmer doesn't need to worry if the pointer will remain valid long enough. This works for storing references in structs and passing them to and from functions.

Related

Rust Box vs non-box

Given a rust object, is it possible to wrap it so that multiple references and a mutable reference are allowed but do not cause problems?
For example, a Vec that has multiple references and a single mutable reference.
Yes, but...
The type you're looking for is RefCell, but read on before jumping the gun!
Rust is a single-ownership language. It always will be. It's exactly that feature that makes Rust as thread-safe and memory-safe as it is. You cannot fully circumvent this, short of wrapping your entire program in unsafe and using raw pointers exclusively, and if you're going to do that, just write C since you're no longer getting any benefits out of using Rust.
So, at any given moment in your program, there must either be one thing writing to this memory or several things reading. That's the fundamental law of single-ownership. Keep that in mind; you cannot get around that. What I'm about to say still follows that rule.
Usually, we enforce this with our type signatures. If I take a &T, then I'm just an alias and won't write to it. If I take a &mut T, then nobody else can see what I'm doing till I forfeit that reference. That's usually good enough, and if we can, we want to do it that way, since we get guarantees at compile-time.
But it doesn't always work that way. Sometimes we can't prove that what we're doing is okay. Sometimes I've got two functions holding an, ostensibly, mutable reference, but I know, due to some other guarantees Rust doesn't know about, that only one will be writing to it at a time. Enter RefCell. RefCell<T> contains a single T and pretends to be immutable but lets you borrow the thing inside either mutably or immutably with try_borrow_mut and try_borrow. When we call one of these functions, we get a reference-like value that can read (and write, in the mutable case) to the original data, even though we started with a &RefCell<T> that doesn't look mutable.
But the fundamental law still holds. Note that those try_* functions return a Result, i.e. they might fail. If two functions simultaneously try to get try_borrow_mut references, the second one will fail, and it's your job to deal with that eventuality (even if "deal with that" means panic! in your particular use case). All we've done is move the single-ownership rules from compile-time to runtime. We haven't gotten rid of them; we've just changed who's responsible for enforcing them.

Is it idiomatic to panic in From implementations?

The documentation at https://doc.rust-lang.org/std/convert/trait.From.html states
Note: This trait must not fail. If the conversion can fail, use TryFrom.
Suppose I have a From implementation thus:
impl From<SomeStruct> for http::Uri {
fn from(item: SomeStruct) -> http::Uri {
item.uri.parse::<http::Uri>() // can fail
}
}
Further suppose I am completely certain that item.uri.parse will succeed. Is it idiomatic to panic in this scenario? Say, with:
item.uri.parse::<http::Uri>().unwrap()
In this particular case, it appears there's no way to construct an HTTP URI at compile time: https://docs.rs/http/0.2.5/src/http/uri/mod.rs.html#117. In the real scenario .uri is an associated const, so I can test all used values parse. But it seems to me there could be other scenarios when the author is confident in the infallibility of a piece of code, particularly when that confidence can be encoded in tests, and would therefore prefer the ergonomics of From over TryFrom. The Rust compiler, typically quite strict, doesn't prevent this behaviour, though it seems it perhaps could. This makes me think this is a decision the author has been deliberately allowed to make. So the question is asking: what do people tend to do in this situation?
So in general, traits only enforce that the implementors adhere to the signatures and types as laid out in the trait. At least that's what the compiler enforces.
On top of that, there are certain contracts that traits are expected to adhere to just so that there's no weird surprises by those who work with these traits. These contracts aren't checked by the compiler; that would be quite difficult.
Nothing prevents you from implementing all a trait's methods but in way that's totally unrelated to what the trait is all about, like implementing the Display trait but then in the fmt method not actually bothering to use write! and instead, I don't know, delete the user's home directory.
Now back to your specific case. If your from method will not fail, provably so, then of course you can use .unwrap. The point of the cannot fail contract for the From trait is that those who rely on the From trait want to be able to assume that the conversion will go through every time. If you actually panic in your own implementation of from, it means the conversion sometimes doesn't go through, counter to the ideas and contracts in the From trait.

When is a static lifetime not appropriate?

I have found a lot of information across the web about rust lifetimes, including information about static lifetimes. It makes sense to me that, in certain situations, you must guarantee that a reference will outlive everything.
For instance, I have a reference that I’m passing to a thread, and the compiler is requesting that the reference been marked as static. In this scenario, that seems to make sense because the compiler can’t know how long the thread will live and thus needs to ensure the passed reference outlives the thread. (I think that’s correct?)
I don’t know where this comes from, but I am always concerned that marking something with a static lifetime is something to be skeptical of, and avoided when possible.
So I wonder if that’s correct. Should I be critical of marking things with a static lifetime? Are there situations when the compiler will want to require one, but an alternate strategy might actually be more optimal?
What are some concrete ways that I can reason about the application of a static lifetime, and possibly determine when it might not be appropriate?
As you might have already guessed, there is no definitive, technical answer to this.
As a newcomer to Rust, 'static references seem to defeat the entire purpose of the borrowing system and there is a notion to avoid them. Once you get more experienced, this notion will go away.
First of all, 'static is not bad as it seems, since all things that have no other lifetimes associated with them are 'static, e.g. String::new(). Notice that 'static does not mean that the value in question does truly live forever. It just means that the value can be made to live forever. In your threading-examples, the thread can't make any promises about its own lifetime, so it needs to be able to make all things passed to it live forever. Any owned value which does not include lifetimes shorter than 'static (like vec![1,2,3]) can be made to live forever (simply by not destroying them) and are therefor 'static.
Second, &'static - the static reference - does not come up often anyway. If it does, you'll usually be aware of why. You won't see a lot of fn foo(bar: &'static Bar) because there simply aren't that many use-cases for it, not because it is actively avoided.
There are situations where 'static does come up in surprising ways. Out of my head:
A Box<dyn Trait> is implicitly a Box<dyn Trait + 'static>. This is because when the type of the value inside the Box gets erased, it might have had lifetimes associated with it; and all (different) types must be valid for as long as the Box lives. Therefore all types need to share a common denominator wrt their lifetimes and Rust is defined to choose 'static. This choice is usually ok, but can lead to surprising "requires 'static" errors. You can generalize this explicitly to Box<dyn Trait + 'a>
If you have a custom impl Drop on your type, the Drop-checker may not be able to prove that the destructor is unable to observe values that have already been dropped. To prevent the Drop impl from accessing references to values that have already been dropped, the compiler requires the entire type to only have 'static references inside of it. This can be overcome by an unsafe impl, which lifts the 'static-requirement.
Instead of &'static T, pass Arc<T> to the thread. This has only a tiny cost and ensures lifetimes will not be longer than necessary.

Rust Ownership Smart Pointers

I've recently started learning Rust and just learned about the Smart Pointers (Box, Rc and RefCell).
In the guide they talked about Rc implementing "shared ownership". But if I understood it correctly, the whole point of the ownership system is that there can only be one owner.
And to me (still a Rust newbie) it seems as if Rc and RefCell take ownership of they value they contain and just "expose" different types of references to the contained value?
Am I wrong and if yes: why is Rust allowed to "cheat" the ownership system like that and would I be theoretically able to implement my own "cheating" types?
if I understood it correctly, the whole point of the ownership system is that there can only be one owner.
No. Rust guarantees that there can be no more than a single mutable borrow and there cannot be mutable and non-mutable borrows at the same time. It doesn't say anything about owners.
why is Rust allowed to "cheat" the ownership system
It doesn't.
would I be theoretically able to implement my own "cheating" types
Yes. Those types are all implemented in Rust¹. Those types are battle-tested and perfectly safe under Rust's safety rules, but they require the use of unsafe at a lower level.
Note that unsafe doesn't permit going around the rule that you can have one mutable borrow XOR any number of non-mutable borrows, but using unsafe, you could do it anyway. This, of course, would actually be unsafe (and trigger undefined behavior).
1: Although some of those types are implemented using features that are still private to the compiler so you wouldn't be able to do everything as efficiently as the standard library, and Box and UnsafeCell are special to the language and cannot be reproduced by a normal library. There are for example many crates providing Rc or Arc alternatives which are better that the standard ones in some cases.

Can I avoid using explicit lifetime specifiers and instead use reference counting (Rc)?

I am reading the Rust Book and everything was pretty simple to understand (thanks to the book's authors), until the section about lifetimes. I spent all day, reading a lot of articles on lifetimes and still I am very insecure about using them correctly.
What I do understand, though, is that the concept of explicit lifetime specifiers aims to solve the problem of dangling references. I also know that Rust has reference-counting smart pointers (Rc) which I believe is the same as shared_ptr in C++, which has the same purpose: to prevent dangling references.
Given that those lifetimes are so horrendous to me, and smart pointers are very familiar and comfortable for me (I used them in C++ a lot), can I avoid the lifetimes in favor of smart pointers? Or are lifetimes an inevitable thing that I'll have to understand and use in Rust code?
are lifetimes an inevitable thing that I'll have to understand and use in Rust code?
In order to read existing Rust code, you probably don't need to understand lifetimes. The borrow-checker understands them so if it compiles then they are correct and you can just review what the code does.
I am very insecure about using them correctly.
The most important thing to understand about lifetimes annotations is that they do nothing. Rather, they are a way to express to the compiler the relationship between references. For example, if an input and output to a function have the same lifetime, that means that the output contains a reference to the input (or part of it) and therefore is not allowed to live longer than the input. Using them "incorrectly" means that you are telling the compiler something about the lifetime of a reference which it can prove to be untrue - and it will give you an error, so there is nothing to be insecure about!
can I avoid the lifetimes in favor of smart pointers?
You could choose to avoid using references altogether and use Rc everywhere. You would be missing out on one of the big features of Rust: lifetimes and references form one of the most important zero-cost abstractions, which enable Rust to be fast and safe at the same time. There is code written in Rust that nobody would attempt to write in C/C++ because a human could never be absolutely certain that they haven't introduced a memory bug. Avoiding Rust references in favour of smart pointers will mostly result in slower code, because smart pointers have runtime overhead.
Many APIs use references. In order to use those APIs you will need to have at least some grasp of what is going on.
The best way to understand is just to write code and gain an intuition from what works and what doesn't. Rust's error messages are excellent and will help a lot with forming that intuition.

Resources