Why does Rust require ownership annotations instead of inferring it? [duplicate] - rust

This question already has answers here:
Why are explicit lifetimes needed in Rust?
(10 answers)
Closed 2 years ago.
How come Rust does not fully infer ownership of its variables? Why are annotations needed?

If that were even possible I believe it would be a terrible user experience because:
if the compiler cannot deduce ownership of an object, the error can barely be understood (like with trial-and-error approach in C++ templates link);
the ownership policy doesn't seem to be easy to grasp (that's one opinion though) and trying to understand which semantic has been chosen by a compiler may lead to unexpected behaviors (reference a Javascript weird type conversions);
more bugs during refactoring can be introduced (implied by the point above);
full program inference would definitely take a huge amount of time, if it is even a solvable problem.
However, if you struggle with a lack of polymorphism, it is usually possible to parametrize a method with an ownership kind, which might be considered a somewhat explicit alternative to inference, e.g.:
fn print_str(s: impl AsRef<str>) {
println!("{}", s.as_ref());
}
fn main() {
print_str("borrowed");
print_str("owned".to_owned());
}

Related

Is it idiomatic to panic in From implementations?

The documentation at https://doc.rust-lang.org/std/convert/trait.From.html states
Note: This trait must not fail. If the conversion can fail, use TryFrom.
Suppose I have a From implementation thus:
impl From<SomeStruct> for http::Uri {
fn from(item: SomeStruct) -> http::Uri {
item.uri.parse::<http::Uri>() // can fail
}
}
Further suppose I am completely certain that item.uri.parse will succeed. Is it idiomatic to panic in this scenario? Say, with:
item.uri.parse::<http::Uri>().unwrap()
In this particular case, it appears there's no way to construct an HTTP URI at compile time: https://docs.rs/http/0.2.5/src/http/uri/mod.rs.html#117. In the real scenario .uri is an associated const, so I can test all used values parse. But it seems to me there could be other scenarios when the author is confident in the infallibility of a piece of code, particularly when that confidence can be encoded in tests, and would therefore prefer the ergonomics of From over TryFrom. The Rust compiler, typically quite strict, doesn't prevent this behaviour, though it seems it perhaps could. This makes me think this is a decision the author has been deliberately allowed to make. So the question is asking: what do people tend to do in this situation?
So in general, traits only enforce that the implementors adhere to the signatures and types as laid out in the trait. At least that's what the compiler enforces.
On top of that, there are certain contracts that traits are expected to adhere to just so that there's no weird surprises by those who work with these traits. These contracts aren't checked by the compiler; that would be quite difficult.
Nothing prevents you from implementing all a trait's methods but in way that's totally unrelated to what the trait is all about, like implementing the Display trait but then in the fmt method not actually bothering to use write! and instead, I don't know, delete the user's home directory.
Now back to your specific case. If your from method will not fail, provably so, then of course you can use .unwrap. The point of the cannot fail contract for the From trait is that those who rely on the From trait want to be able to assume that the conversion will go through every time. If you actually panic in your own implementation of from, it means the conversion sometimes doesn't go through, counter to the ideas and contracts in the From trait.

Is `UnsafeCell` a compiler optimization barrier in Rust? [duplicate]

This question already has answers here:
Is this error due to the compiler's special knowledge about RefCell?
(1 answer)
How does the Rust compiler know `Cell` has internal mutability?
(3 answers)
Closed 2 years ago.
In Unsafe Code Guidelines Reference, it says
All interior mutation in Rust has to happen inside an UnsafeCell, so all data structures that have interior mutability must (directly or indirectly) use UnsafeCell for this purpose.
Also, in a discussion about UnsafeCell, it says
UnsafeCell is basically an optimization barrier to the compiler.
It is true that UnsafeCell acts as a compiler optimization barrier in Rust? If yes, which line in the standard library source code emits a barrier and how does it work?
[UPDATE]
The answer of a related question gives a very nice explanation. The TL;DR version is: UnsafeCell<T> is marked with #[lang = "unsafe_cell"] which forces it to be invariant over T.
Now I think this is not very much connected to optimization, but interacts more closely with lifetime analysis.
For the notion of variance in Rust, The Rustonomicon Book gives a detailed explanation.

Can I avoid using explicit lifetime specifiers and instead use reference counting (Rc)?

I am reading the Rust Book and everything was pretty simple to understand (thanks to the book's authors), until the section about lifetimes. I spent all day, reading a lot of articles on lifetimes and still I am very insecure about using them correctly.
What I do understand, though, is that the concept of explicit lifetime specifiers aims to solve the problem of dangling references. I also know that Rust has reference-counting smart pointers (Rc) which I believe is the same as shared_ptr in C++, which has the same purpose: to prevent dangling references.
Given that those lifetimes are so horrendous to me, and smart pointers are very familiar and comfortable for me (I used them in C++ a lot), can I avoid the lifetimes in favor of smart pointers? Or are lifetimes an inevitable thing that I'll have to understand and use in Rust code?
are lifetimes an inevitable thing that I'll have to understand and use in Rust code?
In order to read existing Rust code, you probably don't need to understand lifetimes. The borrow-checker understands them so if it compiles then they are correct and you can just review what the code does.
I am very insecure about using them correctly.
The most important thing to understand about lifetimes annotations is that they do nothing. Rather, they are a way to express to the compiler the relationship between references. For example, if an input and output to a function have the same lifetime, that means that the output contains a reference to the input (or part of it) and therefore is not allowed to live longer than the input. Using them "incorrectly" means that you are telling the compiler something about the lifetime of a reference which it can prove to be untrue - and it will give you an error, so there is nothing to be insecure about!
can I avoid the lifetimes in favor of smart pointers?
You could choose to avoid using references altogether and use Rc everywhere. You would be missing out on one of the big features of Rust: lifetimes and references form one of the most important zero-cost abstractions, which enable Rust to be fast and safe at the same time. There is code written in Rust that nobody would attempt to write in C/C++ because a human could never be absolutely certain that they haven't introduced a memory bug. Avoiding Rust references in favour of smart pointers will mostly result in slower code, because smart pointers have runtime overhead.
Many APIs use references. In order to use those APIs you will need to have at least some grasp of what is going on.
The best way to understand is just to write code and gain an intuition from what works and what doesn't. Rust's error messages are excellent and will help a lot with forming that intuition.

Why can't Rust do more complex type inference? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
In Chapter 3 of The Rust Programming Language, the following code is used as an example for a kind of type inference that Rust cannot manage:
fn main() {
let condition = true;
let number = if condition { 5 } else { "six" };
println!("The value of number is: {}", number);
}
with the explanation that:
Rust needs to know at compile time what type the number variable is, definitively, so it can verify at compile time that its type is valid everywhere we use number. Rust wouldn’t be able to do that if the type of number was only determined at runtime; the compiler would be more complex and would make fewer guarantees about the code if it had to keep track of multiple hypothetical types for any variable.
I'm not certain I understand the rationale, because the example does seem like something where a simple compiler could infer the type.
What exactly makes this kind of type inference so difficult? In this case, the value of condition can be clearly inferred at compile time (it's true), and so thus the type of number can be too (it's i32?).
I can see how things could become a lot more complicated, if you were trying to infer types across multiple compilation units for instance, but is there something about this specific example that would add a lot of complexity to the compiler?
There are three main reasons I can think of:
1. Action at a distance effects
Let's suppose the language worked that way. Since we're extending type inference, we might as well make the language even smarter and have it infer return types as well. This allows me to write something like:
pub fn get_flux_capacitor() {
let is_prod = true;
if is_prod { FluxCapacitor::new() } else { MovieProp::new() }
}
And elsewhere in my project, I can get a FluxCapacitor by calling that function. However, one day, I change is_prod to false. Now, instead of getting an error that my function is returning the wrong type, I will get errors at every callsite. A small change inside one function has lead to errors in entirely unchanged files! That's pretty weird.
(If we don't want to add inferered return types, just imagine it's a very long function instead.)
2. Compiler internals exposed
What happens in the case where it's not so simple? Surely this should be the same as the above example:
pub fn get_flux_capacitor() {
let is_prod = (1 + 1) == 2;
...
}
But how far does that extend? The compiler's constant propagation is mostly an implementation detail. You don't want the types in your program to depend on how smart this version of the compiler is.
3. What did you actually mean?
As a human looking at this code, it looks like something is missing. Why are you branching on true at all? Why not just write FluxCapacitor::new()? Perhaps there's logic missing to check and see if a env=DEV environment variable is missing. Perhaps a trait object should actually be used so that you can take advantage of runtime polymorphism.
In this kind of situation where you're asking the computer to do something that doesn't seem quite right, Rust often chooses to throw its hands up and ask you to fix the code.
You're right, in this very specific case (where condition=true statically), the compiler could be made able to detect that the else branch is unreachable and therefore number must be 5.
This is just a contrived example, though... in the more general case, the value of condition would only be dynamically known at runtime.
It's in that case, as other have said, that inference becomes hard to implement.
On that topic, there are two things I haven't seen mentioned yet.
The Rust language design tends to err on the side of doing things as
explicitly as possible
Rust type inference is only local
On point #1, the explicit way for Rust to deal with the "this type can be one of multiple types" use case are enums.
You can define something like this:
#[derive(Debug)]
enum Whatsit {
Num(i32),
Text(&'static str),
}
and then do let number = if condition { Num(5) } else { Text("six") };
On point #2, let's see how the enum (while wordier) is the preferred approach in the language. In the example from the book we just try printing the value of number.
In a more real-case scenario we would at one point use number for something other than printing.
This means passing it to another function or including it in another type. Or (to even enable use of println!) implementing the Debug or Display traits on it. Local inference means that (if you can't name the type of number in Rust), you would not be able to do any of these things.
Suppose you want to create a function that does something with a number;
with the enum you would write:
fn do_something(number: Whatsit)
but without it...
fn do_something(number: /* what type is this? */)
In a nutshell, you're right that in principle it IS doable for the compiler to synthesize a type for number. For instance, the compiler might create an anonymous enum like Whatsit above when compiling that code.
But you - the programmer - would not know the name of that type, would not be able to refer to it, wouldn't even know what you can do with it (can I multiply two "numbers"?) and this would greatly limit its usefulness.
A similar approach was followed for instance to add closures to the language. The compiler would know what specific type a closure has, but you, the programmer, would not. If you're interested I can try finding out discussions on the difficulties that the approach introduced in the design of the language.

Do Rust lifetimes influence the semantics of the compiled program?

I'm trying to grok lifetimes in Rust and asked myself whether they are "just" a safety measure (and a way to communicate how safety is ensured, or not, in the case of errors) or if there are cases where different choices of lifetimes actually change how the program runs, i.e. whether lifetimes make a semantic difference to the compiled program.
And with "lifetimes" I refer to all the pesky little 'a, 'b, 'static markers we include to make the borrow checker happy. Of course, writing
{
let foo = File::open("foo.txt")?;
}
foo.write_all(b"bar");
instead of
let foo = File::open("foo.txt")?;
foo.write_all(b"bar");
will close the file descriptor before the write occurs, even if we could access foo afterwards, but that kind of scoping and destructor-calling also happens in C++.
No, lifetimes do not affect the generated machine code in any way. At the end of the day, it's all "just pointers" to the compiled code.
Because we are humans speaking a human language, we tend to lump two different but related concepts together: concrete lifetimes and generic lifetime parameters.
All programming languages have concrete lifetimes. That just corresponds to when a resource will be released. That's what your example shows and indeed, C++ works the same as Rust does there. This is often known as Resource Acquisition Is Initialization (RAII). Garbage-collected languages have lifetimes too, but they can be harder to nail down exactly when they end.
What makes Rust neat in this area are the generic lifetime parameters, the things we know as 'a or 'static. These allow the compiler to track the underlying pointers so that the programmer doesn't need to worry if the pointer will remain valid long enough. This works for storing references in structs and passing them to and from functions.

Resources