T.into() vs Some(x) are there any detriments of using .into()? - rust

I just observed for the first time that I could create Options with .into() instead of wrapping in Some(). Are there any downsides to this approach?

Codegen wise, there wouldn't be any downsides as the .into() just wraps the value in Some() too. The only problem might be if LLVM wasn't inlining the call.
In the code readability side though, .into() is far less clear than Some(). .into() is highly generic. That means, you could end up having to add type annotations, which is more effort than just wrapping yourself. Even in the cases where you don't need to add annotations, it can become difficult for the reader what the type of the expression is.
IMO, .into() should be used where the exact type is not important and is only an implementation detail. The meaning of the type should not change. Going from Foo::Color to Foo::BetterForInternalUseColor is an implementation detail and does not change meaning. Going from T to Option<T> does.

Related

An easy way to find unwrap() usages of Result only

In Rust two of the most commonly used enums, Option and Result, have a method with the same name unwrap(). I'm not sure why Rust authors chose both enums to use the same method name - it's clear that both enums are somewhat similar, but that decision can make it harder to find all the usages of, say, Result's method only. And I think in a Rust project it would be very useful if we could easily find all the places where we have unwrap() or something else that might panic. For example, if we start off with some proof-of-concept implementation that is OK to panic but later decide to properly handle errors.
Option's unwrap() could also panic, of course, but usually we would have made sure that wouldn't be possible, so there is a clear difference, compared to Result, where we generally expect there might be an error. (Also, I know Option's unwrap() can generally be avoided by using alternatives, but sometimes it does make code simpler.)
Update
It seems from the comments I should probably clarify why I said sometimes Option's unwrapping should be considered safe. I guess an example would be best:
if o.is_none() {
// ...
return ...;
}
// ...
o.unwrap() // <--- Here I do NOT expect a None

Is it idiomatic to panic in From implementations?

The documentation at https://doc.rust-lang.org/std/convert/trait.From.html states
Note: This trait must not fail. If the conversion can fail, use TryFrom.
Suppose I have a From implementation thus:
impl From<SomeStruct> for http::Uri {
fn from(item: SomeStruct) -> http::Uri {
item.uri.parse::<http::Uri>() // can fail
}
}
Further suppose I am completely certain that item.uri.parse will succeed. Is it idiomatic to panic in this scenario? Say, with:
item.uri.parse::<http::Uri>().unwrap()
In this particular case, it appears there's no way to construct an HTTP URI at compile time: https://docs.rs/http/0.2.5/src/http/uri/mod.rs.html#117. In the real scenario .uri is an associated const, so I can test all used values parse. But it seems to me there could be other scenarios when the author is confident in the infallibility of a piece of code, particularly when that confidence can be encoded in tests, and would therefore prefer the ergonomics of From over TryFrom. The Rust compiler, typically quite strict, doesn't prevent this behaviour, though it seems it perhaps could. This makes me think this is a decision the author has been deliberately allowed to make. So the question is asking: what do people tend to do in this situation?
So in general, traits only enforce that the implementors adhere to the signatures and types as laid out in the trait. At least that's what the compiler enforces.
On top of that, there are certain contracts that traits are expected to adhere to just so that there's no weird surprises by those who work with these traits. These contracts aren't checked by the compiler; that would be quite difficult.
Nothing prevents you from implementing all a trait's methods but in way that's totally unrelated to what the trait is all about, like implementing the Display trait but then in the fmt method not actually bothering to use write! and instead, I don't know, delete the user's home directory.
Now back to your specific case. If your from method will not fail, provably so, then of course you can use .unwrap. The point of the cannot fail contract for the From trait is that those who rely on the From trait want to be able to assume that the conversion will go through every time. If you actually panic in your own implementation of from, it means the conversion sometimes doesn't go through, counter to the ideas and contracts in the From trait.

Why isn't there implicit type conversion (coercion) between primitive types in Rust

I am reading through Rust by Example, and I am curious about why we cannot coerce a decimal to a u8, like in the following snippet:
let decimal = 65.4321_f32;
// Error! No implicit conversion
let integer: u8 = decimal;
But explicit casting is allowed, so I don't understand why can't we have it implicit too.
Is this a language design decision? What advantages does this bring?
Safety is a big part of the design of Rust and its standard library. A lot of the focus is on memory safety but Rust also tries to help prevent common bugs by forcing you to make decisions where data could be lost or where your program could panic.
A good example of this is that it uses the Option type instead of null. If you are given an Option<T> you are now forced to decide what to do with it. You could decide to unwrap it and panic, or you could use unwrap_or to provide a sensible default. Your decision, but you have to make it.
To convert a f64 to a u8 you can use the as operator. It doesn't happen automatically because Rust can't decide for you what you want to happen in the case where the number is too big or too small. Or maybe you want to do something with the extra decimal part? Do you want to round it up or down or to the nearest integer?
Even the as operator is considered by some[1] to be an early design mistake, since you can easily lose data unintentionally - especially when your code evolves over time and the types are less visible because of type inference.
[1] https://github.com/rust-lang/rfcs/issues/2784#issuecomment-543180066

Is it safe and defined behavior to transmute between a T and an UnsafeCell<T>?

A recent question was looking for the ability to construct self-referential structures. In discussing possible answers for the question, one potential answer involved using an UnsafeCell for interior mutability and then "discarding" the mutability through a transmute.
Here's a small example of such an idea in action. I'm not deeply interested in the example itself, but it's just enough complication to require a bigger hammer like transmute as opposed to just using UnsafeCell::new and/or UnsafeCell::into_inner:
use std::{
cell::UnsafeCell, mem, rc::{Rc, Weak},
};
// This is our real type.
struct ReallyImmutable {
value: i32,
myself: Weak<ReallyImmutable>,
}
fn initialize() -> Rc<ReallyImmutable> {
// This mirrors ReallyImmutable but we use `UnsafeCell`
// to perform some initial interior mutation.
struct NotReallyImmutable {
value: i32,
myself: Weak<UnsafeCell<NotReallyImmutable>>,
}
let initial = NotReallyImmutable {
value: 42,
myself: Weak::new(),
};
// Without interior mutability, we couldn't update the `myself` field
// after we've created the `Rc`.
let second = Rc::new(UnsafeCell::new(initial));
// Tie the recursive knot
let new_myself = Rc::downgrade(&second);
unsafe {
// Should be safe as there can be no other accesses to this field
(&mut *second.get()).myself = new_myself;
// No one outside of this function needs the interior mutability
// TODO: Is this call safe?
mem::transmute(second)
}
}
fn main() {
let v = initialize();
println!("{} -> {:?}", v.value, v.myself.upgrade().map(|v| v.value))
}
This code appears to print out what I'd expect, but that doesn't mean that it's safe or using defined semantics.
Is transmuting from a UnsafeCell<T> to a T memory safe? Does it invoke undefined behavior? What about transmuting in the opposite direction, from a T to an UnsafeCell<T>?
(I am still new to SO and not sure if "well, maybe" qualifies as an answer, but here you go. ;)
Disclaimer: The rules for these kinds of things are not (yet) set in stone. So, there is no definitive answer yet. I'm going to make some guesses based on (a) what kinds of compiler transformations LLVM does/we will eventually want to do, and (b) what kind of models I have in my head that would define the answer to this.
Also, I see two parts to this: The data layout perspective, and the aliasing perspective. The layout issue is that NotReallyImmutable could, in principle, have a totally different layout than ReallyImmutable. I don't know much about data layout, but with UnsafeCell becoming repr(transparent) and that being the only difference between the two types, I think the intent is for this to work. You are, however, relying on repr(transparent) being "structural" in the sense that it should allow you to replace things in larger types, which I am not sure has been written down explicitly anywhere. Sounds like a proposal for a follow-up RFC that extends the repr(transparent) guarantees appropriately?
As far as aliasing is concerned, the issue is breaking the rules around &T. I'd say that, as long as you never have a live &T around anywhere when writing through the &UnsafeCell<T>, you are good -- but I don't think we can guarantee that quite yet. Let's look in more detail.
Compiler perspective
The relevant optimizations here are the ones that exploit &T being read-only. So if you reordered the last two lines (transmute and the assignment), that code would likely be UB as we may want the compiler to be able to "pre-fetch" the value behind the shared reference and re-use that value later (i.e. after inlining this).
But in your code, we would only emit "read-only" annotations (noalias in LLVM) after the transmute comes back, and the data is indeed read-only starting there. So, this should be good.
Memory models
The "most aggressive" of my memory models essentially asserts that all values are always valid, and I think even that model should be fine with your code. &UnsafeCell is a special case in that model where validity just stops, and nothing is said about what lives behind this reference. The moment the transmute returns, we grab the memory it points to and make it all read-only, and even if we did that "recursively" through the Rc (which my model doesn't, but only because I couldn't figure out a good way to make it do so) you'd be fine as you don't mutate any more after the transmute. (As you may have noticed, this is the same restriction as in the compiler perspective. The point of these models is to allow compiler optimizations, after all. ;)
(As a side-note, I really wish miri was in better shape right now. Seems I have to try and get validation to work again in there, because then I could tell you to just run your code in miri and it'd tell you if that version of my model is okay with what you are doing :D )
I am thinking about other models currently that only check things "on access", but haven't worked out the UnsafeCell story for that model yet. What this example shows is that the model may have to contain ways for a "phase transition" of memory first being UnsafeCell, but later having normal sharing with read-only guarantees. Thanks for bringing this up, that will make for some nice examples to think about!
So, I think I can say that (at least from my side) there is the intent to allow this kind of code, and doing so does not seem to prevent any optimizations. Whether we'll actually manage to find a model that everybody can agree with and that still allows this, I cannot predict.
The opposite direction: T -> UnsafeCell<T>
Now, this is more interesting. The problem is that, as I said above, you must not have a &T live when writing through an UnsafeCell<T>. But what does "live" mean here? That's a hard question! In some of my models, this could be as weak as "a reference of that type exists somewhere and the lifetime is still active", i.e., it could have nothing to do with whether the reference is actually used. (That's useful because it lets us do more optimizations, like moving a load out of a loop even if we cannot prove that the loop ever runs -- which would introduce a use of an otherwise unused reference.) And since &T is Copy, you cannot even really get rid of such a reference either. So, if you have x: &T, then after let y: &UnsafeCell<T> = transmute(x), the old x is still around and its lifetime still active, so writing through y could well be UB.
I think you'd have to somehow restrict the aliasing that &T allows, very carefully making sure that nobody still holds such a reference. I'm not going to say "this is impossible" because people keep surprising me (especially in this community ;) but TBH I cannot think of a way to make this work. I'd be curious if you have an example though where you think this is reasonable.

Why does Rust need the `if let` syntax?

Coming from other functional languages (and being a Rust newbie), I'm a bit surprised by the motivation of Rust's if let syntax. The RFC mentions that without if let, the "idiomatic solution today for testing and unwrapping an Option<T>" is either
match opt_val {
Some(x) => {
do_something_with(x);
}
None => {}
}
or
if opt_val.is_some() {
let x = opt_val.unwrap();
do_something_with(x);
}
In Scala, it would be possible to do exactly the same, but the idiomatic solution is rather to map over an Option (or to foreach if it is only for the side effect of doing_something_with(x)).
Why isn't it an idiomatic solution to do the same in Rust?
opt_val.map(|x| do_something_with(x));
map() is intended for transforming an optional value, while if let is mostly needed to perform side effects. While Rust is not a pure language, so any of its code blocks can contain side effects, map semantics is still there. Using map() to perform side effects, while certainly possible, will only confuse readers of your code. Note that it should not have performance penalties, at least in simple code - LLVM optimizer is perfectly capable of inlining the closure directly into the calling function, so it turns to be equivalent to a match statement.
Before if let the only way to perform side effects on an Option was either a match or if with Option::is_some() check. match approach is the safest one, but it is very verbose, especially when a lot of nested checks are needed:
match o1 {
Some(v1) => match v1.f {
Some(v2) => match some_function(v2) {
Some(r) => ...
None => {}
}
None => {}
}
None => {}
}
Note the prominent rightward drift and a lot of syntactical noise. And it only gets worse if branches are not simple matches but proper blocks with multiple statements.
if option.is_some() approach, on the other hand, is slightly less verbose but still reads very badly. Also its condition check and unwrap() are not statically tied, so it is possible to get it wrong without the compiler noticing it.
if let solves the verbosity problem, based on the same pattern matching infrastructure as match (so it is harder to get wrong than if option.is_some()) and, as a side benefit, allows using arbitrary types in patterns, not only Option. For example, some types may not provide map()-like methods; if let will still work with them very nicely. So if let is a clear win, hence it is idiomatic.
.map() is specific to the Option<T> type, but if let (and while let!) are features that work with all Rust types.
Because your solution creates a closure, which uses resources, whereas if let desugars exactly to your first example, which doesn't. I also find it more readable.
Rust is all about zero-cost abstractions that make programming nicer, and if let and while let are good examples of those (at least IMO -- I realize it's a matter of personal preference). They're not strictly necessary, but they sure feel good to use (also see: Clojure, where they were likely lifted from).
A quote from a related issue against The Rust Programming Language:
You'd better think about if let as a one-handed match with no exhaustive checking enforced. It often has nothing to do with if or let, that's why it is so confusing. :)
Maybe it would be better to rename the whole operator, call it match once or something like that.
(slightly modified to make sense without the context)

Resources