Design choice of default trait implementation vs. derive macro in Rust - rust

After reading quite a bit I haven't found much in the way how to decide whether to implement functionality as a trait with a default implementation or as a derive macro. It appears that the functionality of both is quite similar in many situations, so I'm wondering what considerations are taken when choosing to use one over the other.
The main difference I see is that the default functionality of a trait can be overridden by the consumer of the trait, while this isn't possible for the output of a derive macro.
The other big difference is that the macro has access to the token stream of the annotated item while a default trait implementation does not.
What are other practical differences between the two (i.e. not so concerned about technical details of how the two operate behind the scenes)?
Some possible use cases where this decision needs to be made are
If I know the default implementation of a trait will not need to be overridden is a derive macro preferred?
If I do not need access to the item's internals, is it preferred to use a trait with a default implementation?

Related

Why do I need "use rand::Rng" to call gen() on rand::thread_rng()?

When I'm using Rust's rand crate, if I want to produce a rand number, I would write:
use rand::{self, Rng};
let rand = rand::thread_rng().gen::<usize>();
If I don't use rand::Rng, an error occurs:
no method named gen found for struct rand::prelude::ThreadRng in the current scope
That's quite different from what I'm used to. Usually I treat mods like:
import rand from "path";
rand.generate();
Once I import the mod I don't need to import something else, and I can use every method it exports.
Why must I use rand::Rng to enable the gen method on rand::thread_rng()?
That's quite different from what I used to know.
It feels different because it is indeed different. You are probably used to dynamic dispatch via some kind of virtual method table (as in e.g. C++), or, in case of JS, to dynamic dispatch by looking up either the own properties of the receiver object, or its ancestors via the __proto__-chain. In any case, the object on which you are invoking a method carries around some data that tells it how to get the method that you're invoking. Given the signature of the invoked method, the receiver object itself knows how to get the method with that signature.
That's not the only way, though. For example,
modules / functors in OCaml or SML
Typeclasses in Haskell
implicits / givens in Scala
traits in Rust
work on a rather different principle: the methods are not tied to the receiver, but to the module / typeclass / given / trait instances. In each case, those are entities that are separate from the receiver of the method call. It opens some new possibilities, e.g. it allows you to do some ad-hoc polymorphism (i.e. to define instances of traits after the fact, for types that are not necessarily under your control). At the same time, the compiler typically requires a bit more information from you in order to be able to select the correct instances: it behaves somewhat like a little type-directed search engine, or even a little "theorem prover", and for this to work, you have to tell the compiler where to look for the suitable building blocks for those synthetically generated instances.
If you've never worked before with any language that has a compiler with a subsystem that is "searching for instances" based on type information, this should indeed feel quite foreign. The error messages and the solution approaches do indeed feel rather different, because instead of comparing your implementation against an interface and looking for conflicts, you have to guide this instance-searching mechanism by providing more hints (e.g. by importing more traits etc.).
In your particular case, rand::thread_rng returns a struct ThreadRng. On its own, the struct knows nothing about the gen method, because this method is not tied directly to the struct. Instead, it's defined in the Rng trait. But at the same time, it could be defined in some entirely unrelated trait, and have some completely different meaning. In order to disambiguate the intended meaning, you therefore have to explicitly specify that you want to work with the Rng trait. This is why you have to mention it in the use-clause.
I don't know the specific library you're using, but I can guess at the problem. I would guess that Rng is a trait which defines gen. Traits can be thought of as somewhat like Java's interfaces: they enable ad-hoc polymorphism by allowing you to define different behaviors for the same function on different datatypes.
However, Rust's traits fix one major problem (well, they fix several major problems, but one that's relevant here) with Java's interfaces. In Java, if you define an interface, then anyone writing a class can implement the interface, but you can't implement it for other people. In particular, the built-in types String and int and the like can never implement any new interfaces downstream. In Rust, either the trait writer or the struct/enum writer can implement the trait.
But this poses another issue. Now, if I have a value foo of type Foo and I write foo.bar(), then bar might not be a method defined on Foo; it might be something some trait writer implemented in some other file. We can't go search every Rust file on your computer for possible matching traits, so Rust makes the logical decision to restrict this search to traits that are in scope. If you want to call foo.bar() and bar is a method on trait Bar, then trait Bar has to be in scope when you call it. Otherwise, Rust won't see it.
So, in your case, thread_rng() returns a rand::prelude::ThreadRng. The method gen is not defined on rand::prelude::ThreadRng. Instead, it's defined on a trait called rand::Rng which is *implemented by ThreadRng. That trait has to be in-scope to use the method.

Is it idiomatic to panic in From implementations?

The documentation at https://doc.rust-lang.org/std/convert/trait.From.html states
Note: This trait must not fail. If the conversion can fail, use TryFrom.
Suppose I have a From implementation thus:
impl From<SomeStruct> for http::Uri {
fn from(item: SomeStruct) -> http::Uri {
item.uri.parse::<http::Uri>() // can fail
}
}
Further suppose I am completely certain that item.uri.parse will succeed. Is it idiomatic to panic in this scenario? Say, with:
item.uri.parse::<http::Uri>().unwrap()
In this particular case, it appears there's no way to construct an HTTP URI at compile time: https://docs.rs/http/0.2.5/src/http/uri/mod.rs.html#117. In the real scenario .uri is an associated const, so I can test all used values parse. But it seems to me there could be other scenarios when the author is confident in the infallibility of a piece of code, particularly when that confidence can be encoded in tests, and would therefore prefer the ergonomics of From over TryFrom. The Rust compiler, typically quite strict, doesn't prevent this behaviour, though it seems it perhaps could. This makes me think this is a decision the author has been deliberately allowed to make. So the question is asking: what do people tend to do in this situation?
So in general, traits only enforce that the implementors adhere to the signatures and types as laid out in the trait. At least that's what the compiler enforces.
On top of that, there are certain contracts that traits are expected to adhere to just so that there's no weird surprises by those who work with these traits. These contracts aren't checked by the compiler; that would be quite difficult.
Nothing prevents you from implementing all a trait's methods but in way that's totally unrelated to what the trait is all about, like implementing the Display trait but then in the fmt method not actually bothering to use write! and instead, I don't know, delete the user's home directory.
Now back to your specific case. If your from method will not fail, provably so, then of course you can use .unwrap. The point of the cannot fail contract for the From trait is that those who rely on the From trait want to be able to assume that the conversion will go through every time. If you actually panic in your own implementation of from, it means the conversion sometimes doesn't go through, counter to the ideas and contracts in the From trait.

What functionality does it makes sense to implement using Rust enums?

I'm having problem understanding the usefulness of Rust enums after reading The Rust Programming Language.
In section 17.3, Implementing an Object-Oriented Design Pattern, we have this paragraph:
If we were to create an alternative implementation that didn’t use the state pattern, we might instead use match expressions in the methods on Post or even in the main code that checks the state of the post and changes behavior in those places. That would mean we would have to look in several places to understand all the implications of a post being in the published state! This would only increase the more states we added: each of those match expressions would need another arm.
I agree completely. It would be very bad to use enums in this case because of the reasons outlined. Yet, using enums was my first thought of a more idiomatic implementation. Later in the same section, the book introduces the concept of encoding the state of the objects using types, via variable shadowing.
It's my understanding that Rust enums can contain complex data structures, and different variants of the same enum can contain different types.
What is a real life example of a design in which enums are the better option? I can only find fake or very simple examples in other sources.
I understand that Rust uses enums for things like Result and Option, but those are very simple uses. I was thinking of some functionality with a more complex behavior.
This turned out to be a somewhat open ended question, but I could not find a useful response after searching Google. I'm free to change this question to a more closed version if someone could be so kind as to help me rephrase it.
A fundamental trade-off between these choices in a broad sense has a name: "the expression problem". You should find plenty on Google under that name, both in general and in the context of Rust.
In the context of the question, the "problem" is to write the code in such a way that both adding a new state and adding a new operation on states does not involve modifying existing implementations.
When using a trait object, it is easy to add a state, but not an operation. To add a state, one defines a new type and implements the trait. To add an operation, naively, one adds a method to the trait but has to intrusively update the trait implementations for all states.
When using an enum for state, it is easy to add a new operation, but not a new state. To add an operation, one defines a new function. To add a new state, naively, one must intrusively modify all the existing operations to handle the new state.
If I explained this well enough, hopefully it should be clear that both will have a place. They are in a way dual to one another.
With this lens, an enum would be a better fit when the operations on the enum are expected to change more than the alternatives. For example, suppose you were trying to represent an abstract syntax tree for C++, which changes every three years. The set of types of AST nodes may not change frequently relative to the set of operations you may want to perform on AST nodes.
With that said, there are solutions to the more difficult options in both cases, but they remain somewhat more difficult. And what code must be modified may not be the primary concern.

What is difference between derive attribute and implementing traits for structures by yourself?

I was wondering, what is difference between, for example #[derive(Debug)] attribute and implementing std::fmt::Debug trait for some struct by myself(or any other trait that can be added with derive attribute), assuming that my implementation is efficient? Will there be compile time difference? performance difference?
When should I use derive attribute and when should I implement traits by myself?
derive just provides a generally useful implementation of the trait (a default implementation which would often be what a user would want).
You should derive if the behaviour of derive is what you're looking for, and that's it. It's a convenience for generally useful behaviour e.g. structural equality for (Partial)Eq, printing a literal-ish struct for Debug, etc… It doesn't make a difference when using the structure / trait.
A derive is just a special case of macro (custom derives are proc macros), so compilation would be infinitesimally slower: rustc needs to run the proc macro first then compile its output. I doubt that makes any real difference though.

Why is Default not implemented for Mutex, RWLock, CondVar, Duration?

The Default trait can be #[derive(..)]d only if the contents of the deriving type also implement Default. This means the trait gets easier to use the more it is implemented. However, I notice some types from std are missing implementations, although they have perfectly valid defaults (sometimes depending on generic params).
Mutex<T> and RWLock<T> could implement by new(_) (where T: Default)
CondVar could simply implement by CondVar::new()
Duration could derive (to get a zero duration, which is a sensible default)
Is there a technical reason for those omissions?
Some people have asked a similar question with Debug implementations, see “Missing Debug Implementations — #31869” which can also only be deriving under similar conditions as Default.
Unfortunately the corresponding PR “libcore: add Debug implementations to most missing types #32054” seems to suggest that some types were not Debug simply because no-one had written a Debug implementation for them. Some other types are kind of controversial about what the implementation should do and there is some fear about adding them in the standard library.
It reasonable to assume that at least some types are not Default for the same non-technical reasons.

Resources