What is the difference between `dyn` and generics? - rust

I'm reading some code and it has a consume function which makes it possible for me to pass my own function f.
fn consume<R, F>(self, _timestamp: Instant, len: usize, f: F) -> Result<R>
where
F: FnOnce(&mut [u8]) -> Result<R>,
I wrote some similar code, but like this:
pub fn phy_receive(
&mut self,
f: &mut dyn FnMut(&[u8])
) -> u8 {
and to be fair I don't know what is the difference, aside from FnOnce vs FnMut. What is the difference between using dyn vs a generic type parameter to specify this function?

Using dyn with types results in dynamic dispatch (hence the dyn keyword), whereas using a (constrained) generic parameter results in monomorphism.
General explanation
Dynamic dispatch
Dynamic dispatch means that a method call is resolved at runtime. It is generally more expensive in terms of runtime resources than monomorphism.
For example, say you have the following trait
trait MyTrait {
fn f(&self);
fn g(&self);
}
and a struct MyStruct which implements that trait. If you use a dyn reference to that trait (e.g. &dyn MyTrait), and pass a reference to a MyStruct object to it, what happens is the following:
A "vtable" data structure is created. This is a table containing pointers to the MyStruct implementations of f and g.
A pointer to this vtable is stored with the &dyn MyTrait reference, hence the reference will be twice its usual size; sometimes &dyn references are called "fat references" for this reason.
Calling f and g will then result in indirect function calls using the pointers stored in the vtable.
Monomorphism
Monomorphism means that the code is generated at compile-time. It's similar to copy and paste. Using MyTrait and MyStruct defined in the previous section, imagine you have a function like the following:
fn sample<T: MyTrait>(t: T) { ... }
And you pass a MyStruct to it:
sample(MyStruct);
What happens is the following:
During compile time, a copy of the sample function is created specifically for the MyStruct type. In very simple terms, this is as if you copied and pasted the sample function definition and replaced T with MyStruct:
fn sample__MyStruct(t: MyStruct) { ... }
The sample(MyStruct) call gets compiled into sample__MyStruct(MyStruct).
This means that in general, monomorphism can be more expensive in terms of binary code size (since you are essentially duplicating similar chunks of code, but for different types), but there's no runtime cost like there is with dynamic dispatch.
Monomorphism is also generally more expensive in terms of compile times: because it essentially does copy-paste of code, codebases that use monomorphism abundantly tend to compile a bit slower.
Your example
Since FnMut is just a trait, the above discussion applies directly to your question. Here's the trait definition:
pub trait FnMut<Args>: FnOnce<Args> {
pub extern "rust-call" fn call_mut(&mut self, args: Args) -> Self::Output;
}
Disregarding the extern "rust-call" weirdness, this is a trait just like MyTrait above. This trait is implemented by certain Rust functions, so any of those functions is analogous to MyStruct from above. Using &dyn FnMut<...> will result in dynamic dispatch, and using <T: FnMut<...>> will result in monomorphism.
My 2 cents and general advice
Certain situations will require you to use a dynamic dispatch. For example, if you have a Vec of external objects implementing a certain trait, you have no choice but to use dynamic dispatch. For example,
Vec<Box<dyn Debug>>. If those objects are internal to your code, though, you could use an enum type and monomorphism.
If your trait contains an associated type or a generic method, you will have to use monomorphism, because such traits are not object safe.
Everything else being equal, my advice is to pick one preference and stick with it in your codebase. From what I've seen, most people tend to prefer defaulting to generics and monomorphism.

Related

Concrete Option<Box<impl T>> conversion to Option<Box<dyn T>> in Rust [duplicate]

I'm having trouble understanding how values of boxed traits come into existence. Consider the following code:
trait Fooer {
fn foo(&self);
}
impl Fooer for i32 {
fn foo(&self) { println!("Fooer on i32!"); }
}
fn main() {
let a = Box::new(32); // works, creates a Box<i32>
let b = Box::<i32>::new(32); // works, creates a Box<i32>
let c = Box::<dyn Fooer>::new(32); // doesn't work
let d: Box<dyn Fooer> = Box::new(32); // works, creates a Box<Fooer>
let e: Box<dyn Fooer> = Box::<i32>::new(32); // works, creates a Box<Fooer>
}
Obviously, variant a and b work, trivially. However, variant c does not, probably because the new function takes only values of the same type which is not the case since Fooer != i32. Variant d and e work, which lets me suspect that some kind of automatic conversion from Box<i32> to Box<dyn Fooer> is being performed.
So my questions are:
Does some kind of conversion happen here?
If so, what the mechanism behind it and how does it work? (I'm also interested in the low level details, i.e. how stuff is represented under the hood)
Is there a way to create a Box<dyn Fooer> directly from an i32? If not: why not?
However, variant c does not, probably because the new function takes only values of the same type which is not the case since Fooer != i32.
No, it's because there is no new function for Box<dyn Fooer>. In the documentation:
impl<T> Box<T>
pub fn new(x: T) -> Box<T>
Most methods on Box<T> allow T: ?Sized, but new is defined in an impl without a T: ?Sized bound. That means you can only call Box::<T>::new when T is a type with a known size. dyn Fooer is unsized, so there simply isn't a new function to call.
In fact, that function can't exist in today's Rust. Box<T>::new needs to know the concrete type T so that it can allocate memory of the right size and alignment. Therefore, you can't erase T before you send it to Box::new. (It's conceivable that future language extensions may allow functions to accept unsized parameters; however, it's unclear whether even unsized_locals would actually enable Box<T>::new to accept unsized T.)
For the time being, unsized types like dyn Fooer can only exist behind a "fat pointer", that is, a pointer to the object and a pointer to the implementation of Fooer for that object. How do you get a fat pointer? You start with a thin pointer and coerce it. That's what's happening in these two lines:
let d: Box<Fooer> = Box::new(32); // works, creates a Box<Fooer>
let e: Box<Fooer> = Box::<i32>::new(32); // works, creates a Box<Fooer>
Box::new returns a Box<i32>, which is then coerced to Box<Fooer>. You could consider this a conversion, but the Box isn't changed; all the compiler does is stick an extra pointer on it and forget its original type. rodrigo's answer goes into more detail about the language-level mechanics of this coercion.
Hopefully all of this goes to explain why the answer to
Is there a way to create a Box<Fooer> directly from an i32?
is "no": the i32 has to be boxed before you can erase its type. It's the same reason you can't write let x: Fooer = 10i32.
Related
Why can't I write a function with the same type as Box::new?
Are polymorphic variables allowed?
How do you actually use dynamically sized types in Rust?
Why is `let ref a: Trait = Struct` forbidden?
I'll try to explain what conversions (coercions) happen in your code.
There is a marker trait named Unsize that, between others:
Unsize is implemented for:
T is Unsize<Trait> when T: Trait.
[...]
This trait, AFAIK, is not used directly for coercions. Instead, CoerceUnsized is used. This trait is implemented in a lot of cases, some of them are quite expected, such as:
impl<'a, 'b, T, U> CoerceUnsized<&'a U> for &'b T
where
'b: 'a,
T: Unsize<U> + ?Sized,
U: ?Sized
that is used to coerce &i32 into &Fooer.
The interesting, not so obvious implementation for this trait, that affects your code is:
impl<T, U> CoerceUnsized<Box<U>> for Box<T>
where
T: Unsize<U> + ?Sized,
U: ?Sized
This, together with the definition of the Unsize marker, can be somewhat read as: if U is a trait and T implements U, then Box<T> can be coerced into Box<U>.
About your last question:
Is there a way to create a Box<Fooer> directly from an i32? If not: why not?
Not that I know of. The problem is that Box::new(T) requires a sized value, since the value passed is moved into the box, and unsized values cannot be moved.
In my opinion, the easiest way to do that is to simply write:
let c = Box::new(42) as Box<Fooer>;
That is, you create a Box of the proper type and then coerce to the unsized one (note it looks quite similar to your d example).

What is the idiomatic way in rust for a function accepts a closure as argument or return a closure?

What is the idiomatic way in rust for a function accepts a closure as argument or return a closure?
I see it can be done in at least the below 3 ways:
// 1
pub fn run_with_envs_guard1(envs: &HashMap<&str, &str>, f: &dyn FnOnce()) {}
// 2
pub fn run_with_envs_guard2(envs: &HashMap<&str, &str>, f: Box<dyn FnOnce()>) {}
// 3
pub fn run_with_envs_guard3<F: FnOnce()>(envs: &HashMap<&str, &str>, f: F) {}
Are there really some differences among these 3 ways? If yes, pls help to clarify, and which way is more idiomatic i should choose?
I am learning rust still, sorry if all the above ways are some bad/strange things.
Maybe a more specific question, why in way 1 and 2 i need the dyn keyword, but in 3 i don't, from my understanding, these all need dynamic dispatching, is it? as the actual function cannot be determined in compiling time
Abdul answers the first half of your question (and I agree completely with what he said), so I'll take a stab at the second half.
If you want to return a closure from a function, you can't return a type parameter, because that implies that you're returning an instance of any FnOnce, at the caller's choice. You can't return a &FnOnce, because you (usually) need to pass ownership to the caller. You could make it work with Box<FnOnce>, but that tends to just be clunky to work with. When returning closures from functions, I'm partial to the impl trait syntax.
pub fn test() -> impl FnOnce() {
|| { println!("It worked!") }
}
In argument position, writing impl FnOnce() as the type of something is equivalent to defining a type argument, as Abdul did in his answer. However, in return position, it's an entirely new feature that returns an opaque value. It says "I'm returning an FnOnce, and I'm not telling you which one it is". It's the same concept as a trait object, but without the overhead of throwing it in a box.
Responding to your edit
i don't, from my understanding, these all need dynamic dispatching, is it? as the actual function cannot be determined in compiling time
This is actually not necessarily true. If you see the dyn keyword, then there's definitely a dynamic (runtime) dispatch happening. To understand your other example, though, let's consider a simple trait that doesn't have the baggage of FnOnce.
pub trait MyTrait {}
struct Foo;
struct Bar;
impl MyTrait for Foo {}
impl MyTrait for Bar {}
pub fn example<T: MyTrait>(_arg: T) {
println!("It works!");
}
fn main() {
example(Foo);
example(Bar);
}
I claim there's no dynamic dispatch happening here. Rust monomorphizes functions with type parameters. That means that example is like a template function in C++. Every instantiation of it will end up being a separate function. So, really, during Rust's compilation, this will end up being more like
struct Foo;
struct Bar;
pub fn example1(_arg: Foo) {
println!("It works!");
}
pub fn example2(_arg: Foo) {
println!("It works!");
}
fn main() {
example1(Foo);
example2(Bar);
}
Two unrelated functions that happen to do something similar. Rust resolves all of the linkage statically, so there's no dispatch happening at runtime. In fact, we can prove it. Take the code I just posted above and compile it with debugging symbols on (rustc -g filename.rs). Then use a tool like nm (available on most Linux machines by default) to list all of the symbols in the linker table. Assuming you didn't turn any optimizations on, you should see two example functions. This is what they look like in my linker
0000000000005340 t _ZN10code7example17h46383f9ad372dc94E
00000000000053a0 t _ZN10code7example17h97b400359a146fcaE
or, with nm -C to demangle the function names
0000000000005340 t code::example
00000000000053a0 t code::example
Two different functions, each of which takes concrete arguments of specific types.
Your proposed FnOnce would work the same way.
pub fn run_with_envs_guard3<F: FnOnce()>(envs: &HashMap<&str, &str>, f: F) {}
Every closure in Rust has a distinct type, so every time this function is called, a new version of run_with_envs_guard3 will get made, specifically for that closure. That new function will know exactly what to do for the closure you just gave it. In 99% of cases, if you have optimizations turned on, these made-up local functions will get inlined and optimized out, so no harm done. But there's no dynamic dispatch here.
In the other two examples, we have a dyn FnOnce, which is more like what you'd expect coming from a traditionally object-oriented language. dyn FnOnce contains a dynamic pointer to some function somewhere that will be dispatched at runtime, the way you'd expect.
I would prefer the third one. Because the rust documentation suggest to Use FnOnce as a bound when you want to accept a parameter of function-like type and only need to call it once.
pub fn run_with_envs_guard3<F: FnOnce()>(envs: &HashMap<&str, &str>, f: F) {}
This means that the F to be bound by FnOnce(ie, F must implement FnOnce)

What is the meaning of a dyn Trait in argument position?

I can see the difference between dyn and (static) impl Traits in return position, such as:
fn foo() -> Box<dyn Trait> {}
vs
fn foo() -> impl Trait {}
Where in the dyn version I'm allowed to return different types as long they all implement the Trait, while in the impl version I'm only allowed to return the same type (same applies if I return a reference).
But I can't see the purpose of a dyn Trait in argument position such as:
fn foo(x: &dyn Trait) {}
vs
fn foo(x: &impl Trait) {} // syntatic sugar of `fn foo<T: Trait>(x: &T){}`
What is the difference between the two? Why would I use one or the other? And what does the dyn version allow me to do that the static one doesn't (that I cannot do for example by relaxing the implicit Sized restriction with ?Sized)?
If you're familiar with C++/Java, then dyn corresponds to "interface reference", so it implies dynamic polymorphism (thus requires a bunch of jumps over references, so it's a bit slower).
impl being a syntatic sugar, defines a template for functions, thus every time you use the function with another type, you'll get a separate copy of function compiled specifically for this type. So no extra jumping, but your executable bloats with these copies. Rust's ideology tells to create templates with <T>, impl, unless there're too many versions compiled that the executable is too bloated.

Usage of dyn in rust [duplicate]

I have recently seen code using the dyn keyword:
fn foo(arg: &dyn Display) {}
fn bar() -> Box<dyn Display> {}
What does this syntax mean?
TL;DR: It's a syntax for specifying the type of a trait object and must be specified for clarity reasons.
Since Rust 1.0, traits have led a double life. Once a trait has been declared, it can be used either as a trait or as a type:
// As a trait
impl MyTrait for SomeType {}
// As a type!
impl MyTrait {}
impl AnotherTrait for MyTrait {}
As you can imagine, this double meaning can cause some confusion. Additionally, since the MyTrait type is an unsized / dynamically-sized type, this can expose people to very complex error messages.
To ameliorate this problem, RFC 2113 introduced the dyn syntax. This syntax is available starting in Rust 1.27:
use std::{fmt::Display, sync::Arc};
fn main() {
let display_ref: &dyn Display = &42;
let display_box: Box<dyn Display> = Box::new(42);
let display_arc: Arc<dyn Display> = Arc::new(42);
}
This new keyword parallels the impl Trait syntax and strives to make the type of a trait object more obviously distinct from the "bare" trait syntax.
dyn is short for "dynamic" and refers to the fact that trait objects perform dynamic dispatch. This means that the decision of exactly which function is called will occur at program run time. Contrast this to static dispatch which uses the impl Trait syntax.
The syntax without dyn is now deprecated and it's likely that in a subsequent edition of Rust it will be removed.
Why would I implement methods on a trait instead of as part of the trait?
What makes something a "trait object"?
TLDR: "dyn" allows you to store in a Box a mix of Apples and Oranges, because they all implement the same trait of Fruit, which is what your Box is using as a type constraint, instead of just a generic type. This is because Generic allows any ONE of Apple OR Orange, but not both:
Vec<Box<T>> --> Vector can hold boxes of either Apples OR Oranges structs
Vec<Box<dyn Fruit>> --> Vector can now hold a mix of boxes of Apples AND Oranges Structs
If you want to store multiple types to the same instance of a data-structure, you have to use a trait wrapping a generic type and tag it as a "dyn", which will then cause that generic type to be resolved each time it's called, during runtime.
Sometimes, rather than using a type (String, &str, i32, etc...) or generic (T, Vec, etc...), we are using a trait as the type constraint (i.e. TryFrom). This is to allow us to store multiple types (all implementing the required trait), in the same data-structure instance (you will probably need to Box<> it too).
"dyn" basically tells the compiler that we don't know what the type is going to be at compile-time in place of the trait, and that it will be determined at run-time. This allows the final type to actually be a mixture of types that all implement the trait.
For generics, the compiler will hard-code the type in place of our generic type at the first use of the call to our data-structure consuming the generics. Every other call to store data in that same data-structure is expected to be using the same type as in the first call.
WARNING
As with all things, there is a performance penalty for implementing added flexibility, and this case definitely has a performance penalty.
I found this blog post to explain this feature really clearly: https://medium.com/digitalfrontiers/rust-dynamic-dispatching-deep-dive-236a5896e49b
Relevant excerpt:
struct Service<T:Backend>{
backend: Vec<T> // Either Vec<TypeA> or Vec<TypeB>, not both
}
...
let mut backends = Vec::new();
backends.push(TypeA);
backends.push(TypeB); // <---- Type error here
vs
struct Service{
backends: Vec<Box<dyn Backend>>
}
...
let mut backends = Vec::new();
backends.push( Box::new(PositiveBackend{}) as Box<dyn Backend>);
backends.push( Box::new(NegativeBackend{}) as Box<dyn Backend>);
The dyn keyword is used to indicate that a type is a trait object. According to the Rust docs:
A trait object is an opaque value of another type that implements a
set of traits.
In other words, we do not know the specific type of the object at compile time, we just know that the object implements the trait.
Because the size of a trait object is unknown at compile time they must be placed behind a pointer. For example, if Trait is your trait name then you can use your trait objects in the following manner:
Box<dyn Trait>
&dyn Trait
and other pointer types
The variables/parameters which hold the trait objects are fat pointers which consists of the following components:
pointer to the object in memory
pointer to that object’s vtable, a vtable is a table with pointers which point to the actual method(s) implementation(s).
See my answer on What makes something a “trait object”? for further details.

How does the mechanism behind the creation of boxed traits work?

I'm having trouble understanding how values of boxed traits come into existence. Consider the following code:
trait Fooer {
fn foo(&self);
}
impl Fooer for i32 {
fn foo(&self) { println!("Fooer on i32!"); }
}
fn main() {
let a = Box::new(32); // works, creates a Box<i32>
let b = Box::<i32>::new(32); // works, creates a Box<i32>
let c = Box::<dyn Fooer>::new(32); // doesn't work
let d: Box<dyn Fooer> = Box::new(32); // works, creates a Box<Fooer>
let e: Box<dyn Fooer> = Box::<i32>::new(32); // works, creates a Box<Fooer>
}
Obviously, variant a and b work, trivially. However, variant c does not, probably because the new function takes only values of the same type which is not the case since Fooer != i32. Variant d and e work, which lets me suspect that some kind of automatic conversion from Box<i32> to Box<dyn Fooer> is being performed.
So my questions are:
Does some kind of conversion happen here?
If so, what the mechanism behind it and how does it work? (I'm also interested in the low level details, i.e. how stuff is represented under the hood)
Is there a way to create a Box<dyn Fooer> directly from an i32? If not: why not?
However, variant c does not, probably because the new function takes only values of the same type which is not the case since Fooer != i32.
No, it's because there is no new function for Box<dyn Fooer>. In the documentation:
impl<T> Box<T>
pub fn new(x: T) -> Box<T>
Most methods on Box<T> allow T: ?Sized, but new is defined in an impl without a T: ?Sized bound. That means you can only call Box::<T>::new when T is a type with a known size. dyn Fooer is unsized, so there simply isn't a new function to call.
In fact, that function can't exist in today's Rust. Box<T>::new needs to know the concrete type T so that it can allocate memory of the right size and alignment. Therefore, you can't erase T before you send it to Box::new. (It's conceivable that future language extensions may allow functions to accept unsized parameters; however, it's unclear whether even unsized_locals would actually enable Box<T>::new to accept unsized T.)
For the time being, unsized types like dyn Fooer can only exist behind a "fat pointer", that is, a pointer to the object and a pointer to the implementation of Fooer for that object. How do you get a fat pointer? You start with a thin pointer and coerce it. That's what's happening in these two lines:
let d: Box<Fooer> = Box::new(32); // works, creates a Box<Fooer>
let e: Box<Fooer> = Box::<i32>::new(32); // works, creates a Box<Fooer>
Box::new returns a Box<i32>, which is then coerced to Box<Fooer>. You could consider this a conversion, but the Box isn't changed; all the compiler does is stick an extra pointer on it and forget its original type. rodrigo's answer goes into more detail about the language-level mechanics of this coercion.
Hopefully all of this goes to explain why the answer to
Is there a way to create a Box<Fooer> directly from an i32?
is "no": the i32 has to be boxed before you can erase its type. It's the same reason you can't write let x: Fooer = 10i32.
Related
Why can't I write a function with the same type as Box::new?
Are polymorphic variables allowed?
How do you actually use dynamically sized types in Rust?
Why is `let ref a: Trait = Struct` forbidden?
I'll try to explain what conversions (coercions) happen in your code.
There is a marker trait named Unsize that, between others:
Unsize is implemented for:
T is Unsize<Trait> when T: Trait.
[...]
This trait, AFAIK, is not used directly for coercions. Instead, CoerceUnsized is used. This trait is implemented in a lot of cases, some of them are quite expected, such as:
impl<'a, 'b, T, U> CoerceUnsized<&'a U> for &'b T
where
'b: 'a,
T: Unsize<U> + ?Sized,
U: ?Sized
that is used to coerce &i32 into &Fooer.
The interesting, not so obvious implementation for this trait, that affects your code is:
impl<T, U> CoerceUnsized<Box<U>> for Box<T>
where
T: Unsize<U> + ?Sized,
U: ?Sized
This, together with the definition of the Unsize marker, can be somewhat read as: if U is a trait and T implements U, then Box<T> can be coerced into Box<U>.
About your last question:
Is there a way to create a Box<Fooer> directly from an i32? If not: why not?
Not that I know of. The problem is that Box::new(T) requires a sized value, since the value passed is moved into the box, and unsized values cannot be moved.
In my opinion, the easiest way to do that is to simply write:
let c = Box::new(42) as Box<Fooer>;
That is, you create a Box of the proper type and then coerce to the unsized one (note it looks quite similar to your d example).

Resources