I can see the difference between dyn and (static) impl Traits in return position, such as:
fn foo() -> Box<dyn Trait> {}
vs
fn foo() -> impl Trait {}
Where in the dyn version I'm allowed to return different types as long they all implement the Trait, while in the impl version I'm only allowed to return the same type (same applies if I return a reference).
But I can't see the purpose of a dyn Trait in argument position such as:
fn foo(x: &dyn Trait) {}
vs
fn foo(x: &impl Trait) {} // syntatic sugar of `fn foo<T: Trait>(x: &T){}`
What is the difference between the two? Why would I use one or the other? And what does the dyn version allow me to do that the static one doesn't (that I cannot do for example by relaxing the implicit Sized restriction with ?Sized)?
If you're familiar with C++/Java, then dyn corresponds to "interface reference", so it implies dynamic polymorphism (thus requires a bunch of jumps over references, so it's a bit slower).
impl being a syntatic sugar, defines a template for functions, thus every time you use the function with another type, you'll get a separate copy of function compiled specifically for this type. So no extra jumping, but your executable bloats with these copies. Rust's ideology tells to create templates with <T>, impl, unless there're too many versions compiled that the executable is too bloated.
Related
In Rust for Rustaceans, Jon Gjengset states (emphasis mine):
When you’re given the choice between static and dynamic dispatch,
there is rarely a clear-cut right answer. Broadly speaking, though, you’ll
want to use static dispatch in your libraries and dynamic dispatch in your
binaries. In a library, you want to allow your users to decide what kind of
dispatch is best for them, since you don’t know what their needs are. If you use dynamic dispatch, they’re forced to do the same, whereas if you use
static dispatch, they can choose whether to use dynamic dispatch or not."
My interpretation of this was:
In a library one should use: fn execute(x: Arc<impl MyTrait>) -> ... This is static dispatch.
In a binary: let x: Arc<dyn MyTrait> = Arc::new(my_obj) This is dynamic.
Then I want to call execute(x), but the compiler complains:
error[E0277]: the size for values of type `dyn MyTrait` cannot be known at compilation time
--> src/main.rs:12:13
|
12 | execute(x);
| ------- ^ doesn't have a size known at compile-time
| |
| required by a bound introduced by this call
|
= help: the trait `Sized` is not implemented for `dyn MyTrait`
What am I missing?
Update:
One of MyTrait's methods is:
fn build(conf: MyStruct) -> Self where Self: Sized; where I put Sized in order for this method to be "skipped". In other words, this method would never be called from execute function.
Your interpretation is correct. However, every generic argument in Rust is, unless explicitly stated otherwise, assumed to be Sized. So when you write
fn execute(x: Arc<impl MyTrait>) {}
That's equivalent[1] to
fn execute<T>(x: Arc<T>)
where T: MyTrait {}
And since we didn't opt out of the Sized constraint, this gives
fn execute<T>(x: Arc<T>)
where T: MyTrait + Sized {}
And while dyn MyTrait absolutely does implement MyTrait, it does not implement Sized. So if we write any of the following
fn execute(x: Arc<impl MyTrait + ?Sized>) {}
fn execute<T: MyTrait + ?Sized>(x: Arc<T>) {}
fn execute<T>(x: Arc<T>) where T: MyTrait + ?Sized {}
Then Rust will accept a trait object as argument to this function.
[1] Almost equivalent. You can't write execute::<MyType>() if you're using the impl syntax.
What is the idiomatic way in rust for a function accepts a closure as argument or return a closure?
I see it can be done in at least the below 3 ways:
// 1
pub fn run_with_envs_guard1(envs: &HashMap<&str, &str>, f: &dyn FnOnce()) {}
// 2
pub fn run_with_envs_guard2(envs: &HashMap<&str, &str>, f: Box<dyn FnOnce()>) {}
// 3
pub fn run_with_envs_guard3<F: FnOnce()>(envs: &HashMap<&str, &str>, f: F) {}
Are there really some differences among these 3 ways? If yes, pls help to clarify, and which way is more idiomatic i should choose?
I am learning rust still, sorry if all the above ways are some bad/strange things.
Maybe a more specific question, why in way 1 and 2 i need the dyn keyword, but in 3 i don't, from my understanding, these all need dynamic dispatching, is it? as the actual function cannot be determined in compiling time
Abdul answers the first half of your question (and I agree completely with what he said), so I'll take a stab at the second half.
If you want to return a closure from a function, you can't return a type parameter, because that implies that you're returning an instance of any FnOnce, at the caller's choice. You can't return a &FnOnce, because you (usually) need to pass ownership to the caller. You could make it work with Box<FnOnce>, but that tends to just be clunky to work with. When returning closures from functions, I'm partial to the impl trait syntax.
pub fn test() -> impl FnOnce() {
|| { println!("It worked!") }
}
In argument position, writing impl FnOnce() as the type of something is equivalent to defining a type argument, as Abdul did in his answer. However, in return position, it's an entirely new feature that returns an opaque value. It says "I'm returning an FnOnce, and I'm not telling you which one it is". It's the same concept as a trait object, but without the overhead of throwing it in a box.
Responding to your edit
i don't, from my understanding, these all need dynamic dispatching, is it? as the actual function cannot be determined in compiling time
This is actually not necessarily true. If you see the dyn keyword, then there's definitely a dynamic (runtime) dispatch happening. To understand your other example, though, let's consider a simple trait that doesn't have the baggage of FnOnce.
pub trait MyTrait {}
struct Foo;
struct Bar;
impl MyTrait for Foo {}
impl MyTrait for Bar {}
pub fn example<T: MyTrait>(_arg: T) {
println!("It works!");
}
fn main() {
example(Foo);
example(Bar);
}
I claim there's no dynamic dispatch happening here. Rust monomorphizes functions with type parameters. That means that example is like a template function in C++. Every instantiation of it will end up being a separate function. So, really, during Rust's compilation, this will end up being more like
struct Foo;
struct Bar;
pub fn example1(_arg: Foo) {
println!("It works!");
}
pub fn example2(_arg: Foo) {
println!("It works!");
}
fn main() {
example1(Foo);
example2(Bar);
}
Two unrelated functions that happen to do something similar. Rust resolves all of the linkage statically, so there's no dispatch happening at runtime. In fact, we can prove it. Take the code I just posted above and compile it with debugging symbols on (rustc -g filename.rs). Then use a tool like nm (available on most Linux machines by default) to list all of the symbols in the linker table. Assuming you didn't turn any optimizations on, you should see two example functions. This is what they look like in my linker
0000000000005340 t _ZN10code7example17h46383f9ad372dc94E
00000000000053a0 t _ZN10code7example17h97b400359a146fcaE
or, with nm -C to demangle the function names
0000000000005340 t code::example
00000000000053a0 t code::example
Two different functions, each of which takes concrete arguments of specific types.
Your proposed FnOnce would work the same way.
pub fn run_with_envs_guard3<F: FnOnce()>(envs: &HashMap<&str, &str>, f: F) {}
Every closure in Rust has a distinct type, so every time this function is called, a new version of run_with_envs_guard3 will get made, specifically for that closure. That new function will know exactly what to do for the closure you just gave it. In 99% of cases, if you have optimizations turned on, these made-up local functions will get inlined and optimized out, so no harm done. But there's no dynamic dispatch here.
In the other two examples, we have a dyn FnOnce, which is more like what you'd expect coming from a traditionally object-oriented language. dyn FnOnce contains a dynamic pointer to some function somewhere that will be dispatched at runtime, the way you'd expect.
I would prefer the third one. Because the rust documentation suggest to Use FnOnce as a bound when you want to accept a parameter of function-like type and only need to call it once.
pub fn run_with_envs_guard3<F: FnOnce()>(envs: &HashMap<&str, &str>, f: F) {}
This means that the F to be bound by FnOnce(ie, F must implement FnOnce)
I'm reading some code and it has a consume function which makes it possible for me to pass my own function f.
fn consume<R, F>(self, _timestamp: Instant, len: usize, f: F) -> Result<R>
where
F: FnOnce(&mut [u8]) -> Result<R>,
I wrote some similar code, but like this:
pub fn phy_receive(
&mut self,
f: &mut dyn FnMut(&[u8])
) -> u8 {
and to be fair I don't know what is the difference, aside from FnOnce vs FnMut. What is the difference between using dyn vs a generic type parameter to specify this function?
Using dyn with types results in dynamic dispatch (hence the dyn keyword), whereas using a (constrained) generic parameter results in monomorphism.
General explanation
Dynamic dispatch
Dynamic dispatch means that a method call is resolved at runtime. It is generally more expensive in terms of runtime resources than monomorphism.
For example, say you have the following trait
trait MyTrait {
fn f(&self);
fn g(&self);
}
and a struct MyStruct which implements that trait. If you use a dyn reference to that trait (e.g. &dyn MyTrait), and pass a reference to a MyStruct object to it, what happens is the following:
A "vtable" data structure is created. This is a table containing pointers to the MyStruct implementations of f and g.
A pointer to this vtable is stored with the &dyn MyTrait reference, hence the reference will be twice its usual size; sometimes &dyn references are called "fat references" for this reason.
Calling f and g will then result in indirect function calls using the pointers stored in the vtable.
Monomorphism
Monomorphism means that the code is generated at compile-time. It's similar to copy and paste. Using MyTrait and MyStruct defined in the previous section, imagine you have a function like the following:
fn sample<T: MyTrait>(t: T) { ... }
And you pass a MyStruct to it:
sample(MyStruct);
What happens is the following:
During compile time, a copy of the sample function is created specifically for the MyStruct type. In very simple terms, this is as if you copied and pasted the sample function definition and replaced T with MyStruct:
fn sample__MyStruct(t: MyStruct) { ... }
The sample(MyStruct) call gets compiled into sample__MyStruct(MyStruct).
This means that in general, monomorphism can be more expensive in terms of binary code size (since you are essentially duplicating similar chunks of code, but for different types), but there's no runtime cost like there is with dynamic dispatch.
Monomorphism is also generally more expensive in terms of compile times: because it essentially does copy-paste of code, codebases that use monomorphism abundantly tend to compile a bit slower.
Your example
Since FnMut is just a trait, the above discussion applies directly to your question. Here's the trait definition:
pub trait FnMut<Args>: FnOnce<Args> {
pub extern "rust-call" fn call_mut(&mut self, args: Args) -> Self::Output;
}
Disregarding the extern "rust-call" weirdness, this is a trait just like MyTrait above. This trait is implemented by certain Rust functions, so any of those functions is analogous to MyStruct from above. Using &dyn FnMut<...> will result in dynamic dispatch, and using <T: FnMut<...>> will result in monomorphism.
My 2 cents and general advice
Certain situations will require you to use a dynamic dispatch. For example, if you have a Vec of external objects implementing a certain trait, you have no choice but to use dynamic dispatch. For example,
Vec<Box<dyn Debug>>. If those objects are internal to your code, though, you could use an enum type and monomorphism.
If your trait contains an associated type or a generic method, you will have to use monomorphism, because such traits are not object safe.
Everything else being equal, my advice is to pick one preference and stick with it in your codebase. From what I've seen, most people tend to prefer defaulting to generics and monomorphism.
I have recently seen code using the dyn keyword:
fn foo(arg: &dyn Display) {}
fn bar() -> Box<dyn Display> {}
What does this syntax mean?
TL;DR: It's a syntax for specifying the type of a trait object and must be specified for clarity reasons.
Since Rust 1.0, traits have led a double life. Once a trait has been declared, it can be used either as a trait or as a type:
// As a trait
impl MyTrait for SomeType {}
// As a type!
impl MyTrait {}
impl AnotherTrait for MyTrait {}
As you can imagine, this double meaning can cause some confusion. Additionally, since the MyTrait type is an unsized / dynamically-sized type, this can expose people to very complex error messages.
To ameliorate this problem, RFC 2113 introduced the dyn syntax. This syntax is available starting in Rust 1.27:
use std::{fmt::Display, sync::Arc};
fn main() {
let display_ref: &dyn Display = &42;
let display_box: Box<dyn Display> = Box::new(42);
let display_arc: Arc<dyn Display> = Arc::new(42);
}
This new keyword parallels the impl Trait syntax and strives to make the type of a trait object more obviously distinct from the "bare" trait syntax.
dyn is short for "dynamic" and refers to the fact that trait objects perform dynamic dispatch. This means that the decision of exactly which function is called will occur at program run time. Contrast this to static dispatch which uses the impl Trait syntax.
The syntax without dyn is now deprecated and it's likely that in a subsequent edition of Rust it will be removed.
Why would I implement methods on a trait instead of as part of the trait?
What makes something a "trait object"?
TLDR: "dyn" allows you to store in a Box a mix of Apples and Oranges, because they all implement the same trait of Fruit, which is what your Box is using as a type constraint, instead of just a generic type. This is because Generic allows any ONE of Apple OR Orange, but not both:
Vec<Box<T>> --> Vector can hold boxes of either Apples OR Oranges structs
Vec<Box<dyn Fruit>> --> Vector can now hold a mix of boxes of Apples AND Oranges Structs
If you want to store multiple types to the same instance of a data-structure, you have to use a trait wrapping a generic type and tag it as a "dyn", which will then cause that generic type to be resolved each time it's called, during runtime.
Sometimes, rather than using a type (String, &str, i32, etc...) or generic (T, Vec, etc...), we are using a trait as the type constraint (i.e. TryFrom). This is to allow us to store multiple types (all implementing the required trait), in the same data-structure instance (you will probably need to Box<> it too).
"dyn" basically tells the compiler that we don't know what the type is going to be at compile-time in place of the trait, and that it will be determined at run-time. This allows the final type to actually be a mixture of types that all implement the trait.
For generics, the compiler will hard-code the type in place of our generic type at the first use of the call to our data-structure consuming the generics. Every other call to store data in that same data-structure is expected to be using the same type as in the first call.
WARNING
As with all things, there is a performance penalty for implementing added flexibility, and this case definitely has a performance penalty.
I found this blog post to explain this feature really clearly: https://medium.com/digitalfrontiers/rust-dynamic-dispatching-deep-dive-236a5896e49b
Relevant excerpt:
struct Service<T:Backend>{
backend: Vec<T> // Either Vec<TypeA> or Vec<TypeB>, not both
}
...
let mut backends = Vec::new();
backends.push(TypeA);
backends.push(TypeB); // <---- Type error here
vs
struct Service{
backends: Vec<Box<dyn Backend>>
}
...
let mut backends = Vec::new();
backends.push( Box::new(PositiveBackend{}) as Box<dyn Backend>);
backends.push( Box::new(NegativeBackend{}) as Box<dyn Backend>);
The dyn keyword is used to indicate that a type is a trait object. According to the Rust docs:
A trait object is an opaque value of another type that implements a
set of traits.
In other words, we do not know the specific type of the object at compile time, we just know that the object implements the trait.
Because the size of a trait object is unknown at compile time they must be placed behind a pointer. For example, if Trait is your trait name then you can use your trait objects in the following manner:
Box<dyn Trait>
&dyn Trait
and other pointer types
The variables/parameters which hold the trait objects are fat pointers which consists of the following components:
pointer to the object in memory
pointer to that object’s vtable, a vtable is a table with pointers which point to the actual method(s) implementation(s).
See my answer on What makes something a “trait object”? for further details.
From the Rust documentation:
Rust supports powerful local type inference in the bodies of functions, but it deliberately does not perform any reasoning about types for item signatures. However, for ergonomic reasons, a very restricted secondary inference algorithm called “lifetime elision” does apply when judging lifetimes. Lifetime elision is concerned solely with inferring lifetime parameters using three easily memorizable and unambiguous rules. This means lifetime elision acts as a shorthand for writing an item signature, while not hiding away the actual types involved as full local inference would if applied to it.
I don't understand what this means. What are item signatures? What does "infer lifetime parameters" mean? Some examples or analogies would be helpful.
An item signature is the bit which gives the name and types of your function, i.e. everything you need to call it (without needing to know how it's implemented); for example:
fn foo(x: u32) -> u32;
Here's another which takes a &str reference:
fn bar<'a>(s: &'a str) -> &'a str;
In Rust, all references have an attached lifetime; this is part of the type. The above bar function says more than just "this function takes a reference to a string and returns another one". It says "this function takes a string reference, and returns another which is valid for as long as the one it's given. This is an important part of Rust's ownership system.
However, it's annoying and a pain to specify these lifetimes every time, so Rust has "lifetime elision" (i.e. "not explicitly writing them out"). All that means is that for a few very common cases, you can leave the lifetime annotations out and Rust will implicitly add them for you. This is purely a convenience for programmers so that they don't have to write so many lifetimes in "obvious" cases.
The rules are listed in the book, but for completeness they are:
Every lifetime in the function parameters which isn't otherwise specified is different. For example:
fn f(x: &T, y: &U)
means:
fn f<'a, 'b>(x: &'a T, y: &'b U)
i.e. there's no automatic link between those lifetimes.
If there's only one input lifetime, it's used for every output lifetime. For example:
struct U<'a> {} // struct with a lifetime parameter
fn f(x: &T) -> &U
becomes:
fn f<'a>(x: &'a T) -> &'a U<'a>
Otherwise, if there are multiple input lifetimes but one of them is &self or &mut self (i.e. it's a method), then all the elided output lifetimes get the same as self. This covers the common case that a method returns a reference to one of its fields. For example:
impl S {
fn get_my_item(&self, key: &str) -> &str {}
}
becomes:
fn get_my_item<'a,'b>(&'a self, key: &'b str) -> &'a str // use the self lifetime
The documentation has some more examples.