I have the module mod1.rs:
pub struct Foo;
impl Foo {}
impl Drop for Foo {
fn drop(&mut self) {}
}
In file2.rs I wrote use mod1::Foo;.
What do I actually have in file2.rs? Only struct Foo, impl Foo? What about impl Drop for Foo?
If I get all traits for Foo in file2.rs, and I write
fn my_func(foo: Foo)..., what do I have here? Is Foo a struct or a trait (impl Foo) here?
I read the Rust book and manual, but they explain only
explicit usage, not mention what happens with trait with the same name (impl). The Rust book tells you to import traits explicitly, if so and Drop is not imported by use mod1::Foo, this is a really, really bad thing.
In file2.rs I wrote use mod1::Foo;.
What do I actually have in file2.rs? Only struct Foo, impl Foo? What about impl Drop for Foo?
When you use a type like a struct or an enum, you get all of the inherent methods; those defined in the impl Foo. You'd also be able to access any public fields on the type.
If I get all traits for Foo in file2.rs, and I write fn my_func(foo: Foo), what do I have here? Is Foo a struct or a trait (impl Foo) here?
impl Foo is not a trait. trait Bar defines a trait. impl Bar for Foo implements a trait for the type Foo. impl Foo creates inherent methods; these are not related to traits.
I read the Rust book and manual, but they explain only explicit usage, not mention what happens with trait with the same name (impl). The Rust book tells you to import traits explicitly, if so and Drop is not imported by use mod1::Foo, this is a really, really bad thing.
That would be a very bad idea for the language designers to have made. Thankfully, they didn't do that. Importing something simply allows the code that imported it to use it. It doesn't cause the code to disappear if it's not imported.
The compiler itself is the user of types that implement Drop, so you can think of it as the compiler implementation has use Drop in it somewhere. This is probably not literally true, but a mental model. Just because your code doesn't import Drop doesn't mean some other code couldn't.
As mentioned elsewhere, you don't have to import Drop anyway, as it's included in the prelude.
Related
In the following example code the trait Foo requires that the associated type X implement the Clone trait.
When using the impl Foo<X = Baz> syntax in the do_it function signature, cargo check does not complain that Baz does not implement the Clone trait.
However, cargo check does complain about this issue, in the impl Foo for Bar block.
I would have expected impl Foo<X = Baz> to complain in the same way.
trait Foo {
type X: Clone;
}
struct Bar;
struct Baz;
impl Foo for Bar {
type X = Baz; // <- complains Baz does not impl Clone trait
}
fn do_it(foo: impl Foo<X = Baz>) {} // <- does not complain
This is not the case if X is a generic parameter. In that case, cargo check indicates that the Clone trait bound is not satisfied by foo: impl Foo<Bar>
trait Foo<X>
where
X: Clone,
{
}
struct Bar;
struct Baz;
fn do_it(foo: impl Foo<Baz>) {} // <- complains Baz does not impl Clone trait
Is this intended behavior and if so why?
This is described in the RFC introducing associated types, in a short sentence:
The BOUNDS and WHERE_CLAUSE on associated types are obligations for the implementor of the trait, and assumptions for users of the trait:
trait Graph {
type N: Show + Hash;
type E: Show + Hash;
...
}
impl Graph for MyGraph {
// Both MyNode and MyEdge must implement Show and Hash
type N = MyNode;
type E = MyEdge;
...
}
fn print_nodes<G: Graph>(g: &G) {
// here, can assume G::N implements Show
...
}
What this means is that the person that is responsible to prove that the bounds hold is not the user of the trait (do_it() in your example) but the implementor of the trait. This is in contrast to generic parameters of traits, where the proof obligation is on the user.
The difference should be obvious when you look at it: with generic parameters, the types are foreign and unknown inside the trait implementation, so it must assume the bounds hole. The user of the trait, on the other hand, has concrete types for them (even if they're themselves generic, they're still concrete types from the point of view of the trait) and so it should prove the bounds hold. In contrast, with associated types the story is different: the implementor knows the concrete type, while the user assumes a generic type (even if, like in your code, it constrains them to a specific type, in the general case it is still unknown).
Note that with where bounds on associated types (type Foo where Self::Foo: Clone), that were introduced with generic associated types (yes, I know the RFC I linked bring them, but as far as I know they were not implemented and eventually implemented as part of GATs with different semantics), the story is again different from normal associated type bounds: the user has to prove them too (I think the both need to prove, but I'm not sure). This is because they're expected to be used for generic parameters on associated types, so they're similar to generic parameters on traits, or where clauses in them.
What is the idiomatic way in rust for a function accepts a closure as argument or return a closure?
I see it can be done in at least the below 3 ways:
// 1
pub fn run_with_envs_guard1(envs: &HashMap<&str, &str>, f: &dyn FnOnce()) {}
// 2
pub fn run_with_envs_guard2(envs: &HashMap<&str, &str>, f: Box<dyn FnOnce()>) {}
// 3
pub fn run_with_envs_guard3<F: FnOnce()>(envs: &HashMap<&str, &str>, f: F) {}
Are there really some differences among these 3 ways? If yes, pls help to clarify, and which way is more idiomatic i should choose?
I am learning rust still, sorry if all the above ways are some bad/strange things.
Maybe a more specific question, why in way 1 and 2 i need the dyn keyword, but in 3 i don't, from my understanding, these all need dynamic dispatching, is it? as the actual function cannot be determined in compiling time
Abdul answers the first half of your question (and I agree completely with what he said), so I'll take a stab at the second half.
If you want to return a closure from a function, you can't return a type parameter, because that implies that you're returning an instance of any FnOnce, at the caller's choice. You can't return a &FnOnce, because you (usually) need to pass ownership to the caller. You could make it work with Box<FnOnce>, but that tends to just be clunky to work with. When returning closures from functions, I'm partial to the impl trait syntax.
pub fn test() -> impl FnOnce() {
|| { println!("It worked!") }
}
In argument position, writing impl FnOnce() as the type of something is equivalent to defining a type argument, as Abdul did in his answer. However, in return position, it's an entirely new feature that returns an opaque value. It says "I'm returning an FnOnce, and I'm not telling you which one it is". It's the same concept as a trait object, but without the overhead of throwing it in a box.
Responding to your edit
i don't, from my understanding, these all need dynamic dispatching, is it? as the actual function cannot be determined in compiling time
This is actually not necessarily true. If you see the dyn keyword, then there's definitely a dynamic (runtime) dispatch happening. To understand your other example, though, let's consider a simple trait that doesn't have the baggage of FnOnce.
pub trait MyTrait {}
struct Foo;
struct Bar;
impl MyTrait for Foo {}
impl MyTrait for Bar {}
pub fn example<T: MyTrait>(_arg: T) {
println!("It works!");
}
fn main() {
example(Foo);
example(Bar);
}
I claim there's no dynamic dispatch happening here. Rust monomorphizes functions with type parameters. That means that example is like a template function in C++. Every instantiation of it will end up being a separate function. So, really, during Rust's compilation, this will end up being more like
struct Foo;
struct Bar;
pub fn example1(_arg: Foo) {
println!("It works!");
}
pub fn example2(_arg: Foo) {
println!("It works!");
}
fn main() {
example1(Foo);
example2(Bar);
}
Two unrelated functions that happen to do something similar. Rust resolves all of the linkage statically, so there's no dispatch happening at runtime. In fact, we can prove it. Take the code I just posted above and compile it with debugging symbols on (rustc -g filename.rs). Then use a tool like nm (available on most Linux machines by default) to list all of the symbols in the linker table. Assuming you didn't turn any optimizations on, you should see two example functions. This is what they look like in my linker
0000000000005340 t _ZN10code7example17h46383f9ad372dc94E
00000000000053a0 t _ZN10code7example17h97b400359a146fcaE
or, with nm -C to demangle the function names
0000000000005340 t code::example
00000000000053a0 t code::example
Two different functions, each of which takes concrete arguments of specific types.
Your proposed FnOnce would work the same way.
pub fn run_with_envs_guard3<F: FnOnce()>(envs: &HashMap<&str, &str>, f: F) {}
Every closure in Rust has a distinct type, so every time this function is called, a new version of run_with_envs_guard3 will get made, specifically for that closure. That new function will know exactly what to do for the closure you just gave it. In 99% of cases, if you have optimizations turned on, these made-up local functions will get inlined and optimized out, so no harm done. But there's no dynamic dispatch here.
In the other two examples, we have a dyn FnOnce, which is more like what you'd expect coming from a traditionally object-oriented language. dyn FnOnce contains a dynamic pointer to some function somewhere that will be dispatched at runtime, the way you'd expect.
I would prefer the third one. Because the rust documentation suggest to Use FnOnce as a bound when you want to accept a parameter of function-like type and only need to call it once.
pub fn run_with_envs_guard3<F: FnOnce()>(envs: &HashMap<&str, &str>, f: F) {}
This means that the F to be bound by FnOnce(ie, F must implement FnOnce)
I'm reading some code and it has a consume function which makes it possible for me to pass my own function f.
fn consume<R, F>(self, _timestamp: Instant, len: usize, f: F) -> Result<R>
where
F: FnOnce(&mut [u8]) -> Result<R>,
I wrote some similar code, but like this:
pub fn phy_receive(
&mut self,
f: &mut dyn FnMut(&[u8])
) -> u8 {
and to be fair I don't know what is the difference, aside from FnOnce vs FnMut. What is the difference between using dyn vs a generic type parameter to specify this function?
Using dyn with types results in dynamic dispatch (hence the dyn keyword), whereas using a (constrained) generic parameter results in monomorphism.
General explanation
Dynamic dispatch
Dynamic dispatch means that a method call is resolved at runtime. It is generally more expensive in terms of runtime resources than monomorphism.
For example, say you have the following trait
trait MyTrait {
fn f(&self);
fn g(&self);
}
and a struct MyStruct which implements that trait. If you use a dyn reference to that trait (e.g. &dyn MyTrait), and pass a reference to a MyStruct object to it, what happens is the following:
A "vtable" data structure is created. This is a table containing pointers to the MyStruct implementations of f and g.
A pointer to this vtable is stored with the &dyn MyTrait reference, hence the reference will be twice its usual size; sometimes &dyn references are called "fat references" for this reason.
Calling f and g will then result in indirect function calls using the pointers stored in the vtable.
Monomorphism
Monomorphism means that the code is generated at compile-time. It's similar to copy and paste. Using MyTrait and MyStruct defined in the previous section, imagine you have a function like the following:
fn sample<T: MyTrait>(t: T) { ... }
And you pass a MyStruct to it:
sample(MyStruct);
What happens is the following:
During compile time, a copy of the sample function is created specifically for the MyStruct type. In very simple terms, this is as if you copied and pasted the sample function definition and replaced T with MyStruct:
fn sample__MyStruct(t: MyStruct) { ... }
The sample(MyStruct) call gets compiled into sample__MyStruct(MyStruct).
This means that in general, monomorphism can be more expensive in terms of binary code size (since you are essentially duplicating similar chunks of code, but for different types), but there's no runtime cost like there is with dynamic dispatch.
Monomorphism is also generally more expensive in terms of compile times: because it essentially does copy-paste of code, codebases that use monomorphism abundantly tend to compile a bit slower.
Your example
Since FnMut is just a trait, the above discussion applies directly to your question. Here's the trait definition:
pub trait FnMut<Args>: FnOnce<Args> {
pub extern "rust-call" fn call_mut(&mut self, args: Args) -> Self::Output;
}
Disregarding the extern "rust-call" weirdness, this is a trait just like MyTrait above. This trait is implemented by certain Rust functions, so any of those functions is analogous to MyStruct from above. Using &dyn FnMut<...> will result in dynamic dispatch, and using <T: FnMut<...>> will result in monomorphism.
My 2 cents and general advice
Certain situations will require you to use a dynamic dispatch. For example, if you have a Vec of external objects implementing a certain trait, you have no choice but to use dynamic dispatch. For example,
Vec<Box<dyn Debug>>. If those objects are internal to your code, though, you could use an enum type and monomorphism.
If your trait contains an associated type or a generic method, you will have to use monomorphism, because such traits are not object safe.
Everything else being equal, my advice is to pick one preference and stick with it in your codebase. From what I've seen, most people tend to prefer defaulting to generics and monomorphism.
I have recently seen code using the dyn keyword:
fn foo(arg: &dyn Display) {}
fn bar() -> Box<dyn Display> {}
What does this syntax mean?
TL;DR: It's a syntax for specifying the type of a trait object and must be specified for clarity reasons.
Since Rust 1.0, traits have led a double life. Once a trait has been declared, it can be used either as a trait or as a type:
// As a trait
impl MyTrait for SomeType {}
// As a type!
impl MyTrait {}
impl AnotherTrait for MyTrait {}
As you can imagine, this double meaning can cause some confusion. Additionally, since the MyTrait type is an unsized / dynamically-sized type, this can expose people to very complex error messages.
To ameliorate this problem, RFC 2113 introduced the dyn syntax. This syntax is available starting in Rust 1.27:
use std::{fmt::Display, sync::Arc};
fn main() {
let display_ref: &dyn Display = &42;
let display_box: Box<dyn Display> = Box::new(42);
let display_arc: Arc<dyn Display> = Arc::new(42);
}
This new keyword parallels the impl Trait syntax and strives to make the type of a trait object more obviously distinct from the "bare" trait syntax.
dyn is short for "dynamic" and refers to the fact that trait objects perform dynamic dispatch. This means that the decision of exactly which function is called will occur at program run time. Contrast this to static dispatch which uses the impl Trait syntax.
The syntax without dyn is now deprecated and it's likely that in a subsequent edition of Rust it will be removed.
Why would I implement methods on a trait instead of as part of the trait?
What makes something a "trait object"?
TLDR: "dyn" allows you to store in a Box a mix of Apples and Oranges, because they all implement the same trait of Fruit, which is what your Box is using as a type constraint, instead of just a generic type. This is because Generic allows any ONE of Apple OR Orange, but not both:
Vec<Box<T>> --> Vector can hold boxes of either Apples OR Oranges structs
Vec<Box<dyn Fruit>> --> Vector can now hold a mix of boxes of Apples AND Oranges Structs
If you want to store multiple types to the same instance of a data-structure, you have to use a trait wrapping a generic type and tag it as a "dyn", which will then cause that generic type to be resolved each time it's called, during runtime.
Sometimes, rather than using a type (String, &str, i32, etc...) or generic (T, Vec, etc...), we are using a trait as the type constraint (i.e. TryFrom). This is to allow us to store multiple types (all implementing the required trait), in the same data-structure instance (you will probably need to Box<> it too).
"dyn" basically tells the compiler that we don't know what the type is going to be at compile-time in place of the trait, and that it will be determined at run-time. This allows the final type to actually be a mixture of types that all implement the trait.
For generics, the compiler will hard-code the type in place of our generic type at the first use of the call to our data-structure consuming the generics. Every other call to store data in that same data-structure is expected to be using the same type as in the first call.
WARNING
As with all things, there is a performance penalty for implementing added flexibility, and this case definitely has a performance penalty.
I found this blog post to explain this feature really clearly: https://medium.com/digitalfrontiers/rust-dynamic-dispatching-deep-dive-236a5896e49b
Relevant excerpt:
struct Service<T:Backend>{
backend: Vec<T> // Either Vec<TypeA> or Vec<TypeB>, not both
}
...
let mut backends = Vec::new();
backends.push(TypeA);
backends.push(TypeB); // <---- Type error here
vs
struct Service{
backends: Vec<Box<dyn Backend>>
}
...
let mut backends = Vec::new();
backends.push( Box::new(PositiveBackend{}) as Box<dyn Backend>);
backends.push( Box::new(NegativeBackend{}) as Box<dyn Backend>);
The dyn keyword is used to indicate that a type is a trait object. According to the Rust docs:
A trait object is an opaque value of another type that implements a
set of traits.
In other words, we do not know the specific type of the object at compile time, we just know that the object implements the trait.
Because the size of a trait object is unknown at compile time they must be placed behind a pointer. For example, if Trait is your trait name then you can use your trait objects in the following manner:
Box<dyn Trait>
&dyn Trait
and other pointer types
The variables/parameters which hold the trait objects are fat pointers which consists of the following components:
pointer to the object in memory
pointer to that object’s vtable, a vtable is a table with pointers which point to the actual method(s) implementation(s).
See my answer on What makes something a “trait object”? for further details.
I have a trait Foo inheriting from another trait Bar. Bar has an associated type Baz. Foo constrains Baz such that Baz must implement Hoge.
trait Hoge {}
trait Bar {
type Baz;
}
trait Foo: Bar where Self::Baz: Hoge {}
However, when I define a generic function requiring the generic type T to implement Foo,
// [DESIRED CODE]
fn fizz<T: Foo>(buzz: T) {
// ...
}
rustc complains with EO277 unless I constrain T explicitly:
fn fizz<T: Foo>(buzz: T) where T::Baz: Hoge {
// ...
}
I do not understand why I need to do this. I would like to be able to write [DESIRED CODE]. What is the recommended way to do this?
Sadly (or not), you have to repeat the bounds.
Last year I opened a issue thinking that the type checker was being inconsistent. The code is similar to yours.
#arielb1 closed the issue and said that this was the intended behavior and gave this explanation:
The thing is that we don't want too many bounds to be implicitly
available for functions, as this can lead to fragility with distant
changes causing functions to stop compiling. There are basically 3
kinds of bounds available to a function:
bounds from explicit where-clauses - e.g. T: B when you have that clause. This includes the "semi-explicit" Sized bound.
bounds from supertraits of explicit where-clauses - a where-clause adds bounds for its supertraits (as trait B: A, the T: B bound adds a
T: A bound).
bounds from the lifetime properties of arguments (outlives/implicator/implied bounds). These are only lifetime bounds,
and irrelevant for the current problem. rust-lang/rfcs#1214 involved
them a great deal.
If your bound isn't in the list, you will have to add it explicitly if
you want to use it. I guess this should be a FAQ entry.
Today I opened an issue to request that this information to be added to the docs.
It is possible to work around this behaviour by using another associated type in Foo since the compiler accepts implicit bounds when part of the associated type definition (playground):
trait Foo: Bar<Baz = Self::HogeBaz> {
type HogeBaz: Hoge;
}
The new associated type can be hidden in a helper trait to avoid having to include it in every implementation. Full example (with renaming for clarity)
(playground)
trait Bound {
fn bound();
}
trait Trait {
type Type;
}
trait BoundedTypeHelper: Trait<Type = Self::BoundedType> {
type BoundedType: Bound;
}
impl<T> BoundedTypeHelper for T
where
T: Trait,
Self::Type: Bound,
{
type BoundedType = Self::Type;
}
trait UserTrait: BoundedTypeHelper {}
fn fizz<T: UserTrait>() {
T::Type::bound()
}
I am with you in thinking that the original where-based bound ought to be treated as part of the trait definition and applied implicitly. It feels very arbitrary that bounding associated types inline works but where clauses do not.