Why is the `Sized` bound necessary in this trait? - rust

I have a trait with two associated functions:
trait WithConstructor: Sized {
fn new_with_param(param: usize) -> Self;
fn new() -> Self {
Self::new_with_param(0)
}
}
Why does the default implementation of the second method (new()) force me to put the Sized bound on the type? I think it's because of the stack pointer manipulation, but I'm not sure.
If the compiler needs to know the size to allocate memory on the stack,
why does the following example not require Sized for T?
struct SimpleStruct<T> {
field: T,
}
fn main() {
let s = SimpleStruct { field: 0u32 };
}

As you probably already know, types in Rust can be sized and unsized. Unsized types, as their name suggests, do not have a size required to store values of this type which is known to the compiler. For example, [u32] is an unsized array of u32s; because the number of elements is not specified anywhere, the compiler doesn't know its size. Another example is a bare trait object type, for example, Display, when it is used directly as a type:
let x: Display = ...;
In this case, the compiler does not know which type is actually used here, it is erased, therefore it does not know the size of values of these types. The above line is not valid - you can't make a local variable without knowing its size (to allocate enough bytes on the stack), and you can't pass the value of an unsized type into a function as an argument or return it from one.
Unsized types can be used through a pointer, however, which can carry additional information - the length of available data for slices (&[u32]) or a pointer to a virtual table (Box<SomeTrait>). Because pointers always have a fixed and known size, they can be stored in local variables and be passed into or returned from functions.
Given any concrete type you can always say whether it is sized or unsized. With generics, however, a question arises - is some type parameter sized or not?
fn generic_fn<T>(x: T) -> T { ... }
If T is unsized, then such a function definition is incorrect, as you can't pass unsized values around directly. If it is sized, then all is OK.
In Rust all generic type parameters are sized by default everywhere - in functions, in structs and in traits. They have an implicit Sized bound; Sized is a trait for marking sized types:
fn generic_fn<T: Sized>(x: T) -> T { ... }
This is because in the overwhelming number of times you want your generic parameters to be sized. Sometimes, however, you'd want to opt-out of sizedness, and this can be done with ?Sized bound:
fn generic_fn<T: ?Sized>(x: &T) -> u32 { ... }
Now generic_fn can be called like generic_fn("abcde"), and T will be instantiated with str which is unsized, but that's OK - this function accepts a reference to T, so nothing bad happens.
However, there is another place where question of sizedness is important. Traits in Rust are always implemented for some type:
trait A {
fn do_something(&self);
}
struct X;
impl A for X {
fn do_something(&self) {}
}
However, this is only necessary for convenience and practicality purposes. It is possible to define traits to always take one type parameter and to not specify the type the trait is implemented for:
// this is not actual Rust but some Rust-like language
trait A<T> {
fn do_something(t: &T);
}
struct X;
impl A<X> {
fn do_something(t: &X) {}
}
That's how Haskell type classes work, and, in fact, that's how traits are actually implemented in Rust at a lower level.
Each trait in Rust has an implicit type parameter, called Self, which designates the type this trait is implemented for. It is always available in the body of the trait:
trait A {
fn do_something(t: &Self);
}
This is where the question of sizedness comes into the picture. Is the Self parameter sized?
It turns out that no, Self is not sized by default in Rust. Each trait has an implicit ?Sized bound on Self. One of the reasons this is needed because there are a lot of traits which can be implemented for unsized types and still work. For example, any trait which only contains methods which only take and return Self by reference can be implemented for unsized types. You can read more about motivation in RFC 546.
Sizedness is not an issue when you only define the signature of the trait and its methods. Because there is no actual code in these definitions, the compiler can't assume anything. However, when you start writing generic code which uses this trait, which includes default methods because they take an implicit Self parameter, you should take sizedness into account. Because Self is not sized by default, default trait methods can't return Self by value or take it as a parameter by value. Consequently, you either need to specify that Self must be sized by default:
trait A: Sized { ... }
or you can specify that a method can only be called if Self is sized:
trait WithConstructor {
fn new_with_param(param: usize) -> Self;
fn new() -> Self
where
Self: Sized,
{
Self::new_with_param(0)
}
}

Let's see what would happen if you did this with an unsized type.
new() moves the result of your new_with_param(_) method to the caller. But unless the type is sized, how many bytes should be moved? We simply cannot know. That's why move semantics require Sized types.
Note: The various Boxes have been designed to offer runtime services for exactly this problem.

Related

How to define an adapter trait where some implementations need a lifetime on &self?

I'm writing a set of benchmarks for different key-value stores, and would like to have a single adapter trait that I can use in all the benchmarks, and then implement it for each key-value store.
This worked well for two of them. However, the third required me to add a lifetime on my trait, and after fighting the borrow checker for an entire day, I still can't seem to get it right.
I've distilled it down to this minimal repro: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=54fec74cb70c63c03f25ec7a9dfc7e60
What I don't understand is why the borrow on txn lives longer than the scope of benchmark(). It seems to me that it should live for only that one line.
How can I define the AdapterTransaction trait to resolve this, that still allows implementations to choose their own return type?
edit
added that I need to be able to use the AdapterTransaction implementations with a factory trait
The main problem in your first playground is the lifetime on &self being the same as the generic lifetime on the trait.
pub trait AdapterTransaction<'a, T: AsRef<[u8]>> {
fn get(&'a self, key: &[u8]) -> Option<T>;
}
Because they are the same, it requires the borrow of the underlying type to live at least as long as the type itself. This isn't true because even though the type is owned, the borrow would only last for the duration of the function call. In benchmark<'a,...>(), the lifetime 'a is picked by the caller, and there is no way a borrow within that function can be long enough. There would have been a quick fix to remove the 'a parameter on benchmark and replace it with a higher ranked trait bound (playground).
fn benchmark<U: AsRef<[u8]>, T: for<'a> AdapterTransaction<'a, U>>(txn: T)
In this example, 'a isn't chosen by the caller anymore, so the compiler is free to use a valid lifetime for the call.
As for the 2nd part of your question, traits can define associated types which can change depending on the implementation. You could have a trait that has an associated Output, which can change for each implemented type. There is a big difference with doing a generic param vs an associated type since in the former case, you are allowed to implement multiple generic variants of a trait for the same type. (This is how From<T> works, for example).
pub trait AdapterTransaction<'a> {
type Output;
fn get(&'a self, key: &[u8]) -> Option<Self::Output>;
}
impl<'a> AdapterTransaction<'a> for AdapterImpl {
type Output = &'a [u8];
fn get(&'a self, key: &[u8]) -> Option<Self::Output> {
Some(self.txn.get(&key))
}
}
fn benchmark<T>(txn: T)
where
for<'a> T: AdapterTransaction<'a>,
{
let _ = txn.get(&[]).unwrap();
}
Edit: Some of my initial assumptions weren't exact, it's not necessary to implement on &'a Type if the trait lifetime is used in a non-conflicting way.

Why AsRef can not be used as parameter type annotation but fine with generic declaration?

playground
use std::path::Path;
// fn f1(p: AsRef<Path>) {
// println!("{}", p.as_ref().display());
// }
fn f2<P: AsRef<Path>>(p: P) {
println!("{}", p.as_ref().display());
}
fn main() {
f2("/tmp/test.jpg");
}
The compiler will complain about the size of Path is not known for f1
AsRef is a trait, not a type. Your definition of f1() uses it in place of a type. This legacy syntax is short for dyn AsRef<Path>, and denotes an arbitrary type implementing the trait AsRef<Path>, with dynamic dispatch at runtime. The size of an arbitrary type implementing the trait obviously isn't known at compile time, so you can use trait objects only behind pointers, e.g. &dyn AsRef<Path> or Box<dyn AsRef<Path>>. The compiler does not complain about the size of Path being unknown, it complains about the size of the trait object being unknown.
Trait bounds on generic types, on the other hand, expect traits, not types. P again is an arbitrary type implementing AsRef<Path>, but this arbitrary type needs to be known at compile time, and the compiler emits a new compiled version of f2() for every type that is actually used. This way, the size of each individual type is known at compile time.

Why does returning `Self` in trait work, but returning `Option<Self>` requires `Sized`?

This trait definition compiles fine:
trait Works {
fn foo() -> Self;
}
This, however, does lead to an error:
trait Errors {
fn foo() -> Option<Self>;
}
error[E0277]: the size for values of type `Self` cannot be known at compilation time
--> src/lib.rs:6:5
|
6 | fn foo() -> Option<Self>;
| ^^^^^^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `Self`
= note: to learn more, visit <https://doc.rust-lang.org/book/second-edition/ch19-04-advanced-types.html#dynamically-sized-types-and-the-sized-trait>
= help: consider adding a `where Self: std::marker::Sized` bound
= note: required by `std::option::Option`
With the : Sized supertrait bound, it works.
I know that the Self type in traits is not automatically bound to be Sized. And I understand that Option<Self> cannot be returned (via the stack) unless it is sized (which, in turn, requires Self to be sized). However, the same would go for Self as return type, right? It also cannot be stored on the stack unless it's sized.
Why doesn't the first trait definition already trigger that error?
(This question is related, but it doesn't answer my exact question – unless I didn't understand it.)
There are two sets of checks happening here, which is why the difference appears confusing.
Each type in the function signature is checked for validity. Option inherently requires T: Sized. A return type that doesn't require Sized is fine:
trait Works {
fn foo() -> Box<Self>;
}
The existing answer covers this well.
Any function with a body also checks that all of the parameters are Sized. Trait functions without a body do not have this check applied.
Why is this useful? Allowing unsized types to be used in trait methods is a key part of allowing by-value trait objects, a very useful feature. For example, FnOnce does not require that Self be Sized:
pub trait FnOnce<Args> {
type Output;
extern "rust-call" fn call_once(self, args: Args) -> Self::Output;
}
fn call_it(f: Box<dyn FnOnce() -> i32>) -> i32 {
f()
}
fn main() {
println!("{}", call_it(Box::new(|| 42)));
}
A big thanks to pnkfelix and nikomatsakis for answering my questions on this topic.
It's in the error message:
= note: required by `std::option::Option`
Option requires the type to be Sized because it allocates on the stack. All type parameters to a concrete type definition are bound to Sized by default. Some types choose to opt out with a ?Sized bound but Option does not.
Why doesn't the first trait definition already trigger that error?
I think this is a conscious design decision, due to history, future-proofing and ergonomics.
First of all, Self is not assumed to be Sized in a trait definition because people are going to forget to write where Self: ?Sized, and those traits would be less useful. Letting traits be as flexible as possible by default is a sensible design philosophy; push errors to impls or make developers explicitly add constraints where they are needed.
With that in mind, imagine that trait definitions did not permit unsized types to be returned by a method. Every trait method that returns Self would have to also specify where Self: Sized. Apart from being a lot of visual noise, this would be bad for future language development: if unsized types are allowed to be returned in the future (e.g. to be used with placement-new), then all of these existing traits would be overly constrained.
The issue with Option is just the tip of the iceberg and the others have already explained that bit; I'd like to elaborate on your question in the comment:
is there a way I can implement fn foo() -> Self with Self not being
Sized? Because if there is no way to do that, I don't see the point in
allowing to return Self without a Sized bound.
That method indeed makes it impossible (at least currently) to utilize the trait as a trait object due to 2 issues:
Method has no receiver:
Methods that do not take a self parameter can't be called since there
won't be a way to get a pointer to the method table for them.
Method references the Self type in its arguments or return type:
This renders the trait not object-safe, i.e. it is impossible to create a trait object from it.
You are still perfectly able to use it for other things, though:
trait Works {
fn foo() -> Self;
}
#[derive(PartialEq, Debug)]
struct Foo;
impl Works for Foo {
fn foo() -> Self {
Foo
}
}
fn main() {
assert_eq!(Foo::foo(), Foo);
}

How do I use unsized types/traits in Rust? [duplicate]

This question already has answers here:
What is the correct way to return an Iterator (or any other trait)?
(2 answers)
Closed 6 years ago.
I have the following
trait T {}
type Iter = fn() -> Iterator<Item = T>;
fn func(iter: Iter) {
for a in iter() {
// ...
}
}
I would like iter to return an Iterator with move semantics, so I shouldn't have to return &Iterator. Problem is, Iterator is a trait, so it's unsized. The above code gets a compile error saying that Iterable does not satisfy the Sized trait because all local variables have to be statically sized.
On top of this, T is also a trait, therefore unsized, so I can't bind a to it either because it's unsized.
I'm new to Rust so this is really tripping me up. How do I use unsized types?
You probably shouldn't use unsized types at all here. Use generics instead:
trait Foo {} // T is a common name for type parameters, so use a different name
fn func<I, F>(iter: F) where I: Iterator, I::Item: Foo, F: FnOnce() -> I {
for a in iter() {
// ...
}
}
Check out the book chapters on generics and Fn* traits.
And by the way, accepting a function doesn't seem very useful, since you just call it once at the start. It seems much simpler to write the function like this:
fn func<I>(iter: I) where I: Iterator, I::Item: Foo {
for a in iter {
// ...
}
}
Unsized types only "work" if there is an additional level of indirection involved. Example:
trait Ttait {}
fn foo(x: Trait) {} // error: x would be unsized which is not allowed
fn bar(x: &Trait) {} // OK: x is just a reference
fn baz(x: Box<Trait>) {} // OK: x is just an "owning pointer"
But instead of using traits as types, you should probably prefer to use traits as bounds for type parameters of generic functions and structs:
fn generic<Type: Trait>(x: Type) {}
This is actually not a function but a family of functions. For each type Type that implements Trait the compiler will create a special version of generic if needed. If you have some value x of a concrete type that implements Trait and write generic(x) you will be invoking a function that is created especially for that type which is unsized. This is called "monomorphization".
So, traits have two uses. They act as unsized types (for trait objects, dynamic dispatch, dynamic polymorphism) as well as type bounds for generic code (static dispatch, static polymorphism). My suggestion would be to avoid trait objects if you can and prefer generic code.
Without knowing exactly what you're trying to do, let me show you one possibility:
trait Trait {
fn foo(&self);
}
fn func<'x,I>(iterable: I) where I: IntoIterator<Item = &'x Trait> {
for a in iterable {
a.foo();
}
}
struct Example;
impl Trait for Example {
fn foo(&self) {
println!("Example::foo()");
}
}
fn main() {
let arr = [Example, Example];
func(arr.iter().map(|x| x as &Trait));
}
This is an example which demonstrates both kinds of trait uses. func is generic over the kinds of iterables it takes but uses dynamic dispatch to invoke the right foo function. This is probably closest to what you were trying to do.
You could also go "full generic" (avoiding the dynamic dispatch for the foo calls. See delnan's answer) or "full dynamic" (making func agnostic about what kind of iterator it actually deals with at runtime). With more context we can probably make better suggestions.

How do I deal with wrapper type invariance in Rust?

References to wrapper types like &Rc<T> and &Box<T> are invariant in T (&Rc<T> is not a &Rc<U> even if T is a U). A concrete example of the issue (Rust Playground):
use std::rc::Rc;
use std::rc::Weak;
trait MyTrait {}
struct MyStruct {
}
impl MyTrait for MyStruct {}
fn foo(rc_trait: Weak<MyTrait>) {}
fn main() {
let a = Rc::new(MyStruct {});
foo(Rc::downgrade(&a));
}
This code results in the following error:
<anon>:15:23: 15:25 error: mismatched types:
expected `&alloc::rc::Rc<MyTrait>`,
found `&alloc::rc::Rc<MyStruct>`
Similar example (with similar error) with Box<T> (Rust Playground):
trait MyTrait {}
struct MyStruct {
}
impl MyTrait for MyStruct {}
fn foo(rc_trait: &Box<MyTrait>) {}
fn main() {
let a = Box::new(MyStruct {});
foo(&a);
}
In these cases I could of course just annotate a with the desired type, but in many cases that won't be possible because the original type is needed as well. So what do I do then?
What you see here is not related to variance and subtyping at all.
First, the most informative read on subtyping in Rust is this chapter of Nomicon. You can find there that in Rust subtyping relationship (i.e. when you can pass a value of one type to a function or a variable which expects a variable of different type) is very limited. It can only be observed when you're working with lifetimes.
For example, the following piece of code shows how exactly &Box<T> is (co)variant:
fn test<'a>(x: &'a Box<&'a i32>) {}
fn main() {
static X: i32 = 12;
let xr: &'static i32 = &X;
let xb: Box<&'static i32> = Box::new(xr); // <---- start of box lifetime
let xbr: &Box<&'static i32> = &xb;
test(xbr); // Covariance in action: since 'static is longer than or the
// same as any 'a, &Box<&'static i32> can be passed to
// a function which expects &'a Box<&'a i32>
//
// Note that it is important that both "inner" and "outer"
// references in the function signature are defined with
// the same lifetime parameter, and thus in `test(xbr)` call
// 'a gets instantiated with the lifetime associated with
// the scope I've marked with <----, but nevertheless we are
// able to pass &'static i32 as &'a i32 because the
// aforementioned scope is less than 'static, therefore any
// shared reference type with 'static lifetime is a subtype of
// a reference type with the lifetime of that scope
} // <---- end of box lifetime
This program compiles, which means that both & and Box are covariant over their respective type and lifetime parameters.
Unlike most of "conventional" OOP languages which have classes/interfaces like C++ and Java, in Rust traits do not introduce subtyping relationship. Even though, say,
trait Show {
fn show(&self) -> String;
}
highly resembles
interface Show {
String show();
}
in some language like Java, they are quite different in semantics. In Rust bare trait, when used as a type, is never a supertype of any type which implements this trait:
impl Show for i32 { ... }
// the above does not mean that i32 <: Show
Show, while being a trait, indeed can be used in type position, but it denotes a special unsized type which can only be used to form trait objects. You cannot have values of the bare trait type, therefore it does not even make sense to talk about subtyping and variance with bare trait types.
Trait objects take form of &SomeTrait or &mut SomeTrait or SmartPointer<SomeTrait>, and they can be passed around and stored in variables and they are needed to abstract away the actual implementation of the trait. However, &T where T: SomeTrait is not a subtype of &SomeTrait, and these types do not participate in variance at all.
Trait objects and regular pointers have incompatible internal structure: &T is just a regular pointer to a concrete type T, while &SomeTrait is a fat pointer which contains a pointer to the original value of a type which implements SomeTrait and also a second pointer to a vtable for the implementation of SomeTrait of the aforementioned type.
The fact that passing &T as &SomeTrait or Rc<T> as Rc<SomeTrait> works happens because Rust does automatic coercion for references and smart pointers: it is able to construct a fat pointer &SomeTrait for a regular reference &T if it knows T; this is quite natural, I believe. For instance, your example with Rc::downgrade() works because Rc::downgrade() returns a value of type Weak<MyStruct> which gets coerced to Weak<MyTrait>.
However, constructing &Box<SomeTrait> out of &Box<T> if T: SomeTrait is much more complex: for one, the compiler would need to allocate a new temporary value because Box<T> and Box<SomeTrait> has different memory representations. If you have, say, Box<Box<T>>, getting Box<Box<SomeTrait>> out of it is even more complex, because it would need creating a new allocation on the heap to store Box<SomeTrait>. Thus, there are no automatic coercions for nested references and smart pointers, and again, this is not connected with subtyping and variance at all.
In the case of Rc::downgrade this is actually just a failure of the type inference in this particular case, and will work if it is done as a separate let:
fn foo(rc_trait: Weak<MyTrait>) {}
fn main() {
let a = Rc::new(MyStruct {});
let b = Rc::downgrade(&a);
foo(b);
}
Playground
For Box<T> it is very likely you don't actually want a reference to the box as the argument, but a reference to the contents. In which case there is no invariance to deal with:
fn foo(rc_trait: &MyTrait) {}
fn main() {
let a = Box::new(MyStruct {});
foo(a.as_ref());
}
Playground
Similarly, for the case with Rc<T>, if you write a function that takes an Rc<T> you probably want a clone (i.e. a reference counted reference), and not a normal reference:
fn foo(rc_trait: Rc<MyTrait>) {}
fn main() {
let a = Rc::new(MyStruct {});
foo(a.clone());
}
Playground

Resources