This question already has answers here:
What does a lifetime mean when returning a conservative impl trait?
(2 answers)
Closed 8 months ago.
I am reading the async book. In section async lifetimes there is a code snippet whose grammar I am not familiar with:
fn foo_expanded<'a>(x: &'a u8) -> impl Future<Output = u8> + 'a {
async move { *x }
}
In impl Future<Output = u8> + 'a, what is impl Trait + 'lifetime here?
Update: I am more asking what it is instead of the lifetime logic explanation. A definition from the official doc will be
much appreciated.
what is impl Trait + 'lifetime here?
It is an Impl trait, here particularly used as an abstract return type
Impl trait
Syntax
ImplTraitType : impl TypeParamBounds
ImplTraitTypeOneBound : impl TraitBound
impl Trait provides ways to specify unnamed but concrete types that
implement a specific trait. It can appear in two sorts of places:
argument position (where it can act as an anonymous type parameter to
functions), and return position (where it can act as an abstract
return type).
trait Trait {}
// argument position: anonymous type parameter
fn foo(arg: impl Trait) {
}
// return position: abstract return type
fn bar() -> impl Trait {
}
Where the grammar of TypeParamBounds allows both trait and lifetime bounds
Trait and lifetime bounds
Syntax
TypeParamBounds :
TypeParamBound ( + TypeParamBound )* +?
TypeParamBound :
Lifetime | TraitBound
TraitBound :
?? ForLifetimes? TypePath
| ( ?? ForLifetimes? TypePath )
LifetimeBounds :
( Lifetime + )* Lifetime?
Lifetime :
LIFETIME_OR_LABEL
| 'static
| '_
Particularly noting that the grammar of TypeParamBounds is one or optionally several TypeParamBounds combined, each of which in turn is either TraitBound or Lifetime (bounds).
Specifying Multiple Trait Bounds with the + Syntax describe in some detail that we may combine more than one trait bound, but the same applies for lifetime bounds (where the grammar allows it). One could argue that the section should also mention lifetime bounds.
Not entirely familiar, but logically it makes sense:
The function returns a Future that, when completed returns a byte. When the async fn is actually executed, it moves x, which is a reference with a lifetime. Technically x could be dropped before the Future is completed. To prevent this the lifetime trait guarantees that the return value of the function lives at least as long as x
Related
I was writing some code where I want to use an iterator, or its reversed version depending on a flag, but the straightforward code gives an error
pub fn eggs<I,T>(iter:I)->Box<dyn Iterator<Item=T>>
where I:Iterator<Item=T>+DoubleEndedIterator
{
Box::new(iter.rev())
}
pub fn bacon<I,T>(iter:I, reverse:bool) -> Box<dyn Iterator<Item=T>>
where I:Iterator<Item=T>+DoubleEndedIterator
{
if reverse {
Box::new(iter.rev())
} else {
Box::new(iter)
}
}
fn main()
{
let pants:String = "pants".into();
eggs(pants.chars());
}
fails to compile:
error[E0310]: the parameter type `I` may not live long enough
--> src/main.rs:5:5
|
2 | pub fn eggs<I,T>(iter:I)->Box<dyn Iterator<Item=T>>
| - help: consider adding an explicit lifetime bound...: `I: 'static`
...
5 | Box::new(iter.rev())
| ^^^^^^^^^^^^^^^^^^^^ ...so that the type `Rev<I>` will meet its required lifetime bounds
With my limited understanding of Rust, I'm not sure where those lifetime bounds are coming from. There aren't any on the Iterator trait, or the Rev struct, and the parameter is being moved.
What is the proper way to declare these sorts of functions given that 'static isn't really an option.
rust playground
This doesn't have to do with .rev() at all, but with returning Box<dyn Iterator>:
// error[E0310]: the parameter type `I` may not live long enough
fn boxed_iter<I, T>(iter: I) -> Box<dyn Iterator<Item = T>>
// - help: consider adding an explicit lifetime bound...: `I: 'static`
where
I: Iterator<Item = T>,
{
Box::new(iter)
// ^^^^^^^^^^^^^^ ...so that the type `I` will meet its required lifetime bounds
}
The reason for this is that trait objects like Box<dyn Trait> have an implicit 'static lifetime if not specified. So when the compiler tries to cast Box<I> to Box<dyn Iterator>, it fails if I is does not also have a 'static lifetime. (There are some more specific rules if the trait contains lifetimes itself; you can read about those in more detail here.)
If you instead want a shorter lifetime, you need to specify it explicitly as Box<dyn 'a + Trait>. So for example:
fn boxed_iter<'a, I, T>(iter: I) -> Box<dyn 'a + Iterator<Item = T>>
where
I: 'a + Iterator<Item = T>,
{
Box::new(iter)
}
Frxstrem's answer is excellent. I just want to add that, if you know that the return value of your function has a specific concrete type, you can use the special impl trait syntax.
In the case of your eggs function, the return type is probably something like Rev<I>. That type, on its own, isn't very illuminating, but we know that such a type exists and the only thing we care about is that it's an iterator, so we can write
pub fn eggs<I,T>(iter:I) -> impl Iterator<Item=T> + DoubleEndedIterator
where I: Iterator<Item=T> + DoubleEndedIterator {
iter.rev()
}
Now the compiler still understands that there is a single concrete type and will act accordingly (no need to box the value or have dynamic dispatch), but we as the programmers still only have to care about the Iterator and DoubleEndedIterator aspects of it. Zero-cost abstraction at its finest.
Your bacon function can't benefit from this, as it could return either an I or a Rev<I> depending on input, so the dynamic dispatch is actually necessary. In that case, you'll need to follow Frxstrem's answer to correctly box your iterator.
use std::iter::Iterator;
trait ListTerm<'a> {
type Iter: Iterator<Item = &'a u32>;
fn iter(&'a self) -> Self::Iter;
}
enum TermValue<'a, LT>
where
LT: ListTerm<'a> + Sized + 'a,
{
Str(LT),
}
error[E0392]: parameter `'a` is never used
--> src/main.rs:8:16
|
8 | enum TermValue<'a, LT>
| ^^ unused type parameter
|
= help: consider removing `'a` or using a marker such as `std::marker::PhantomData`
'a clearly is being used. Is this a bug, or are parametric enums just not really finished? rustc --explain E0392 recommends the use of PhantomData<&'a _>, but I don't think there's any opportunity to do that in my use case.
'a clearly is being used.
Not as far as the compiler is concerned. All it cares about is that all of your generic parameters are used somewhere in the body of the struct or enum. Constraints do not count.
What you might want is to use a higher-ranked lifetime bound:
enum TermValue<LT>
where
for<'a> LT: 'a + ListTerm<'a> + Sized,
{
Str(LT),
}
In other situations, you might want to use PhantomData to indicate that you want a type to act as though it uses the parameter:
use std::marker::PhantomData;
struct Thing<'a> {
// Causes the type to function *as though* it has a `&'a ()` field,
// despite not *actually* having one.
_marker: PhantomData<&'a ()>,
}
And just to be clear: you can use PhantomData in an enum; put it in one of the variants:
enum TermValue<'a, LT>
where
LT: 'a + ListTerm<'a> + Sized,
{
Str(LT, PhantomData<&'a ()>),
}
DK. answered how to circumvent the issue (by using PhantomData as suggested), and hinted that the issue was that 'a was unused in the definition, but why would the compiler care about that?
'a is a lifetime marker. It is used by the borrow-checker to identify the relationship between the lifetime of different objects, as well as their borrow status.
When borrowing an object, you may borrow it either mutably (&mut T) or immutably (&T), and in accordance with the Mutability XOR Aliasing principle underpinning Rust's memory safety it changes everything:
You can have multiple concurrent &T
You can only have a single &mut T, and it excludes concurrent &T
When you parameterize your struct or enum with 'a, you announce your intention to borrow something whose lifetime will be some 'a. You do not, however, announce whether you will be borrowing mutably or immutably, and this detail is critical.
The compiler, therefore, will peer at the internals of your data type and check whether you use a mutable or immutable reference to deduce, by itself, which kind of borrow will occur when you use the data type.
And here, because 'a is unused, it cannot find any such use and therefore cannot compile your code.
It is arguable whether the compiler peering inside the data type is a good thing or not, since changing the internals of this data type (from &T to &mut T) could lead to compilation failures without changing the type interface.
It is important, thus, to remember that how you use the generic parameters (owning, borrowing mutably or borrowing immutably) is NOT an implementation detail.
If you aren't using an associated type (like <LT as ListTerm<'a>>::Iter) in the enum definition, you probably don't need to make 'a a parameter at all.
I assume you want the LT: ListTerm<'a> bound so that you can write one or more fns or an impl that uses LT as a ListTerm. In which case, you can easily parameterize the type with just <LT>, and put the 'a generic and trait bound only on the items that require it:
trait ListTerm<'a> {
type Iter: Iterator<Item = &'a u32>;
fn iter(&'a self) -> Self::Iter;
}
enum TermValue<LT> { // no 'a parameter here...
Str(LT),
}
impl<'a, LT> TermValue<LT> // ... just here
where
LT: ListTerm<'a>,
{
fn iter(&'a self) -> LT::Iter {
match *self {
TermValue::Str(ref term) => term.iter(),
}
}
}
Some standard library types like std::collections::HashMap<K, V> do this: the K: Hash + Eq bound isn't on the type itself. Alternatively, you could have a where clause on each method where the bound is needed. The difference between a where clause on an impl and one on a fn is not significant unless you're implementing a trait (see this question).
The main reason for using PhantomData is that you want to express some constraint that the compiler can't figure out by itself. You don't need PhantomData to express "Any TermData<LT> is only valid as long as its contained LT is valid", because the compiler already enforces that (by "peering inside" the type, as in Matthieu's answer).
While trying to understand the Any trait better, I saw that it has an impl block for the trait itself. I don't understand the purpose of this construct, or even if it has a specific name.
I made a little experiment with both a "normal" trait method and a method defined in the impl block:
trait Foo {
fn foo_in_trait(&self) {
println!("in foo")
}
}
impl dyn Foo {
fn foo_in_impl(&self) {
println!("in impl")
}
}
impl Foo for u8 {}
fn main() {
let x = Box::new(42u8) as Box<dyn Foo>;
x.foo_in_trait();
x.foo_in_impl();
let y = &42u8 as &dyn Foo;
y.foo_in_trait();
y.foo_in_impl(); // May cause an error, see below
}
Editor's note
In versions of Rust up to and including Rust 1.15.0, the line
y.foo_in_impl() causes the error:
error: borrowed value does not live long enough
--> src/main.rs:20:14
|
20 | let y = &42u8 as &Foo;
| ^^^^ does not live long enough
...
23 | }
| - temporary value only lives until here
|
= note: borrowed value must be valid for the static lifetime...
This error is no longer present in subsequent versions, but the
concepts explained in the answers are still valid.
From this limited experiment, it seems like methods defined in the impl block are more restrictive than methods defined in the trait block. It's likely that there's something extra that doing it this way unlocks, but I just don't know what it is yet! ^_^
The sections from The Rust Programming Language on traits and trait objects don't make any mention of this. Searching the Rust source itself, it seems like only Any and Error use this particular feature. I've not seen this used in the handful of crates where I have looked at the source code.
When you define a trait named Foo that can be made into an object, Rust also defines a trait object type named dyn Foo. In older versions of Rust, this type was only called Foo (see What does "dyn" mean in a type?). For backwards compatibility with these older versions, Foo still works to name the trait object type, although dyn syntax should be used for new code.
Trait objects have a lifetime parameter that designates the shortest of the implementor's lifetime parameters. To specify that lifetime, you write the type as dyn Foo + 'a.
When you write impl dyn Foo { (or just impl Foo { using the old syntax), you are not specifying that lifetime parameter, and it defaults to 'static. This note from the compiler on the y.foo_in_impl(); statement hints at that:
note: borrowed value must be valid for the static lifetime...
All we have to do to make this more permissive is to write a generic impl over any lifetime:
impl<'a> dyn Foo + 'a {
fn foo_in_impl(&self) { println!("in impl") }
}
Now, notice that the self argument on foo_in_impl is a borrowed pointer, which has a lifetime parameter of its own. The type of self, in its full form, looks like &'b (dyn Foo + 'a) (the parentheses are required due to operator precedence). A Box<u8> owns its u8 – it doesn't borrow anything –, so you can create a &(dyn Foo + 'static) out of it. On the other hand, &42u8 creates a &'b (dyn Foo + 'a) where 'a is not 'static, because 42u8 is put in a hidden variable on the stack, and the trait object borrows this variable. (That doesn't really make sense, though; u8 doesn't borrow anything, so its Foo implementation should always be compatible with dyn Foo + 'static... the fact that 42u8 is borrowed from the stack should affect 'b, not 'a.)
Another thing to note is that trait methods are polymorphic, even when they have a default implementation and they're not overridden, while inherent methods on a trait objects are monomorphic (there's only one function, no matter what's behind the trait). For example:
use std::any::type_name;
trait Foo {
fn foo_in_trait(&self)
where
Self: 'static,
{
println!("{}", type_name::<Self>());
}
}
impl dyn Foo {
fn foo_in_impl(&self) {
println!("{}", type_name::<Self>());
}
}
impl Foo for u8 {}
impl Foo for u16 {}
fn main() {
let x = Box::new(42u8) as Box<dyn Foo>;
x.foo_in_trait();
x.foo_in_impl();
let x = Box::new(42u16) as Box<Foo>;
x.foo_in_trait();
x.foo_in_impl();
}
Sample output:
u8
dyn playground::Foo
u16
dyn playground::Foo
In the trait method, we get the type name of the underlying type (here, u8 or u16), so we can conclude that the type of &self will vary from one implementer to the other (it'll be &u8 for the u8 implementer and &u16 for the u16 implementer – not a trait object). However, in the inherent method, we get the type name of dyn Foo (+ 'static), so we can conclude that the type of &self is always &dyn Foo (a trait object).
I suspect that the reason is very simple: may be overridden or not?
A method implemented in a trait block can be overridden by implementors of the trait, it just provides a default.
On the other hand, a method implemented in an impl block cannot be overridden.
If this reasoning is right, then the error you get for y.foo_in_impl() is just a lack of polish: it should have worked. See Francis Gagné's more complete answer on the interaction with lifetimes.
use std::iter::Iterator;
trait ListTerm<'a> {
type Iter: Iterator<Item = &'a u32>;
fn iter(&'a self) -> Self::Iter;
}
enum TermValue<'a, LT>
where
LT: ListTerm<'a> + Sized + 'a,
{
Str(LT),
}
error[E0392]: parameter `'a` is never used
--> src/main.rs:8:16
|
8 | enum TermValue<'a, LT>
| ^^ unused type parameter
|
= help: consider removing `'a` or using a marker such as `std::marker::PhantomData`
'a clearly is being used. Is this a bug, or are parametric enums just not really finished? rustc --explain E0392 recommends the use of PhantomData<&'a _>, but I don't think there's any opportunity to do that in my use case.
'a clearly is being used.
Not as far as the compiler is concerned. All it cares about is that all of your generic parameters are used somewhere in the body of the struct or enum. Constraints do not count.
What you might want is to use a higher-ranked lifetime bound:
enum TermValue<LT>
where
for<'a> LT: 'a + ListTerm<'a> + Sized,
{
Str(LT),
}
In other situations, you might want to use PhantomData to indicate that you want a type to act as though it uses the parameter:
use std::marker::PhantomData;
struct Thing<'a> {
// Causes the type to function *as though* it has a `&'a ()` field,
// despite not *actually* having one.
_marker: PhantomData<&'a ()>,
}
And just to be clear: you can use PhantomData in an enum; put it in one of the variants:
enum TermValue<'a, LT>
where
LT: 'a + ListTerm<'a> + Sized,
{
Str(LT, PhantomData<&'a ()>),
}
DK. answered how to circumvent the issue (by using PhantomData as suggested), and hinted that the issue was that 'a was unused in the definition, but why would the compiler care about that?
'a is a lifetime marker. It is used by the borrow-checker to identify the relationship between the lifetime of different objects, as well as their borrow status.
When borrowing an object, you may borrow it either mutably (&mut T) or immutably (&T), and in accordance with the Mutability XOR Aliasing principle underpinning Rust's memory safety it changes everything:
You can have multiple concurrent &T
You can only have a single &mut T, and it excludes concurrent &T
When you parameterize your struct or enum with 'a, you announce your intention to borrow something whose lifetime will be some 'a. You do not, however, announce whether you will be borrowing mutably or immutably, and this detail is critical.
The compiler, therefore, will peer at the internals of your data type and check whether you use a mutable or immutable reference to deduce, by itself, which kind of borrow will occur when you use the data type.
And here, because 'a is unused, it cannot find any such use and therefore cannot compile your code.
It is arguable whether the compiler peering inside the data type is a good thing or not, since changing the internals of this data type (from &T to &mut T) could lead to compilation failures without changing the type interface.
It is important, thus, to remember that how you use the generic parameters (owning, borrowing mutably or borrowing immutably) is NOT an implementation detail.
If you aren't using an associated type (like <LT as ListTerm<'a>>::Iter) in the enum definition, you probably don't need to make 'a a parameter at all.
I assume you want the LT: ListTerm<'a> bound so that you can write one or more fns or an impl that uses LT as a ListTerm. In which case, you can easily parameterize the type with just <LT>, and put the 'a generic and trait bound only on the items that require it:
trait ListTerm<'a> {
type Iter: Iterator<Item = &'a u32>;
fn iter(&'a self) -> Self::Iter;
}
enum TermValue<LT> { // no 'a parameter here...
Str(LT),
}
impl<'a, LT> TermValue<LT> // ... just here
where
LT: ListTerm<'a>,
{
fn iter(&'a self) -> LT::Iter {
match *self {
TermValue::Str(ref term) => term.iter(),
}
}
}
Some standard library types like std::collections::HashMap<K, V> do this: the K: Hash + Eq bound isn't on the type itself. Alternatively, you could have a where clause on each method where the bound is needed. The difference between a where clause on an impl and one on a fn is not significant unless you're implementing a trait (see this question).
The main reason for using PhantomData is that you want to express some constraint that the compiler can't figure out by itself. You don't need PhantomData to express "Any TermData<LT> is only valid as long as its contained LT is valid", because the compiler already enforces that (by "peering inside" the type, as in Matthieu's answer).
While trying to understand the Any trait better, I saw that it has an impl block for the trait itself. I don't understand the purpose of this construct, or even if it has a specific name.
I made a little experiment with both a "normal" trait method and a method defined in the impl block:
trait Foo {
fn foo_in_trait(&self) {
println!("in foo")
}
}
impl dyn Foo {
fn foo_in_impl(&self) {
println!("in impl")
}
}
impl Foo for u8 {}
fn main() {
let x = Box::new(42u8) as Box<dyn Foo>;
x.foo_in_trait();
x.foo_in_impl();
let y = &42u8 as &dyn Foo;
y.foo_in_trait();
y.foo_in_impl(); // May cause an error, see below
}
Editor's note
In versions of Rust up to and including Rust 1.15.0, the line
y.foo_in_impl() causes the error:
error: borrowed value does not live long enough
--> src/main.rs:20:14
|
20 | let y = &42u8 as &Foo;
| ^^^^ does not live long enough
...
23 | }
| - temporary value only lives until here
|
= note: borrowed value must be valid for the static lifetime...
This error is no longer present in subsequent versions, but the
concepts explained in the answers are still valid.
From this limited experiment, it seems like methods defined in the impl block are more restrictive than methods defined in the trait block. It's likely that there's something extra that doing it this way unlocks, but I just don't know what it is yet! ^_^
The sections from The Rust Programming Language on traits and trait objects don't make any mention of this. Searching the Rust source itself, it seems like only Any and Error use this particular feature. I've not seen this used in the handful of crates where I have looked at the source code.
When you define a trait named Foo that can be made into an object, Rust also defines a trait object type named dyn Foo. In older versions of Rust, this type was only called Foo (see What does "dyn" mean in a type?). For backwards compatibility with these older versions, Foo still works to name the trait object type, although dyn syntax should be used for new code.
Trait objects have a lifetime parameter that designates the shortest of the implementor's lifetime parameters. To specify that lifetime, you write the type as dyn Foo + 'a.
When you write impl dyn Foo { (or just impl Foo { using the old syntax), you are not specifying that lifetime parameter, and it defaults to 'static. This note from the compiler on the y.foo_in_impl(); statement hints at that:
note: borrowed value must be valid for the static lifetime...
All we have to do to make this more permissive is to write a generic impl over any lifetime:
impl<'a> dyn Foo + 'a {
fn foo_in_impl(&self) { println!("in impl") }
}
Now, notice that the self argument on foo_in_impl is a borrowed pointer, which has a lifetime parameter of its own. The type of self, in its full form, looks like &'b (dyn Foo + 'a) (the parentheses are required due to operator precedence). A Box<u8> owns its u8 – it doesn't borrow anything –, so you can create a &(dyn Foo + 'static) out of it. On the other hand, &42u8 creates a &'b (dyn Foo + 'a) where 'a is not 'static, because 42u8 is put in a hidden variable on the stack, and the trait object borrows this variable. (That doesn't really make sense, though; u8 doesn't borrow anything, so its Foo implementation should always be compatible with dyn Foo + 'static... the fact that 42u8 is borrowed from the stack should affect 'b, not 'a.)
Another thing to note is that trait methods are polymorphic, even when they have a default implementation and they're not overridden, while inherent methods on a trait objects are monomorphic (there's only one function, no matter what's behind the trait). For example:
use std::any::type_name;
trait Foo {
fn foo_in_trait(&self)
where
Self: 'static,
{
println!("{}", type_name::<Self>());
}
}
impl dyn Foo {
fn foo_in_impl(&self) {
println!("{}", type_name::<Self>());
}
}
impl Foo for u8 {}
impl Foo for u16 {}
fn main() {
let x = Box::new(42u8) as Box<dyn Foo>;
x.foo_in_trait();
x.foo_in_impl();
let x = Box::new(42u16) as Box<Foo>;
x.foo_in_trait();
x.foo_in_impl();
}
Sample output:
u8
dyn playground::Foo
u16
dyn playground::Foo
In the trait method, we get the type name of the underlying type (here, u8 or u16), so we can conclude that the type of &self will vary from one implementer to the other (it'll be &u8 for the u8 implementer and &u16 for the u16 implementer – not a trait object). However, in the inherent method, we get the type name of dyn Foo (+ 'static), so we can conclude that the type of &self is always &dyn Foo (a trait object).
I suspect that the reason is very simple: may be overridden or not?
A method implemented in a trait block can be overridden by implementors of the trait, it just provides a default.
On the other hand, a method implemented in an impl block cannot be overridden.
If this reasoning is right, then the error you get for y.foo_in_impl() is just a lack of polish: it should have worked. See Francis Gagné's more complete answer on the interaction with lifetimes.