Inconsistency of lifetime bound requirement when storing closures

Inconsistency of lifetime bound requirement when storing closures - rust

When I try to store closures to a HashMap, I come across a lifetime bound requirement reported by the compiler. It seems like an inconsistent requirement.
struct NoBox<C: Fn() -> ()>(HashMap<String, C>);
impl<C> NoBox<C>
where
C: Fn() -> (),
{
fn new() -> NoBox<C> {
NoBox(HashMap::new())
}
fn add(&mut self, str: &str, closure: C) {
self.0.insert(str.to_string(), closure);
}
}
This is Ok. The compiler is happy with it. However, when I try to wrap the closure into a trait object and store it. The compiler imposes a 'static lifetime bound on it.
struct Boxed(HashMap<String, Box<dyn Fn() -> ()>>);
impl Boxed {
fn new() -> Boxed {
Boxed(HashMap::new())
}
fn add<C>(&mut self, str: &str, closure: C)
where
C: Fn() -> ()//add 'static here fix the error
{
self.0.insert(str.to_string(), Box::new(closure)); //error: type parameter C may not live long enough, consider adding 'static lifebound
}
}
According to the complain of the compiler, C may not live long enough. It makes sense to add a 'static bound to it.
But, why the first case without boxing doesn't have this requirement?
To my understanding, if C contains some reference to an early-dropped referent, then store it in NoBox would also cause the invalid-reference problem. For me, it seems like an inconsistency.

NoBox is not a problem because if the function contains a reference to the lifetime, the type will stay contain this lifetime because the function type needs to be specified explicitly.
For example, suppose we're storing a closure that captures something with lifetime 'a. Then the closure's struct will looks like (this is not how the compiler actually desugars closures but is enough for the example):
struct Closure<'a> {
captured: &'a i32,
}
And when specifying it in NoBox, the type will be NoBox<Closure<'a>>, and so we know it cannot outlive 'a. Note this type may never be actually explicitly specified - especially with closures - but the compiler's inferred type still have the lifetime in it.
With Boxed on the other hand, we erase this information, and thus may accidentally outlive 'a - because it does not appear on the type. So the compiler enforces it to be 'static, unless you explicitly specify otherwise:
struct Boxed<'a>(HashMap<String, Box<dyn Fn() + 'a>>);

Related

Storing a function pointer taking generic parameter slice in struct

I'm in a situation where I need to store away a function that implements a trait into a struct. Here is some reduced code
struct Node<T>{
compute_func: Box<dyn Fn(&[T]) -> T>
}
impl<T: Debug + 'static> OtherHoldingStruct<T> {
pub fn create_node<F: Fn(&[T]->T>(..., _compute_func: F) {
Node {
compute_function: Box::new(_compute_func),
//~~~~ the parameter type `impl Fn(&[T]) -> T` may not live long enough
//~~~~ ...so that the type `impl Fn(&[T]) -> T` will meet its required lifetime bounds rustc(E0310)
}
}
What I gather is that since I'm trying to accept a function type that takes a reference to a slice, the compiler is trying to create some assurances around how the lifetime of the reference to the slice will behave. What i'm not sure of is how to give it that ?
I considered adding a lifetime to create_node
impl<T: Debug + 'static> OtherHoldingStruct<T> {
pub fn create_node<'a, F: Fn(&'a [T]->T>(..., _compute_func: F) {
Node {
compute_function: Box::new(_compute_func),
//~~~~ expected a `std::ops::Fn<(&[T],)>` closure, found `impl Fn(&'a [T]) -> T`
}
}
which then seems to barf at not being able to match closures to the type.

The problem is not the slice reference — it's a lifetime requirement on the function type F itself. If you're going to store the function, then the function itself must be able to live for 'static (unless there's an explicitly permitted shorter lifetime).
The requirement actually in your code causing the compiler error appears because dyn Fn (or any other dyn) has an implicit + 'static bound if you don't specify a different lifetime. Thus, the bounds for F in create_node are Fn(&[T]) -> T but the bounds for compute_function are Fn(&[T]) -> T + 'static, creating the error you saw.
The fix is to add a 'static bound on F:
pub fn create_node<F: Fn(&[T]) -> T + 'static>(_compute_func: F) -> Node<T> {
// ^^^^^^^^^
This bound disallows passing, for example, a closure that captures a non-static reference, which has to be invalid anyway since Node could live indefinitely, unlike that reference.

How to define an adapter trait where some implementations need a lifetime on &self?

I'm writing a set of benchmarks for different key-value stores, and would like to have a single adapter trait that I can use in all the benchmarks, and then implement it for each key-value store.
This worked well for two of them. However, the third required me to add a lifetime on my trait, and after fighting the borrow checker for an entire day, I still can't seem to get it right.
I've distilled it down to this minimal repro: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=54fec74cb70c63c03f25ec7a9dfc7e60
What I don't understand is why the borrow on txn lives longer than the scope of benchmark(). It seems to me that it should live for only that one line.
How can I define the AdapterTransaction trait to resolve this, that still allows implementations to choose their own return type?
edit
added that I need to be able to use the AdapterTransaction implementations with a factory trait

The main problem in your first playground is the lifetime on &self being the same as the generic lifetime on the trait.
pub trait AdapterTransaction<'a, T: AsRef<[u8]>> {
fn get(&'a self, key: &[u8]) -> Option<T>;
}
Because they are the same, it requires the borrow of the underlying type to live at least as long as the type itself. This isn't true because even though the type is owned, the borrow would only last for the duration of the function call. In benchmark<'a,...>(), the lifetime 'a is picked by the caller, and there is no way a borrow within that function can be long enough. There would have been a quick fix to remove the 'a parameter on benchmark and replace it with a higher ranked trait bound (playground).
fn benchmark<U: AsRef<[u8]>, T: for<'a> AdapterTransaction<'a, U>>(txn: T)
In this example, 'a isn't chosen by the caller anymore, so the compiler is free to use a valid lifetime for the call.
As for the 2nd part of your question, traits can define associated types which can change depending on the implementation. You could have a trait that has an associated Output, which can change for each implemented type. There is a big difference with doing a generic param vs an associated type since in the former case, you are allowed to implement multiple generic variants of a trait for the same type. (This is how From<T> works, for example).
pub trait AdapterTransaction<'a> {
type Output;
fn get(&'a self, key: &[u8]) -> Option<Self::Output>;
}
impl<'a> AdapterTransaction<'a> for AdapterImpl {
type Output = &'a [u8];
fn get(&'a self, key: &[u8]) -> Option<Self::Output> {
Some(self.txn.get(&key))
}
}
fn benchmark<T>(txn: T)
where
for<'a> T: AdapterTransaction<'a>,
{
let _ = txn.get(&[]).unwrap();
}
Edit: Some of my initial assumptions weren't exact, it's not necessary to implement on &'a Type if the trait lifetime is used in a non-conflicting way.

Compiler complains about missing PhantomData when a lifetime is only needed for another trait parameter [duplicate]

use std::iter::Iterator;
trait ListTerm<'a> {
type Iter: Iterator<Item = &'a u32>;
fn iter(&'a self) -> Self::Iter;
}
enum TermValue<'a, LT>
where
LT: ListTerm<'a> + Sized + 'a,
{
Str(LT),
}
error[E0392]: parameter `'a` is never used
--> src/main.rs:8:16
|
8 | enum TermValue<'a, LT>
| ^^ unused type parameter
|
= help: consider removing `'a` or using a marker such as `std::marker::PhantomData`
'a clearly is being used. Is this a bug, or are parametric enums just not really finished? rustc --explain E0392 recommends the use of PhantomData<&'a _>, but I don't think there's any opportunity to do that in my use case.

'a clearly is being used.
Not as far as the compiler is concerned. All it cares about is that all of your generic parameters are used somewhere in the body of the struct or enum. Constraints do not count.
What you might want is to use a higher-ranked lifetime bound:
enum TermValue<LT>
where
for<'a> LT: 'a + ListTerm<'a> + Sized,
{
Str(LT),
}
In other situations, you might want to use PhantomData to indicate that you want a type to act as though it uses the parameter:
use std::marker::PhantomData;
struct Thing<'a> {
// Causes the type to function *as though* it has a `&'a ()` field,
// despite not *actually* having one.
_marker: PhantomData<&'a ()>,
}
And just to be clear: you can use PhantomData in an enum; put it in one of the variants:
enum TermValue<'a, LT>
where
LT: 'a + ListTerm<'a> + Sized,
{
Str(LT, PhantomData<&'a ()>),
}

DK. answered how to circumvent the issue (by using PhantomData as suggested), and hinted that the issue was that 'a was unused in the definition, but why would the compiler care about that?
'a is a lifetime marker. It is used by the borrow-checker to identify the relationship between the lifetime of different objects, as well as their borrow status.
When borrowing an object, you may borrow it either mutably (&mut T) or immutably (&T), and in accordance with the Mutability XOR Aliasing principle underpinning Rust's memory safety it changes everything:
You can have multiple concurrent &T
You can only have a single &mut T, and it excludes concurrent &T
When you parameterize your struct or enum with 'a, you announce your intention to borrow something whose lifetime will be some 'a. You do not, however, announce whether you will be borrowing mutably or immutably, and this detail is critical.
The compiler, therefore, will peer at the internals of your data type and check whether you use a mutable or immutable reference to deduce, by itself, which kind of borrow will occur when you use the data type.
And here, because 'a is unused, it cannot find any such use and therefore cannot compile your code.
It is arguable whether the compiler peering inside the data type is a good thing or not, since changing the internals of this data type (from &T to &mut T) could lead to compilation failures without changing the type interface.
It is important, thus, to remember that how you use the generic parameters (owning, borrowing mutably or borrowing immutably) is NOT an implementation detail.

If you aren't using an associated type (like <LT as ListTerm<'a>>::Iter) in the enum definition, you probably don't need to make 'a a parameter at all.
I assume you want the LT: ListTerm<'a> bound so that you can write one or more fns or an impl that uses LT as a ListTerm. In which case, you can easily parameterize the type with just <LT>, and put the 'a generic and trait bound only on the items that require it:
trait ListTerm<'a> {
type Iter: Iterator<Item = &'a u32>;
fn iter(&'a self) -> Self::Iter;
}
enum TermValue<LT> { // no 'a parameter here...
Str(LT),
}
impl<'a, LT> TermValue<LT> // ... just here
where
LT: ListTerm<'a>,
{
fn iter(&'a self) -> LT::Iter {
match *self {
TermValue::Str(ref term) => term.iter(),
}
}
}
Some standard library types like std::collections::HashMap<K, V> do this: the K: Hash + Eq bound isn't on the type itself. Alternatively, you could have a where clause on each method where the bound is needed. The difference between a where clause on an impl and one on a fn is not significant unless you're implementing a trait (see this question).
The main reason for using PhantomData is that you want to express some constraint that the compiler can't figure out by itself. You don't need PhantomData to express "Any TermData<LT> is only valid as long as its contained LT is valid", because the compiler already enforces that (by "peering inside" the type, as in Matthieu's answer).

What does 'impl MyTrait' do without 'for MyStruct' in Rust? [duplicate]

While trying to understand the Any trait better, I saw that it has an impl block for the trait itself. I don't understand the purpose of this construct, or even if it has a specific name.
I made a little experiment with both a "normal" trait method and a method defined in the impl block:
trait Foo {
fn foo_in_trait(&self) {
println!("in foo")
}
}
impl dyn Foo {
fn foo_in_impl(&self) {
println!("in impl")
}
}
impl Foo for u8 {}
fn main() {
let x = Box::new(42u8) as Box<dyn Foo>;
x.foo_in_trait();
x.foo_in_impl();
let y = &42u8 as &dyn Foo;
y.foo_in_trait();
y.foo_in_impl(); // May cause an error, see below
}
Editor's note
In versions of Rust up to and including Rust 1.15.0, the line
y.foo_in_impl() causes the error:
error: borrowed value does not live long enough
--> src/main.rs:20:14
|
20 | let y = &42u8 as &Foo;
| ^^^^ does not live long enough
...
23 | }
| - temporary value only lives until here
|
= note: borrowed value must be valid for the static lifetime...
This error is no longer present in subsequent versions, but the
concepts explained in the answers are still valid.
From this limited experiment, it seems like methods defined in the impl block are more restrictive than methods defined in the trait block. It's likely that there's something extra that doing it this way unlocks, but I just don't know what it is yet! ^_^
The sections from The Rust Programming Language on traits and trait objects don't make any mention of this. Searching the Rust source itself, it seems like only Any and Error use this particular feature. I've not seen this used in the handful of crates where I have looked at the source code.

When you define a trait named Foo that can be made into an object, Rust also defines a trait object type named dyn Foo. In older versions of Rust, this type was only called Foo (see What does "dyn" mean in a type?). For backwards compatibility with these older versions, Foo still works to name the trait object type, although dyn syntax should be used for new code.
Trait objects have a lifetime parameter that designates the shortest of the implementor's lifetime parameters. To specify that lifetime, you write the type as dyn Foo + 'a.
When you write impl dyn Foo { (or just impl Foo { using the old syntax), you are not specifying that lifetime parameter, and it defaults to 'static. This note from the compiler on the y.foo_in_impl(); statement hints at that:
note: borrowed value must be valid for the static lifetime...
All we have to do to make this more permissive is to write a generic impl over any lifetime:
impl<'a> dyn Foo + 'a {
fn foo_in_impl(&self) { println!("in impl") }
}
Now, notice that the self argument on foo_in_impl is a borrowed pointer, which has a lifetime parameter of its own. The type of self, in its full form, looks like &'b (dyn Foo + 'a) (the parentheses are required due to operator precedence). A Box<u8> owns its u8 – it doesn't borrow anything –, so you can create a &(dyn Foo + 'static) out of it. On the other hand, &42u8 creates a &'b (dyn Foo + 'a) where 'a is not 'static, because 42u8 is put in a hidden variable on the stack, and the trait object borrows this variable. (That doesn't really make sense, though; u8 doesn't borrow anything, so its Foo implementation should always be compatible with dyn Foo + 'static... the fact that 42u8 is borrowed from the stack should affect 'b, not 'a.)
Another thing to note is that trait methods are polymorphic, even when they have a default implementation and they're not overridden, while inherent methods on a trait objects are monomorphic (there's only one function, no matter what's behind the trait). For example:
use std::any::type_name;
trait Foo {
fn foo_in_trait(&self)
where
Self: 'static,
{
println!("{}", type_name::<Self>());
}
}
impl dyn Foo {
fn foo_in_impl(&self) {
println!("{}", type_name::<Self>());
}
}
impl Foo for u8 {}
impl Foo for u16 {}
fn main() {
let x = Box::new(42u8) as Box<dyn Foo>;
x.foo_in_trait();
x.foo_in_impl();
let x = Box::new(42u16) as Box<Foo>;
x.foo_in_trait();
x.foo_in_impl();
}
Sample output:
u8
dyn playground::Foo
u16
dyn playground::Foo
In the trait method, we get the type name of the underlying type (here, u8 or u16), so we can conclude that the type of &self will vary from one implementer to the other (it'll be &u8 for the u8 implementer and &u16 for the u16 implementer – not a trait object). However, in the inherent method, we get the type name of dyn Foo (+ 'static), so we can conclude that the type of &self is always &dyn Foo (a trait object).

I suspect that the reason is very simple: may be overridden or not?
A method implemented in a trait block can be overridden by implementors of the trait, it just provides a default.
On the other hand, a method implemented in an impl block cannot be overridden.
If this reasoning is right, then the error you get for y.foo_in_impl() is just a lack of polish: it should have worked. See Francis Gagné's more complete answer on the interaction with lifetimes.

"parameter `'a` is never used" error when 'a is used in type parameter bound

use std::iter::Iterator;
trait ListTerm<'a> {
type Iter: Iterator<Item = &'a u32>;
fn iter(&'a self) -> Self::Iter;
}
enum TermValue<'a, LT>
where
LT: ListTerm<'a> + Sized + 'a,
{
Str(LT),
}
error[E0392]: parameter `'a` is never used
--> src/main.rs:8:16
|
8 | enum TermValue<'a, LT>
| ^^ unused type parameter
|
= help: consider removing `'a` or using a marker such as `std::marker::PhantomData`
'a clearly is being used. Is this a bug, or are parametric enums just not really finished? rustc --explain E0392 recommends the use of PhantomData<&'a _>, but I don't think there's any opportunity to do that in my use case.

'a clearly is being used.
Not as far as the compiler is concerned. All it cares about is that all of your generic parameters are used somewhere in the body of the struct or enum. Constraints do not count.
What you might want is to use a higher-ranked lifetime bound:
enum TermValue<LT>
where
for<'a> LT: 'a + ListTerm<'a> + Sized,
{
Str(LT),
}
In other situations, you might want to use PhantomData to indicate that you want a type to act as though it uses the parameter:
use std::marker::PhantomData;
struct Thing<'a> {
// Causes the type to function *as though* it has a `&'a ()` field,
// despite not *actually* having one.
_marker: PhantomData<&'a ()>,
}
And just to be clear: you can use PhantomData in an enum; put it in one of the variants:
enum TermValue<'a, LT>
where
LT: 'a + ListTerm<'a> + Sized,
{
Str(LT, PhantomData<&'a ()>),
}

If you aren't using an associated type (like <LT as ListTerm<'a>>::Iter) in the enum definition, you probably don't need to make 'a a parameter at all.
I assume you want the LT: ListTerm<'a> bound so that you can write one or more fns or an impl that uses LT as a ListTerm. In which case, you can easily parameterize the type with just <LT>, and put the 'a generic and trait bound only on the items that require it:
trait ListTerm<'a> {
type Iter: Iterator<Item = &'a u32>;
fn iter(&'a self) -> Self::Iter;
}
enum TermValue<LT> { // no 'a parameter here...
Str(LT),
}
impl<'a, LT> TermValue<LT> // ... just here
where
LT: ListTerm<'a>,
{
fn iter(&'a self) -> LT::Iter {
match *self {
TermValue::Str(ref term) => term.iter(),
}
}
}
Some standard library types like std::collections::HashMap<K, V> do this: the K: Hash + Eq bound isn't on the type itself. Alternatively, you could have a where clause on each method where the bound is needed. The difference between a where clause on an impl and one on a fn is not significant unless you're implementing a trait (see this question).
The main reason for using PhantomData is that you want to express some constraint that the compiler can't figure out by itself. You don't need PhantomData to express "Any TermData<LT> is only valid as long as its contained LT is valid", because the compiler already enforces that (by "peering inside" the type, as in Matthieu's answer).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string