Rust, serde Deserialize and Higher Rank Trait Bounds For<`a>

Rust, serde Deserialize and Higher Rank Trait Bounds For<`a> - rust

I am trying to have a deeper understanding of how rust works. I am trying to do some serializing and deserializing to save and load a struct with a generic type. I got it to work, but I don't understand the HRTB and why they made the code work.
Initially I have this
use serde::Deserialize;
use bincode;
use std::fs;
#[derive(Deserialize)]
pub struct Construct<T> {
data: Vec<T>
}
impl <'a, T: Deserialize<'a>> Construct<T> {
pub fn load() -> Self {
match fs::read("data.sav") {
Ok(d) => {
let c: Construct<T> = bincode::deserialize(&d).unwrap();
c
},
Err(e) => {
println!("{e}, passing empty Construct");
Self { data: Vec::new() }
}
}
}
}
whihc produces this error
error[E0597]: `d` does not live long enough
--> src/main.rs:14:49
|
10 | impl <'a, T: Deserialize<'a>> Construct<T> {
| -- lifetime `'a` defined here
...
14 | let c: Construct<T> = bincode::deserialize(&d).unwrap();
| ---------------------^^-
| | |
| | borrowed value does not live long enough
| argument requires that `d` is borrowed for `'a`
15 | c
16 | },
| - `d` dropped here while still borrowed
I have fixed the impl block to take a higher ranked trait bound. And it works just fine.
...
impl <T: for<'a> Deserialize<'a>> Construct<T> {
pub fn load() -> Self {
...
As I understand it Deserialize needs to make sure that the input reference lives as long as the out structure(https://serde.rs/lifetimes.html), and the difference between declaring the trait in the first example and using for<'a>. Is that the 1st example the lifetime is being provided by the caller and the for<'a> is getting the lifetime from the impl itself. (How does "for<>" syntax differ from a regular lifetime bound?)
Am I right in thinking that with the for<'a> syntax we are getting the lifetime from the implementation block and that gives us a longer lifetime than from calling the function? Is there another way to code this load function without using HRTBs?

Am I right in thinking that with the for<'a> syntax we are getting the lifetime from the implementation block
Yes, from the call bincode::deserialize(&d). Specifically, the lifetime of d.
and that gives us a longer lifetime than from calling the function
Nope, a shorter: instead of a caller-decided lifetime, that will always be longer than d's lifetime (because it is declared inside our function), we get a lifetime for only d.
Is there another way to code this load function without using HRTBs?
Yes, by bounding T to DeserializeOwned. But this just hides the HRTB: DeserializeOwned uses them behind the scene.

Related

Lifetime in mutable structure with HashSet

I'm having trouble understanding why rust doesn't like my remove_str method in there:
use std::cell::RefCell;
use std::collections::HashSet;
#[derive(Hash, Eq, PartialEq)]
struct StringWrap<'a>{
s: &'a String,
}
struct Container<'a>{
m: HashSet<StringWrap<'a>>
}
impl<'a> Container<'a>{
fn remove_str(&mut self, s: &str){
let string = String::from(s);
let to_remove = StringWrap{s: &string};
self.m.remove(&to_remove);
}
}
It chokes with:
error[E0597]: `string` does not live long enough
--> tests/worksheet.rs:17:39
|
14 | impl<'a> Container<'a>{
| -- lifetime `'a` defined here
...
17 | let to_remove = StringWrap{s: &string};
| ^^^^^^^ borrowed value does not live long enough
18 | self.m.remove(&to_remove);
| ------------------------- argument requires that `string` is borrowed for `'a`
19 | }
| - `string` dropped here while still borrowed
As far as I can see, my string and to_remove live long enough to allow the .remove call to do its job. Is it because remove is potentially asynchronous or something like that?
Thanks for any help or insight!

As far as I can see, my string and to_remove live long enough to allow the .remove call to do its job. Is it because remove is potentially asynchronous or something like that?
No, it's because HashSet::remove must be called with something that the item becomes when borrowed:
pub fn remove<Q: ?Sized>(&mut self, value: &Q) -> bool
where
T: Borrow<Q>,
Q: Hash + Eq,
However, unless you manually implement Borrow for StringWrap, only the blanket reflexive implementation will apply—and thus remove can only be called with value of type &StringWrap<'a>. Note the lifetime requirement.
What you need to do to make this work is to implement Borrow for StringWrap. You could, for example, do the following:
impl Borrow<str> for StringWrap<'_> {
fn borrow(&self) -> &str {
self.s
}
}
and then Container::remove_str can merely forward its argument to HashMap::remove:
impl Container<'_> {
fn remove_str(&mut self, s: &str) {
self.m.remove(s);
}
}
See it on the playground.
All that said, it's rather unusual to store references in a HashSet: typically one would move ownership of the stored Strings into the set, which would render this problem moot as no lifetimes would be at play.

Can I limit the lifetime pollution from a struct?

I have a struct which contains some stuff. I implement the Iterator trait for that struct, and return a tuple of references to internal data in the struct. That necessitates that I annotate at least some things with lifetimes. What I want is to minimize the lifetime annotation, especially when it comes to other structs which have the original struct as a member.
some code:
pub struct LogReader<'a> {
data:String,
next_fn:fn(&mut LogReader)->Option<(&'a str,&'a [ConvertedValue])>,
//...
}
pub struct LogstreamProcessor {
reader: LogReader, // doesn't work without polluting LogstreamProcessor with lifetimes
//...
}
impl<'a> Iterator for LogReader<'a > {
type Item = (&'a str,&'a[ConvertedValue]);
fn next(&mut self) -> Option<(&'a str,&'a[ConvertedValue])>{(self.next_fn)(self)}
}
impl <'a> LogReader<'a> {
pub fn new(textFile:Option<bool>) -> LogReader<'a> {
LogReader {
next_fn:if textFile.unwrap_or(false) { LogReader::readNextText }else{ LogReader::readNextRaw },
data: "blah".to_string()
}
}
fn readNextText(&mut self)->Option<(&str,&[ConvertedValue])>{unimplemented!();}
fn readNextRaw(&mut self)->Option<(&str,&[ConvertedValue])>{unimplemented!();}
}

Can I limit the lifetime pollution from a struct?
Generically, if you're using them in any of your struct's fields, then you can't. They are made explicit for very good reasons (see Why are explicit lifetimes needed in Rust?), and once you have a struct containing objects that require explicit lifetimes, then they must be propagated.
Note that usually this isn't a concern to consumers of the struct, since the concrete lifetimes are then imposed by the compiler:
struct NameRef<'a>(&'a str);
let name = NameRef("Jake"); // 'a is 'static
One could also slightly mitigate the "noise" on the implementation of next by using the definition of Self::Item.
impl<'a> Iterator for LogReader<'a > {
type Item = (&'a str,&'a[ConvertedValue]);
fn next(&mut self) -> Option<Self::Item> {
(self.next_fn)(self)
}
}
However, your concern actually hides a more serious issue: Unlike you've mentioned, the values returned from next are not necessarily internal data from the struct. They actually live for as long as the generic lifetime 'a, and nothing inside LogReader is actually bound by that lifetime.
This means two things:
(1) I could pass a function that gives something completely different, and it would work just fine:
static NO_DATA: &[()] = &[()];
fn my_next_fn<'a>(reader: &mut LogReader<'a>) -> Option<(&'a str, &'a[ConvertedValue])> {
Some(("wat", NO_DATA))
}
(2) Even if I wanted my function to return something from the log reader's internal data, it wouldn't work because the lifetimes do not match at all. Let's try it out anyway to see what happens:
static DATA: &[()] = &[()];
fn my_next_fn<'a>(reader: &mut LogReader<'a>) -> Option<(&'a str, &'a[ConvertedValue])> {
Some((&reader.data[0..4], DATA))
}
fn main() {
let mut a = LogReader {
data: "This is DATA!".to_owned(),
next_fn: my_next_fn
};
println!("{:?}", a.next());
}
The compiler would throw you this:
error[E0495]: cannot infer an appropriate lifetime for lifetime parameter in function call due to conflicting requirements
--> src/main.rs:26:12
|
26 | Some((&reader.data[0..4], DATA))
| ^^^^^^^^^^^^^^^^^
|
note: first, the lifetime cannot outlive the anonymous lifetime #1 defined on the body at 25:88...
--> src/main.rs:25:89
|
25 | fn my_next_fn<'a>(reader: &mut LogReader<'a>) -> Option<(&'a str, &'a[ConvertedValue])> {
| _________________________________________________________________________________________^ starting here...
26 | | Some((&reader.data[0..4], DATA))
27 | | }
| |_^ ...ending here
note: ...so that reference does not outlive borrowed content
--> src/main.rs:26:12
|
26 | Some((&reader.data[0..4], DATA))
| ^^^^^^^^^^^
note: but, the lifetime must be valid for the lifetime 'a as defined on the body at 25:88...
--> src/main.rs:25:89
|
25 | fn my_next_fn<'a>(reader: &mut LogReader<'a>) -> Option<(&'a str, &'a[ConvertedValue])> {
| _________________________________________________________________________________________^ starting here...
26 | | Some((&reader.data[0..4], DATA))
27 | | }
| |_^ ...ending here
note: ...so that expression is assignable (expected std::option::Option<(&'a str, &'a [()])>, found std::option::Option<(&str, &[()])>)
--> src/main.rs:26:5
|
26 | Some((&reader.data[0..4], DATA))
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...where the anonymous lifetime #1 is the log reader's lifetime. Forcing &mut LogReader to also have a lifetime 'a (&'a mut LogReader<'a>) would lead to further lifetime issues when attempting to implement Iterator. This basically narrows down to the fact that 'a is incompatible with references to values of LogReader themselves.
So, how should we fix that?
but that doesn't change the fact that the return type has references and so lifetime annotations come into it
Although that is not accurate (since lifetime elision can occur in some cases), that gives a hint to the solution: either avoid returning references at all or delegate data to a separate object, so that 'a can be bound to that object's lifetime. The final part of the answer to your question is in Iterator returning items by reference, lifetime issue.

Define a trait with a function that returns an associated type with the same lifetime as one parameter

I'm trying to define a trait with a function that returns an associated type with the same lifetime as one parameter.
Conceptually something like the following (which doesn't work: lifetime parameter not allowed on this type [Self::Output]):
trait Trait {
type Output;
fn get_output<'a>(&self, input: &'a i32) -> Self::Output<'a>;
}
I found several questions about lifetimes for associated types on Stack Overflow and the Internet, but none seem to help. Some suggested defining the lifetime on the whole trait:
trait Trait<'a> {
type Output;
fn get_output(&self, input: &'a i32) -> Self::Output;
}
but this doesn't work either: it compiles, but then the following function fails to compile:
fn f<'a, T>(var: &T)
where T: Trait<'a>
{
let input = 0i32;
let output = var.get_output(&input);
}
giving an error:
error: `input` does not live long enough
--> <anon>:9:35
|
| let output = var.get_output( &input );
| ^^^^^ does not live long enough
| }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the lifetime 'a as defined on the body at 7:48...
--> <anon>:7:49
|
| fn f<'a, T>( var : &T ) where T : Trait<'a> {
| _________________________________________________^ starting here...
| | let input = 0i32;
| | let output = var.get_output( &input );
| | }
| |_^ ...ending here
How should I define the trait so that it behaves the way I want?

This is currently impossible, even in nightly Rust.
This requires some form of Higher Kinded Types (HKT), and the current approach envisaged is dubbed Associated Type Constructor (ATC).
The main motivation for introducing ATC is actually this very usecase.
With ATC, the syntax would be:
trait Trait {
type Output<'a>;
fn get_output<'a>(&self, input: &'a i32) -> Self::Output<'a>;
}
Note: if you wish to learn about ATCs, see Niko's series of articles.
For the particular example you have, you can work around this with HRTB (Higher Ranked Traits Bounds) as noted by Shepmaster:
fn f<T>(var: &T)
where for<'a> T: Trait<'a>
{ ... }
however this is fairly limited (notably, limited to trait bounds).

This requires using higher-ranked trait bounds:
fn f<T>(var: &T)
where for<'a> T: Trait<'a>
{
let input = 0i32;
let output = var.get_output(&input);
}
See also:
How does for<> syntax differ from a regular lifetime bound?

Restrict lifetime parameter to scope of parameters of a function

Consider the following example
trait MyTrait<'a> {
type N: 'a;
fn func(&'a self) -> Self::N;
}
fn myfunc<'a, T: 'a + MyTrait<'a>>(g: T) {
g.func();
}
fn main() {}
Compiling this small program fails with:
error[E0597]: `g` does not live long enough
--> src/main.rs:8:5
|
8 | g.func();
| ^ borrowed value does not live long enough
9 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the lifetime 'a as defined on the function body at 7:1...
--> src/main.rs:7:1
|
7 | fn myfunc<'a, T: 'a + MyTrait<'a>>(g: T) {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
As far as I understand, the lifetime parameter 'a is not restricted and could be arbitrary. However, g is a parameter and its lifetime is only the function scope, therefore it does not satisfy the condition of lifetime 'a in the definition of method func.
What I really want is that the associated type N is always restricted to the lifetime of self in MyTrait. That's why I came up with the explicit lifetime parameter 'a of MyTrait. I want function myfunc to work, i.e. 'a should somehow be restricted to the lifetime of of the parameter g.
What is the "correct" way to solve this problem?
A very simple example is
struct MyPtr<'a> {
x: &'a usize,
}
struct MyStruct {
data: Vec<usize>,
}
impl<'a> MyTrait<'a> for MyStruct {
type N = MyPtr<'a>;
fn func(&'a self) -> Self::N {
MyPtr { x: &self.data[0] }
}
}
Note that this is extremely simplified, of course. The idea is that N always contains a reference to something contained in MyTrait and should therefore never outlive MyTrait.

What you want is not to bind a generic lifetime, but to allow "any" lifetime:
fn myfunc<T: for<'a> MyTrait<'a>>(g: T) {
g.func();
}
Fully working example in the playground.
The best source for an explanation is How does for<> syntax differ from a regular lifetime bound?.

What are the differences between specifying lifetime parameters on an impl or on a method?

In Rust 1.3.0, the Deref trait has the following signature in the documentation:
pub trait Deref {
type Target: ?Sized;
fn deref(&'a self) -> &'a Self::Target;
}
I would implement it without naming the lifetimes, since they get elided anyway. However, in the docs example it looks like this:
use std::ops::Deref;
struct DerefExample<T> {
value: T
}
impl<T> Deref for DerefExample<T> {
type Target = T;
fn deref<'a>(&'a self) -> &'a T {
&self.value
}
}
fn main() {
let x = DerefExample { value: 'a' };
assert_eq!('a', *x);
}
This works all well and good, but if I specify the lifetime parameter 'a on the impl instead of the method:
struct DerefExample<T> {
value: T
}
impl<'a, T> Deref for DerefExample<T> {
type Target = T;
fn deref(&'a self) -> &'a T {
&self.value
}
}
I get the following error:
error[E0308]: method not compatible with trait
--> src/main.rs:10:5
|
10 | / fn deref(&'a self) -> &'a T {
11 | | &self.value
12 | | }
| |_____^ lifetime mismatch
|
= note: expected type `fn(&DerefExample<T>) -> &T`
found type `fn(&'a DerefExample<T>) -> &'a T`
note: the anonymous lifetime #1 defined on the method body at 10:5...
--> src/main.rs:10:5
|
10 | / fn deref(&'a self) -> &'a T {
11 | | &self.value
12 | | }
| |_____^
note: ...does not necessarily outlive the lifetime 'a as defined on the impl at 7:1
--> src/main.rs:7:1
|
7 | / impl<'a, T> Deref for DerefExample<T> {
8 | | type Target = T;
9 | |
10 | | fn deref(&'a self) -> &'a T {
11 | | &self.value
12 | | }
13 | | }
| |_^
This confuses me. The method's signature is no different than the one from the docs. In addition, I thought that the difference between specifying the lifetime parameter on the impl or on the method directly is in the scope of the parameter only, so it can be used in the entire impl block instead of just the method. What am I missing here?

Yes, there is a difference.
The method's signature is no different than the one from the docs.
The fact that it looks like this in docs is a fault of rustdoc, and has since been resolved.
If you press [src] link in the upper right corner of the documentation, you will be redirected to the actual source of Deref, which looks as follows (I've removed extra attributes and comments):
pub trait Deref {
type Target: ?Sized;
fn deref<'a>(&'a self) -> &'a Self::Target;
}
You can see that deref() is declared to have a lifetime parameter.
I thought that the difference between specifying the lifetime
parameter on the impl or on the method directly is in the scope of the
parameter only.
And this is wrong. The difference is not in scope only. I don't think I will be able to provide convincing side-by-side examples where a semantic difference is visible, but consider the following reasoning.
First, lifetime parameters are no different from generic type parameters. It is no coincidence that they use similar declaration syntax. Like generic parameters, lifetime parameters participate in the method/function signature, so if you want to implement a trait which has a method with lifetime parameters, your implementation must have the same lifetime parameters as well (modulo possible renaming).
Second, lifetime parameters in impl signature are used to express different kinds of lifetime relationship than those on functions. For methods, it is always the caller who determines the actual lifetime parameter they want to use. It is, again, similar to how generic methods work - the caller may instantiate its type parameters with any type they need. It is very important, for Deref in particular - you would want that anything which implements Deref may be dereferenced with the lifetime of the reference the method is called on, not something else.
With impl, however, lifetime parameters are chosen not when the method which uses this parameter is called, but when the appropriate impl is chosen by the compiler. It may do so (and usually does so) based on the type of the value, which precludes the user from specifying arbitrary lifetimes when the method is called. For example:
struct Bytes<'a>(&'a [u8]);
impl<'a> Bytes<'a> {
fn first_two(&self) -> &'a [u8] {
&self.0[..2]
}
}
Here, the first_two() method returns a slice with a lifetime of the value which is stored inside the Bytes structure. The caller of the method can't decide which lifetime they want - it is always fixed to the lifetime of the slice inside the structure this method is called on. It is also impossible to bring the lifetime parameter down to the method while keeping the same semantics, I guess you can see why.
In your case the lifetime parameter you specified does not participate either in the signature of the impl nor in any associated types, so it theoretically could be used as if it was declared on each function separately (because it can be arbitrary when the method is called), but then the reasoning about method signatures (provided above) kicks in.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Rust, serde Deserialize and Higher Rank Trait Bounds For<`a> - rust

Related

Lifetime in mutable structure with HashSet

Can I limit the lifetime pollution from a struct?

Define a trait with a function that returns an associated type with the same lifetime as one parameter

Restrict lifetime parameter to scope of parameters of a function

What are the differences between specifying lifetime parameters on an impl or on a method?

Categories

Resources