Rust lifetimes for struct references - rust

I've just started with Rust but can't quite grasp lifetimes so I could resolve following issue by myself:
This test project is about simulating a bit to allow tracing it through various bitwise operations, e.g. let newbit = oldbit1 ^ oldbit2 and looking at newbit I can tell afterwards it came out of an XOR operation with oldbit1 and oldbit2 as operands.
#[derive(Copy,Clone)]
pub enum TraceOperation {
AND,
OR,
XOR,
NOT,
}
#[derive(Copy,Clone)]
pub struct TraceBit<'a> {
source_a: Option<&'a TraceBit<'a>>,
source_b: Option<&'a TraceBit<'a>>,
source_op: Option<TraceOperation>,
value: bool,
}
This compiles, but I don't fully understand why the lifetime parameters are needed that way. I assume that the compiler cannot expect that the members source_a and source_b live as long as the struct itself as this may not hold true, so explicit lifetimes are required.
is this assumption correct?
Further I don't fully understand why I have to re-specify the lifetime parameter for the reference type, i.e. why I have to write source_a: Option<&'a TraceBit<'a>> as opposed to source_a: Option<&'a TraceBit>.
What is the second lifetime used for? How do I read that line out loud? I have:
"source_a is a variable of type Option that may have Some reference (that is valid at least as long as the struct itself and as long as member source_b) to an instance of TraceBit"
My final issue is that I cannot make it to work using an overloaded operator:
use std::ops::BitXor;
impl<'a> BitXor for TraceBit<'a> {
type Output = Self;
fn bitxor(self, rhs: Self) -> Self {
let valA: usize = if self.value { 1 } else { 0 };
let valB: usize = if rhs.value { 1 } else { 0 };
let val = if valA ^ valB != 0 { true } else { false };
TraceBit { source_a: Some(&self), source_b: Some(&rhs), source_op: Some(TraceOperation::XOR), value: val }
}
}
This is basically pure guessing based on BitXor documentation. So what I try to do, in a very explicit manner, is to perform an xor operation on the two input variables and create a new TraceBit as output with the inputs stored in it as reference.
error[E0597]: `self` does not live long enough
--> libbittrace/src/lib.rs:37:30
|
37 | TraceBit { source_a: Some(&self), source_b: Some(&rhs), source_op: Some(TraceOperation::XOR), value: val }
| ^^^^ does not live long enough
38 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the lifetime 'a as defined on the impl at 31:1...
--> libbittrace/src/lib.rs:31:1
|
31 | / impl<'a> BitXor for TraceBit<'a> {
32 | | type Output = Self;
33 | | fn bitxor(self, rhs: Self) -> Self {
34 | | let valA: usize = if self.value { 1 } else { 0 };
... |
40 | |
41 | | }
| |_^
error[E0597]: `rhs` does not live long enough
--> libbittrace/src/lib.rs:37:53
|
37 | TraceBit { source_a: Some(&self), source_b: Some(&rhs), source_op: Some(TraceOperation::XOR), value: val }
| ^^^ does not live long enough
38 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the lifetime 'a as defined on the impl at 31:1...
--> libbittrace/src/lib.rs:31:1
|
31 | / impl<'a> BitXor for TraceBit<'a> {
32 | | type Output = Self;
33 | | fn bitxor(self, rhs: Self) -> Self {
34 | | let valA: usize = if self.value { 1 } else { 0 };
... |
40 | |
41 | | }
| |_^
error: aborting due to 2 previous errors
Seems like nothing lives longer than the xor operation itself, but how can I resolve this?
I've tried various workarounds/changes to the code but to no avail and in any way I rather like to understand the issue than guessing a correct solution....

Tree-like structures must use the Box pointer type (Option<Box<TraceBit>>). In general, in structs you should prefer owned types.
Rust references aren't mere pointers. They are borrows (compile-time read/write locks) of data that must exist as owned somewhere else.
So if you have an owned version of TraceBit:
pub struct TraceBit {
source_a: Option<Box<TraceBit>>,
}
then reference to it is of type: &'a TraceBit, but references to a type don't change how the type looks internally, so the type of source_a is still Box<TraceBit>. You can keep getting the &'a TraceBit references recursively step by step:
trace_bit = trace_bit.source_a.as_ref().unwrap();
but there's no construct in Rust where taking a reference to the root of a tree suddenly changes the whole tree into a tree of references, so the type you are creating can't exist, and that's why you can't get type annotations right.

Maybe instead of passing references around, you should use a contained and cloneable name type.
use std::rc::Rc;
#[derive(Debug)]
pub enum TraceOperation {
AND,
OR,
XOR,
NOT,
}
#[derive(Debug)]
pub enum BitName<T> {
Name(Rc<T>),
Combination(Rc<(TraceOperation, BitName<T>, BitName<T>)>),
}
impl<T> Clone for BitName<T> {
fn clone(&self) -> Self {
match self {
&BitName::Name(ref x) => BitName::Name(Rc::clone(x)),
&BitName::Combination(ref x) => BitName::Combination(Rc::clone(x)),
}
}
}
impl<T> From<T> for BitName<T> {
fn from(x:T) -> Self {
BitName::Name(Rc::new(x))
}
}
impl<T> BitName<T> {
pub fn combine(op : TraceOperation, a : &Self, b :&Self) -> Self {
BitName::Combination(Rc::new((op, (*a).clone(), (*b).clone())))
}
}
fn main() {
let x : BitName<String> = BitName::from(String::from("x"));
let y : BitName<String> = BitName::from(String::from("y"));
let xandy = BitName::combine(TraceOperation::AND, &x, &y);
println!("{:?}", xandy);
}

Related

Compiler error - mismatched types when trying to implement a proxy pattern in rust

I trying to implement a proxy in Rust, allowing me to read a model, but also to perform mutable operations. To avoid having two different implementations of the proxy (one mutable, one immutable), I made the proxy implementation generic.
I would like the proxy to create other instances of self, in different situations, and this is where I got a type mismatch.
The code below reproduces the problem I have:
use std::borrow::{Borrow, BorrowMut};
struct Model {
data: Vec<u32>,
}
impl Model {
fn get_proxy(&self) -> Proxy<&Model> {
Proxy::new(self)
}
fn get_proxy_mut(&mut self) -> Proxy<&mut Model> {
Proxy::new(self)
}
}
struct Proxy<M: Borrow<Model>> {
model: M
}
impl<M: Borrow<Model>> Proxy<M> {
fn new(model: M) -> Self {
Self {model}
}
fn get_another(&self) -> Proxy<M> {
// FIRST ERROR here
self.model.borrow().get_proxy()
}
}
impl<M: BorrowMut<Model>> Proxy<M> {
fn get_another_mut(&mut self) -> Proxy<M> {
// SECOND ERROR here
self.model.borrow_mut().get_proxy_mut()
}
}
The first compiler error looks like
error[E0308]: mismatched types
--> src\sandbox.rs:28:9
|
21 | impl<'m, M: Borrow<Model>> Proxy<M> {
| - this type parameter
...
26 | fn get_another(&self) -> Proxy<M> {
| -------- expected `Proxy<M>` because of return type
27 | // FIRST ERROR here
28 | self.model.borrow().get_proxy()
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected type parameter `M`, found `&Model`
|
= note: expected struct `Proxy<M>`
found struct `Proxy<&Model>`
Any idea how to fix the first error? (I assume the second one will be fixed in a similar way).

Custom data format - `Deserializer::deserialize_str` implementation

Link to playground
I am trying to implement a custom data format with serde, I've been struggling with the deserialize_str method
pub struct Deserializer<R> {
rdr: R,
}
impl<'de, 'a, R: io::Read + 'de> de::Deserializer<'de> for &'a mut Deserializer<R> {
fn deserialize_str<V>(self, visitor: V) -> Result<V::Value>
where
V: Visitor<'de>,
{
let len = self.read_i16()?; // implemention below
if len == 0 || len == -1 {
return visitor.visit_borrowed_str("");
}
let len = len as usize;
let buf = self.read_exact(len)?; // implemention below
let out_str = std::str::from_utf8(&buf)?;
// visitor.visit_borrowed_str(out_str) doesn't compile
visitor.visit_str(out_str) // compiles but errors
}
}
impl<R: io::Read> Deserializer<R> {
fn read_exact(&mut self, len: usize) -> Result<Vec<u8>> {
let mut buf = vec![0; len];
self.rdr.read_exact(&mut buf)?;
Ok(buf)
}
fn read_i16(&mut self) -> io::Result<i8> {
self.rdr.read_i16::<byteorder::NetworkEndian>()
}
}
When using visitor.visit_borrowed_str(out_str), I get the error
|
94 | impl<'de, 'a, R: io::Read + 'de> de::Deserializer<'de> for &'a mut Deserializer<R> {
| --- lifetime `'de` defined here
...
149 | let out_str = std::str::from_utf8(&buf)?;
| ^^^^ borrowed value does not live long enough
150 |
151 | visitor.visit_borrowed_str(out_str)
| ----------------------------------- argument requires that `buf` is borrowed for `'de`
152 | }
| - `buf` dropped here while still borrowed
I understand that out_str needs to somehow live longer than its scope, but I can't find a way to go about it.
To use visit_borrowed_str, you need to hand it a reference to something that lives as long as your deserializer. Creating a new temporary Vec with read_exact won't do, you need to get access to the underlying slice, e.g. std::str::from_utf8(self.rdr.get_ref()[self.rdr.position()..][..len]) or similar. If you want to keep R a generic std::io::Read, I think you can't use visit_borrowed_str. serde_json e.g. handles this by having a special Read that returns a reference to the underlying data if it can, and then only uses visit_borrowed_str if it does have a reference to the underlying data.
Also, if you ask a deserializer to deserialize to a borrowed string when it can't, it must necessarily error. That holds true for serde_json as well. So the error from visit_str is not an error in your deserializer implementation, but an error in how you use the deserializer. You should have asked to deserialize to a String or Cow<str> instead (not that your serializer could ever give you a Cow::Borrowed, but asking for a &str just isn't a good idea with any deserializer, asking for a Cow<str> is the thing generally recommended instead).

"cannot infer an appropriate lifetime" when attempting to return a chunked response with hyper

I would like to return binary data in chunks of specific size. Here is a minimal example.
I made a wrapper struct for hyper::Response to hold my data like status, status text, headers and the resource to return:
pub struct Response<'a> {
pub resource: Option<&'a Resource>
}
This struct has a build method that creates the hyper::Response:
impl<'a> Response<'a> {
pub fn build(&mut self) -> Result<hyper::Response<hyper::Body>, hyper::http::Error> {
let mut response = hyper::Response::builder();
match self.resource {
Some(r) => {
let chunks = r.data
.chunks(100)
.map(Result::<_, std::convert::Infallible>::Ok);
response.body(hyper::Body::wrap_stream(stream::iter(chunks)))
},
None => response.body(hyper::Body::from("")),
}
}
}
There is also another struct holding the database content:
pub struct Resource {
pub data: Vec<u8>
}
Everything works until I try to create a chunked response. The Rust compiler gives me the following error:
error[E0495]: cannot infer an appropriate lifetime due to conflicting requirements
--> src/main.rs:14:15
|
14 | match self.resource {
| ^^^^^^^^^^^^^
|
note: first, the lifetime cannot outlive the lifetime `'a` as defined on the impl at 11:6...
--> src/main.rs:11:6
|
11 | impl<'a> Response<'a> {
| ^^
note: ...so that the types are compatible
--> src/main.rs:14:15
|
14 | match self.resource {
| ^^^^^^^^^^^^^
= note: expected `Option<&Resource>`
found `Option<&'a Resource>`
= note: but, the lifetime must be valid for the static lifetime...
note: ...so that the types are compatible
--> src/main.rs:19:31
|
19 | response.body(hyper::Body::wrap_stream(stream::iter(chunks)))
| ^^^^^^^^^^^^^^^^^^^^^^^^
= note: expected `From<&[u8]>`
found `From<&'static [u8]>`
I don't know how to fulfill these lifetime requirements. How can I do this correctly?
The problem is not in the 'a itself, but in the fact that the std::slice::chunks() function returns an iterator that borrows the original slice. You are trying to create a stream future from this Chunks<'_, u8> value, but the stream requires it to be 'static. Even if your Resource did not have the 'a lifetime, you would still have the r.data borrowed, and it would still fail.
Remember that here 'static does not mean that the value lives forever, but that it can be made to live as long as necessary. That is, the future must not hold any (non-'static) borrows.
You could clone all the data, but if it is very big, it can be costly. If so, you could try using Bytes, that is just like Vec<u8> but reference counted.
It looks like there is no Bytes::chunks() function that returns an iterator of Bytes. Fortunately it is easy to do it by hand.
Lastly, remember that iterators in Rust are lazy, so they keep the original data borrowed, even if it is a Bytes. So we need to collect them into a Vec to actually own the data (playground):
pub struct Resource {
pub data: Bytes,
}
impl<'a> Response<'a> {
pub fn build(&mut self) -> Result<hyper::Response<hyper::Body>, hyper::http::Error> {
let mut response = hyper::Response::builder();
match self.resource {
Some(r) => {
let len = r.data.len();
let chunks = (0..len)
.step_by(100)
.map(|x| {
let range = x..len.min(x + 100);
Ok(r.data.slice(range))
})
.collect::<Vec<Result<Bytes, std::convert::Infallible>>>();
response.body(hyper::Body::wrap_stream(stream::iter(chunks)))
}
None => response.body(hyper::Body::from("")),
}
}
}
UPDATE: We can avoid the call to collect() if we notice that stream::iter() takes ownership of an IntoIterator that can be evaluated lazily, as long as we make it 'static. It can be done if we do a (cheap) clone of r.data and move it into the lambda (playground):
let data = r.data.clone();
let len = data.len();
let chunks = (0..len).step_by(100)
.map(move |x| {
let range = x .. len.min(x + 100);
Result::<_, std::convert::Infallible>::Ok(data.slice(range))
});
response.body(hyper::Body::wrap_stream(stream::iter(chunks)))

How to return a struct with a reference to self in Rust?

I've implemented a struct which has a list of crontab entries, each of which knows its own recurrence (such as */5 * * * * in crontab):
extern crate chrono;
use chrono::NaiveDateTime;
pub struct Crontab<'a> {
entries: Vec<Entry<'a>>,
}
pub struct Entry<'a> {
pub recurrence: Recurrence,
pub command: &'a str,
}
pub struct Recurrence {
minutes: Vec<u8>,
hours: Vec<u8>,
days_of_month: Vec<u8>,
months: Vec<u8>,
days_of_week: Vec<u8>,
}
Based on the current time you can get the next occurrence of a command:
impl Recurrence {
pub fn next_match(&self, after: NaiveDateTime) -> NaiveDateTime {
unimplemented!()
}
}
I'm trying to write a function on Crontab to get the Entry which will run next (that is, for which recurrence.next_match() is the lowest).
impl<'a> Crontab<'a> {
fn next_run(&self, from: NaiveDateTime) -> Run<'a> {
&self.entries
.into_iter()
.map(|entry| Run {
entry: &entry,
datetime: entry.recurrence.next_match(from),
})
.min_by(|this, other| this.datetime.cmp(&other.datetime))
.unwrap()
}
}
struct Run<'a> {
entry: &'a Entry<'a>,
datetime: NaiveDateTime,
}
This generates the error:
error[E0308]: mismatched types
--> src/main.rs:30:9
|
29 | fn next_run(&self, from: NaiveDateTime) -> Run<'a> {
| ------- expected `Run<'a>` because of return type
30 | / &self.entries
31 | | .into_iter()
32 | | .map(|entry| Run {
33 | | entry: &entry,
... |
36 | | .min_by(|this, other| this.datetime.cmp(&other.datetime))
37 | | .unwrap()
| |_____________________^ expected struct `Run`, found &Run<'_>
|
= note: expected type `Run<'a>`
found type `&Run<'_>`
Similar variants I've tried fail to compile with messages such as "cannot move out of borrowed content" (if changing the return type to &Run<'a>) or that the &entry does not live long enough.
It seems to make most sense that the Run should have a reference to rather than a copy of the Entry, but I'm not sure how to juggle both the lifetimes and references to get to that point (and I don't know whether 'a refers to the same lifetime in both structs). What am I missing here?
As described in Is there any way to return a reference to a variable created in a function?, you cannot create a value in a function and return a reference to it. Nothing would own the result of your iterator chain, thus the reference would point at invalid data.
That doesn't even really matter: as pointed out in the comments, you cannot call into_iter on self.entries because you cannot move out of borrowed content to start with, as described in Cannot move out of borrowed content. This means that we cannot have an owned value of an Entry as the result of the iterator chain to start with.
Crontab owns the Entry; as soon as the Crontab moves, any reference to any Entry becomes invalid. This means that any references need to be tied to how long self lives; the generic lifetime 'a cannot come into play:
fn next_run(&self, from: NaiveDateTime) -> Run {
self.entries
.iter()
.map(|entry| Run {
entry,
datetime: entry.recurrence.next_match(from),
})
.min_by(|this, other| this.datetime.cmp(&other.datetime))
.unwrap()
}
Or the explicit version:
fn next_run<'b>(&'b self, from: NaiveDateTime) -> Run<'b> { /* ... */ }

Return type for rusqlite MappedRows

I am trying to write a method that returns a rusqlite::MappedRows:
pub fn dump<F>(&self) -> MappedRows<F>
where F: FnMut(&Row) -> DateTime<UTC>
{
let mut stmt =
self.conn.prepare("SELECT created_at FROM work ORDER BY created_at ASC").unwrap();
let c: F = |row: &Row| {
let created_at: DateTime<UTC> = row.get(0);
created_at
};
stmt.query_map(&[], c).unwrap()
}
I am getting stuck on a compiler error:
error[E0308]: mismatched types
--> src/main.rs:70:20
|
70 | let c: F = |row: &Row| {
| ____________________^ starting here...
71 | | let created_at: DateTime<UTC> = row.get(0);
72 | | created_at
73 | | };
| |_________^ ...ending here: expected type parameter, found closure
|
= note: expected type `F`
= note: found type `[closure#src/main.rs:70:20: 73:10]`
What am I doing wrong here?
I tried passing the closure directly to query_map but I get the same compiler error.
I'll divide the answer in two parts, the first about how to fix the return type without considering borrow-checker, the second about why it doesn't work even if you fixed the return type.
§1.
Every closure has a unique, anonymous type, so c cannot be of any type F the caller provides. That means this line will never compile:
let c: F = |row: &Row| { ... } // no, wrong, always.
Instead, the type should be propagated out from the dump function, i.e. something like:
// ↓ no generics
pub fn dump(&self) -> MappedRows<“type of that c”> {
..
}
Stable Rust does not provide a way to name that type. But we could do so in nightly with the "impl Trait" feature:
#![feature(conservative_impl_trait)]
// ↓~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
pub fn dump(&self) -> MappedRows<impl FnMut(&Row) -> DateTime<UTC>> {
..
}
// note: wrong, see §2.
The impl F here means that, “we are going to return a MappedRows<T> type where T: F, but we are not going to specify what exactly is T; the caller should be ready to treat anything satisfying F as a candidate of T”.
As your closure does not capture any variables, you could in fact turn c into a function. We could name a function pointer type, without needing "impl Trait".
// ↓~~~~~~~~~~~~~~~~~~~~~~~~
pub fn dump(&self) -> MappedRows<fn(&Row) -> DateTime<UTC>> {
let mut stmt = self.conn.prepare("SELECT created_at FROM work ORDER BY created_at ASC").unwrap();
fn c(row: &Row) -> DateTime<UTC> {
row.get(0)
}
stmt.query_map(&[], c as fn(&Row) -> DateTime<UTC>).unwrap()
}
// note: wrong, see §2.
Anyway, if we do use "impl Trait", since MappedRows is used as an Iterator, it is more appropriate to just say so:
#![feature(conservative_impl_trait)]
pub fn dump<'c>(&'c self) -> impl Iterator<Item = Result<DateTime<UTC>>> + 'c {
..
}
// note: wrong, see §2.
(without the 'c bounds the compiler will complain E0564, seems lifetime elision doesn't work with impl Trait yet)
If you are stuck with Stable Rust, you cannot use the "impl Trait" feature. You could wrap the trait object in a Box, at the cost of heap allocation and dynamic dispatch:
pub fn dump(&self) -> Box<Iterator<Item = Result<DateTime<UTC>>>> {
...
Box::new(stmt.query_map(&[], c).unwrap())
}
// note: wrong, see §2.
§2.
The above fix works if you want to, say, just return an independent closure or iterator. But it does not work if you return rusqlite::MappedRows. The compiler will not allow the above to work due to lifetime issue:
error: `stmt` does not live long enough
--> 1.rs:23:9
|
23 | stmt.query_map(&[], c).unwrap()
| ^^^^ does not live long enough
24 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the anonymous lifetime #1 defined on the body at 15:80...
--> 1.rs:15:81
|
15 | pub fn dump(conn: &Connection) -> MappedRows<impl FnMut(&Row) -> DateTime<UTC>> {
| ^
And this is correct. MappedRows<F> is actually MappedRows<'stmt, F>, this type is valid only when the original SQLite statement object (having 'stmt lifetime) outlives it — thus the compiler complains that stmt is dead when you return the function.
Indeed, if the statement is dropped before we iterate on those rows, we will get garbage results. Bad!
What we need to do is to make sure all rows are read before dropping the statement.
You could collect the rows into a vector, thus disassociating the result from the statement, at the cost of storing everything in memory:
// ↓~~~~~~~~~~~~~~~~~~~~~~~~~
pub fn dump(&self) -> Vec<Result<DateTime<UTC>>> {
..
let it = stmt.query_map(&[], c).unwrap();
it.collect()
}
Or invert the control, let dump accept a function, which dump will call while keeping stmt alive, at the cost of making the calling syntax weird:
// ↓~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
pub fn dump<F>(&self, mut f: F) where F: FnMut(Result<DateTime<UTC>>) {
...
for res in stmt.query_map(&[], c).unwrap() {
f(res);
}
}
x.dump(|res| println!("{:?}", res));
Or split dump into two functions, and let the caller keep the statement alive, at the cost of exposing an intermediate construct to the user:
#![feature(conservative_impl_trait)]
pub fn create_dump_statement(&self) -> Statement {
self.conn.prepare("SELECT '2017-03-01 12:34:56'").unwrap()
}
pub fn dump<'s>(&self, stmt: &'s mut Statement) -> impl Iterator<Item = Result<DateTime<UTC>>> + 's {
stmt.query_map(&[], |row| row.get(0)).unwrap()
}
...
let mut stmt = x.create_dump_statement();
for res in x.dump(&mut stmt) {
println!("{:?}", res);
}
The issue here is that you are implicitly trying to return a closure, so to find explanations and examples you can search for that.
The use of the generic <F> means that the caller decides the concrete type of F and not the function dump.
What you would like to achieve instead requires the long awaited feature impl trait.

Resources