GAT-related lifetime conflicts with mutex temporary lifetime - rust

I'm experimenting with GATs to enhance the API to an in-memory data store. The data is organized in values, where each value contains, among other things, a lookup key. You can think of it like a row in a database table, where the whole row is a "value", but it also contains a primary-key column or columns.
The idea is to describe this by a trait, so you can look for a particular value by providing the key. The key must be able to refer into the value, so that if the key-part of the value is String, you can look it up using just &str. This is where GATs enter the picture:
pub trait Value {
type Key<'a>: PartialEq where Self: 'a;
fn as_key<'a>(&'a self) -> Self::Key<'a>;
}
The Key<'a> GAT provides a lifetime that as_key() can use to return a value that refers to inner data. Note that as_key() can't just return a reference to the key because the returned key can be something that doesn't exist verbatim inside Self, such as a composite key. For example, these are all possible:
struct Data {
s: String,
n: u64,
// ... more fields ...
}
// example 1: expose key as self.s as a &str key
impl Value for Data {
type Key<'a> = &'a str;
fn as_key(&self) -> &str { &self.s }
}
// example 2: expose key as a pair of (self.s.as_str(), self.n)
impl Value for Data {
type Key<'a> = (&'a str, u64);
fn as_key(&self) -> (&str, u64) { (&self.s, self.n) }
}
An example of generic code that makes use of this trait could look like this:
pub struct Table<T> {
data: Vec<T>,
}
impl<T: Value> Table<T> {
fn find<'a: 'k, 'k>(&'a self, k: T::Key<'k>) -> Option<usize> {
self.data.iter().position(|v| v.as_key() == k)
}
}
This works beautifully and you can play around with it in the playground. (A more realistic example would require Ord or Hash from Value::Key and build a more sophisticated storage, but this is enough to show the idea.)
Now, let's make a simple change and store the table data in a Mutex. The code looks almost the same, and since it only returns the position, the mutex manipulation should remain internal to the implementation:
struct Table<T> {
data: Mutex<Vec<T>>,
}
impl<T: Value> Table<T> {
pub fn find<'a: 'k, 'k>(&'a self, k: T::Key<'k>) -> Option<usize> {
let data = self.data.lock().unwrap();
data.iter().position(|v| v.as_key() == k)
}
}
However, the above code doesn't compile - it complains that "data doesn't live long enough":
error[E0597]: `data` does not live long enough
--> src/main.rs:18:9
|
16 | pub fn find<'a: 'k, 'k>(&'a self, k: T::Key<'k>) -> Option<usize> {
| -- lifetime `'k` defined here
17 | let data = self.data.lock().unwrap();
18 | data.iter().position(|v| v.as_key() == k)
| ^^^^^^^^^^^ ---------- argument requires that `data` is borrowed for `'k`
| |
| borrowed value does not live long enough
19 | }
| - `data` dropped here while still borrowed
Playground
I don't quite understand this error - why would data need to live for the lifetime of the key we're comparing it to? I tried:
changing lifetimes so that lifetimes of 'k and 'a are fully decoupled
extracting the comparison to a simple function that receives &'a T and &T::Key<'b>, and returns a bool after comparing them (and which compiles on its own)
replacing Iterator::position() with an explicit for loop
But nothing helped, the error always remained in some form. Note that it's perfectly legal to place both v.as_key() and k in the same closure (e.g. like this), it's only when you attempt to compare them that the error arises.
My intuitive understanding of the problem is that the Eq bound associated with Value::Key<'a> only applies to another Key<'a>.
Is it possible to rework the lifetimes or the as_key() interface to work around this issue? Is this a variant of the issue described here?
EDIT: relaxing the PartialEq bound to HRTB for<'b> PartialEq<Self::Key<'b>> as suggested by kmdreko fixes the above examples, but breaks with generics. For example, this implementation of Value fails to compile:
struct NoKey<T>(T);
impl<T> Value for NoKey<T> {
type Key<'a> = () where T: 'a;
fn as_key(&self) -> () {
()
}
}
with the error:
error[E0311]: the parameter type `T` may not live long enough
--> src/lib.rs:38:20
|
38 | type Key<'a> = () where T: 'a;
| ^^ ...so that the type `NoKey<T>` will meet its required lifetime bounds
Playground

My intuitive understanding of the problem is that the Eq bound associated with Value::Key<'a> only applies to another Key<'a>.
This is correct. You can relax this constraint by using a higher-ranked trait bound to require Key<'a> to be comparable to all Key<'b>:
pub trait Value {
type Key<'a>: for<'b> PartialEq<Self::Key<'b>> // <-----
where
Self: 'a;
fn as_key<'a>(&'a self) -> Self::Key<'a>;
}
I don't think there's any other way, because most types are covariant with respect to their associated lifetime, but with T::Key<'k> I don't think you can constrain that 'k can be shortened.
The issue with generics pointed out in the edited question can be worked around by requiring the generic to be 'static (playground). Note that the 'static bound only applies to the value as a whole, the key may still refer to parts of the value.

Related

Iterator that owns another iterator and creates items with generic lifetime

I want to make an iterator that owns another iterator and gives items based on that other iterator. Originally, the inner iterator is formed from the result of a database query, it gives the raw data rows as arrived from the DB. The outer iterator takes items of the inner iterator and puts them into a struct that is meaningful within my program. Because different software versions store the same data in different database table structures, I have a parser trait that takes a row and creates a structure. My outer iterator takes two parameters for creation: the iterator for the DB rows and an object which implements how to parse the data.
But I run into a lifetime error which I don't really see the reason of, and following the compiler's hints only lead me in circles. I literally follow the compiler's advice and getting back to the same problem. I tried to minic the code and bring it to a minimal form to reproduce the same compiler errors I'm getting. I'm not entirely sure if it could be minimized further, but I also wanted it to resemble my real code.
Here is the sample:
struct Storeroom<'a> {
storeroom_id: i64,
version: &'a str
}
trait StoreroomParser {
fn parse(&self, row: Row) -> Result<Storeroom, Error>;
}
struct StoreroomParserX;
impl StoreroomParser for StoreroomParserX {
fn parse(&self, row: Row) -> Result<Storeroom, Error> {
Ok(Storeroom { storeroom_id: row.dummy, version: "0.0.0"})
}
}
struct StoreroomIterator {
rows: Box<dyn Iterator<Item = Row>>,
parser: Box<dyn StoreroomParser>
}
impl StoreroomIterator {
fn new() -> Result<Self, Error> {
let mut rows: Vec<Row> = vec![];
rows.push(Row { dummy: 4});
rows.push(Row { dummy: 6});
rows.push(Row { dummy: 8});
let rows = Box::new(rows.into_iter());
let parser = Box::new(StoreroomParserX {});
Ok(Self {rows, parser})
}
}
impl Iterator for StoreroomIterator {
type Item<'a> = Result<Storeroom<'a>, Error>;
fn next(&mut self) -> Option<Self::Item> {
if let Some(nextrow) = self.rows.next() {
Some(self.parser.parse(nextrow))
}
else {
None
}
}
}
During my first attempt, the compiler suggested to add a lifetime annotation to the Item type declaration, because it uses a struct that requires a lifetime. But this resulted in the following error:
error[E0658]: generic associated types are unstable
--> src/main.rs:59:5
|
59 | type Item<'a> = Result<Storeroom<'a>, Error>;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: see issue #44265 <https://github.com/rust-lang/rust/issues/44265> for more information
error[E0195]: lifetime parameters or bounds on type `Item` do not match the trait declaration
--> src/main.rs:59:14
|
59 | type Item<'a> = Result<Storeroom<'a>, Error>;
| ^^^^ lifetimes do not match type in trait
Here's the sample code on Playground.
When I tried to mitigate this by moving the lifetime annotation to the impl block instead, I provoked the following error I can't progress from:
error: lifetime may not live long enough
--> src/main.rs:61:13
|
56 | impl<'a> Iterator for StoreroomIterator<'a> {
| -- lifetime `'a` defined here
...
59 | fn next(&mut self) -> Option<Self::Item> {
| - let's call the lifetime of this reference `'1`
60 | if let Some(nextrow) = self.rows.next() {
61 | Some(self.parser.parse(nextrow))
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ associated function was supposed to return data with lifetime `'a` but it is returning data with lifetime `'1`
Playground.
I've been stuck on this problem for about a week now. Do you have any ideas how to resolve these errors? I'm also thinking that I should probably just use map() on the rows with whatever closure it takes to properly convert the data, but at this point it would definitely feel like a compromise.
So you say you force version to be 'static with Box::leak(). If so, you can can just remove the lifetime parameter entirely:
struct Storeroom {
storeroom_id: i64,
version: &'static str
}
playground
You also mention that the compiler "forces" you to Box<dyn> rows and parser. You can avoid that by making StoreroomIterator generic over two types for the two members. Only change needed is to take rows in the constructor:
struct StoreroomIterator<R: Iterator<Item = Row>, P: StoreroomParser> {
rows: R,
parser: P
}
impl<R: Iterator<Item = Row>> StoreroomIterator<R, StoreroomParserX> {
fn new(rows: R) -> Result<Self, Error> {
let parser = StoreroomParserX {};
Ok(Self { rows, parser })
}
}
playground
It may be possible to get everything to work with lifetimes as well, but from your incomplete example, it's hard to say exactly. You may want to store a String containing the version in Storeroom and then add a version() method to generate a Version on demand, rather than generating them all up front. But it's hard to say without knowing what this all is for. You may just want to switch to a different library for handling version comparisons.

Entry::Occupied.get() returns a value referencing data owned by the current function even though hashmap should have the ownership

My goal was to implement the suggested improvement on the cacher struct of the rust book chapter 13.1, that is creating a struct which takes a function and uses memoization to reduce the number of calls of the given function. To do this, I created a struct with an HashMap
struct Cacher<T, U, V>
where T: Fn(&U) -> V, U: Eq + Hash
{
calculation: T,
map: HashMap<U,V>,
}
and two methods, one constructor and one which is resposible of the memoization.
impl<T, U, V> Cacher<T, U, V>
where T: Fn(&U) -> V, U: Eq + Hash
{
fn new(calculation: T) -> Cacher<T,U,V> {
Cacher {
calculation,
map: HashMap::new(),
}
}
fn value(&mut self, arg: U) -> &V {
match self.map.entry(arg){
Entry::Occupied(occEntry) => occEntry.get(),
Entry::Vacant(vacEntry) => {
let argRef = vacEntry.key();
let result = (self.calculation)(argRef);
vacEntry.insert(result)
}
}
}
}
I used the Entry enum, because I didn't found a better way of deciding if the HashMap contains a key and - if it doesn't - calculating the value and inserting it into the HashMap as well as returning a reference to it.
If I want to compile the code above, I get an error which says that occEntry is borrowed by it's .get() method (which is fine by me) and that .get() "returns a value referencing data owned by the current function".
My understanding is that the compiler thinks that the value which occEntry.get() is referencing to is owned by the function value(...). But shouldn't I get a reference of the value of type V, which is owned by the HashMap? Is the compiler getting confused because the value is owned by the function and saved as result for a short moment?
let result = (self.calculation)(argRef);
vacEntry.insert(result)
Please note that it is necessary to save the result temporarily because the insert method consumes the key and such argRef is not valid anymore. Also I acknowledge that the signature of value can be problematic (see Mutable borrow from HashMap and lifetime elision) but I tried to avoid a Copy Trait Bound.
For quick reproduction of the problem I append the use statements necessary. Thanks for your help.
use std::collections::HashMap;
use std::cmp::Eq;
use std::hash::Hash;
use std::collections::hash_map::{OccupiedEntry, VacantEntry, Entry};
Let's take a look at OccupiedEntry::get()'s signature:
pub fn get(&self) -> &V
What this signature is telling us is that the reference obtained from the OccupiedEntry can only live as long as the OccupiedEntry itself. However, the OccupiedEntry is a local variable, thus it's dropped when the function returns.
What we want is a reference whose lifetime is bound to the HashMap's lifetime. Both Entry and OccupiedEntry have a lifetime parameter ('a), which is linked to the &mut self parameter in HashMap::entry. We need a method on OccupiedEntry that returns a &'a V. There's no such method, but there's one that returns a '&a mut V: into_mut. A mutable reference can be implicitly coerced to a shared reference, so all we need to do to make your method compile is to replace get() with into_mut().
fn value(&mut self, arg: U) -> &V {
match self.map.entry(arg) {
Entry::Occupied(occ_entry) => occ_entry.into_mut(),
Entry::Vacant(vac_entry) => {
let arg_ref = vac_entry.key();
let result = (self.calculation)(arg_ref);
vac_entry.insert(result)
}
}
}

Is there a way to pass a reference to a generic function and return an impl Trait that isn't related to the argument's lifetime?

I've worked down a real-life example in a web app, which I've solved using unnecessary heap allocation, to the following example:
// Try replacing with (_: &String)
fn make_debug<T>(_: T) -> impl std::fmt::Debug {
42u8
}
fn test() -> impl std::fmt::Debug {
let value = "value".to_string();
// try removing the ampersand to get this to compile
make_debug(&value)
}
pub fn main() {
println!("{:?}", test());
}
As is, compiling this code gives me:
error[E0597]: `value` does not live long enough
--> src/main.rs:9:16
|
5 | fn test() -> impl std::fmt::Debug {
| -------------------- opaque type requires that `value` is borrowed for `'static`
...
9 | make_debug(&value)
| ^^^^^^ borrowed value does not live long enough
10 | }
| - `value` dropped here while still borrowed
I can fix this error in at least two ways:
Instead of passing in a reference to value in test(), pass in value itself
Instead of the parameter T, explicitly state the type of the argument for make_debug as &String or &str
My understanding of what's happening is that, when there is a parameter, the borrow checker is assuming that any lifetime on that parameter affects the output impl Debug value.
Is there a way to keep the code parameterized, continue passing in a reference, and get the borrow checker to accept it?
I think this is due to the rules around how impl trait opaque types capture lifetimes.
If there are lifetimes inside an argument T, then an impl trait has to incorporate them. Additional lifetimes in the type signature follow the normal rules.
For more information please see:
https://github.com/rust-lang/rust/issues/43396#issuecomment-349716967
https://github.com/rust-lang/rfcs/blob/master/text/1951-expand-impl-trait.md#lifetime-parameters
https://github.com/rust-lang/rfcs/blob/master/text/1951-expand-impl-trait.md#assumption-3-there-should-be-an-explicit-marker-when-a-lifetime-could-be-embedded-in-a-return-type
https://github.com/rust-lang/rfcs/blob/master/text/1951-expand-impl-trait.md#scoping-for-type-and-lifetime-parameters
A more complete answer
Original goal: the send_form function takes an input parameter of type &T which is rendered to a binary representation. That binary representation is owned by the resulting impl Future, and no remnant of the original &T remains. Therefore, the lifetime of &T need not outlive the impl Trait. All good.
The problem arises when T itself, additionally, contains references with lifetimes. If we were not using impl Trait, our signature would look something like this:
fn send_form<T>(self, data: &T) -> SendFormFuture;
And by looking at SendFormFuture, we can readily observe that there is no remnant of T in there at all. Therefore, even if T has lifetimes of its own to deal with, we know that all references are used within the body of send_form, and never used again afterward by SendFormFuture.
However, with impl Future as the output, we get no such guarantees. There's no way to know if the concrete implementation of Future in fact holds onto the T.
In the case where T has no references, this still isn't a problem. Either the impl Future references the T, and fully takes ownership of it, or it doesn't reference it, and no lifetime issues arise.
However, if T does have references, you could end up in a situation where the concrete impl Future is holding onto a reference stored in the T. Even though the impl Future has ownership of the T itself, it doesn't have ownership of the values referenced by the T.
This is why the borrow check must be conservative, and insist that any references inside T must have a 'static lifetime.
The only workaround I can see is to bypass impl Future and be explicit in the return type. Then, you can demonstrate to the borrow checker quite easily that the output type does not reference the input T type at all, and any references in it are irrelevant.
The original code in the actix web client for send_form looks like:
https://docs.rs/awc/0.2.1/src/awc/request.rs.html#503-522
pub fn send_form<T: Serialize>(
self,
value: &T,
) -> impl Future<
Item = ClientResponse<impl Stream<Item = Bytes, Error = PayloadError>>,
Error = SendRequestError,
> {
let body = match serde_urlencoded::to_string(value) {
Ok(body) => body,
Err(e) => return Either::A(err(Error::from(e).into())),
};
// set content-type
let slf = self.set_header_if_none(
header::CONTENT_TYPE,
"application/x-www-form-urlencoded",
);
Either::B(slf.send_body(Body::Bytes(Bytes::from(body))))
}
You may need to patch the library or write your own function that does the same thing but with a concrete type. If anyone else knows how to deal with this apparent limitation of impl trait I'd love to hear it.
Here's how far I've gotten on a rewrite of send_form in awc (the actix-web client library):
pub fn send_form_alt<T: Serialize>(
self,
value: &T,
// ) -> impl Future<
// Item = ClientResponse<impl Stream<Item = Bytes, Error = PayloadError>>,
// Error = SendRequestError,
) -> Either<
FutureResult<String, actix_http::error::Error>,
impl Future<
Item = crate::response::ClientResponse<impl futures::stream::Stream>,
Error = SendRequestError,
>,
> {
Some caveats so far:
Either::B is necessarily an opaque impl trait of Future.
The first param of FutureResult might actually be Void or whatever the Void equivalent in Rust is called.

Impl trait with generic associated type in return position causes lifetime error

I need to store a fn(I) -> O (where I & O can be references) in a 'static struct. O needs to be a trait with an 'static generic associated type, that associated type is also stored in the struct. Neither I nor O itself get stored inside of the struct, so their lifetime shouldn't matter. But the compiler is still complaining about I not living long enough.
trait IntoState {
type State: 'static;
fn into_state(self) -> Self::State;
}
impl IntoState for &str {
type State = String;
fn into_state(self) -> Self::State {
self.to_string()
}
}
struct Container<F, S> {
func: F,
state: S,
}
impl<I, O> Container<fn(I) -> O, O::State>
where
O: IntoState,
{
fn new(input: I, func: fn(I) -> O) -> Self {
// I & O lives only in the next line of code. O gets converted into
// a `'static` (`String`), that is stored in `Container`.
let state = func(input).into_state();
Container { func, state }
}
}
fn map(i: &str) -> impl '_ + IntoState {
i
}
fn main() {
let _ = {
// create a temporary value
let s = "foo".to_string();
// the temporary actually only needs to live in `new`. It is
// never stored in `Container`.
Container::new(s.as_str(), map)
// ERR: ^ borrowed value does not live long enough
};
// ERR: `s` dropped here while still borrowed
}
playground
As far as I can tell, the compiler's error message is misleading, what it actually requires is an explicitly defined associated type:
fn map(i: &str) -> impl '_ + IntoState<State = String> {
i
}
The excellent answer given to the quesion: Why does the compiler not infer the concrete type of an associated type of an impl trait return value? provides enough information on why this is actually needed.
See also Rust issue #42940 - impl-trait return type is bounded by all input type parameters, even when unnecessary
You can use a generic type parameter instead of returning an impl in which case you don't have to specify the associated type:
fn map<T: IntoState>(i: T) -> T {
i
}
Apparently there is still some confusion about what exactly is going on here, so I'll try to destil my comments into a short answer.
The problem here is the prototype of the function map():
fn map(i: &str) -> impl '_ + IntoState
This specifies that the return type of map() is some type implementing IntoState, with an unspecified associated type State. The return type has a lifetime parameter with the lifetime of the argument i; let's call that lifetime 'a, and the full return type T<'a>. The associated type State of this return type now is <T<'a> as IntoState>::State, which is parametrized by 'a. The compiler is currently not able to eliminate this lifetime parameter from the assoicated type, in spite of the 'static declaration in the trait definition. By explicitly specifying the associated type as String, the compiler will simply use the explicitly specified type String instead of <T<'a> as IntoState>::State, so the lifetime parameter is gone, and we don't get an error anymore.
This compiler shortcoming is discussed in this Github issue.

How to write a proper generic function signature when borrowing data across multiple traits

While developing on a private project I ran into a lifetime problem related to borrowing the same object over multiple structs and traits. This is a bunch of stripped-down definitions I used:
trait WorkspaceLog {
fn get(&self) -> usize;
}
struct TheLog<'a>(&'a FilesystemOverlay);
impl<'a> WorkspaceLog for TheLog<'a> {
fn get(&self) -> usize {
(self.0).0
}
}
trait WorkspaceController<'a> {
type Log: WorkspaceLog;
fn get_log(&'a self) -> Self::Log;
}
struct FilesystemOverlay(usize);
struct FSWorkspaceController<'a>(&'a mut FilesystemOverlay);
impl<'a> WorkspaceController<'a> for FSWorkspaceController<'a> {
type Log = TheLog<'a>;
fn get_log(&'a self) -> Self::Log {
TheLog(&*self.0)
}
}
trait AsWorkspaceController<'a> {
type Controller: WorkspaceController<'a>;
fn get_controller(self) -> Self::Controller;
}
impl<'a> AsWorkspaceController<'a> for &'a mut FilesystemOverlay {
type Controller = FSWorkspaceController<'a>;
fn get_controller(self) -> FSWorkspaceController<'a> {
FSWorkspaceController(self)
}
}
So far, so good. This basically enables me to borrow a mut ref of FilesystemOverlay as some other interface, providing additional functionality. This interface, in turn, allows me to borrow essentially the same thing as yet another thing that provides the final data. This works as long a I directly use FilesystemOverlay:
fn init1(control_dir: &mut FilesystemOverlay) -> usize {
let controller = control_dir.get_controller();
let log = controller.get_log();
log.get()
}
However, if I replace the concrete reference with a type parameter, the compilation fails, telling me that controller doesn't live long enough since it, for reasons I don't understand, thinks that get_log borrows controller beyond the end of the function and thus way longer than the program logic
requires:
fn init2<'a: 'b, 'b, O>(control_dir: O) -> usize
where O: AsWorkspaceController<'b>+'a {
let controller = control_dir.get_controller();
let log = controller.get_log();
log.get()
}
fn main() {
let mut control_dir = FilesystemOverlay(5);
dbg!(init1(&mut control_dir));
dbg!(init2(&mut control_dir));
}
I tried several approaches but I so far were unable to figure out the proper signature of init2. This is the error I get:
error[E0597]: `controller` does not live long enough
--> test.rs:53:15
|
53 | let log = controller.get_log();
| ^^^^^^^^^^ borrowed value does not live long enough
54 | log.get()
55 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the lifetime 'b as defined on the function body at 50:18...
--> test.rs:50:18
|
50 | fn init2<'a: 'b, 'b, O>(control_dir: O) -> usize
| ^^
error: aborting due to previous error
For more information about this error, try `rustc --explain E0597`.
This is the full code on the rust playground.
So, how do I need to change the signature of init2 so that the compiler understands that controller may be dropped after the call to log.get()? Do I need other changes in the above types as well?
Edit: I've made some additional experiments and this is the closest I could manage to create. This one has two lifetimes and a signature that late-binds, but it still gives a warning about UB. Does anyone understand why?
With the help of a nice and knowing person on GitHub I was able to create a working version of the code, see https://github.com/rust-lang/rust/issues/58868. The key was to use a free lifetime bound on the type declaration of Controller inside AsWorkspaceController:
trait AsWorkspaceController<'a> {
type Controller: for<'b> WorkspaceController<'b>+'a;
fn get_controller(&'a mut self) -> Self::Controller;
}
See the full code on the playground.

Resources