Passing on lifetimes of underlying reference fields? - rust

I'm trying to make two structs that operate on an underlying dataset; one providing immutable "read" operations, the other allowing modification. For this to work, I need to be able to use the read functions from within the modifying object - as such I create a temporary new read object within the modifier function with a view onto the underlying data.
Here's some code:
struct Read<'db> {
x: &'db i32
}
impl<'db> Read<'db> {
pub fn get(&'db self) -> &'db i32 { self.x }
}
struct Write<'db> {
x: &'db mut i32
}
impl<'db> Write<'db> {
fn new(x: &mut i32) -> Write { Write{x: x} }
fn as_read(&'db self) -> Read<'db> {
Read{x: self.x}
}
pub fn get(&'db self) -> &'db i32 { self.as_read().get() }
}
fn main() {
let mut x = 69i32;
let y = Write::new(&mut x);
println!("{}", y.get());
}
It doesn't compile - it seems that despite my best efforts, the lifetime of the reference returned from Read::get is bound to the Write::get's scope, rather than the Write's 'db lifetime. How can I make it compile? (And, is what I want to do possible? Is it the simplest/most concise way of doing it?)

The point the compiler is trying to get across is that &'db self actually means self: &'db Write<'db>. This means that you tie the reference AND the type to the same lifetime. What you actually want in your case is self: &'a Write<'db> where 'a lives just for the as_read function. To be able to return a 'db reference from a 'a reference, you need to specify that 'a lives at least as long as 'db by constraining 'a: 'db.
fn as_read<'a: 'db>(self: &'a Write<'db>) -> Read<'db> {
Read{x: self.x}
}
pub fn get<'a: 'db>(self: &'a Write<'db>) -> &'db i32 { self.as_read().get() }
or more concisely
fn as_read<'a: 'db>(&'a self) -> Read<'db> {
Read{x: self.x}
}
pub fn get<'a: 'db>(&'a self) -> &'db i32 { self.as_read().get() }
Try it out in the Playground

Related

Immutable struct with mutable reference members

Is my understanding correct that in Rust it is not possible to protect reference members of a struct from modification while having the reference target values mutable? (Without runtime borrow checking that is.) For example:
struct MyData<'a> {
pub some_ref: &'a mut i32,
}
fn doit<'a>(data: &mut MyData<'a>, other_ref: &'a mut i32) {
// I want to be able to do the following here:
*data.some_ref = 22;
// but make it impossible to do the following:
data.some_ref = other_ref;
}
Not being able to change the reference value may be useful in certain FFI situations. FFI and the performance requirements reasons prevent the use of runtime borrow checking here.
In C++ it can be expressed like this:
struct MyData {
int* const some_ref;
};
void doit(const MyData &data, int* other_ref) {
// this is allowed:
*data.some_ref = 22;
// this is not:
data.some_ref = other_ref; // compile error
}
You can create a wrapper type around the reference. If the constructor is private, and so is the wrapped reference field, you cannot replace the reference itself. You can then implement DerefMut to allow changing the referent.
pub struct ImmRef<'a> {
inner: &'a mut i32,
}
impl<'a> ImmRef<'a> {
fn new(inner: &'a mut i32) -> Self { Self { inner } }
}
impl std::ops::Deref for ImmRef<'_> {
type Target = i32;
fn deref(&self) -> &Self::Target { &*self.inner }
}
impl std::ops::DerefMut for ImmRef<'_> {
fn deref_mut(&mut self) -> &mut Self::Target { &mut *self.inner }
}
struct MyData<'a> {
pub some_ref: ImmRef<'a>,
}
fn doit<'a>(data: &mut MyData<'a>, other_ref: &'a mut i32) {
// I want to be able to do the following here:
*data.some_ref = 22;
// but make it impossible to do the following:
// data.some_ref = other_ref;
}
You can mark the newtype #[repr(transparent)] for FFI purposes.
But do note that if the code has some ImmRef<'a> available it can use tools such as std::mem::replace() to replace the reference.
Rust does not allow you to specify the mutability of individual fields like you can via const in C++. Instead, you should simply encapsulate the data by making it private and only allow modification through methods that you dictate:
struct MyData<'a> {
some_ref: &'a mut i32,
}
impl MyData<'_> {
pub fn set_ref(&mut self, other: i32) {
*self.some_ref = other;
}
}
That way, the field some_ref cannot be modified directly (outside of the module) and must use the available method.

Return a reference to a member iterator

I have a struct that has an iterator over a borrowed slice, and I want to create a method that returns an iterator that does some computation on the contained iterator
Basically this:
#[derive(Clone)]
struct Foo<'a> {
inner: ::std::iter::Cycle<::std::slice::Iter<'a, u8>>,
}
impl<'a> Foo<'a> {
pub fn new<T: AsRef<[u8]> + ?Sized>(vals: &'a T) -> Foo<'a> {
Self {
inner: vals.as_ref().iter().cycle(),
}
}
pub fn iter(&mut self) -> impl Iterator<Item = u8> + 'a {
self.inner.by_ref().map(Clone::clone) // simple case but this could be any computation
}
}
Which yields a cryptic error.
Here's my understanding: I need to bind the returned iterator's lifetime to the &mut self of iter() so that &mut self is only dropped when .collect() is called on the returned iterator.
BUT, changing to fn iter(&'a mut self) isn't good enough because &'a mut self now lives for the entire duration of the struct instance (meaning that calling collect() doesn't drop that reference). Which prevents something like this
#[test]
fn test(){
let mut foo = Foo::new("hello, there");
foo.iter().zip(0..4).collect::<Vec<_>>(); // call once, ok
foo.iter().zip(0..4).collect::<Vec<_>>(); // error because 2 &mut references exist at the same
}
rust playground link
The way I would do this is to make a struct just for iterating over those items. It makes things much easier.
...
pub fn iter(&mut self) -> ClonedFooIter<'a, '_> {
ClonedFooIter {
inner: &mut self.inner,
}
}
}
pub struct ClonedFooIter<'a, 'b> {
inner: &'b mut std::iter::Cycle<::std::slice::Iter<'a, u8>>,
}
impl<'a, 'b> Iterator for ClonedFooIter<'a, 'b> {
type Item = u8;
fn next(&mut self) -> Option<Self::Item> {
self.inner.next().cloned()
}
}
The idea is that the returned struct only lives till the mutable borrow of self lives.
Using a struct makes naming the iterator type easier too.

Borrowing an object as mutable twice for unrelated, sequential uses

I'm trying to implement an abstraction that allows me to read from either a directory or a zip file. I start by implementing something of this sort:
pub trait FileOpener<'a> {
type ReaderType: Read;
fn open(&'a self, file_name: &str) -> Result<Self::ReaderType, Box<dyn Error>>;
}
pub struct DirectoryFileOpener<'a> {
root: &'a Path
}
impl<'a> DirectoryFileOpener<'a> {
pub fn new(root: &'a Path) -> Self {
DirectoryFileOpener { root }
}
}
impl<'a> FileOpener<'a> for DirectoryFileOpener<'a> {
type ReaderType = File;
fn open(&'a self, file_name: &str) -> Result<File, Box<dyn Error>> {
Ok(File::open(self.root.join(file_name))?)
}
}
But then I realize that the zip-rs package's zip::ZipFile is constructed from a mutable reference to the zip::ZipArchive which it is located in, so I end up with the following code:
use std::path::Path;
use std::error::Error;
use std::fs::File;
use std::io::prelude::*;
use zip::{ZipArchive, read::ZipFile};
use std::marker::PhantomData;
pub trait FileOpener<'a> {
type ReaderType: Read;
fn open(&'a mut self, file_name: &str) -> Result<Self::ReaderType, Box<dyn Error>>;
}
pub struct DirectoryFileOpener<'a> {
root: &'a Path
}
impl<'a> DirectoryFileOpener<'a> {
pub fn new(root: &'a Path) -> Self {
DirectoryFileOpener { root }
}
}
impl<'a> FileOpener<'a> for DirectoryFileOpener<'a> {
type ReaderType = File;
fn open(&'a mut self, file_name: &str) -> Result<File, Box<dyn Error>> {
Ok(File::open(self.root.join(file_name))?)
}
}
pub struct ZipFileOpener<'a, R: Read + Seek> {
zip: ZipArchive<R>,
phantom: PhantomData<&'a Self>
}
impl<'a, R: Read + Seek> ZipFileOpener<'a, R> {
pub fn new(zip: ZipArchive<R>) -> Self {
ZipFileOpener { zip, phantom: PhantomData }
}
}
impl<'a, R: Read + Seek> FileOpener<'a> for ZipFileOpener<'a, R> {
type ReaderType = ZipFile<'a>;
fn open(&'a mut self, file_name: &str) -> Result<ZipFile<'a>, Box<dyn Error>> {
Ok(self.zip.by_name(file_name)?)
}
}
I'm not sure if that's the most optimal way to write that, but at least it compiles. Then I try to use it as such:
fn load(root: &Path) -> Result<...> {
let mut opener = io::DirectoryFileOpener::new(root);
let a = Self::parse_a(opener.open("a.txt")?)?;
let b = Self::parse_b(opener.open("b.txt")?, a)?;
}
and I get cannot borrow 'opener' as mutable more than once at a time. This does not surprise me much, as I indeed use open(), which borrows opener as mutable, twice - although a is only a u64, and from my point of view it is unrelated to the lifetime of opener.open(), from the compiler's point of view it has to be in the same lifetime of the line below it, and thus we attempt to borrow opener as mutable twice.
However, I then look at the following code, which compiles and works well and which I started this whole thing by trying to improve:
fn load_zip(root: &Path) -> Result<...> {
let file = File::open(root)?;
let mut zip = ZipArchive::new(file)?;
let a = Self::parse_a(zip.by_name("a.txt")?)?;
let b = Self::parse_b(zip.by_name("b.txt")?, a)?;
}
This throws me off completely, because the function by_name() also borrows zip as mutable, and is also called twice! Why is it allowed to borrow zip as mutable twice here but not in the previous case?
After researching the issue and Rust's semantics deeper, and building on top of the notes by trentcl, I came to realize that the problem essentially boils down to defining the FileOpener trait where the lifetime argument is bound to the associated type and not to the trait itself, e.g.
pub trait FileOpener {
type ReaderType: Read;
fn open(&'a mut self, file_name: &str) -> Result<Self::ReaderType, Box<dyn Error>>;
}
impl<'a, R: Read + Seek> FileOpener for ZipFileOpener<R> {
type ReaderType = ZipFile<'a>;
...
}
However, this is known as generic associated types (GAT), and is not yet supported in Rust. The GAT RFC does however mention that in some cases the problem can be circumvented by binding the lifetime to the trait itself and using higher-rank trait bounds (HRTB) in the receiving function, which yields the following working solution to this question:
pub trait FileOpener<'a> {
type ReaderType: Read;
fn open(&'a self, file_name: &str) -> Result<Self::ReaderType, Box<dyn Error>>;
}
...
fn load<T: for<'a> FileOpener<'a>>(opener: T) -> ... {
let a = parse_a(opener.open("a.txt")?)?;
let b = parse_b(opener.open("b.txt")?, a)?;
}
This is because the HRTB allows us to bind T to a FileOpener without binding a specific lifetime to it, which enables the late binding of different lifetimes for each call to opener.open()

impl Iterator failing for iterator with multiple lifetime parameters

I've got code that looks (a little) like this:
struct OutputIterator<'r, 'i: 'r> {
input_handler: &'r mut InputHandler<'i>
}
impl<'r, 'i> Iterator for OutputIterator<'r, 'i> {
type Item = i32;
fn next(&mut self) -> Option<Self::Item> {
self.input_handler.inputs.next().map(|x| x * 2)
}
}
struct InputHandler<'a> {
inputs: Box<dyn Iterator<Item = i32> + 'a>
}
impl<'a> InputHandler<'a> {
fn outputs<'b>(&'b mut self) -> OutputIterator<'b, 'a> {
OutputIterator { input_handler: self }
}
}
fn main() {
let mut input_handler = InputHandler {
inputs: Box::new(vec![1,2,3,4,5].into_iter())
};
for output in input_handler.outputs() {
println!("{}", output);
}
}
Basically, the user can supply an iterator of inputs, and then get an iterator of outputs (in my real code the connection between inputs and outputs is much more complex involving a bunch of internal state. Multiple inputs might go towards one output or vice-versa).
This works, but I would like to change it use impl Iterator both to hide the OutputIterator type and to allow for easier swapping out of the return type in testing with a fake. My best attempt at that changes the InputHandler impl like so:
impl<'a> InputHandler<'a> {
fn outputs<'b>(&'b mut self) -> impl Iterator<Item = i32> + 'b {
OutputIterator { input_handler: self }
}
}
Unfortunately, that gets me: error[E0700]: hidden type for `impl Trait` captures lifetime that does not appear in bounds
Is there a way to make this work? It's important for the interface that InputHandler take an iterator with some lifetime, and that obviously has to be passed on to the OutputIterator somehow, but I'd really like to abstract those details away from the caller. In principle, the caller should only have to worry about making sure that the inputs Iterator and InputHandler outlive the OutputIterator so I think the logical lifetime bound on the OutputIterator here is the smaller of those two? Any clarity on why this error is happening or how to fix it would be great!
In case it helps, here's a rust playground with the code in it.
Using a workaround from https://github.com/rust-lang/rust/issues/34511#issuecomment-373423999, via https://stackoverflow.com/a/50548538/1757964:
trait Captures<'a> {}
impl<'a, T: ?Sized> Captures<'a> for T {}
impl<'a> InputHandler<'a> {
fn outputs<'b>(&'b mut self) -> impl Iterator<Item = i32> + Captures<'a> + 'b {
OutputIterator { input_handler: self }
}
}

lifetime with closure captures in rust

How can I reduce the lifetime of a closure?
I was trying to make a method, which returns an iterator related to self. I didn't want to make new struct or something, so I just made it return filters and maps, and confronted some borrow checker errors.
The following code was my first try.
fn f<'b>(&'b self) -> impl Iterator<Item = u8> {
(0..self.some_number())
.filter(|&i| self.some_bool_function(i))
.map(|i| i as u8)
}
The following code replicates my question.
struct A(bool);
impl A {
fn f<'a>(&'a self) -> impl Iterator<Item = u8> + 'a {
(0..1).filter(|&i| self.0)
}
}
or even shorter,
fn g<'a>(t:&'a ()) -> impl 'a + FnMut() {
|| *t
}
This would not compile, because the closure may outlive self. I don't know how to make this work, without moving self.
If you return a closure, you must ensure that the closure has everything it needs - even after returning (i.e. after the (temporary) function parameters are popped from the stack).
Thus, I think you want to move the stuff you return into the closure:
impl A {
fn f<'a>(&'a self) -> impl Iterator<Item = u8> + 'a {
(0..1).filter(move |&i| self.0)
}
}
Resp.
fn g<'a>(t:&'a ()) -> impl 'a + FnMut() {
move || *t
}
Resp (extending your first example):
struct A(bool);
impl A {
fn some_number(&self) -> usize {
6
}
fn some_bool_function(&self, i: usize) -> bool {
i%2==0
}
fn f<'b>(&'b self) -> impl Iterator<Item = u8> + 'b {
(0..self.some_number())
.filter(move |&i| self.some_bool_function(i))
.map(|i| i as u8)
}
}

Resources