Lifetimes and references to objects containing references

Lifetimes and references to objects containing references - reference

Let's say I have a struct with a reference in it, and another struct with a reference to that struct, something like this:
struct Image<'a> {
pixel_data: &'a mut Vec<u8>,
size: (i32, i32),
}
struct SubImage<'a> {
image: &'a mut Image<'a>,
offset: (i32, i32),
size: (i32, i32),
}
The structs have nearly identical interfaces, the difference being that SubImage adjusts position parameters based on its offset before forwarding to the corresponding functions of the contained Image reference. I would like these structs to be mostly interchangeable, but I can't seem to figure out how to get lifetimes right. Originally, I was just using Image, and could pass around objects simply, without ever mucking about with lifetime specifiers:
fn main() {
let mut pixel_data: Vec<u8> = Vec::new();
let mut image = Image::new(&mut pixel_data, (1280, 720));
render(&mut image);
}
fn render(image: &mut Image) {
image.rect_fill(0, 0, 10, 10);
}
Then I created SubImage, and wanted to do things like this:
fn render2(image: &mut Image) {
let mut sub = SubImage {
image: image, // line 62
offset: (100, 100),
size: (600, 400),
};
sub.rect_fill(0, 0, 10, 10);
}
This, however, causes a compiler error:
main.rs:62:16: 62:21 error: cannot infer an appropriate lifetime for automatic coercion due to conflicting requirements
The compiler's suggestion is to change the signature to this:
fn render2<'a>(image: &'a mut Image<'a>)
However, that just pushes the problem up to the function which called render2, and took a &mut Image. And this is quite annoying, as the function calls go a few layers deep, and I didn't have to do any of this when I was just using the Image class (which also has a reference), and adjusting the offsets inline.
So first of all, I don't even understand why this is necessary (admittedly my understanding of rust lifetimes is limited). And secondly (my main question), is there anything I can do to SubImage to make these explicit lifetimes not necessary?

Yes, this error may be confusing but there is a legitimate reason for it.
struct SubImage<'a> {
image: &'a mut Image<'a>,
offset: (i32, i32),
size: (i32, i32),
}
Here you declare that the reference to Image must live exactly as long as the borrowed data inside the image itself - the same lifetime parameter 'a is used both in the reference and as a parameter for Image: &'a mut Image<'a>.
However, render2() violates this requirement. The actual signature of render2() is as follows:
fn render2<'b, 'a>(image: &'b mut Image<'a>)
Therefore, it tries to create SubImage with &'b mut Image<'a>, where 'b not necessarily equals to 'a (and in this particular case, it most certainly does not), and so the compiler bails out.
Also such signature is the only reason you can call this function while providing it &mut image in main(), because &mut image have lifetime of image variable, but the Image contained inside this variable has lifetime of pixel_data which is slightly longer. The following code is not valid Rust, but it is close to how the compiler understands things and it demonstrates the problem:
fn main() {
'a: {
let mut pixel_data: Vec<u8> = Vec::new();
'b: {
let mut image: Image<'a> = Image::new(&'a mut pixel_data, (1280, 720));
render2::<'b, 'a>(&'b mut image);
}
}
}
When you declare render2() as
fn render2<'a>(image: &'a mut Image<'a>)
you indeed do "push" the problem upstream - now the function can't be called at all with &mut image, and you can now see why - it would require unifying 'a and 'b lifetimes, which is impossible because 'a is longer than 'b.
The proper solution is to use separate lifetimes for reference to Image and Image itself in SubImage definition:
struct SubImage<'b, 'a:'b> {
image: &'b mut Image<'a>,
offset: (i32, i32),
size: (i32, i32),
}
Now 'b and 'a may be different lifetimes, though in order for this to work you have to bound 'a lifetime with 'b, that is, 'a must live at least as long as 'b. This is exactly the semantics your code needs. If this constraint is not enforced, then it would be possible for the referenced image to "die" before the reference to it goes out of scope, which is a violation of safety rules of Rust.

is there anything I can do to SubImage to make these explicit lifetimes not necessary?
Vladimir's answer is spot on, but I'd encourage you to change your code a bit. A lot of my original code had very similar references to things with references. If you need that, then having separate lifetimes can help a lot. However, I'd just embed the Image in SubImage:
struct Image<'a> {
pixel_data: &'a mut Vec<u8>,
size: (i32, i32),
}
struct SubImage<'a> {
image: Image<'a>,
offset: (i32, i32),
size: (i32, i32),
}
In my case, I wasn't really gaining anything by having nested references. Embedding the struct directly makes it a bit bigger, but can make access a bit faster (one less pointer chase). Importantly in this case, it removes the need for a second lifetime.

Related

Specifying lifetime causing immutable/mutable borrow conflict

Small example to illustrate the problem. The following compiles:
fn main() {
let value: u32 = 15;
let mut ref_to_value: &u32 = &0;
fill_object(&mut ref_to_value, &value);
println!("referring to value {}", ref_to_value);
}
fn fill_object<'a>(m: &mut &'a u32, v: &'a u32) {
*m = v;
}
Now, the following gives a compile time error about mutable borrow followed by immutable borrow:
fn fill_object<'a>(m: &'a mut &'a u32, v: &'a u32) {
*m = v;
}
fill_object(&mut ref_to_value, &value);
| ----------------- mutable borrow occurs here
5 | println!("referring to value {}", ref_to_value);
| ^^^^^^^^^^^^
| |
| immutable borrow occurs here
| mutable borrow later used here
Why? I'm presuming that because I have now specified a lifetime of 'a for the reference to ref_to_value, that mutable reference is now over the entire scope (ie. main). Whereas before, without the 'a lifetime reference, the mutable reference was limited?
I'm looking for clarity on how to think about this.

Your intuition is spot on. With one lifetime,
fn fill_object<'a>(m: &'a mut &'a u32, v: &'a u32) {
*m = v;
}
All three references are required to live for the same length, so if v lives a long time then the mutable reference must as well. This is not intuitive behavior, so it's generally a bad idea to tie together lifetimes like this. If you don't specify any lifetimes, Rust gives each reference a different one implicitly. So the following are equivalent.
fn fill_object(m: &mut &u32, v: &u32)
fn fill_object<'a, 'b, 'c>(m: &'a mut &'b u32, v: &'c u32)
(Note: The inferred lifetime of returned values is a bit more complicated but isn't in play here)
So, your partially-specified lifetimes are equivalent to
fn fill_object<'a>(m: &mut &'a u32, v: &'a u32);
fn fill_object<'a, 'b>(m: &'b mut &'a u32, v: &'a u32);
As a side note, &mut &u32 is a weird type for multiple reasons. Putting aside the fact that u32 is copy (and hence, outside of generics, useless to take immutable references of), a mutable reference to an immutable reference is just confusing. I'm not sure what you're real use case is. If this was just a test example, then sure. But if this is your real program, I recommend considering if you can get off with fill_object(&mut u32, u32), and if you really need the nested reference, you might consider making it a bit easier to swallow with a structure.
struct MyIntegerCell<'a>(&'a u32);
Plus some documentation as to why this is necessary. And then you would have fill_object(&mut MyIntegerCell, u32) or something like that.

Mutable borrow with returned object of longer lifetime

I'm currently trying to implement a mutable slice/view into a buffer that supports taking subslices safely for in-memory message traversal. A minimal example would be
struct MutView<'s> {
data: &'s mut [u8]
}
impl<'s> MutView<'s> {
pub fn new(data: &'s mut [u8]) -> Self {
MutView { data }
}
pub fn subview<'a>(&'a mut self, start: usize, end: usize) -> MutView<'a> {
MutView::new(&mut self.data[start..end])
}
pub fn get<'a>(&'a mut self) -> &'a mut [u8] {
self.data
}
}
However there is a problem with this design, in that if I have a MutView<'s> that is created locally in a function and 's is a longer than the scope of the function (say 's = 'static) I have no way of returning a subview from the function. Something like
fn get_subview(data: &'s mut [u8]) -> MutView<'s> {
MutView::new(data).subview(0, 5)
}
Gives a compile error since MutView::new(data) is a local temporary, so clearly MutView<'a> returned by subview() cannot be returned from the function.
Changing the signature of subview() to
pub fn subview<'a>(&'a mut self, start: usize, end: usize) -> MutView<'s> {
MutView::new(&mut self.data[start..end])
}
Isn't possible since the borrow checker complains that the lifetime of the returned object must be longer than the lifetime of &'a mut self.
The problem I see is that the borrow checker is set up to handle cases where subview() is returning data owned by the &'a mut self, whereas in this case I am returning data owned by some underlying buffer that outlives lifetime 'a, so for safety reasons I still want to do a mutable borrow of &'a mut self for the duration of the lifetime of returned object while also allowing the &'a mut self to be dropped without shortening the lifetime of the returned subview.
The only option I see is to add an into_subview and into_slice method that consumes self. However for my specific case I am generating get_/set_ methods for reading messages with a given schema, meaning I have get/set methods per field in the message body and adding an extra into_* method means a lot of extra code to generate/compile, as well as additional usage complexity. Therefore I would like to avoid this if possible.
Is there a good way of handling this kind of dependency currently in Rust?

Returning a MutView<'s> from subview is unsound.
It would allow users to call subview multiple times and yield potentially overlapping ranges which would violate Rust's referential guarantees that mutable references are exclusive. This can be done easily with immutable references since they can be shared, but there are much stricter requirements for mutable references. For this reason, mutable references derived from self must have their lifetime bound to self in order to "lock out" access to it while the mutable borrow is still in use. The compiler is enforcing that by telling you &mut self.data[..] is &'a mut [u8] instead of &'s mut [u8].
The only option I see is to add an into_subview and into_slice method that consumes self.
That the main option I see, the key part you need to need to guarantee is exclusivity, and consuming self would remove it from the equation. You can also take inspiration from the mutable methods on slices like split_mut, split_at_mut, chunks_mut, etc. which are carefully designed to get multiple mutable elements/sub-slices at the same time.
You could use std::mem::transmute to force the lifetimes to be what you want (Warning: transmute is very unsafe and is easy to use incorrectly), however, you then are burdened with upholding the referential guarantees mentioned above yourself. The subview() -> MutView<'s> function should then be marked unsafe with the safety requirement that the ranges are exclusive. I do not recommend doing that except in exceptional cases where you are returning multiple mutable references and have checked that they don't overlap.
I'd have to see exactly what kind of API you're hoping to design to give better advice.

How are lifetimes of struct arguments treated in calls to functions with lifetime parameters?

This code passes the compiler (for clarification lifetimes are not elided):
struct Foo<'a> {
_field: &'a i32,
}
fn test<'a, 'b, 'c>(_x: &'a mut Foo<'c>, _y: &'b bool) { // case 1
}
fn main() {
let f = &mut Foo { _field: &0 };
{
let p = false;
test(f, &p);
}
}
If I use 'b instead of 'c in test's definition like so:
fn test<'a, 'b>(_x: &'a mut Foo<'b>, _y: &'b bool) { // case 2
}
the code fails to compile ("p does not live long enough")!
What I would expect to happen at the call of test in case 2 is:
'a is set to the actual lifetime of f,
'b is set to the intersection of the Foo's actual lifetime and &p's actual lifetime which is &p's lifetime,
and everything should be fine, as in case 1.
Instead, what actually seems to happen in case 2 is that 'b is forced to become the lifetime of the Foo which is too big for &p's lifetime, hence the compiler error 'p does not live long enough'. True?
Even stranger (case 3): this only fails if test takes a &mut. If I leave the <'b> in, but remove the mut like so:
fn test<'a, 'b>(_x: &'a Foo<'b>, _y: &'b bool) { // case 3
}
the code passes again.
Anyone to shed light on this?
Cheers.

Noting the difference with mut was a key observation. I think that it will make more sense if you change the type of the second argument and give one possible implementation:
fn test<'a, 'b>(_x: &'a mut Foo<'b>, _y: &'b i32) {
_x._field = _y;
}
This function has the ability to mutate _x. That mutation also includes storing a new reference in _field. However, if we were able to store a reference that had a shorter lifetime (the intersection you mentioned), as soon as the inner block ended, the reference in the Foo would become invalid and we would have violated Rust's memory safety guarantees!
When you use an immutable reference, you don't have this danger, so the compiler allows it.
You have discovered an important thing - Rust doesn't always care what you do in the function. When checking if a function call is valid, only the type signature of the function is used.
I'm sure there's a fancy way of saying this using the proper terms like contravariance and covariance, but I don't know those well enough to use them properly! ^_^

Can you control borrowing a struct vs borrowing a field?

I'm working on a program involving a struct along these lines:
struct App {
data: Vec<u8>,
overlay: Vec<(usize, Vec<u8>)>,
sink: Sink,
}
In brief the data field holds some bytes and overlay is a series of byte sequences to be inserted at specific indices. The Sink type is unimportant except that it has a function like:
impl Sink {
fn process<'a>(&mut self, input: Vec<&'a [u8]>) {
// ...
}
}
I've implemented an iterator to merge the information from data and overlay for consumption by Sink.
struct MergeIter<'a, 'b> {
data: &'a Vec<u8>,
overlay: &'b Vec<(usize, Vec<u8>)>,
// iterator state etc.
}
impl<'a, 'b> Iterator for MergeIter<'a, 'b> {
type Item = &'a [u8];
// ...
}
This is I think a slight lie, because the lifetime of each &[u8] returned by the iterator isn't always that of the original data. The data inserted from overlay has a different lifetime, but I don't see how I can annotate this more accurately. Anyway, the borrow checker doesn't seem to mind - the following approach works:
fn merge<'a, 'b>(data: &'a Vec<u8>, overlay: &'b Vec<(usize, Vec<u8>)>, start: usize) -> Vec<&'a [u8]> {
MergeIter::new(data, overlay, start).collect()
}
impl App {
fn process(&mut self) {
let merged = merge(&self.data, &self.overlay, 0);
// inspect contents of 'merged'
self.sink.process(merged);
}
}
I end up using this merge function all over the place, but always against the same data/overlay. So I figure I'll add an App::merge function for convenience, and here's where the problem begins:
impl App {
fn merge<'a>(&'a self, start: usize) -> Vec<&'a [u8]> {
MergeIter::new(&self.data, &self.overlay, start).collect()
}
fn process(&mut self) {
let merged = self.merge(0);
// inspect contents of 'merged'
self.sink.process(merged);
}
}
App::process now fails to pass the borrow checker - it refuses to allow the mutable borrow of self.sink while self is borrowed.
I've wrestled with this for some time, and if I've understood correctly the problem isn't with process but with this signature:
fn merge<'a>(&'a self, start: usize) -> Vec<&'a [u8]> {
Here I've essentially told the borrow checker that the references returned in the vector are equivalent to the self borrow.
Even though I feel like I've now understood the problem, I still feel like my hands are tied. Leaving the lifetime annotations out doesn't help (because the compiler does the equivalent?), and with only the two references involved there's no way I can see to tell rust that the output reference has a lifetime bound to something else.
I also tried this:
fn merge<'a, 'b>(&'b self, start: usize) -> Vec<&'a [u8]> {
let data: &'a Vec<u8> = &self.data;
MergeIter::new(&self.data, &self.overlay, start).collect()
}
but the compiler complains about the let statement ("unable to infer appropriate lifetime due to conflicting requirements" -- I also find it infuriating that the compiler doesn't explain said requirements).
Is it possible to achieve this? The Rust Reference is kind of light on lifetime annotations and associated syntax.
rustc 1.0.0-nightly (706be5ba1 2015-02-05 23:14:28 +0000)

As long as the method merge takes &self, you cannot accomplish what you desire: it borrows all of each of its arguments and this cannot be altered.
The solution is to change it so that it doesn’t take self, but instead takes the individual fields you wish to be borrowed:
impl App {
...
fn merge(data: &Vec<u8>, overlay: &Vec<(usize, Vec<u8>)>, start: usize) -> Vec<&[u8]> {
MergeIter::new(data, overlay, start).collect()
}
fn process(&mut self) {
let merged = Self::merge(&self.data, &self.overlay, 0);
... // inspect contents of 'merged'
self.sink.process(merged);
}
}

Yes, you've guessed correctly - the error happens because when you have merge method accept &self, the compiler can't know at its call site that it uses only some fields - merge signature only tells it that the data it returns is somehow derived from self, but it doesn't tell how - and so the compiler assumes the "worst" case and prevents you from accessing other fields self has.
I'm afraid there is no way to fix this at the moment, and I'm not sure there ever will be any. However, you can use macros to shorten merge invocations:
macro_rules! merge {
($this:ident, $start:expr) => {
MergeIter::new(&$this.data, &$this.overlay, $start).collect()
}
}
fn process(&mut self) {
let merged = merge!(self, 0);
// inspect contents of 'merged'
self.sink.process(merged);
}

Need help on lifetime issue

pub struct Decoder<'a> {
reader: &'a mut io::Reader+'a,
}
impl<'a> Decoder<'a> {
pub fn from_reader(r: &'a mut io::Reader) -> Decoder<'a> {
Decoder {
reader: r,
}
}
}
// shortcut method to accept bytes to decode
pub fn decode<'a, T: Decodable<Decoder<'a>, IoError>>(data: Vec<u8>) -> DecodeResult<T> {
let mut r = MemReader::new(data);
let mut decoder = Decoder::from_reader(&mut r); // error: `r` does not live long enough
Decodable::decode(&mut decoder)
}
I have two question here.
How do you read this declaration(what it means) reader: &'a mut io::Reader+'a. Which I was referencing the code from the std json encoder.
I write a shortcut method to wrap Vec<u8> with MemReader, so that I can just interfacing io::Reader. But the compiler complains error:rdoes not live long. How to make it right.
Update: I upload the code to github.

The first 'a means that the Reader object itself has lifetime 'a. The second 'a means that the Reader object doesn't contain references that outlive 'a. Since Reader is a trait, it could be implemented by a struct that has lifetime parameters. This bound applies to those potential lifetime parameters.
The problem is with the bound on T: Decodable<Decoder<'a>, IoError> references the lifetime parameter 'a. However, the Decoder you're creating references a local variable, whereas 'a refers to a lifetime that lives longer than the function call (because it's an input parameter specified implicitly at the call site).
I think there's no way to make this function compile successfully without unsafe code for the moment. In fact, Encoder::buffer_encode seems to be having the same issue (#14302) and uses a similar workaround. transmute allows us to coerce the local lifetime to 'a.
pub fn decode<'a, T: Decodable<Decoder<'a>, IoError>>(data: Vec<u8>) -> DecodeResult<T> {
let mut r = MemReader::new(data);
let mut decoder = unsafe { mem::transmute(Decoder::from_reader(&mut r)) };
Decodable::decode(&mut decoder)
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string