Rust lifetime subtyping doesn't work with Cell

Rust lifetime subtyping doesn't work with Cell - rust

Given a value of type Vec<&'static str>, I can freely convert that to Vec<&'r str>, as 'r is a subregion of 'static. That seems to work for most types, e.g. Vec, pairs etc. However, it doesn't work for types like Cell or RefCell. Concretely, down_vec compiles, but down_cell doesn't:
use std::cell::Cell;
fn down_vec<'p, 'r>(x: &'p Vec<&'static str>) -> &'p Vec<&'r str> {
x
}
fn down_cell<'p, 'r>(x: &'p Cell<&'static str>) -> &'p Cell<&'r str> {
x
}
Giving the error:
error[E0308]: mismatched types
--> src/lib.rs:9:5
|
9 | x
| ^ lifetime mismatch
|
= note: expected reference `&'p std::cell::Cell<&'r str>`
found reference `&'p std::cell::Cell<&'static str>`
note: the lifetime `'r` as defined on the function body at 8:18...
--> src/lib.rs:8:18
|
8 | fn down_cell<'p, 'r>(x: &'p Cell<&'static str>) -> &'p Cell<&'r str> {
| ^^
= note: ...does not necessarily outlive the static lifetime
Why does this not work for Cell? How does the compiler track that it doesn't work? Is there an alternative that can make it work?

Cell and RefCell are different because they allow mutation of the internal value through a shared reference.
To see why this is important, we can write a function that uses down_cell to leak a reference to freed memory:
fn oops() -> &'static str {
let cell = Cell::new("this string doesn't matter");
let local = String::from("this string is local to oops");
let broken = down_cell(&cell); // use our broken function to rescope the Cell
broken.set(&local); // use the rescoped Cell to mutate `cell`
cell.into_inner() // return a reference to `local`
} // uh-oh! `local` is dropped here
oops contains no unsafe blocks, but it compiles, so in order to prevent accessing freed memory the compiler must reject down_cell.
The type level explanation for why this is so is because Cell<T> and RefCell<T> contain an UnsafeCell<T>, which makes them invariant in T, while Box<T> and Vec<T> are covariant in T.
The reason Vec, Box and other container-like structures can be covariant is because those containers require &mut access to mutate their contents, and &mut T is itself invariant in T. You couldn't write a function like oops using down_vec -- the compiler wouldn't allow it.
References
The Subtyping and Variance chapter of the Rustonomicon
How does the Rust compiler know `Cell` has internal mutability?
Why does linking lifetimes matter only with mutable references?

Related

Why is it possible to return a mutable reference to a literal from a function?

The current edition of The Rustonomicon has this example code:
use std::mem;
pub struct IterMut<'a, T: 'a>(&'a mut [T]);
impl<'a, T> Iterator for IterMut<'a, T> {
type Item = &'a mut T;
fn next(&mut self) -> Option<Self::Item> {
let slice = mem::replace(&mut self.0, &mut []);
if slice.is_empty() {
return None;
}
let (l, r) = slice.split_at_mut(1);
self.0 = r;
l.get_mut(0)
}
}
I'm confused about this line in particular:
let slice = mem::replace(&mut self.0, &mut []);
// ^^^^^^^
How does this borrow check? If this were an immutable borrow, RFC 1414 indicates that the [] rvalue should have 'static lifetime, so that an immutable borrow would borrow-check, but the example shows a mutable borrow! It seems that one of two things must be going on:
Either [] is a temporary (so that it can be used mutably), in which case it would not have 'static lifetime, and should not borrow-check;
Or that [] has 'static lifetime, and therefore it should not be possible to take a mutable borrow (since we don't guarantee exclusive access as we take the borrow), and should not borrow-check.
What am I missing?
Related:
Why can I return a reference to a local literal but not a variable?
This question focuses on immutable references; this question is about mutable references.
Why is it legal to borrow a temporary?
This question focuses on taking references inside of a function; this question is about returning a reference.

TL;DR: empty arrays are special cased in the compiler and it's safe because you can't ever dereference the pointer of a zero-length array, so there's no possible mutable aliasing.
RFC 1414, rvalue static promotion, discusses the mechanism by which values are promoted to static values. It has a section about possible extensions for mutable references (bolding mine):
It would be possible to extend support to &'static mut references,
as long as there is the additional constraint that the
referenced type is zero sized.
This again has precedence in the array reference constructor:
// valid code today
let y: &'static mut [u8] = &mut [];
The rules would be similar:
If a mutable reference to a constexpr rvalue is taken. (&mut <constexpr>)
And the constexpr does not contain a UnsafeCell { ... } constructor.
And the constexpr does not contain a const fn call returning a type containing a UnsafeCell.
And the type of the rvalue is zero-sized.
Then instead of translating the value into a stack slot, translate
it into a static memory location and give the resulting reference a
'static lifetime.
The zero-sized restriction is there because
aliasing mutable references are only safe for zero sized types
(since you never dereference the pointer for them).
From this, we can tell that mutable references to empty arrays are currently special-cased in the compiler. In Rust 1.39, the discussed extension has not been implemented:
struct Zero;
fn example() -> &'static mut Zero {
&mut Zero
}
error[E0515]: cannot return reference to temporary value
--> src/lib.rs:4:5
|
4 | &mut Zero
| ^^^^^----
| | |
| | temporary value created here
| returns a reference to data owned by the current function
While the array version does work:
fn example() -> &'static mut [i32] {
&mut []
}
See also:
Why is it legal to borrow a temporary?
Why can I return a reference to a local literal but not a variable?

What does it mean that Box is covariant if Box<dyn B> is not a subtype of Box<dyn A> where B: A?

After I read the subtyping chapter of the Nomicon, I couldn't wrap my head around covariance of a type parameter. Especially for the Box<T> type, which is described as: T is covariant.
However, if I write this code:
trait A {}
trait B: A {}
struct C;
impl A for C {}
impl B for C {}
fn foo(v: Box<dyn A>) {}
fn main() {
let c = C;
let b: Box<dyn B> = Box::new(c);
foo(b);
}
(Playground)
error[E0308]: mismatched types
--> src/main.rs:13:9
|
13 | foo(b);
| ^ expected trait `A`, found trait `B`
|
= note: expected type `std::boxed::Box<(dyn A + 'static)>`
found type `std::boxed::Box<dyn B>`
B is clearly a "subtype" of A and Box is covariant over its input. I don't know why it doesn't work or why it won't do any type coercion. Why would they consider Box<T> to be covariant where the only use cases are invariants?

What subtyping and variance means in Rust
The Nomicon is not a fully polished document. Right now, 5 of the most recent 10 issues in that repo specifically deal with subtyping or variance based on their title alone. The concepts in the Nomicon can require substantial effort, but the information is generally there.
First off, check out some initial paragraphs (emphasis mine):
Subtyping in Rust is a bit different from subtyping in other languages. This makes it harder to give simple examples, which is a problem since subtyping, and especially variance, are already hard to understand properly.
To keep things simple, this section will consider a small extension to the Rust language that adds a new and simpler subtyping relationship. After establishing concepts and issues under this simpler system, we will then relate it back to how subtyping actually occurs in Rust.
It then goes on to show some trait-based code. Reiterating the point, this code is not Rust code anymore; traits do not form subtypes in Rust!
Later on, there's this quote:
First and foremost, subtyping references based on their lifetimes is the entire point of subtyping in Rust. The only reason we have subtyping is so we can pass long-lived things where short-lived things are expected.
Rust's notion of subtyping only applies to lifetimes.
What's an example of subtyping and variance?
Variant lifetimes
Here's an example of subtyping and variance of lifetimes at work inside of a Box.
A failing case
fn smaller<'a>(v: Box<&'a i32>) {
bigger(v)
}
fn bigger(v: Box<&'static i32>) {}
error[E0308]: mismatched types
--> src/lib.rs:2:12
|
2 | bigger(v)
| ^ lifetime mismatch
|
= note: expected type `std::boxed::Box<&'static i32>`
found type `std::boxed::Box<&'a i32>`
note: the lifetime 'a as defined on the function body at 1:12...
--> src/lib.rs:1:12
|
1 | fn smaller<'a>(v: Box<&'a i32>) {
| ^^
= note: ...does not necessarily outlive the static lifetime
A working case
fn smaller<'a>(v: Box<&'a i32>) {}
fn bigger(v: Box<&'static i32>) {
smaller(v)
}
Invariant lifetimes
Here's a case that works:
struct S<'a>(&'a i32);
fn smaller<'a>(_v: &S<'a>, _x: &'a i32) {}
fn bigger(v: &S<'static>) {
let x: i32 = 1;
smaller(v, &x);
}
The same code with all the references changed to mutable references will fail because mutable references are invariant:
struct S<'a>(&'a mut i32);
fn smaller<'a>(_v: &mut S<'a>, _x: &'a mut i32) {}
fn bigger(v: &mut S<'static>) {
let mut x: i32 = 1;
smaller(v, &mut x);
}
error[E0597]: `x` does not live long enough
--> src/lib.rs:7:16
|
7 | smaller(v, &mut x);
| -----------^^^^^^-
| | |
| | borrowed value does not live long enough
| argument requires that `x` is borrowed for `'static`
8 | }
| - `x` dropped here while still borrowed
Addressing specific points
B is clearly a "subtype" of A
It is not.
Box is covariant over its input
It is, where covariance is only applicable to lifetimes.
I don't know why it doesn't work or why it won't do any type coercion.
This is covered by Why doesn't Rust support trait object upcasting?
Why would they consider Box<T> to be covariant
Because it is, for the things in Rust to which variance is applied.
See also
How do I deal with wrapper type invariance in Rust?
Why does linking lifetimes matter only with mutable references?
What is an example of contravariant use in Rust?

To add a bit:
I think the confusion here is mainly due to a common misconception that when we say Foo<T>, T is always assumed to be an owned type. In fact, T can refer to a reference type, such as &i32.
As for (co)variance, Wikipedia defines it as:
Variance refers to how subtyping between more complex types relates to subtyping between their components.
In Rust, as pointed by others, subtyping applies to lifetimes only. Subtrait relationships don’t define subtypes: If trait A is a subtrait of trait B, it doesn’t mean that A is a subtype of B.
An example of subtyping among lifetimes: A shared reference (e.g. &'a i32) is a subtype of another shared reference (e.g. &'b i32) if and only if the former’s lifetime outlives the latter’s lifetimes ('a outlives 'b). Below is some code that demonstrates it:
fn main() {
let r1: &'static i32 = &42;
// This obviously works
let b1: Box<&'static i32> = Box::new(r1);
// This also works
// because Box<T> is covariant over T
// and `&'static i32` is a subtype of `&i32`.
// NOTE that T here is `&i32`, NOT `i32`
let b2: Box<&i32> = Box::new(r1);
let x: i32 = 42;
let r2: &i32 = &x;
// This does NOT work
// because `&i32` is NOT a subtype of `&'static i32`
// (it is the other way around)
let b3: Box<&'static i32> = Box::new(r2);
}

Why can't I call a method with a temporary value?

I can't call Foo::new(words).split_first() in the following code
fn main() {
let words = "Sometimes think, the greatest sorrow than older";
/*
let foo = Foo::new(words);
let first = foo.split_first();
*/
let first = Foo::new(words).split_first();
println!("{}", first);
}
struct Foo<'a> {
part: &'a str,
}
impl<'a> Foo<'a> {
fn split_first(&'a self) -> &'a str {
self.part.split(',').next().expect("Could not find a ','")
}
fn new(s: &'a str) -> Self {
Foo { part: s }
}
}
the compiler will give me an error message
error[E0716]: temporary value dropped while borrowed
--> src/main.rs:8:17
|
8 | let first = Foo::new(words).split_first();
| ^^^^^^^^^^^^^^^ - temporary value is freed at the end of this statement
| |
| creates a temporary which is freed while still in use
9 |
10 | println!("{}", first);
| ----- borrow later used here
|
= note: consider using a `let` binding to create a longer lived value
If I bind the value of Foo::new(words) first, then call the split_first method there is no problem.
These two methods of calling should intuitively be the same but are somehow different.

Short answer: remove the 'a lifetime for the self parameter of split_first: fn split_first(&self) -> &'a str (playground).
Long answer:
When you write this code:
struct Foo<'a> {
part: &'a str,
}
impl<'a> Foo<'a> {
fn new(s: &'a str) -> Self {
Foo { part: s }
}
}
You are telling the compiler that all Foo instances are related to some lifetime 'a that must be equal to or shorter than the lifetime of the string passed as parameter to Foo::new. That lifetime 'a may be different from the lifetime of each Foo instance. When you then write:
let words = "Sometimes think, the greatest sorrow than older";
Foo::new(words)
The compiler infers that the lifetime 'a must be equal to or shorter than the lifetime of words. Barring any other constraints the compiler will use the lifetime of words, which is 'static so it is valid for the full life of the program.
When you add your definition of split_first:
fn split_first(&'a self) -> &'a str
You are adding an extra constraint: you are saying that 'a must also be equal to or shorter than the lifetime of self. The compiler will therefore take the shorter of the lifetime of words and the lifetime of the temporary Foo instance, which is the lifetime of the temporary. #AndersKaseorg's answer explains why that doesn't work.
By removing the 'a lifetime on the self parameter, I am decorrelating 'a from the lifetime of the temporary, so the compiler can again infer that 'a is the lifetime of words, which is long enough for the program to work.

Foo::new(words).split_first() would be interpreted roughly as
let tmp = Foo::new(words);
let ret = tmp.split_first();
drop(tmp);
ret
If Rust allowed you to do this, the references in ret would point [edit: would be allowed by the type of split_first to point*] into the now dropped value of tmp. So it’s a good thing that Rust disallows this. If you wrote the equivalent one-liner in C++, you’d silently get undefined behavior.
By writing the let binding yourself, you delay the drop until the end of the scope, thus extending the region where it’s safe to have these references.
For more details, see temporary lifetimes in the Rust Reference.
* Edit: As pointed out by Jmb, the real problem in this particular example is that the type
fn split_first(&'a self) -> &'a str
isn’t specific enough, and a better solution is to refine the type to:
fn split_first<'b>(&'b self) -> &'a str
which can be abbreviated:
fn split_first(&self) -> &'a str
This conveys the intended guarantee that the returned references do not point into the Foo<'a> (only into the string itself).

How can I create an Iter over a Vec contained in a RefCell? [duplicate]

Given the following struct and impl:
use std::slice::Iter;
use std::cell::RefCell;
struct Foo {
bar: RefCell<Vec<u32>>,
}
impl Foo {
pub fn iter(&self) -> Iter<u32> {
self.bar.borrow().iter()
}
}
fn main() {}
I get an error message about a lifetime issue:
error: borrowed value does not live long enough
--> src/main.rs:9:9
|
9 | self.bar.borrow().iter()
| ^^^^^^^^^^^^^^^^^ does not live long enough
10 | }
| - temporary value only lives until here
|
note: borrowed value must be valid for the anonymous lifetime #1 defined on the body at 8:36...
--> src/main.rs:8:37
|
8 | pub fn iter(&self) -> Iter<u32> {
| _____________________________________^ starting here...
9 | | self.bar.borrow().iter()
10 | | }
| |_____^ ...ending here
How am I able to return and use bars iterator?

You cannot do this because it would allow you to circumvent runtime checks for uniqueness violations.
RefCell provides you a way to "defer" mutability exclusiveness checks to runtime, in exchange allowing mutation of the data it holds inside through shared references. This is done using RAII guards: you can obtain a guard object using a shared reference to RefCell, and then access the data inside RefCell using this guard object:
&'a RefCell<T> -> Ref<'a, T> (with borrow) or RefMut<'a, T> (with borrow_mut)
&'b Ref<'a, T> -> &'b T
&'b mut RefMut<'a, T> -> &'b mut T
The key point here is that 'b is different from 'a, which allows one to obtain &mut T references without having a &mut reference to the RefCell. However, these references will be linked to the guard instead and can't live longer than the guard. This is done intentionally: Ref and RefMut destructors toggle various flags inside their RefCell to force mutability checks and to force borrow() and borrow_mut() panic if these checks fail.
The simplest thing you can do is to return a wrapper around Ref, a reference to which would implement IntoIterator:
use std::cell::Ref;
struct VecRefWrapper<'a, T: 'a> {
r: Ref<'a, Vec<T>>
}
impl<'a, 'b: 'a, T: 'a> IntoIterator for &'b VecRefWrapper<'a, T> {
type IntoIter = Iter<'a, T>;
type Item = &'a T;
fn into_iter(self) -> Iter<'a, T> {
self.r.iter()
}
}
(try it on playground)
You can't implement IntoIterator for VecRefWrapper directly because then the internal Ref will be consumed by into_iter(), giving you essentially the same situation you're in now.

Alternate Solution
Here is an alternate solution that uses interior mutability as it was intended. Instead of creating an iterator for &T values, we should create an iterator for Ref<T> values, which deference automatically.
struct Iter<'a, T> {
inner: Option<Ref<'a, [T]>>,
}
impl<'a, T> Iterator for Iter<'a, T> {
type Item = Ref<'a, T>;
fn next(&mut self) -> Option<Self::Item> {
match self.inner.take() {
Some(borrow) => match *borrow {
[] => None,
[_, ..] => {
let (head, tail) = Ref::map_split(borrow, |slice| {
(&slice[0], &slice[1..])
});
self.inner.replace(tail);
Some(head)
}
},
None => None,
}
}
}
Playground
Explanation
The accepted answer has a few significant drawbacks that may confuse those new to Rust. I will explain how, in my personal experience, the accepted answer might actually be harmful to a beginner, and why I believe this alternative uses interior mutability and iterators as they were intended.
As the previous answer importantly highlights, using RefCell creates a divergent type hierarchy that isolates mutable and immutable access to a shared value, but you do not have to worry about lifetimes to solve the iteration problem:
RefCell<T> .borrow() -> Ref<T> .deref() -> &T
RefCell<T> .borrow_mut() -> RefMut<T> .deref_mut() -> &mut T
The key to solving this without lifetimes is the Ref::map method, which is critically missed in the book. Ref::map "makes a new reference to a component of the borrowed data", or in other words converts a Ref<T> of the outer type to a Ref<U> of some inner value:
Ref::map(Ref<T>, ...) -> Ref<U>
Ref::map and its counterpart RefMut::map are the real stars of the interior mutability pattern, not borrow() and borrow_mut().
Why? Because unlike borrow() and borrow_mut(), Ref::mut and RefMut::map, allow you to create references to interior values that can be "returned".
Consider adding a first() method to the Foo struct described in the question:
fn first(&self) -> &u32 {
&self.bar.borrow()[0]
}
Nope, .borrow() makes a temporary Ref that only lives until the method returns:
error[E0515]: cannot return value referencing temporary value
--> src/main.rs:9:11
|
9 | &self.bar.borrow()[0]
| ^-----------------^^^
| ||
| |temporary value created here
| returns a value referencing data owned by the current function
error: aborting due to previous error; 1 warning emitted
We can make it more obvious what is happening if we break it up and make the implicit deference explicit:
fn first(&self) -> &u32 {
let borrow: Ref<_> = self.bar.borrow();
let bar: &Vec<u32> = borrow.deref();
&bar[0]
}
Now we can see that .borrow() creates a Ref<T> that is owned by the method's scope, and isn't returned and therefore dropped even before the reference it provided can be used. So, what we really need is to return an owned type instead of a reference. We want to return a Ref<T>, as it implements Deref for us!
Ref::map will help us do just that for component (internal) values:
fn first(&self) -> Ref<u32> {
Ref::map(self.bar.borrow(), |bar| &bar[0])
}
Of course, the .deref() will still happen automatically, and Ref<u32> will be mostly be referentially transparent as &u32.
Gotcha. One easy mistake to make when using Ref::map is to try to create an owned value in the closure, which is not possible as when we tried to use borrow(). Consider the type signature of the second parameter, the function: FnOnce(&T) -> &U,. It returns a reference, not an owned type!
This is why we use a slice in the answer &v[..] instead of trying to use the vector's .iter() method, which returns an owned std::slice::Iter<'a, T>. Slices are a reference type.
Additional Thoughts
Alright, so now I will attempt to justify why this solution is better than the accepted answer.
First, the use of IntoIterator is inconsistent with the Rust standard library, and arguably the purpose and intent of the trait. The trait method consumes self: fn into_iter(self) -> ....
let v = vec![1,2,3,4];
let i = v.into_iter();
// v is no longer valid, it was moved into the iterator
Using IntoIterator indirectly for a wrapper is inconsistent as you consume the wrapper and not the collection. In my experience, beginners will benefit from sticking with the conventions. We should use a regular Iterator.
Next, the IntoIterator trait is implemented for the reference &VecRefWrapper and not the owned type VecRefWrapper.
Suppose you are implementing a library. The consumers of your API will have to seemingly arbitrarily decorate owned values with reference operators, as is demonstrated in the example on the playground:
for &i in &foo.iter() {
println!("{}", i);
}
This is a subtle and confusing distinction if you are new to Rust. Why do we have to take a reference to the value when it is anonymously owned by - and should only exist for - the scope of the loop?
Finally, the solution above shows how it is possible to drill all they way into your data with interior mutability, and makes the path forward for implementing a mutable iterator clear as well. Use RefMut.

From my research there is currently no solution to this problem. The biggest problem here is self-referentiality and the fact that rust cannot prove your code to be safe. Or at least not in the generic fashion.
I think it's safe to assume that crates like ouroboros, self-cell and owning_ref are solution if you know that your struct (T in Ref<T>) does not contain any smart pointers nor anything which could invalidate any pointers you might obtain in your "dependent" struct.
Note that self-cell does this safely with extra heap allocation which might be ok in some cases.
There was also RFC for adding map_value to Ref<T> but as you can see, there is always some way to invalidate pointers in general (which does not mean your specific case is wrong it's just that it probably will never be added to the core library/language because it cannot be guaranteed for any T)
Yeah, so no answer, sorry. impl IntoIterator for &T works but I think it's rather hack and it forces you to write for x in &iter instead of for x in iter

Why does the Rust compiler request I constrain a generic type parameter's lifetime (error E0309)?

Why does the Rust compiler emit an error requesting me to constrain the lifetime of the generic parameter in the following structure?
pub struct NewType<'a, T> {
x: &'a T,
}
error[E0309]: the parameter type `T` may not live long enough
--> src/main.rs:2:5
|
2 | x: &'a T,
| ^^^^^^^^
|
= help: consider adding an explicit lifetime bound `T: 'a`...
note: ...so that the reference type `&'a T` does not outlive the data it points at
--> src/main.rs:2:5
|
2 | x: &'a T,
| ^^^^^^^^
I can fix it by changing to
pub struct NewType<'a, T>
where
T: 'a,
{
x: &'a T,
}
I don't understand why it is necessary to add the T: 'a part to the structure definition. I cannot think of a way that the data contained in T could outlive the reference to T. The referent of x needs to outlive the NewType struct and if T is another structure then it would need to meet the same criteria for any references it contains as well.
Is there a specific example where this type of annotation would be necessary or is the Rust compiler just being pedantic?

What T: 'a is saying is that any references in T must outlive 'a.
What this means is that you can't do something like:
let mut o: Option<&str> = Some("foo");
let mut nt = NewType { x: &o }; // o has a reference to &'static str, ok.
{
let s = "bar".to_string();
let o2: Option<&str> = Some(&s);
nt.x = &o2;
}
This would be dangerous because nt would have a dangling reference to s after the block. In this case it would also complain that o2 didn't live long enough either.
I can't think of a way you can have a &'a reference to something which contains shorter-lifetime references, and clearly the compiler knows this in some way (because it's telling you to add the constraint). However I think it's helpful in some ways to spell out the restriction, since it makes the borrow checker less magic: you can reason about it just from just type declarations and function signatures, without having to look at how the fields are defined (often implementation details which aren't in documentation) or how the implementation of a function.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Rust lifetime subtyping doesn't work with Cell - rust

Related

Why is it possible to return a mutable reference to a literal from a function?

What does it mean that Box is covariant if Box<dyn B> is not a subtype of Box<dyn A> where B: A?

Why can't I call a method with a temporary value?

How can I create an Iter over a Vec contained in a RefCell? [duplicate]

Why does the Rust compiler request I constrain a generic type parameter's lifetime (error E0309)?

Categories

Resources