Issues in the Rust reference book - rust

https://doc.rust-lang.org/reference/types/closure.html#capture-modes
struct SetVec {
set: HashSet<u32>,
vec: Vec<u32>
}
impl SetVec {
fn populate(&mut self) {
let vec = &mut self.vec;
self.set.iter().for_each(|&n| {
vec.push(n);
})
}
}
If, instead, the closure were to use self.vec directly, then it would
attempt to capture self by mutable reference. But since self.set is
already borrowed to iterate over, the code would not compile.
The reference book says the code would not compile. , but the code can be compiled.
So why?

This was true in the past, but has changed since and the reference was not updated. Since edition 2021, closures capture disjoint fields.
See reference issue #1066.

Related

Rust multiple mutable borrows are not allowed ONLY when lifetime is stated [duplicate]

I've encountered a confusing error about the use of a mutable and immutable borrow at the same time, after I expect the mutable borrow to end. I've done a lot of research on similar questions (1, 2, 3, 4, 5) which has led me to believe my problem has something to do with lexical lifetimes (though turning on the NLL feature and compiling on nightly doesn't change the result), I just have no idea what; my situation doesn't seem to fit into any of the scenarios of the other questions.
pub enum Chain<'a> {
Root {
value: String,
},
Child {
parent: &'a mut Chain<'a>,
},
}
impl Chain<'_> {
pub fn get(&self) -> &String {
match self {
Chain::Root { ref value } => value,
Chain::Child { ref parent } => parent.get(),
}
}
pub fn get_mut(&mut self) -> &mut String {
match self {
Chain::Root { ref mut value } => value,
Chain::Child { ref mut parent } => parent.get_mut(),
}
}
}
#[test]
fn test() {
let mut root = Chain::Root { value: "foo".to_string() };
{
let mut child = Chain::Child { parent: &mut root };
*child.get_mut() = "bar".to_string();
} // I expect child's borrow to go out of scope here
assert_eq!("bar".to_string(), *root.get());
}
playground
The error is:
error[E0502]: cannot borrow `root` as immutable because it is also borrowed as mutable
--> example.rs:36:36
|
31 | let mut child = Chain::Child { parent: &mut root };
| --------- mutable borrow occurs here
...
36 | assert_eq!("bar".to_string(), *root.get());
| ^^^^
| |
| immutable borrow occurs here
| mutable borrow later used here
I understand why an immutable borrow happens there, but I do not understand how a mutable borrow is used there. How can both be used at the same place? I'm hoping someone can explain what is happening and how I can avoid it.
In short, &'a mut Chain<'a> is extremely limiting and pervasive.
For an immutable reference &T<'a>, the compiler is allowed to shorten the lifetime of 'a when necessary to match other lifetimes or as part of NLL (this is not always the case, it depends on what T is). However, it cannot do so for mutable references &mut T<'a>, otherwise you could assign it a value with a shorter lifetime.
So when the compiler tries to reconcile the lifetimes when the reference and the parameter are linked &'a mut T<'a>, the lifetime of the reference is conceptually expanded to match the lifetime of the parameter. Which essentially means you've created a mutable borrow that will never be released.
Applying that knowledge to your question: creating a reference-based hierarchy is really only possible if the nested values are covariant over their lifetimes. Which excludes:
mutable references
trait objects
structs with interior mutability
Refer to these variations on the playground to see how these don't quite work as expected.
See also:
Why does linking lifetimes matter only with mutable references?
How do I implement the Chain of Responsibility pattern using a chain of trait objects?
What are the differences when getting an immutable reference from a mutable reference with self-linked lifetimes?
For fun, I'll include a case where the Rust standard library does this sort of thing on purpose. The signature of std::thread::scope looks like:
pub fn scope<'env, F, T>(f: F) -> T
where
F: for<'scope> FnOnce(&'scope Scope<'scope, 'env>) -> T
The Scope that is provided to the user-defined function intentionally has its lifetimes tied in a knot to ensure it is only used in intended ways. This is not always the case since structs may be covariant or contravariant over their generic types, but Scope is defined to be invariant. Also, the only function that can be called on it is .spawn() which intentionally takes &'scope self as the self-parameter as well, ensuring that the reference does not have a shorter lifetime than what is given by scope.
Internally, the standard library contains this documentation (source):
Invariance over 'scope, to make sure 'scope cannot shrink, which is necessary for soundness.
Without invariance, this would compile fine but be unsound:
std::thread::scope(|s| {
s.spawn(|| {
let a = String::from("abcd");
s.spawn(|| println!("{a:?}")); // might run after `a` is dropped
});
});
Even if the lifetime of the reference is invariant with respect to itself, this still avoids many problems above because it uses an immutable reference and interior-mutability. If the parameter to .spawn() required &'scope mut self, then this would not work and run into the same problems above when trying to spawn more than one thread.
The issue isn't lexical lifetimes, and adding an explicit drop won't change the error. The issue is with the &'a mut Chain<'a>- that forces root to be borrowed for its entire life, rendering it useless after the borrow is dropped. As per the comment below, doing this with lifetimes is basically impossible. I would suggest using a box instead. Changing the struct to
pub enum Chain{
Root {
value: String,
},
Child {
parent: Box<Chain>,
},
}
and adjusting the other methods as necesary. Alternatively, using an Rc<RefCell<Chain>> if you want the original to remain usable without consuming self.

Why does borrowing this temporary value work? [duplicate]

I have the following code:
#[derive(Debug)]
pub enum List<'a> {
Nil,
Cons(i32, &'a List<'a>)
}
{
let x = Cons(1, &Cons(2, &Nil));
println!("{:?}", x);
}
It works fine. I don't understand why this code doesn't report any error, isn't the Cons(2, &Nil) dropped before constructing Cons(1, _) ?
Moreover, after I added an empty impl Drop for List, the above code doesn't work any more:
impl<'a> Drop for List<'a> {
fn drop(&mut self) {
}
}
It reports errors that borrowed value does not live long enough for Cons(2, _) and Nil.
Why is there such difference between before and after adding impl Drop ?
Isn't the Cons(2, &Nil) dropped before constructing Cons(1, _)?
If you bind a reference to a temporary, Rust extends the lifetime of the temporary as needed for the binding; see this answer for details.
Why is there such difference between before and after adding impl Drop?
See this comment. The extended lifetime of the temporary matches the lifetime of x in your example. When a struct containing references has no Drop implementation,
it’s permissible for reference and referent to have identical lifetimes: the reference can’t be used unsafely. Introducing a Drop impl to the situation requires the referent to strictly outlive the reference, to ensure there is a clear order in which to run drop methods.

Cannot borrow as immutable because it is also borrowed as mutable when implementing an ECS

I am trying to write a simple ECS:
struct Ecs {
component_sets: HashMap<TypeId, Box<dyn Any>>,
}
impl Ecs {
pub fn read_all<Component>(&self) -> &SparseSet<Component> {
self.component_sets
.get(&TypeId::of::<Component>())
.unwrap()
.downcast_ref::<SparseSet<Component>>()
.unwrap()
}
pub fn write_all<Component>(&mut self) -> &mut SparseSet<Component> {
self.component_sets
.get_mut(&TypeId::of::<Component>())
.unwrap()
.downcast_mut::<SparseSet<Component>>()
.unwrap()
}
}
I am trying to get mutable access to a certain component while another is immutable. This testing code triggers the error:
fn testing() {
let all_pos = { ecs.write_all::<Pos>() };
let all_vel = { ecs.read_all::<Vel>() };
for (p, v) in all_pos.iter_mut().zip(all_vel.iter()) {
p.x += v.x;
p.y += v.y;
}
}
And the error
error[E0502]: cannot borrow `ecs` as immutable because it is also borrowed as mutable
--> src\ecs.rs:191:25
|
190 | let all_pos = { ecs.write_all::<Pos>() };
| --- mutable borrow occurs here
191 | let all_vel = { ecs.read_all::<Vel>() };
| ^^^ immutable borrow occurs here
My understanding of the borrow checker rules tells me that it's totally fine to get references to different component sets mutably or immutably (that is, &mut SparseSet<Pos> and &SparseSet<Vel>) since they are two different types. In order to get these references though, I need to go through the main ECS struct which owns the sets, which is where the compiler complains (i.e. first I use &mut Ecs when I call ecs.write_all and then &Ecs on ecs.read_all).
My first instinct was to enclose the statements in a scope, thinking it could just drop the &mut Ecs after I get the reference to the inner component set so as not to have both mutable and immutable Ecs references alive at the same time. This is probably very stupid, yet I don't fully understand how, so I wouldn't mind some more explaining there.
I suspect one additional level of indirection is needed (similar to RefCell's borrow and borrow_mut) but I am not sure what exactly I should wrap and how I should go about it.
Update
Solution 1: make the method signature of write_all take a &self despite returning a RefMut<'_, SparseSet<Component>> by wrapping the SparseSet in a RefCell (as illustrated in the answer below by Kevin Reid).
Solution 2: similar as above (method signature takes &self) but uses this piece of unsafe code:
fn write_all<Component>(&self) -> &mut SparseSet<Component> {
let set = self.component_sets
.get(&TypeId::of::<Component>())
.unwrap()
.downcast_ref::<SparseSet<Component>>()
.unwrap();
unsafe {
let set_ptr = set as *const SparseSet<Component>;
let set_ptr = set_ptr as *mut SparseSet<Component>;
&mut *set_ptr
}
}
What are benefits of using solution 1, is the implied runtime borrow-checking provided by RefCell an hindrance in this case or would it actually prove useful?
Would the use of unsafe be tolerable in this case? Are there benefits? (e.g. performance)
it's totally fine to get references to different component sets mutably or immutably
This is true: we can safely have multiple mutable, or mutable and immutable references, as long as no mutable reference points to the same data as any other reference.
However, not every means of obtaining those references will be accepted by the compiler's borrow checker. This doesn't mean they're unsound; just that we haven't convinced the compiler that they're safe. In particular, the only way the compiler understands to have simultaneous references is a struct's fields, because the compiler can know those are disjoint using a purely local analysis (looking only at the code of a single function):
struct Ecs {
pub pos: SparseSet<Pos>,
pub vel: SparseSet<Vel>,
}
for (p, v) in ecs.pos.iter_mut().zip(ecs.vel.iter()) {
p.x += v.x;
p.y += v.y;
}
This would compile, because the compiler can see that the references refer to different subsets of memory. It will not compile if you replace ecs.pos with a method ecs.pos() — let alone a HashMap. As soon as you get a function involved, information about field borrowing is hidden. Your function
pub fn write_all<Component>(&mut self) -> &mut SparseSet<Component>
has the elided lifetimes (lifetimes the compiler picks for you because every & must have a lifetime)
pub fn write_all<'a, Component>(&'a mut self) -> &'a mut SparseSet<Component>
which are the only information the compiler will use about what is borrowed. Hence, the 'a mutable reference to the SparseSet is borrowing all of the Ecs (as &'a mut self) and you can't have any other access to it.
The ways to arrange to be able to have multiple mutable references in a mostly-statically-checked way are discussed in the documentation page on Borrow Splitting. However, all of those are based on having some statically known property, which you don't. There's no way to express “this is okay as long as the Component type is not equal to another call's”. Therefore, to do this you do need RefCell, our general-purpose helper for runtime borrow checking.
Given what you've got already, the simplest thing to do is to replace SparseSet<Component> with RefCell<SparseSet<Component>>:
// no mut; changed return type
pub fn write_all<Component>(&self) -> RefMut<'_, SparseSet<Component>> {
self.component_sets
.get(&TypeId::of::<Component>())
.unwrap()
.downcast::<RefCell<SparseSet<Component>>>() // changed type
.unwrap()
.borrow_mut() // added this line
}
Note the changed return type, because borrowing a RefCell must return an explicit handle in order to track the duration of the borrow. However, a Ref or RefMut acts mostly like an & or &mut thanks to deref coercion. (Your code that inserts items in the map, which you didn't show in the question, will also need a RefCell::new.)
Another option is to put the interior mutability — likely via RefCell, but not necessarily — inside the SparseSet type, or create a wrapper type that does that. This might or might not help the code be cleaner.

Why does this mutable borrow live beyond its scope?

I've encountered a confusing error about the use of a mutable and immutable borrow at the same time, after I expect the mutable borrow to end. I've done a lot of research on similar questions (1, 2, 3, 4, 5) which has led me to believe my problem has something to do with lexical lifetimes (though turning on the NLL feature and compiling on nightly doesn't change the result), I just have no idea what; my situation doesn't seem to fit into any of the scenarios of the other questions.
pub enum Chain<'a> {
Root {
value: String,
},
Child {
parent: &'a mut Chain<'a>,
},
}
impl Chain<'_> {
pub fn get(&self) -> &String {
match self {
Chain::Root { ref value } => value,
Chain::Child { ref parent } => parent.get(),
}
}
pub fn get_mut(&mut self) -> &mut String {
match self {
Chain::Root { ref mut value } => value,
Chain::Child { ref mut parent } => parent.get_mut(),
}
}
}
#[test]
fn test() {
let mut root = Chain::Root { value: "foo".to_string() };
{
let mut child = Chain::Child { parent: &mut root };
*child.get_mut() = "bar".to_string();
} // I expect child's borrow to go out of scope here
assert_eq!("bar".to_string(), *root.get());
}
playground
The error is:
error[E0502]: cannot borrow `root` as immutable because it is also borrowed as mutable
--> example.rs:36:36
|
31 | let mut child = Chain::Child { parent: &mut root };
| --------- mutable borrow occurs here
...
36 | assert_eq!("bar".to_string(), *root.get());
| ^^^^
| |
| immutable borrow occurs here
| mutable borrow later used here
I understand why an immutable borrow happens there, but I do not understand how a mutable borrow is used there. How can both be used at the same place? I'm hoping someone can explain what is happening and how I can avoid it.
In short, &'a mut Chain<'a> is extremely limiting and pervasive.
For an immutable reference &T<'a>, the compiler is allowed to shorten the lifetime of 'a when necessary to match other lifetimes or as part of NLL (this is not always the case, it depends on what T is). However, it cannot do so for mutable references &mut T<'a>, otherwise you could assign it a value with a shorter lifetime.
So when the compiler tries to reconcile the lifetimes when the reference and the parameter are linked &'a mut T<'a>, the lifetime of the reference is conceptually expanded to match the lifetime of the parameter. Which essentially means you've created a mutable borrow that will never be released.
Applying that knowledge to your question: creating a reference-based hierarchy is really only possible if the nested values are covariant over their lifetimes. Which excludes:
mutable references
trait objects
structs with interior mutability
Refer to these variations on the playground to see how these don't quite work as expected.
See also:
Why does linking lifetimes matter only with mutable references?
How do I implement the Chain of Responsibility pattern using a chain of trait objects?
What are the differences when getting an immutable reference from a mutable reference with self-linked lifetimes?
For fun, I'll include a case where the Rust standard library does this sort of thing on purpose. The signature of std::thread::scope looks like:
pub fn scope<'env, F, T>(f: F) -> T
where
F: for<'scope> FnOnce(&'scope Scope<'scope, 'env>) -> T
The Scope that is provided to the user-defined function intentionally has its lifetimes tied in a knot to ensure it is only used in intended ways. This is not always the case since structs may be covariant or contravariant over their generic types, but Scope is defined to be invariant. Also, the only function that can be called on it is .spawn() which intentionally takes &'scope self as the self-parameter as well, ensuring that the reference does not have a shorter lifetime than what is given by scope.
Internally, the standard library contains this documentation (source):
Invariance over 'scope, to make sure 'scope cannot shrink, which is necessary for soundness.
Without invariance, this would compile fine but be unsound:
std::thread::scope(|s| {
s.spawn(|| {
let a = String::from("abcd");
s.spawn(|| println!("{a:?}")); // might run after `a` is dropped
});
});
Even if the lifetime of the reference is invariant with respect to itself, this still avoids many problems above because it uses an immutable reference and interior-mutability. If the parameter to .spawn() required &'scope mut self, then this would not work and run into the same problems above when trying to spawn more than one thread.
The issue isn't lexical lifetimes, and adding an explicit drop won't change the error. The issue is with the &'a mut Chain<'a>- that forces root to be borrowed for its entire life, rendering it useless after the borrow is dropped. As per the comment below, doing this with lifetimes is basically impossible. I would suggest using a box instead. Changing the struct to
pub enum Chain{
Root {
value: String,
},
Child {
parent: Box<Chain>,
},
}
and adjusting the other methods as necesary. Alternatively, using an Rc<RefCell<Chain>> if you want the original to remain usable without consuming self.

Why is the mutable reference not moved here?

I was under the impression that mutable references (i.e. &mut T) are always moved. That makes perfect sense, since they allow exclusive mutable access.
In the following piece of code I assign a mutable reference to another mutable reference and the original is moved. As a result I cannot use the original any more:
let mut value = 900;
let r_original = &mut value;
let r_new = r_original;
*r_original; // error: use of moved value *r_original
If I have a function like this:
fn make_move(_: &mut i32) {
}
and modify my original example to look like this:
let mut value = 900;
let r_original = &mut value;
make_move(r_original);
*r_original; // no complain
I would expect that the mutable reference r_original is moved when I call the function make_move with it. However that does not happen. I am still able to use the reference after the call.
If I use a generic function make_move_gen:
fn make_move_gen<T>(_: T) {
}
and call it like this:
let mut value = 900;
let r_original = &mut value;
make_move_gen(r_original);
*r_original; // error: use of moved value *r_original
The reference is moved again and therefore the program behaves as I would expect.
Why is the reference not moved when calling the function make_move?
Code example
There might actually be a good reason for this.
&mut T isn't actually a type: all borrows are parametrized by some (potentially inexpressible) lifetime.
When one writes
fn move_try(val: &mut ()) {
{ let new = val; }
*val
}
fn main() {
move_try(&mut ());
}
the type inference engine infers typeof new == typeof val, so they share the original lifetime. This means the borrow from new does not end until the borrow from val does.
This means it's equivalent to
fn move_try<'a>(val: &'a mut ()) {
{ let new: &'a mut _ = val; }
*val
}
fn main() {
move_try(&mut ());
}
However, when you write
fn move_try(val: &mut ()) {
{ let new: &mut _ = val; }
*val
}
fn main() {
move_try(&mut ());
}
a cast happens - the same kind of thing that lets you cast away pointer mutability. This means that the lifetime is some (seemingly unspecifiable) 'b < 'a. This involves a cast, and thus a reborrow, and so the reborrow is able to fall out of scope.
An always-reborrow rule would probably be nicer, but explicit declaration isn't too problematic.
I asked something along those lines here.
It seems that in some (many?) cases, instead of a move, a re-borrow takes place. Memory safety is not violated, only the "moved" value is still around. I could not find any docs on that behavior either.
#Levans opened a github issue here, although I'm not entirely convinced this is just a doc issue: dependably moving out of a &mut reference seems central to Rust's approach of ownership.
It's implicit reborrow. It's a topic not well documented.
This question has already been answered pretty well:
how implicit reborrow works
how reborrow works along with borrow split
If I tweak the generic one a bit, it would not complain either
fn make_move_gen<T>(_: &mut T) {
}
or
let _ = *r_original;

Resources