I'm doing Advent Of Code Day 7 in Rust. I have to parse a tree out of order like so:
a(10)
c(5) -> a, b
b(20)
That says c is the root with a and b as its children.
I handle this by parsing each line, making an object, and storing it in a hash by name. If it shows up later as a child, like a, I can use that hash to lookup the object and apply it as a child. If it shows up as a child before being defined, like b, I can create a partial version and update it via the hash. The above would be something like:
let mut np = NodeParser{
map: HashMap::new(),
root: None,
};
{
// This would be the result of parsing "a(10)".
{
let a = Node{
name: "a".to_string(),
weight: Some(10),
children: None
};
np.map.insert( a.name.clone(), a );
}
// This is the result of parsing "c(5) -> a, b".
// Note that it creates 'b' with incomplete data.
{
let b = Node{
name: "b".to_string(),
weight: None,
children: None
};
np.map.insert("b".to_string(), b);
let c = Node{
name: "c".to_string(),
weight: Some(5),
children: Some(vec![
*np.map.get("a").unwrap(),
// ^^^^^^^^^^^^^^^^^^^^^^^^^ cannot move out of borrowed content
*np.map.get("b").unwrap()
// ^^^^^^^^^^^^^^^^^^^^^^^^^ cannot move out of borrowed content
])
};
np.map.insert( c.name.clone(), c );
}
// Parsing "b(20)", it's already seen b, so it updates it.
// This also updates the entry in c.children. It avoids
// having to search all nodes for any with b as a child.
{
let mut b = np.map.get_mut( "b" ).unwrap();
b.weight = Some(20);
}
}
I might want to look up a node and look at its children.
// And if I wanted to look at the children of c...
let node = np.map.get("c").unwrap();
for child in node.children.unwrap() {
// ^^^^ cannot move out of borrowed content
println!("{:?}", child);
}
Rust does not like this. It doesn't like that both NodeParser.map and Node.children own a node.
error[E0507]: cannot move out of borrowed content
--> /Users/schwern/tmp/test.rs:46:21
|
46 | *np.map.get("a").unwrap(),
| ^^^^^^^^^^^^^^^^^^^^^^^^^ cannot move out of borrowed content
error[E0507]: cannot move out of borrowed content
--> /Users/schwern/tmp/test.rs:49:21
|
49 | *np.map.get("b").unwrap()
| ^^^^^^^^^^^^^^^^^^^^^^^^^ cannot move out of borrowed content
It doesn't like that the for loop is trying to borrow the node to iterate because I've already borrowed the node from the NodeParser that owns it.
error[E0507]: cannot move out of borrowed content
--> /Users/schwern/tmp/test.rs:68:18
|
68 | for child in node.children.unwrap() {
| ^^^^ cannot move out of borrowed content
I think I understand what I'm doing wrong, but I'm not sure how to make it right.
How should I construct this to make the borrower happy? Because of the way NodeParser.map and Node.children must be linked, copying is not an option.
Here is the code to test with. In the real code both Node and NodeParser have implementations and methods.
One option is unsafe code ... but I would suggest avoiding that if you're using the Advent of Code to learn idiomatic Rust and not just drop all the safety its trying to give you.
Another option is to reference count the Node instances so that the borrow checker is happy and the compiler knows how to clean things up. The std::rc::Rc type does this for you ... and essentially every call to clone() just increments a reference count and returns a new Rc instance. Then every time an object is dropped, the Drop implementation just decrements the reference count.
As for the iteration .. for x in y is syntactic sugar for for x in y.into_iter(). This is attempting to move the contents of children out of node (notice in the IntoIterator trait, into_iter(self) takes ownership of self). To rectify this, you can ask for a reference instead when iterating, using for x in &y. This essentially becomes for x in y.iter(), which does not move the contents.
Here are these suggestions in action.
use std::collections::HashMap;
use std::rc::Rc;
struct NodeParser {
map: HashMap<String, Rc<Node>>,
root: Option<Node>,
}
#[derive(Debug)]
struct Node {
name: String,
children: Option<Vec<Rc<Node>>>,
}
fn main() {
let mut np = NodeParser{
map: HashMap::new(),
root: None,
};
let a = Rc::new(Node{ name: "a".to_string(), children: None });
np.map.insert( a.name.clone(), a.clone() );
let b = Rc::new(Node{ name: "b".to_string(), children: None });
np.map.insert( b.name.clone(), b.clone() );
let c = Rc::new(Node{
name: "c".to_string(),
children: Some(vec![a, b])
});
np.map.insert( c.name.clone(), c.clone() );
let node = np.map.get("c").unwrap();
for child in &node.children {
println!("{:?}", child);
}
}
EDIT: I will expand on my comment here. You can use lifetimes here too if you want, but I'm concerned that the lifetime solution will work against the MCVE and won't work once applied to the actual problem the OP (not just of this question... others as well) actually has. Lifetimes are tricky in Rust and small things like re-ordering the instantiation of variables to allow the lifetime solution can throw people off. My concern being they will run into lifetime issues and therefore the answers won't be appropriate to their actual situation even if it works for the MCVE. Maybe I overthink that though..
Related
In my efforts to learn Rusts' notorious borrow checker I'm hopelessly stuck. I've constructed a tree structure that loosely represents a file system. As soon as I'm trying to borrow a enum value and try to mutate the underlying value, I get an error that I don't understand. Pretend that cache_files actually has some elements in it. Here's the code:
enum UnixFile{
FILE(UFile),
FOLDER(UFolder)
}
struct UFile {
name: String,
parent: Weak<RefCell<UnixFile>>,
size: usize,
}
struct UFolder {
name: String,
parent: Weak<RefCell<UnixFile>>,
files: Vec<Rc<RefCell<UnixFile>>>,
}
fn main() {
let root = Rc::new(RefCell::new(
UnixFile::FOLDER(
UFolder {
name: String::from("root"),
parent: Weak::new(),
files: vec![],
})));
let mut current = root.clone();
let mut f = &*(*current).borrow_mut();
let mut cache_files:Vec<Rc<RefCell<UnixFile>>> = Vec::new();
match f {
UnixFile::FOLDER(mut folder) => {
folder.files.append(&mut cache_files);
},
UnixFile::FILE(file) => {}
}
}
And this is the resulting error:
error[E0507]: cannot move out of `f` as enum variant `FOLDER` which is behind a shared reference
--> src/main.rs:43:11
|
43 | match f {
| ^
44 | UnixFile::FOLDER(mut folder) => {
| ----------
| |
| data moved here
| move occurs because `folder` has type `UFolder`, which does not implement the `Copy` trait
So what does this error mean? And how can I put the code in a position that I actually can add a file from cache_files to the folder?
First, you do &* for f. This gives you a shared reference. You cannot mutate it.
Instead, do &mut *. You can also simplify the expression further, removing the parentheses and asterisk thanks to auto-deref:
let mut f = &mut *current.borrow_mut();
The second problem is more subtle. When you specify UnixFolder::FOLDER(mut folder) in the match arm, the mut forces the compiler to move out of folder instead of just binding it to a reference. The explanation of why is complicated and not really relevant (what happens is that it opts out for match ergonomics, but you don't need to understand that). What you need to know is that you can just remove the mut, and then folder will have type &mut UFolder, and everything will go fine.
You are moving a value(enum) that is owned by something else. Either you implement Clone/Copy or wrap it with Rc(Arc in multhead), so you can clone the reference count and not the value.
I'm trying to learn a bit of Rust through a toy application, which involves a tree data structure that is filled dynamically by querying an external source. In the beginning, only the root node is present. The tree structure provides a method get_children(id) that returns a [u32] of the IDs of all the node's children — either this data is already known, or the external source is queried and all the nodes are inserted into the tree.
I'm running into the following problem with the borrow checker that I can't seem to figure out:
struct Node {
id: u32,
value: u64, // in my use case, this type is much larger and should not be copied
children: Option<Vec<u32>>,
}
struct Tree {
nodes: std::collections::HashMap<u32, Node>,
}
impl Tree {
fn get_children(&mut self, id: u32) -> Option<&[u32]> {
// This will perform external queries and add new nodes to the tree
None
}
fn first_even_child(&mut self, id: u32) -> Option<u32> {
let children = self.get_children(id)?;
let result = children.iter().find(|&id| self.nodes.get(id).unwrap().value % 2 == 0)?;
Some(*result)
}
}
Which results in:
error[E0502]: cannot borrow `self.nodes` as immutable because it is also borrowed as mutable
--> src/lib.rs:19:43
|
18 | let children = self.get_children(id)?;
| ---- mutable borrow occurs here
19 | let result = children.iter().find(|&id| self.nodes.get(id).unwrap().value % 2 == 0)?;
| ---- ^^^^^ ---------- second borrow occurs due to use of `self.nodes` in closure
| | |
| | immutable borrow occurs here
| mutable borrow later used by call
Since get_children might insert nodes into the tree, we need a &mut self reference. However, the way I see it, after the value of children is known, self no longer needs to be borrowed mutably. Why does this not work, and how would I fix it?
EDIT -- my workaround
After Chayim Friedman's answer, I decided against returning Self. I mostly ran into the above problem when first calling get_children to get a list of IDs and then using nodes.get() to obtain the corresponding Node. Instead, I refactored to provide the following functions:
impl Tree {
fn load_children(&mut self, id: u32) {
// If not present yet, perform queries to add children to the tree
}
fn iter_children(&self, id: u32) -> Option<IterChildren> {
// Provides an iterator over the children of node `id`
}
}
Downgrading a mutable reference into a shared reference produces a reference that should be kept unique. This is necessary for e.g. Cell::from_mut(), which has the following signature:
pub fn from_mut(t: &mut T) -> &Cell<T>
This method relies on the uniqueness guarantee of &mut T to ensure no references to T are kept directly, only via Cell. If downgrading the reference would mean the unqiueness could have been violated, this method would be unsound, because the value inside the Cell could have been changed by another shared references (via interior mutability).
For more about this see Common Rust Lifetime Misconceptions: downgrading mut refs to shared refs is safe.
To solve this you need to get both shared references from the same shared reference that was created from the mutable reference. You can, for example, also return &Self from get_children():
fn get_children(&mut self, id: u32) -> Option<(&Self, &[u32])> {
// This will perform external queries and add new nodes to the tree
Some((self, &[]))
}
fn first_even_child(&mut self, id: u32) -> Option<u32> {
let (this, children) = self.get_children(id)?;
let result = children.iter().find(|&id| this.nodes.get(id).unwrap().value % 2 == 0)?;
Some(*result)
}
I have a tree structure with a node and children, and a loop from a GUI library which expects a function to run on each iteration. I'm struggling to get the borrow checker to let me keep a reference to the node I'm processing - it complains that nodes doesn't live long enough.
Here's a minimal reproduction:
#[derive(Debug)]
struct Node {
value: u64,
children: Vec<Node>,
}
fn run_loop<F>(mut handler: F)
where
F: 'static + FnMut(),
{
for _ in 0..500 {
handler();
}
}
fn main() {
let nodes = vec![
Node {
value: 1,
children: vec![Node {
value: 3,
children: vec![],
}],
},
Node {
value: 2,
children: vec![],
},
];
let mut node = &nodes[0];
run_loop(move || {
println!("Node: {:?}", node);
node = &node.children[0];
});
}
error[E0597]: `nodes` does not live long enough
--> src/main.rs:30:21
|
30 | let mut node = &nodes[0];
| ^^^^^ borrowed value does not live long enough
31 |
32 | / run_loop(move || {
33 | | println!("Node: {:?}", node);
34 | | node = &node.children[0];
35 | | });
| |______- argument requires that `nodes` is borrowed for `'static`
36 | }
| - `nodes` dropped here while still borrowed
Rust Playground
What's the best way to make this work? I can't change the structure of run_loop. Ideally I wouldn't change the structure of Node (it's an object returned from a third-party library so while I could parse the object out into a new data structure, that wouldn't be elegant). Can I make the borrow checker happy with this just making changes in main?
it complains that nodes doesn't live long enough.
That's because it doesn't. The run_loop function requires its argument to live forever ('static). The nodes variable does not live forever, and consequently the closure that captures it does not live forever.
The easy fix would be to change run_loop to not require an argument that lives forever (by removing the 'static constraint), but if you cannot do that then you can just make nodes live forever instead. You can do this by "leaking" it.
let nodes = vec![ /*...*/ ];
let nodes = Vec::leak(nodes);
let mut node = &nodes[0];
(Playground link)
At the moment, this requires nightly, but there is a similar leak function in Box in stable.
let nodes = vec![ /*...*/ ];
let nodes = Box::leak(nodes.into_boxed_slice());
let mut node = &nodes[0];
(Playground link)
The leak solution didn't work for my actual use case, and in any case doesn't really represent the semantics of the situation or generalize very well (what if you don't want to leak the contents forever? Or what if it's not a vector you're working with?).
I ended up deciding that the best solution was just to do unsafe pointer manipulation:
let nodes = Box::pin(nodes);
let mut node_ptr = std::ptr::NonNull::from(&nodes[0]);
run_loop(move || {
let node = unsafe { node_ptr.as_ref() };
println!("Node: {:?}", node);
node_ptr = std::ptr::NonNull::from(&(node.children[0]));
});
In my actual implementation I put both nodes and node_ptr in a single struct so that provides some guarantee that the nodes won't be dropped before node_ptr.
I'm going to leave this open since I'd love to see a solution that doesn't require unsafe, but am posting this here since for now at least it's the best I have.
I am trying to define a recursive struct similar to a linked list for a tree traversal. A node has some data and access to its parent. The child node should borrow its parent mutably to ensure exclusive access, and release it once it's dropped. I can define this struct using immutable references, but not when I make the parent reference mutable. When making the parent reference mutable, I am confused by the compiler error and do not understand it.
How can I define the lifetimes for such a recursive structure with a mutable parent reference?
Here is a minimal example. This compiles but uses a readonly reference:
struct Node<'a> {
// Parent reference. `None` indicates a root node.
// I want this to be a mutable reference.
pub parent: Option<&'a Node<'a>>,
// This field just represents some data attached to this node.
pub value: u32,
}
// Creates a root node
// I use a static lifetime since there's no parent for the root so there are no constraints there
fn root_node(value: u32) -> Node<'static> {
Node {
parent: None,
value,
}
}
// Creates a child node
// The lifetimes indicates that the parent must outlive its child
fn child_node<'inner, 'outer: 'inner>(
parent: &'inner mut Node<'outer>,
value: u32,
) -> Node<'inner> {
Node {
parent: Some(parent),
value,
}
}
// An example function using the struct
fn main() {
let mut root = root_node(0);
let mut c1 = child_node(&mut root, 1);
let mut c2 = child_node(&mut c1, 2);
{
let mut c3 = child_node(&mut c2, 3);
let c4 = child_node(&mut c3, 4);
let mut cur = Some(&c4);
while let Some(n) = cur {
println!("{}", n.value);
cur = n.parent;
}
}
{
let c5 = child_node(&mut c2, 5);
let mut cur = Some(&c5);
while let Some(n) = cur {
println!("{}", n.value);
cur = n.parent;
}
}
println!("{}", c2.value);
}
Rust playground: immutable reference
I want a mutable reference, so I tried to replace the Node struct to use a mutable reference:
struct Node<'a> {
// Parent reference. `None` indicates a root node.
// I want this to be a mutable reference.
pub parent: Option<&'a mut Node<'a>>,
// This field just represents some data attached to this node.
pub value: u32,
}
But then I get the following error:
error[E0623]: lifetime mismatch
--> src/main.rs:25:22
|
21 | parent: &'inner mut Node<'outer>,
| ------------------------
| |
| these two types are declared with different lifetimes...
...
25 | parent: Some(parent),
| ^^^^^^ ...but data from `parent` flows into `parent` here
Rust playground: mutable reference
I do not understand the relationship between mutability and data flowing into a field. In the immutable case, I was already requiring the functions to pass mutable/exclusive references. I've been trying various combinations of lifetimes (using a single lifetime, reversing their relationship, etc.) but was unsuccessful.
It is not possible to implement this kind of recursive structure with mutable references due to variance.
The Rustonomicon has a section on variance, with the following table:
| | 'a | T |
|-----------|-----------|-----------|
| &'a T | covariant | covariant |
| &'a mut T | covariant | invariant |
In particular, &'a mut T is invariant with regard to T.
The core issue here is that a Node only knows the lifetimes of its parent, not the lifetime of all its ancestors. Even if in my case I'm just interested in mutating the value field of the ancestor, &mut Node also gives access to modify the parent field of any ancestor up the chain where we don't have access to the precise lifetime.
Here is an example where my struct may causes unsoundness with a mutable parent reference.
The following code would be accepted if T was covariant in &'a mut T:
fn main() {
let mut root: Node<'static> = root_node(0);
// where 'a corresponds to `root`
let mut c1: Node<'a> = child_node(&mut root, 1);
{
let mut evil_root: Node<'static> = root_node(666);
{
// where 'b corresponds to `c1`
let mut c2: Node<'b> = child_node(&mut c1, 2);
// where 'c corresponds to `c2`
let mut c3: Node<'c> = child_node(&mut c2, 3);
// Here is the issue: `c3` knows that its ancestors live at least as long
// as `c2`. But it does not know how long exactly.
// With covariance, the lifetime of `evil_root` would be compatible since
// it outlives `c2`. And because `&mut T` enables to mutate any field
// we could do the following:
let c2_ref: &mut Node<'c> = c3.parent.unwrap();
let c1_ref: &mut Node<'c> = c2_ref.parent.unwrap();
*c1_ref.parent = Some(&mut evil_root);
}
}
// Trying to access the parent of `c1` now causes a read-after-free
println!("{}", c1.parent.unwrap().value);
}
The invariance rule ensures that the code above is rejected by the compiler and there is no unsoundness.
Because &mut allows to modify any field, including ones with references, and because this kind of recursion does not keep track of all the parent lifetimes, it would be unsound.
To safely implement such a recursive struct Rust would need a reference allowing to mutate value (since it has a static lifetime, no issue there) but not parent.
In the minimal example I posted above it could be achieved using immutable references for the parents and placing the node data behind a Cell or RefCell. Another possible solution (but I haven't looked much into it) would be to place the mutable parent references behind a Pin but dereferencing it would be unsafe: I'd have to manually ensure that I am never changing the parent reference.
My actual use case is a bit more complex, so I'll try to instead restructure it to remove the need for the recursive struct by storing my data in a stack backed by a Vec.
I want to be able to store a struct called Child inside a Parent, where the Child contains a reference back to the parent.
It works if I have the Child structs directly inside the parent like this:
struct Parent<'s> {
cache: RefCell<Vec<Child<'s>>>
}
But if I move the Vec into a separate struct, then it will fail to compile with lifetime errors.
struct Parent<'s> {
cache: RefCell<Cache<'s>>
}
struct Cache<'s> {
children: Vec<Child<'s>>
}
It is possible to make this example work with the separate structs?
Here's the full working code, which compiles fine. When move the children into the separate struct then it fails.
My analysis of the problem:
When Parent contains children directly, 's is the same lifetime as the scope of the Parent struct itself, thus I can call methods that take &'s self on Parent.
When Parent contains Cache which contains children, 's is the same lifetime as the scope of the Cache struct, which is created before Parent, thus it is impossible to call methods on Parent that take &'s self. Attempting to do so gives the error
<anon>:33:15: 33:16 error: `p` does not live long enough
<anon>:33 let obj = p.create_object();
^
<anon>:30:48: 38:2 note: reference must be valid for the block suffix following statement 0 at 30:47...
<anon>:30 let cache = Cache { children: Vec::new() }; // the lifetime `'s` is essentially from this line to the end of the program
<anon>:31 let mut p = Parent { cache: RefCell::new(cache) }; // although the Parent instance was created here, 's still refers to the lifetime before it
<anon>:32 // this fails because p doesn't live long enough
<anon>:33 let obj = p.create_object();
I need a way of shortening 's to the scope of Parent, not the scope of the Cache.
Disclaimer:
This question is very similar to one I asked earlier (https://stackoverflow.com/questions/32579518/rust-lifetime-error-with-self-referencing-struct?noredirect=1#comment53014063_32579518) that was marked as duplicate. I've read through the answer and I believe I'm beyond that as I can get the lifetimes of references right (as shown in my first example). I'm asking this (now slightly different) question again because I now have a concrete example that works, and one that doesn't work. I'm sure that what can be done with one struct can be done with two, right?
You can make it compile by forcing the Cache and the Parent to have the same lifetime by defining them in the same let binding.
fn main() {
let (cache, mut p);
cache = Cache { children: Vec::new() };
p = Parent { cache: RefCell::new(cache) };
let obj = p.create_object();
let c1 = Child { parent: &p, data: 1 };
p.cache.borrow_mut().children.push(c1);
}
Here, we're essentially declaring a destructured tuple and then initializing it. We cannot initialize the tuple directly on the let binding:
let (cache, mut p) = (Cache { children: Vec::new() }, Parent { cache: RefCell::new(cache) });
because the initializer for p references cache, but that name is not defined until the end of the let statement. The separate initialization works because the compiler tracks which variables are initialized; if you swap the order of the assignments, you'll get a compiler error:
<anon>:31:38: 31:43 error: use of possibly uninitialized variable: `cache` [E0381]
<anon>:31 p = Parent { cache: RefCell::new(cache) };