I'm studying rust and I'm stuck on a question about Rc, Weak and RefCell. The use case is to implement a fully functioning tree, where each node has one parent and a list of children. The documentation provides a good starting point:
use std::cell::RefCell;
use std::rc::{Rc, Weak};
#[derive(Debug)]
struct Node {
value: i32,
parent: RefCell<Weak<Node>>,
children: RefCell<Vec<Rc<Node>>>,
}
fn main() {
let leaf = Rc::new(Node {
value: 3,
parent: RefCell::new(Weak::new()),
children: RefCell::new(vec![]),
});
println!("leaf parent = {:?}", leaf.parent.borrow().upgrade());
let branch = Rc::new(Node {
value: 5,
parent: RefCell::new(Weak::new()),
children: RefCell::new(vec![Rc::clone(&leaf)]),
});
*leaf.parent.borrow_mut() = Rc::downgrade(&branch);
println!("leaf parent = {:?}", leaf.parent.borrow().upgrade());
}
But. When I try to expand upon this, there's a problem with the design. I cannot add a new node to an already consisting one. The branch node has a child, but this I think is only possible because we already know leaf is going to be the one and only child before branch is created. If this is not the case, can we somehow change branch after it already has been created inside an Rc and add leaf as a child?
Or should I leave this design and adopt a design that looks more like:
#[derive(Debug)]
struct Node {
value: i32,
parent: Weak<RefCell<Node>>,
children: Vec<Rc<RefCell<Node>>>,
}
You can add new leaves with the following code:
let new_leaf = Rc::new(Node {
value: 4,
parent: RefCell::new(Rc::downgrade(&branch)),
children: RefCell::new(vec![])
});
branch.children.borrow_mut().push(new_leaf);
That being said your alternative suggested Node type seems to be what most rustaceans would go with / are more familiar with.
Related
the following example illustrates what i am trying to do:
use rayon::prelude::*;
struct Parent {
children: Vec<Child>,
}
struct Child {
value: f64,
index: usize,
//will keep the distances to other children in the children vactor of parent
distances: Vec<f64>,
}
impl Parent {
fn calculate_distances(&mut self) {
self.children
.par_iter_mut()
.for_each(|x| x.calculate_distances(&self.children));
}
}
impl Child {
fn calculate_distances(&mut self, children: &[Child]) {
children
.iter()
.enumerate()
.for_each(|(i, x)| self.distances[i] = (self.value - x.value).abs());
}
}
The above won't compile. The problem is, that i can't access &self.children in the closure of the first for_each. I do understand, why the borrow checker does not allow that, so my question is, if there is a way to make it work with little changes. The solutions i found so far are not really satisfying. One solution would be to just clone children at the beginning of Parent::calculate distances and use that inside the closure(which adds an unnecessary clone). The other solution would be to extract the value field of Child like that:
use rayon::prelude::*;
struct Parent {
children: Vec<Child>,
values: Vec<f64>
}
struct Child {
index: usize,
//will keep the distances to other children in the children vactor of parent
distances: Vec<f64>,
}
impl Parent {
fn calculate_distances(&mut self) {
let values = &self.values;
self.children
.par_iter_mut()
.for_each(|x| x.calculate_distances(values));
}
}
impl Child {
fn calculate_distances(&mut self, values: &[f64]) {
for i in 0..values.len(){
self.distances[i]= (values[self.index]-values[i]).abs();
}
}
}
while this would be efficient it totally messes up my real code, and value conceptually just really belongs to Child. I am relatively new to rust, and just asked myself if there is any nice way of doing this. As far as i understand there would need to be a way to tell the compiler, that i only change the distances field in the parallel iterator, while the value field stays constant. Maybe this is a place to use unsafe? Anyways, i would really appreciate if you could hint me in the right direction, or at least confirm that my code really has to become so much messier to make it work:)
Rust tries hard to prevent you from doing precisely what you want to do: retain access to the whole collection while modifying it. If you're unwilling to adjust the layout of your data to accommodate the borrow checker, you can use interior mutability to make Child::calculate_distances take &self rather than &mut self. Then your problem goes away because it's perfectly fine to hand out multiple shared references to self.children.
Ideally you'd use a RefCell because you don't access the same Child from multiple threads. But Rust doesn't allow that because, based on the signatures of the functions involved, you could do so, which would be a data race. Declaring distances: RefCell<Vec<f64>> makes Child no longer Sync, removing access to Vec<Child>::par_iter().
What you can do is use a Mutex. Although it feels initially wasteful, have in mind that each call to Child::calculate_distances() receives a different Child, so the mutex will always be uncontended and therefore cheap to lock (not involve a system call). And you'd lock it only once per Child::calculate_distances(), not on every access to the array. The code would look like this (playground):
use rayon::prelude::*;
use std::sync::Mutex;
struct Parent {
children: Vec<Child>,
}
struct Child {
value: f64,
index: usize,
//will keep the distances to other children in the children vactor of parent
distances: Mutex<Vec<f64>>,
}
impl Parent {
fn calculate_distances(&mut self) {
self.children
.par_iter()
.for_each(|x| x.calculate_distances(&self.children));
}
}
impl Child {
fn calculate_distances(&self, children: &[Child]) {
let mut distances = self.distances.lock().unwrap();
children
.iter()
.enumerate()
.for_each(|(i, x)| distances[i] = (self.value - x.value).abs());
}
}
You can also try replacing std::sync::Mutex with parking_lot::Mutex which is smaller (only one byte of overhead, no allocation), faster, and doesn't require the unwrap() because it doesn't do lock poisoning.
I am trying to implement Red-Black Tree in Rust. After 2 days of battling with the compiler, I am ready to give up and am here asking for help.
This question helped me quite a bit: How do I handle/circumvent "Cannot assign to ... which is behind a & reference" in Rust?
I looked at existing sample code for RB-Trees in Rust, but all of the ones I saw use some form of unsafe operations or null, which we are not supposed to use here.
I have the following code:
#[derive(Debug, Clone, PartialEq)]
pub enum Colour {
Red,
Black,
}
type T_Node<T> = Option<Box<Node<T>>>;
#[derive(Debug, Clone, PartialEq)]
pub struct Node<T: Copy + Clone + Ord> {
value: T,
colour: Colour,
parent: T_Node<T>,
left: T_Node<T>,
right: T_Node<T>,
}
impl<T: Copy + Clone + Ord> Node<T>
{
pub fn new(value: T) -> Node<T>
{
Node {
value: value,
colour: Colour::Red, // add a new node as red, then fix violations
parent: None,
left: None,
right: None,
// height: 1,
}
}
pub fn insert(&mut self, value: T)
{
if self.value == value
{
return;
}
let mut leaf = if value < self.value { &mut self.left } else { &mut self.right };
match leaf
{
None =>
{
let mut new_node = Node::new(value);
new_node.parent = Some(Box::new(self));
new_node.colour = Colour::Red;
(*leaf) = Some(Box::new(new_node));
},
Some(ref mut leaf) =>
{
leaf.insert(value);
}
};
}
}
The line new_node.parent = Some(Box::new(self)); gives me the error.
I understand understand why the error happens (self is declared as a mutable reference) and I have no idea how to fix this, but I need self to be a mutable reference so I can modify my tree (unless you can suggest something better).
I tried to declare the T_Node to have a mutable reference instead of just Node, but that just created more problems.
I am also open to suggestions for a better choice of variable types and what not.
Any help is appreciated.
There are some faults in the design which makes it impossible to go any further without making some changes.
First, Box doesn't support shared ownership but you require that because the same node is referenced by parent (rbtree.right/rbtree.left) and child (rbtree.parent). For that you need Rc.
So instead of Box, you will need to switch to Rc:
type T_Node<T> = Option<Rc<Node<T>>>;
But this doesn't solve the problem. Now your node is inside Rc and Rc doesn't allow mutation to it's contents (you can mutate by get_mut but that requires it to be unique which is not a constant in your case). You won't be able to do much with your tree unless you can mutate a node.
So you need to use interior mutability pattern. For that we will add an additional layer of RefCell.
type T_Node<T> = Option<Rc<RefCell<Node<T>>>>;
Now, this will allow us to mutate the contents inside.
But this doesn't solve it. Because you need to hold a reference from the child to the parent as well, you will end up creating a reference cycle.
Luckily, the rust book explains how to fix reference cycle for the exact same scenario:
To make the child node aware of its parent, we need to add a parent field to our Node struct definition. The trouble is in deciding what the type of parent should be. We know it can’t contain an Rc, because that would create a reference cycle with leaf.parent pointing to branch and branch.children pointing to leaf, which would cause their strong_count values to never be 0. Thinking about the relationships another way, a parent node should own its children: if a parent node is dropped, its child nodes should be dropped as well. However, a child should not own its parent: if we drop a child node, the parent should still exist. This is a case for weak references!
So we need child to hold a weak reference to parent. This can be done as:
type Child<T> = Option<Rc<RefCell<Node<T>>>>;
type Parent<T> = Option<Weak<RefCell<Node<T>>>>;
Now we have fixed majority of the design.
One more thing that we should do is, instead of exposing Node directly, we will encapsulate it in a struct RBTree which will hold the root of the tree and operations like insert, search, delete, etc. can be called on RBtree. This will make things simple and implementation will become more logical.
pub struct RBTree<T: Ord> {
root: Child<T>,
}
Now, let's write an insert implementation similar to yours:
impl<T: Ord> RBTree<T> {
pub fn insert(&mut self, value: T) {
fn insert<T: Ord>(child: &mut Child<T>, mut new_node: Node<T>) {
let child = child.as_ref().unwrap();
let mut child_mut_borrow = child.borrow_mut();
if child_mut_borrow.value == new_node.value {
return;
}
let leaf = if child_mut_borrow.value > new_node.value {
&mut child_mut_borrow.left
} else {
&mut child_mut_borrow.right
};
match leaf {
Some(_) => {
insert(leaf, new_node);
}
None => {
new_node.parent = Some(Rc::downgrade(&child));
*leaf = Some(Rc::new(RefCell::new(new_node)));
}
};
}
let mut new_node = Node::new(value);
if self.root.is_none() {
new_node.parent = None;
self.root = Some(Rc::new(RefCell::new(new_node)));
} else {
// We ensure that a `None` is never sent to insert()
insert(&mut self.root, new_node);
}
}
}
I defined an insert function inside the RBTree::insert just for the sake of simplicity of recursive calls. The outer functions tests for root and further insertions are carried out inside nested insert functions.
Basically, we start with:
let mut new_node = Node::new(value);
This creates a new node.
Then,
if self.root.is_none() {
new_node.parent = None;
self.root = Some(Rc::new(RefCell::new(new_node)));
} else {
// We ensure that a `None` is never sent to insert()
insert(&mut self.root, new_node);
}
If root is None, insert at root, otherwise call insert with root itself. So the nested insert function basically receives the parent in which left and right child are checked and the insertion is made.
Then, the control moves to the nested insert function.
We define the following two lines for making it convenient to access inner data:
let child = child.as_ref().unwrap();
let mut child_mut_borrow = child.borrow_mut();
Just like in your implementation, we return if value is already there:
if child_mut_borrow.value == new_node.value {
return;
}
Now we store a mutable reference to either left or right child:
let leaf = if child_mut_borrow.value > new_node.value {
&mut child_mut_borrow.left
} else {
&mut child_mut_borrow.right
};
Now, a check is made on the child if it is None or Some. In case of None, we make the insertion. Otherwise, we call insert recursively:
match leaf {
Some(_) => {
insert(leaf, new_node);
}
None => {
new_node.parent = Some(Rc::downgrade(&child));
*leaf = Some(Rc::new(RefCell::new(new_node)));
}
};
Rc::downgrade(&child) is for generating a weak reference.
Here is a working sample: Playground
In Chapter 4 of "Programming Rust" by Jim Blandy & Jason Orendorff it says,
It follows that the owners and their owned values form trees: your owner is your parent, and the values you own are your children. And at the ultimate root of each tree is a variable; when that variable goes out of scope, the entire tree goes with it. We can see such an ownership tree in the diagram for composers: it’s not a “tree” in the sense of a search tree data structure, or an HTML document made from DOM elements. Rather, we have a tree built from a mixture of types, with Rust’s single-owner rule forbidding any rejoining of structure that could make the arrangement more complex than a tree. Every value in a Rust program is a member of some tree, rooted in some variable.
An example is provided,
This is simplified and pretty, but is there any mechanism to generate an "ownership tree" visualization with Rust or the Rust tooling? Can I dump an ownership tree when debugging?
There isn't really a specific tool for it, but you can get pretty close by deriving the Debug trait. When you derive the Debug trait for a struct, it will give you a recursive representation of all owned data, terminating at primitive types such as str, u32 etc, or when it encounters a custom Debug implementation.
For example, this program here:
use rand;
#[derive(Debug)]
enum State {
Good,
Bad,
Ugly(&'static str),
}
#[derive(Debug)]
struct ExampleStruct {
x_factor: Option<f32>,
children: Vec<ExampleStruct>,
state: State,
}
impl ExampleStruct {
fn random(max_depth: usize) -> Self {
use rand::Rng;
let mut rng = rand::thread_rng();
let child_count = match max_depth {
0 => 0,
_ => rng.gen::<usize>() % max_depth,
};
let mut children = Vec::with_capacity(child_count);
for _ in 0..child_count {
children.push(ExampleStruct::random(max_depth - 1));
}
let state = if rng.gen() {
State::Good
} else if rng.gen() {
State::Bad
} else {
State::Ugly("really ugly")
};
Self {
x_factor: Some(rng.gen()),
children,
state,
}
}
}
fn main() {
let foo = ExampleStruct::random(3);
dbg!(foo);
}
prints something like this:
[src/main.rs:51] foo = ExampleStruct {
x_factor: Some(
0.27388978,
),
children: [
ExampleStruct {
x_factor: Some(
0.5051847,
),
children: [
ExampleStruct {
x_factor: Some(
0.9675246,
),
children: [],
state: Ugly(
"really ugly",
),
},
],
state: Bad,
},
ExampleStruct {
x_factor: Some(
0.70672345,
),
children: [],
state: Ugly(
"really ugly",
),
},
],
state: Bad,
}
Note that not all the data is in line: The children live somewhere else on the heap. They are not stored inside the ExampleStruct, they are simply owned by it.
This could get confusing if you store references to things, because Debug may start to traverse these references. It does not matter to Debug that they aren't owned. In fact, this is the case with the &'static str inside State::Ugly. The actual bytes that make up the string are not owned by any variable, they are hard coded and live inside the program itself. They will exist for as long as the program is running.
This question already has answers here:
How do I express mutually recursive data structures in safe Rust?
(4 answers)
Closed 4 years ago.
I have a structure with a parent property I want to add to a queue. The parent is the same type as itself, so I need to wrap it in a Box.
use std::collections::vec_deque::VecDeque;
struct GraphNode {
value: u32,
parent: Option<Box<&GraphNode>>,
}
fn main() {
let mut queue: VecDeque<GraphNode> = VecDeque::new();
let parent = GraphNode {
value: 23,
parent: Option::None,
};
let second = GraphNode { value: 42, parent };
let third = GraphNode {
value: 19,
parent: Option::Some(Box::from(&parent)),
};
queue.push_front(parent);
queue.push_front(second);
queue.push_front(third);
}
Playground
error[E0106]: missing lifetime specifier
--> src/main.rs:5:24
|
5 | parent: Option<Box<&GraphNode>>,
| ^ expected lifetime paramete
The parent can be null, so I get that it needs to be Box<Option<&GraphNode>>, but I get the error expected lifetime parameter, however what's in the docs isn't really making sense to me.
There's also the issue that when I create a Box, to save to the parent, the value is moved. I don't want to move the value, I just want to save a reference in the box.
I think you are looking for std::rc::Rc, not Box.
use std::collections::vec_deque::VecDeque;
use std::rc::Rc;
struct GraphNode {
value: u32,
parent: Option<Rc<GraphNode>>,
}
fn main() {
let mut queue: VecDeque<Rc<GraphNode>> = VecDeque::new();
let parent = Rc::new(GraphNode {
value: 23,
parent: None,
});
let second = Rc::new(GraphNode {
value: 42,
parent: None,
});
let third = Rc::new(GraphNode {
value: 19,
parent: Some(parent.clone()), // Clones the reference, still point to the same thing.
});
queue.push_front(parent);
queue.push_front(second);
queue.push_front(third);
}
Playground
Rc (reference counted), is a way to have multiple "owners" to the same object. When cloning, you are just cloning the reference, so changes made to either one will affect the other.
The lifetime problems you encountered is because you are storing a direct reference (don't actually know what it's suppose to be called) made with &.
If you want to know more about lifetimes, here's the entry from the book.
I'm trying to model a structure for a UI library where there exists a ViewNode which owns a RenderNode which owns a LayoutNode. These structures should at the same time form three distinct trees. A ViewTree, a RenderTree, and a Layout tree.
Is there any way of modeling this ownership without resorting to use of Rc? I don't want to use Rc<> because the ownership is clear from my point of view, The trees should never own their children (except for ViewNode), the wrapper is the owner. Each layer should also be able to be pulled out into a library and I don't want to force users of the library to use Rc<>.
Below is what I would want to do but what doesn't work. Should I go about this in a different way perhaps?
#[derive(Debug)]
struct LayoutNode<'a> {
// .. Some fields
children: Vec<&'a LayoutNode<'a>>,
}
#[derive(Debug)]
struct RenderNode<'a> {
// .. Some fields
layout_node: LayoutNode<'a>,
children: Vec<&'a RenderNode<'a>>,
}
#[derive(Debug)]
struct ViewNode<'a> {
// .. Some fields
render_node: RenderNode<'a>,
children: Vec<ViewNode<'a>>,
}
fn make_tree<'a>() -> ViewNode<'a> {
let layout_child = LayoutNode { children: vec![] };
let layout = LayoutNode { children: vec![&layout_child] };
let render_child = RenderNode { layout_node: layout_child, children: vec![] };
let render = RenderNode { layout_node: layout, children: vec![&render_child] };
let view_child = ViewNode { render_node: render_child, children: vec![] };
let view = ViewNode { render_node: render, children: vec![view_child] };
view
}
fn main() {
println!("{:?}", make_tree())
}
You can use a memory arena that uses indices instead of reference counted pointers.
Using indextree as an example:
pub struct NodeId {
index: usize,
}
pub struct Node<T> {
parent: Option<NodeId>,
previous_sibling: Option<NodeId>,
next_sibling: Option<NodeId>,
first_child: Option<NodeId>,
last_child: Option<NodeId>,
removed: bool,
/// The actual data which will be stored within the tree
pub data: T,
}
pub struct Arena<T> {
nodes: Vec<Node<T>>,
}
The NodeId struct is a simple integer index.
The nodes contain references to close by nodes (parent, previous_sibling, etc..) so to make for easy traversal.
A downside of this method is that it's very similar to manual memory management, in that you need to ensure that nodes are added/removed correctly to avoid dangling references. indextree has a lot of error checking when adding/removing nodes in the tree for this reason.
You might also want to have a look at petgraph: While this is a Graph instead of a Tree you can use it as a tree.