Pass vector of borrowed variables to new Struct - struct

In an effort to learn about structs, borrowing, and lifetimes, I am putting together a toy library that would handle nodes and edges for a graph. It's been instructive, but I'm stuck when finally instantiating a Graph instance with Nodes that have already been borrowed by multiple Edges.
The error I'm receiving:
error[E0505]: cannot move out of `n0` because it is borrowed
--> src/lib.rs:94:18
|
90 | let e0 = Edge::new(&n0, &n1);
| --- borrow of `n0` occurs here
...
94 | vec![n0, n1, n2],
| ^^ move out of `n0` occurs here
95 | vec![e0, e1, e2],
| -- borrow later used here
Code being used:
use std::fmt;
use uuid::Uuid;
#[derive(PartialEq)]
struct Node {
id: Uuid,
label: Option<String>,
}
impl fmt::Display for Node {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "<Node {}>", self.id)
}
}
impl Node {
fn new() -> Node {
Node {
id: Uuid::new_v4(),
label: None,
}
}
fn new_with_id(id: Uuid) -> Node {
Node {
id,
label: None,
}
}
}
struct Edge<'a> {
nodes: (&'a Node, &'a Node),
}
impl fmt::Display for Edge<'_> {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "<Edge ({}, {})>", self.nodes.0, self.nodes.1)
}
}
impl Edge<'_> {
fn new<'a>(n0: &'a Node, n1: &'a Node) -> Edge<'a> {
Edge {
nodes: (n0, n1)
}
}
}
struct Graph<'a> {
nodes: Vec<Node>,
edges: Vec<Edge<'a>>,
}
impl Graph<'_> {
fn new<'a>(nodes: Vec<Node>, edges: Vec<Edge>) -> Graph {
Graph {
nodes,
edges,
}
}
}
///////////////////////////////////////////////////////////////////////
// Tests
///////////////////////////////////////////////////////////////////////
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn create_edge() {
let n0 = Node::new();
let n1 = Node::new();
let e0 = Edge::new(&n0, &n1);
println!("Created node: {}", n0);
println!("Created node: {}", n1);
println!("Created edge: {}", e0);
assert!(e0.nodes.0 == &n0 && e0.nodes.1 == &n1);
}
#[test]
fn create_undirected_graph() {
let n0 = Node::new();
let n1 = Node::new();
let n2 = Node::new();
let e0 = Edge::new(&n0, &n1);
let e1 = Edge::new(&n1, &n2);
let e2 = Edge::new(&n2, &n0);
let g0 = Graph::new(
vec![n0, n1, n2],
vec![e0, e1, e2],
);
}
}
It feels like I would want to modify the struct Graph definition to expect borrowed instances in the vectors, but running into a bunch of compiler errors as I go in that direction.
Any help or guidance would be much appreciated!

As soon as you borrow any data structure, you may not change it except through that very borrow. Now, your edges immutably borrow your nodes. Since your nodes are in a Vec, your edges also implicitly borrow the whole vector. (If they didn't, you could, say, resize the vector, which would change the position of your nodes.) This means that you are not allowed to change anything about your vector of nodes.
However, you are trying to move your vector of nodes into a new memory location inside your struct. Since Vec is not Copy, this invalidates your previous vector, which is clearly "changing" it.
There are a number of ways you can go to avoid this issue.
Borrow the vector of nodes
Rather than moving the nodes into the Graph, you could borrow them.
struct Graph<'a> {
nodes: &'a Vec<Node>,
edges: Vec<Edge<'a>>,
}
While this should resolve your current compilation error, the resulting structure wouldn't be very pleasant to handle, because it cannot exist on its own. It can only exist as long as the vector of nodes exist elsewhere.
Use reference-counting
If you want to avoid the reliance on an "outside borrow", you can fall back to good old reference counting. If you know C++, this is a similar concept as a shared pointer. It's available in the standard library as std::rc::Rc. It puts an object on the heap along with a counter how often it is referenced and makes sure that the memory is not freed as long as one reference exists. If you put your nodes behind such an abstraction, you don't need lifetimes anymore (since the required properties are ensured at runtime).
In this case you would also change your Edges slightly.
struct Edge {
nodes: (Rc<Node>, Rc<Node>),
}
struct Graph {
nodes: Vec<Rc<Node>>,
edges: Vec<Edge>,
}
Reference nodes by their ID or index
While the previous approach works just fine, you often want to avoid additional heap allocations. One of the standard ways to deal with issues such as this one is not to store references to the nodes in your edges, but rather to store either their index (or in your case potentially their UUID).
Self-referential structs
The structure you were trying to create is self-referential. That means that your field edges wanted to reference data from another field in the same struct, namely nodes. While it isn't impossible to create such structures in Rust, getting it right is often not easy and rather error-prone. There are some libraries performing black magic to make this easier, such as ouroboros. However, I would try to avoid such incantations if possible.

Related

Multi-dimensional vector borrowing

I'm trying to implement a coding exercise, but I've ran into a wall regarding multi-dimensional vectors and borrowing.
The code is accessible in this playground, but I'll add here a snippet for reference:
type Matrix = Rc<RefCell<Vec<Vec<String>>>>;
/// sequence -> target string
/// dictionary -> array of 'words' that can be used to construct the 'sequence'
/// returns -> 2d array of all the possible combinations to create the 'sequence' from the 'dictionary'
pub fn all_construct<'a>(sequence: &'a str, dictionary: &'a [&str]) -> Matrix {
let memo: Rc<RefCell<HashMap<&str, Matrix>>> = Rc::new(RefCell::new(HashMap::new()));
all_construct_memo(sequence, dictionary, Rc::clone(&memo))
}
fn all_construct_memo<'a>(
sequence: &'a str,
dictionary: &'a [&str],
memo: Rc<RefCell<HashMap<&'a str, Matrix>>>,
) -> Matrix {
if memo.borrow().contains_key(sequence) {
return Rc::clone(&memo.borrow()[sequence]);
}
if sequence.is_empty() {
return Rc::new(RefCell::new(Vec::new()));
}
let ways = Rc::new(RefCell::new(Vec::new()));
for word in dictionary {
if let Some(new_sequence) = sequence.strip_prefix(word) {
let inner_ways = all_construct_memo(new_sequence, dictionary, Rc::clone(&memo));
for mut entry in inner_ways.borrow_mut().into_iter() { // error here
entry.push(word.to_string());
ways.borrow_mut().push(entry);
}
}
}
memo.borrow_mut().insert(sequence, Rc::clone(&ways));
Rc::clone(&ways)
}
The code doesn't compile.
Questions:
This feel overly complicated. Is there a simpler way to do it?
1.1 For the Matrix type, I tried getting by with just Vec<Vec<String>>, but that didn't get me very far. What's the way to properly encode a 2d Vector that allows for mutability and sharing, without using extra crates?
1.2. Is there a better way to pass the memo object?
Not really understanding the compiler error here. Can you help me with that?
error[E0507]: cannot move out of dereference of `RefMut<'_, Vec<Vec<String>>>`
--> src/lib.rs:31:30
|
31 | for mut entry in inner_ways.borrow_mut().into_iter() { // error here
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ move occurs because value has type `Vec<Vec<String>>`, which does not implement the `Copy` trait
For more information about this error, try `rustc --explain E0507`.
Thank you!
2d vecs work fine, and for jagged arrays like yours, your implementation is correct. Your issues stem from a needless use of Rc and RefCell. Because of the way you're calling things, a single, mutable reference will work.
Consider the following, modified, example:
type Vec2<T> = Vec<Vec<T>>;
fn all_constructs<'a>(sequence: &'a str, segments: &[&'a str]) -> Vec2<&'a str> {
let mut cache = HashMap::new();
all_constructs_memo(sequence, segments, &mut cache)
}
fn all_constructs_memo<'a>(
sequence: &'a str,
segments: &[&'a str],
cache: &mut HashMap<&'a str, Vec2<&'a str>>
) -> Vec2<&'a str> {
// If we have the answer cached, return the cache
if let Some(constructs) = cache.get(sequence) {
return constructs.to_vec();
}
// We don't have it cached, so figure it out
let mut constructs = Vec::new();
for segment in segments {
if *segment == sequence {
constructs.push(vec![*segment]);
} else if let Some(sub_sequence) = sequence.strip_suffix(segment) {
let mut sub_constructs = all_constructs_memo(sub_sequence, segments, cache);
sub_constructs.iter_mut().for_each(|c| c.push(segment));
constructs.append(&mut sub_constructs);
}
}
cache.insert(sequence, constructs.clone());
return constructs;
}
It's identical, execpt for 4 differences:
1.) I removed all Rc and RefCell. There is a single Hashmap reference
2.) Instead of having all_constructs_memo("", ...) -> Vec::new(), I just added a branch in the iterator if *segment == sequence to test for single-segment matches that way.
3.) I wrote Vec2 instead of Matrix
4.) strip_suffix instead of strip_prefix, just because adding to the end of vecs is a little more efficient than adding to the front.
Here's a playground link with tests against a non-memoized reference implementation
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=1b488aafda6466629c17c8a7de8f3e42

Can I move from &mut self to &self within the same function?

I'm trying to learn a bit of Rust through a toy application, which involves a tree data structure that is filled dynamically by querying an external source. In the beginning, only the root node is present. The tree structure provides a method get_children(id) that returns a [u32] of the IDs of all the node's children — either this data is already known, or the external source is queried and all the nodes are inserted into the tree.
I'm running into the following problem with the borrow checker that I can't seem to figure out:
struct Node {
id: u32,
value: u64, // in my use case, this type is much larger and should not be copied
children: Option<Vec<u32>>,
}
struct Tree {
nodes: std::collections::HashMap<u32, Node>,
}
impl Tree {
fn get_children(&mut self, id: u32) -> Option<&[u32]> {
// This will perform external queries and add new nodes to the tree
None
}
fn first_even_child(&mut self, id: u32) -> Option<u32> {
let children = self.get_children(id)?;
let result = children.iter().find(|&id| self.nodes.get(id).unwrap().value % 2 == 0)?;
Some(*result)
}
}
Which results in:
error[E0502]: cannot borrow `self.nodes` as immutable because it is also borrowed as mutable
--> src/lib.rs:19:43
|
18 | let children = self.get_children(id)?;
| ---- mutable borrow occurs here
19 | let result = children.iter().find(|&id| self.nodes.get(id).unwrap().value % 2 == 0)?;
| ---- ^^^^^ ---------- second borrow occurs due to use of `self.nodes` in closure
| | |
| | immutable borrow occurs here
| mutable borrow later used by call
Since get_children might insert nodes into the tree, we need a &mut self reference. However, the way I see it, after the value of children is known, self no longer needs to be borrowed mutably. Why does this not work, and how would I fix it?
EDIT -- my workaround
After Chayim Friedman's answer, I decided against returning Self. I mostly ran into the above problem when first calling get_children to get a list of IDs and then using nodes.get() to obtain the corresponding Node. Instead, I refactored to provide the following functions:
impl Tree {
fn load_children(&mut self, id: u32) {
// If not present yet, perform queries to add children to the tree
}
fn iter_children(&self, id: u32) -> Option<IterChildren> {
// Provides an iterator over the children of node `id`
}
}
Downgrading a mutable reference into a shared reference produces a reference that should be kept unique. This is necessary for e.g. Cell::from_mut(), which has the following signature:
pub fn from_mut(t: &mut T) -> &Cell<T>
This method relies on the uniqueness guarantee of &mut T to ensure no references to T are kept directly, only via Cell. If downgrading the reference would mean the unqiueness could have been violated, this method would be unsound, because the value inside the Cell could have been changed by another shared references (via interior mutability).
For more about this see Common Rust Lifetime Misconceptions: downgrading mut refs to shared refs is safe.
To solve this you need to get both shared references from the same shared reference that was created from the mutable reference. You can, for example, also return &Self from get_children():
fn get_children(&mut self, id: u32) -> Option<(&Self, &[u32])> {
// This will perform external queries and add new nodes to the tree
Some((self, &[]))
}
fn first_even_child(&mut self, id: u32) -> Option<u32> {
let (this, children) = self.get_children(id)?;
let result = children.iter().find(|&id| this.nodes.get(id).unwrap().value % 2 == 0)?;
Some(*result)
}

What's the best way to have a struct that contains a reference to another struct of the same type? [duplicate]

I am trying to implement a scenegraph-like data structure in Rust. I would like an equivalent to this C++ code expressed in safe Rust:
struct Node
{
Node* parent; // should be mutable, and nullable (no parent)
std::vector<Node*> child;
virtual ~Node()
{
for(auto it = child.begin(); it != child.end(); ++it)
{
delete *it;
}
}
void addNode(Node* newNode)
{
if (newNode->parent)
{
newNode->parent.child.erase(newNode->parent.child.find(newNode));
}
newNode->parent = this;
child.push_back(newNode);
}
}
Properties I want:
the parent takes ownership of its children
the nodes are accessible from the outside via a reference of some kind
events that touch one Node can potentially mutate the whole tree
Rust tries to ensure memory safety by forbidding you from doing things that might potentially be unsafe. Since this analysis is performed at compile-time, the compiler can only reason about a subset of manipulations that are known to be safe.
In Rust, you could easily store either a reference to the parent (by borrowing the parent, thus preventing mutation) or the list of child nodes (by owning them, which gives you more freedom), but not both (without using unsafe). This is especially problematic for your implementation of addNode, which requires mutable access to the given node's parent. However, if you store the parent pointer as a mutable reference, then, since only a single mutable reference to a particular object may be usable at a time, the only way to access the parent would be through a child node, and you'd only be able to have a single child node, otherwise you'd have two mutable references to the same parent node.
If you want to avoid unsafe code, there are many alternatives, but they'll all require some sacrifices.
The easiest solution is to simply remove the parent field. We can define a separate data structure to remember the parent of a node while we traverse a tree, rather than storing it in the node itself.
First, let's define Node:
#[derive(Debug)]
struct Node<T> {
data: T,
children: Vec<Node<T>>,
}
impl<T> Node<T> {
fn new(data: T) -> Node<T> {
Node { data: data, children: vec![] }
}
fn add_child(&mut self, child: Node<T>) {
self.children.push(child);
}
}
(I added a data field because a tree isn't super useful without data at the nodes!)
Let's now define another struct to track the parent as we navigate:
#[derive(Debug)]
struct NavigableNode<'a, T: 'a> {
node: &'a Node<T>,
parent: Option<&'a NavigableNode<'a, T>>,
}
impl<'a, T> NavigableNode<'a, T> {
fn child(&self, index: usize) -> NavigableNode<T> {
NavigableNode {
node: &self.node.children[index],
parent: Some(self)
}
}
}
impl<T> Node<T> {
fn navigate<'a>(&'a self) -> NavigableNode<T> {
NavigableNode { node: self, parent: None }
}
}
This solution works fine if you don't need to mutate the tree as you navigate it and you can keep the parent NavigableNode objects around (which works fine for a recursive algorithm, but doesn't work too well if you want to store a NavigableNode in some other data structure and keep it around). The second restriction can be alleviated by using something other than a borrowed pointer to store the parent; if you want maximum genericity, you can use the Borrow trait to allow direct values, borrowed pointers, Boxes, Rc's, etc.
Now, let's talk about zippers. In functional programming, zippers are used to "focus" on a particular element of a data structure (list, tree, map, etc.) so that accessing that element takes constant time, while still retaining all the data of that data structure. If you need to navigate your tree and mutate it during the navigation, while retaining the ability to navigate up the tree, then you could turn a tree into a zipper and perform the modifications through the zipper.
Here's how we could implement a zipper for the Node defined above:
#[derive(Debug)]
struct NodeZipper<T> {
node: Node<T>,
parent: Option<Box<NodeZipper<T>>>,
index_in_parent: usize,
}
impl<T> NodeZipper<T> {
fn child(mut self, index: usize) -> NodeZipper<T> {
// Remove the specified child from the node's children.
// A NodeZipper shouldn't let its users inspect its parent,
// since we mutate the parents
// to move the focused nodes out of their list of children.
// We use swap_remove() for efficiency.
let child = self.node.children.swap_remove(index);
// Return a new NodeZipper focused on the specified child.
NodeZipper {
node: child,
parent: Some(Box::new(self)),
index_in_parent: index,
}
}
fn parent(self) -> NodeZipper<T> {
// Destructure this NodeZipper
let NodeZipper { node, parent, index_in_parent } = self;
// Destructure the parent NodeZipper
let NodeZipper {
node: mut parent_node,
parent: parent_parent,
index_in_parent: parent_index_in_parent,
} = *parent.unwrap();
// Insert the node of this NodeZipper back in its parent.
// Since we used swap_remove() to remove the child,
// we need to do the opposite of that.
parent_node.children.push(node);
let len = parent_node.children.len();
parent_node.children.swap(index_in_parent, len - 1);
// Return a new NodeZipper focused on the parent.
NodeZipper {
node: parent_node,
parent: parent_parent,
index_in_parent: parent_index_in_parent,
}
}
fn finish(mut self) -> Node<T> {
while let Some(_) = self.parent {
self = self.parent();
}
self.node
}
}
impl<T> Node<T> {
fn zipper(self) -> NodeZipper<T> {
NodeZipper { node: self, parent: None, index_in_parent: 0 }
}
}
To use this zipper, you need to have ownership of the root node of the tree. By taking ownership of the nodes, the zipper can move things around in order to avoid copying or cloning nodes. When we move a zipper, we actually drop the old zipper and create a new one (though we could also do it by mutating self, but I thought it was clearer that way, plus it lets you chain method calls).
If the above options are not satisfactory, and you must absolutely store the parent of a node in a node, then the next best option is to use Rc<RefCell<Node<T>>> to refer to the parent and Weak<RefCell<Node<T>>> to the children. Rc enables shared ownership, but adds overhead to perform reference counting at runtime. RefCell enables interior mutability, but adds overhead to keep track of the active borrows at runtime. Weak is like Rc, but it doesn't increment the reference count; this is used to break reference cycles, which would prevent the reference count from dropping to zero, causing a memory leak. See DK.'s answer for an implementation using Rc, Weak and RefCell.
The problem is that this data structure is inherently unsafe; it doesn't have a direct equivalent in Rust that doesn't use unsafe. This is by design.
If you want to translate this into safe Rust code, you need to be more specific about what, exactly, you want from it. I know you listed some properties above, but often people coming to Rust will say "I want everything I have in this C/C++ code", to which the direct answer is "well, you can't."
You're also, unavoidably, going to have to change how you approach this. The example you've given has pointers without any ownership semantics, mutable aliasing, and cycles; all of which Rust will not allow you to simply ignore like C++ does.
The simplest solution is to just get rid of the parent pointer, and maintain that externally (like a filesystem path). This also plays nicely with borrowing because there are no cycles anywhere:
pub struct Node1 {
children: Vec<Node1>,
}
If you need parent pointers, you could go half-way and use Ids instead:
use std::collections::BTreeMap;
type Id = usize;
pub struct Tree {
descendants: BTreeMap<Id, Node2>,
root: Option<Id>,
}
pub struct Node2 {
parent: Id,
children: Vec<Id>,
}
The BTreeMap is effectively your "address space", bypassing borrowing and aliasing issues by not directly using memory addresses.
Of course, this introduces the problem of a given Id not being tied to the particular tree, meaning that the node it belongs to could be destroyed, and now you have what is effectively a dangling pointer. But, that's the price you pay for having aliasing and mutation. It's also less direct.
Or, you could go whole-hog and use reference-counting and dynamic borrow checking:
use std::cell::RefCell;
use std::rc::{Rc, Weak};
// Note: do not derive Clone to make this move-only.
pub struct Node3(Rc<RefCell<Node3_>>);
pub type WeakNode3 = Weak<RefCell<Node3>>;
pub struct Node3_ {
parent: Option<WeakNode3>,
children: Vec<Node3>,
}
impl Node3 {
pub fn add(&self, node: Node3) {
// No need to remove from old parent; move semantics mean that must have
// already been done.
(node.0).borrow_mut().parent = Some(Rc::downgrade(&self.0));
self.children.push(node);
}
}
Here, you'd use Node3 to transfer ownership of a node between parts of the tree, and WeakNode3 for external references. Or, you could make Node3 cloneable and add back the logic in add to make sure a given node doesn't accidentally stay a child of the wrong parent.
This is not strictly better than the second option, because this design absolutely cannot benefit from static borrow-checking. The second one can at least prevent you from mutating the graph from two places at once at compile time; here, if that happens, you'll just crash.
The point is: you can't just have everything. You have to decide which operations you actually need to support. At that point, it's usually just a case of picking the types that give you the necessary properties.
In certain cases, you can also use an arena. An arena guarantees that values stored in it will have the same lifetime as the arena itself. This means that adding more values will not invalidate any existing lifetimes, but moving the arena will. Thus, such a solution is not viable if you need to return the tree.
This solves the problem by removing the ownership from the nodes themselves.
Here's an example that also uses interior mutability to allow a node to be mutated after it is created. In other cases, you can remove this mutability if the tree is constructed once and then simply navigated.
use std::{
cell::{Cell, RefCell},
fmt,
};
use typed_arena::Arena; // 1.6.1
struct Tree<'a, T: 'a> {
nodes: Arena<Node<'a, T>>,
}
impl<'a, T> Tree<'a, T> {
fn new() -> Tree<'a, T> {
Self {
nodes: Arena::new(),
}
}
fn new_node(&'a self, data: T) -> &'a mut Node<'a, T> {
self.nodes.alloc(Node {
data,
tree: self,
parent: Cell::new(None),
children: RefCell::new(Vec::new()),
})
}
}
struct Node<'a, T: 'a> {
data: T,
tree: &'a Tree<'a, T>,
parent: Cell<Option<&'a Node<'a, T>>>,
children: RefCell<Vec<&'a Node<'a, T>>>,
}
impl<'a, T> Node<'a, T> {
fn add_node(&'a self, data: T) -> &'a Node<'a, T> {
let child = self.tree.new_node(data);
child.parent.set(Some(self));
self.children.borrow_mut().push(child);
child
}
}
impl<'a, T> fmt::Debug for Node<'a, T>
where
T: fmt::Debug,
{
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "{:?}", self.data)?;
write!(f, " (")?;
for c in self.children.borrow().iter() {
write!(f, "{:?}, ", c)?;
}
write!(f, ")")
}
}
fn main() {
let tree = Tree::new();
let head = tree.new_node(1);
let _left = head.add_node(2);
let _right = head.add_node(3);
println!("{:?}", head); // 1 (2 (), 3 (), )
}
TL;DR: DK.'s second version doesn't compile because parent has another type than self.0, fix it by converting it to a WeakNode. Also, in the line directly below, "self" doesn't have a "children" attribute but self.0 has.
I corrected the version of DK. so it compiles and works. Here is my Code:
dk_tree.rs
use std::cell::RefCell;
use std::rc::{Rc, Weak};
// Note: do not derive Clone to make this move-only.
pub struct Node(Rc<RefCell<Node_>>);
pub struct WeakNode(Weak<RefCell<Node_>>);
struct Node_ {
parent: Option<WeakNode>,
children: Vec<Node>,
}
impl Node {
pub fn new() -> Self {
Node(Rc::new(RefCell::new(Node_ {
parent: None,
children: Vec::new(),
})))
}
pub fn add(&self, node: Node) {
// No need to remove from old parent; move semantics mean that must have
// already been done.
node.0.borrow_mut().parent = Some(WeakNode(Rc::downgrade(&self.0)));
self.0.borrow_mut().children.push(node);
}
// just to have something visually printed
pub fn to_str(&self) -> String {
let mut result_string = "[".to_string();
for child in self.0.borrow().children.iter() {
result_string.push_str(&format!("{},", child.to_str()));
}
result_string += "]";
result_string
}
}
and then the main function in main.rs:
mod dk_tree;
use crate::dk_tree::{Node};
fn main() {
let root = Node::new();
root.add(Node::new());
root.add(Node::new());
let inner_child = Node::new();
inner_child.add(Node::new());
inner_child.add(Node::new());
root.add(inner_child);
let result = root.to_str();
println!("{result:?}");
}
The reason I made the WeakNode be more like the Node is to have an easier conversion between the both

Can a type know when a mutable borrow to itself has ended?

I have a struct and I want to call one of the struct's methods every time a mutable borrow to it has ended. To do so, I would need to know when the mutable borrow to it has been dropped. How can this be done?
Disclaimer: The answer that follows describes a possible solution, but it's not a very good one, as described by this comment from Sebastien Redl:
[T]his is a bad way of trying to maintain invariants. Mostly because dropping the reference can be suppressed with mem::forget. This is fine for RefCell, where if you don't drop the ref, you will simply eventually panic because you didn't release the dynamic borrow, but it is bad if violating the "fraction is in shortest form" invariant leads to weird results or subtle performance issues down the line, and it is catastrophic if you need to maintain the "thread doesn't outlive variables in the current scope" invariant.
Nevertheless, it's possible to use a temporary struct as a "staging area" that updates the referent when it's dropped, and thus maintain the invariant correctly; however, that version basically amounts to making a proper wrapper type and a kind of weird way to use it. The best way to solve this problem is through an opaque wrapper struct that doesn't expose its internals except through methods that definitely maintain the invariant.
Without further ado, the original answer:
Not exactly... but pretty close. We can use RefCell<T> as a model for how this can be done. It's a bit of an abstract question, but I'll use a concrete example to demonstrate. (This won't be a complete example, but something to show the general principles.)
Let's say you want to make a Fraction struct that is always in simplest form (fully reduced, e.g. 3/5 instead of 6/10). You write a struct RawFraction that will contain the bare data. RawFraction instances are not always in simplest form, but they have a method fn reduce(&mut self) that reduces them.
Now you need a smart pointer type that you will always use to mutate the RawFraction, which calls .reduce() on the pointed-to struct when it's dropped. Let's call it RefMut, because that's the naming scheme RefCell uses. You implement Deref<Target = RawFraction>, DerefMut, and Drop on it, something like this:
pub struct RefMut<'a>(&'a mut RawFraction);
impl<'a> Deref for RefMut<'a> {
type Target = RawFraction;
fn deref(&self) -> &RawFraction {
self.0
}
}
impl<'a> DerefMut for RefMut<'a> {
fn deref_mut(&mut self) -> &mut RawFraction {
self.0
}
}
impl<'a> Drop for RefMut<'a> {
fn drop(&mut self) {
self.0.reduce();
}
}
Now, whenever you have a RefMut to a RawFraction and drop it, you know the RawFraction will be in simplest form afterwards. All you need to do at this point is ensure that RefMut is the only way to get &mut access to the RawFraction part of a Fraction.
pub struct Fraction(RawFraction);
impl Fraction {
pub fn new(numerator: i32, denominator: i32) -> Self {
// create a RawFraction, reduce it and wrap it up
}
pub fn borrow_mut(&mut self) -> RefMut {
RefMut(&mut self.0)
}
}
Pay attention to the pub markings (and lack thereof): I'm using those to ensure the soundness of the exposed interface. All three types should be placed in a module by themselves. It would be incorrect to mark the RawFraction field pub inside Fraction, since then it would be possible (for code outside the module) to create an unreduced Fraction without using new or get a &mut RawFraction without going through RefMut.
Supposing all this code is placed in a module named frac, you can use it something like this (assuming Fraction implements Display):
let f = frac::Fraction::new(3, 10);
println!("{}", f); // prints 3/10
f.borrow_mut().numerator += 3;
println!("{}", f); // prints 3/5
The types encode the invariant: Wherever you have Fraction, you can know that it's fully reduced. When you have a RawFraction, &RawFraction, etc., you can't be sure. If you want, you may also make RawFraction's fields non-pub, so that you can't get an unreduced fraction at all except by calling borrow_mut on a Fraction.
Basically the same thing is done in RefCell. There you want to reduce the runtime borrow-count when a borrow ends. Here you want to perform an arbitrary action.
So let's re-use the concept of writing a function that returns a wrapped reference:
struct Data {
content: i32,
}
impl Data {
fn borrow_mut(&mut self) -> DataRef {
println!("borrowing");
DataRef { data: self }
}
fn check_after_borrow(&self) {
if self.content > 50 {
println!("Hey, content should be <= {:?}!", 50);
}
}
}
struct DataRef<'a> {
data: &'a mut Data
}
impl<'a> Drop for DataRef<'a> {
fn drop(&mut self) {
println!("borrow ends");
self.data.check_after_borrow()
}
}
fn main() {
let mut d = Data { content: 42 };
println!("content is {}", d.content);
{
let b = d.borrow_mut();
//let c = &d; // Compiler won't let you have another borrow at the same time
b.data.content = 123;
println!("content set to {}", b.data.content);
} // borrow ends here
println!("content is now {}", d.content);
}
This results in the following output:
content is 42
borrowing
content set to 123
borrow ends
Hey, content should be <= 50!
content is now 123
Be aware that you can still obtain an unchecked mutable borrow with e.g. let c = &mut d;. This will be silently dropped without calling check_after_borrow.

Destructuring a struct containing a borrow in a function argument

I am trying to implement a system that would use borrow checking/lifetimes in order to provide safe custom indices on a collection. Consider the following code:
struct Graph(i32);
struct Edge<'a>(&'a Graph, i32);
impl Graph {
pub fn get_edge(&self) -> Edge {
Edge(&self, 0)
}
pub fn split(&mut self, Edge(_, edge_id): Edge) {
self.0 = self.0 + edge_id;
}
pub fn join(&mut self, Edge(_, edge0_id): Edge, Edge(_, edge1_id): Edge) {
self.0 = self.0 + edge0_id + edge1_id;
}
}
fn main() {
let mut graph = Graph(0);
let edge = graph.get_edge();
graph.split(edge)
}
References to the graph borrowed by the Edge struct should be dropped when methods such as split or join are called. This would fulfill the API invariant that all edge indices must be destroyed when the graph is mutated. However, the compiler doesn't get it. It fails with messages like
error[E0502]: cannot borrow `graph` as mutable because it is also borrowed as immutable
--> src/main.rs:23:5
|
22 | let edge = graph.get_edge();
| ----- immutable borrow occurs here
23 | graph.split(edge)
| ^^^^^ mutable borrow occurs here
24 | }
| - immutable borrow ends here
If I understand this correctly, the compiler fails to realise that the borrowing of the graph that happened in the edge struct is actually being released when the function is called. Is there a way to teach the compiler what I am trying to do here?
Bonus question: is there a way to do exactly the same but without actually borrowing the graph in the Edge struct? The edge struct is only used as a temporary for the purpose of traversal and will never be part of an external object state (I have 'weak' versions of the edge for that).
Addendum: After some digging around, it seems to be really far from trivial. First of all, Edge(_, edge_id) does not actually destructure the Edge, because _ does not get bound at all (yes, i32 is Copy which makes things even more complicated, but this is easily remedied by wrapping it into a non-Copy struct). Second, even if I completely destructure Edge (i.e. by doing it in a separate scope), the reference to the graph is still there, even though it should have been moved (this must be a bug). It only works if I perform the destructuring in a separate function. Now, I have an idea how to circumvent it (by having a separate object that describes a state change and destructures the indices as they are supplied), but this becomes very awkward very quickly.
You have a second problem that you didn’t mention: how does split know that the user didn’t pass an Edge from a different Graph? Fortunately, it’s possible to solve both problems with higher-rank trait bounds!
First, let’s have Edge carry a PhantomData marker instead of a real reference to the graph:
pub struct Edge<'a>(PhantomData<&'a mut &'a ()>, i32);
Second, let’s move all the Graph operations into a new GraphView object that gets consumed by operations that should invalidate the identifiers:
pub struct GraphView<'a> {
graph: &'a mut Graph,
marker: PhantomData<&'a mut &'a ()>,
}
impl<'a> GraphView<'a> {
pub fn get_edge(&self) -> Edge<'a> {
Edge(PhantomData, 0)
}
pub fn split(self, Edge(_, edge_id): Edge) {
self.graph.0 = self.graph.0 + edge_id;
}
pub fn join(self, Edge(_, edge0_id): Edge, Edge(_, edge1_id): Edge) {
self.graph.0 = self.graph.0 + edge0_id + edge1_id;
}
}
Now all we have to do is guard the construction of GraphView objects such that there’s never more than one with a given lifetime parameter 'a.
We can do this by (1) forcing GraphView<'a> to be invariant over 'a with a PhantomData member as above, and (2) only ever providing a constructed GraphView to a closure with a higher-rank trait bound that creates a fresh lifetime each time:
impl Graph {
pub fn with_view<Ret>(&mut self, f: impl for<'a> FnOnce(GraphView<'a>) -> Ret) -> Ret {
f(GraphView {
graph: self,
marker: PhantomData,
})
}
}
fn main() {
let mut graph = Graph(0);
graph.with_view(|view| {
let edge = view.get_edge();
view.split(edge);
});
}
Full demo on Rust Playground.
This isn’t totally ideal, since the caller may have to go through contortions to put all its operations inside the closure. But I think it’s the best we can do in the current Rust language, and it does allow us to enforce a huge class of compile-time guarantees that almost no other language can express at all. I’d love to see more ergonomic support for this pattern added to the language somehow—perhaps a way to create a fresh lifetime via a return value rather than a closure parameter (pub fn view(&mut self) -> exists<'a> GraphView<'a>)?

Resources