How to achieve behavior similar to #override in Rust - rust

I'm new to rust and am currently rewriting some of my old java code in it. It's my first time not programming in OOP so this is strange and new to me.
I'm having some problems understanding how to implement a different method (with the same name) to each instance of a struct. In essence I'm trying to achieve behavior similar to abstract Class, extends, #override in Java.
Perhaps this example is better at explaining what exactly I'm trying to do. In it I try to implement different execute() logic to each instance of AbstractNode.
Create a struct called "AbstractNode" that holds some data and has 3 methods associated with it (validate(), log(), execute())
struct AbstractNode {
pub data: //data here
pub validate: bool,
pub log: String,
}
trait NodeFunctions {
fn validate(&self)->bool{false}
fn log(&self){println!("/")}
fn execute(&self){}
}
impl NodeFunctions for AbstractNode{
fn validate(&self)->bool{
self.validate
}
fn log(&self){
println!("{}/", self.log);
}
fn execute(&self){
//--this function is the problem, because I don't want its behavior to be
//shared between all instances of Abstract node--
}
}
I then instantiate several nodes. If possible I would also like to define the body of the execute() somewhere in here.
let node1 = AbstractNode{
data: //data
validate: false,
log: "node1".to_string(),
};
let node2 = AbstractNode{
data: //data
validate: 1>0,
log: "node2".to_string(),
};
let node3 = AbstractNode{
data: //data
validate: true,
log: "node3".to_string(),
};
//...
It is called from main like so. If the condition in validate() is true first the log() method is executed, which is the same for all nodes. Then the execute() which is not the same for all nodes.
fn main(){
let mut node_tree = vec![
node1,
node2,
node3
//...
];
for node in node_tree.iter() {
if node.validate(){
node.log();
node.execute(); //<--
break;
}
};
}
Each node should be able to hold different logic under the execute() method and I don't know how I could define this specific behavior.
I hope this question is clear enough. If you don't understand what I'm traying to achieve, please ask additional questions.
Ty in advance.

You could somewhat replicate it using closures. However, you'll still end up with parts that can't be generic per Node if you also want to be able to mutate it within the closure.
I've renamed and removed some parts, to simplify the examples.
First, you'll need a NodeData type, which holds your data. I'm assuming you want to be able to mutate it within the "execute" method. Then you'll need a Node type, which holds the data, along with the boxed closure for that Node instance.
struct NodeData {}
struct Node {
data: NodeData,
f: Box<dyn Fn(&mut NodeData)>,
}
Then we'll implement a method for creating a Node instance, along with the execute method that calls the closure.
This is where the limitation of using closures appears. You cannot pass it a mutable reference to the Node itself. Because the Node becomes borrowed when you access self.f to call the closure.
impl Node {
fn with<F>(f: F) -> Self
where
F: Fn(&mut NodeData) + 'static,
{
Self {
data: NodeData {},
f: Box::new(f),
}
}
fn execute(&mut self) {
(self.f)(&mut self.data);
}
}
An example of using it would then look like this:
let mut nodes: Vec<Node> = vec![];
nodes.push(Node::with(|_node_data| {
println!("I'm a node");
}));
nodes.push(Node::with(|_node_data| {
println!("I'm another node");
}));
nodes.push(Node::with(|_node_data| {
println!("I'm also a node");
}));
for node in &mut nodes {
node.execute();
}
Now, again this works. But NodeData cannot be generic, as then modifying the data in the closure becomes increasingly difficult.
Of course you could defer to having the NodeData be a HashMap, and that way you can store anything with a String key and some enum value.
While you didn't want to have separate types. This does somewhat make it easier, as all node types can have different kinds of data.
Because now we can have a single trait Node which has the execute method.
trait Node {
fn execute(&mut self);
}
Now define multiple types and implement Node for each of them. Again, the two good things of using a trait instead of closure is:
Every node you define, can contain any kind of data you'd like
In this case execute will actually be able to modify Self, which the closure solution cannot.
struct NodeA {}
struct NodeB {}
struct NodeC {}
impl Node for NodeA {
fn execute(&mut self) {
println!("I'm a node");
}
}
impl Node for NodeB {
fn execute(&mut self) {
println!("I'm another node");
}
}
impl Node for NodeC {
fn execute(&mut self) {
println!("I'm also a node");
}
}
You can still have a single Vec of nodes as the traits can easily be boxed.
let mut nodes: Vec<Box<dyn Node>> = vec![];
nodes.push(Box::new(NodeA {}));
nodes.push(Box::new(NodeB {}));
nodes.push(Box::new(NodeC {}));
for node in &mut nodes {
node.execute();
}

You could have AbstractNode store a closure that takes a reference to Self:
struct AbstractNode {
pub validate: bool,
pub log: String,
pub executor: Box<dyn Fn(&Self)>
}
The NodeFunctions implementation for AbstractNode would simply call the executor closure:
impl NodeFunctions for AbstractNode {
fn execute(&self) {
(self.executor)(&self)
}
// ...
}
Now, everytime you create a new instance of an AbstractNode, you can have a custom executor function. The executor takes a reference to self and can therefore access the node's data:
let node1 = AbstractNode {
validate: false,
log: "node1".to_string(),
executor: Box::new(|n| println!("Executing node #1. Is Valid? {}", n.validate))
};
// => Executing node #1. Is Valid? true

Related

Can you store functions as fields in a struct? How to store instructions?

In my model, I have a Petgraph graph which stores as nodes a struct with fields as followed:
struct ControlBloc
{
name:String,
message_inbox:Vec<MessageObj>,
blocked:bool,
instruct:String,
inbox_capacity:f64,
buffer:Vec<MessageObj>,
number_discarded:u32,
clock_queue:SendingQueue,
clock_speed:f64,
}
In it there is a field called instruct in which I want to store instructions. I want to code the model in a way such that after some time, all the nodes will execute the instructions that are stored in the struct. Instructions can be for example send messages to other nodes, computing something... I want something versatile.
Is there a way to store functions as fields in a struct? and then after some time, the function stored can be called and whatever function will be executed?
One way that I see doing this is maybe using enum to store all the function names then using a function to map whatever enum to the corresponding function, for example:
enum FuncName {
SendMessage,
ComputeSize,
StoreSomething,
DoNothing,
}
fn exec_function(func:FuncName)
{
match func {
FuncName::SendMessage => send_message_function(input1,input2),
FuncName::ComputeSize => compute_size_function(input1,input2,input3),
FuncName::StoreSomething => store_something_funtion(input1),
FuncName::DoNothing => (),
}
}
However in this case you can't really customize the inputs of the FuncName function and they either have to be always preset to the same thing or in the input of exec_function you add all the different inputs fields of all the functions in FuncName but that seems rather overkill, even then, I dont really see how to pass them and store in the struct.
Is there then a way to directly add the functions or something in the struct? I know I'm breaking many Rust rules but say for example I had a variable already declared let bloc = ControlBloc::new(...); then you could set the function as for example bloc.instruct = send_message_function(node1,node2); and then when you called bloc.instruct then that would call whatever function is stored there.
Is something like this possible or am I dreaming or like very difficult (I am still learning the language)?
What you can do is storing Box<dyn Fn()> in your struct:
struct Foo {
instruct: Box<dyn Fn(Vec<i32>)>
}
fn sum(vec: Vec<i32>) {
let sum: i32 = vec.into_iter().sum();
println!("{}", sum);
}
fn main() {
let foo = Foo {
instruct: Box::new(|vec| {
let sum: i32 = vec.into_iter().sum();
println!("{}", sum);
})
};
(foo.instruct)(vec![1, 2, 3, 4]);
let foo = Foo {
instruct: Box::new(sum)
};
(foo.instruct)(vec![1, 2, 3, 4]);
}
Fn is implemented automatically by closures which only take immutable references to captured variables or don’t capture anything at all, as well as (safe) function pointers (with some caveats, see their documentation for more details). Additionally, for any type F that implements Fn, &F implements Fn, too.
#EDIT
In my example I used Vec<i32> as an abstract for multiple arguments. However if you are going to have some set of instructions that have different count of arguments, but within itself always the same, you might consider creating a trait Instruct and create struct for every different instruct that will implement this.
Playground
struct Foo<T> {
instruct: Box<dyn Instruct<T>>
}
trait Instruct<T> {
fn run(&self) -> T;
}
struct CalcSum {
f: Box<dyn Fn() -> i32>
}
impl CalcSum {
fn new(arg: Vec<i32>) -> CalcSum {
CalcSum {
f: Box::new(move || arg.iter().sum::<i32>()),
}
}
}
impl Instruct<i32> for CalcSum {
fn run(&self) -> i32 {
(self.f)()
}
}

How can I determine if I have a unique Arc when it's dropped?

I've an Arc<Mutex<Thing>> field in a struct which is cloned many times. It is shared between concurrent threads. Drop::drop is called for each clone as it goes out of scope. Is there any way to determine when Drop::drop is called for the last (unique) Arc<Mutex<Thing>>?
It's clear that strong_count is subject to data races (I've seen them). So, you can't count on Arc::strong_count() == 1 (no pun intended).
I found that I couldn't use Arc::try_unwrap() due to a move issue.
Arc::is_unique() is private.
Other than keeping a Arc<AtomicUsize> field, which is incremented on clone and decremented on drop, is there any way to determine if a drop is for a unique Arc<Mutex<Thing>>?
Here's an MRE:
use std::sync::{Arc};
#[derive(Debug)]
enum Action {
One, Two, Three
}
// Thing trait which operates on an Action, which should be a enum, allowing for
// different action sets.
trait Thing<T> {
fn disconnected(&self);
fn action(&self, action: T);
}
// There are many instances of an ActionController.
// There may be zero or more clones of an instance.
// The final drop of the instances should call thing.disconnected()
// In a multi-core environment, the same instance may be running on multiple cores
// ActionController should not be generic.
#[derive(Clone)]
struct ActionController {
id: usize,
thing: Arc<dyn Thing<Action>>,
}
impl ActionController {
fn new(id: usize, thing: Box<dyn Thing<Action>>) -> Self {
Self { id, thing: Arc::from(thing) }
}
fn invoke(&self, action: Action) {
self.thing.action(action);
}
}
//
// To work around the drop issue, I've implemented Clone for ActionController which
// performs a fetch_add(1) on clone and a fetch_sub(1) on drop. This provides
// suficient information to call disconnected() -- but it just seems like there's
// got to be a better way.
impl Drop for ActionController {
fn drop(&mut self) {
// drop will be called for each clone of an Controller instance. When
// the unique instance is dropped, disconnected() must be called
self.thing.disconnected();
}
}
struct Controlled {}
impl Thing<Action> for Controlled {
fn disconnected(&self) { println!("disconnected")}
fn action(&self, action: Action) {println!("action: {:#?}", action)}
}
fn bad() {
let controlled = Controlled{};
let controlled = Box::new(controlled) as Box<dyn Thing<Action>>;
let controller = ActionController::new(1, controlled);
let clone = controller.clone();
controller.invoke(Action::One);
clone.invoke(Action::Two);
drop (controller);
clone.invoke(Action::Three);
}
fn main() {
bad();
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn incorrect() {
bad();
}
}
Arc::try_unwrap is probably the intended way to do this - is it possible to restructure your code to avoid the move issues you were running into?
Why do you want to know? If you have some extra cleanup code that needs to be executed before the Mutex<Thing> is dropped, maybe you could use an Arc<MyLockedThing> instead, where MyLockedThing is a struct containing a Mutex<Thing> that impls Drop to do the cleanup?
It seems like you want to be notified when the data inside the Arc is to be dropped. If so, this can be done by implementing Drop on the type "inside" the Arc.
Define a newtype:
struct ThingAction(Box<dyn Thing<Action>>);
impl Thing<Action> for ThingAction {
fn disconnected(&self) {
self.0.disconnected()
}
fn action(&self, action: Action) {
self.0.action(action)
}
}
And implement Drop:
impl Drop for ThingAction {
fn drop(&mut self) {
self.disconnected()
}
}
Then use the newtype:
#[derive(Clone)]
struct ActionController {
id: usize,
thing: Arc<ThingAction>,
}
impl ActionController {
fn new(id: usize, thing: Box<dyn Thing<Action>>) -> Self {
Self { id, thing: Arc::new(ThingAction(thing)) }
}
I don't think there's any perfect way to do this without stdlib support (go checkout out Arc::drop).
Weak::strong_count or Weak::upgrade is less subject to races so if you downgrade your Arc then drop it, if the weakref's strong count is 0 or trying to upgrade it fails you know the Arc is dead, but there is no guarantee the current thread killed it, two might have concurrently dropped the Arc at the same time before either had the time to check for the weakref's strong count.
I think the only bulletproof way would be to get notified by a Drop stored inside the Arc, that you're guaranteed is only called once.

What's the best way to have a struct that contains a reference to another struct of the same type? [duplicate]

I am trying to implement a scenegraph-like data structure in Rust. I would like an equivalent to this C++ code expressed in safe Rust:
struct Node
{
Node* parent; // should be mutable, and nullable (no parent)
std::vector<Node*> child;
virtual ~Node()
{
for(auto it = child.begin(); it != child.end(); ++it)
{
delete *it;
}
}
void addNode(Node* newNode)
{
if (newNode->parent)
{
newNode->parent.child.erase(newNode->parent.child.find(newNode));
}
newNode->parent = this;
child.push_back(newNode);
}
}
Properties I want:
the parent takes ownership of its children
the nodes are accessible from the outside via a reference of some kind
events that touch one Node can potentially mutate the whole tree
Rust tries to ensure memory safety by forbidding you from doing things that might potentially be unsafe. Since this analysis is performed at compile-time, the compiler can only reason about a subset of manipulations that are known to be safe.
In Rust, you could easily store either a reference to the parent (by borrowing the parent, thus preventing mutation) or the list of child nodes (by owning them, which gives you more freedom), but not both (without using unsafe). This is especially problematic for your implementation of addNode, which requires mutable access to the given node's parent. However, if you store the parent pointer as a mutable reference, then, since only a single mutable reference to a particular object may be usable at a time, the only way to access the parent would be through a child node, and you'd only be able to have a single child node, otherwise you'd have two mutable references to the same parent node.
If you want to avoid unsafe code, there are many alternatives, but they'll all require some sacrifices.
The easiest solution is to simply remove the parent field. We can define a separate data structure to remember the parent of a node while we traverse a tree, rather than storing it in the node itself.
First, let's define Node:
#[derive(Debug)]
struct Node<T> {
data: T,
children: Vec<Node<T>>,
}
impl<T> Node<T> {
fn new(data: T) -> Node<T> {
Node { data: data, children: vec![] }
}
fn add_child(&mut self, child: Node<T>) {
self.children.push(child);
}
}
(I added a data field because a tree isn't super useful without data at the nodes!)
Let's now define another struct to track the parent as we navigate:
#[derive(Debug)]
struct NavigableNode<'a, T: 'a> {
node: &'a Node<T>,
parent: Option<&'a NavigableNode<'a, T>>,
}
impl<'a, T> NavigableNode<'a, T> {
fn child(&self, index: usize) -> NavigableNode<T> {
NavigableNode {
node: &self.node.children[index],
parent: Some(self)
}
}
}
impl<T> Node<T> {
fn navigate<'a>(&'a self) -> NavigableNode<T> {
NavigableNode { node: self, parent: None }
}
}
This solution works fine if you don't need to mutate the tree as you navigate it and you can keep the parent NavigableNode objects around (which works fine for a recursive algorithm, but doesn't work too well if you want to store a NavigableNode in some other data structure and keep it around). The second restriction can be alleviated by using something other than a borrowed pointer to store the parent; if you want maximum genericity, you can use the Borrow trait to allow direct values, borrowed pointers, Boxes, Rc's, etc.
Now, let's talk about zippers. In functional programming, zippers are used to "focus" on a particular element of a data structure (list, tree, map, etc.) so that accessing that element takes constant time, while still retaining all the data of that data structure. If you need to navigate your tree and mutate it during the navigation, while retaining the ability to navigate up the tree, then you could turn a tree into a zipper and perform the modifications through the zipper.
Here's how we could implement a zipper for the Node defined above:
#[derive(Debug)]
struct NodeZipper<T> {
node: Node<T>,
parent: Option<Box<NodeZipper<T>>>,
index_in_parent: usize,
}
impl<T> NodeZipper<T> {
fn child(mut self, index: usize) -> NodeZipper<T> {
// Remove the specified child from the node's children.
// A NodeZipper shouldn't let its users inspect its parent,
// since we mutate the parents
// to move the focused nodes out of their list of children.
// We use swap_remove() for efficiency.
let child = self.node.children.swap_remove(index);
// Return a new NodeZipper focused on the specified child.
NodeZipper {
node: child,
parent: Some(Box::new(self)),
index_in_parent: index,
}
}
fn parent(self) -> NodeZipper<T> {
// Destructure this NodeZipper
let NodeZipper { node, parent, index_in_parent } = self;
// Destructure the parent NodeZipper
let NodeZipper {
node: mut parent_node,
parent: parent_parent,
index_in_parent: parent_index_in_parent,
} = *parent.unwrap();
// Insert the node of this NodeZipper back in its parent.
// Since we used swap_remove() to remove the child,
// we need to do the opposite of that.
parent_node.children.push(node);
let len = parent_node.children.len();
parent_node.children.swap(index_in_parent, len - 1);
// Return a new NodeZipper focused on the parent.
NodeZipper {
node: parent_node,
parent: parent_parent,
index_in_parent: parent_index_in_parent,
}
}
fn finish(mut self) -> Node<T> {
while let Some(_) = self.parent {
self = self.parent();
}
self.node
}
}
impl<T> Node<T> {
fn zipper(self) -> NodeZipper<T> {
NodeZipper { node: self, parent: None, index_in_parent: 0 }
}
}
To use this zipper, you need to have ownership of the root node of the tree. By taking ownership of the nodes, the zipper can move things around in order to avoid copying or cloning nodes. When we move a zipper, we actually drop the old zipper and create a new one (though we could also do it by mutating self, but I thought it was clearer that way, plus it lets you chain method calls).
If the above options are not satisfactory, and you must absolutely store the parent of a node in a node, then the next best option is to use Rc<RefCell<Node<T>>> to refer to the parent and Weak<RefCell<Node<T>>> to the children. Rc enables shared ownership, but adds overhead to perform reference counting at runtime. RefCell enables interior mutability, but adds overhead to keep track of the active borrows at runtime. Weak is like Rc, but it doesn't increment the reference count; this is used to break reference cycles, which would prevent the reference count from dropping to zero, causing a memory leak. See DK.'s answer for an implementation using Rc, Weak and RefCell.
The problem is that this data structure is inherently unsafe; it doesn't have a direct equivalent in Rust that doesn't use unsafe. This is by design.
If you want to translate this into safe Rust code, you need to be more specific about what, exactly, you want from it. I know you listed some properties above, but often people coming to Rust will say "I want everything I have in this C/C++ code", to which the direct answer is "well, you can't."
You're also, unavoidably, going to have to change how you approach this. The example you've given has pointers without any ownership semantics, mutable aliasing, and cycles; all of which Rust will not allow you to simply ignore like C++ does.
The simplest solution is to just get rid of the parent pointer, and maintain that externally (like a filesystem path). This also plays nicely with borrowing because there are no cycles anywhere:
pub struct Node1 {
children: Vec<Node1>,
}
If you need parent pointers, you could go half-way and use Ids instead:
use std::collections::BTreeMap;
type Id = usize;
pub struct Tree {
descendants: BTreeMap<Id, Node2>,
root: Option<Id>,
}
pub struct Node2 {
parent: Id,
children: Vec<Id>,
}
The BTreeMap is effectively your "address space", bypassing borrowing and aliasing issues by not directly using memory addresses.
Of course, this introduces the problem of a given Id not being tied to the particular tree, meaning that the node it belongs to could be destroyed, and now you have what is effectively a dangling pointer. But, that's the price you pay for having aliasing and mutation. It's also less direct.
Or, you could go whole-hog and use reference-counting and dynamic borrow checking:
use std::cell::RefCell;
use std::rc::{Rc, Weak};
// Note: do not derive Clone to make this move-only.
pub struct Node3(Rc<RefCell<Node3_>>);
pub type WeakNode3 = Weak<RefCell<Node3>>;
pub struct Node3_ {
parent: Option<WeakNode3>,
children: Vec<Node3>,
}
impl Node3 {
pub fn add(&self, node: Node3) {
// No need to remove from old parent; move semantics mean that must have
// already been done.
(node.0).borrow_mut().parent = Some(Rc::downgrade(&self.0));
self.children.push(node);
}
}
Here, you'd use Node3 to transfer ownership of a node between parts of the tree, and WeakNode3 for external references. Or, you could make Node3 cloneable and add back the logic in add to make sure a given node doesn't accidentally stay a child of the wrong parent.
This is not strictly better than the second option, because this design absolutely cannot benefit from static borrow-checking. The second one can at least prevent you from mutating the graph from two places at once at compile time; here, if that happens, you'll just crash.
The point is: you can't just have everything. You have to decide which operations you actually need to support. At that point, it's usually just a case of picking the types that give you the necessary properties.
In certain cases, you can also use an arena. An arena guarantees that values stored in it will have the same lifetime as the arena itself. This means that adding more values will not invalidate any existing lifetimes, but moving the arena will. Thus, such a solution is not viable if you need to return the tree.
This solves the problem by removing the ownership from the nodes themselves.
Here's an example that also uses interior mutability to allow a node to be mutated after it is created. In other cases, you can remove this mutability if the tree is constructed once and then simply navigated.
use std::{
cell::{Cell, RefCell},
fmt,
};
use typed_arena::Arena; // 1.6.1
struct Tree<'a, T: 'a> {
nodes: Arena<Node<'a, T>>,
}
impl<'a, T> Tree<'a, T> {
fn new() -> Tree<'a, T> {
Self {
nodes: Arena::new(),
}
}
fn new_node(&'a self, data: T) -> &'a mut Node<'a, T> {
self.nodes.alloc(Node {
data,
tree: self,
parent: Cell::new(None),
children: RefCell::new(Vec::new()),
})
}
}
struct Node<'a, T: 'a> {
data: T,
tree: &'a Tree<'a, T>,
parent: Cell<Option<&'a Node<'a, T>>>,
children: RefCell<Vec<&'a Node<'a, T>>>,
}
impl<'a, T> Node<'a, T> {
fn add_node(&'a self, data: T) -> &'a Node<'a, T> {
let child = self.tree.new_node(data);
child.parent.set(Some(self));
self.children.borrow_mut().push(child);
child
}
}
impl<'a, T> fmt::Debug for Node<'a, T>
where
T: fmt::Debug,
{
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "{:?}", self.data)?;
write!(f, " (")?;
for c in self.children.borrow().iter() {
write!(f, "{:?}, ", c)?;
}
write!(f, ")")
}
}
fn main() {
let tree = Tree::new();
let head = tree.new_node(1);
let _left = head.add_node(2);
let _right = head.add_node(3);
println!("{:?}", head); // 1 (2 (), 3 (), )
}
TL;DR: DK.'s second version doesn't compile because parent has another type than self.0, fix it by converting it to a WeakNode. Also, in the line directly below, "self" doesn't have a "children" attribute but self.0 has.
I corrected the version of DK. so it compiles and works. Here is my Code:
dk_tree.rs
use std::cell::RefCell;
use std::rc::{Rc, Weak};
// Note: do not derive Clone to make this move-only.
pub struct Node(Rc<RefCell<Node_>>);
pub struct WeakNode(Weak<RefCell<Node_>>);
struct Node_ {
parent: Option<WeakNode>,
children: Vec<Node>,
}
impl Node {
pub fn new() -> Self {
Node(Rc::new(RefCell::new(Node_ {
parent: None,
children: Vec::new(),
})))
}
pub fn add(&self, node: Node) {
// No need to remove from old parent; move semantics mean that must have
// already been done.
node.0.borrow_mut().parent = Some(WeakNode(Rc::downgrade(&self.0)));
self.0.borrow_mut().children.push(node);
}
// just to have something visually printed
pub fn to_str(&self) -> String {
let mut result_string = "[".to_string();
for child in self.0.borrow().children.iter() {
result_string.push_str(&format!("{},", child.to_str()));
}
result_string += "]";
result_string
}
}
and then the main function in main.rs:
mod dk_tree;
use crate::dk_tree::{Node};
fn main() {
let root = Node::new();
root.add(Node::new());
root.add(Node::new());
let inner_child = Node::new();
inner_child.add(Node::new());
inner_child.add(Node::new());
root.add(inner_child);
let result = root.to_str();
println!("{result:?}");
}
The reason I made the WeakNode be more like the Node is to have an easier conversion between the both

Correct way to implement container-element relationship in idiomatic Rust

I know why Rust doesn't like my code. However, I don't know what would be the idiomatic Rust approach to the problem.
I'm a C# programmer, and while I feel I understand Rust's system, I think my "old" approach to some problems don't work in Rust at all.
This code reproduces the problem I'm having, and it probably doesn't look like idiomatic Rust (or maybe it doesn't even look good in C# as well):
//a "global" container for the elements and some extra data
struct Container {
elements: Vec<Element>,
global_contextual_data: i32,
//... more contextual data fields
}
impl Container {
//this just calculates whatever I need based on the contextual data
fn calculate_contextual_data(&self) -> i32 {
//This function will end up using the elements vector and the other fields as well,
//and will do some wacky maths with it.
//That's why I currently have the elements stored in the container
}
}
struct Element {
element_data: i32,
//other fields
}
impl Element {
//I need to take a mutable reference to update element_data,
//and a reference to the container to calculate something that needs
//this global contextual data... including the other elements, as previously stated
fn update_element_data(&mut self, some_data: i32, container: &Container) {
self.element_data *= some_data + container.calculate_contextual_data() //do whatever maths I need
}
}
fn main(){
//let it be mutable so I can assign the elements later
let mut container = Container {
elements: vec![],
global_contextual_data: 1
};
//build a vector of elements
let elements = vec![
Element {
element_data: 5
},
Element {
element_data: 7
}
];
//this works
container.elements = elements;
//and this works, but container is now borrowed as mutable
for elem in container.elements.iter_mut() {
elem.element_data += 1; //and while this works
let some_data = 2;
//i can't borrow it as immutable here and pass to the other function
elem.update_element_data(some_data, &container);
}
}
I understand why elem.update_element_data(some_data, &container); won't work: I'm already borrowing it as mutable when I call iter_mut. Maybe each element should have a reference to the container? But then wouldn't I have more opportunities to break at borrow-checking?
I don't think it's possible to bring my old approach to this new system. Maybe I need to rewrite the whole thing. Can someone point me to the right direction? I've just started programming in Rust, and while the ownership system is making some sort of sense to me, the code I should write "around" it is still not that clear.
I came across this question:
What's the Rust way to modify a structure within nested loops? which gave me insight into my problem.
I revisited the problem and boiled the problem down to the sharing of the vector by borrowing for writes and reads at the same time. This is just forbidden by Rust. I don't want to circumvent the borrow checker using unsafe. I was wondering, though, how much data should I copy?
My Element, which in reality is the entity of a game (I'm simulating a clicker game) has both mutable and immutable properties, which I broke apart.
struct Entity {
type: EntityType,
starting_price: f64,
...
...
status: Cell<EntityStatus>
}
Every time I need to change the status of an entity, I need to call get and set methods on the status field. EntityStatus derives Clone, Copy.
I could even put the fields directly on the struct and have them all be Cells but then it would be cumbersome to work with them (lots of calls to get and set), so I went for the more aesthetically pleasant approach.
By allowing myself to copy the status, edit and set it back, I could borrow the array immutably twice (.iter() instead of .iter_mut()).
I was afraid that the performance would be bad due to the copying, but in reality it was pretty good once I compiled with opt-level=3. If it gets problematic, I might change the fields to be Cells or come up with another approach.
Just do the computation outside:
#[derive(Debug)]
struct Container {
elements: Vec<Element>
}
impl Container {
fn compute(&self) -> i32 {
return 42;
}
fn len(&self) -> usize {
return self.elements.len();
}
fn at_mut(&mut self, index: usize) -> &mut Element {
return &mut self.elements[index];
}
}
#[derive(Debug)]
struct Element {
data: i32
}
impl Element {
fn update(&mut self, data: i32, computed_data: i32) {
self.data *= data + computed_data;
}
}
fn main() {
let mut container = Container {
elements: vec![Element {data: 1}, Element {data: 3}]
};
println!("{:?}", container);
for i in 0..container.len() {
let computed_data = container.compute();
container.at_mut(i).update(2, computed_data);
}
println!("{:?}", container);
}
Another option is to add an update_element to your container:
#[derive(Debug)]
struct Container {
elements: Vec<Element>
}
impl Container {
fn compute(&self) -> i32 {
let sum = self.elements.iter().map(|e| {e.data}).reduce(|a, b| {a + b});
return sum.unwrap_or(0);
}
fn len(&self) -> usize {
return self.elements.len();
}
fn at_mut(&mut self, index: usize) -> &mut Element {
return &mut self.elements[index];
}
fn update_element(&mut self, index: usize, data: i32) {
let computed_data = self.compute();
self.at_mut(index).update(data, computed_data);
}
}
#[derive(Debug)]
struct Element {
data: i32
}
impl Element {
fn update(&mut self, data: i32, computed_data: i32) {
self.data *= data + computed_data;
}
}
fn main() {
let mut container = Container {
elements: vec![Element {data: 1}, Element {data: 3}]
};
println!("{:?}", container);
for i in 0..container.len() {
let computed_data = container.compute();
container.at_mut(i).update(2, computed_data);
}
println!("{:?}", container);
for i in 0..container.len() {
container.update_element(i, 2);
}
println!("{:?}", container);
}
Try it!

How to take ownership of Any:downcast_ref from trait object?

I've met a conflict with Rust's ownership rules and a trait object downcast. This is a sample:
use std::any::Any;
trait Node{
fn gen(&self) -> Box<Node>;
}
struct TextNode;
impl Node for TextNode{
fn gen(&self) -> Box<Node>{
Box::new(TextNode)
}
}
fn main(){
let mut v: Vec<TextNode> = Vec::new();
let node = TextNode.gen();
let foo = &node as &Any;
match foo.downcast_ref::<TextNode>(){
Some(n) => {
v.push(*n);
},
None => ()
};
}
The TextNode::gen method has to return Box<Node> instead of Box<TextNode>, so I have to downcast it to Box<TextNode>.
Any::downcast_ref's return value is Option<&T>, so I can't take ownership of the downcast result and push it to v.
====edit=====
As I am not good at English, my question is vague.
I am implementing (copying may be more precise) the template parser in Go standard library.
What I really need is a vector, Vec<Box<Node>> or Vec<Box<Any>>, which can contain TextNode, NumberNode, ActionNode, any type of node that implements the trait Node can be pushed into it.
Every node type needs to implement the copy method, return Box<Any>, and then downcasting to the concrete type is OK. But to copy Vec<Box<Any>>, as you don't know the concrete type of every element, you have to check one by one, that is really inefficient.
If the copy method returns Box<Node>, then copying Vec<Box<Node>> is simple. But it seems that there is no way to get the concrete type from trait object.
If you control trait Node you can have it return a Box<Any> and use the Box::downcast method
It would look like this:
use std::any::Any;
trait Node {
fn gen(&self) -> Box<Any>; // downcast works on Box<Any>
}
struct TextNode;
impl Node for TextNode {
fn gen(&self) -> Box<Any> {
Box::new(TextNode)
}
}
fn main() {
let mut v: Vec<TextNode> = Vec::new();
let node = TextNode.gen();
if let Ok(n) = node.downcast::<TextNode>() {
v.push(*n);
}
}
Generally speaking, you should not jump to using Any. I know it looks familiar when coming from a language with subtype polymorphism and want to recreate a hierarchy of types with some root type (like in this case: you're trying to recreate the TextNode is a Node relationship and create a Vec of Nodes). I did it too and so did many others: I bet the number of SO questions on Any outnumbers the times Any is actually used on crates.io.
While Any does have its uses, in Rust it has alternatives.
In case you have not looked at them, I wanted to make sure you considered doing this with:
enums
Given different Node types you can express the "a Node is any of these types" relationship with an enum:
struct TextNode;
struct XmlNode;
struct HtmlNode;
enum Node {
Text(TextNode),
Xml(XmlNode),
Html(HtmlNode),
}
With that you can put them all in one Vec and do different things depending on the variant, without downcasting:
let v: Vec<Node> = vec![
Node::Text(TextNode),
Node::Xml(XmlNode),
Node::Html(HtmlNode)];
for n in &v {
match n {
&Node::Text(_) => println!("TextNode"),
&Node::Xml(_) => println!("XmlNode"),
&Node::Html(_) => println!("HtmlNode"),
}
}
playground
adding a variant means potentially changing your code in many places: the enum itself and all the functions that do something with the enum (to add the logic for the new variant). But then again, with Any it's mostly the same, all those functions might need to add the downcast to the new variant.
Trait objects (not Any)
You can try putting the actions you'd want to perform on the various types of nodes in the trait, so you don't need to downcast, but just call methods on the trait object.
This is essentially what you were doing, except putting the method on the Node trait instead of downcasting.
playground
The (more) ideomatic way for the problem:
use std::any::Any;
pub trait Nodeable {
fn as_any(&self) -> &dyn Any;
}
#[derive(Clone, Debug)]
struct TextNode {}
impl Nodeable for TextNode {
fn as_any(&self) -> &dyn Any {
self
}
}
fn main() {
let mut v: Vec<Box<dyn Nodeable>> = Vec::new();
let node = TextNode {}; // or impl TextNode::new
v.push(Box::new(node));
// the downcast back to TextNode could be solved like this:
if let Some(b) = v.pop() { // only if we have a node…
let n = (*b).as_any().downcast_ref::<TextNode>().unwrap(); // this is secure *)
println!("{:?}", n);
};
}
*) This is secure: only Nodeables are allowd to be downcasted to types that had Nodeable implemented.

Resources