When to use a field or a method in Rust structs? - struct

being quite new to Rust and programming in general I have a basic question that is probably easily answered. Let's say I have a rectangle object, which has a width and a height, like so:
struct Rectangle1 {
width: u32,
height: u32,
}
I can create such an object with a new method:
fn new(width: u32, height: u32) -> Self {
Rectangle1 {
width,
height,
}
}
Now, let's say I want to use that object later and need its area, what is better practice? Have the area a field or a method?
So either I just implement this:
fn area(&self) -> u32 {
self.width * self.height
}
Or, since area is inherent to that object, I give the object an "area" field:
struct Rectangle2 {
width: u32,
height: u32,
area: u32,
}
And implement the new method like this instead:
fn new(width: u32, height: u32) -> Self {
Rectangle2 {
width,
height,
area: width * height,
}
}
Now somewhere else in the code, when the area is needed:
let rect1 = Rectangle1::new(30, 50);
let rect2 = Rectangle2::new(30, 50);
println!(
"The area of the rectangle 1 is {} square pixels.",
rect1.area()
);
println!(
"The area of the rectangle 2 is {} square pixels.",
rect2.area
);
In this simple example I can't see when one would be better than the other. What should be preferred anyway? Is there one way that is less common because of something I am not aware of?

TLDR; it depends on the use case.
IMHO
This starts to become opinionated very fast. Generally speaking your answer will be driven by use-cases.
On modern systems, experience says that it's better to keep dependent params as functions and only optimize the caching of results as a special case.
Example 1:
Length and height remain constant over life of the Rectangle; pre-calculating may be useful. (Consider having to do it for 10^6 rectangles e.g.)
Example 2:
Your height and length get modified ... then do you precalculate? Do you cache the result?
Example 3:
You are constantly updating the third param based on the user updating any of the other two :-)
#prog-fh made a comment about accessing memory is expensive. I would say that you have to consider that both ways. Fetching two values into the CPU and calculating it can be potentially more expensive than accessing exactly one pre-calculated value.
so IMHO, and in line with what every one says,
It depends :-)

Related

Best way in Rust to count leaves in a binary search tree?

I'm developing a basic implementation of a binary search tree in Rust. I was creating a method for counting leaves, but ran into some very strange looking code to get it to work. I wanted to clarify if the way I did it is:
Considered appropriate by Rust standards/convention
Efficient
I'm using an enum that differentiates between a node or nothing being present:
pub enum BST<T: Ord> {
Node {
value: T, // template with type T
left: Box<BST<T>>,
right: Box<BST<T>>,
},
Empty,
}
Now, count_leaves(&self) is first checking if the provided type is either a Node or Empty. If it's Empty, I can just return 0, but if it's a valid Node then I need to check if the left and right children are Empty. If so, then I can return a 1 because I'm at a leaf.
pub fn count_leaves(&self) -> u32 {
match self {
BST::Node {
value: _,
ref left,
ref right,
} => {
match (&**left, &**right) {
(BST::Empty, BST::Empty) => 1,
_ => {
left.count_leaves() + right.count_leaves()
}
}
},
BST::Empty => 0
}
}
So, to check if both left and right are BST::Empty, I wanted to use a tuple! But in doing so, Rust tries to move both left and right into the tuple. Since my type BST<T> does not implement the Copy trait, this is not possible. Also, since left and right are both boxes and borrowed, something simply like this is not possible:
match (left, right) {
BST::Empty => {},
_ => {}
}
In order to use this tuple, it looks like I need to first dereference the borrowed box using *, then dereference that box again into its type using a second *, and then finally borrow using & to avoid a move. This gives the weird looking (&**left, &**right).
From my testing this works, but I thought it looked really strange. Should I rewrite this in a more readable way (if there is one)?
I've considered using Option<> instead of the enum with the Node and Empty, but I wasn't sure if that would lead to anything more readable or more efficient.
Thanks!
EDIT:
Just wanted to clarify that when I say leaves I mean a node in the tree with no children, not a non-empty node.
You're just overthinking it. You already have a base case for when a node is empty so you don't need both matches. When possible you want to ignore the boxes in favor of implicitly using Deref to perform operations on them.
pub fn count_leaves(&self) -> u32 {
match self {
BST::Node { left, right, .. } => 1 + left.count_leaves() + right.count_leaves(),
BST::Empty => 0,
}
}
By manually checking if both sides are empty before calling count_leaves on both sides, you might actually be decreasing performance. A recursive function call (or any function call really) can be very cheap since your code is already at the processor. However, it takes (a very tiny) time for a processor to read a value from a pointer so ideally you only needs to do it once per value. However the compiler is made of eldritch sorcery so it will probably figure out the best way to optimize your code either way. Another option which may help is to add an #[inline] hint to the function to ask the compiler to unroll the recursive call one or more times if it thinks it would be helpful for performance.
You may find it helpful to change the structure of your BST. By making your tree an enum, then it needs to be matched every time you perform any operation on it.
pub struct BST<T> {
left: Option<Box<BST<T>>>,
right: Option<Box<BST<T>>>,
data: T,
}
impl<T> BST<T> {
pub fn new_root(data: T) -> Self {
BST {
left: None,
right: None,
data,
}
}
pub fn count_leaves(&self) -> u64 {
let left_leaves = self.left.as_ref().map_or(0, |x| x.count_leaves());
let right_leaves = self.right.as_ref().map_or(0, |x| x.count_leaves());
left_leaves + right_leaves + 1
}
}
impl<T: Ord> BST<T> {
pub fn insert(&mut self, data: T) {
let side = match self.data.cmp(&data) {
Ordering::Less | Ordering::Equal => &mut self.left,
Ordering::Greater => &mut self.right,
};
if let Some(node) = side {
node.insert(data);
} else {
*side = Some(Box::new(Self::new_root(data)));
}
}
}
Now this works well, but it also introduces a new problem that I'm guessing you were attempting to avoid with your solution. You can't create an empty BST<T>. This may make initializing your program difficult. We can fix this by using a small wrapper struct (Ex: pub struct BinarySearchTree<T>(Option<BST<T>>)). This is also what std::collections::LinkedList does. You may also be surprised to learn that this cuts our memory footprint in half compared to the original post. This is caused by Empty requiring just as much space as Node. So this means we need to allocate the entire next layer of the tree even though we don't use it.

How to manipulate a content variable from inside struct's function?

I'm new to Rust. I followed the tutorial from their doc here. In Listing 5-13, we have an implementation of struct that prints an area.
My question is how to manipulate self.width or self.height from within the struct's function, such as this to be possible:
#[derive(Debug)]
struct Rectangle {
width: u32,
height: u32,
}
impl Rectangle {
fn area(&self) -> u32 {
self.width * self.height
}
// I need to use mut self instead of &self so I can access self variable here
// but since I pass the ownership into this function, the prinln!
// throws an error: "value borrowed here after move"
// but If I don't do that, the set_width cannot manipulate the width
fn set_width(mut self, width: u32) {
self.width = width;
}
}
fn main() {
let rect1 = Rectangle {
width: 30,
height: 50,
};
rect1.set_width(50);
// value borrowed here after move error
println!("The area of the rectangle is {} square pixels.", rect1.area());
}
Taking self, whether by mut or not, means the method consumes self. If you want to modify the structure in-place, you need &mut self. In fact this is covered by the text below 5.13:
We’ve chosen &self here for the same reason we used &Rectangle in the function version: we don’t want to take ownership, and we just want to read the data in the struct, not write to it. If we wanted to change the instance that we’ve called the method on as part of what the method does, we’d use &mut self as the first parameter. Having a method that takes ownership of the instance by using just self as the first parameter is rare; this technique is usually used when the method transforms self into something else and you want to prevent the caller from using the original instance after the transformation.
(emphasis mine)
&mut having been covered by the previous section 4.2.
self is both magical and not, it's syntactic sugar for a variation of self: <something>Self, and allows calling the method on an instance (inst.method()), but that aside it follows the normal Rust rule with respect to ownership and borrowing.

How can I read from a component and write to a new entity with that same component in Rust SPECS?

I have an entity that is generating other entities. For example, the generator has a position component, and I want the generated entity to have the same position as the generator.
In the generating system, it seems that I need to both read and write a component, which doesn't sound possible. The only option seems to be a LazyUpdate, but I would like to avoid this because it requires a call to world::maintain, and I want to use the generated entity in another system within the same frame.
My code for the system:
#[derive(Debug)]
struct Position {
x: f32, // meters
y: f32,
z: f32,
direction: f32,
}
impl Component for Position {
type Storage = VecStorage<Self>;
}
struct GenerateEntity;
impl<'a> System<'a> for GenerateEntity {
type SystemData = (
ReadStorage<'a, Generator>,
ReadStorage<'a, Position>,
Entities<'a>,
);
fn run(&mut self, (gen, pos, entities): Self::SystemData) {
for (pos, gen) in (&pos, &gen).join() {
let generated = entities.create();
// This gives an error because position can only be read
pos.insert(generated, pos.clone());
}
}
}
How do I get around this problem?
it would seem that I would need to both read and write a component, which doesn't sound possible
Sure it is: use a WriteStorage.
The name is slightly misleading. WriteStorage isn't write-only storage; it's mutable storage, which includes reading.
The only issue is that you will likely not be able to insert into the position storage while you are iterating over it. You'd need to store the changes you'd want to make during the loop and apply them afterwards.
(Also as the comments point out, you should rename the pos in your loop (that refers to the single component) so that it doesn't shadow the pos you take as an argument (that refers to the entire storage))

How to convert &Vector<Mutex> to Vector<Mutex>

I'm working my way trough the Rust examples. There is this piece of code:
fn new(name: &str, left: usize, right: usize) -> Philosopher {
Philosopher {
name: name.to_string(),
left: left,
right: right,
}
}
what is the best way to adapt this to a vector ? This works:
fn new(v: Vec<Mutex<()>>) -> Table {
Table {
forks: v
}
}
Than I tried the following:
fn new(v: &Vec<Mutex<()>>) -> Table {
Table {
forks: v.to_vec()
}
}
But that gives me:
the trait `core::clone::Clone` is not implemented
for the type `std::sync::mutex::Mutex<()>`
Which make sense. But what must I do If I want to pass a reference to Table and do not want to store a reference inside the Table struct ?
The error message actually explains a lot here. When you call to_vec on a &Vec<_>, you have to make a clone of the entire vector. That's because Vec owns the data, while a reference does not. In order to clone a vector, you also have to clone all of the contents. This is because the vector owns all the items inside of it.
However, your vector contains a Mutex, which is not able to be cloned. A mutex represents unique access to some data, so having two separate mutexes to the same data would be pointless.
Instead, you probably want to share references to the mutex, not clone it completely. Chances are, you want an Arc:
use std::sync::{Arc, Mutex};
fn main() {
let things = vec![Arc::new(Mutex::new(()))];
things.to_vec();
}

Mutable self while reading from owner object

I have one object that owns another. The owned object has a mutating method that depends on non-mutating methods of its owner. The architecture (simplified as much as possible) looks like this:
struct World {
animals: Vec<Animal>,
}
impl World {
fn feed_all(&mut self) {
for i in 0..self.animals.len() {
self.animals[i].feed(self);
}
}
}
struct Animal {
food: f32,
}
impl Animal {
fn inc_food(&mut self) {
self.food += 1.0;
}
fn feed(&mut self, world: &World) {
// Imagine this is a much more complex calculation, involving many
// queries to world.animals, several loops, and a bunch of if
// statements. In other words, something so complex it can't just
// be moved outside feed() and pass its result in as a pre-computed value.
for other_animal in world.animals.iter() {
self.food += 10.0 / (other_animal.food + self.food);
}
}
}
fn main() {
let mut world = World {
animals: Vec::with_capacity(1),
};
world.animals.push(Animal { food: 0.0 });
world.feed_all();
}
The above does not compile. The compiler says:
error[E0502]: cannot borrow `*self` as immutable because `self.animals` is also borrowed as mutable
--> src/main.rs:8:34
|
8 | self.animals[i].feed(self);
| ------------ ^^^^- mutable borrow ends here
| | |
| | immutable borrow occurs here
| mutable borrow occurs here
I understand why that error occurs, but what is the idiomatic Rust way to do this?
Just to be clear, the example code is not real. It's meant to present the core problem as simply as possible. The real application I'm writing is much more complex and has nothing to do with animals and feeding.
Assume it is not practical to pre-compute the food value before the call to feed(). In the real app, the method that's analogous to feed() makes many calls to the World object and does a lot of complex logic with the results.
You'd want to compute the argument first in a form that doesn't alias self, then pass that in. As it stands, it seems a little strange that an animal decides how much food it's going to eat by looking at every other animal... regardless, you could add a method Animal::decide_feed_amount(&self, world: &World) -> f32. You can call that safely (&self and &World are both immutable, so that's OK), store the result in a variable, then pass that to Animal::feed.
Edit to address your Edit: well, you're kinda screwed, then. Rust's borrow checker is not sophisticated enough to prove that the mutations you make to the Animal cannot possibly interfere with any possible immutable access to the containing World. Some things you can try:
Do a functional-style update. Make a copy of the Animal you want to update so that it has its own lifetime, update it, then overwrite the original. If you duplicate the whole array up front, this gives you what is effectively an atomic update of the whole array.
As someone who worked on a simulator for like half a decade, I wish I'd done something like that instead of mutating updates. sigh
Change to Vec<Option<Animal>> which will allow you to move (not copy) an Animal out of the array, mutate it, then put it back (see std::mem::replace). Downside is that now everything has to check to see if there's an animal in each position of the array.
Put the Animals inside Cells or RefCells, which will allow you to mutate them from immutable references. It does this by performing dynamic borrow checking which is infinitely slower (no checks vs. some checks), but is still "safe".
Absolute last resort: unsafe. But really, if you do that, you're throwing all your memory safety guarantees out the window, so I wouldn't recommend it.
In summary: Rust is doing the right thing by refusing to compile what I wrote. There's no way to know at compile time that I won't invalidate the data I'm using. If I get a mutable pointer to one animal, the compiler can't know that my read-only access to the vector isn't invalidated by my mutations to that particular animal.
Because this can't be determined at compile time, we need some kind of runtime check, or we need to use unsafe operations to bypass the safety checks altogether.
RefCell is the way to go if we want safety at the cost of runtime checks. UnsafeCell is at least one option to solve this without the overhead, at the cost of safety of course.
I've concluded that RefCell is preferable in most cases. The overhead should be minimal. That's especially true if we're doing anything even moderately complex with the values once we obtain them: The cost of the useful operations will dwarf the cost of RefCell's checks. While UnsafeCell might be a little faster, it invites us to make mistakes.
Below is an example program solving this class of problem with RefCell. Instead of animals and feeding, I chose players, walls, and collision detection. Different scenery, same idea. This solution is generalizable to a lot of very common problems in game programming. For example:
A map composed of 2D tiles, where the render state of each tile depends on its neighbors. E.g. grass next to water needs to render a coast texture. The render state of a given tile updates when that tile or any of its neighbors changes.
An AI declares war against the player if any of the AI's allies are at war with the player.
A chunk of terrain is calculating its vertex normals, and it needs to know the vertex positions of the neighboring chunks.
Anyway, here's my example code:
use std::cell::RefCell;
struct Vector2 {x: f32, y: f32}
impl Vector2 {
fn add(&self, other: &Vector2) -> Vector2 {
Vector2 {x: self.x + other.x, y: self.y + other.y}
}
}
struct World {
players: Vec<RefCell<Player>>,
walls: Vec<Wall>
}
struct Wall;
impl Wall {
fn intersects_line_segment(&self, start: &Vector2, stop: &Vector2) -> bool {
// Pretend this actually does a computation.
false
}
}
struct Player {position: Vector2, velocity: Vector2}
impl Player {
fn collides_with_anything(&self, world: &World, start: &Vector2, stop: &Vector2) -> bool {
for wall in world.walls.iter() {
if wall.intersects_line_segment(start, stop) {
return true;
}
}
for cell in world.players.iter() {
match cell.try_borrow_mut() {
Some(player) => {
if player.intersects_line_segment(start, stop) {
return true;
}
},
// We don't want to collision detect against this player. Nor can we,
// because we've already mutably borrowed this player. So its RefCell
// will return None.
None => {}
}
}
false
}
fn intersects_line_segment(&self, start: &Vector2, stop: &Vector2) -> bool {
// Pretend this actually does a computation.
false
}
fn update_position(&mut self, world: &World) {
let new_position = self.position.add(&self.velocity);
if !Player::collides_with_anything(self, world, &self.position, &new_position) {
self.position = new_position;
}
}
}
fn main() {
let world = World {
players: vec!(
RefCell::new(
Player {
position: Vector2 { x: 0.0, y: 0.0},
velocity: Vector2 { x: 1.0, y: 1.0}
}
),
RefCell::new(
Player {
position: Vector2 { x: 1.1, y: 1.0},
velocity: Vector2 { x: 0.0, y: 0.0}
}
)
),
walls: vec!(Wall, Wall)
};
for cell in world.players.iter() {
let player = &mut cell.borrow_mut();
player.update_position(&world);
}
}
The above could be altered to use UnsafeCell with very few changes. But again,I think RefCell is preferable in this case and in most others.
Thanks to #DK for putting me on the right track to this solution.

Resources