Passing structures as arguments in Rust - struct

I am building a graphics engine from scratch in Rust.
I have a struct Point:
struct Point {
x: i32,
y: i32
}
and a triangle struct, which is just three points:
struct Triangle (Point,Point,Point)
.
I made a trait Engine that I will later implement with SDL
trait Engine {
fn draw_triangle(&self,tri:Triangle){
self.draw_line(tri.0,tri.1);
self.draw_line(tri.1,tri.2); // <-
self.draw_line(tri.2,tri.0);
}
fn draw_line(&self,p1:Point,p2:Point)
}
I get an error on line with an arrow: use of moved value tri.1, value used after move.
I know this has something to do with references and ownership, and have experimented changing things, but I just don't know what I am doing.
I have searched and searched, but to no avail: I cannot comprehend.
Can anyone tell me why this is not working? More than a solution, I wish to understand

On the first draw_line call, you pass two points tri.0 and tri.1. This "moves" the variables into the draw_line function, and they cannot be used afterwards. There are two ways to get around doing this.
Firstly, you can copy (or clone) the points when calling the functions by simply adding #[derive(Clone, Copy)] above the struct Point line. This allows Points to be either cloned explicitly (by using the .clone() method) or copied implicitly. This means that the data will be copied every time you pass it into a function, which in this case isn't a big deal. However, for larger structs, this will want to be avoided and as such you can use the following method.
Secondly, you can pass the Points as a reference instead of by value. This can be accomplished by changing the draw_line signature to accept a Point reference as follows:
fn draw_line(&self, p1: &Point, p2: &Point);
However, this means you now also need to pass the Points by reference, leading to the following trait definition:
trait Engine {
fn draw_triangle(&self, tri: Triangle) {
self.draw_line(&tri.0, &tri.1);
self.draw_line(&tri.1, &tri.2);
self.draw_line(&tri.2, &tri.0);
}
fn draw_line(&self, p1: &Point, p2: &Point);
}
This leads to the Points themselves not being passed into the function, but instead a reference to a single Point - meaning the structs don't need to be copied six times (two for each function call). If this method was called a lot, or the structs were larger, this could be a significant speedup.
The following chapter of the rust book provides a thorough explanation of ownership, references, and borrowing: https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html.

Related

Does Rust's borrow checker really mean that I should re-structure my program?

So I've read Why can't I store a value and a reference to that value in the same struct? and I understand why my naive approach to this was not working, but I'm still very unclear how to better handle my situation.
I have a program I wanted to structure like follows (details omitted because I can't make this compile anyway):
use std::sync::Mutex;
struct Species{
index : usize,
population : Mutex<usize>
}
struct Simulation<'a>{
species : Vec<Species>,
grid : Vec<&'a Species>
}
impl<'a> Simulation<'a>{
pub fn new() -> Self {...} //I don't think there's any way to implement this
pub fn run(&self) {...}
}
The idea is that I create a vector of Species (which won't change for the lifetime of Simulation, except in specific mutex-guarded fields) and then a grid representing which species live where, which will change freely. This implementation won't work, at least not any way I've been able to figure out. As I understand it, the issue is that pretty much however I make my new method, the moment it returns, all of the references in grid would becomine invalid as Simulation and therefor Simulation.species are moved to another location in the stack. Even if I could prove to the compiler that species and its contents would continue to exist, they actually won't be in the same place. Right?
I've looked into various ways around this, such as making species as an Arc on the heap or using usizes instead of references and implementing my own lookup function into the species vector, but these seem slower, messier or worse. What I'm starting to think is that I need to really re-structure my code to look something like this (details filled in with placeholders because now it actually runs):
use std::sync::Mutex;
struct Species{
index : usize,
population : Mutex<usize>
}
struct Simulation<'a>{
species : &'a Vec<Species>, //Now just holds a reference rather than data
grid : Vec<&'a Species>
}
impl<'a> Simulation<'a>{
pub fn new(species : &'a Vec <Species>) -> Self { //has to be given pre-created species
let grid = vec!(species.first().unwrap(); 10);
Self{species, grid}
}
pub fn run(&self) {
let mut population = self.grid[0].population.lock().unwrap();
println!("Population: {}", population);
*population += 1;
}
}
pub fn top_level(){
let species = vec![Species{index: 0, population : Mutex::new(0_)}];
let simulation = Simulation::new(&species);
simulation.run();
}
As far as I can tell this runs fine, and ticks off all the ideal boxes:
grid uses simple references with minimal boilerplate for me
these references are checked at compile time with minimal overhead for the system
Safety is guaranteed by the compiler (unlike a custom map based approach)
But, this feels very weird to me: the two-step initialization process of creating owned memory and then references can't be abstracted any way that I can see, which feels like I'm exposing an implementation detail to the calling function. top_level has to also be responsible for establishing any other functions or (scoped) threads to run the simulation, call draw/gui functions, etc. If I need multiple levels of references, I believe I will need to add additional initialization steps to that level.
So, my question is just "Am I doing this right?". While I can't exactly prove this is wrong, I feel like I'm losing a lot of near-universal abstraction of the call structure. Is there really no way to return species and simulation as a pair at the end (with some one-off update to make all references point to the "forever home" of the data).
Phrasing my problem a second way: I do not like that I cannot have a function with a signature of ()-> Simulation, when I can can have a pair of function calls that have that same effect. I want to be able to encapsulate the creation of this simulation. I feel like the fact that this approach cannot do so indicates I'm doing something wrong, and that there may be a more idiomatic approach I'm missing.
I've looked into various ways around this, such as making species as an Arc on the heap or using usizes instead of references and implementing my own lookup function into the species vector, but these seem slower, messier or worse.
Don't assume that, test it. I once had a self-referential (using ouroboros) structure much like yours, with a vector of things and a vector of references to them. I tried rewriting it to use indices instead of references, and it was faster.
Rc/Arc is also an option worth trying out — note that there is only an extra cost to the reference counting when an Arc is cloned or dropped. Arc<Species> doesn't cost any more to dereference than &Species, and you can always get an &Species from an Arc<Species>. So the reference counting only matters if and when you're changing which Species is in an element of Grid.
If you're owning a Vec of objects, then want to also keep track of references to particular objects in that Vec, a usize index is almost always the simplest design. It might feel like extra boilerplate to you now, but it's a hell of a lot better than properly dealing with keeping pointers in check in a self-referential struct (as somebody who's made this mistake in C++ more than I should have, trust me). Rust's rules are saving you from some real headaches, just not ones that are obvious to you necessarily.
If you want to get fancy and feel like a raw usize is too arbitrary, then I recommend you look at slotmap. For a simple SlotMap, internally it's not much more than an array of values, iteration is fast and storage is efficient. But it gives you generational indices (slotmap calls these "keys") to the values: each value is embellished with a "generation" and each index also internally keeps hold of a its generation, therefore you can safely remove and replace items in the Vec without your references suddenly pointing at a different object, it's really cool.

Tell the Rust compiler that the lifetime of a parameter is always identical to a struct's lifetime

Sorry if this is obvious, I'm starting out with Rust.
I'm trying to implement a simple Composition relationship (one object is the single owner of another one, and the inner object is destroyed automatically when the outer object dies).
I originally thought it would be as simple as declaring a struct like this:
struct Outer {
inner: Inner
}
To my knowledge, that does exactly what I want: the inner attribute is owned by the outer struct, and will be destroyed whenever the outer object disappears.
However, in my case, the Inner type is from a library, and has a lifetime parameter.
// This is illegal
struct Outer {
inner: Inner<'a>
}
// I understand why, I'm using an undeclared lifetime parameter
I have read a bit on lifetime parameters (but I'm not yet completely used to them), and I'm unsure whether there is some syntax to tell the Rust compiler that “the lifetime this field expects is its owner's“, or whether it's just not possible (in which case, what would be the proper way to architecture this code?).
Edit: more detailed situation
I'm writing a project with Vulkano. I want to bundle multiple structures into a single structure so I can pass all at once to functions throughout the project.
Here, I have:
The Engine struct, which should hold everything
The Instance struct, which represents the Vulkan API
The PhysicalDevice struct, which represents a specific GPU, and can only be used as long as its matching Instance exists
The struct I'm struggling with is PhysicalDevice:
// https://github.com/vulkano-rs/vulkano/blob/c6959aa961c9c4bac59f53c99e73620b458d8d82/vulkano/src/device/physical.rs#L297
pub struct PhysicalDevice<'a> {
instance: &'a Arc<Instance>,
device: usize,
}
I want to create a struct that looks like:
pub struct Engine {
instance: Arc<Instance>,
device: PhysicalDevice,
}
Because the Engine struct owns PhysicalDevice as well as Arc<Instance>, and the instance referenced by PhysicalDevice is the same as the one referenced by the Engine, the PhysicalDevice's lifetime requirement should always be valid (since the contained instance cannot be freed before the Engine is freed).
I don't have very good reason of using this architecture, apart from the fact that this is the standard way of bundling related data in other languages. If this not proper "good practices" in Rust, I'm curious as to what the recommended approach would be.
“the lifetime this field expects is its owner's“,
This is impossible.
Generally, you should understand any type with a lifetime parameter (whether it is a reference &'a T or a struct Foo<'a> or anything else) as pointing to something elsewhere and which lives independently.
In your case, whatever Inner borrows (whatever its lifetime is about) cannot be a part of Outer. This is because if it was, Outer would be borrowing parts of itself, which makes Outer impossible to use normally. (Doing so requires pinning, preventing the struct from moving and thereby invalidating the references, which is not currently possible to set up without resorting to unsafe code.)
So, there are three cases you might have:
Your full situation is
struct Outer {
some_data: SomeOtherType,
inner: Inner<'I_want_to_borrow_the_some_data_field>,
}
This is not possible in basic Rust; you need to either
put some_data somewhere other than Outer,
use the ouroboros library which provides mechanisms to build sound self-referential structs (at the price of heap allocations and a complex interface),
or design your data structures differently in some other way.
The data Inner borrows is already independent. In that case, the correct solution is to propagate the lifetime parameter.
struct Outer<'a> {
inner: Inner<'a>,
}
There is not actually any borrowed data; Inner provides for the possibility but isn't actually using it in this case (or the borrowed data is a compile-time constant). In this case, you can use the 'static lifetime.
struct Outer {
inner: Inner<'static>,
}
You can use a bound on Self:
struct Outer<'a> where Self: 'a {
inner: Inner<'a>
}

Is my understanding of a Rust vector that supports Rc or Box wrapped types correct?

I'm not looking for code samples. I want to state my understanding of Box vs. Rc and have you tell me if my understanding is right or wrong.
Let's say I have some trait ChattyAnimal and a struct Cat that implements this trait, e.g.
pub trait ChattyAnimal {
fn make_sound(&self);
}
pub struct Cat {
pub name: String,
pub sound: String
}
impl ChattyAnimal for Cat {
fn make_sound(&self) {
println!("Meow!");
}
}
Now let's say I have other structs (Dog, Cow, Chicken, ...) that also implement the ChattyAnimal trait, and let's say I want to store all of these in the same vector.
So step 1 is I would have to use a Box type, because the Rust compiler cannot determine the size of everything that might implement this trait. And therefore, we must store these items on the heap – viola using a Box type, which is like a smarter pointer in C++. Anything wrapped with Box is automatically deleted by Rust when it goes out of scope.
// I can alias and use my Box type that wraps the trait like this:
pub type BoxyChattyAnimal = Box<dyn ChattyAnimal>;
// and then I can use my type alias, i.e.
pub struct Container {
animals: Vec<BoxyChattyAnimal>
}
Meanwhile, with Box, Rust's borrow checker requires changing when I pass or reassign the instance. But if I actually want to have multiple references to the same underlying instance, I have to use Rc. And so to have a vector of ChattyAnimal instances where each instance can have multiple references, I would need to do:
pub type RcChattyAnimal = Rc<dyn ChattyAnimal>;
pub struct Container {
animals: Vec<RcChattyAnimal>
}
One important take away from this is that if I want to have a vector of some trait type, I need to explicitly set that vector's type to a Box or Rc that wraps my trait. And so the Rust language designers force us to think about this in advance so that a Box or Rc cannot (at least not easily or accidentally) end up in the same vector.
This feels like a very and well thought design – helping prevent me from introducing bugs in my code. Is my understanding as stated above correct?
Yes, all this is correct.
There's a second reason for this design: it allows the compiler to verify that the operations you're performing on the vector elements are using memory in a safe way, relative to how they're stored.
For example, if you had a method on ChattyAnimal that mutates the animal (i.e. takes a &mut self argument), you could call that method on elements of a Vec<Box<dyn ChattyAnimal>> as long as you had a mutable reference to the vector; the Rust compiler would know that there could only be one reference to the ChattyAnimal in question (because the only reference is inside the Box, which is inside the Vec, and you have a mutable reference to the Vec so there can't be any other references to it). If you tried to write the same code with a Vec<Rc<dyn ChattyAnimal>>, the compiler would complain; it wouldn't be able to completely eliminate the possibility that your code might be mutating the animal at the same time as the code that called it was in the middle of trying to read the animal, which might lead to some inconsistencies in the calling code.
As a consequence, the compiler needs to know that all the elements of the Vec have their memory treated in the same way, so that it can check to make sure that a reference to some arbitrary element of the Vec is being used appropriately.
(There's a third reason, too, which is performance; because the compiler knows that this is a "vector of Boxes" or "vector of Rcs", it can generate code that assumes a particular storage mechanism. For example, if you have a vector of Rcs, and clone one of the elements, the machine code that the compiler generates will work simply by going to the memory address listed in the vector and adding 1 to the reference count stored there – there's no need for any extra levels of indirection. If the vector were allowed to mix different allocation schemes, the generated code would have to be a lot more complex, because it wouldn't be able to assume things like "there is a reference count", and would instead need to (at runtime) find the appropriate piece of code for dealing with the memory allocation scheme in use, and then run it; that would be much slower.)

Why do I get double references when filtering a Vec?

I am always having difficulties writing a proper iteration over Vec.
Perhaps this is because I don't understand yet properly when and why references are introduced.
For example
pub fn notInCheck(&self) -> bool { .... } // tells whether king left in check
pub fn apply(&self, mv: Move) -> Self { ... } // applies a move and returns the new position
pub fn rawMoves(&self, vec: &mut Vec<Move>) { ... } // generates possible moves, not taking chess into account
/// List of possible moves in a given position.
/// Verified to not leave the king of the moving player in check.
pub fn moves(&self) -> Vec<Move> {
let mut vec: Vec<Move> = Vec::with_capacity(40);
self.rawMoves(&mut vec);
vec.iter().filter(|m| self.apply(**m).notInCheck()).map(|m| *m).collect()
}
where
#[derive(Clone, Copy, Debug, PartialEq, PartialOrd, Eq, Ord, Hash)]
#[repr(transparent)]
pub struct Move {
mv: u32,
}
The iteration I wrote first was:
vec.iter().filter(|m| self.apply(m).notInCheck()).collect()
but, of course, the compiler gave all sorts of errors. In addressing these errors, I arrived finally at the version shown above, but while the compiler is happy, I'm not sure I am.
It looks like the vector doesn't hold Move's at all, but merely references to Moves? But then, where are the Move's stored? In addition, the filter() function adds another level of indirection. Is this correct? Please explain to me!
Bonus question: When I have vector elements with a type that implements Copy, is there a way to avoid all this useless reference taking stuff. I understand how it would make sense with vector elements of a notable size one does not want to copy around. However, I definitely want to avoid &&value in filter(). Can I?
It looks like the vector doesn't hold Move's at all, but merely references to Moves? But then, where are the Move's stored? In addition, the filter() function adds another level of indirection. Is this correct? Please explain to me!
No, Vec<Move> definitely holds moves. The part you're missing is that aside from filter() getting a reference to the iterator item, slice::iter creates an iterator on references to the slice (or vec here) items, so Vec<Move> -> Iterator<Item=&Move> -> filter(predicate: FnMut(&&Move) -> bool), and that's why you've got two indirections in your filter callback.
When I have vector elements with a type that implements Copy, is there a way to avoid all this useless reference taking stuff. I understand how it would make sense with vector elements of a notable size one does not want to copy around. However, I definitely want to avoid &&value in filter(). Can I?
Yes. You can use into_iter which will consume the source vector but iterate on the contained values directly, or you can use the Iterator::copied adapter which will Copy the iterator item, therefore going from Iterator<Item=&T> to Iterator<Item=T>. However filter will never get a T, the most it can get is an &T since otherwise the item would get "lost" (it would be consumed by the filter, which would only return a boolean, yielding… nothing useful).
The alternative is to use something like filter_map which does get a T input, and returns an Option<U>. Because (as the name indicates) it both filters and maps, it gets to consume the input item and either return an output item (possibly the same) or return "nothing" and essentially remove the item from the collection.
Incidentally, there's also an Iterator::cloned adapter for types which are Clone but not (necessarily) Copy.
Also you could have basically done that by hand by just flipping map and filter around in the original:
vec.iter().map(|m| *m).filter(|m| self.apply(*m).notInCheck()).collect()
map transforms the Iterator<Item=&T> into an Iterator<Item=T>, then filter just gets an &T instead of an &&T.
That aside, I don't really get why apply needs to consume the input move. Or why rawMoves doesn't just… create the vector internally and return it? I get the optimisation allowing for reusing the buffer but that seems like a case of premature optimisation maybe?
And your Move seems… both over-complicated and a bit too simple? If you just want to newtype a u32 then using a tuple-struct seems more than sufficient.
And repr(transparent) is wholly unnecessary, it's only a concern in FFI contexts where the newtype is intended as a type-safety measure which only exists on the Rust side (aka the newtype itself is not visible from / exposed to C, only the wrapped type is).

Using same reference for multiple method parameters

I'll preface by saying I'm very new to Rust, and I'm still wrapping my head around the semantics of the borrow-checker. I have some understanding of why it doesn't like my code, but I'm not sure how to resolve it in an idiomatic way.
I have a method in Rust which accepts 3 parameters with a signature that looks something like this:
fn do_something(&mut self, mem: &mut impl TraitA, bus: &mut impl TraitB, int_lines: &impl TraitC) -> ()
I also have a struct which implements all three of these traits; however, the borrow-checker is complaining when I attempt to use the same reference for multiple parameters:
cannot borrow `*self` as mutable more than once at a time
And also:
cannot borrow `*self` as immutable because it is also borrowed as mutable
My first question is whether this is a shortcoming of the borrow-checker (being unable to recognize that the same reference is being passed), or by design (I suspect this is the case, since from the perspective of the called method each reference is distinct and thus the ownership of each can be regarded separately).
My second question is what the idiomatic approach would be. The two solutions I see are:
a) Combining all three traits into one. While this is technically trivial given my library's design, it would make the code decidedly less clean since the three traits are used to interface with unrelated parts of the struct's state. Furthermore, since this is a library (the do_something method is part of a test), it hinders the possibility of separating the state out into separate structs.
b) Moving each respective part of the struct's state into separate structs, which are then owned by the main struct. This seems like the better option to me, especially since it does not require any changes to the library code itself.
Please let me know if I'm missing another solution, or if there's a way to convince the borrow-checker to accept my original design.
The borrow checker is operating as designed. It only knows you are passing three different mutable references into the same function: it does not know what the function will do with these, even if they do happen to be references to the same struct. Within the function they are three different mutable references to the same struct.
If the three different traits represent three different functional aspects, then your best approach might be to split the struct into different structs, each implementing one of the traits, as you have proposed.
If you would prefer to keep a single struct, and if the function will always be called with a single struct, then you can just pass it in once like this:
fn do_something(&mut self, proc: &mut (impl TraitA + TraitB + TraitC)) -> () { ... }

Resources