Inefficient instance construction? - rust

Here is a simple struct
pub struct Point {
x: uint,
y: uint
}
impl Point {
pub fn new() -> Point {
Point{x: 0u, y: 0u}
}
}
fn main() {
let p = box Point::new();
}
My understanding of how the constructor function works is as follows. The new() function creates an instance of Point in its local stack and returns it. Data from this instance is shallow copied into the heap memory created by box. The pointer to the heap memory is then assigned to the variable p.
Is my understanding correct? Does two separate memory regions get initialized to create one instance? This seems to be an inefficient way to initialize an instance compared to C++ where we get to directly write to the memory of the instance from the constructor.

From a relevant guide:
You may think that this gives us terrible performance: return a value and then immediately box it up ?! Isn't this pattern the worst of both worlds? Rust is smarter than that. There is no copy in this code. main allocates enough room for the box, passes a pointer to that memory into foo as x, and then foo writes the value straight into the Box.
This is important enough that it bears repeating: pointers are not for optimizing returning values from your code. Allow the caller to choose how they want to use your output.
While this talks about boxing the value, I believe the mechanism is general enough, and not specific to boxes.

Just to expand a bit on #Shepmaster's answer:
Rust (and LLVM) supports RVO, or return value optimization, where if a return value is used in a context like box, Rust is smart enough to generate code that uses some sort of out pointer to avoid the copy by writing the return value directly into its usage site. box is one of the major uses of RVO, but it can be used for other types and situations as well.

Related

Does Rust's borrow checker really mean that I should re-structure my program?

So I've read Why can't I store a value and a reference to that value in the same struct? and I understand why my naive approach to this was not working, but I'm still very unclear how to better handle my situation.
I have a program I wanted to structure like follows (details omitted because I can't make this compile anyway):
use std::sync::Mutex;
struct Species{
index : usize,
population : Mutex<usize>
}
struct Simulation<'a>{
species : Vec<Species>,
grid : Vec<&'a Species>
}
impl<'a> Simulation<'a>{
pub fn new() -> Self {...} //I don't think there's any way to implement this
pub fn run(&self) {...}
}
The idea is that I create a vector of Species (which won't change for the lifetime of Simulation, except in specific mutex-guarded fields) and then a grid representing which species live where, which will change freely. This implementation won't work, at least not any way I've been able to figure out. As I understand it, the issue is that pretty much however I make my new method, the moment it returns, all of the references in grid would becomine invalid as Simulation and therefor Simulation.species are moved to another location in the stack. Even if I could prove to the compiler that species and its contents would continue to exist, they actually won't be in the same place. Right?
I've looked into various ways around this, such as making species as an Arc on the heap or using usizes instead of references and implementing my own lookup function into the species vector, but these seem slower, messier or worse. What I'm starting to think is that I need to really re-structure my code to look something like this (details filled in with placeholders because now it actually runs):
use std::sync::Mutex;
struct Species{
index : usize,
population : Mutex<usize>
}
struct Simulation<'a>{
species : &'a Vec<Species>, //Now just holds a reference rather than data
grid : Vec<&'a Species>
}
impl<'a> Simulation<'a>{
pub fn new(species : &'a Vec <Species>) -> Self { //has to be given pre-created species
let grid = vec!(species.first().unwrap(); 10);
Self{species, grid}
}
pub fn run(&self) {
let mut population = self.grid[0].population.lock().unwrap();
println!("Population: {}", population);
*population += 1;
}
}
pub fn top_level(){
let species = vec![Species{index: 0, population : Mutex::new(0_)}];
let simulation = Simulation::new(&species);
simulation.run();
}
As far as I can tell this runs fine, and ticks off all the ideal boxes:
grid uses simple references with minimal boilerplate for me
these references are checked at compile time with minimal overhead for the system
Safety is guaranteed by the compiler (unlike a custom map based approach)
But, this feels very weird to me: the two-step initialization process of creating owned memory and then references can't be abstracted any way that I can see, which feels like I'm exposing an implementation detail to the calling function. top_level has to also be responsible for establishing any other functions or (scoped) threads to run the simulation, call draw/gui functions, etc. If I need multiple levels of references, I believe I will need to add additional initialization steps to that level.
So, my question is just "Am I doing this right?". While I can't exactly prove this is wrong, I feel like I'm losing a lot of near-universal abstraction of the call structure. Is there really no way to return species and simulation as a pair at the end (with some one-off update to make all references point to the "forever home" of the data).
Phrasing my problem a second way: I do not like that I cannot have a function with a signature of ()-> Simulation, when I can can have a pair of function calls that have that same effect. I want to be able to encapsulate the creation of this simulation. I feel like the fact that this approach cannot do so indicates I'm doing something wrong, and that there may be a more idiomatic approach I'm missing.
I've looked into various ways around this, such as making species as an Arc on the heap or using usizes instead of references and implementing my own lookup function into the species vector, but these seem slower, messier or worse.
Don't assume that, test it. I once had a self-referential (using ouroboros) structure much like yours, with a vector of things and a vector of references to them. I tried rewriting it to use indices instead of references, and it was faster.
Rc/Arc is also an option worth trying out — note that there is only an extra cost to the reference counting when an Arc is cloned or dropped. Arc<Species> doesn't cost any more to dereference than &Species, and you can always get an &Species from an Arc<Species>. So the reference counting only matters if and when you're changing which Species is in an element of Grid.
If you're owning a Vec of objects, then want to also keep track of references to particular objects in that Vec, a usize index is almost always the simplest design. It might feel like extra boilerplate to you now, but it's a hell of a lot better than properly dealing with keeping pointers in check in a self-referential struct (as somebody who's made this mistake in C++ more than I should have, trust me). Rust's rules are saving you from some real headaches, just not ones that are obvious to you necessarily.
If you want to get fancy and feel like a raw usize is too arbitrary, then I recommend you look at slotmap. For a simple SlotMap, internally it's not much more than an array of values, iteration is fast and storage is efficient. But it gives you generational indices (slotmap calls these "keys") to the values: each value is embellished with a "generation" and each index also internally keeps hold of a its generation, therefore you can safely remove and replace items in the Vec without your references suddenly pointing at a different object, it's really cool.

why does value allocated in stack didn't result in double free pointer?

Please tell me why didn't result in double free pointer that the value is allocated in stack? Thanks.
#[test]
fn read_value_that_allocated_in_stack_is_no_problem() {
let origin = Value(1);
let copied = unsafe { std::ptr::read(&origin) };
assert_eq!(copied, Value(1));
assert_eq!(copied, origin);
}
/// test failed as expected: double free detected
#[test]
fn read_value_that_allocated_in_heap_will_result_in_double_free_problem() {
let origin = Box::new(Value(1));
let copied = unsafe { std::ptr::read(&origin) };
assert_eq!(copied, Box::new(Value(1)));
assert_eq!(copied, origin);
}
#[derive(Debug, PartialEq)]
struct Value<T>(T);
The unsafe method you are using just creates a bitwise copy of the referenced value. When you do this with a Box, it's not okay but for something like your Value struct containing an integer, it is okay to make the copy as Drop of integers has no side effects while drop of Box accesses global allocator and changes the state.
If you do not understand any term I used for this explanation, try to search it or ask in the comments.
Those tests hide the fact that you use different types in them. It isn't really about stack or heap.
In the first one you use Value<i32> type, which is your custom type, presumably without custom Drop implemented. If so, then Rust will call Drop on each member, in this case the i32 member. Which does nothing. And so nothing happens when both objects go out of scope. Even if you implement Drop, it would have to have some serious side-effects (like call to free) for it to fail.
In the second one you actually use Box type, which does implement Drop. Internally it calls free (in addition to dropping the underlying object). And so free is called twice on drop, trying to free the same pointer (because of the unsafe copy).
This is not a double free because we do not free the memory twice. Because we do not free memory at all.
However, whether this is valid or UB is another question. Miri does not flag it as UB, however Miri does not aim to flag any UB out there. If Value<i32> was Copy, that would be fine. As it is not, it depends on the question whether std::ptr::read() invalidates the original data, i.e. is it always invalid to use a data that was std::ptr::read()'ed, or only if it violates Stacked Borrows semantics, like in the case of copying the Box itself where both destructors try to access the Box thereafter?
The answer is that it's not decided yet. As per UCG issue #307, "Are Copy implementations semantically relevant (besides specialization)?":
#steffahn
Overall, this still being an open question means that while miri doesn't complain, one should avoid code like this because it's not yet certain that it won't be UB, right?
#RalfJung
Yes.
In conclusion, you should avoid code like that.

Is my understanding of a Rust vector that supports Rc or Box wrapped types correct?

I'm not looking for code samples. I want to state my understanding of Box vs. Rc and have you tell me if my understanding is right or wrong.
Let's say I have some trait ChattyAnimal and a struct Cat that implements this trait, e.g.
pub trait ChattyAnimal {
fn make_sound(&self);
}
pub struct Cat {
pub name: String,
pub sound: String
}
impl ChattyAnimal for Cat {
fn make_sound(&self) {
println!("Meow!");
}
}
Now let's say I have other structs (Dog, Cow, Chicken, ...) that also implement the ChattyAnimal trait, and let's say I want to store all of these in the same vector.
So step 1 is I would have to use a Box type, because the Rust compiler cannot determine the size of everything that might implement this trait. And therefore, we must store these items on the heap – viola using a Box type, which is like a smarter pointer in C++. Anything wrapped with Box is automatically deleted by Rust when it goes out of scope.
// I can alias and use my Box type that wraps the trait like this:
pub type BoxyChattyAnimal = Box<dyn ChattyAnimal>;
// and then I can use my type alias, i.e.
pub struct Container {
animals: Vec<BoxyChattyAnimal>
}
Meanwhile, with Box, Rust's borrow checker requires changing when I pass or reassign the instance. But if I actually want to have multiple references to the same underlying instance, I have to use Rc. And so to have a vector of ChattyAnimal instances where each instance can have multiple references, I would need to do:
pub type RcChattyAnimal = Rc<dyn ChattyAnimal>;
pub struct Container {
animals: Vec<RcChattyAnimal>
}
One important take away from this is that if I want to have a vector of some trait type, I need to explicitly set that vector's type to a Box or Rc that wraps my trait. And so the Rust language designers force us to think about this in advance so that a Box or Rc cannot (at least not easily or accidentally) end up in the same vector.
This feels like a very and well thought design – helping prevent me from introducing bugs in my code. Is my understanding as stated above correct?
Yes, all this is correct.
There's a second reason for this design: it allows the compiler to verify that the operations you're performing on the vector elements are using memory in a safe way, relative to how they're stored.
For example, if you had a method on ChattyAnimal that mutates the animal (i.e. takes a &mut self argument), you could call that method on elements of a Vec<Box<dyn ChattyAnimal>> as long as you had a mutable reference to the vector; the Rust compiler would know that there could only be one reference to the ChattyAnimal in question (because the only reference is inside the Box, which is inside the Vec, and you have a mutable reference to the Vec so there can't be any other references to it). If you tried to write the same code with a Vec<Rc<dyn ChattyAnimal>>, the compiler would complain; it wouldn't be able to completely eliminate the possibility that your code might be mutating the animal at the same time as the code that called it was in the middle of trying to read the animal, which might lead to some inconsistencies in the calling code.
As a consequence, the compiler needs to know that all the elements of the Vec have their memory treated in the same way, so that it can check to make sure that a reference to some arbitrary element of the Vec is being used appropriately.
(There's a third reason, too, which is performance; because the compiler knows that this is a "vector of Boxes" or "vector of Rcs", it can generate code that assumes a particular storage mechanism. For example, if you have a vector of Rcs, and clone one of the elements, the machine code that the compiler generates will work simply by going to the memory address listed in the vector and adding 1 to the reference count stored there – there's no need for any extra levels of indirection. If the vector were allowed to mix different allocation schemes, the generated code would have to be a lot more complex, because it wouldn't be able to assume things like "there is a reference count", and would instead need to (at runtime) find the appropriate piece of code for dealing with the memory allocation scheme in use, and then run it; that would be much slower.)

Reuse raw pointer to rehydrate Rust Box

I have a function that coverts a Box::into_raw result into a u64. I later 're-Box' with from the u64.
// Somewhere at inception
let bbox = Box::new(MyStruct::from_u32(1u32).unwrap());
let rwptr = Box::into_raw(bbox);
let bignum_ptr = rwptr as u64;
// Later in life
let rehyrdrate: Box<MyStruct> = unsafe {
Box::from_raw(bignum_ptr as *mut MyStruct)
};
What I would like to do is 're-Box' that bignum_ptr again, and again, as needed. Is this possible?
A box owns the data it points to, and will deallocate/drop it when it goes out of scope, so if you need use the same pointer in more than one place, Box is not the correct type. To support multiple "revivals" of the pointer, you can use a reference instead:
// safety contract: bignum_ptr comes from a valid pointer, there are no
// mutable references
let rehydrate: &MyStruct = unsafe { &*(bignum_ptr as *const MyStruct) };
When the time comes to free the initial box and its data (and you know that no outstanding references exist), only then re-create the box using Box::from_raw:
// safety contract: bignum_ptr comes from a valid pointer, no references
// of any kind remain
drop(unsafe { Box::from_raw(bignum_ptr as *const MyStruct) });
What I would like to do is 're-Box' that bignum_ptr again, and again, as needed. Is this possible?
If you mean creating many boxes from the same pointer without taking it out each time, no.
If you mean putting it in and out repeatedly and round-tripping every time via an integer, probably yes; however, I would be careful with code like that. Most likely, it will work, but be aware that the memory model for Rust is not formalized and the rules around pointer provenance may change in the future. Even the C and C++ standards (from where Rust memory model comes from) have open questions around those, including round-tripping via an integer type.
Furthermore, your code assumes a pointer fits in a u64, which is likely true for most architectures, but maybe not all in the future.
At the very least, I suggest you use mem::transmute rather than a cast.
In short: don't do it. There is likely a better design for what you are trying to achieve.

How to create a DST type?

DST (Dynamically Sized Types) are a thing in Rust now. I have used them successfully, with a flexible last member which is known to the compiler (such as [u8]).
What I am looking to do, however, is to create a custom DST. Say, for example:
struct Known<S> {
dropit: fn (&mut S) -> (),
data: S,
}
struct Unknown {
dropit: fn (&mut ()) -> (),
data: (),
}
With an expected usage being Box<Known<S>> => Box<Unknown> => Box<Known<S>>, where the middleware need not know about concrete types.
Note: yes, I know about Any, and no I am not interested in using it.
I am open to suggestions in the layout of both Known and Unknown, however:
size_of::<Box<Known>>() = size_of::<Box<Unknown>>() = size_of::<Box<u32>>(); that is it should be a thin pointer.
dropping Box<Unknown> drops its content
cloning Box<Unknown> (assuming a clonable S), clones its content
ideally, fn dup(u: &Unknown) -> Box<Unknown> { box u.clone() } works
I have particular difficulties with (3) and (4), I could solve (3) with manually allocating memory (not using box, but directly calling malloc) but I would prefer providing an idiomatic experience to the user.
I could not find any documentation on how to inform box of the right size to allocate.
There are exactly two types of unsized objects at present: slices ([T]), where it adds a length member; and trait objects (Trait, Trait + Send, &c.), where it adds a vtable including a destructor which knows how large an object to free.
There is not currently any mechanism for declaring your own variety of unsized objects.
At this point, you should seek inspiration from Arc::new_uinint_slice and Arc::from_ptr.
We've no nice mechanism to make custom DSTs play nicely together though, making Arc<Known<T>> nasty.
We still always create Arc<dyn Trait> with CoerceUnsized because you cannot make trait objects form DSTs currently.
You could try using the vptr crate, which stores the vtable pointer with the data instead of with the pointer.

Resources