How can the memory address of a struct be the same inside a function and after the struct has been returned?

How can the memory address of a struct be the same inside a function and after the struct has been returned? - rust

The phrase "when scope exits the values get automatically popped from stack" is repeated many times, but the example I provide here disproves the statement:
fn main() {
let foo = foobar();
println!("The address in main {:p}", &foo);
}
fn foobar() -> Employee {
let emp = Employee {
company: String::from("xyz"),
name: String::from("somename"),
age: 50,
};
println!("The address inside func {:p}", &emp);
emp
}
#[derive(Debug)]
struct Employee {
name: String,
company: String,
age: u32,
}
The output is:
The address inside func 0x7fffc34011e8
The address in main 0x7fffc34011e8
This makes sense. When I use Box to create the struct the address differs as I expected.
If the function returns ownership (move) of the return value to the caller, then after the function execution the memory corresponds to that function gets popped which is not safe, then how is the struct created inside the function accessible even after the function exits?
The same things happens when returning an array. Where are these elements stored in memory, whether in the stack or on the heap?
Will the compiler do escape analysis at compile time and move the values to the heap like Go does?
I'm sure that Employee doesn't implement the Copy trait.

In many languages, variables are just a convenient means for
humans to name some values.
Even if on a logical point of view we can assume that there is one
specific storage for each specific variable, and we can reason about
this in terms of copy, move... it does not imply that these copies
and moves physically happen (and notably because of the optimizer).
Moreover, when reading various documents about Rust, we often find
the term binding instead of variable; this reinforces the idea
that we just refer to a value that exists somewhere.
It is exactly the same as writing let a=something(); then let b=a;,
the again let c=b;... we simply change our mind about the name
but no data is actually moved.
When it comes to debugging, the generated code is generally
sub-optimal by giving each variable its own storage
in order to inspect these variables in memory.
This can be misleading about the true nature of the optimised code.
Back to your example, you detected that Rust decided to perform
a kind of return-value-optimization (common C++ term nowadays)
because it knows that a temporary value must appear in the calling
context to provide the result, and this result comes from a local
variable inside the function.
So, instead of creating two different storages and copying or moving from
one to another, it is better to use the same storage: the local
variable is stored outside the function (where the result is
expected).
On the logical point of view it does not change anything but it
is much more efficient.
And when code inlining comes into play, no one can predict where
our variables/values/bindings are actually stored.
Some comments below state that this return-value-optimisation
can be counted on since it takes place in the Rust ABI.
(I was not aware of that, still a beginner ;^)

Related

Is reading from a file with multiple threads considered undefined behavior in rust?

Example seen below. It seems like this might by definition be ub, but it remains unclear to me.
fn main(){
let mut threads = Vec::new();
for _ in 0..100 {
let thread = thread::spawn(move || {
fs::read_to_string("./config/init.json")
.unwrap()
.trim()
.to_string()
});
threads.push(thread);
}
for handler in threads {
handler.join().unwrap();
}
}

On most operating systems only individual read operations are guaranteed to be atomic. read_to_string may perform multiple distinct reads, which means that it's not guaranteed to be atomic between multiple threads/processes. If another process is modifying this file concurrently, read_to_string could return a mixture of data from before and after the modification. In other words, each read_to_string operation is not guaranteed to return an identical result, and some may even fail while others succeed if another process deletes the file while the program is running.
However, none of this behavior is classified as "undefined." Absent hardware problems, you are guaranteed to get back a std::io::Result<String> in a valid state, which is something you can reason about. Once UB is invoked, you can no longer reason about the state of the program.
By way of analogy, consider a choose your own adventure book. At the end of each segment you'll have some instructions like "If you choose to go into the cave, go to page 53. If you choose to take the path by the river, go to page 20." Then you turn to the appropriate page and keep reading. This is a bit like Result -- if you have an Ok you do one thing, but if you have an Err you do another thing.
Once undefined behavior is invoked, this kind of choice no longer makes sense because the program is in a state where the rules of the language no longer apply. The program could do anything, including deleting random files from your hard drive. In the book analogy, the book caught fire. Trying to follow the rules of the book no longer makes any sense, and you hope the book doesn't burn your house down with it.
In Rust you're not supposed to be able to invoke UB without using the unsafe keyword, so if you don't see that keyword anywhere then UB isn't on the table.

Does moving data to Rc/Arc always copy it from the stack to the heap?

Take a look at the following simple example:
use std::rc::Rc;
struct MyStruct {
a: i8,
}
fn main() {
let mut my_struct = MyStruct { a: 0 };
my_struct.a = 5;
let my_struct_rc = Rc::new(my_struct);
println!("my_struct_rc.a = {}", my_struct_rc.a);
}
The official documentation of Rc says:
The type Rc<T> provides shared ownership of a value of type T,
allocated in the heap.
Theoretically it is clear. But, firstly my_struct is not immediately wrapped into Rc, and secondly MyStruct is a very simple type. I can see 2 scenarios here.
When my_struct is moved into the Rc the memory content is literally copied from the stack to the heap.
The compiler is able to resolve that my_struct will be moved into the Rc, so it puts it on the heap from the beginning.
If number 1 is true, then there might be a hidden performance bottleneck as when reading the code one does not explicitly see memory being copied (I am assuming MyStruct being much more complex).
If number 2 is true, I wonder whether the compiler is always able to resolve such things. The provided example is very simple, but I can imagine that my_struct is much more complex and is mutated several times by different functions before being moved to the Rc.

Tl;dr It could be either scenario, but for the most part, you should just write code in the most obvious way and let the compiler worry about it.
According to the semantics of the abstract machine, that is, the theoretical model of computation that defines Rust's behavior, there is always a copy. In fact, there are at least two: my_struct is first created in the stack frame of main, but then has to be moved into the stack frame of Rc::new. Then Rc::new has to create an allocation and move my_struct a second time, from its own stack frame into the newly allocated memory*. Each of these moves is conceptually a copy.
However, this analysis isn't particularly useful for predicting the performance of code in practice, for three reasons:
Copies are actually pretty darn cheap. Moving my_struct from one place to another may actually be much cheaper, in the long run, than referencing it with a pointer. Copying a chunk of bytes is easy to optimize on modern processors; following a pointer to some arbitrary location is not. (Bear in mind also that the complexity of the structure is irrelevant because all moves are bytewise copies; for instance, moving any Vec is just copying three usizes regardless of the contents.)
If you haven't measured the performance and shown that excessive copying is a problem, you must not assume that it is without evidence: you may accidentally pessimize instead of optimizing your code. Measure first.
The semantics of the abstract machine is not the semantics of your real machine. The whole point of an optimizing compiler is to figure out the best way to transform one to the other. Under reasonable assumptions, it's very unlikely that the code here would result in 2 copies with optimizations turned on. But how the compiler eliminates one or both copies may be dependent on the rest of the code: not just on the snippet that contains them but on how the data is initialized and so forth. Real machine performance is complicated and generally requires analysis of more than just a few lines at a time. Again, this is the whole point of an optimizing compiler: it can do a much more comprehensive analysis, much faster than you or I can.
Even if the compiler leaves a copy "on the table", you shouldn't assume without evidence that removing the copy would make things better simply because it is a copy. Measure first.
It probably doesn't matter anyway, in this case. Requesting a new allocation from the heap is likely† more expensive than copying a bunch of bytes from one place to another, so fiddling around with 1 fast copy vs. no copies while ignoring a (plausible) big bottleneck is probably a waste of time. Don't try to optimize things before you've profiled your application or library to see where the most performance is being lost. Measure first.
See also
Questions about overflowing the stack by accidentally putting large data on it (to which the solution is usually to use Vec instead of an array):
How to allocate arrays on the heap in Rust 1.0?
Thread '<main>' has overflowed its stack when allocating a large array using Box
* Rc, although part of the standard library, is written in plain Rust code, which is how I analyze it here. Rc could theoretically be subject to guaranteed optimizations that aren't available to ordinary code, but that doesn't happen to be relevant to this case.
† Depending at least on the allocator and on whether new memory must be acquired from the OS or if a recently freed allocation can be re-used.

You can just test what happens:
Try to use my_struct after creating an Rc out of it. The value has been moved, so you can't use it.
use std::rc::Rc;
struct MyStruct {
a: i8,
}
fn main() {
let mut my_struct = MyStruct { a: 0 };
my_struct.a = 5;
let my_struct_rc = Rc::new(my_struct);
println!("my_struct_rc.a = {}", my_struct_rc.a);
// Add this line. Compilation error "borrow of moved value"
println!("my_struct.a = {}", my_struct.a);
}
Make your struct implement the Copy trait, and it will be automatically copied into the Rc::new function. Now the code above works, because the my_struct variable is not moved anywhere, just copied.
#[derive(Clone, Copy)]
struct MyStruct {
a: i8,
}
The compiler is able to resolve that my_struct will be moved into the Rc, so it puts it on the heap from the beginning.
Take a look at Rc::new source code (removed the comment which is irrelevant).
struct RcBox<T: ?Sized> {
strong: Cell<usize>,
weak: Cell<usize>,
value: T,
}
// ...
pub fn new(value: T) -> Rc<T> {
Self::from_inner(Box::into_raw_non_null(box RcBox {
strong: Cell::new(1),
weak: Cell::new(1),
value,
}))
}
It takes the value you pass to it, and creates a Box, so it's always put on the heap. This is plain Rust and I don't think it performs too many sophisticated optimizations, but that may change.
Note that "move" in Rust may also copy data implicitly, and this may depend on the current compiler's behavior. In that case, if you are concerned about performance you can try to make the struct as small as possible, and store some information on the heap. For example when a Vec<T> is moved, as far as I know it only copies the capacity, length and pointer to the heap, but the actual array which is on the heap is not copied element by element, so only a few bytes are copied when moving a vector (assuming the data is copied, because that's also subject to compiler optimizations in case copying is not actually needed).

What happens in memory when ownership is transferred out of a box?

Does the variable s in print_struct refer to data on the heap or on the stack?
struct Structure {
x: f64,
y: u32,
/* Use a box, so that Structure isn't copy */
z: Box<char>,
}
fn main() {
let my_struct_boxed = Box::new(Structure {
x: 2.0,
y: 325,
z: Box::new('b'),
});
let my_struct_unboxed = *my_struct_boxed;
print_struct(my_struct_unboxed);
}
fn print_struct(s: Structure) {
println!("{} {} {}", s.x, s.y, s.z);
}
As I understand it, let my_struct_unboxed = *my_struct_boxed; transfers the ownership away from the box, to my_struct_unboxed, and then to s in the function print_struct.
What happens with the actual data? Initially it is copied from the stack onto the heap by calling Box::new(...), but is the data some how moved or copied back to the stack at some point? If so, how? And when is drop called? When s goes out of scope?

The Structure data in my_struct_boxed exists on the heap and the Structure data in my_struct_unboxed exists on the stack.
Therefore naïvely speaking (no compiler optimizations), a move or copy operation when dereferencing (*) your Box will always involve copying of the data. On the borrow-checker/static-analysis side, since the Copy trait is not implemented for Structure, this represents a transfer of ownership of the data to the my_struct_unboxed variable.
When you call print_struct, another copy would take place that would copy the bits in memory representing your Structure from the local variable to the function's arguments call-stack. Semantically, this again represents a transfer of ownership into the print_struct function.
Finally when print_struct goes out of scope, it drops the Structure which it owns.
Reference: std::marker::Copy
Excerpt
It's important to note that in these two examples, the only difference
is whether you are allowed to access [your variable] after the assignment. Under the
hood, both a copy and a move can result in bits being copied in
memory, although this is sometimes optimized away.
Note the last part "this is sometimes optimized away". This is why the earlier descriptions were simplified to assume no compiler optimizations i.e. naïve. In a lot of cases, the compiler will aggressively optimize and inline the code especially with higher values for the opt-level flag.

If so, how?
Both "copy" and "move" are semantically memcpy (though that may be optimised to something else, or even nothing whatsoever).
And when is drop called? When s goes out of scope?
Yes. When print_struct ends it cleans up its local scope, and drops s.

Is it possible to have safe mutable aliasing to non-overlapping memory?

I'm looking for a way to take a large object and break it into smaller mutable child objects, which can be processed in parallel.
Something like:
struct PixelBuffer { data:Vec<u32>, width:u32, height:u32 }
struct PixelBlock { data:Vec<u32> }
impl PixelBuffer {
fn decompose(&'a mut self) -> Vec<Guard<'a, PixelBlock>>> {
...
}
}
Where the resulting PixelBlock's can be processed in parallel, and the parent PixelBuffer will remain locked until all Guard<PixelBlock> are dropped.
This is effectively mutable pointer aliasing; the large data block in PixelBuffer will be directly modified via each PixelBlock.
However, each PixelBlock is non-overlapping segment from the internal data in PixelBuffer.
You can certainly do this in unsafe code (internal buffer is a raw pointer; generate a new external pointer for each PixelBlock); but is it possible to achieve the same result using safe code?
(NB. I'm open to using a data block allocated from libc::malloc if that'll help?)

This works fine and is a natural consequence of how, e.g., iterators work: the next method hands out a sequence of values that are not lifetime-connected to the reference they come from, i.e. fn next(&mut self) -> Option<Self::Item>. This automatically means that any iterator that yields &mut pointers (like, slice.iter_mut()) is yielding pointers to non-overlapping memory, because anything else would be incorrect.
One way to use this in parallel is something like my simple_parallel library, e.g. Pool::for_.
(You'll need to give more details about the internals of PixelBuffer to be more specific about how to do it in this case.)

There is no way to completely avoid unsafe Rust, because the compiler cannot currently evaluate the safety of sub-slices. However, the standard library contains code that provides a safe wrapper that you can use.
Read up on std::slice::Chunks and std::slice::ChunksMut.
Sample code: https://play.rust-lang.org/?gist=ceec5be3e1530c0a6d3b&version=stable
However, your next problem is sending the slices to separate threads, because the best way to do that would be thread::scoped, which is currently deprecated due to some safety problems that were discovered this year...
Also, keep in mind that Vec<_> owns its contents, whereas slices are just a view. Generally, you want to write most functions in terms of slices, and keep only one "Vec" to hold the data.

Are objects accessed indirectly in D?

As I've read all objects in D are fully location independent. How this requirement is achieved?
One thing that comes to my mind, is that all references are not pointers to the objects, but to some proxy, so when you move object (in memory) you just update that proxy, not all references used in program.
But this is just my guess. How it is done in D for real?

edit: bottom line up front, no proxy object, objects are referenced directly through regular pointers. /edit
structs aren't allowed to keep a pointer to themselves, so if they get copied, they should continue to just work. This isn't strictly enforced by the language though:
struct S {
S* lol;
void beBad() {
lol = &this; // this compiler will allow this....
}
}
S pain() {
S s;
s.beBad();
return s;
}
void main() {
S s;
s = pain();
assert(s.lol !is &s); // but it will also move the object without notice!
}
(EDIT: actually, I guess you could use a postblit to update internal pointers, so it isn't quite without notice. If you're careful enough, you could make it work, but then again, if you're careful enough, you can shoot between your toes without hitting your foot too. EDIT2: Actually no, the compiler/runtime is still allowed to move it without even calling the postblit. One example of where this happens is if it copies a stack frame to the heap to make a closure. The struct data is moved to a new address without being informed. So yeah. /edit)
And actually, that assert isn't guaranteed to pass, the compiler might choose to call pain straight on the local object declared in main, so the pointer would work (though I'm not able to force this optimization here for a demo, generally, when you return a struct from a function, it is actually done via a hidden pointer the caller passes - the caller says "put the return value right here" thus avoiding a copy/move in some cases).
But anyway, the point just is that the compiler is free to copy or not to copy a struct at its leisure, so if you do keep the address of this around in it, it may become invalid without notice; keeping that pointer is not a compile error, but it is undefined behavior.
The situation is different with classes. Classes are allowed to keep references to this internally since a class is (in theory, realized by the garbage collector implementation)) an independent object with an infinite lifetime. While it may be moved (such as be a moving GC (not implemented in D today)), if it is moved, all references to it, internal and external, would also be required to be updated.
So classes can't have the memory pulled out from under them like structs can (unless you the programmer take matters into your own hands and bypass the GC...)
The location independent thing I'm pretty sure is referring only to structs and only to the rule that they can't have pointers to themselves. There's no magic done with references or pointers - they indeed work with memory addresses, no proxy objects.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string