Implementation of AnyMap and runtime overhead of `struct Port(u32);` - metaprogramming

I was reading "24 days of Rust" and the example of AnyMap usage just blew my mind. Consider the following code:
#[deriving(Show)]
struct Port(u32);
#[deriving(Show)]
struct ConnectionLimit(u32);
It says:
Here the Port and ConnectionLimit types are abstractions over the
underlying integer (with no overhead at runtime!).
Very well, I can understand how this could be achieved. All types are checked during compilation and in runtime we have only u32's. But in this case how is it possible to create a map from some TypeId to Box<Any>? And how Any could be casted to any subtype, like u32?
The source code on AnyMap is quite complicated and a lot of metaprogramming is involved. How does it work? Maybe there is just a mistake in "24 days of Rust" and Port and ConnectionLimit actually have runtime overhead?

Actually, it is pretty common in C, C++ and other system languages to have different types with the same in-memory representation.
In memory:
a u32 is 4 contiguous bytes of memory
a Port is 4 contiguous bytes of memory
a ConnectionLimit is 4 contiguous bytes of memory
Notably, compared to many other languages, there is no "virtual table" or other extraneous information stored in memory for each instance of those types.
As for AnyMap, at the point where you store your object in the map, the compiler knows the type of the object, and therefore can provide the correct TypeId. This then has to be carried along preciously with the object data, because if it is ever lost it cannot be recovered.

Related

What is uninitialized memory and why isn't it initialized when allocating?

Taking this signature for a method of the GlobalAllocator:
unsafe fn alloc(&self, layout: Layout) -> *mut u8
and this sentence from the method's documentation:
The allocated block of memory may or may not be initialized.
Suppose that we are going to allocate some chunk of memory for an [i32, 10]. Assuming the size of i32 it's 4 bytes, our example array would need 40 bytes for the requested storage.
Now, the allocator found a memory spot that fits our requirements. Some 40 bytes of a memory region... but... what's there? I always read the term garbage data, and assume that it's just old data already stored there by another process, program... etc.
What's unitialized memory? Just data that is not initialized with zeros of with some default value for the type that we want to store there?
Why not always memory it's initialized before returning the pointer? It's too costly? But the memory must be initialized in order to use it properly and not cause UB. Why then doesn't comes already initialized?
When some resource it's deallocated, things musn't be pointing to that freed memory. That's that place got zeroed? What really happens when you deallocate some piece of memory?
What's unitialized memory? Just data that is not initialized with zeros of with some default value for the type that we want to store there?
It's worse than either of those. Reading from uninitialized memory is undefined behavior, as in you can no longer reason about a program which does so. Practically, compilers often optimize assuming that code paths that would trigger undefined behavior are never executed and their code can be removed. Or not, depending on how aggressive the compiler is.
If you could reliably read from the pointer, it would contain arbitrary data. It may be zeroes, it may be old data structures, it may be parts of old data structures. It may even be things like passwords and encryption keys, which is another reason why reading uninitialized memory is problematic.
Why not always memory it's initialized before returning the pointer? It's too costly? But the memory must be initialized in order to use it properly and not cause UB. Why then doesn't comes already initialized?
Yes, cost is the issue. The first thing that is typically done after allocating a piece of memory is to write to it. Having the allocator "pre-initialize" the memory is wasteful when the caller is going to overwrite it anyway with the values it wants. This is especially significant with large buffers used for IO or other large storage.
When some resource it's deallocated, things musn't be pointing to that freed memory. That's that place got zeroed? What really happens when you deallocate some piece of memory?
It's up to how the memory allocator is implemented. Most don't waste processing power to clear the data that's been deallocated, since it will be overwritten anyway when it's reallocated. Some allocators may write some bookkeeping data to the freed space. GlobalAllocator is an interface to whatever allocator the system comes with, so it can vary depending on the environment.
I always read the term garbage data, and assume that it's just old data already stored there by another process, program... etc.
Worth noting: all modern desktop OSs have memory isolation between processes - your program cannot access the memory of other processes or the kernel (unless you explicitly share it via specialized functionality). The kernel will clear memory before it assigns it to your process, to prevent leaking sensitive data. But you can see old data from your own process, for the reasons described above.
What you are asking are implementation details that can even vary from run to run. From the perspective of the abstract machine and thus the optimizer they don't matter.
Turning contents of uninitialized memory into almost any type (other than MaybeUninit) is immediate undefined behavior.
let mem: *u8 = unsafe { alloc(...) };
let x: u8 = unsafe { ptr::read(mem) };
if x != x {
print!("wtf");
}
May or may not print, crash or delete the contents of your harddrive, possibly even before reaching that alloc call because the optimizer worked backwards and eliminated the entire code block because it could prove that all execution paths are UB.
This may happen due to assumptions the optimizer relies on, i.e. even when the underlying allocator is well-behaved. But real systems may also behave non-deterministically. E.g. theoretically on a freshly booted embedded system memory might be in an uninitialized state that doesn't reliably return 0 or 1. Or on linux madvise(MADV_FREE) can cause allocations to return inconsistent results over time until initialized.

Find total size of struct at runtime

Is there any way to calculate the total stack and heap size of a struct at runtime?
As far as I can tell, std::mem::{size_of, size_of_val} only work for stack-allocated values, but a struct might contain heap-allocated buffers, too (e.g. Vec).
Servo was using the heapsize crate to measure the size of heap allocations during program.
You can call the heap_size_of function to measure the allocated heap size by jemalloc.
Be aware that you can get different results with different allocators.
Regarding Github: "This crate is not maintained and is no longer used by Servo. At the time of writing, Servo uses internal malloc_size_of instead."
You can either use heapsize crate or you can check the implementation details of malloc_size_of as well

What goes on the stack and what goes on the heap in Rust?

I am really confused about Rust's system of memory allocation.
In Java you use new to allocate memory on the heap. In C you use malloc(), everything else goes on the stack.
I thought, in Rust, Box<T> allocates memory on the heap but after reading "Defining Our Own Smart Pointer" section in chapter 15.2 in The Rust Programming Language it seems like MyBox<T> doesn't have any special annotation to make the value of T live on the heap.
What exactly goes on the stack and what goes on the heap?
Is the implementation of MyBox<T> essentially the same as Box<T>?
If the implementations are identical, what makes T stored on the heap rather than the stack?
If the implementations aren't identical what makes a Box<T> allocate memory on the heap?
This is hard to say. Usually Rust avoid allocating anything on the heap. Never will the compiler do an implicit allocation on the heap, but may library functions can do it for you. At least anything that is dynamically sized (eg. Vec<T>) will need something on the heap under the hood, for the rest, the documentation should hint it.
Note that even in C, many functions can do heap allocation without an explicit call to malloc. Eg. I've had to debug a memory leak recently where a developer called getaddrinfo without a corresponding freeaddrinfo, ignoring that this function allocates memory on the heap. This class of bugs should be really rare in Rust however, thanks to RAII.
Not at all! The book is simplifying things here to avoid details not important for this section.
—
Box is a compiler built-in. Under the hood what allocates the memory is an allocator defined as in liballoc. You can think of this allocator as providing malloc-like functionality. In practice, the default allocator uses jemalloc on most targets, it is also possible to use a custom allocator, for example the alloc_system crate uses the system's malloc/realloc/free functions to build its allocator.
Just for setting the record straight, malloc is heap, new is heap. Stack allocation doesn't require any malloc, and the malloc can just be liberated through a free, the same for the new can only be liberated by a deleted; otherwise you get a process leak. In those languages, the developer manages what is stack allocation and heap allocation.
For the question, the Box::new goes in the heap. The memory is claimed back with Box::Drop (you don't see this), unless you transfer ownership.
To complement --> types that have a known size at compile time are stored entirely on the stack. This tells you that Rust is managing what goes on the stack and what goes on the heap. Rust ownership and borrowing should clarify this.

Maximum size of an array in 32 bits?

According to the Rust Reference:
The isize type is a signed integer type with the same number of bits as the platform's pointer type. The theoretical upper bound on object and array size is the maximum isize value. This ensures that isize can be used to calculate differences between pointers into an object or array and can address every byte within an object along with one byte past the end.
This obviously constrain an array to at most 2G elements on 32 bits system, however what is not clear is whether an array is also constrained to at most 2GB of memory.
In C or C++, you would be able to cast the pointers to the first and one past last elements to char* and obtain the difference of pointers from those two; effectively limiting the array to 2GB (lest it overflow intptr_t).
Is an array in 32 bits also limited to 2GB in Rust? Or not?
The internals of Vec do cap the value to 4GB, both in with_capacity and grow_capacity, using
let size = capacity.checked_mul(mem::size_of::<T>())
.expect("capacity overflow");
which will panic if the pointer overflows.
As such, Vec-allocated slices are also capped in this way in Rust. Given that this is because of an underlying restriction in the allocation API, I would be surprised if any typical type could circumvent this. And if they did, Index on slices would be unsafe due to pointer overflow. So I hope not.
It might still not be possible to allocate all 4GB for other reasons, though. In particular, allocate won't let you allocate more than 2GB (isize::MAX bytes), so Vec is restricted to that.
Rust uses LLVM as compiler backend. The LLVM instruction for pointer arithmetic (GetElementPtr) takes signed integer offsets and has undefined behavior on overflow, so it is impossible to index into arrays larger than 2GB when targeting a 32-bit platform.
To avoid undefined behavior, Rust will refuse to allocate more than 2 GB in a single allocation. See Rust issue #18726 for details.

GCC's std::string - why so weird implementation

When I was looking at the way std::string is implemented in gcc I noticed that sizeof(std::string) is exactly equal to the size of pointer (4 bytes in x32 build, and 8 bytes for x64). As string should hold a pointer to string buffer and its length as a bare minimum, this made me think that std::string object in GCC is actually a pointer to some internal structure that holds this data.
As a consequence when new string is created one dynamic memory allocation should occur (even if the string is empty).
In addition to performance overhead this also cause memory overhead (that happens when we are allocating very small chunk of memory).
So I see only downsides of such design. What am I missing? What are the upsides and what is the reason for such implementation in the first place?
Read the long comment at the top of <bits/basic_string.h>, it explains what the pointer points to and where the string length (and reference count) are stored and why it's done that way.
However, C++11 doesn't allow a reference-counted Copy-On-Write std::string so the GCC implementation will have to change, but doing so would break the ABI so is being delayed until an ABI change is inevitable. We don't want to change the ABI, then have to change it again a few months later, then again. When it changes it should only change once to minimise the hassles for users.

Resources