What is the null pointer optimization in Rust? - rust

In Learning Rust With Entirely Too Many Linked Lists, the author mentions:
However, if we have a special kind of enum:
enum Foo {
A,
B(ContainsANonNullPtr),
}
the null pointer optimization kicks in, which eliminates the space needed for the tag. If the variant is A, the whole enum is set to all 0's. Otherwise, the variant is B. This works because B can never be all 0's, since it contains a non-zero pointer.
I guess that the author is saying that (assuming A is 4 bits, and B is 4 bits)
let test = Foo::A
the memory layout is
0000 0000
but
let test = Foo::B
the memory layout is
some 8 bit non 0 value
What exactly is optimized here? Aren't both representation always 8 bits What does it mean when the author claims
It means &, &mut, Box, Rc, Arc, Vec, and several other important types in Rust have no overhead when put in an Option

The null pointer optimization basically means that if you have an enum with two variants, where one variant has no associated data, and the other variant has associated data where the bit pattern of all zeros isn't a valid value, then the enum itself will take exactly the same amount of space as that associated value, using the all zeroes bit pattern to indicate that it's the other variant.
In other words, this means that Option<&T> is exactly the same size as &T instead of requiring an extra word.

enum is a tagged union. Without optimization it looks like
Foo::A; // tag 0x00 data 0xXX
Foo::B(2); // tag 0x01 data 0x02
The null pointer optimization removes the separate tag field.
Foo::A; // tag+data 0x00
Foo::B(2); // tag+data 0x02

I m also learning too many linked list, perhaps this code snippet can deepen your understanding
pub enum WithNullPtrOptimization{
A,
B(String),
}
pub enum WithoutNullPtrOptimization{
A,
B(u32),
}
fn main() {
println!("{} {}", std::mem::size_of::<WithNullPtrOptimization>(), std::mem::size_of::<String>()); // 24 24
println!("{} {}", std::mem::size_of::<WithoutNullPtrOptimization>(), std::mem::size_of::<u32>()); // 8 4
}

Related

How does double ampersands passed to size_of_val work?

I read a book published by Apress named Beginning Rust - Get Started with Rust 2021 Edition
In one of the code examples, the author does not explain it in detail or clearly how the code works. Here is the code snippet
/* In a 64-bit system, it prints:
16 16 16; 8 8 8
In a 32-bit system, it prints:
8 8 8; 4 4 4
*/
fn main() {
use std::mem::*;
let a: &str = "";
let b: &str = "0123456789";
let c: &str = "abcdè";
print!("{} {} {}; ",
size_of_val(&a),
size_of_val(&b),
size_of_val(&c));
print!("{} {} {}",
size_of_val(&&a),
size_of_val(&&b),
size_of_val(&&c));
}
My question is how it work since the size_of_val takes a reference and this was done in the declaration of the &str. But how come in the print! statement, the author put another ampersand before the variable? In addition to that when we just pass the variable without an additional ampersand such as size_of_val(a or b or c), the size we get is for a 0, for b 10 and for c 6, but when we pass the variable with the ampersand such as size_of_val(&a or &b or &c), then like the comments above the main function described by the author, the sizes are 16 16 16 or 8 8 8. Last for the second print! statement (macro), the author put double ampersands to get the size of reference? How does it work. Just don't get it cuz I thought that would generate the error since size_of_val only accept one reference but then in the print! macro there is another ampersand and the second macro there are double ampersands...
The size_of_val() function is declared as follows:
pub fn size_of_val<T>(val: &T) -> usize
where
T: ?Sized,
That means: given any type T (the ?Sized constraint means "really any type, even unsized ones"), we take a reference for T and give back a usize.
Let's take a as an example (b and c are the same).
When we evaluate size_of_val(a), the compiler knows that a has type &str, and thus it infers the generic parameter to be str (without a reference), so the full call is size_of_val::<str>(a /* &str */), which match the signature: we give &str for T == str.
What is the size of a str? str is actually a continuous sequence of bytes, encoding the string as UTF-8. a contains "", the empty string, which is of course zero bytes long. So size_of_val() returns 0. For b, there are 10 ASCII characters, each is one byte long UTF8-encoded, so together they're 10 bytes long. C contains 4 ASCII chars (abcd), so four bytes, and one Unicode character (è) that is two bytes wide, encoded as \xC3\xA8 (195 and 168 in decimal). So a total length of six bytes.
What does happen when we calculate size_of_val(&a)? &a is &&str because a is &str, so the compiler infers T to be &str. The size of &str is constant and always double the size of a pointer: this is because &str, i.e. a pointer to str, should include the data address and the length. On 64 bit platforms this is 16 (8 * 2); on 32 bit ones it is 8 (4 * 2). This is called a fat pointer, that is, a pointer that carries additional metadata besides just the address (note that it is not guaranteed to be double times the length, so don't rely on it, but practically it is).
When we evaluate size_of_val(&&a), the type of &&a is &&&str, so T is inferred to be &&str. While &str (a pointer to str) is a fat pointer, meaning it is doubled in size, a pointer to a fat pointer is a normal thin pointer (the opposite of a fat pointer: a pointer that only carries the address, without any additional metadata), meaning it is one machine word size. So 8 bytes for 64 bit or 4 bytes for 32 bit platforms.

Is it sound to transmute a MaybeUninit<[T; N]> to [MaybeUninit<T>; N]?

Is the following code sound?
#![feature(maybe_uninit)]
use std::mem;
const N: usize = 2; // or another number
type T = String; // or any other type
fn main() {
unsafe {
// create an uninitialized array
let t: mem::MaybeUninit<[T; N]> = mem::MaybeUninit::uninitialized();
// convert it to an array of uninitialized values
let mut t: [mem::MaybeUninit<T>; N] = mem::transmute(t);
// initialize the values
t[0].set("Hi".to_string());
t[1].set("there".to_string());
// use the values
println!("{} {}", t[0].get_ref(), t[1].get_ref());
// drop the values
mem::replace(&mut t[0], mem::MaybeUninit::uninitialized()).into_initialized();
mem::replace(&mut t[1], mem::MaybeUninit::uninitialized()).into_initialized();
}
}
I should note that miri runs it without problems.
Correction: The answer below still holds in the general case, but in the case of MaybeUninit there are a few handy special cases about memory layout that makes this actually safe to do:
First, the documentation for MaybeUninit has a layout section stating that
MaybeUninit<T> is guaranteed to have the same size and alignment as T.
Secondly, the language reference says this about array layouts:
Arrays are laid out so that the nth element of the array is offset from the start of the array by n * the size of the type bytes. An array of [T; n] has a size of size_of::<T>() * n and the same alignment of T.
This means that the layout of MaybeUninit<[T; n]> and the layout of [MaybeUninit<T>; n] are the same.
Original answer:
From what I can tell, this is one of those things that are likely to work but not guaranteed, and may be subject to compiler-specific or platform-specific behavior.
MaybeUninit is defined as follows in the current source:
#[allow(missing_debug_implementations)]
#[unstable(feature = "maybe_uninit", issue = "53491")]
pub union MaybeUninit<T> {
uninit: (),
value: ManuallyDrop<T>,
}
Since it's not marked with the #[repr] attribute (as opposed to for instance ManuallyDrop), it's in the default representation, of which the reference says this:
Nominal types without a repr attribute have the default representation. Informally, this representation is also called the rust representation.
There are no guarantees of data layout made by this representation.
In order to transmute from Wrapper<[T]> to [Wrapper<T>], it must be the case that the memory layout of Wrapper<T> is exactly the same as the memory layout of T. This is the case for a number of wrappers, such as the previously mentioned ManuallyDrop, and those will usually be marked with the #[repr(transparent)] attribute.
But in this case, this is not necessarily true. Since () is a zero-size type, it's likely that the compiler will use the same memory layout for T and MaybeUninit<T> (and this is why it's working for you), but it is also possible
that the compiler decides to use some other memory layout (e.g. for optimization purposes), in which case transmuting will not work anymore.
As a specific example, the compiler may chose to use the following memory layout for MaybeUninit<T>:
+---+---+...+---+
| T | b | where b is "is initialized" flag
+---+---+...+---+
According to the above quote, the compiler is allowed to do this. In this case, [MaybeUninit<T>] and MaybeUninit<[T]> have different memory layouts, since MaybeUninit<[T]> has one b for the entire array, while [MaybeUninit<T>] has one b for each MaybeUninit<T> in the array:
MaybeUninit<[T]>:
+---+...+---+---+...+---+...+---+...+---+---+
| T[0] | T[1] | … | T[n-1] | b |
+---+...+---+---+...+---+...+---+...+---+---+
Total size: n * size_of::<T>() + 1
[MaybeUninit<T>]
+---+...+---+----+---+...+---+----+...+---+...+---+------+
| T[0] |b[0]| T[1] |b[1]| … | T[n-1] |b[n-1]|
+---+...+---+----+---+...+---+----+...+---+...+---+------+
Total size: (n + 1) * size_of::<T>()

What is the difference between casting to `i32` from `usize` versus the other way?

I am making a function that makes a array of size n random numbers but my comparison for the while throws an error.
while ar.len() as i32 < size { }
Complains with: expected one of !, (, +, ,, ::, <, or >, found {.
If I remove the as i32 it complains with mismatch types and if I add a as usize to the size variable then it doesn't complain.
When you cast from a smaller-sized type to a larger one, you won't lose any data, but the data will now take up more space.
When you cast from a larger-sized type to a smaller one, you might lose some of your data, but the data will take up less space.
Pretend I have a box of size 1 that can hold the numbers 0 to 9 and another box of size 2 that can hold the numbers 0 to 99.
If I want to store the number 7; both boxes will work, but I will have space left over if I use the larger box. I could move the value from the smaller box to the larger box without any trouble.
If I want to store the number 42; only one box can fit the number: the larger one. If I try to take the number and cram it in the smaller box, something will be lost, usually the upper parts of the number. In this case, my 42 would be transformed into a 2! Oops!
In addition, signed and unsigned plays a role; when you cast between signed and unsigned numbers, you might be incorrectly interpreting the value, as a number like -1 becomes 255!
See also:
How do I convert between numeric types safely and idiomatically?
In this particular case, it's a bit more complicated. A usize is defined to be a "pointer-sized integer", which is usually the native size of the machine. On a 64-bit x64 processor, that means a usize is 64 bits, and on a 32-bit x86 processor, it will be 32 bits.
Casting a usize to a i32 thus will operate differently depending on what type of machine you are running on.
The error message you get is because the code you've tried isn't syntactically correct, and the compiler isn't giving a good error message.
You really want to type
while (ar.len() as i32) < size { }
The parenthesis will help the precedence be properly applied.
To be on the safe side, I'd cast to the larger value:
while ar.len() < size as usize { }
See also:
How do I convert a usize to a u32 using TryFrom?
How to idiomatically convert between u32 and usize?
Why is type conversion from u64 to usize allowed using `as` but not `From`?
It seems that your size is of type i32. You either need parentheses:
while (ar.len() as i32) < size { }
or cast size to usize:
while ar.len() < size as usize { }
as len() returns a usize and the types on both sides of the comparison need to match. You need the parentheses in the first case so that the < operator doesn't try to compare i32 with size but rather ar.len() as i32 with size which is your intention.

Why does an enum require extra memory size?

My understanding is that enum is like union in C and the system will allocate the largest of the data types in the enum.
enum E1 {
DblVal1(f64),
}
enum E2 {
DblVal1(f64),
DblVal2(f64),
DblVal3(f64),
DblVal4(f64),
}
fn main() {
println!("Size is {}", std::mem::size_of::<E1>());
println!("Size is {}", std::mem::size_of::<E2>());
}
Why does E1 takes up 8 bytes as expected, but E2 takes up 16 bytes?
In Rust, unlike in C, enums are tagged unions. That is, the enum knows which value it holds. So 8 bytes wouldn't be enough because there would be no room for the tag.
As a first approximation, you can assume that an enum is the size of the maximum of its variants plus a discriminant value to know which variant it is, rounded up to be efficiently aligned. The alignment depends on the platform.
This isn't always true; some types are "clever" and pack a bit tighter, such as Option<&T>. Your E1 is another example; it doesn't need a discriminant because there's only one possible value.
The actual memory layout of an enum is undefined and is up to the whim of the compiler. If you have an enum with variants that have no values, you can use a repr attribute to specify the total size.
You can also use a union in Rust. These do not have a tag/discriminant value and are the size of the largest variant (perhaps adding alignment as well). In exchange, these are unsafe to read as you can't be statically sure what variant it is.
See also:
How to specify the representation type for an enum in Rust to interface with C++?
Why does Nil increase one enum size but not another? How is memory allocated for Rust enums?
What is the overhead of Rust's Option type?
Can I use the "null pointer optimization" for my own non-pointer types?
Why does Rust not have unions?

Is Option<T> optimized to a single byte when T allows it?

Suppose we have an enum Foo { A, B, C }.
Is an Option<Foo> optimized to a single byte in this case?
Bonus question: if so, what are the limits of the optimization process? Enums can be nested and contain other types. Is the compiler always capable of calculating the maximum number of combinations and then choosing the smallest representation?
The compiler is not very smart when it comes to optimizing the layout of enums for space. Given:
enum Option<T> { None, Some(T) }
enum Weird<T> { Nil, NotNil { x: int, y: T } }
enum Foo { A, B, C }
There's really only one case the compiler considers:
An Option-like enum: one variant carrying no data ("nullary"), one variant containing exactly one datum. When used with a pointer known to never be null (currently, only references and Box<T>) the representation will be that of a single pointer, null indicating the nullary variant. As a special case, Weird will receive the same treatment, but the value of the y field will be used to determine which variant the value represents.
Beyond this, there are many, many possible optimizations available, but the compiler doesn't do them yet. In particular, your case will not berepresented as a single byte. For a single enum, not considering the nested case, it will be represented as the smallest integer it can.

Resources