Is Option<T> optimized to a single byte when T allows it? - rust

Suppose we have an enum Foo { A, B, C }.
Is an Option<Foo> optimized to a single byte in this case?
Bonus question: if so, what are the limits of the optimization process? Enums can be nested and contain other types. Is the compiler always capable of calculating the maximum number of combinations and then choosing the smallest representation?

The compiler is not very smart when it comes to optimizing the layout of enums for space. Given:
enum Option<T> { None, Some(T) }
enum Weird<T> { Nil, NotNil { x: int, y: T } }
enum Foo { A, B, C }
There's really only one case the compiler considers:
An Option-like enum: one variant carrying no data ("nullary"), one variant containing exactly one datum. When used with a pointer known to never be null (currently, only references and Box<T>) the representation will be that of a single pointer, null indicating the nullary variant. As a special case, Weird will receive the same treatment, but the value of the y field will be used to determine which variant the value represents.
Beyond this, there are many, many possible optimizations available, but the compiler doesn't do them yet. In particular, your case will not berepresented as a single byte. For a single enum, not considering the nested case, it will be represented as the smallest integer it can.

Related

What's a unique/special case that requires tuple variants in an enum?

from Rust tutorial, types can be defined with struct
enum Shape {
Circle { center: Point, radius: f64 },
Rectangle { top_left: Point, bottom_right: Point }
}
and non-struct/Tuples type enums (thanks #ÖmerErden for correct nomenclature)
enum Shape {
Circle(Point, f64),
Rectangle(Point, Point)
}
As far as I can tell both are identical, have the exact same use cases in match, etc. But the struct one forces you to use named properties and the non-struct forces you to use ordered params.
Is there any other difference I am missing, in usage or output (performance, memory footprint, etc)? Or is the later just a lazy-syntax version of the struct one with the exact outcome after compilation?
I only found two obvious usages:
to wrap a type with minimum boilerplate, so every time you need to go down to the wrapped object, it's just self.0.
struct MyString(String);
when .0, .1, etc syntax adds more clarity and you can't do it with named fields (since an identifier cannot start from a numeric character, in particular cannot be a numeric string

Rust traits required to get an algorithm to work with both f32 and f64 in Rust [duplicate]

I'm looking to write a function that can accept any floating point data, similar to the following form:
fn multiply<F: Float>(floating_point_number: F) -> F {
floating_point_number * 2
}
But I can't find the syntax for it in the documentation, or a trait that is common to floating point numbers only.
Currently all of the generic story with primitive numeric types in Rust is available in the official num crate. This crate contains, among everything else, a number of traits which are implemented for various primitive numeric types, and in particular there is Float which represents a floating-point number.
Float trait provides a lot of methods which are specific to floating-point numbers, but it also extends Num and NumCast traits which allow one to perform numeric operations and obtain generic types from arbitrary primitive numbers. With Float your code could look like this:
use num::{Float, NumCast};
fn multiply<F: Float>(n: F) -> F {
n * NumCast::from(2).unwrap()
}
NumCast::from() returns Option because not all numeric casts make sense, but in this particular case it is guaranteed to work, hence I used unwrap().

How can I create specific values for generic numeric types?

I'm working on a library which will provide a trait for axis-aligned bounding boxes (AABB) operations. The trait is declared like this:
trait Aabb {
type Precision : Zero + One + Num + PartialOrd + Copy;
// [...]
}
I don't care which precision the user chooses, as long as these constraints are respected (though I don't really expect integer types to be chosen).
I'm having trouble using literals. Some operations require constant values, as an example:
let extension = 0.1;
aabb.extend(extension);
This doesn't work because Aabb::extend expects Aabb::Precision and not a float. My solution was something like this:
let mut ten = Aabb::Precision::zero();
for _ in 0..10 {
ten = ten + Aabb::Precision::one();
}
aabb_extension = Aabb::Precision::one() / ten;
This works, but I need to resort to this every time I need a specific number and it is getting cumbersome. Is this really the only way?
I need to resort to this every time I need a specific number and it is getting cumbersome. Is this really the only way?
Basically, yes. Unless you can answer the question of "how do you support converting a literal 0 to MyCustomTypeThatImplementsTheTrait".
You can't have it both ways — you can't ask for something to be generic and then use concrete literals.
You can have different workarounds. Providing base values like "zero" and "one", or having a "convert a specific type to yourself" method, for example.
You could also re-evaluate what you are attempting to do; perhaps you are thinking at too low a level. Indeed, what does it mean to "extend by 0.1" a type that represents points as floating point values between 0 and 1?
Maybe it would be better to have an expand_by_percentage method instead, or something else that makes sense in the domain.
See also:
How do I use integer number literals when using generic types?
Cannot create a generic function that uses a literal zero
Dividing a const by a generic in Rust
How can I create an is_prime function that is generic over various integer types?
In this case, I would recommend that you create your own trait and provide default implementations of the methods.
For example, I would naively imagine:
trait ApproximateValue: Zero + One {
fn approximate(val: f64) -> ApproximateValue {
// some algorithm to create "val" from Zero and One
}
}
then, your Precision associated type will have a bound of ApproximateValue and you will just call Precision::approximate(0.1).

Set of structs in Go

If I have a number of structs that I want to store:
type Stuff struct {
a string
b string
}
I can do it with a slice, but it seems like it would use less memory to use a proper set structure.
Unfortunately Go doesn't have a set structure. Everyone recommends using map[Stuff]struct{} but that doesn't work because Stuff is a struct. Anyone have any good solutions? Ideally without having to download a library.
Usually set and map data structures require more memory than storing a list of values in plain array or slice as set and map provide additional features efficiently like uniqueness or value retrieval by key.
If you want minimal memory usage, simply store them in a slice, e.g. []Stuff. If you use the values in multiple places, it might also be profitable to just store their pointers, e.g. []*Stuff and so each places that store the same Stuff values can store the same pointer (without duplicating the value).
If you only want to store unique struct values, then indeed the set would be the most convenient choice, in Go realized with a map.
There's nothing wrong with map[Stuff]struct{}, it works. The requirement for the key type for maps:
The comparison operators == and != must be fully defined for operands of the key type; thus the key type must not be a function, map, or slice.
Stuff is a struct, and structs in Go are comparable if:
Struct values are comparable if all their fields are comparable. Two struct values are equal if their corresponding non-blank fields are equal.
If your Stuff struct is what you posted, it is comparable: it only contains fields of the comparable type string.
Also note that if you want a set data structure, it's clearer if you use bool as the value type (e.g. map[Stuff]bool) and true as the value, and then you can simply use indexing to test if a value is in the map as the index expression yields the zero value of the value type (false for bool) if the key (Stuff in your case) is not in the map, properly telling the value you're looking for is not in the "set". (And if it is in the map, its associated true value is the result of the index expression - properly telling it is in the map).
As icza said, a map of structs is a valid option.
But instead of implementing the Set yourself, I would use one of the new generic implementations that are out there since Go 1.18.
See this one for example: https://github.com/amit7itz/goset
package main
import (
"fmt"
"github.com/amit7itz/goset"
)
type Stuff struct {
a string
b string
}
func main() {
set := goset.NewSet[Stuff]()
set.Add(Stuff{a: "1", b: "2"})
set.Add(Stuff{a: "2", b: "3"})
set.Add(Stuff{a: "2", b: "3"})
fmt.Println(set) // Set[main.Stuff]{{2 3} {1 2}}
fmt.Println(set.Len()) // 2
}

Why does an enum require extra memory size?

My understanding is that enum is like union in C and the system will allocate the largest of the data types in the enum.
enum E1 {
DblVal1(f64),
}
enum E2 {
DblVal1(f64),
DblVal2(f64),
DblVal3(f64),
DblVal4(f64),
}
fn main() {
println!("Size is {}", std::mem::size_of::<E1>());
println!("Size is {}", std::mem::size_of::<E2>());
}
Why does E1 takes up 8 bytes as expected, but E2 takes up 16 bytes?
In Rust, unlike in C, enums are tagged unions. That is, the enum knows which value it holds. So 8 bytes wouldn't be enough because there would be no room for the tag.
As a first approximation, you can assume that an enum is the size of the maximum of its variants plus a discriminant value to know which variant it is, rounded up to be efficiently aligned. The alignment depends on the platform.
This isn't always true; some types are "clever" and pack a bit tighter, such as Option<&T>. Your E1 is another example; it doesn't need a discriminant because there's only one possible value.
The actual memory layout of an enum is undefined and is up to the whim of the compiler. If you have an enum with variants that have no values, you can use a repr attribute to specify the total size.
You can also use a union in Rust. These do not have a tag/discriminant value and are the size of the largest variant (perhaps adding alignment as well). In exchange, these are unsafe to read as you can't be statically sure what variant it is.
See also:
How to specify the representation type for an enum in Rust to interface with C++?
Why does Nil increase one enum size but not another? How is memory allocated for Rust enums?
What is the overhead of Rust's Option type?
Can I use the "null pointer optimization" for my own non-pointer types?
Why does Rust not have unions?

Resources