I've just learned that an Enum can be initialized with a custom data type.
Wouldn't that make Enums a subset of Structs? An Enum could be a Struct with a single unnamed variable which is initialized once and cannot be changed (like final variables in Java). Also, no methods can be implemented to enums.
Like this:
enum E {
ONE(String)
}
struct S {
one: String
}
Thus, in memory, both a single variable struct and enum would look the same.
Is this true or am I missing something?
It's actually the opposite: structs are subsets of enums. Any struct can be represented as one-variant enum:
struct Record { ... }
struct TupleLike(...);
enum Record { Variant { ... } }
enum TupleLike { Variant(...) }
This enum will even have the same in-memory representation! (though it isn't guaranteed).
On the other hand, enums with multiple variants cannot be described precisely as structs. They are usually implemented as tagged unions, for example:
enum E {
S(String),
I(i32),
}
E::I(123);
type E_Discriminant = u8;
const E_S: E_Discriminant = 0;
const E_I: E_Discriminant = 1;
union E_Payload {
s: String,
i: i32,
}
struct E {
discriminant: E_Discriminant,
payload: E_Payload,
}
E { discriminant: E_I, payload: E_Payload { i: 123 } };
But even doing that manually will not provide you the whole language experience of using enums: you will unable to use pattern matching, accessing variants will be unsafe (and dangerous), etc..
However, when only one variant is needed, structs are more comfortable to use, and that's why they're there.
Related
I would like to create a structure that's something like a compile-time immutable map with safely checked keys at compile-time. More generally, I would like an iterable associative array with safe key access.
My first attempt at this was using a const HashMap (such as described here) but then the keys are not safely accessible:
use phf::{phf_map};
static COUNTRIES: phf::Map<&'static str, &'static str> = phf_map! {
"US" => "United States",
"UK" => "United Kingdom",
};
COUNTRIES.get("EU") // no compile-time error
Another option I considered was using an enumerable enum with the strum crate as described here:
use strum::IntoEnumIterator; // 0.17.1
use strum_macros::EnumIter; // 0.17.1
#[derive(Debug, EnumIter)]
enum Direction {
NORTH,
SOUTH,
EAST,
WEST,
}
fn main() {
for direction in Direction::iter() {
println!("{:?}", direction);
}
}
This works, except that enum values in rust can only be integers. To assign a different value would require something like implementing a value() function for the enum with a match statement, (such as what's described here), however this means that any time the developer decides to append a new item, the value function must be updated as well, and rewriting the enum name in two places every time isn't ideal.
My last attempt was to use consts in an impl, like so:
struct MyType {
value: &'static str
}
impl MyType {
const ONE: MyType = MyType { value: "one" };
const TWO: MyType = MyType { value: "two" };
}
This allows single-write implementations and the objects are safely-accessible compile-time constants, however there's no way that I've found to iterate over them (as expressed by work-arounds here) (although this may be possible with some kind of procedural macro).
I'm coming from a lot of TypeScript where this kind of task is very simple:
const values = {
one: "one",
two: "two" // easy property addition
}
values.three; // COMPILE-TIME error
Object.keys(values).forEach(key => {...}) // iteration
Or even in Java where this can be done simply with enums with properties.
I'm aware this smells a bit like an XY problem, but I don't really think it's an absurd thing to ask generally for a safe, iterable, compile-time immutable constant associative array (boy is it a mouthful though). Is this pattern possible in Rust? The fact that I can't find anything on it and that it seems so difficult leads me to believe what I'm doing isn't the best practice for Rust code. In that case, what are the alternatives? If this is a bad design pattern for Rust, what would a good substitute be?
#JakubDóka How would I implement it? I did some looking at procedural macros and couldn't seem to understand how to implement such a macro.
macro_rules! decl_named_iterable_enum {
(
// notice how we format input as it should be inputted (good practice)
// here is the indentifier bound to $name variable, when we later mention it
// it will be replaced with the passed value
$name:ident {
// the `$(...)*` matches 0-infinity of consecutive `...`
// the `$(...)?` matches 0-1 of `...`
$($variant:ident $(= $repr:literal)?,)*
}
) => {
#[derive(Clone, Copy)]
enum $name {
// We use the metavar same way we bind it,
// just ommitting its token type
$($variant),*
// ^ this will insert `,` between the variants
}
impl $name {
// same story just with additional tokens
pub const VARIANTS: &[Self] = &[$(Self::$variant),*];
pub const fn name(self) -> &'static str {
match self {
$(
// see comments on other macro branches, this si a
// common way to handle optional patterns
Self::$variant => decl_named_iterable_enum!(#repr $variant $($repr)?),
)*
}
}
}
};
// this branch will match if literal is present
// in this case we just ignore the name
(#repr $name:ident $repr:literal) => {
$repr
};
// fallback for no literal provided,
// we stringify the name of variant
(#repr $name:ident) => {
stringify!($name)
};
}
// this is how you use the macro, similar to typescript
decl_named_iterable_enum! {
MyEnum {
Variant,
Short = "Long",
}
}
// some example code collecting names of variants
fn main() {
let name_list = MyEnum::VARIANTS
.iter()
.map(|v| v.name())
.collect::<Vec<_>>();
println!("{name_list:?}");
}
// Exercise for you:
// 1. replace `=` for name override with `:`
// 2. add a static `&[&str]` names accessed by `MyEnum::VARIANT_NAMES`
An enum is clearly a kind of key/value pair structure. Consequently, it would be nice to automatically create a dictionary from one wherein the enum variants become the possible keys and their payload the associated values. Keys without a payload would use the unit value. Here is a possible usage example:
enum PaperType {
PageSize(f32, f32),
Color(String),
Weight(f32),
IsGlossy,
}
let mut dict = make_enum_dictionary!(
PaperType,
allow_duplicates = true,
);
dict.insert(dict.PageSize, (8.5, 11.0));
dict.insert(dict.IsGlossy, ());
dict.insert_def(dict.IsGlossy);
dict.remove_all(dict.PageSize);
Significantly, since an enum is merely a list of values that may optionally carry a payload, auto-magically constructing a dictionary from it presents some semantic issues.
How does a strongly typed Dictionary<K, V> maintain the discriminant/value_type dependency inherent with enums where each discriminant has a specific payload type?
enum Ta {
K1(V1),
K2(V2),
...,
Kn(Vn),
}
How do you conveniently refer to an enum discriminant in code without its payload (Ta.K1?) and what type is it (Ta::Discriminant?) ?
Is the value to be set and get the entire enum value or just the payload?
get(&self, key: Ta::Discriminant) -> Option<Ta>
set(&mut self, value: Ta)
If it were possible to augment an existing enum auto-magically with another enum of of its variants then a reasonably efficient solution seems plausible in the following pseudo code:
type D = add_discriminant_keys!( T );
impl<D> for Vec<D> {
fn get(&self, key: D::Discriminant) -> Option<D> { todo!() }
fn set(&mut self, value: D) { todo!() }
}
I am not aware whether the macro, add_discriminant_keys!, or the construct, D::Discriminant, is even feasible. Unfortunately, I am still splashing in the shallow end of the Rust pool, despite this suggestion. However, the boldness of its macro language suggests many things are possible to those who believe.
Handling of duplicates is an implementation detail.
Enum discriminants are typically functions and therefore have a fixed pointer value (as far as I know). If such values could become constants of an associated type within the enum (like a trait) with attributes similar to what has been realized by strum::EnumDiscriminants things would look good. As it is, EnumDiscriminants seems like a sufficient interim solution.
A generic implementation over HashMap using strum_macros crate is provided based on in the rust playground; however, it is not functional there due to the inability of rust playground to load the strum crate from there. A macro derived solution would be nice.
First, like already said here, the right way to go is a struct with optional values.
However, for completeness sake, I'll show here how you can do that with a proc macro.
When you want to design a macro, especially a complicated one, the first thing to do is to plan what the emitted code will be. So, let's try to write the macro's output for the following reduced enum:
enum PaperType {
PageSize(f32, f32),
IsGlossy,
}
I will already warn you that our macro will not support brace-style enum variants, nor combining enums (your add_discriminant_keys!()). Both are possible to support, but both will complicate this already-complicated answer more. I'll refer to them shortly at the end.
First, let's design the map. It will be in a support crate. Let's call this crate denum (a name will be necessary later, when we'll refer to it from our macro):
pub struct Map<E> {
map: std::collections::HashMap<E, E>, // You can use any map implementation you want.
}
We want to store the discriminant as a key, and the enum as the value. So, we need a way to refer to the free discriminant. So, let's create a trait Enum:
pub trait Enum {
type DiscriminantsEnum: Eq + Hash; // The constraints are those of `HashMap`.
}
Now our map will look like that:
pub struct Map<E: Enum> {
map: std::collections::HashMap<E::DiscriminantsEnum, E>,
}
Our macro will generate the implementation of Enum. Hand-written, it'll be the following (note that in the macro, I wrap it in const _: () = { ... }. This is a technique used to prevent names polluting the global namespaces):
#[derive(PartialEq, Eq, Hash)]
pub enum PaperTypeDiscriminantsEnum {
PageSize,
IsGlossy,
}
impl Enum for PaperType {
type DiscriminantsEnum = PaperTypeDiscriminantsEnum;
}
Next. insert() operation:
impl<E: Enum> Map<E> {
pub fn insert(discriminant: E::DiscriminantsEnum, value: /* What's here? */) {}
}
There is no way in current Rust to refer to an enum discriminant as a distinct type. But there is a way to refer to struct as a distinct type.
We can think about the following:
pub struct PageSize;
But this pollutes the global namespace. Of course, we can call it something like PaperTypePageSize, but I much prefer something like PaperTypeDiscriminants::PageSize.
Modules to the rescue!
#[allow(non_snake_case)]
pub mod PaperTypeDiscriminants {
#[derive(Clone, Copy)]
pub struct PageSize;
#[derive(Clone, Copy)]
pub struct IsGlossy;
}
Now we need a way in insert() to validate the the provided discriminant indeed matches the wanted enum, and to refer to its value. A new trait!
pub trait EnumDiscriminant: Copy {
type Enum: Enum;
type Value;
fn to_discriminants_enum(self) -> <Self::Enum as Enum>::DiscriminantsEnum;
fn to_enum(self, value: Self::Value) -> Self::Enum;
}
And here's how our macro will implements it:
impl EnumDiscriminant for PaperTypeDiscriminants::PageSize {
type Enum = PaperType;
type Value = (f32, f32);
fn to_discriminants_enum(self) -> PaperTypeDiscriminantsEnum { PaperTypeDiscriminantsEnum::PageSize }
fn to_enum(self, (v0, v1): Self::Value) -> Self::Enum { Self::Enum::PageSize(v0, v1) }
}
impl EnumDiscriminant for PaperTypeDiscriminants::IsGlossy {
type Enum = PaperType;
type Value = ();
fn to_discriminants_enum(self) -> PaperTypeDiscriminantsEnum { PaperTypeDiscriminantsEnum::IsGlossy }
fn to_enum(self, (): Self::Value) -> Self::Enum { Self::Enum::IsGlossy }
}
And now insert():
pub fn insert<D>(&mut self, discriminant: D, value: D::Value)
where
D: EnumDiscriminant<Enum = E>,
{
self.map.insert(
discriminant.to_discriminants_enum(),
discriminant.to_enum(value),
);
}
And trivially insert_def():
pub fn insert_def<D>(&mut self, discriminant: D)
where
D: EnumDiscriminant<Enum = E, Value = ()>,
{
self.insert(discriminant, ());
}
And get() (note: seprately getting the value is possible when removing, by adding a method to the trait EnumDiscriminant with the signature fn enum_to_value(enum_: Self::Enum) -> Self::Value. It can be unsafe fn and use unreachable_unchecked() for better performance. But with get() and get_mut(), that returns reference, it's harder because you can't get a reference to the discriminant value. Here's a playground that does that nonetheless, but requires nightly):
pub fn get_entry<D>(&self, discriminant: D) -> Option<&E>
where
D: EnumDiscriminant<Enum = E>,
{
self.map.get(&discriminant.to_discriminants_enum())
}
get_mut() is very similar.
Note that my code doesn't handle duplicates but instead overwrites them, as it uses HashMap. However, you can easily create your own map that handles duplicates in another way.
Now that we have a clear picture in mind what the macro should generate, let's write it!
I decided to write it as a derive macro. You can use an attribute macro too, and even a function-like macro, but you must call it at the declaration site of your enum - because macros cannot inspect code other than the code the're applied to.
The enum will look like:
#[derive(denum::Enum)]
enum PaperType {
PageSize(f32, f32),
Color(String),
Weight(f32),
IsGlossy,
}
Usually, when my macro needs helper code, I put this code in crate and the macro in crate_macros, and reexports the macro from crate. So, the code in denum (besides the aforementioned Enum, EnumDiscriminant and Map):
pub use denum_macros::Enum;
denum_macros/src/lib.rs:
use proc_macro::TokenStream;
use quote::{format_ident, quote};
#[proc_macro_derive(Enum)]
pub fn derive_enum(item: TokenStream) -> TokenStream {
let item = syn::parse_macro_input!(item as syn::DeriveInput);
if item.generics.params.len() != 0 {
return syn::Error::new_spanned(
item.generics,
"`denum::Enum` does not work with generics currently",
)
.into_compile_error()
.into();
}
if item.generics.where_clause.is_some() {
return syn::Error::new_spanned(
item.generics.where_clause,
"`denum::Enum` does not work with `where` clauses currently",
)
.into_compile_error()
.into();
}
let (vis, name, variants) = match item {
syn::DeriveInput {
vis,
ident,
data: syn::Data::Enum(syn::DataEnum { variants, .. }),
..
} => (vis, ident, variants),
_ => {
return syn::Error::new_spanned(item, "`denum::Enum` works only with enums")
.into_compile_error()
.into()
}
};
let discriminants_mod_name = format_ident!("{}Discriminants", name);
let discriminants_enum_name = format_ident!("{}DiscriminantsEnum", name);
let mut discriminants_enum = Vec::new();
let mut discriminant_structs = Vec::new();
let mut enum_discriminant_impls = Vec::new();
for variant in variants {
let variant_name = variant.ident;
discriminant_structs.push(quote! {
#[derive(Clone, Copy)]
pub struct #variant_name;
});
let fields = match variant.fields {
syn::Fields::Named(_) => {
return syn::Error::new_spanned(
variant.fields,
"`denum::Enum` does not work with brace-style variants currently",
)
.into_compile_error()
.into()
}
syn::Fields::Unnamed(fields) => Some(fields.unnamed),
syn::Fields::Unit => None,
};
let value_destructuring = fields
.iter()
.flatten()
.enumerate()
.map(|(index, _)| format_ident!("v{}", index));
let value_destructuring = quote!((#(#value_destructuring,)*));
let value_builder = if fields.is_some() {
value_destructuring.clone()
} else {
quote!()
};
let value_type = fields.into_iter().flatten().map(|field| field.ty);
enum_discriminant_impls.push(quote! {
impl ::denum::EnumDiscriminant for #discriminants_mod_name::#variant_name {
type Enum = #name;
type Value = (#(#value_type,)*);
fn to_discriminants_enum(self) -> #discriminants_enum_name { #discriminants_enum_name::#variant_name }
fn to_enum(self, #value_destructuring: Self::Value) -> Self::Enum { Self::Enum::#variant_name #value_builder }
}
});
discriminants_enum.push(variant_name);
}
quote! {
#[allow(non_snake_case)]
#vis mod #discriminants_mod_name { #(#discriminant_structs)* }
const _: () = {
#[derive(PartialEq, Eq, Hash)]
pub enum #discriminants_enum_name { #(#discriminants_enum,)* }
impl ::denum::Enum for #name {
type DiscriminantsEnum = #discriminants_enum_name;
}
#(#enum_discriminant_impls)*
};
}
.into()
}
This macro has several flaws: it doesn't handle visibility modifiers and attributes correctly, for example. But in the general case, it works, and you can fine-tune it more.
If you want to also support brace-style variants, you can create a struct with the data (instead of the tuple we use currently).
Combining enum is possible if you'll not use a derive macro but a function-like macro, and invoke it on both enums, like:
denum::enums! {
enum A { ... }
enum B { ... }
}
Then the macro will have to combine the discriminants and use something like Either<A, B> when operating with the map.
Unfortunately, a couple of questions arise in that context:
should it be possible to use enum types only once? Or are there some which might be there multiple times?
what should happen if you insert a PageSize and there's already a PageSize in the dictionary?
All in all, a regular struct PaperType is much more suitable to properly model your domain. If you don't want to deal with Option, you can implement the Default trait to ensure that some sensible defaults are always available.
If you really, really want to go with a collection-style interface, the closest approximation would probably be a HashSet<PaperType>. You could then insert a value PaperType::PageSize.
I'm just learning Rust.
So I know this works:
enum Animal {
Cat { name: String, weight: f64 }
}
fn main() {
let a = Animal::Cat { name: "Spotty".to_string(), weight: 2.7 };
match a {
Animal::Cat{name, weight} => { println!("Cat n={} w={}", name, weight); }
}
}
but why can I not assign from an enum field directly like this:
...
let wt = a.weight;
Or am I using the wrong syntax? Or is it because Rust cannot guarantee an Animal instance will be of type Cat?
Is the only way to access fields of an instance of enum struct or tuple variant by using match?
It is because weight isn't part of the Animal enum, it's part of the Cat type of the enum. Say in the future you add another type of animal that doesn't have weight information:
enum Animal {
Cat { name: String, weight: f64 },
Insect { species: String, height: f64 },
}
Then a.weight wouldn't always have a value. You can use if let or match to conditionally get data from an enum if it exists:
if let Animal::Cat { weight, .. } = a {
println!("Your cat weights {} kg.", weight);
} else {
println!("get a cat");
};
Or is it because Rust cannot guarantee an Animal instance will be of type Cat?
Yes. In a normal enum, you'd have multiple variants, which may or may not have the same types, and thus fields (in fact some may not have fields at all).
So while it would technically be possible for rust to have attribute access on enums with a single variant or where all variants have a compatible version of the attribute, those are such rare cases they don't really matter and the value of such complexity would be essentially nil.
I don't rightly see the point of using a single-variant enum tho, why are you not just using a struct?
struct Cat { name: String, weight: f64 }
?
Is the only way to access fields of an instance of enum struct or tuple variant by using match?
You could also use an if let though that's essentially the same.
Or you can have a method which exposes the information (with a match or if let inside), as convenience (or because the variants are private).
Let say I define and instantiate an enum as follows:
enum MyEnum {
EmptyVariant,
TupleVariant(u8),
StructVariant {
key: u8,
value: char,
}
}
let instance = MyEnum::StructVariant{key: 8, value: 'a'};
Is it possible to match against this variant without destructuring? For example, instead of doing:
if let MyEnum::StructVariant{key, value} = instance {
eprintln!("key, value = {}, {}", key, value);
}
I would rather write something like:
if let MyEnum::StructVariant{VARIANT_MEMBERS} = instance {
eprintln!("key, value = {}, {}", VARIANT_MEMBERS.key, VARIANT_MEMBERS.value);
}
In this example, writing out the members of the struct variant is benign, but in the case where the variant has many members it makes the code difficult to read.
I don't think it is possible as of today.
I did see the following pattern, which in practice achieves what you asked for just with an extra intermediary type:
Instead place all the desired members of the StructVariant in an actual struct type, and have your enum use that struct as the only field of the StructVariant.
struct MyVariant {
key: u8,
value: char
}
enum MyEnum {
EmptyVariant,
TupleVariant(u8),
StructVariant(MyVariant)
}
if let MyEnum::StructVariant(x) = instance {
eprintln!("key, value = {}, {}", x.key, x.value);
}
This also proves handy when you want to pass the members of StructVariant around, to other functions or types.
When writing callbacks for generic interfaces, it can be useful for them to define their own local data which they are responsible for creating and accessing.
In C I would just use a void pointer, C-like example:
struct SomeTool {
int type;
void *custom_data;
};
void invoke(SomeTool *tool) {
StructOnlyForThisTool *data = malloc(sizeof(*data));
/* ... fill in the data ... */
tool.custom_data = custom_data;
}
void execute(SomeTool *tool) {
StructOnlyForThisTool *data = tool.custom_data;
if (data.foo_bar) { /* do something */ }
}
When writing something similar in Rust, replacing void * with Option<Box<Any>>, however I'm finding that accessing the data is unreasonably verbose, eg:
struct SomeTool {
type: i32,
custom_data: Option<Box<Any>>,
};
fn invoke(tool: &mut SomeTool) {
let data = StructOnlyForThisTool { /* my custom data */ }
/* ... fill in the data ... */
tool.custom_data = Some(Box::new(custom_data));
}
fn execute(tool: &mut SomeTool) {
let data = tool.custom_data.as_ref().unwrap().downcast_ref::<StructOnlyForThisTool>().unwrap();
if data.foo_bar { /* do something */ }
}
There is one line here which I'd like to be able to write in a more compact way:
tool.custom_data.as_ref().unwrap().downcast_ref::<StructOnlyForThisTool>().unwrap()
tool.custom_data.as_ref().unwrap().downcast_mut::<StructOnlyForThisTool>().unwrap()
While each method makes sense on its own, in practice it's not something I'd want to write throughout a code-base, and not something I'm going to want to type out often or remember easily.
By convention, the uses of unwrap here aren't dangerous because:
While only some tools define custom data, the ones that do always define it.
When the data is set, by convention the tool only ever sets its own data. So there is no chance of having the wrong data.
Any time these conventions aren't followed, its a bug and should panic.
Given these conventions, and assuming accessing custom-data from a tool is something that's done often - what would be a good way to simplify this expression?
Some possible options:
Remove the Option, just use Box<Any> with Box::new(()) representing None so access can be simplified a little.
Use a macro or function to hide verbosity - passing in the Option<Box<Any>>: will work of course, but prefer not - would use as a last resort.
Add a trait to Option<Box<Any>> which exposes a method such as tool.custom_data.unwrap_box::<StructOnlyForThisTool>() with matching unwrap_box_mut.
Update 1): since asking this question a point I didn't include seems relevant.
There may be multiple callback functions like execute which must all be able to access the custom_data. At the time I didn't think this was important to point out.
Update 2): Wrapping this in a function which takes tool isn't practical, since the borrow checker then prevents further access to members of tool until the cast variable goes out of scope, I found the only reliable way to do this was to write a macro.
If the implementation really only has a single method with a name like execute, that is a strong indication to consider using a closure to capture the implementation data. SomeTool can incorporate an arbitrary callable in a type-erased manner using a boxed FnMut, as shown in this answer. execute() then boils down to invoking the closure stored in the struct field implementation closure using (self.impl_)(). For a more general approach, that will also work when you have more methods on the implementation, read on.
An idiomatic and type-safe equivalent of the type+dataptr C pattern is to store the implementation type and pointer to data together as a trait object. The SomeTool struct can contain a single field, a boxed SomeToolImpl trait object, where the trait specifies tool-specific methods such as execute. This has the following characteristics:
You no longer need an explicit type field because the run-time type information is incorporated in the trait object.
Each tool's implementation of the trait methods can access its own data in a type-safe manner without casts or unwraps. This is because the trait object's vtable automatically invokes the correct function for the correct trait implementation, and it is a compile-time error to try to invoke a different one.
The "fat pointer" representation of the trait object has the same performance characteristics as the type+dataptr pair - for example, the size of SomeTool will be two pointers, and accessing the implementation data will still involve a single pointer dereference.
Here is an example implementation:
struct SomeTool {
impl_: Box<SomeToolImpl>,
}
impl SomeTool {
fn execute(&mut self) {
self.impl_.execute();
}
}
trait SomeToolImpl {
fn execute(&mut self);
}
struct SpecificTool1 {
foo_bar: bool
}
impl SpecificTool1 {
pub fn new(foo_bar: bool) -> SomeTool {
let my_data = SpecificTool1 { foo_bar: foo_bar };
SomeTool { impl_: Box::new(my_data) }
}
}
impl SomeToolImpl for SpecificTool1 {
fn execute(&mut self) {
println!("I am {}", self.foo_bar);
}
}
struct SpecificTool2 {
num: u64
}
impl SpecificTool2 {
pub fn new(num: u64) -> SomeTool {
let my_data = SpecificTool2 { num: num };
SomeTool { impl_: Box::new(my_data) }
}
}
impl SomeToolImpl for SpecificTool2 {
fn execute(&mut self) {
println!("I am {}", self.num);
}
}
pub fn main() {
let mut tool1: SomeTool = SpecificTool1::new(true);
let mut tool2: SomeTool = SpecificTool2::new(42);
tool1.execute();
tool2.execute();
}
Note that, in this design, it doesn't make sense to make implementation an Option because we always associate the tool type with the implementation. While it is perfectly valid to have an implementation without data, it must always have a type associated with it.