Rust safe const iterable associative array - rust

I would like to create a structure that's something like a compile-time immutable map with safely checked keys at compile-time. More generally, I would like an iterable associative array with safe key access.
My first attempt at this was using a const HashMap (such as described here) but then the keys are not safely accessible:
use phf::{phf_map};
static COUNTRIES: phf::Map<&'static str, &'static str> = phf_map! {
"US" => "United States",
"UK" => "United Kingdom",
};
COUNTRIES.get("EU") // no compile-time error
Another option I considered was using an enumerable enum with the strum crate as described here:
use strum::IntoEnumIterator; // 0.17.1
use strum_macros::EnumIter; // 0.17.1
#[derive(Debug, EnumIter)]
enum Direction {
NORTH,
SOUTH,
EAST,
WEST,
}
fn main() {
for direction in Direction::iter() {
println!("{:?}", direction);
}
}
This works, except that enum values in rust can only be integers. To assign a different value would require something like implementing a value() function for the enum with a match statement, (such as what's described here), however this means that any time the developer decides to append a new item, the value function must be updated as well, and rewriting the enum name in two places every time isn't ideal.
My last attempt was to use consts in an impl, like so:
struct MyType {
value: &'static str
}
impl MyType {
const ONE: MyType = MyType { value: "one" };
const TWO: MyType = MyType { value: "two" };
}
This allows single-write implementations and the objects are safely-accessible compile-time constants, however there's no way that I've found to iterate over them (as expressed by work-arounds here) (although this may be possible with some kind of procedural macro).
I'm coming from a lot of TypeScript where this kind of task is very simple:
const values = {
one: "one",
two: "two" // easy property addition
}
values.three; // COMPILE-TIME error
Object.keys(values).forEach(key => {...}) // iteration
Or even in Java where this can be done simply with enums with properties.
I'm aware this smells a bit like an XY problem, but I don't really think it's an absurd thing to ask generally for a safe, iterable, compile-time immutable constant associative array (boy is it a mouthful though). Is this pattern possible in Rust? The fact that I can't find anything on it and that it seems so difficult leads me to believe what I'm doing isn't the best practice for Rust code. In that case, what are the alternatives? If this is a bad design pattern for Rust, what would a good substitute be?

#JakubDóka How would I implement it? I did some looking at procedural macros and couldn't seem to understand how to implement such a macro.
macro_rules! decl_named_iterable_enum {
(
// notice how we format input as it should be inputted (good practice)
// here is the indentifier bound to $name variable, when we later mention it
// it will be replaced with the passed value
$name:ident {
// the `$(...)*` matches 0-infinity of consecutive `...`
// the `$(...)?` matches 0-1 of `...`
$($variant:ident $(= $repr:literal)?,)*
}
) => {
#[derive(Clone, Copy)]
enum $name {
// We use the metavar same way we bind it,
// just ommitting its token type
$($variant),*
// ^ this will insert `,` between the variants
}
impl $name {
// same story just with additional tokens
pub const VARIANTS: &[Self] = &[$(Self::$variant),*];
pub const fn name(self) -> &'static str {
match self {
$(
// see comments on other macro branches, this si a
// common way to handle optional patterns
Self::$variant => decl_named_iterable_enum!(#repr $variant $($repr)?),
)*
}
}
}
};
// this branch will match if literal is present
// in this case we just ignore the name
(#repr $name:ident $repr:literal) => {
$repr
};
// fallback for no literal provided,
// we stringify the name of variant
(#repr $name:ident) => {
stringify!($name)
};
}
// this is how you use the macro, similar to typescript
decl_named_iterable_enum! {
MyEnum {
Variant,
Short = "Long",
}
}
// some example code collecting names of variants
fn main() {
let name_list = MyEnum::VARIANTS
.iter()
.map(|v| v.name())
.collect::<Vec<_>>();
println!("{name_list:?}");
}
// Exercise for you:
// 1. replace `=` for name override with `:`
// 2. add a static `&[&str]` names accessed by `MyEnum::VARIANT_NAMES`

Related

Metaprogamming name to function and type lookup in Rust?

I am working on a system which produces and consumes large numbers of "events", they are a name with some small payload of data, and an attached function which is used as a kind of fold-left over the data, something like a reducer.
I receive from the upstream something like {t: 'fieldUpdated', p: {new: 'new field value'}}, and must in my program associate the fieldUpdated "callback" function with the incoming event and apply it. There is a confirmation command I must echo back (which follows a programatic naming convention), and each type is custome.
I tried using simple macros to do codegen for the structs, callbacks, and with the paste::paste! macro crate, and with the stringify macro I made quite good progress.
Regrettably however I did not find a good way to metaprogram these into a list or map using macros. Extending an enum through macros doesn't seem to be possible, and solutions such as the use of ctors seems extremely hacky.
My ideal case is something this:
type evPayload = {
new: String
}
let evHandler = fn(evPayload: )-> Result<(), Error> { Ok(()) }
// ...
let data = r#"{"t": 'fieldUpdated', "p": {"new": 'new field value'}}"#'
let v: Value = serde_json::from_str(data)?;
Given only knowledge of data how can use macros, specifically (boilerplate is actually 2-3 types, 3 functions, some factory and helper functions) in a way that I can do a name-to-function lookup?
It seems like Serde's adjacently, or internally tagged would get me there, if I could modify a enum in a macro https://serde.rs/enum-representations.html#internally-tagged
It almost feels like I need a macro which can either maintain an enum, or I can "cheat" and use module scoped ctors to do a quasi-static initialization of the names and types into a map.
My program would have on the order of 40-100 of these, with anything from 3-10 in a module. I don't think ctors are necessarily a problem here, but the fact that they're a little grey area handshake, and that ctors might preclude one day being able to cross-compile to wasm put me off a little.
I actually had need of something similar today; the enum macro part specifically. But beware of my method: here be dragons!
Someone more experienced than me — and less mad — should probably vet this. Please do not assume my SAFETY comments to be correct.
Also, if you don't have variant that collide with rust keywords, you might want to tear out the '_' prefix hack entirely. I used a static mut byte array for that purpose, as manipulating strings was an order of magnitude slower, but that was benchmarked in a simplified function. There are likely better ways of doing this.
Finally, I am using it where failing to parse must cause panic, so error handling somewhat limited.
With that being said, here's my current solution:
/// NOTE: It is **imperative** that the length of this array is longer that the longest variant name +1
static mut CHECK_BUFF: [u8; 32] = [b'_'; 32];
macro_rules! str_enums {
($enum:ident: $($variant:ident),* $(,)?) => {
#[allow(non_camel_case_types)]
#[derive(Debug, Default, Hash, Clone, PartialEq, Eq, PartialOrd, Ord)]
enum $enum {
#[default]
UNINIT,
$($variant),*,
UNKNOWN
}
impl FromStr for $enum {
type Err = String;
fn from_str(s: &str) -> Result<Self, Self::Err> {
unsafe {
// SAFETY: Currently only single threaded
CHECK_BUFF[1..len].copy_from_slice(s.as_bytes());
let len = s.len() + 1;
assert!(CHECK_BUFF.len() >= len);
// SAFETY: Safe as long as CHECK_BUFF.len() >= s.len() + 1
match from_utf8_unchecked(&CHECK_BUFF[..len]) {
$(stringify!($variant) => Ok(Self::$variant),)*
_ => Err(format!(
"{} variant not accounted for: {s} ({},)",
stringify!($enum),
from_utf8_unchecked(&CHECK_BUFF[..len])
))
}
}
}
}
impl From<&$enum> for &'static str {
fn from(variant: &$enum) -> Self {
unsafe {
match variant {
// SAFETY: The first byte is always '_', and stripping it of should be safe.
$($enum::$variant => from_utf8_unchecked(&stringify!($variant).as_bytes()[1..]),)*
$enum::UNINIT => {
eprintln!("uninitialized {}!", stringify!($enum));
""
}
$enum::UNKNOWN => {
eprintln!("unknown {}!", stringify!($enum));
""
}
}
}
}
}
impl Display for $enum {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "{}", Into::<&str>::into(self))
}
}
};
}
And then I call it like so:
str_enums!(
AttributeKind:
_alias,
_allowduplicate,
_altlen,
_api,
...
_enum,
_type,
_struct,
);
str_enums!(
MarkupKind:
_alias,
_apientry,
_command,
_commands,
...
);

Rust treat two different structs as one

I have two different structs with similar functions. Suppose that the program choose which struct to take from the user input.
I want to write something like this
fn main() {
...
let obj = if a == first {
first_object //struct First
} else {
second_object//struct Second
};
obj.read();
obj.write();
obj.some_another_method();
}
I have tried to make an enumeration
pub enum ObjectKind {
FirstObject(First),
SecondObject(Second)
}
But I cannot use methods in this case
let something = ObjectKind::FirstObject(first_object);
something.read()
//no method named `read` found for enum `structs::ObjectKind` in the current scope
//method not found in `structs::ObjectKind`
But I cannot use methods in this case
An enum is a proper type in and of itself, it's not a "union" of existing types. You can just define the methods on the enum to forward to the relevant object e.g.
impl ObjectKind {
fn read(&self) {
match self {
FirstObject(f) => f.read()
SecondObject(s) => s.read()
}
}
}
as it would probably be a bit repetitive, you can use a macro to make the forwarding easier.
Alternatively, you might be able to define a trait for the common behaviour and use a trait object to dynamically dispatch between the two, but that can be somewhat restrictive.

Difference between match binding and match destructuring?

Recently I was reading about the match keyword in the Rust Book. What confused me was the difference between Binding and Destructuring. In my understanding, both of these provide a way to access variables in an expression. Binding can specify a range matching, but you can achieve it with Destructuring and Guards. So can someone show some cases that only Binding can do or explain the real difference between these two concepts?
Here you can see one scenario when a binding is needed because destructuring doesn't satisfy our current need. If we simply destructure the struct we get access to the inner field of the struct. This means that the values used on the right hand side in the match arm won't have access to the methods defined on the struct.
In my example I also match against a specific value of ex.value, this is of course not necessary and can be done with a guard instead, this way is however more concise if the condition isn't very complex.
struct Example {
some_value: i32,
some_other_value: String
}
impl Example {
pub fn some_fn(&mut self) {}
}
fn main() {
let ex = Example { some_value: 42, some_other_value: "Foobar".to_string() };
match ex {
mut new_ex # Example { some_value: 43, .. } => new_ex.some_fn(),
Example { some_value: first, some_other_value: second } => println!("first value: {}\nSecond value: {}", first, second),
}
}

How do I process enum/struct/field attributes in a procedural macro?

Serde supports applying custom attributes that are used with #[derive(Serialize)]:
#[derive(Serialize)]
struct Resource {
// Always serialized.
name: String,
// Never serialized.
#[serde(skip_serializing)]
hash: String,
// Use a method to decide whether the field should be skipped.
#[serde(skip_serializing_if = "Map::is_empty")]
metadata: Map<String, String>,
}
I understand how to implement a procedural macro (Serialize in this example) but what should I do to implement #[serde(skip_serializing)]? I was unable to find this information anywhere. The docs don't even mention this. I have tried to look at the serde-derive source code but it is very complicated for me.
First you must register all of your attributes in the same place you register your procedural macro. Let's say we want to add two attributes (we still don't talk what will they belong to: structs or fields or both of them):
#[proc_macro_derive(FiniteStateMachine, attributes(state_transitions, state_change))]
pub fn fxsm(input: TokenStream) -> TokenStream {
// ...
}
After that you may already compile your user code with the following:
#[derive(Copy, Clone, Debug, FiniteStateMachine)]
#[state_change(GameEvent, change_condition)] // optional
enum GameState {
#[state_transitions(NeedServer, Ready)]
Prepare { players: u8 },
#[state_transitions(Prepare, Ready)]
NeedServer,
#[state_transitions(Prepare)]
Ready,
}
Without that compiler will give a error with message like:
state_change does not belong to any known attribute.
These attributes are optional and all we have done is allow them to be to specified. When you derive your procedural macro you may check for everything you want (including attributes existence) and panic! on some condition with meaningful message which will be told by the compiler.
Now we will talk about handling the attribute! Let's forget about state_transitions attribute because it's handling will not vary too much from handling struct/enum attributes (actually it is only a little bit more code) and talk about state_change. The syn crate gives you all the needed information about definitions (but not implementations unfortunately (I am talking about impl here) but this is enough for handling attributes of course). To be more detailed, we need syn::DeriveInput, syn::Body, syn::Variant, syn::Attribute and finally syn::MetaItem.
To handle the attribute of a field you need to go through all these structures from one to another. When you reach Vec<syn:: Attribute> - this is what you want, a list of all attributes of a field. Here our state_transitions can be found. When you find it, you may want to get its content and this can be done by using matching syn::MetaItem enum. Just read the docs :) Here is a simple example code which panics when we find state_change attribute on some field plus it checks does our target entity derive Copy or Clone or neither of them:
#[proc_macro_derive(FiniteStateMachine, attributes(state_transitions, state_change))]
pub fn fxsm(input: TokenStream) -> TokenStream {
// Construct a string representation of the type definition
let s = input.to_string();
// Parse the string representation
let ast = syn::parse_derive_input(&s).unwrap();
// Build the impl
let gen = impl_fsm(&ast);
// Return the generated impl
gen.parse().unwrap()
}
fn impl_fsm(ast: &syn::DeriveInput) -> Tokens {
const STATE_CHANGE_ATTR_NAME: &'static str = "state_change";
if let syn::Body::Enum(ref variants) = ast.body {
// Looks for state_change attriute (our attribute)
if let Some(ref a) = ast.attrs.iter().find(|a| a.name() == STATE_CHANGE_ATTR_NAME) {
if let syn::MetaItem::List(_, ref nested) = a.value {
panic!("Found our attribute with contents: {:?}", nested);
}
}
// Looks for derive impls (not our attribute)
if let Some(ref a) = ast.attrs.iter().find(|a| a.name() == "derive") {
if let syn::MetaItem::List(_, ref nested) = a.value {
if derives(nested, "Copy") {
return gen_for_copyable(&ast.ident, &variants, &ast.generics);
} else if derives(nested, "Clone") {
return gen_for_clonable(&ast.ident, &variants, &ast.generics);
} else {
panic!("Unable to produce Finite State Machine code on a enum which does not drive Copy nor Clone traits.");
}
} else {
panic!("Unable to produce Finite State Machine code on a enum which does not drive Copy nor Clone traits.");
}
} else {
panic!("How have you been able to call me without derive!?!?");
}
} else {
panic!("Finite State Machine must be derived on a enum.");
}
}
fn derives(nested: &[syn::NestedMetaItem], trait_name: &str) -> bool {
nested.iter().find(|n| {
if let syn::NestedMetaItem::MetaItem(ref mt) = **n {
if let syn::MetaItem::Word(ref id) = *mt {
return id == trait_name;
}
return false
}
false
}).is_some()
}
You may be interested in reading serde_codegen_internals, serde_derive, serenity's #[command] attr, another small project of mine - unique-type-id, fxsm-derive. The last link is actually my own project to explain to myself how to use procedural macros in Rust.
After some Rust 1.15 and updating the syn crate, it is no longer possible to check derives of a enums/structs, however, everything else works okay.
You implement attributes on fields as part of the derive macro for the struct (you can only implement derive macros for structs and enums).
Serde does this by checking every field for an attribute within the structures provided by syn and changing the code generation accordingly.
You can find the relevant code here: https://github.com/serde-rs/serde/blob/master/serde_derive/src/internals/attr.rs
To expand Victor Polevoy's answer when it comes to the state_transitions attribute. I'm providing an example of how to extract the field attribute #[state_transitions(NeedServer, Ready)] on a enum that derives #[derive(FiniteStateMachine)]:
#[derive(FiniteStateMachine)]
enum GameState {
#[state_transitions(NeedServer, Ready)] // <-- extract this
Prepare { players: u8 },
#[state_transitions(Prepare, Ready)]
NeedServer,
#[state_transitions(Prepare)]
Ready,
}
use proc_macro::TokenStream;
#[proc_macro_derive(FiniteStateMachine, attributes(state_transitions))]
pub fn finite_state_machine(input: TokenStream) -> TokenStream {
let ast = syn::parse(input).unwrap();
// Extract the enum variants
let variants: Vec<&syn::Variant> = match &ast.data {
syn::Data::Enum(ref data_enum) => data_enum.variants.iter().collect(),
other => panic!("#[derive(FiniteStateMachine)] expects enum, got {:#?}", other)
};
// For each variant, extract the attributes
let _ = variants.iter().map(|variant| {
let attrs = variant.attrs.iter()
// checks attribute named "state_transitions(...)"
.find_map(|attr| match attr.path.is_ident("state_transitions") {
true => Some(&attr.tokens),
false => None,
})
.expect("#[derive(FiniteStateMachine)] expects attribute macros #[state_transitions(...)] on each variant, found none");
// outputs: attr: "(NeedServer, Ready)"
eprintln!("attr: {:#?}", attrs.to_string());
// do something with the extracted attributes
...
})
.collect();
...
}
The content of the extracted attrs (typed TokenStream) looks like this:
TokenStream [
Group {
delimiter: Parenthesis,
stream: TokenStream [
Ident {
ident: "NeedServer",
span: #0 bytes(5511..5521),
},
Punct {
ch: ',',
spacing: Alone,
span: #0 bytes(5521..5522),
},
Ident {
ident: "Ready",
span: #0 bytes(5523..5528),
},
],
span: #0 bytes(5510..5529),
},
]

Accessing tuple from within an enum

I have a Rust enum defined like this
enum MyFirstEnum {
TupleType(f32, i8, String),
StuctType {varone: i32, vartwo: f64},
NewTypeTuple(i32),
SomeVarName
}
I have the following code:
let mfe: MyFirstEnum = MyFirstEnum::TupleType(3.14, 1, "Hello".to_string());
I'm following the Rust documentation and this looks fine. I don't need to define everything in the enum, but how would I go about accessing the mid element in the enum tuple?
mfe.TupleType.1 and mfe.1 don't work when I add them to a println!
I know Rust provides the facility to do pattern matching to obtain the value, but if I changed the code to define the other variants within the enum, the code to output a particular variant would quickly become a mess.
Is there a simple way to output the variant of the tuple (or any other variant) in the enum?
This is a common misconception: enum variants are not their own types (at least in Rust 1.9). Therefore when you create a variable like this:
let value = MyFirstEnum::TupleType(3.14, 1, "Hello".to_string());
The fact that it's a specific variant is immediately "lost". You will need to pattern match to prevent accessing the enum as the wrong variant. You may prefer to use an if let statement instead of a match:
if let MyFirstEnum::TupleType(f, i, s) = value {
// Values available here
println!("f: {:?}", f);
}
Example solution:
enum MyFirstEnum {
TupleType(f32, i8, String),
// StuctType { varone: i32, vartwo: f64 },
// NewTypeTuple(i32),
// SomeVarName,
}
fn main() {
let mfe: MyFirstEnum = MyFirstEnum::TupleType(3.14, 1, "Hello".to_string());
let MyFirstEnum::TupleType(value, id, text) = &mfe;
println!("[{}; {}; {}]", value, id, text);
//or
match &mfe {
MyFirstEnum::TupleType(value, id, text) => {
println!("[{}; {}; {}]", value, id, text);
}
// _ => {}
}
}
Playground link

Resources