Automatically create enum variants - rust

I want macros that automatically add structures with attribute macros to the enum.
For example
#[add_enum(AutoEnum)]
struct A(usize);
#[add_enum(AutoEnum)]
struct B(i32);
#[auto_enum]
enum AutoEnum{}
Expand on the above
struct A(usize);
struct B(i32);
enum AutoEnum{
A(A),
B(B),
}
I have looked around and could not find a way to get information on other attribute macros (e.g. #[add_enum(AutoEnum)] from #[auto_enum]).
Is this possible?

Here's a crazy idea that will probably don't work in your real code but should work in the example you gave (after a slight modification) so 🤷.
As opposed to every other item in Rust, macros can shadow each other. If you will have two macros with the same name, the macro defined later will shadow the first macro.
So, I think... What if we will define macros for all possible names of the enum variant, then shadow them for each actual enum? This will give us a way to expand every variant to what we want.
It will be clearer with an example. Suppose we want to only support letters A-C as variant names, and only support one letter (no AC). We first need some place to define the "default" no-variant macros, and it needs to be before any variant. So let's modify the example as follows:
#[auto_enum(declare)] // New!
enum AutoEnum {}
#[add_enum(AutoEnum)]
struct A(usize);
#[add_enum(AutoEnum)]
struct B(i32);
#[auto_enum(define)] // Changed! (This is not necessary).
enum AutoEnum {}
Then, the #[auto_enum(declare)] will expand to the following three "default" macros, that pass their input as-is to the next macro except the last macro that creates the final enum:
macro_rules! AutoEnum_A {
( $($t:tt)* ) => { AutoEnum_B! { $($t)* } };
}
macro_rules! AutoEnum_B {
( $($t:tt)* ) => { AutoEnum_C! { $($t)* } };
}
macro_rules! AutoEnum_C {
( $($t:tt)* ) => {
enum AutoEnum { $($t)* }
};
}
Now, every #[add_enum] call will expand to a macro that shadows the "default" macro, and instead of passing the input to the next macro as-is, it adds its variant to it:
// #[add_enum(AutoEnum)]
// struct A(usize);
macro_rules! AutoEnum_A {
( $($t:tt)* ) => { AutoEnum_B! { $($t)* A(usize), } };
}
// #[add_enum(AutoEnum)]
// struct B(i32);
macro_rules! AutoEnum_B {
( $($t:tt)* ) => { AutoEnum_C! { $($t)* B(i32), } };
}
Finally, the #[auto_enum(define)] will expand to a call to the first macro, a call that eventually at the end of the chain will generate the enum:
AutoEnum_A! {}
Of course, for every letter (assuming you want to support uppercase/lowercase/numbers/underscores) this requires 63 combinations, which means that even supporting 5 letters would require 992436543 macros, so this is not really usable. But still, an interesting idea to explore.
I'm still seeking better ideas that are hopefully also usable.

Related

Zip iterables with Optional and Non Optional parameter in macro

For the testing part of my lexer, I came up with a simple macro that let met define the expected token type (enum) and the token literal (string):
macro_rules! token_test {
($($ttype:ident: $literal:literal)*) => {
{
vec!($($ttype,)*).iter().zip(vec!($($literal,)*).iter())
}
}
}
and then I can use it like this:
for (ttype, literal) in token_test! {
Let: "let" Identifier: "five" Assign: "=" Int: "5" Semicolon: ";"
} {
//...
}
However, this is a little bit verbose and we don't need to specify the literal for most of the token since I have another macro that transforms an enum variant into a string (eg: Let -> "let").
So what I hope to do is something like:
for (ttype, literal) in token_test! {
Let Identifier: "five" Assign Int: "5" Semicolon
} {
//...
}
And if I understood properly, I can use optional parameters to match either TYPE: LITERAL or TYPE. Maybe something like:
macro_rules! token_test {
($($ttype:ident$(: $literal:literal)?)*) => {
{
//...
}
}
}
So then my question is is there a way to build Vector out of this?
To be more clear:
In the case of no literal passed, it should add the string representation of my enum (eg: Let -> "let")
In the case of literal passed, it should add the literal directly
Made it work with the following macro (any improvement welcomed):
macro_rules! token_test {
($($ttype:ident$(: $literal:literal)?)*) => {
vec!($($ttype,)*).iter().zip(vec!(
$(
{
let mut literal = $ttype.as_str().unwrap();
$(literal = $literal;)?
literal
}
),*).iter())
}
}
This 'iterates' over the literal macro arguments and initially set the value of the as_str which transform a enum variant to a string. Then if the $literal is defined, it replaces the local literal value to that. And finally, it returns the local literal variable.
Improvement
macro_rules! some_or_none {
() => { None };
($entity:literal) => { Some($entity) }
}
macro_rules! token_test {
($($ttype:ident$(: $literal:literal)?)*) => {
vec!($($ttype,)*).iter().zip(vec!($(
some_or_none!($($literal)?).unwrap_or($ttype.as_str().unwrap())
),*))
}
}
Removed some unnecessary scopes, the second .iter(), and added some_or_none macro. With this way I don't need to do the as_str if there is a literal provided.
Further improvement
In the above example, there are two macros that are provided. One is clearly a "private" macro, because its existence is only useful for the implementation of the other one. However, there is a small catch about how macro exports work. Unlike functions, macros cannot access a macro that was defined in the same scope, but which are not accessible from the caller. See this playground example. This is not a problem if you don't intend to export that macro, which is possible since its only purpose is to be used in a test suite. However, you might still want to expose it publicly at a crate level, without exposing some_or_none!. The conventional way to do this is to integrate some_or_none! inside the token_test! macro, by prepending it with #:
macro_rules! token_test {
(#some_or_none) => {
None
};
(#some_or_none $entity:literal) => {
Some($entity)
};
($($ttype:ident $(: $literal:literal)?)*) => {
vec!($($ttype,)*)
.iter()
.zip(vec!($(
token_test!(#some_or_none $($literal)?)
.unwrap_or($ttype.as_str().unwrap())
),*))
};
}
With this version, you can safely export test_token without any fears as shown in this playground.
Little bit more
original idea from steffahn on the Rust Forum
There is another similar way to solve that and without involving unwrap_or, instead of wrapping into an Option in the some_or_none, we can actually create two branches that take either TYPE + LITERAL or TYPE, like so:
macro_rules! token_test {
(#ttype_or_literal $ttype:ident) => { $ttype.as_str().unwrap() };
(#ttype_or_literal $ttype:ident: $literal:literal) => { $literal };
($($ttype:ident $(: $literal:literal)?)*) => {
vec!($($ttype,)*)
.iter()
.zip(vec![$(token_test!(#ttype_or_literal $ttype$(: $literal)?)),*])
};
}
And again
As I only need an iterable than can be deconstructed as (type, iterable), an array of pair is enough:
macro_rules! token_test {
(#ttype_or_literal $ttype:ident) => { $ttype.as_str().unwrap() };
(#ttype_or_literal $ttype:ident: $literal:literal) => { $literal };
($($ttype:ident $(: $literal:literal)?)*) => {
[$(($ttype, token_test!(#ttype_or_literal $ttype$(: $literal)?))),*]
};
}
so no more vec and no more zip.
A Smart trick
A user on the Rust forum gave this potential trick involving ignoring the second argument if it exists. I made the solution a little bit more compact by not having two macros:
macro_rules! token_test {
(#ignore_second $value:expr $(, $_ignored:expr)? $(,)?) => { $value };
($($ttype:ident $(: $literal:literal)?)*) => {
[$(($ttype, token_test!(#ignore_second $($literal,)? $ttype.as_str().unwrap()))),*]
};
}

How to match an argument with dots in Rust macros?

I am writing a program and it contains a lot of matchblocks as I keep calling methods and functions that return Result struct type results.
So I was thinking maybe a macro will reduce the amount of code.
And the final macro is like this:
#[macro_export]
macro_rules! ok_or_return {
//when calls on methods
($self: ident, $method: ident($($args: tt)*), $Error: ident::$err: ident) => {{
match $self.$method($($args)*) {
Ok(v) => v,
Err(e) => {
dbg!(e);
return Err($Error::$err);
}
}
}};
}
As yo can see, I use $($args: tt)* to match multiple arguments, and it goes pretty well. Even when I use struct.method() as a form of argument, it compiled.
Like:
ok_or_return!(self, meth(node.get_num()), Error::GetNumError);
However, if I use the same form to match a normal macro argument, it failed. I changed the specifier to tt, and it didn't work out. Like:
ok_or_return!(self.people, meth(node.get_num()), Error::GetNumError);
So my problem is why node.get_num() can be matched and self.people can't?

Is it possible to match structs to compare them?

Similar to How to match struct fields in Rust?, is it possible to match a struct like Default without physically writing out the fields? I do not want to write out the fields constantly.
Something along the lines of:
let someValue = Struct { /* ... */ };
match someValue {
Struct::default() => println!("Default!"),
_ => println!("Not Default"),
}
This gives an error.
I did some testing on the Rust Playground but I only ended up running into the problem of matching named variables described in the docs.
What is your best solution to comparing many structs? Is it using #[derive(PartialEq)] and if statements?
Rust's patterns aren't values to compare to. They're more related to variable assignment (destructuring).
There is a "match guard" syntax that can be used:
match some_value {
tmp if tmp == Struct::default() => /* it's default-like-ish */
}

Can I 'enumerate' with Rust's variadic macros?

Essentially I have a macro that looks like:
macro_rules! my_macro {
( $expr:expr; $( $pat:pat ),* ) => {
match $expr {
$(
$pat => $(some-macro-magic-here),
)*
}
}
}
Is there anything that can go into $(some-macro-magic-here), so that
my_macro!(foo; A, B, C)
will expand to
match foo {
A => 2,
B => 4,
C => 6,
}
?
Is there some other way I might be able to get a similar feature that effectively lets me "enumerate" over the sequence of inputs for the macro?
I think I could probably write a recursive macro to get a similar effect, but I'm wondering if there's a more elegant/idiomatic way about it than what I'm thinking of
Because macros aren't allowed to store or manipulate "variables" in any form, this problem becomes very difficult. You could, however, use an iterator to do something to the same effect, by creating an iterator that "enumerates" over the input the way you want it (using std::iter::successors, for example), and simply calling iterator.next().unwrap() in $(some-macro-magic-here).
You cannot create such match statement as Rust do not allow creating match branches via macro, aka this will not work right now:
match val {
my_macro! (A, B, C)
}
However in this case we can "hack it around" by using nested if let blocks and using recursive macro:
macro_rules! my_macro {
($expr:expr; $($pat:pat),*) => {
my_macro!($expr; 2, 2; $($pat),*)
};
($expr:expr; $curr:expr, $step:literal; $pat:pat) => {
if let $pat = $expr {
$curr
} else {
unreachable!()
}
};
($expr:expr; $curr:expr, $step:literal; $pat:pat, $($rest:pat),*) => {
if let $pat = $expr {
$curr
} else {
my_macro! ($expr; $curr+$step, $step; $($rest),*)
}
}
}
Playground
It will generate the nested entries with enough 2 added to create the expected constants. Alternatively you could replace that with multiplication, but it should be optimised out by the compiler anyway.

How to replace one identifier in an expression with another one via Rust macro?

I'm trying to build a macro that does some code transformation, and should be able to parse its own syntax.
Here is the simplest example I can think of:
replace!(x, y, x * 100 + z) ~> y * 100 + z
This macro should be able to replace the first identifier with the second in the expression provided as third parameter. The macro should have some understanding of the language of the third parameter (which in my particular case, as opposed to the example, wouldn't parse in Rust) and apply recursively over it.
What's the most effective way to build such a macro in Rust? I'm aware of the proc_macro approach and the macro_rules! one. However I am not sure whether macro_rules! is powerful enough to handle this and I couldn't find much documentation in how to build my own transformations using proc_macro. Can anyone point me in the right direction?
Solution with macro_rules! macro
To implement this with declarative macros (macro_rules!) is a bit tricky but possible. However, it's necessary to use a few tricks.
But first, here is the code (Playground):
macro_rules! replace {
// This is the "public interface". The only thing we do here is to delegate
// to the actual implementation. The implementation is more complicated to
// call, because it has an "out" parameter which accumulates the token we
// will generate.
($x:ident, $y:ident, $($e:tt)*) => {
replace!(#impl $x, $y, [], $($e)*)
};
// Recursion stop: if there are no tokens to check anymore, we just emit
// what we accumulated in the out parameter so far.
(#impl $x:ident, $y:ident, [$($out:tt)*], ) => {
$($out)*
};
// This is the arm that's used when the first token in the stream is an
// identifier. We potentially replace the identifier and push it to the
// out tokens.
(#impl $x:ident, $y:ident, [$($out:tt)*], $head:ident $($tail:tt)*) => {{
replace!(
#impl $x, $y,
[$($out)* replace!(#replace $x $y $head)],
$($tail)*
)
}};
// These arms are here to recurse into "groups" (tokens inside of a
// (), [] or {} pair)
(#impl $x:ident, $y:ident, [$($out:tt)*], ( $($head:tt)* ) $($tail:tt)*) => {{
replace!(
#impl $x, $y,
[$($out)* ( replace!($x, $y, $($head)*) ) ],
$($tail)*
)
}};
(#impl $x:ident, $y:ident, [$($out:tt)*], [ $($head:tt)* ] $($tail:tt)*) => {{
replace!(
#impl $x, $y,
[$($out)* [ replace!($x, $y, $($head)*) ] ],
$($tail)*
)
}};
(#impl $x:ident, $y:ident, [$($out:tt)*], { $($head:tt)* } $($tail:tt)*) => {{
replace!(
#impl $x, $y,
[$($out)* { replace!($x, $y, $($head)*) } ],
$($tail)*
)
}};
// This is the standard recusion case: we have a non-identifier token as
// head, so we just put it into the out parameter.
(#impl $x:ident, $y:ident, [$($out:tt)*], $head:tt $($tail:tt)*) => {{
replace!(#impl $x, $y, [$($out)* $head], $($tail)*)
}};
// Helper to replace the identifier if its the needle.
(#replace $needle:ident $replacement:ident $i:ident) => {{
// This is a trick to check two identifiers for equality. Note that
// the patterns in this macro don't contain any meta variables (the
// out meta variables $needle and $i are interpolated).
macro_rules! __inner_helper {
// Identifiers equal, emit $replacement
($needle $needle) => { $replacement };
// Identifiers not equal, emit original
($needle $i) => { $i };
}
__inner_helper!($needle $i)
}}
}
fn main() {
let foo = 3;
let bar = 7;
let z = 5;
dbg!(replace!(abc, foo, bar * 100 + z)); // no replacement
dbg!(replace!(bar, foo, bar * 100 + z)); // replace `bar` with `foo`
}
It outputs:
[src/main.rs:56] replace!(abc , foo , bar * 100 + z) = 705
[src/main.rs:57] replace!(bar , foo , bar * 100 + z) = 305
How does this work?
There are two main tricks one need to understand before understanding this macro: push down accumulation and how to check two identifiers for equality.
Furthermore, just to be sure: the #foobar things at the start of the macro pattern are not a special feature, but simply a convention to mark internal helper macros (also see: "The little book of Macros", StackOverflow question).
Push down accumulation is well described in this chapter of "The little book of Rust macros". The important part is:
All macros in Rust must result in a complete, supported syntax element (such as an expression, item, etc.). This means that it is impossible to have a macro expand to a partial construct.
But often it is necessary to have partial results, for example when dealing token for token with some input. To solve this, one basically has an "out" parameter which is just a list of tokens that grows with each recursive macro call. This works, because macro input can be arbitrary tokens and don't have to be a valid Rust construct.
This pattern only makes sense for macros that work as "incremental TT munchers", which my solution does. There is also a chapter about this pattern in TLBORM.
The second key point is to check two identifiers for equality. This is done with an interesting trick: the macro defines a new macro which is then immediately used. Let's take a look at the code:
(#replace $needle:ident $replacement:ident $i:ident) => {{
macro_rules! __inner_helper {
($needle $needle) => { $replacement };
($needle $i) => { $i };
}
__inner_helper!($needle $i)
}}
Let's go through two different invocations:
replace!(#replace foo bar baz): this expands to:
macro_rules! __inner_helper {
(foo foo) => { bar };
(foo baz) => { baz };
}
__inner_helper!(foo baz)
And the inner_helper! invocation now clearly takes the second pattern, resulting in baz.
replace!(#replace foo bar foo) on the other hand expands to:
macro_rules! __inner_helper {
(foo foo) => { bar };
(foo foo) => { foo };
}
__inner_helper!(foo foo)
This time, the inner_helper! invocation takes the first pattern, resulting in bar.
I learned this trick from a crate that offers basically only exactly that: a macro checking two identifiers for equality. But unfortunately, I cannot find this crate anymore. Let me know if you know the name of that crate!
This implementation has a few limitations, however:
As an incremental TT muncher, it recurses for each token in the input. So it's easy to reach the recursion limit (which can be increased, but it's not optimal). It could be possible to write a non-recursive version of this macro, but so far I haven't found a way to do that.
macro_rules! macros are a bit strange when it comes to identifiers. The solution presented above might behave strange with self as identifier. See this chapter for more information on that topic.
Solution with proc-macro
Of course this can also be done via a proc-macro. It also involves less strange tricks. My solution looks like this:
extern crate proc_macro;
use proc_macro::{
Ident, TokenStream, TokenTree,
token_stream,
};
#[proc_macro]
pub fn replace(input: TokenStream) -> TokenStream {
let mut it = input.into_iter();
// Get first parameters
let needle = get_ident(&mut it);
let _comma = it.next().unwrap();
let replacement = get_ident(&mut it);
let _comma = it.next().unwrap();
// Return the remaining tokens, but replace identifiers.
it.map(|tt| {
match tt {
// Comparing `Ident`s can only be done via string comparison right
// now. Note that this ignores syntax contexts which can be a
// problem in some situation.
TokenTree::Ident(ref i) if i.to_string() == needle.to_string() => {
TokenTree::Ident(replacement.clone())
}
// All other tokens are just forwarded
other => other,
}
}).collect()
}
/// Extract an identifier from the iterator.
fn get_ident(it: &mut token_stream::IntoIter) -> Ident {
match it.next() {
Some(TokenTree::Ident(i)) => i,
_ => panic!("oh noes!"),
}
}
Using this proc macro with the main() example from above works exactly the same.
Note: error handling was ignored here to keep the example short. Please see this question on how to do error reporting in proc macros.
Apart from this, that code doesn't need as much explanations, I think. This proc macro version also doesn't suffer from the recursion limit problem as the macro_rules! macro.

Resources