Rust preprocessor: Plain text replacement like #define for syntax simplification - rust

In C, you can define macros using #define to have something globally replaced by the C preprocessor. How can I achieve the same plain-text replacement in rust?
My goal is to define macros that simplifies Rust syntax and reduce boilerplate, such as replacing !!. with .unwrap(). to simulate the "non-null assertion operator" syntax in Kotlin, Swift, or TypeScript, and replacing (\w+)\?\? with Option<$1> to simplify optional type syntax.
For example, the following struct
struct ReturnPath {
name: Option<String>,
file_type: Option<String>,
mtime: Option<i64>
}
would be much shorter in the processed syntax:
struct ReturnPath {
name: String??,
file_type: String??,
mtime: i64??
}

How can I achieve the same plain-text replacement in rust?
You can't do this without a DSL macro, essentially.
C macros are token-replacement and happen before the syntax-parsing step -- Rust macros are more akin to Lisp macros, which happen during the parsing step. See: https://softwareengineering.stackexchange.com/a/28387
replacing (\w+)?? with Option<$1> to simplify optional type syntax.
This also won't do what you want it to do in more complicated scenarios, such as if you have Result<T, R>?? or Vec<&'static str>?? or something like that. This is because Rust's grammar is more powerful than regexes -- it is mostly context-free with some context-sensitive parts (link).
What you're probably looking for is some sort of declarative and syntactical replacement from ($t:ty) \? \? => Option<$t>.
In addition, this regex can also probably match on non-types like a?? if a: Option<Option<T>>.
If you wanted to create a special syntax for this, you'd probably do something like:
macro_rules! shorthand_struct_fields {
/* rows of idents to types and then adjust struct definition but with T?? substituted for Option<T> */
}
struct Foo {
shorthand_struct_fields! {
x: i32??,
}
}
I'm not sure if this is such a good idea, but it you do want to create this large macro, you would be best suited by writing a declarative macros with sub-macros that use # rules, like so: http://danielkeep.github.io/tlborm/book/pat-internal-rules.html

Related

Can I execute a function inside a declarative macro?

I'm stuck trying to "rename" a identifier inside a declarative macro, I have the renaming function already, but can't find a way to "execute" it inside the macro. Is this possible with declarative macros?
Example:
fn to_pascal_case(s: &str) -> String { ... }
macro_rules! test_gl_enum {
($n: ident) => { struct to_pascal_case(stringify!($n)); }
}
test_gl_enum!(VERTEX_SHADER);
expands to:
struct to_pascal_case(());
desired result would be:
struct VertexShader;
You can't execute arbitrary code in a declarative macro. You have a couple of options:
Proc Macro
Proc macros are basically just functions that take a TokenStream and return a TokenStream (with some variations between function-like, attribute and derive macros). They can contain arbitrary code, so you could absolutely parse the identifier and apply some transformation to it.
If you're going to take this route, you will likely want to use syn and quote, and I'd also very strongly recommend going through the proc macro workshop, which is a series of exercises to teach you the basics.
Or use a crate
As is often the case, someone else has also had this problem, and has written a crate for it: https://crates.io/crates/casey
With that, you can write:
use casey::*;
macro_rules! test_gl_enum {
($n:ident) => {
struct pascal!($n);
};
}
Here, pascal!() is a proc macro that does the transformation, and since it's a macro (rather than a function) it is also evaluated at compile time, so the expansion won't look like struct pascal!(my_enum_name);

How to define recursive type alias in Rust?

So in Rust, I can define recursive type using enum like this:
enum Trie {
Child(HashMap<char, Box<Trie>>)
}
However this is a little bit verbose, considering that this enum has only one member.
May I know how to make it simpler? Let's say using type alias like this?
// This code does not compile in rustc 1.55.0
type Trie = HashMap<char, Box<Trie>>;
Using a single field struct compiles:
struct Trie(HashMap<char, Box<Trie>>);

Is it possible to use functions on Rust's constant generics

So say I'm writing a wrapper type for the array.
struct Array<const L: usize, T> {
raw: [T;L]
}
And I have some function that mutates the length of the array wrapper, say the function is concatenation:
impl<const L: usize, T> Array<L, T> {
fn concat<const L2: usize>(self, other: Array<L, T>) -> Array<{L + L2}, T> {todo!();}
}
When I try to compile this code, the rust compiler gets extremely mad. Thinking it might have something to do with adding types corresponding to implementing multiple, traits, i tried multiplication instead of addition, which also didn't work.
I know that rust can evaluate some expressions at compile time, is this just a case where that isn't allowed, or am I missing something?
When I try to compile this code, the rust compiler gets extremely mad. […] I know that rust can evaluate some expressions at compile time, is this just a case where that isn't allowed, or am I missing something?
You say that the compiler gets mad at you, but have you considered listening at what it was telling you?
Plugging your code into the playground, the first error is a trivial showstopper of
error: type parameters must be declared prior to const parameters
--> src/lib.rs:1:30
|
1 | struct Array<const L: usize, T> {
| -----------------^- help: reorder the parameters: lifetimes, then types, then consts: `<T, const L: usize>
so it kind-of doesn't seem to because that's trivial to fix.
Now fixing that we get to the meat of the issue:
= help: const parameters may only be used as standalone arguments, i.e. `L`
= help: use `#![feature(const_generics)]` and `#![feature(const_evaluatable_checked)]` to allow generic const expressions
that seems to explain the entirety of the thing: what's currently enabled in Rust is the
Const generics MVP
MVP as in minimum viable products aka the smallest featureset which can be useful, and what that translates to is explained in the introductory blog post.
The first limitation is
Only integral types are permitted for const generics
which is fine here because you're using integral types, but the second limitation is
No complex generic expressions in const arguments
What is a complex generic expression? Anything outside of:
A standalone const parameter.
A literal (i.e. an integer, bool, or character).
A concrete constant expression (enclosed by {}), involving no generic parameters.
What you're trying to do does not work (outside of nightly with both const_generics and const_evaluatable_checked enabled) because you're writing constant expressions involving at least one generic parameter, which is not part of the MVP.

How to distinguish different kinds of items in macro_rules macros?

I want to write a macro_rules based macro that will be used to wrap a series of type aliases and struct definitions. I can match on "items" with $e:item, but this will match both aliases and structs. I would like to treat the two separately (I need to add some #[derive(...)] just on the structs). Do I have to imitate their syntax directly by matching on something like type $name:ident = $type:ty; or is there a better way? This route seems annoying for structs because of regular vs tuple like. If I also wanted to distinguish functions that would be really painful because they have a lot of syntactical variation.
I believe for that problem somewhat simple cases can be solved with macro_rules!, but that probably would be limited (you can't lookahead) and super error-prone. I only cover an example for types, I hope that would be convincing enough to avoid macro_rules!. Consider this simple macro:
macro_rules! traverse_types {
($(type $tp:ident = $alias:ident;)*) => {
$(type $tp = $alias;)*
}
}
traverse_types!(
type T = i32;
type Y = Result<i32, u64>;
);
That works fine for the trivial aliases. At some point you probably also would like to handle generic type aliases (i.e. in the form type R<Y> = ...). Ok, your might still rewrite the macro to the recursive form (and that already a non-trivial task) to handle all of cases. Then you figure out that generics can be complex (type-bounds, lifetime parameters, where-clause, default types, etc):
type W<A: Send + Sync> = Option<A>;
type X<A: Iterator<Item = usize>> where A: 'static = Option<A>;
type Y<'a, X, Y: for<'t> Trait<'t>> = Result<&'a X, Y>;
type Z<A, B = u64> = Result<A, B>;
Probably all of these cases still can be handled with a barely readable macro_rules!. Nevertheless it would be really hard to understand (even to the person who wrote it) what's going on. Besides, it is hard to support new syntax (e.g. impl-trait alias type T = impl K), you may even need to have a complete rewrite of the macro. And I only cover the type aliases part, there's more to handle for the structs.
My point is: one better avoid macro_rules! for that (and similar) problem(-s), procedural macros is a way much a better tool for that. It easier to read (and thus extend) and handles new syntax for free (if syn and quote crates are maintained). For the type alias this can be done as simple as:
extern crate proc_macro;
use proc_macro::TokenStream;
use syn::parse::{Parse, ParseStream};
struct TypeAliases {
aliases: Vec<syn::ItemType>,
}
impl Parse for TypeAliases {
fn parse(input: ParseStream) -> syn::Result<Self> {
let mut aliases = vec![];
while !input.is_empty() {
aliases.push(input.parse()?);
}
Ok(Self { aliases })
}
}
#[proc_macro]
pub fn traverse_types(token: TokenStream) -> TokenStream {
let input = syn::parse_macro_input!(token as TypeAliases);
// do smth with input here
// You may remove this binding by implementing a `Deref` or `quote::ToTokens` for `TypeAliases`
let aliases = input.aliases;
let gen = quote::quote! {
#(#aliases)*
};
TokenStream::from(gen)
}
For the struct parsing code is the same using ItemStruct type and also you need a lookahead to determine wether it's an type-alias or a struct, there's very similar example at syn for that.

Why does rust parser need the fn keyword?

I've been reading the blogs about rust and this closure for example made me wonder:
fn each<E>(t: &Tree<E>, f: &fn(&E) -> bool) {
if !f(&t.elem) {
return;
}
for t.children.each |child| { each(child, f); }
}
why couldn't it be:
each<E>(t: &Tree<E>, f: &(&E) -> bool) {
if !f(&t.elem) {
return;
}
for t.children.each |child| { each(child, f); }
}
Maybe i'm missing something on the class system that would prevent this.
It makes parsing more complicated for compilers, syntax-highlighters, shell scripts and humans (i.e. everyone).
For example, with fn, foo takes a function that has two int arguments and returns nothing, and bar takes a pointer to a tuple of 2 ints
fn foo(f: &fn(int, int)) {}
fn bar(t: &(int, int)) {}
Without fn, the arguments for both become &(int, int) and the compiler couldn't distinguish them. Sure one could come up with other rules so they are written differently, but these almost certainly have no advantages over just using fn.
Some of the fn's may seem extraneous, but this has a side benefit that rust code is extremely easy to navigate with 'grep'. To find the definition of function 'foo', you simply grep "fn\sfoo". To see the major definitions in a source file, just grep for "(fn|struct|trait|impl|enum|type)".
This is extremely helpful at this early stage when rust lacks IDE's, and probably does simplify the grammar in other ways.
Making the grammar less ambiguous than C++ is a major goal , it makes generic programming easier (you dont have to bring in so much definition into a compile unit, in a specific order, to correctly parse it), and should make future tooling easier. A feature to auto-insert 'fn's would be pretty straightforward compared to many of the issues current C++ IDE's have to deal with.

Resources