Rust: How to allow and detect optional punctionuation in macro_rules?

Rust: How to allow and detect optional punctionuation in macro_rules? - rust

I have been playing with Rust for a while and decided it is time to start with macros. I want to create a macro that allows for a bitwise and operation on specific bits of unsigned integer variables. Here is what I currently have which is working:
macro_rules! AND {
($($val:ident.$bit:literal), *) => {
{
let mut val_out = 0x01;
$(
val_out &= ($val >> $bit);
)*
val_out & 0x01
}
};
}
fn main() {
let x = 0x01;
let y = 0x02;
let z = 0x10;
println!("{}", AND!(x.0, y.1, z.4)); // Prints 1
println!("{}", AND!(x.0, y.1, z.0)); // Prints 0
}
What I want to do is also allow for a negation operator. This is what I have that compiles, but I cannot figure out how to determine if the exclamation point was matched or not.
macro_rules! AND {
($($(!)?$val:ident.$bit:literal), *) => {
{
let mut val_out = 0x01;
$(
val_out &= ($val >> $bit);
)*
val_out & 0x01
}
};
}
fn main() {
let x = 0x01;
let y = 0x02;
let z = 0x10;
println!("{}", AND!(!x.0, !y.1, z.4)); // Prints 1, would like to print 0
println!("{}", AND!(x.0, y.1, !z.0)); // Prints 0, would like to print 1
}
I have tried to match an expression instead of the exclamation point, but expressions have to be the last thing matched. I also tried to match an ident after the literal, with the idea that an underscore after the literal could denote negation, but when I used an if or match statement to determine if the ident was matched, I get an error:
macro_rules! AND {
($($val:ident.$bit:literal$($negate:ident)?), *) => {
{
let mut val_out = 0x01;
$(
if $negate == "_" {
val_out &= (!$val >> $bit);
}
else {
val_out &= ($val >> $bit);
}
)*
val_out & 0x01
}
};
}
fn main() {
let x = 0x01;
let y = 0x02;
let z = 0x10;
println!("{}", AND!(x.0, y.1, z.4_));
error: variable 'negate' is still repeating at this depth
--> src/main.rs:8:20
|
8 | if $negate == "_" {
| ^^^^^^^
Any ideas would be appreciated. Thanks!

Your proposed macro syntax with a leading but optional symbol (e.g. a !) is very difficult to express with Rust's macro_rules!. One way to simplify it, is to use a symbol for both cases (e.g. + for positive and - for negated). Then you can use a secondary macro to distinguish the two cases one-by-one:
// Auxiliary macro to distinguish positive and negative cases
macro_rules! and_aux {
($var:ident + $val:ident $bit:literal) => {
$var &= ($val >> $bit);
};
($var:ident - $val:ident $bit:literal) => {
$var &= !($val >> $bit);
};
}
// Main macro
macro_rules! AND {
($($t:tt $val:ident . $bit:literal), *) => {{
let mut val_out = 0x01;
$(
and_aux!(val_out $t $val $bit);
)*
val_out & 0x01
}};
}
fn main() {
let x = 0x01;
let y = 0x02;
let z = 0x10;
println!("{}", AND!(+x.0, +y.1, +z.4)); // Prints 1
println!("{}", AND!(+x.0, +y.1, +z.0)); // Prints 0
println!("{}", AND!(-x.0, -y.1, +z.4)); // Prints 0
println!("{}", AND!(+x.0, +y.1, -z.0)); // Prints 1
}
Of course, you can also use other symbols, or you can employ brackets, which can make a lot of things possible in macro_rules!. This is partially because brackets (and that includes (), [], and {}) are the only elements that can delimit tts (tocken-trees), which you typically need for more advanced macro_rules!.
However, you can actually make your original macro syntax work, that is, if you like complex hard-to-debug macros. For instance, you can do a recursive macro that only parses a little bit of your input with each recursion, forwarding some intermediate representation to the next invocation. E.g.:
// Auxiliary macro, does the heavy lifting
macro_rules! and_inner {
// Finishing rule, assembles the actual output
( # $var:ident { $( $finished:tt )* } from { $(,)? } ) => {
{
let mut $var = 0x01;
$( $finished )*
$var & 0x01
}
};
// Parse negated case
( # $var:ident {
$( $finished:tt )*
} from {
! $val:ident . $bit:literal , // only this line is processed here
$( $rest_input:tt )*
}
) => {
and_inner!(# $var {
$( $finished )*
$var &= !($val >> $bit);
} from {
$( $rest_input )*
})
};
// Parse positive case
( # $var:ident {
$( $finished:tt )*
} from {
$val:ident . $bit:literal , // only this line is processed here
$( $rest_input:tt )*
}
) => {
and_inner!(# $var {
$( $finished )*
$var &= ($val >> $bit);
} from {
$( $rest_input )*
})
};
}
// Main macro
macro_rules! AND {
// Entry rule prepares input for internal macro
( $( $input:tt )* ) => {
and_inner!(# tmp_var { } from { $($input)* , })
};
}

You can try what is called a incremental macro muncher:
macro_rules! _AND {
($val_out:ident , $val:ident.$bit:literal $($tail:tt)*) => {
$val_out &= ($val >> $bit);
_AND!{$val_out $($t)*};
};
($val_out:ident , ! $val:ident.$bit:literal $($tail:tt)*) => {
$val_out &= (!$val >> $bit);
_AND!{$val_out $($t)*};
};
($val_out:ident) => { }
}
macro_rules! AND {
($($tail:tt)*) => {
{
let mut val_out = 0x01;
_AND!{val_out , $($tail)*};
val_out & 0x01
}
};
}
fn main() {
let x = 0x01;
let y = 0x02;
let z = 0x10;
println!("{}", AND!(!x.0, !y.1, z.4)); // Prints 1, would like to print 0
println!("{}", AND!(x.0, y.1, !z.0)); // Prints 0, would like to print 1
}
The idea is that you parse the whole contents of the macro as a list of token trees (tt), that are basically anything, and pass them on to a recursive macro, eating a bit of the those in each iteration of that macro.

Related

Double vs single bracket in rust macros

I've been looking at rust macros recently and have found conflicting examples of macros using brackets (as laid out below). I'd like to know what the difference between each of these are and which one should be used when building macros. I'd also like to know whether any docs exist for any of this, as I can't find anything on the interwebs.
macro_rules! mac_a {
($x:ident,$y:expr) => { // <-- outer curlies
{ // <-- inner curlies
let $x = $y;
println!("{} {}", $x, $y);
}
};
}
macro_rules! mac_b {
($x:ident,$y:expr) => { // <-- outer curlies
// <-- no inner brackets / curlies
let $x = $y;
println!("{} {}", $x, $y);
};
}
// Does not compile
// macro_rules! mac_c {
// ($x:ident,$y:expr) => ( // <-- outer brackets
// ( // <-- inner brackets
// let $x = $y;
// println!("{} {}", $x, $y);
// )
// );
// }
macro_rules! mac_c2 {
($x:expr,$y:expr) => ( // <-- outer brackets
( // <-- inner brackets
println!("{} {}", $x, $y)
)
);
}
macro_rules! mac_d {
($x:ident,$y:expr) => ( // <-- outer brackets
// <-- no inner brackets / curlies
let $x = $y;
println!("{} {}", $x, $y);
);
}
fn main() {
mac_a!(a, 1);
mac_b!(b, 2);
// mac_c!(c, 3); // Does not compile
mac_c2!(3, 3);
mac_d!(d, 4);
}
All of the above except mac_c compile, and there are differences between each hence the need for mac_c2 with ident and let removed. I don't know why they can't be included ¯\_(ツ)_/¯

AFAIK the outer curlies/brackets are equivalent and simply serve to delimit each individual macro expansion. OTOH the inner curlies/brackets are part of the generated code and their contents must therefore be legal for where they are used. If we expand the macro invocations in your main functions, we get:
fn main() {
// mac_a!(a, 1);
{ // <-- inner curlies
let a = 1;
println!("{} {}", a, 1);
}
// mac_b!(b, 2);
// <-- no inner brackets / curlies
let b = 2;
println!("{} {}", b, 2);
// mac_c!(c, 3); // Does not compile
( // <-- inner brackets
let c = 3; // Invalid code
println!("{} {}", c, 3);
)
// mac_c2!(3, 3);
( // <-- inner brackets
println!("{} {}", 3, 3)
)
// mac_d!(d, 4);
// <-- no inner brackets / curlies
let d = 4;
println!("{} {}", d, 4);
}
Note BTW that there is therefore a difference for what variables are still around after the macro invocations:
fn main() {
mac_a!(a, 1);
mac_b!(b, 2);
// mac_c!(c, 3); // Does not compile
mac_c2!(3, 3);
mac_d!(d, 4);
// println!("{}", a); // Does not compile because the `let a = ...` was done inside curlies
println!("{} {}", b, d); // Work because `let b = ...` and `let d = ...` were inserted in the current scope
}
Playground

Rust macro: capture exactly matching tokens

My goal is to write a macro expand! such that:
struct A;
struct B;
struct Mut<T>;
expand!() => ()
expand!(A) => (A,)
expand!(mut A) => (Mut<A>,)
expand!(A, mut B) => (A, Mut<B>,)
// etc
[Edit] added trailing comma for consistent tuple syntax.
I wrote this macro so far:
macro_rules! to_type {
( $ty:ty ) => { $ty };
( mut $ty:ty ) => { Mut<$ty> };
}
macro_rules! expand {
( $( $(mut)? $ty:ty ),* ) => {
(
$( to_type!($ty) ),*
,)
};
}
What I'm struggling with, is capturing the mut token. How can I assign it to a variable and reuse it in the macro body? Is it possible to work on more than 1 token at a time?

Something like this?
macro_rules! expand {
(#phase2($($ty_final:ty),*),) => {
($($ty_final,)*)
};
(#phase2($($ty_final:ty),*), mut $ty:ty, $($rest:tt)*) => {
expand!(#phase2($($ty_final,)* Mut::<$ty>), $($rest)*)
};
(#phase2($($ty_final:ty),*), $ty:ty, $($rest:tt)*) => {
expand!(#phase2($($ty_final,)* $ty), $($rest)*)
};
($($t:tt)*) => {
expand!(#phase2(), $($t)*)
};
}
struct A;
struct B;
struct Mut<T>(std::marker::PhantomData<T>);
fn main() {
#[allow(unused_parens)]
let _: expand!() = ();
#[allow(unused_parens)]
let _: expand!(A,) = (A,);
#[allow(unused_parens)]
let _: expand!(mut B,) = (Mut::<B>(Default::default()),);
#[allow(unused_parens)]
let _: expand!(A, mut B,) = (A, Mut::<B>(Default::default()));
}

What is a good way to match strings against patterns and extract values?

I am trying get something like this (doesn't work):
match input {
"next" => current_question_number += 1,
"prev" => current_question_number -= 1,
"goto {x}" => current_question_number = x,
// ...
_ => status = "Unknown Command".to_owned()
}
I tried two different versions of Regex:
go_match = regex::Regex::new(r"goto (\d+)?").unwrap();
// ...
match input {
...
x if go_match.is_match(x) => current_question_number = go_match.captures(x).unwrap().get(1).unwrap().as_str().parse().unwrap(),
_ => status = "Unknown Command".to_owned()
}
and
let cmd_match = regex::Regex::new(r"([a-zA-Z]+) (\d+)?").unwrap();
// ...
if let Some(captures) = cmd_match.captures(input.as_ref()) {
let cmd = captures.get(1).unwrap().as_str().to_lowercase();
if let Some(param) = captures.get(2) {
let param = param.as_str().parse().unwrap();
match cmd.as_ref() {
"goto" => current_question_number = param,
}
} else {
match cmd.as_ref() {
"next" => current_question_number += 1,
"prev" => current_question_number -= 1,
}
}
} else {
status = "Unknown Command".to_owned();
}
Both seem like a ridiculously long and and complicated way to do something pretty common, am I missing something?

You can create a master Regex that captures all the interesting components then build a Vec of all the captured pieces. This Vec can then be matched against:
extern crate regex;
use regex::Regex;
fn main() {
let input = "goto 4";
let mut current_question_number = 0;
// Create a regex that matches on the union of all commands
// Each command and argument is captured
// Using the "extended mode" flag to write a nicer Regex
let input_re = Regex::new(
r#"(?x)
(next) |
(prev) |
(goto)\s+(\d+)
"#
).unwrap();
// Execute the Regex
let captures = input_re.captures(input).map(|captures| {
captures
.iter() // All the captured groups
.skip(1) // Skipping the complete match
.flat_map(|c| c) // Ignoring all empty optional matches
.map(|c| c.as_str()) // Grab the original strings
.collect::<Vec<_>>() // Create a vector
});
// Match against the captured values as a slice
match captures.as_ref().map(|c| c.as_slice()) {
Some(["next"]) => current_question_number += 1,
Some(["prev"]) => current_question_number -= 1,
Some(["goto", x]) => {
let x = x.parse().expect("can't parse number");
current_question_number = x;
}
_ => panic!("Unknown Command: {}", input),
}
println!("Now at question {}", current_question_number);
}

You have a mini language for picking questions:
pick the next question
pick the prev question
goto a specific question
If your requirements end here a Regex based solution fits perfectly.
If your DSL may evolve a parser based solution is worth considering.
The parser combinator nom is a powerful tool to build a grammar starting from basic elements.
Your language has these characteristics:
it has three alternatives statements (alt!): next, prev, goto \d+
the most complex statement "goto {number}" is composed of the keyword (tag!) goto in front of (preceded!) a number (digit!).
any numbers of whitespaces (ws!) has to be ignored
These requirements translate in this implementation:
#[macro_use]
extern crate nom;
use nom::{IResult, digit};
use nom::types::CompleteStr;
// we have for now two types of outcome: absolute or relative cursor move
pub enum QMove {
Abs(i32),
Rel(i32)
}
pub fn question_picker(input: CompleteStr) -> IResult<CompleteStr, QMove> {
ws!(input,
alt!(
map!(
tag!("next"),
|_| QMove::Rel(1)
) |
map!(
tag!("prev"),
|_| QMove::Rel(-1)
) |
preceded!(
tag!("goto"),
map!(
digit,
|s| QMove::Abs(std::str::FromStr::from_str(s.0).unwrap())
)
)
)
)
}
fn main() {
let mut current_question_number = 60;
let first_line = "goto 5";
let outcome = question_picker(CompleteStr(first_line));
match outcome {
Ok((_, QMove::Abs(n))) => current_question_number = n,
Ok((_, QMove::Rel(n))) => current_question_number += n,
Err(err) => {panic!("error: {:?}", err)}
}
println!("Now at question {}", current_question_number);
}

You can use str::split for this (playground)
fn run(input: &str) {
let mut toks = input.split(' ').fuse();
let first = toks.next();
let second = toks.next();
match first {
Some("next") => println!("next found"),
Some("prev") => println!("prev found"),
Some("goto") => match second {
Some(num) => println!("found goto with number {}", num),
_ => println!("goto with no parameter"),
},
_ => println!("invalid input {:?}", input),
}
}
fn main() {
run("next");
run("prev");
run("goto 10");
run("this is not valid");
run("goto"); // also not valid but for a different reason
}
will output
next found
prev found
found goto with number 10
invalid input "this is not valid"
goto with no parameter

How do you write a macro with chainable tokens?

I'm not really sure how to phrase this, so the question title is pretty rubbish, but here's what I'm trying to do:
I can write this macro:
macro_rules! op(
( $v1:ident && $v2:ident ) => { Op::And($v1, $v2) };
( $v1:ident || $v2:ident ) => { Op::Or($v1, $v2) };
);
Which I can use like this:
let _ = op!(Expr || Expr);
let _ = op!(Expr && Expr);
What I want to do is to write an arbitrary sequence of tokens like this:
let _ = op!(Expr || Expr || Expr && Expr || Expr);
Which resolves into a Vec of tokens, like:
vec!(T::Expr(e1), T::Or, T::Expr(e2), T::Or, ...)
I can write a vec! like macro:
macro_rules! query(
( $( $x:expr ),* ) => {
{
let mut temp_vec = Vec::new();
$(temp_vec.push($x);)*
temp_vec
}
};
);
...but I can't see how to convert the arbitrary symbols (eg. &&) into tokens as the macro runs.
Is this possible somehow?
playpen link: http://is.gd/I9F5YV

It seems that it's impossible to capture arbitrary symbols matches during macroexpand: as the language reference says, "valid designators are item, block, stmt, pat, expr, ty (type), ident, path, tt". So the best I could suggest is to use "ident"-valid tokens, like "and"/"or" instead of "&&"/"||", for example:
macro_rules! query_op(
( and ) => { "T::And" };
( or ) => { "T::Or" };
( $e:ident ) => { concat!("T::Expr(", stringify!($e), ")") };
);
macro_rules! query(
( $( $x:ident )* ) => {
{
let mut temp_vec = Vec::new();
$(temp_vec.push(query_op!($x));)*
temp_vec
}
};
);
fn main() {
let q = query!(Expr1 or Expr2 and Expr3 or Expr4);
println!("{:?}", q);
}
Outputs:
["T::Expr(Expr1)", "T::Or", "T::Expr(Expr2)", "T::And", "T::Expr(Expr3)", "T::Or", "T::Expr(Expr4)"]

Why does this rust HashMap macro no longer work?

I previously used:
#[macro_export]
macro_rules! map(
{ T:ident, $($key:expr => $value:expr),+ } => {
{
let mut m = $T::new();
$(
m.insert($key, $value);
)+
m
}
};
)
To create objects, like this:
let mut hm = map! {HashMap, "a string" => 21, "another string" => 33};
However, this no longer seems to work. The compiler reports:
- Failed:
macros.rs:136:24: 136:31 error: no rules expected the token `HashMap`
macros.rs:136 let mut hm = map! {HashMap, "a string" => 21, "another string" => 33};
^~~~~~~
What's changed with macro definitions that makes this no longer work?
The basic example below works fine:
macro_rules! foo(
{$T:ident} => { $T; };
)
struct Blah;
#[test]
fn test_map_create() {
let mut bar = foo!{Blah};
}
So this seem to be some change to how the {T:ident, $(...), +} expansion is processed?
What's going on here?

You’re lacking the $ symbol before T.

How about this play.rust-lang.org?
// nightly rust
#![feature(type_name_of_val)]
use std::collections::{BTreeMap, HashMap};
macro_rules! map {
($T:ty; $( $key:literal : $val:expr ),* ,) => {
{
// $T is the full HashMap<String, i32>, <$T> is just HashMap
let mut m: $T = <$T>::new();
$(
m.insert($key.into(), $val);
)*
m
}
};
// handle no tailing comma
($T:ty; $( $key:literal : $val:expr ),*) => {
map!{$T; $( $key : $val ,)*}
}
}
fn main() {
let hm = map! {HashMap<String, i32>;
"a": 1,
"b": 2,
"c": 3,
};
let bm = map! {BTreeMap<String, i32>;
"1": 1,
"2": 2,
"3": 3
};
println!("typeof hm = {}", std::any::type_name_of_val(&hm));
println!("typeof bm = {}", std::any::type_name_of_val(&bm));
dbg!(hm, bm);
}
Additional, use {}, macro_rules map {} means it will return an item

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Rust: How to allow and detect optional punctionuation in macro_rules? - rust

Related

Double vs single bracket in rust macros

Rust macro: capture exactly matching tokens

What is a good way to match strings against patterns and extract values?

How do you write a macro with chainable tokens?

Why does this rust HashMap macro no longer work?

Categories

Resources