pass macro arguments to other macro - rust

I am new to rust. I am trying to create macro which takes a buffer and then decodes some data out of it and creates givens list of variables. if error occurs then it should print error and continue since I'm gonna call it in a loop where I receive buffers. something like this:-
for bin_ref in bufs {
extract!( bin_ref anime &str episodes u32 season u32);
//if everything goes ok then do some cool stuff with
//above variables otherwise take next buf_ref
}
How can I do this? So for I came with this aproach:-
#[macro_export]
macro_rules! extract {
( $buf:ident $($var:ident $typ:ty),* ) => {
$(
ext_type!( $buf $var $typ );
)*
};
}
#[macro_export]
macro_rules! ext_type {
( $buf:ident $var:ident &str ) => {
let mut $var : &str = ""; //some string specific function
println!("doing cool things with '{}' which is string ",$var);
};
( $buf:ident $var:ident u32 ) => {
let mut $var : u32 = 34; //some u32 specific function
println!("doing cool things with '{}' which is u32",$var);
}
}
I have following test function:-
fn macro_test() {
let mut bin_ref : &[u8] = &[0u8;100];
ext_type!(bin_ref anime &str); // works
ext_type!(bin_ref episodes u32 ); // works
extract!( bin_ref username &str, password &str ); // does not work. why ??
}
When I compile this,I get following error:-
error: no rules expected the token `&str`
--> src/easycode.rs:11:34
|
11 | ext_type!( $buf $var $typ );
| ^^^^ no rules expected this token in macro call
...
19 | macro_rules! ext_type {
| --------------------- when calling this macro
...
48 | extract!( bin_ref username &str, password &str );
| ------------------------------------------------- in this macro invocation
Why I cant just pass $typ to ext_type! macro? it works when called from code

The ext_type! macro's rules require the literal tokens &str and u32 at the end. These literal tokens cannot match the matched fragment $typ:ty in extract!. In order to successfully match the literal tokens to a matched fragment, it must be a tt, ident or lifetime.
The only option that will work in this case is tt, which simply put, is just a parser token. A type is often composed of more than one token though; case in point &str, which consists of two tokens & and str. We must thus use a repetition to fully capture a type with tts: $($typ:tt)+ will do nicely.
Using an unbounded repetition with tt comes at a cost, however -- a tt will match almost everything, so simply substituting $typ:ty with $($typ:tt)+ will not work, as the $typ repetition will capture everything till the end of the macro invocation! To prevent this from happening, we must delimit the type token tree in the macro rule matcher to stop it from consuming everything. At the cost of making invocation slightly verbose, wrapping the repetition in parentheses in will serve us well and stop the token tree matching exactly where we want it to. The modified macro looks like this:
#[macro_export]
macro_rules! extract {
( $buf:ident $($var:ident ($($typ:tt)+)),* ) => {
$(
ext_type!( $buf $var $($typ)+);
)*
};
}
Note the replacement of $typ:ty with ($($typ:tt)+) (which is a token tree repetition wrapped in parentheses) in the matcher, and the replacement of $typ with $($typ)+ in the transcriber.
The macro rule is invoked as follows:
extract!(bin_ref username (&str), password (&str), id (u32));
Rust Playground

Related

Matching an i32 value with character constants

I'm writing a parser in Rust, which needs at various points to match the current token against candidate values. Some of the candidate values are characters, others are integer constants, so the token is declared as i32, which would be plenty to accommodate both. (All the characters to be matched against are ASCII.)
The problem is that when I supply a character constant like '(' to be matched against, the compiler complains that it expected i32 and is getting char.
I tried writing e.g. '(' as i32 but an as expression is not allowed as a match candidate.
Obviously I could look up the ASCII values and provide them as numbers, but it seems there should be a more readable solution. Declaring the token as char doesn't really seem correct, as it sometimes needs to hold integers that are not actually characters.
What's the recommended way to solve this problem?
It’s a bit verbose, but your match arms could be of the form c if c == i32::from(b'(').
Another alternative would be to match on u8::try_from(some_i32) (branch arms Some(b'(') and then either None if some_i32 == … or None => { match some_i32 { … } }).
Yet another would be to change the type from i32 to your own enum, which is probably the cleanest option but might require some convincing of the Rust compiler to get an i32-like representation if you need that for some reason.
Finally, you could define const PAREN_OPEN: i32 = b'(' as i32; and use PAREN_OPEN as the pattern.
Since as expressions are allowed in constants, and matching is allowed against constants, you can use a constant:
const LPAREN: i32 = '(' as i32;
match v {
LPAREN => { ... }
// ...
}
If you can use nightly, you can use the inline_const_pat feature to reduce the boilerplate:
#![feature(inline_const_pat)]
match v {
const { '(' as i32 } => { ... }
// ...
}
Another way: here's a small proc macro that will replace the characters with their numerical value (it does not work with nested char patterns):
use proc_macro::TokenStream;
use quote::ToTokens;
#[proc_macro]
pub fn i32_match(input: TokenStream) -> TokenStream {
let mut input = syn::parse_macro_input!(input as syn::ExprMatch);
for arm in &mut input.arms {
if let syn::Pat::Lit(lit) = &mut arm.pat {
if let syn::Expr::Lit(syn::ExprLit { lit, .. }) = &mut *lit.expr {
if let syn::Lit::Char(ch) = lit {
*lit = syn::Lit::Int(syn::LitInt::new(
&(ch.value() as i32).to_string(),
ch.span(),
));
}
}
}
}
input.into_token_stream().into()
}
i32_match! {
match v {
'(' => { ... }
// ...
}
}

How do I get the value and type of a Literal in a procedural macro?

I am implementing a function-like procedural macro which takes a single string literal as an argument, but I don't know how to get the value of the string literal.
If I print the variable, it shows a bunch of fields, which includes both the type and the value. They are clearly there, somewhere. How do I get them?
extern crate proc_macro;
use proc_macro::{TokenStream,TokenTree};
#[proc_macro]
pub fn my_macro(input: TokenStream) -> TokenStream {
let input: Vec<TokenTree> = input.into_iter().collect();
let literal = match &input.get(0) {
Some(TokenTree::Literal(literal)) => literal,
_ => panic!()
};
// can't do anything with "literal"
// println!("{:?}", literal.lit.symbol); says "unknown field"
format!("{:?}", format!("{:?}", literal)).parse().unwrap()
}
#![feature(proc_macro_hygiene)]
extern crate macros;
fn main() {
let value = macros::my_macro!("hahaha");
println!("it is {}", value);
// prints "it is Literal { lit: Lit { kind: Str, symbol: "hahaha", suffix: None }, span: Span { lo: BytePos(100), hi: BytePos(108), ctxt: #0 } }"
}
After running into the same problem countless times already, I finally wrote a library to help with this: litrs on crates.io. It compiles faster than syn and lets you inspect your literals.
use std::convert::TryFrom;
use litrs::StringLit;
use proc_macro::TokenStream;
use quote::quote;
#[proc_macro]
pub fn my_macro(input: TokenStream) -> TokenStream {
let input = input.into_iter().collect::<Vec<_>>();
if input.len() != 1 {
let msg = format!("expected exactly one input token, got {}", input.len());
return quote! { compile_error!(#msg) }.into();
}
let string_lit = match StringLit::try_from(&input[0]) {
// Error if the token is not a string literal
Err(e) => return e.to_compile_error(),
Ok(lit) => lit,
};
// `StringLit::value` returns the actual string value represented by the
// literal. Quotes are removed and escape sequences replaced with the
// corresponding value.
let v = string_lit.value();
// TODO: implement your logic here
}
See the documentation of litrs for more information.
To obtain more information about a literal, litrs uses the Display impl of Literal to obtain a string representation (as it would be written in source code) and then parses that string. For example, if the string starts with 0x one knows it has to be an integer literal, if it starts with r#" one knows it is a raw string literal. The crate syn does exactly the same.
Of course, it seems a bit wasteful to write and run a second parser given that rustc already parsed the literal. Yes, that's unfortunate and having a better API in proc_literal would be preferable. But right now, I think litrs (or syn if you are using syn anyway) are the best solutions.
(PS: I'm usually not a fan of promoting one's own libraries on Stack Overflow, but I am very familiar with the problem OP is having and I very much think litrs is the best tool for the job right now.)
If you're writing procedural macros, I'd recommend that you look into using the crates syn (for parsing) and quote (for code generation) instead of using proc-macro directly, since those are generally easier to deal with.
In this case, you can use syn::parse_macro_input to parse a token stream into any syntatic element of Rust (such as literals, expressions, functions), and will also take care of error messages in case parsing fails.
You can use LitStr to represent a string literal, if that's exactly what you need. The .value() function will give you a String with the contents of that literal.
You can use quote::quote to generate the output of the macro, and use # to insert the contents of a variable into the generated code.
use proc_macro::TokenStream;
use syn::{parse_macro_input, LitStr};
use quote::quote;
#[proc_macro]
pub fn my_macro(input: TokenStream) -> TokenStream {
// macro input must be `LitStr`, which is a string literal.
// if not, a relevant error message will be generated.
let input = parse_macro_input!(input as LitStr);
// get value of the string literal.
let str_value = input.value();
// do something with value...
let str_value = str_value.to_uppercase();
// generate code, include `str_value` variable (automatically encodes
// `String` as a string literal in the generated code)
(quote!{
#str_value
}).into()
}
I always want a string literal, so I found this solution that is good enough. Literal implements ToString, which I can then use with .parse().
#[proc_macro]
pub fn my_macro(input: TokenStream) -> TokenStream {
let input: Vec<TokenTree> = input.into_iter().collect();
let value = match &input.get(0) {
Some(TokenTree::Literal(literal)) => literal.to_string(),
_ => panic!()
};
let str_value: String = value.parse().unwrap();
// do whatever
format!("{:?}", str_value).parse().unwrap()
}
I had similar problem for parsing doc attribute. It is also represented as a TokenStream. This is not exact answer but maybe will guide in a proper direction:
fn from(value: &Vec<Attribute>) -> Vec<String> {
let mut lines = Vec::new();
for attr in value {
if !attr.path.is_ident("doc") {
continue;
}
if let Ok(Meta::NameValue(nv)) = attr.parse_meta() {
if let Lit::Str(lit) = nv.lit {
lines.push(lit.value());
}
}
}
lines
}

Can I detect if an expression is a mutable variable using Rust's macro repetition?

macro_rules! log {
($($x:expr),*) => {
{
$(
//how to detect $x in the Macro repetition?
)*
}
};
}
I don't want to use ($($x:ident),*). The log! macro can log the expression and then I want to match the variable type.
Macros only can access token streams, and cannot say anything more about those tokens, apart from what is explicitly given to the macro.
For example, consider this code:
fn main() {
let mut foo: i32 = 1;
my_macro!(foo);
}
The only information available to the macro my_macro is the input token foo. It can match that foo is is an identifier but it can't say anything more about it. The fact that there is mutable binding called foo, or that this binding is an i32 are just not available inside the macro.

Why does the tt metavariable reject multiple tokens? [duplicate]

I'm reading a book about Rust, and start playing with Rust macros. All metavariable types are explained there and have examples, except the last one – tt. According to the book, it is a “a single token tree”. I'm curious, what is it and what is it used for? Can you please provide an example?
That's a notion introduced to ensure that whatever is in a macro invocation correctly matches (), [] and {} pairs. tt will match any single token or any pair of parenthesis/brackets/braces with their content.
For example, for the following program:
fn main() {
println!("Hello world!");
}
The token trees would be:
fn
main
()
∅
{ println!("Hello world!"); }
println
!
("Hello world!")
"Hello world!"
;
Each one forms a tree where simple tokens (fn, main etc.) are leaves, and anything surrounded by (), [] or {} has a subtree. Note that ( does not appear alone in the token tree: it's not possible to match ( without matching the corresponding ).
For example:
macro_rules! {
(fn $name:ident $params:tt $body:tt) => { /* … */ }
}
would match the above function with $name → main, $params → (), $body → { println!("Hello world!"); }.
Token tree is the least demanding metavariable type: it matches anything. It's often used in macros which have a “don't really care” part, and especially in macros which have a “head” and a “tail” part. For example, the println! macros have a branch matching ($fmt:expr, $($arg:tt)*) where $fmt is the format string, and $($arg:tt)* means “all the rest” and is just forwarded to format_args!. Which means that println! does not need to know the actual format and do complicated matching with it.

Identifier interpolation in macro definition [duplicate]

Basically, there are two parts to this question:
Can you pass an unknown identifier to a macro in Rust?
Can you combine strings to generate new variable names in a Rust macro?
For example, something like:
macro_rules! expand(
($x:ident) => (
let mut x_$x = 0;
)
)
Calling expand!(hi) obvious fails because hi is an unknown identifier; but can you somehow do this?
ie. The equivalent in C of something like:
#include <stdio.h>
#define FN(Name, base) \
int x1_##Name = 0 + base; \
int x2_##Name = 2 + base; \
int x3_##Name = 4 + base; \
int x4_##Name = 8 + base; \
int x5_##Name = 16 + base;
int main() {
FN(hello, 10)
printf("%d %d %d %d %d\n", x1_hello, x2_hello, x3_hello, x4_hello, x5_hello);
return 0;
}
Why you say, what a terrible idea. Why would you ever want to do that?
I'm glad you asked!
Consider this rust block:
{
let marker = 0;
let borrowed = borrow_with_block_lifetime(data, &marker);
unsafe {
perform_ffi_call(borrowed);
}
}
You now have a borrowed value with an explicitly bounded lifetime (marker) that isn't using a structure lifetime, but that we can guarantee exists for the entire scope of the ffi call; at the same time we don't run into obscure errors where a * is de-referenced unsafely inside an unsafe block and so the compiler doesn't catch it as an error, despite the error being made inside a safe block.
(see also Why are all my pointers pointing to the same place with to_c_str() in rust?)
The use a macro that can declare temporary variables for this purpose would considerably ease the troubles I have fighting with the compiler. That's why I want to do this.
Yes however this is only available as a nightly-only experimental API which may be removed.
You can pass arbitrary identifier into a macro and yes, you can concatenate identifiers into a new identifier using concat_idents!() macro:
#![feature(concat_idents)]
macro_rules! test {
($x:ident) => ({
let z = concat_idents!(hello_, $x);
z();
})
}
fn hello_world() { }
fn main() {
test!(world);
}
However, as far as I know, because concat_idents!() itself is a macro, you can't use this concatenated identifier everywhere you could use plain identifier, only in certain places like in example above, and this, in my opinion, is a HUGE drawback. Just yesterday I tried to write a macro which could remove a lot of boilerplate in my code, but eventually I was not able to do it because macros do not support arbitrary placement of concatenated identifiers.
BTW, if I understand your idea correctly, you don't really need concatenating identifiers to obtain unique names. Rust macros, contrary to the C ones, are hygienic. This means that all names of local variables introduced inside a macro won't leak to the scope where this macro is called. For example, you could assume that this code would work:
macro_rules! test {
($body:expr) => ({ let x = 10; $body })
}
fn main() {
let y = test!(x + 10);
println!("{}", y);
}
That is, we create a variable x and put an expression after its declaration. It is then natural to think that x in test!(x + 10) refers to that variable declared by the macro, and everything should be fine, but in fact this code won't compile:
main3.rs:8:19: 8:20 error: unresolved name `x`.
main3.rs:8 let y = test!(x + 10);
^
main3.rs:3:1: 5:2 note: in expansion of test!
main3.rs:8:13: 8:27 note: expansion site
error: aborting due to previous error
So if all you need is uniqueness of locals, then you can safely do nothing and use any names you want, they will be unique automatically. This is explained in macro tutorial, though I find the example there somewhat confusing.
There is also https://github.com/dtolnay/paste, which works well in cases where concat_idents is underpowered or in cases where you can't target the nightly compiler.
macro_rules! foo_macro {
( $( $name:ident ),+ ) => {
paste! {
#[test]
fn [<test_ $name>]() {
assert! false
}
}
};
}
In cases where concat_idents doesn't work (which is most cases I'd like to use it) changing the problem from concatenated identifiers to using namespaces does work.
That is, instead of the non-working code:
macro_rules! test {
($x:ident) => ({
struct concat_idents!(hello_, $x) {}
enum contact_idents!(hello_, $x) {}
})
}
The user can name the namespace, and then have preset names as shown below:
macro_rules! test {
($x:ident) => ({
mod $x {
struct HelloStruct {}
enum HelloEnum {}
}
})
}
Now you have a name based on the macro's argument. This technique is only helpful in specific cases.
You can collect your identifiers into a struct if you don't want to use nightly and external crates and your identifiers are types.
use std::fmt::Debug;
fn print_f<T: Debug>(v: &T){
println!("{:?}", v);
}
macro_rules! print_all {
($($name:ident),+) => {
struct Values{
$($name: $name),+
}
let values = Values{
$(
$name: $name::default()
),+
};
$(
print_f(&values.$name);
)+
};
}
fn main(){
print_all!(String, i32, usize);
}
This code prints
""
0
0
If you fear that Value will conflict with some type name, you can use some long UUID as part of the name:
struct Values_110cf51d7a694c808e6fe79bf1485d5b{
$($name:$name),+
}

Resources