How to convert string literal into CODE, say a closure, in Rust? - rust

As the title says, I want to convert a string literal into code. So, naturally, I come up with macro:
// main.rs
use quote::quote;
use syn::ExprClosure;
// declare a macro to expand the entire input TokenStream as a token tree.
macro_rules! foo {
($t:tt) => {
$t
};
}
fn main() {
let code = "|x:&i32|x+1";
// parse string literal to syn::ExprClosure type.
let expr = syn::parse_str::<ExprClosure>(code).unwrap();
// return a value has a TokenStream type.
let expr_ts = quote! {
let slice = vec![1,2,3];
let slice2: Vec<i32> = slice.iter()
.map(#expr)
.collect();
println!("{:#?}",slice);
};
// call foo! to expand `expr_ts`
foo!(expr_ts);
}
which can be compiled perfectly, printing, however, nothing at all.
As far as I could understand, the reason why it prints nothing is in that $t in foo! is possibly not expanded as what I want it to. So I turned to this to try figure out if the type of $t went wrong:
macro_rules! foo {
($t:tt) => {
println!("{:#?}", $t);
};
}
which successfully printed the syntax tree of $t(which has a syn::ExprClosure type).
Update:
After doing some research on macro, I found that the problem was likely a misuse (or a misunderstanding) of different macro types: procedural macro vs. declarative macro.
So I refactor the code, which turns out to be a success:
// another_crate::lib
#[proc_macro]
pub fn foo(item : TokenStream) -> TokenStream {
let code = "|x:&i32|x+1";
let expr = syn::parse_str::<ExprClosure>(code).unwrap();
let expr_ts = quote! {
let slice = vec![1,2,3];
let slice2: Vec<i32> = slice.iter()
.map(#expr)
.collect();
println!("{:?}",slice2);
};
TokenStream::from(expr_ts)
}
// main.rs
use another_crate::foo;
fn main() {
foo!();
}
// output
// [2, 3, 4]
The quote! macro evaluates to an expression of type proc_macro2::TokenStream. Meanwhile Rust procedural macros are expected to return the type proc_macro::TokenStream.
The difference between the two types is that proc_macro types are entirely specific to procedural macros and cannot ever exist in code outside of a procedural macro, while proc_macro2 types may exist anywhere including tests and non-macro code like main.rs and build.rs.
As noted above (quote's doc), we can't use proc_macro::TokenStream in main.rs,which was exactly what I did in the first place: I used quote! in main.rs which returns a proc_macro2::TokenStream and expected that it would be parsed and expanded by compiler, which is not possible because the compiler accepts TokenStream input only if it is of proc_macro from another_crate(which is a procedural macro lib),not of proc_macro2 from main.rs.
Now the question here is that if I can implement the convertion just through declarative macro? It seems that procedural macros don't accept any variable as arguments, which means you can only hardcode string literal.
Really appreciate any replies regarding my concern.
Thanks.

Related

Attempt to implement sscanf in Rust, failing when passing &str as argument

Problem:
Im new to Rust, and im trying to implement a macro which simulates sscanf from C.
So far it works with any numeric types, but not with strings, as i am already trying to parse a string.
macro_rules! splitter {
( $string:expr, $sep:expr) => {
let mut iter:Vec<&str> = $string.split($sep).collect();
iter
}
}
macro_rules! scan_to_types {
($buffer:expr,$sep:expr,[$($y:ty),+],$($x:expr),+) => {
let res = splitter!($buffer,$sep);
let mut i = 0;
$(
$x = res[i].parse::<$y>().unwrap_or_default();
i+=1;
)*
};
}
fn main() {
let mut a :u8; let mut b :i32; let mut c :i16; let mut d :f32;
let buffer = "00:98;76,39.6";
let sep = [':',';',','];
scan_to_types!(buffer,sep,[u8,i32,i16,f32],a,b,c,d); // this will work
println!("{} {} {} {}",a,b,c,d);
}
This obviously wont work, because at compile time, it will try to parse a string slice to str:
let a :u8; let b :i32; let c :i16; let d :f32; let e :&str;
let buffer = "02:98;abc,39.6";
let sep = [':',';',','];
scan_to_types!(buffer,sep,[u8,i32,&str,f32],a,b,e,d);
println!("{} {} {} {}",a,b,e,d);
$x = res[i].parse::<$y>().unwrap_or_default();
| ^^^^^ the trait `FromStr` is not implemented for `&str`
What i have tried
I have tried to compare types using TypeId, and a if else condition inside of the macro to skip the parsing, but the same situation happens, because it wont expand to a valid code:
macro_rules! scan_to_types {
($buffer:expr,$sep:expr,[$($y:ty),+],$($x:expr),+) => {
let res = splitter!($buffer,$sep);
let mut i = 0;
$(
if TypeId::of::<$y>() == TypeId::of::<&str>(){
$x = res[i];
}else{
$x = res[i].parse::<$y>().unwrap_or_default();
}
i+=1;
)*
};
}
Is there a way to set conditions or skip a repetition inside of a macro ? Or instead, is there a better aproach to build sscanf using macros ? I have already made functions which parse those strings, but i couldnt pass types as arguments, or make them generic.
Note before the answer: you probably don't want to emulate sscanf() in Rust. There are many very capable parsers in Rust, so you should probably use one of them.
Simple answer: the simplest way to address your problem is to replace the use of &str with String, which makes your macro compile and run. If your code is not performance-critical, that is probably all you need. If you care about performance and about avoiding allocation, read on.
A downside of String is that under the hood it copies the string data from the string you're scanning into a freshly allocated owned string. Your original approach of using an &str should have allowed for your &str to directly point into the data that was scanned, without any copying. Ideally we'd like to write something like this:
trait MyFromStr {
fn my_from_str(s: &str) -> Self;
}
// when called on a type that impls `FromStr`, use `parse()`
impl<T: FromStr + Default> MyFromStr for T {
fn my_from_str(s: &str) -> T {
s.parse().unwrap_or_default()
}
}
// when called on &str, just return it without copying
impl MyFromStr for &str {
fn my_from_str(s: &str) -> &str {
s
}
}
Unfortunately that doesn't compile, complaining of a "conflicting implementation of trait MyFromStr for &str", even though there is no conflict between the two implementations, as &str doesn't implement FromStr. But the way Rust currently works, a blanket implementation of a trait precludes manual implementations of the same trait, even on types not covered by the blanket impl.
In the future this will be resolved by specialization. Specialization is not yet part of stable Rust, and might not come to stable Rust for years, so we have to think of another solution. In case of macro usage, we can just let the compiler "specialize" for us by creating two traits with the same name. (This is similar to the autoref-based specialization invented by David Tolnay, but even simpler because it doesn't require autoref resolution to work, as we have the types provided explicitly.)
We create separate traits for parsed and unparsed values, and implement them as needed:
trait ParseFromStr {
fn my_from_str(s: &str) -> Self;
}
impl<T: FromStr + Default> ParseFromStr for T {
fn my_from_str(s: &str) -> T {
s.parse().unwrap_or_default()
}
}
pub trait StrFromStr {
fn my_from_str(s: &str) -> &str;
}
impl StrFromStr for &str {
fn my_from_str(s: &str) -> &str {
s
}
}
Then in the macro we just call <$y>::my_from_str() and let the compiler generate the correct code. Since macros are untyped, this works because we never need to provide a single "trait bound" that would disambiguate which my_from_str() we want. (Such a trait bound would require specialization.)
macro_rules! scan_to_types {
($buffer:expr,$sep:expr,[$($y:ty),+],$($x:expr),+) => {
#[allow(unused_assignments)]
{
let res = splitter!($buffer,$sep);
let mut i = 0;
$(
$x = <$y>::my_from_str(&res[i]);
i+=1;
)*
}
};
}
Complete example in the playground.

How to pretty print Syn AST?

I'm trying to use syn to create an AST from a Rust file and then using quote to write it to another. However, when I write it, it puts extra spaces between everything.
Note that the example below is just to demonstrate the minimum reproducible problem I'm having. I realize that if I just wanted to copy the code over I could copy the file but it doesn't fit my case and I need to use an AST.
pub fn build_file() {
let current_dir = std::env::current_dir().expect("Unable to get current directory");
let rust_file = std::fs::read_to_string(current_dir.join("src").join("lib.rs")).expect("Unable to read rust file");
let ast = syn::parse_file(&rust_file).expect("Unable to create AST from rust file");
match std::fs::write("src/utils.rs", quote::quote!(#ast).to_string());
}
The file that it creates an AST of is this:
#[macro_use]
extern crate foo;
mod test;
fn init(handle: foo::InitHandle) {
handle.add_class::<Test::test>();
}
What it outputs is this:
# [macro_use] extern crate foo ; mod test ; fn init (handle : foo :: InitHandle) { handle . add_class :: < Test :: test > () ; }
I've even tried running it through rustfmt after writing it to the file like so:
utils::write_file("src/utils.rs", quote::quote!(#ast).to_string());
match std::process::Command::new("cargo").arg("fmt").output() {
Ok(_v) => (),
Err(e) => std::process::exit(1),
}
But it doesn't seem to make any difference.
The quote crate is not really concerned with pretty printing the generated code. You can run it through rustfmt, you just have to execute rustfmt src/utils.rs or cargo fmt -- src/utils.rs.
use std::fs;
use std::io;
use std::path::Path;
use std::process::Command;
fn write_and_fmt<P: AsRef<Path>, S: ToString>(path: P, code: S) -> io::Result<()> {
fs::write(&path, code.to_string())?;
Command::new("rustfmt")
.arg(path.as_ref())
.spawn()?
.wait()?;
Ok(())
}
Now you can just execute:
write_and_fmt("src/utils.rs", quote::quote!(#ast)).expect("unable to save or format");
See also "Any interest in a pretty-printing crate for Syn?" on the Rust forum.
As Martin mentioned in his answer, prettyplease can be used to format code fragments, which can be quite useful when testing proc macro where the standard to_string() on proc_macro2::TokenStream is rather hard to read.
Here a code sample to pretty print a proc_macro2::TokenStream parsable as a syn::Item:
fn pretty_print_item(item: proc_macro2::TokenStream) -> String {
let item = syn::parse2(item).unwrap();
let file = syn::File {
attrs: vec![],
items: vec![item],
shebang: None,
};
prettyplease::unparse(&file)
}
I used this in my tests to help me understand where is the wrong generated code:
assert_eq!(
expected.to_string(),
generate_event().to_string(),
"\n\nActual:\n {}",
pretty_print_item(generate_event())
);
Please see the new prettyplease crate. Advantages:
It can be used directly as a library.
It can handle code fragments while rustfmt only handles full files.
It is fast because it uses a simpler algorithm.
Similar to other answers, I also use prettyplease.
I use this little trick to pretty-print a proc_macro2::TokenStream (e.g. what you get from calling quote::quote!):
fn pretty_print(ts: &proc_macro2::TokenStream) -> String {
let file = syn::parse_file(&ts.to_string()).unwrap();
prettyplease::unparse(&file)
}
Basically, I convert the token stream to an unformatted String, then parse that String into a syn::File, and then pass that to prettyplease package.
Usage:
#[test]
fn it_works() {
let tokens = quote::quote! {
struct Foo {
bar: String,
baz: u64,
}
};
let formatted = pretty_print(&tokens);
let expected = "struct Foo {\n bar: String,\n baz: u64,\n}\n";
assert_eq!(formatted, expected);
}

How do I get the value and type of a Literal in a procedural macro?

I am implementing a function-like procedural macro which takes a single string literal as an argument, but I don't know how to get the value of the string literal.
If I print the variable, it shows a bunch of fields, which includes both the type and the value. They are clearly there, somewhere. How do I get them?
extern crate proc_macro;
use proc_macro::{TokenStream,TokenTree};
#[proc_macro]
pub fn my_macro(input: TokenStream) -> TokenStream {
let input: Vec<TokenTree> = input.into_iter().collect();
let literal = match &input.get(0) {
Some(TokenTree::Literal(literal)) => literal,
_ => panic!()
};
// can't do anything with "literal"
// println!("{:?}", literal.lit.symbol); says "unknown field"
format!("{:?}", format!("{:?}", literal)).parse().unwrap()
}
#![feature(proc_macro_hygiene)]
extern crate macros;
fn main() {
let value = macros::my_macro!("hahaha");
println!("it is {}", value);
// prints "it is Literal { lit: Lit { kind: Str, symbol: "hahaha", suffix: None }, span: Span { lo: BytePos(100), hi: BytePos(108), ctxt: #0 } }"
}
After running into the same problem countless times already, I finally wrote a library to help with this: litrs on crates.io. It compiles faster than syn and lets you inspect your literals.
use std::convert::TryFrom;
use litrs::StringLit;
use proc_macro::TokenStream;
use quote::quote;
#[proc_macro]
pub fn my_macro(input: TokenStream) -> TokenStream {
let input = input.into_iter().collect::<Vec<_>>();
if input.len() != 1 {
let msg = format!("expected exactly one input token, got {}", input.len());
return quote! { compile_error!(#msg) }.into();
}
let string_lit = match StringLit::try_from(&input[0]) {
// Error if the token is not a string literal
Err(e) => return e.to_compile_error(),
Ok(lit) => lit,
};
// `StringLit::value` returns the actual string value represented by the
// literal. Quotes are removed and escape sequences replaced with the
// corresponding value.
let v = string_lit.value();
// TODO: implement your logic here
}
See the documentation of litrs for more information.
To obtain more information about a literal, litrs uses the Display impl of Literal to obtain a string representation (as it would be written in source code) and then parses that string. For example, if the string starts with 0x one knows it has to be an integer literal, if it starts with r#" one knows it is a raw string literal. The crate syn does exactly the same.
Of course, it seems a bit wasteful to write and run a second parser given that rustc already parsed the literal. Yes, that's unfortunate and having a better API in proc_literal would be preferable. But right now, I think litrs (or syn if you are using syn anyway) are the best solutions.
(PS: I'm usually not a fan of promoting one's own libraries on Stack Overflow, but I am very familiar with the problem OP is having and I very much think litrs is the best tool for the job right now.)
If you're writing procedural macros, I'd recommend that you look into using the crates syn (for parsing) and quote (for code generation) instead of using proc-macro directly, since those are generally easier to deal with.
In this case, you can use syn::parse_macro_input to parse a token stream into any syntatic element of Rust (such as literals, expressions, functions), and will also take care of error messages in case parsing fails.
You can use LitStr to represent a string literal, if that's exactly what you need. The .value() function will give you a String with the contents of that literal.
You can use quote::quote to generate the output of the macro, and use # to insert the contents of a variable into the generated code.
use proc_macro::TokenStream;
use syn::{parse_macro_input, LitStr};
use quote::quote;
#[proc_macro]
pub fn my_macro(input: TokenStream) -> TokenStream {
// macro input must be `LitStr`, which is a string literal.
// if not, a relevant error message will be generated.
let input = parse_macro_input!(input as LitStr);
// get value of the string literal.
let str_value = input.value();
// do something with value...
let str_value = str_value.to_uppercase();
// generate code, include `str_value` variable (automatically encodes
// `String` as a string literal in the generated code)
(quote!{
#str_value
}).into()
}
I always want a string literal, so I found this solution that is good enough. Literal implements ToString, which I can then use with .parse().
#[proc_macro]
pub fn my_macro(input: TokenStream) -> TokenStream {
let input: Vec<TokenTree> = input.into_iter().collect();
let value = match &input.get(0) {
Some(TokenTree::Literal(literal)) => literal.to_string(),
_ => panic!()
};
let str_value: String = value.parse().unwrap();
// do whatever
format!("{:?}", str_value).parse().unwrap()
}
I had similar problem for parsing doc attribute. It is also represented as a TokenStream. This is not exact answer but maybe will guide in a proper direction:
fn from(value: &Vec<Attribute>) -> Vec<String> {
let mut lines = Vec::new();
for attr in value {
if !attr.path.is_ident("doc") {
continue;
}
if let Ok(Meta::NameValue(nv)) = attr.parse_meta() {
if let Lit::Str(lit) = nv.lit {
lines.push(lit.value());
}
}
}
lines
}

Is there a way to get the file and the module path of where a procedural macro is attached at compile-time?

I'm looking for the equivalent of file!() & module_path!() in a procedural macro context.
For example, the following doesn't work:
file.rs:
#[some_attribute]
const A: bool = true;
macro.rs:
#[proc_macro_attribute]
pub fn some_attribute(attr: TokenStream, input: TokenStream) -> TokenStream {
println!("{}", file!());
input
}
This prints macro.rs which makes sense, but what I want is file.rs. Is there a way to achieve this? Is there also a similar way for module_path!()?
A requirement of this is that has to happen at compile-time.
I'm trying to create a file in the OUT_DIR containing constant values where the attribute is added with the module and the file that they are in.
I had the same problem and found out that Rust added a new experimential API to Rust macros (#54725) which allows exaclty what you want:
#![feature(proc_macro_span)]
#[proc_macro]
pub(crate) fn do_something(item: TokenStream) -> TokenStream {
let span = Span::call_site();
let source = span.source_file();
format!("println!(r#\"Path: {}\"#)", source.path().to_str().unwrap())
.parse()
.unwrap()
}
use my_macro_crate::*;
fn main() {
println!("Hello, world!");
do_something!();
}
Will output:
Hello, world!
Path: src\main.rs
Important
Apart from this API being experimential, the path might not be a real OS path. This can be the case if the Span was generated by a macro. Visit the documentation here.
The problem here is that println!("{}", file!()); is executed at compile time and not at runtime. Similar to an answer that was recently given here, you can edit the original function and insert your code at the beginning of it, which will be executed at runtime this time. You can still use the procedural macros file!() and module_path!(). Here is a macro.rs with this approach:
#[proc_macro_attribute]
pub fn some_attribute(_attr: TokenStream, input: TokenStream) -> TokenStream {
// prefix to be added to the function's body
let mut prefix: TokenStream = "
println!(\"Called from {:?} inside module path {:?}\",
file!(), module_path!());
".parse().unwrap();
// edit TokenStream
input.into_iter().map(|tt| {
match tt {
TokenTree::Group(ref g) // match function body
if g.delimiter() == proc_macro::Delimiter::Brace => {
// add logic before function body
prefix.extend(g.stream());
// return new function body as TokenTree
TokenTree::Group(proc_macro::Group::new(
proc_macro::Delimiter::Brace, prefix.clone()))
},
other => other, // else just forward
}
}).collect()
}
You can use it like this in your main.rs:
use mylib::some_attribute;
#[some_attribute]
fn yo() -> () { println!("yo"); }
fn main() { yo(); }
Note that the code is added before what's inside of the function's body. We could have inserted it at the end, but this would break the possibility of returning a value without a semicolon.
EDIT: Later realized that the OP wants it to run at compile time.

Why does the println! function use an exclamation mark in Rust?

In Swift, ! means to unwrap an optional (possible value).
println! is not a function, it is a macro. Macros use ! to distinguish them from normal method calls. The documentation contains more information.
See also:
What is the difference between macros and functions in Rust?
Rust uses the Option type to denote optional data. It has an unwrap method.
Rust 1.13 added the question mark operator ? as an analog of the try! macro (originally proposed via RFC 243).
An excellent explanation of the question mark operator is in The Rust Programming Language.
fn foo() -> Result<i32, Error> {
Ok(4)
}
fn bar() -> Result<i32, Error> {
let a = foo()?;
Ok(a + 4)
}
The question mark operator also extends to Option, so you may see it used to unwrap a value or return None from the function. This is different from just unwrapping as the program will not panic:
fn foo() -> Option<i32> {
None
}
fn bar() -> Option<i32> {
let a = foo()?;
Some(a + 4)
}
println! is a macro in rust, that means that rust will rewrite the code for you at compile time.
For example this:
fn main() {
let x = 5;
println!("{}", x);
}
Will be converted to something like this at compile time:
#![feature(prelude_import)]
#[prelude_import]
use std::prelude::v1::*;
#[macro_use]
extern crate std;
fn main() {
let x = 5;
{
::std::io::_print(::core::fmt::Arguments::new_v1(
&["", "\n"],
&match (&x,) {
(arg0,) => [::core::fmt::ArgumentV1::new(
arg0,
::core::fmt::Display::fmt,
)],
},
));
};
}
*Notice that the &x is passed as a reference.
It's a macro because it does things that functions can't do:
It parses the format string at compile time, and generates type safe code
It has a variable number of arguments
It has named arguments ("keyword arguments")
println!("My name is {first} {last}", first = "John", last = "Smith");
sources:
https://doc.rust-lang.org/rust-by-example/hello/print.html.
https://www.reddit.com/r/rust/comments/4qor4o/newb_question_why_is_println_a_macro/
Does println! borrow or own the variable?

Resources