I am new to Rust and trying to figure out how to get into a workflow of discovering the structure of this AST returned by syn.
[package]
name = "rust-ast"
version = "0.1.0"
authors = ["Foo Bar <foo#bar.com>"]
edition = "2018"
[dependencies]
syn = { version = "1.0.89", features = ["full", "printing", "visit", "extra-traits"] }
That is the Cargo.toml, and here is the main file:
use syn;
use std::env;
use std::fs::File;
use std::io::Read;
use std::process;
fn main() {
let mut args = env::args();
let _ = args.next(); // executable name
let filename = match (args.next(), args.next()) {
(Some(filename), None) => filename,
_ => {
eprintln!("Usage: dump-syntax path/to/filename.rs");
process::exit(1);
}
};
let mut file = File::open(&filename).expect("Unable to open file");
let mut src = String::new();
file.read_to_string(&mut src).expect("Unable to read file");
let syntax = syn::parse_file(&src).expect("Unable to parse file");
let mut str = String::from("");
for item in syntax.items {
match item {
syn::Item::Use(x) => {
match x.tree {
syn::path::Path { .. } => {
println!("{:#?}", x);
},
}
str.push_str("load");
},
_ => println!("Skip")
}
}
// let iterator = syntax.iter();
// for val in iterator {
// println!("Got: {:#?}", val);
// }
}
I was able to earlier print out the x, which showed:
However, I am getting this error now:
this expression has type `syn::UseTree`
expected enum `syn::UseTree`, found struct `syn::Path`
First of all, how do I discover the API in VSCode? I have enabled the rust plugin so I can click through some definitions, but I look at that terminal output of the AST "types", and then I try to backward figure out what the type is in VSCode. Usually I resort to looking it up on the docs page, but any tips on how to figure out what the API should be would help teach me to fish. But for this particular question, what am I doing wrong? I am simply trying to destructure the AST as much as possible down to the leaves, to get more familiar with Rust (I am just beginning).
But for this particular question, what am I doing wrong?
x.tree is of type UseTree which is an enum:
pub enum UseTree {
Path(UsePath),
Name(UseName),
... other variants snipped ...
}
Therefore you need to match on the Path variant of UseTree:
match x.tree {
UseTree::Path { .. } => { ... },
_ => todo!(),
}
First of all, how do I discover the API in VSCode?
I don't use VS code but I find clicking through works well for Rust and the stdlib and less well for macro heavy libraries such as syn. Fortunately the documentation for syn is really good.
am simply trying to destructure the AST as much as possible down to the leaves, to get more familiar with Rust (I am just beginning).
syn can be quite complex and so I wouldn't advise this as an approach to learning the language.
Related
I can't figure out how to do import- and instancing-lines such that they tolerate non-existing files/modules and structs.
I tried making a macro that unwraps into such lines based on what files it finds in the directory, using a crate I found that had promise - include_optional - which allows to check for existence of files already at compile-time (since it's a macro).
However, I can't figure out how to use it properly in a macro, neither did I manage to use it without macro using the example at bottom of the docs conditional compilation chapter.
if cfg!(unix) { "unix" } else if cfg!(windows) { "windows" } else { "unknown" } (from the docs)
vs
if include_optional::include_bytes_optional!("day1.rs").is_some() { Some(&day01::Day01 {}) } else { None } // assume day1.rs and thus Day01 are non-existent (my attempt at doing same thing)
My if-statement compiles both cases, including the unreachable code (causing a compilation error), despite how according to the the docs it supposedly doesn't for cfg! ("conditional compilation").
Essentially, what I want is something of this form:
// Macro to generate code based on how many files/structs has been created
// There are anywhere between 1-25 days
get_days_created!;
/* // Let's assume 11 have been created so far, then the macro should evaluate to this code:
* mod day1;
* use day1 as day0#;
* // ...
* mod day11;
* use day11 as day11;
*
* // ...
* fn main() -> Result<(), i32> {
* let days : Vec<&dyn Day> = vec![
* &day01::Day01 {},
* // ...
* &day11::Day11 {},
* ];
* // ...
* }
*/
The solution is to create a proc_macro. These function similar to regular macros except they allow you to write a function of actual code they should execute, instead being given (and returning) a 'TokenStream' to parse the given tokens (and, respectively, what tokens the macro should expand to).
To create a proc_macro, the first and most important piece of information you need to know is that you can't do this anywhere. Instead, you need to create a new library, and in its Cargo.toml file you need to set proc-macro = true. Then you can declare them in its lib.rs. An example TOML would look something like this:
[package]
name = "some_proc_macro_lib"
version = "0.1.0"
edition = "2021"
[lib]
proc-macro = true
[dependencies]
glob = "0.3.0"
regex = "1.7.0"
Then you can create your macros in this library as regular functions, with the #[proc_macro] attribute/annotation. Here's an example lib.rs with as few dependencies as possible. For my exact question, the input TokenStream is irrelevant and can be ignored, and instead you want to generate and return a new one:
use proc_macro::TokenStream;
use glob::glob;
use regex::Regex;
#[proc_macro]
pub fn import_days(_: TokenStream) -> TokenStream {
let mut stream = TokenStream::new();
let re = Regex::new(r".+(\d+)").unwrap();
for entry in glob("./src/day*.rs").expect("Failed to read pattern") {
if let Ok(path) = entry {
let prefix = path.file_stem().unwrap().to_str().unwrap();
let caps = re.captures(prefix);
if let Some(caps) = caps {
let n: u32 = caps.get(1).unwrap().as_str().parse().unwrap();
let day = &format!("{}", prefix);
let day_padded = &format!("day{:0>2}", n);
stream.extend(format!("mod {};", day).parse::<TokenStream>().unwrap());
if n < 10 {
stream.extend(format!("use {} as {};", day, day_padded).parse::<TokenStream>().unwrap());
}
}
}
}
return proc_macro::TokenStream::from(stream);
}
The question could be considered answered with this already, but the answer can and should be further expanded on in my opinion. And as such I will do so.
Some additional explanations and suggestions, beyond the scope of the question
There are however quite a few other crates beside proc_macro that can aid you with both parsing the input stream, and building the output one. Of note are the dependencies syn and quote, and to aid them both there's the crate proc_macro2.
The syn crate
With syn you get helpful types, methods and macros for parsing the input Tokenstream. Essentially, with a struct Foo implementing syn::parse::Parse and the macro let foo = syn::parse_macro_input!(input as Foo) you can much more easily parse it into a custom struct thanks to syn::parse::ParseStream. An example would be something like this:
use proc_macro2::Ident;
use syn;
use syn::parse::{Parse, ParseStream};
#[derive(Debug, Default)]
struct Foo {
idents: Vec<Ident>,
}
impl syn::parse::Parse for Foo {
fn parse(input: syn::parse::ParseStream) -> syn::Result<Self> {
let mut foo= Foo::default();
while !input.is_empty() {
let fn_ident = input.parse::<Ident>()?;
foo.idents.push(fn_ident);
// Optional comma: Ok vs Err doesn't matter. Just consume if it exists and ignore failures.
input.parse::<syn::token::Comma>().ok();
}
return Ok(foo);
}
}
Note that the syn::Result return-type allows for nice propagation of parsing-errors when using the sugary ? syntax: input.parse::<SomeType>()?
The quote crate
With quote you get a helpful macro for generating a tokenstream more akin to how macro_rules does it. As an argument you write essentially regular code, and tell it to use the value of variables by prefixing with #.
Do note that you can't just pass it variables containing strings and expect it to expand into identifiers, as strings resolve to the value "foo" (quotes included). ie. mod "day1"; instead of mod day1;. You need to turn them into either:
a proce_macro2::Ident
syn::Ident::new(foo_str, proc_macro2::Span::call_site())
or a proc_macro2::TokenStream
foo_str.parse::<TokenStream>().unwrap()
The latter also allows to convert longer strings with more than a single Ident, and manages things such as literals etc., making it possible to skip the quote! macro entirely and just use this tokenstream directly (as seen in import_days).
Here's an example that creates a struct with dynamic name, and implements a specific trait for it:
use proc_macro2::TokenStream;
use quote::quote;
// ...
let mut stream = TokenStream::new();
stream.extend(quote!{
#[derive(Debug)]
pub struct #day_padded_upper {}
impl Day for #day_padded_upper {
#trait_parts
}
});
return proc_macro::TokenStream::from(stream);
Finally, on how to implement my question
This 'chapter' is a bit redundant, as I essentially answered it with the first two code-snippets (.toml and fn import_days), and the rest could have been considered an exercise for the reader. However, while the question is about reading the filesystem at compile-time in a macro to 'dynamically' change its expansion (sort of), I wrote it in a more general form asking how to achieve a specific result (as old me didn't know macro's could do that). So for completion I'll include this 'chapter' nevertheless.
There is also the fact that the last macro in this 'chapter' - impl_day (which wasn't mentioned at all in the question) - serves as a good example of how to achieve two adjacent but important and relevant tasks.
Retrieving and using call-site's filename.
Parsing the input TokenStream using the syn dependency as shown above.
In other words: knowing all the above, this is how you can create macros for importing all targeted files, instantiating structs for all targeted files, as well as to declare + define the struct from current file's name.
Importing all targeted files:
See import_days above at the start.
Instantiating Vec with structs from all targeted files:
#[proc_macro]
pub fn instantiate_days(_: proc_macro::TokenStream) -> proc_macro::TokenStream {
let re = Regex::new(r".+(\d+)").unwrap();
let mut stream = TokenStream::new();
let mut block = TokenStream::new();
for entry in glob("./src/day*.rs").expect("Failed to read pattern") {
match entry {
Ok(path) => {
let prefix = path.file_stem().unwrap().to_str().unwrap();
let caps = re.captures(prefix);
if let Some(caps) = caps {
let n: u32 = caps.get(1).unwrap().as_str().parse().unwrap();
let day_padded = &format!("day{:0>2}", n);
let day_padded_upper = &format!("Day{:0>2}", n);
let instance = &format!("&{}::{} {{}}", day_padded, day_padded_upper).parse::<TokenStream>().unwrap();
block.extend(quote!{
v.push( #instance );
});
}
},
Err(e) => println!("{:?}", e),
}
}
stream.extend(quote!{
{
let mut v: Vec<&dyn Day> = Vec::new();
#block
v
}
});
return proc_macro::TokenStream::from(stream);
}
Declaring and defining struct for current file invoking this macro:
#[derive(Debug, Default)]
struct DayParser {
parts: Vec<Ident>,
}
impl Parse for DayParser {
fn parse(input: ParseStream) -> syn::Result<Self> {
let mut day_parser = DayParser::default();
while !input.is_empty() {
let fn_ident = input.parse::<Ident>()?;
// Optional, Ok vs Err doesn't matter. Just consume if it exists.
input.parse::<syn::token::Comma>().ok();
day_parser.parts.push(fn_ident);
}
return Ok(day_parser);
}
}
#[proc_macro]
pub fn impl_day(input: proc_macro::TokenStream) -> proc_macro::TokenStream {
let mut stream = TokenStream::new();
let span = Span::call_site();
let binding = span.source_file().path();
let file = binding.to_str().unwrap();
let re = Regex::new(r".*day(\d+).rs").unwrap();
let caps = re.captures(file);
if let Some(caps) = caps {
let n: u32 = caps.get(1).unwrap().as_str().parse().unwrap();
let day_padded_upper = format!("Day{:0>2}", n).parse::<TokenStream>().unwrap();
let day_parser = syn::parse_macro_input!(input as DayParser);
let mut trait_parts = TokenStream::new();
for (k, fn_ident) in day_parser.parts.into_iter().enumerate() {
let k = k+1;
let trait_part_ident = format!("part_{}", k).parse::<TokenStream>().unwrap();
// let trait_part_ident = proc_macro::Ident::new(format!("part_{}", k).as_str(), span);
trait_parts.extend(quote!{
fn #trait_part_ident(&self, input: &str) -> Result<String, ()> {
return Ok(format!("Part {}: {:?}", #k, #fn_ident(input)));
}
});
}
stream.extend(quote!{
#[derive(Debug)]
pub struct #day_padded_upper {}
impl Day for #day_padded_upper {
#trait_parts
}
});
} else {
// don't generate anything
let str = format!("Tried to implement Day for a file with malformed name: file = \"{}\" , re = \"{:?}\"", file, re);
println!("{}", str);
// compile_error!(str); // can't figure out how to use these
}
return proc_macro::TokenStream::from(stream);
}
I'm trying to parse a file with syn, and add a line to the single function in it. However, it seems to not modify the file at all when writing it back out. I'm fairly sure that I don't understand fully proc-macro and am using it wrong.
In my Cargo.toml I define a lib and bin like so:
[lib]
name = "gen"
path = "src/gen.rs"
proc-macro = true
[[bin]]
name = "main"
path = "src/main.rs"
In my gen.rs file, I define a macro to take in the input, get the function and modify it like so:
use proc_macro::TokenStream;
use quote::quote;
#[proc_macro]
pub fn gen(input: TokenStream) -> TokenStream {
let item = syn::parse(input.clone());
match item {
Ok(mut v) => {
let fn_item = match &mut v {
syn::Item::Fn(fn_item) => fn_item,
_ => panic!("expected fn"),
};
fn_item.block.stmts.insert(
0,
syn::parse(quote!(println!("count me in");).into()).unwrap(),
);
use quote::ToTokens;
return v.into_token_stream().into();
}
Err(error) => {
println!("{:?}", error);
return input;
}
};
}
Now in my main.rs file, I read the file, convert it to a TokenStream, and use my macro on it and write out the output to a file:
fn main() {
if let Err(error) = try_main() {
let _ = writeln!(io::stderr(), "{}", error);
process::exit(1);
}
}
fn try_main() -> Result<(), Error> {
let mut args = env::args_os();
let _ = args.next(); // executable name
let filepath = PathBuf::from("./src/file-to-parse.rs");
let code = fs::read_to_string(&filepath).map_err(Error::ReadFile)?;
let syntax = syn::parse_file(&code).map_err({
|error| Error::ParseFile {
error,
filepath,
source_code: code,
}
})?;
let mut token_stream = TokenStream::new();
syntax.to_tokens(&mut token_stream);
let file_contents_updated = gen::gen!(&token_stream);
std::fs::write("./src/file-updated.rs", file_contents_updated.to_string());
Ok(())
}
Running this, my output file looks the same as the input. For reference, my input file looks like:
fn init() {
println!("Hello, world!");
}
Yes, you've been misunderstood what proc macros do.
gen::gen!(&token_stream) will invoke gen!() at compile time with the literal tokens & token_stream. Since that doesn't look very much like a function, syn will fail to parse this, and your code will println!("{:?}", error); return input; (which by the way, is a bad idea for proc macro: parsing failure should abort compilation. Use return err.into_compile_error().into()). So it will return its input, meaning the output will be the same as the input.
You can use syn and quote for general purpose code generation, but you should not use proc macros for that - rather, use them as libraries. That is, gen::gen(token_stream) instead of gen::gen!(&token_stream). You can also not mark it proc_macro and put it in the same crate.
The following code reads from my server successfully, however I cannot seem to get the correct syntax or semantics to write back to the server when a particular line command is recognized. Do I need to create a FramedWriter? Most examples I have found split the socket but that seems overkill I expect the codec to be able to handle the bi-directional io by providing some async write method.
# Cargo.toml
[dependencies]
tokio = { version = "0.3", features = ["full"] }
tokio-util = { version = "0.4", features = ["codec"] }
//! main.rs
use tokio::net::{ TcpStream };
use tokio_util::codec::{ Framed, LinesCodec };
use tokio::stream::StreamExt;
use std::error::Error;
use std::net::{IpAddr, Ipv4Addr, SocketAddr};
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let saddr = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(127, 0, 0, 1)), 8081);
let conn = TcpStream::connect(saddr).await?;
let mut server = Framed::new(conn, LinesCodec::new_with_max_length(1024));
while let Some(Ok(line)) = server.next().await {
match line.as_str() {
"READY" => println!("Want to write a line to the stream"),
_ => println!("{}", line),
}
}
Ok({})
}
According to the documentation, Framed implements Stream and Sink traits. Sink defines only the bare minimum of low-level sending methods. To get the high-level awaitable methods like send() and send_all(), you need to use the SinkExt extension trait.
For example (playground):
use futures::sink::SinkExt;
// ...
while let Some(Ok(line)) = server.next().await {
match line.as_str() {
"READY" => server.send("foo").await?,
_ => println!("{}", line),
}
}
My simple rust program uses jsonpath crate to lookup some values in a json document.
The code that I am using is the following
use serde::Deserialize;
extern crate jsonpath;
extern crate serde_json;
use jsonpath::Selector;
use serde_json::Value;
use std::any::{Any, TypeId};
fn main() {
let jsondoc = r#"
{
"a": 10,
"b": "a string",
"c" : false,
"point" : {
"x" : 1,
"y": 2
}
}
"#;
let json: Value = serde_json::from_str(jsondoc).unwrap(); // Parse JSON document
let selector1 = Selector::new("$.a").unwrap(); // Create a JSONPath selector
let result1: Vec<f64> = selector1.find(&json)
.map(|t| t.as_f64().unwrap())
.collect();
println!("{:?}", result1);
let selector2 = Selector::new("$.b").unwrap(); // Create a JSONPath selector
let result2: Vec<&str> = selector2.find(&json)
.map(|t| t.as_str().unwrap())
.collect();
println!("{:?}", result2);
let selector3 = Selector::new("$.c").unwrap(); // Create a JSONPath selector
let result3: Vec<bool> = selector3.find(&json)
.map(|t| t.as_bool().unwrap())
.collect();
println!("{:?}", result3);
}
A concern is that if I change the json value "a" from 10 to "ten" (different data type) the code crashes in this statement: map(|t| t.as_f64().unwrap()) (cannot unwrap)
How do I protect the code in order to avoid panics?
As mentioned in comments, the problem here is that unwrap panics on errors. The solution is to propagate your result around your program, bubbling it up until you want to handle it.
There's one complication, though: as_*() methods don't return a Result, but rather an Option. If we want to treat it as an error case and propogate it around the program, turning it into a Result will make that much nicer.
Here's a simple way to do that using ok_or:
let selector3 = Selector::new("$.c").unwrap(); // Create a JSONPath selector
let result3: Vec<bool> = selector3.find(&json)
.map(|t| t.as_bool().ok_or("expected boolean, found something else").unwrap())
.collect();
println!("{:?}", result3);
If you want to make this error message better and/or include more information in it, I think making a custom error enum is probably the best solution. There's a lot of detail on that in this answer to a related question about error handling, but in short, if you create your own error type containing more information, you can easily make the errors much more readable:
use snafu::Snafu; // 0.6.8
#[derive(Debug, Snafu)]
enum MyError {
#[snafu(display("expected {}, found {}", value))]
WrongValueType {
expected: &'static str,
actual: serde_json::Value,
}
}
let selector3 = Selector::new("$.c").unwrap(); // Create a JSONPath selector
let result3: Vec<bool> = selector3.find(&json)
.map(|t| t.as_bool().ok_or_else(|| MyError::WrongValueType { expected: "boolean", actual: t.clone() }).unwrap())
.collect();
println!("{:?}", result3);
This uses the snafu crate, but there are other options (or it can be done manually). Again, see the above linked answer for more information.
Alright - we have a Result. Now, we need to propogate it up. This adds some boilerplate, but fortunately, it shouldn't have to change the style all that much. In particular, Iterator::collect can be used to collect into anything implementing FromIterator, and Result implements FromIterator.
Replacing each map with something like this will iterate until either all results are found Ok(), or one is an Err:
let selector3 = Selector::new("$.c").unwrap(); // Create a JSONPath selector
let result3: Vec<bool> = selector3.find(&json)
.map(|t| t.as_bool().ok_or("expected boolean, found something else"))
// &'static str is our error type
.collect::<Result<Vec<bool>, &'static str>>()
.unwrap();
println!("{:?}", result3);
We're still unwrapping, so it'll still panic, but that brings the error one step up. To deal with it at the top level, you'll need to stick this into a function returning a Result itself, and then match on that result to deal with the error case. It's a common pattern to do this by turning our main function into just error handling, and then to make our "real main" a separate function returning a Result. Here's an application of that on your snippet of code:
// `Box<dyn std::error::Error>` encapsulates "any error" without giving
// access to the details
fn try_main() -> Result<(), Box<dyn std::error::Error>> {
let jsondoc = r#"
{
"a": 10,
"b": "a string",
"c" : false,
"point" : {
"x" : 1,
"y": 2
}
}
"#;
let json: Value = serde_json::from_str(jsondoc)?; // Parse JSON document
let selector1 = Selector::new("$.a").unwrap(); // Create a JSONPath selector
let result1: Vec<f64> = selector1.find(&json)
.map(|t| t.as_f64().ok_or("expected boolean, found something else"))
.collect::<Result<_, _>>()?;
println!("{:?}", result1);
// ... remaining code can be translated as this one.
}
fn main() {
match try_main() {
Ok(()) => {}
Err(e) => {
eprintln!("Error occurred: {}", e);
std::process::exit(1);
}
}
}
I'm having the same problem as Is there any straightforward way for Clap to display help when no command is provided?, but the solution proposed in that question is not good enough for me.
.setting(AppSettings::ArgRequiredElseHelp) stops the program if no arguments are provided, and I need the program to carry on execution even if no arguments are provided. I need the help to be displayed in addition.
You could write the string before.
use clap::{App, SubCommand};
use std::str;
fn main() {
let mut app = App::new("myapp")
.version("0.0.1")
.about("My first CLI APP")
.subcommand(SubCommand::with_name("ls").about("List anything"));
let mut help = Vec::new();
app.write_long_help(&mut help).unwrap();
let _ = app.get_matches();
println!("{}", str::from_utf8(&help).unwrap());
}
Or you could use get_matches_safe
use clap::{App, AppSettings, ErrorKind, SubCommand};
fn main() {
let app = App::new("myapp")
.setting(AppSettings::ArgRequiredElseHelp)
.version("0.0.1")
.about("My first CLI APP")
.subcommand(SubCommand::with_name("ls").about("List anything"));
let matches = app.get_matches_safe();
match matches {
Err(e) => {
if e.kind == ErrorKind::MissingArgumentOrSubcommand {
println!("{}", e.message)
}
}
_ => (),
}
}