How to write a custom derive macro? - rust

I'm trying to write my own derive mode macro in Rust, and the documentation on it is somewhat lacking in examples.
I have a struct like:
#[derive(MyMacroHere)]
struct Example {
id: i64,
value: Option<String>,
}
I want my macro to generate a method à la
fn set_fields(&mut self, id: i64, value: Option<String>) {
// ...
}
What are the basic steps to use the TokenStream trait to achieve something like that?

Create a crate for your procedural macros:
cargo new my_derive --lib
Edit the Cargo.toml to make it a procedural macro crate:
[lib]
proc-macro = true
Implement your procedural macro:
extern crate proc_macro;
use proc_macro::TokenStream;
#[proc_macro_derive(MyMacroHere)]
pub fn my_macro_here_derive(input: TokenStream) -> TokenStream {
// ...
}
Import the procedural macro and use it:
extern crate my_derive;
use my_derive::MyMacroHere;
#[derive(MyMacroHere)]
struct Example {
id: i64,
value: Option<String>,
}
The hard part is in implementation of the macro. Most people use the syn and quote crates to parse the incoming Rust code and then generate new code.
For example, syn's documentation starts with an example of a custom derive. You will parse the struct (or enum or union) and then handle the various ways of defining a struct (unit, tuple, named fields). You'll collect the information you need (type, maybe name), then you'll generate the appropriate code.
See also:
How to Write a Custom derive Macro
Documentation for proc_macro
Documentation for syn
Documentation for quote
Is it possible to add your own derivable traits, or are these fixed by the compiler?

Related

Cargo: How to include the entire directory or file in feature flags?

I'm working on a Rust project. I'm using Cargo feature flags for conditional compilation of the some code. There are cases where I have to include the entire file in the feature flags so doing so adding #[cfg(feature="my-flag")] over every function & use statement doesn't make much sense.
So to include the entire file in the feature flag I'm thinking to surround all the contents in the file a block & add the feature flag for the block.
#[cfg(feature="my-flag")]
{
use crate::access_control::{func1, func2};
use crate::models;
...
#[derive(Debug)]
pub enum MyEnum{..}
#[derive(Clone)]
pub Struct MyStruct{..}
pub fn my_func() {...}
fn my_func_internal() {...}
...
}
But I'm getting the error Syntax Error: expected an item after attributes
Also, there are also some cases where I want the entire directory to be included the feature flags. How should I go about it? Doing the adding feature flags for every file is one way. Does a better way exist?
As in #MarcusDunn's answer, the proper solution is to apply the attribute to the mod declaration:
// This conditionally includes a module which implements WEBP support.
#[cfg(feature = "webp")]
pub mod webp;
However for the sake of completeness, I would point out that attributes can be applied to the item they're in instead of being applied to the following item. These are called "inner attributes" and are specified by adding a ! after the #:
#![cfg(feature="my-flag")] // Applies to the whole file
use crate::access_control::{func1, func2};
use crate::models;
#[derive(Debug)]
pub enum MyEnum {}
#[derive(Clone)]
pub struct MyStruct {}
pub fn my_func() {}
fn my_func_internal() {}
From https://doc.rust-lang.org/cargo/reference/features.html
// This conditionally includes a module which implements WEBP support.
#[cfg(feature = "webp")]
pub mod webp;
This could be an entire directory - or a single file, depends how you structure your modules.
You can use conditional compilation on modules. Perhaps something like this would work for your use case:
#[cfg(feature = "feat")]
use feat::S;
#[cfg(not(feature = "feat"))]
use no_feat::S;
mod feat {
pub const S: &str = "feat";
}
mod no_feat {
pub const S: &str = "no_feat";
}
fn main() {
println!("{}", S);
}
Running with cargo run:
no_feat
Running with cargo run --features feat:
feat
You can use the cfg-if crate:
cfg_if::cfg_if! {
if #[cfg(feature = "my-flag")] {
// ...
}
}

Strange struct definition from crate nix

I just encountered this weird struct definition, it is in fcntl.rs from the crate nix.
pub struct OFlag: c_int {
/// Mask for the access mode of the file.
O_ACCMODE;
// other fields are omitted.
}
A normal struct in my perspective will be something like this:
struct Person{
name: String,
age: u8,
}
So, here are my doubts:
what is OFlag: c_int?
c_int is an type alias of i32. pub type c_int = i32;
Why don't its fields have any type annotation?
My surmise is that OFlag is of type c_int, and the fields are something similar to enum's fields.(compliant to the open syscall function signature int open(const char *pathname, int flags, mode_t mode) ) But this is just my guesswork, an explanation citing rust official doc would be appreciated.
The code you quoted is not valid Rust code on its own. It's code that gets passed to an internal macro of the nix crate called libc_bitflags!(). This macro takes the quoted code as input and transforms it into valid Rust code.
The libc_bitflags!() macro is a simple wrapper around the bitflags!() macro from the bitflags crate. The wrapper simplifies creating bitflags structs that take all their values from constants defined in the libc crate. For example this invocation
libc_bitflags!{
pub struct ProtFlags: libc::c_int {
PROT_NONE;
PROT_READ;
PROT_WRITE;
PROT_EXEC;
}
}
gets expanded to
bitflags!{
pub struct ProtFlags: libc::c_int {
const PROT_NONE = libc::PROT_NONE;
const PROT_READ = libc::PROT_READ;
const PROT_WRITE = libc::PROT_WRITE;
const PROT_EXEC = libc::PROT_EXEC;
}
}
which in turn will be expanded to Rust code by the bitflags!() macro. The libc::c_int type is used as the type of the bits field of the resulting struct. The constants inside will become associated constants of the resulting struct. See the documentation of the bitflags crate for further details.
If you look at the file you can see that the code inside the macro libc_bitflags!. The definition of the macro is here. There you can see that the macro ::bitflags::bitflags! is called and that libc_bitflags almost redirects the full input to bitflags. You can read more about that crate here.
Now to your questions:
OFlag will be after macro expansion a struct with a single attribute which is of type c_int:
pub struct OFlag {
bits: c_int,
}
The fields don't need a type because they won't exist anymore in the expanded code (after the macro was run). The bits are of the "type of the struct" so in your case c_int. The fields will be converted to associated constants:
impl OFlag {
pub const O_ACCMODE = Self { bits: libc::O_ACCMODE };
}
You can create an expansion of an example in the playground (Tools -> Expand macros)

How can I find where a symbol used in `derive` is defined?

I come from a Java background, in which finding a symbol definition location is straightforward: it's ether in the same package or it comes via the imports.
Here Deserialize is defined in the serde crate, yet the source file contains no references to serde crate:
#[derive(Deserialize)]
struct Args {
arg_spec: Vec<String>,
flag_short: bool,
flag_porcelain: bool,
flag_branch: bool,
flag_z: bool,
flag_ignored: bool,
flag_untracked_files: Option<String>,
flag_ignore_submodules: Option<String>,
flag_git_dir: Option<String>,
flag_repeat: bool,
flag_list_submodules: bool,
}
(source)
Notice the declaration:
#[macro_use]
extern crate serde_derive;
(source)
This brings all the macros defined in the serde_derive crate into scope. One of those is the Deserialize macro which helps implement the Deserialize trait.
This was the old way of doing things. In the 2018 edition, the preferred way is to import macros with the more familiar use statements.

How can I fix cannot find derive macro `Deserialize` in this scope? [duplicate]

I have this:
#[derive(FromPrimitive)]
pub enum MyEnum {
Var1 = 1,
Var2
}
And an error:
error: cannot find derive macro `FromPrimitive` in this scope
|
38 | #[derive(FromPrimitive)]
| ^^^^^^^^^^^^^
Why do I get this? How do I fix it?
The compiler has a small set of built-in derive macros. For any others, you have to import the custom derives before they can be used.
Before Rust 1.30, you need to use #[macro_use] on the extern crate line of the crate providing the macros. With Rust 1.30 and up, you can use them instead.
In this case, you need to import FromPrimitive from the num_derive crate:
After Rust 1.30
use num_derive::FromPrimitive; // 0.2.4 (the derive)
use num_traits::FromPrimitive; // 0.2.6 (the trait)
Before Rust 1.30
#[macro_use]
extern crate num_derive; // 0.2.4
extern crate num_traits; // 0.2.6
use num_traits::FromPrimitive;
Usage
#[derive(Debug, FromPrimitive)]
pub enum MyEnum {
Var1 = 1,
Var2,
}
fn main() {
println!("{:?}", MyEnum::from_u8(2));
}
Each project has their own crate containing their own derive macros. A small sample:
Num (e.g. FromPrimitive) => num_derive
Serde (e.g. Serialize, Deserialize) => serde_derive
Diesel (e.g. Insertable, Queryable) => diesel (it's actually the same as the regular crate!)
Some crates re-export their derive macros. For example, you can use the derive feature of Serde and then import it from the serde crate directly:
[dependencies]
serde = { version = "1.0", features = ["derive"] }
use serde::{Serialize, Deserialize}; // imports both the trait and the derive macro
FromPrimitive was actually part of the standard library before Rust 1.0. It wasn't useful enough to continue existing in the standard library, so it was moved to the external num crate. Some very old references might not have been updated for this change.
For more information about converting C-like enums to and from integers, see:
How do I match enum values with an integer?
How do I get the integer value of an enum?

How to introspect all available methods and members of a Rust type?

Is there a way to print out a complete list of available members of a type or instance in Rust?
For example:
In Python, I can use print(dir(object))
In C, Clang has a Python API that can parse C code and introspect it.
Being unfamiliar with Rust tools, I'm interested to know if there is some way to do this, either at run-time or compile-time, either with compiler features (macros for example), or using external tools.
This question is intentionally broad because the exact method isn't important. It is common in any language to want to find all of a variable's methods/functions. Not knowing Rust well, I'm not limiting the question to specific methods for discovery.
The reason I don't define the exact method is that I assume IDEs will need this information, so there will need to be some kinds of introspection available to support this (eventually). For all I know, Rust has something similar.
I don't think this is a duplicate of Get fields of a struct type in a macro since this answer could include use of external tools (not necessarily macros).
Is there a way to print out a complete list of available members of a type or instance in Rust?
Currently, there is no such built-in API that you can get the fields at runtime. However you can retrieve fields by using two different ways.
Declarative Macros
Procedural Macros
Solution By Using Declarative Macro
macro_rules! generate_struct {
($name:ident {$($field_name:ident : $field_type:ty),+}) => {
struct $name { $($field_name: $field_type),+ }
impl $name {
fn introspect() {
$(
let field_name = stringify!($field_name);
let field_type = stringify!($field_type);
println!("Field Name: {:?} , Field Type: {:?}",field_name,field_type);
)*
}
}
};
}
generate_struct! { MyStruct { num: i32, s: String } }
fn main() {
MyStruct::introspect();
}
This will give you the output:
Field Name: "num" , Field Type: "i32"
Field Name: "s" , Field Type: "String"
Playground
Solution Using Procedural Macro
Since procedural macros are more complicated from the declarative macros, you better to read some references(ref1, ref2, ref3) before starting.
We are going to write a custom derive which is named "Instrospect". To create this custom derive, we need to parse our struct as a TokenStream with the help of syn crate.
#[proc_macro_derive(Introspect)]
pub fn derive_introspect(input: TokenStream) -> TokenStream {
let input = parse_macro_input!(input as ItemStruct);
// ...
}
Since our input can be parsed as ItemStruct and ItemStruct has the fields() method in it, we can use this to get fields of our struct.
After we get these fields, we can parse them as named and we can print their field name and field type accordingly.
input
.fields
.iter()
.for_each(|field| match field.parse_named() {
Ok(field) => println!("{:?}", field),
Err(_) => println!("Field can not be parsed successfully"),
});
If you want to attach this behavior to your custom derive you can use the following with the help of the quote crate:
let name = &input.ident;
let output = quote! {
impl #name {
pub fn introspect(){
input
.fields
.iter()
.for_each(|field| match field.parse_named() {
Ok(field) => println!("{:?}", field),
Err(_) => println!("Field can not be parsed successfully"),
});
}
}
};
// Return output TokenStream so your custom derive behavior will be attached.
TokenStream::from(output)
Since the behaviour injected to your struct as introspect function, you can call it in your application like following:
#[derive(Introspect)]
struct MyStruct {
num: i32,
text: String
}
MyStruct::introspect();
Note: Since the example you are looking for similar to this question. This Proc Macro Answer and Declarative Macro Answer should give you insight as well
To expand on my comment, you can use rustdoc, the Rust documentation generator, to view almost everything you're asking for (at compile time). rustdoc will show:
Structs (including public members and their impl blocks)
Enums
Traits
Functions
Any documentation comments written by the crate author with /// or //!.
rustdoc also automatically links to the source of each file in the [src] link.
Here is an example of the output of rustdoc.
Standard Library
The standard library API reference is available here and is available for anything in the std namespace.
Crates
You can get documentation for any crate available on crates.io on docs.rs. This automatically generates documentation for each crate every time it is released on crates.io.
Your Project
You can generate documentation for your project with Cargo, like so:
cargo doc
This will also automatically generate documentation for your dependencies (but not the standard library).
I have written a very simple crate which uses procedural macro. It gives you access to members information plus some simple information about struct/enum you use. Information about methods can not be given because procedural macros simply can't get this information, and as far as I know, there are no any methods which may give such information.
I don't think there is anything that will do this out of the box.
It may be possible to write a compiler plugin which can do that by examining the AST.
If you need the field names inside your program then you probably need to use macros. Either wrap your struct definition in macro and pattern match to create some function to get their names, or use procedural macro to derive structs for traits with such functions.
See examples in syn for derived traits. In particular, see syn::Data::Struct which has fields.
According to question of #AlexandreMahdhaoui, I would say: at least on latest Rust versions the proc_macro from accepted answer will not work, because you will need to pass tokens into quote! using "#". So you could try smth like next:
use proc_macro::{TokenStream};
use quote::{quote, ToTokens};
use syn::{parse_macro_input, ItemStruct};
#[proc_macro_derive(Introspect)]
pub fn derive(input: TokenStream) -> TokenStream {
let input = parse_macro_input!(input as ItemStruct);
let ident = input.ident;
let field_data = input.fields.iter().map(|f| {
let field_type = f.ty.clone();
format!(
"Name={}, Type={}",
f.ident.clone().unwrap().to_string(),
quote!(#field_type).to_string()
)
}).collect::<Vec<_>>();
let output = quote! {
impl #ident {
pub fn introspect() {
println!("{:#?}", vec![#(#field_data),*]);
}
}
};
TokenStream::from(output)
}
#[derive(Introspect)]
struct Test {
size: u8,
test_field: u8,
}
fn main() {
Test::introspect();
}
Regarding methods, defined in impl I didn't find any info in output, so not sure if it possible. Probably somebody could share in comments ?
I use something like this:
println!("{:?}", variable); // struct, enum whatever
If it's a large type, use the # version:
println!("{:#?}", variable); // struct, enum whatever

Resources