Related
Ultimately, I'd like to be able to wrap a function in a proc_macro like this:
native_function! {
fn sum(x:i32, y:i32) -> i32 {
x+y
}
}
I'm trying to find what I need to put in this function:
#[proc_macro]
pub fn native_function(item: TokenStream) -> TokenStream {
...
}
Inside the function above, I need to be able to capture the types of the parameters defined in the function; in this case - "i32"
I've tried all sorts of ways to do this, but currently I'm stumped. I'm trying to avoid using a derive macro since I don't want to have to complete and instantiate a struct. I don't even need a full solution, but if someone can point me to the function/object/library I need to be using to that would be great.
syn is the cannonical parsing library for proc macros.
Using it this is easy:
let f = syn::parse_macro_input!(item as syn::ItemFn);
let types: Vec<syn::Type> = f
.sig
.inputs
.into_iter()
.filter_map(|arg| match arg {
syn::FnArg::Receiver(_) => None,
syn::FnArg::Typed(syn::PatType { ty, .. }) => Some(*ty),
})
.collect();
By the way, your macro can be an attribute macro.
If you want to know whethere the type is a some known type, note first that you can never be sure; that is because macros operate without type information, and code like struct i32; is legal and will shadow the primitive i32 type.
If you've accepted this limitation, what you actually want is to compare it to a path type. Path types are represented by the Path variant of syn::Type. It has qself, which is the T in <T as Trait>::AssociatedType and None in a simple path (not fully-qualified, just A::B::C), and path, which is the Trait::AssociatedType or A::B::C. Paths are complex; they can contain generics, for example, but if all you want is to check if this is one-segment type of simple identifier like i32, syn has got you covered: just use Path::is_ident(). So:
let is_i32 = match ty {
syn::Type::Path(ty) => ty.path.is_ident("i32"),
_ => false,
}
If you want to compare against a more complex type, you will have to walk and match segment-by-segment.
The following compiles:
pub fn build_proverb(list: &[&str]) -> String {
if list.is_empty() {
return String::new();
}
let mut result = (0..list.len() - 1)
.map(|i| format!("For want of a {} the {} was lost.", list[i], list[i + 1]))
.collect::<Vec<String>>();
result.push(format!("And all for the want of a {}.", list[0]));
result.join("\n")
}
The following does not (see Playground):
pub fn build_proverb(list: &[&str]) -> String {
if list.is_empty() {
return String::new();
}
let mut result = (0..list.len() - 1)
.map(|i| format!("For want of a {} the {} was lost.", list[i], list[i + 1]))
.collect::<Vec<String>>()
.push(format!("And all for the want of a {}.", list[0]))
.join("\n");
result
}
The compiler tells me
error[E0599]: no method named `join` found for type `()` in the current scope
--> src/lib.rs:9:10
|
9 | .join("\n");
| ^^^^
I get the same type of error if I try to compose just with push.
What I would expect is that collect returns B, aka Vec<String>. Vec is not (), and Vec of course has the methods I want to include in the list of composed functions.
Why can't I compose these functions? The explanation might include describing the "magic" of terminating the expression after collect() to get the compiler to instantiate the Vec in a way that does not happen when I compose with push etc.
If you read the documentation for Vec::push and look at the signature of the method, you will learn that it does not return the Vec:
pub fn push(&mut self, value: T)
Since there is no explicit return type, the return type is the unit type (). There is no method called join on (). You will need to write your code in multiple lines.
See also:
What is the purpose of the unit type in Rust?
I'd write this more functionally:
use itertools::Itertools; // 0.8.0
pub fn build_proverb(list: &[&str]) -> String {
let last = list
.get(0)
.map(|d| format!("And all for the want of a {}.", d));
list.windows(2)
.map(|d| format!("For want of a {} the {} was lost.", d[0], d[1]))
.chain(last)
.join("\n")
}
fn main() {
println!("{}", build_proverb(&["nail", "shoe"]));
}
See also:
What's an idiomatic way to print an iterator separated by spaces in Rust?
Thank you to everyone for the useful interactions. Everything stated in the previous response is precisely correct. And, there is a bigger picture as I'm learning Rust.
Coming from Haskell (with C training years ago), I bumped into the OO method chaining approach that uses a pointer to chain between method calls; no need for pure functions (i.e., what I was doing with let mut result = ..., which was then used/required to change the value of the Vec using push in result.push(...)). What I believe is a more general observation is that, in OO, it is "aok" to return unit because method chaining does not require a return value.
The custom code below defines push as a trait; it uses the same inputs as the "OO" push, but returns the updated self. Perhaps only as a side comment, this makes the function pure (output depends on input) but in practice, means the push defined as a trait enables the FP composition of functions I had come to expect was a norm (fair enough I thought at first given how much Rust borrows from Haskell).
What I was trying to accomplish, and at the heart of the question, is captured by the code solution that #Stargateur, #E_net4 and #Shepmaster put forward. With only the smallest edits is as follows:
(see playground)
pub fn build_proverb(list: &[&str]) -> String {
if list.is_empty() {
return String::new();
}
list.windows(2)
.map(|d| format!("For want of a {} the {} was lost.", d[0], d[1]))
.collect::<Vec<_>>()
.push(format!("And all for the want of a {}.", list[0]))
.join("\n")
}
The solution requires that I define push as a trait that return the self,
type Vec in this instance.
trait MyPush<T> {
fn push(self, x: T) -> Vec<T>;
}
impl<T> MyPush<T> for Vec<T> {
fn push(mut self, x: T) -> Vec<T> {
Vec::push(&mut self, x);
self
}
}
Final observation, in surveying many of the Rust traits, I could not find a trait function that returns () (modulo e.g., Write that returns Result ()).
This contrasts with what I learned here to expect with struct and enum methods. Both traits and the OO methods have access to self and thus have each been described as "methods", but there seems to be an inherent difference worth noting: OO methods use a reference to enable sequentially changing self, FP traits (if you will) uses function composition that relies on the use of "pure", state-changing functions to accomplish the same (:: (self, newValue) -> self).
Perhaps as an aside, where Haskell achieves referential transparency in this situation by creating a new copy (modulo behind the scenes optimizations), Rust seems to accomplish something similar in the custom trait code by managing ownership (transferred to the trait function, and handed back by returning self).
A final piece to the "composing functions" puzzle: For composition to work, the output of one function needs to have the type required for the input of the next function. join worked both when I was passing it a value, and when I was passing it a reference (true with types that implement IntoIterator). So join seems to have the capacity to work in both the method chaining and function composition styles of programming.
Is this distinction between OO methods that don't rely on a return value and traits generally true in Rust? It seems to be the case "here and there". Case in point, in contrast to push where the line is clear, join seems to be on its way to being part of the standard library defined as both a method for SliceConcatExt and a trait function for SliceConcatExt (see rust src and the rust-lang issue discussion). The next question, would unifying the approaches in the standard library be consistent with the Rust design philosophy? (pay only for what you use, safe, performant, expressive and a joy to use)
I want to write a macro_rules based macro that will be used to wrap a series of type aliases and struct definitions. I can match on "items" with $e:item, but this will match both aliases and structs. I would like to treat the two separately (I need to add some #[derive(...)] just on the structs). Do I have to imitate their syntax directly by matching on something like type $name:ident = $type:ty; or is there a better way? This route seems annoying for structs because of regular vs tuple like. If I also wanted to distinguish functions that would be really painful because they have a lot of syntactical variation.
I believe for that problem somewhat simple cases can be solved with macro_rules!, but that probably would be limited (you can't lookahead) and super error-prone. I only cover an example for types, I hope that would be convincing enough to avoid macro_rules!. Consider this simple macro:
macro_rules! traverse_types {
($(type $tp:ident = $alias:ident;)*) => {
$(type $tp = $alias;)*
}
}
traverse_types!(
type T = i32;
type Y = Result<i32, u64>;
);
That works fine for the trivial aliases. At some point you probably also would like to handle generic type aliases (i.e. in the form type R<Y> = ...). Ok, your might still rewrite the macro to the recursive form (and that already a non-trivial task) to handle all of cases. Then you figure out that generics can be complex (type-bounds, lifetime parameters, where-clause, default types, etc):
type W<A: Send + Sync> = Option<A>;
type X<A: Iterator<Item = usize>> where A: 'static = Option<A>;
type Y<'a, X, Y: for<'t> Trait<'t>> = Result<&'a X, Y>;
type Z<A, B = u64> = Result<A, B>;
Probably all of these cases still can be handled with a barely readable macro_rules!. Nevertheless it would be really hard to understand (even to the person who wrote it) what's going on. Besides, it is hard to support new syntax (e.g. impl-trait alias type T = impl K), you may even need to have a complete rewrite of the macro. And I only cover the type aliases part, there's more to handle for the structs.
My point is: one better avoid macro_rules! for that (and similar) problem(-s), procedural macros is a way much a better tool for that. It easier to read (and thus extend) and handles new syntax for free (if syn and quote crates are maintained). For the type alias this can be done as simple as:
extern crate proc_macro;
use proc_macro::TokenStream;
use syn::parse::{Parse, ParseStream};
struct TypeAliases {
aliases: Vec<syn::ItemType>,
}
impl Parse for TypeAliases {
fn parse(input: ParseStream) -> syn::Result<Self> {
let mut aliases = vec![];
while !input.is_empty() {
aliases.push(input.parse()?);
}
Ok(Self { aliases })
}
}
#[proc_macro]
pub fn traverse_types(token: TokenStream) -> TokenStream {
let input = syn::parse_macro_input!(token as TypeAliases);
// do smth with input here
// You may remove this binding by implementing a `Deref` or `quote::ToTokens` for `TypeAliases`
let aliases = input.aliases;
let gen = quote::quote! {
#(#aliases)*
};
TokenStream::from(gen)
}
For the struct parsing code is the same using ItemStruct type and also you need a lookahead to determine wether it's an type-alias or a struct, there's very similar example at syn for that.
The following compiles:
pub fn build_proverb(list: &[&str]) -> String {
if list.is_empty() {
return String::new();
}
let mut result = (0..list.len() - 1)
.map(|i| format!("For want of a {} the {} was lost.", list[i], list[i + 1]))
.collect::<Vec<String>>();
result.push(format!("And all for the want of a {}.", list[0]));
result.join("\n")
}
The following does not (see Playground):
pub fn build_proverb(list: &[&str]) -> String {
if list.is_empty() {
return String::new();
}
let mut result = (0..list.len() - 1)
.map(|i| format!("For want of a {} the {} was lost.", list[i], list[i + 1]))
.collect::<Vec<String>>()
.push(format!("And all for the want of a {}.", list[0]))
.join("\n");
result
}
The compiler tells me
error[E0599]: no method named `join` found for type `()` in the current scope
--> src/lib.rs:9:10
|
9 | .join("\n");
| ^^^^
I get the same type of error if I try to compose just with push.
What I would expect is that collect returns B, aka Vec<String>. Vec is not (), and Vec of course has the methods I want to include in the list of composed functions.
Why can't I compose these functions? The explanation might include describing the "magic" of terminating the expression after collect() to get the compiler to instantiate the Vec in a way that does not happen when I compose with push etc.
If you read the documentation for Vec::push and look at the signature of the method, you will learn that it does not return the Vec:
pub fn push(&mut self, value: T)
Since there is no explicit return type, the return type is the unit type (). There is no method called join on (). You will need to write your code in multiple lines.
See also:
What is the purpose of the unit type in Rust?
I'd write this more functionally:
use itertools::Itertools; // 0.8.0
pub fn build_proverb(list: &[&str]) -> String {
let last = list
.get(0)
.map(|d| format!("And all for the want of a {}.", d));
list.windows(2)
.map(|d| format!("For want of a {} the {} was lost.", d[0], d[1]))
.chain(last)
.join("\n")
}
fn main() {
println!("{}", build_proverb(&["nail", "shoe"]));
}
See also:
What's an idiomatic way to print an iterator separated by spaces in Rust?
Thank you to everyone for the useful interactions. Everything stated in the previous response is precisely correct. And, there is a bigger picture as I'm learning Rust.
Coming from Haskell (with C training years ago), I bumped into the OO method chaining approach that uses a pointer to chain between method calls; no need for pure functions (i.e., what I was doing with let mut result = ..., which was then used/required to change the value of the Vec using push in result.push(...)). What I believe is a more general observation is that, in OO, it is "aok" to return unit because method chaining does not require a return value.
The custom code below defines push as a trait; it uses the same inputs as the "OO" push, but returns the updated self. Perhaps only as a side comment, this makes the function pure (output depends on input) but in practice, means the push defined as a trait enables the FP composition of functions I had come to expect was a norm (fair enough I thought at first given how much Rust borrows from Haskell).
What I was trying to accomplish, and at the heart of the question, is captured by the code solution that #Stargateur, #E_net4 and #Shepmaster put forward. With only the smallest edits is as follows:
(see playground)
pub fn build_proverb(list: &[&str]) -> String {
if list.is_empty() {
return String::new();
}
list.windows(2)
.map(|d| format!("For want of a {} the {} was lost.", d[0], d[1]))
.collect::<Vec<_>>()
.push(format!("And all for the want of a {}.", list[0]))
.join("\n")
}
The solution requires that I define push as a trait that return the self,
type Vec in this instance.
trait MyPush<T> {
fn push(self, x: T) -> Vec<T>;
}
impl<T> MyPush<T> for Vec<T> {
fn push(mut self, x: T) -> Vec<T> {
Vec::push(&mut self, x);
self
}
}
Final observation, in surveying many of the Rust traits, I could not find a trait function that returns () (modulo e.g., Write that returns Result ()).
This contrasts with what I learned here to expect with struct and enum methods. Both traits and the OO methods have access to self and thus have each been described as "methods", but there seems to be an inherent difference worth noting: OO methods use a reference to enable sequentially changing self, FP traits (if you will) uses function composition that relies on the use of "pure", state-changing functions to accomplish the same (:: (self, newValue) -> self).
Perhaps as an aside, where Haskell achieves referential transparency in this situation by creating a new copy (modulo behind the scenes optimizations), Rust seems to accomplish something similar in the custom trait code by managing ownership (transferred to the trait function, and handed back by returning self).
A final piece to the "composing functions" puzzle: For composition to work, the output of one function needs to have the type required for the input of the next function. join worked both when I was passing it a value, and when I was passing it a reference (true with types that implement IntoIterator). So join seems to have the capacity to work in both the method chaining and function composition styles of programming.
Is this distinction between OO methods that don't rely on a return value and traits generally true in Rust? It seems to be the case "here and there". Case in point, in contrast to push where the line is clear, join seems to be on its way to being part of the standard library defined as both a method for SliceConcatExt and a trait function for SliceConcatExt (see rust src and the rust-lang issue discussion). The next question, would unifying the approaches in the standard library be consistent with the Rust design philosophy? (pay only for what you use, safe, performant, expressive and a joy to use)
As a project for learning rust, I am writing a program which can parse sgf files (a format for storing go games, and technically also other games). Currently the program is supposed to parse strings of the type (this is just an exampel) ";B[ab]B[cd]W[ef]B[gh]" into [Black((0,1)),Black((2,3,)),White((4,5)),Black((6,7))]
For this I am using the parser-combinators library.
I have run into the following error:
main.rs:44:15: 44:39 error: can't infer the "kind" of the closure; explicitly annotate it; e.g. `|&:| {}` [E0187]
main.rs:44 pmove().map(|m| {Property::White(m)})
^~~~~~~~~~~~~~~~~~~~~~~~
main.rs:44:15: 44:39 error: mismatched types:
expected `closure[main.rs:39:15: 39:39]`,
found `closure[main.rs:44:15: 44:39]`
(expected closure,
found a different closure) [E0308]
main.rs:44 pmove().map(|m| {Property::White(m)})
^~~~~~~~~~~~~~~~~~~~~~~~
error: aborting due to 2 previous errors
Could not compile `go`.
The function in question is below. I am completely new to rust, so I can't really isolate the problem further or recreate it in a context without the parser-combinators library (might even have something to do with that library?).
fn parse_go_sgf(input: &str) -> Vec<Property> {
let alphabetic = |&:| {parser::satisfy(|c| {c.is_alphabetic()})};
let prop_value = |&: ident, value_type| {
parser::spaces().with(ident).with(parser::spaces()).with(
parser::between(
parser::satisfy(|c| c == '['),
parser::satisfy(|c| c == ']'),
value_type
)
)
};
let pmove = |&:| {
alphabetic().and(alphabetic())
.map(|a| {to_coord(a.0, a.1)})
};
let pblack = prop_value(
parser::string("B"),
pmove().map(|m| {Property::Black(m)}) //This is where I am first calling the map function.
);
let pwhite = prop_value(
parser::string("W"),
pmove().map(|m| {Property::White(m)}) //This is where the compiler complains
);
let pproperty = parser::try(pblack).or(pwhite);
let mut pnode = parser::spaces()
.with(parser::string(";"))
.with(parser::many(pproperty));
match pnode.parse(input) {
Ok((value, _)) => value,
Err(err) => {
println!("{}",err);
vec!(Property::Unkown)
}
}
}
So I am guessing this has something to do with closures all having different types. But in other cases it seems possible to call the same function with different closures. For example
let greater_than_forty_two = range(0, 100)
.find(|x| *x > 42);
let greater_than_forty_three = range(0, 100)
.find(|x| *x > 43);
Seems to work just fine.
So what is going on in my case that is different.
Also, as I am just learning, any general comments on the code are also welcome.
Unfortunately, you stumbled upon one of the rough edges in the Rust type system (which is, given the closure-heavy nature of parser-combinators, not really unexpected).
Here is a simplified example of your problem:
fn main() {
fn call_closure_fun<F: Fn(usize)>(f: F) { f(12) } // 1
fn print_int(prefix: &str, i: usize) { println!("{}: {}", prefix, i) }
let call_closure = |&: closure| call_closure_fun(closure); // 2
call_closure(|&: i| print_int("first", i)); // 3.1
call_closure(|&: i| print_int("second", i)); // 3.2
}
It gives exactly the same error as your code:
test.rs:8:18: 8:47 error: mismatched types:
expected `closure[test.rs:7:18: 7:46]`,
found `closure[test.rs:8:18: 8:47]`
(expected closure,
found a different closure) [E0308]
test.rs:8 call_closure(|&: i| print_int("second", i));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We have (referenced in the comments in the code):
a function which accepts a closure of some concrete form;
a closure which calls a function (1) with its own argument;
two invocations of closure (1), passing different closures each time.
Rust closures are unboxed. It means that for each closure the compiler generates a fresh type which implements one of closure traits (Fn, FnMut, FnOnce). These types are anonymous - they don't have a name you can write out. All you know is that these types implement a certain trait.
Rust is a strongly- and statically-typed language: the compiler must know exact type of each variable and each parameter at the compile time. Consequently it has to assign types for every parameter of every closure you write. But what type should closure argument of (2) have? Ideally, it should be some generic type, just like in (1): the closure should accept any type as long as it implements a trait. However, Rust closures can't be generic, and so there is no syntax to specify that. So Rust compiler does the most natural thing it can - it infers the type of closure argument based on the first use of call_closure, i.e. from 3.1 invocation - that is, it assigns the anonymous type of the closure in 3.1!
But this anonymous type is different from the anonymous type of the closure in 3.2: the only thing they have in common is that they both implement Fn(usize). And this is exactly what error is about.
The best solution would be to use functions instead of closures because functions can be generic. Unfortunately, you won't be able to do that either: your closures return structures which contain closures inside themselves, something like
pub struct Satisfy<I, Pred> { ... }
where Pred is later constrained to be Pred: FnMut(char) -> bool. Again, because closures have anonymous types, you can't specify them in type signatures, so you won't be able to write out the signature of such generic function.
In fact, the following does work (because I've extracted closures for parser::satisfy() calls to parameters):
fn prop_value<'r, I, P, L, R>(ident: I, value_type: P, l: L, r: R) -> pp::With<pp::With<pp::With<pp::Spaces<&'r str>, I>, pp::Spaces<&'r str>>, pp::Between<pp::Satisfy<&'r str, L>, pp::Satisfy<&'r str, R>, P>>
where I: Parser<Input=&'r str, Output=&'r str>,
P: Parser<Input=&'r str, Output=Property>,
L: Fn(char) -> bool,
R: Fn(char) -> bool {
parser::spaces().with(ident).with(parser::spaces()).with(
parser::between(
parser::satisfy(l),
parser::satisfy(r),
value_type
)
)
}
And you'd use it like this:
let pblack = prop_value(
parser::string("B"),
pmove().map(|&: m| Property::Black(m)),
|c| c == '[', |c| c == ']'
);
let pwhite = prop_value(
parser::string("W"),
pmove().map(|&: m| Property::White(m)),
|c| c == '[', |c| c == ']'
);
pp is introduced with use parser::parser as pp.
This does work, but it is really ugly - I had to use the compiler error output to actually determine the required return type. With the slightest change in the function it will have to be adjusted again. Ideally this is solved with unboxed abstract return types - there is a postponed RFC on them - but we're still not there yet.
As the author of parser-combinators I will just chime in on another way of solving this, without needing to use the compiler to generate the return type.
As each parser is basically just a function together with 2 associated types there are implementations for the Parser trait for all function types.
impl <I, O> Parser for fn (State<I>) -> ParseResult<O, I>
where I: Stream { ... }
pub struct FnParser<I, O, F>(F);
impl <I, O, F> Parser for FnParser<I, O, F>
where I: Stream, F: FnMut(State<I>) -> ParseResult<O, I> { ... }
These should all be replaced by a single trait and the FnParser type removed, once the orphan checking allows it. In the meantime we can use the FnParser type to create a parser from a closure.
Using these traits we can essentially hide the big parser type returned from in Vladimir Matveev's example.
fn prop_value<'r, I, P, L, R>(ident: I, value_type: P, l: L, r: R, input: State<&'r str>) -> ParseResult<Property, &'r str>
where I: Parser<Input=&'r str, Output=&'r str>,
P: Parser<Input=&'r str, Output=Property>,
L: Fn(char) -> bool,
R: Fn(char) -> bool {
parser::spaces().with(ident).with(parser::spaces()).with(
parser::between(
parser::satisfy(l),
parser::satisfy(r),
value_type
)
).parse_state(input)
}
And we can now construct the parser with this
let parser = FnParser(move |input| prop_value(ident, value_type, l, r, input));
And this is basically the best we can do at the moment using rust. Unboxed anonymous return types would make all of this significantly easier since complex return types would not be needed (nor created since the library itself could be written to utilize this, avoiding the complex types entirely).
Two facets of Rust's closures are causing your problem, one, closures cannot be generic, and two, each closure is its own type. Because closure's cannot be generic,prop_value's parameter value_type must be a specific type. Because each closure is a specific type, the closure you pass to prop_value in pwhite is a different type from the one in pblack. What the compiler does is conclude that the value_type must have the type of the closure in pblack, and when it gets to pwhite it finds a different closure, and gives an error.
Judging from your code sample, the simplest solution would probably be to make prop_value a generic fn - it doesn't look like it needs to be a closure. Alternatively, you could declare its parameter value_type to be a closure trait object, e.g. &Fn(...) -> .... Here is a simplifed example demonstrating these approaches:
fn higher_fn<F: Fn() -> bool>(f: &F) -> bool {
f()
}
let higher_closure = |&: f: &Fn() -> bool | { f() };
let closure1 = |&:| { true };
let closure2 = |&:| { false };
higher_fn(&closure1);
higher_fn(&closure2);
higher_closure(&closure1);
higher_closure(&closure2);
The existing answer(s) are good, but I wanted to share an even smaller example of the problem:
fn thing<F: FnOnce(T), T>(f: F) {}
fn main() {
let caller = |&: f| {thing(f)};
caller(|&: _| {});
caller(|&: _| {});
}
When we define caller, its signature is not fully fixed yet. When we call it the first time, type inference sets the input and output types. In this example, after the first call, caller will be required to take a closure with a specific type, the type of the first closure. This is because each and every closure has its own unique, anonymous type. When we call caller a second time, the second closure's (unique, anonymous) type doesn't fit!
As #wingedsubmariner points out, there's no way to create closures with generic types. If we had hypothetical syntax like for<F: Fn()> |f: F| { ... }, then perhaps we could work around this. The suggestion to make a generic function is a good one.